Introduce new LSM interfaces to allow an FS to deal with their own mount
options. This includes a new string parsing function exported from the
LSM that an FS can use to get a security data blob and a new security
data blob. This is particularly useful for an FS which uses binary
mount data, like NFS, which does not pass strings into the vfs to be
handled by the loaded LSM. Also fix a BUG() in both SELinux and SMACK
when dealing with binary mount data. If the binary mount data is less
than one page the copy_page() in security_sb_copy_data() can cause an
illegal page fault and boom. Remove all NFSisms from the SELinux code
since they were broken by past NFS changes.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
Update the Smack LSM to allow the registration of the capability "module"
as a secondary LSM. Integrate the new hooks required for file based
capabilities.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Paul Moore <paul.moore@hp.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simplify the uid equivalence check in cap_task_kill(). Anyone can kill a
process owned by the same uid.
Without this patch wireshark is reported to fail.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Smack uses CIPSO labeling, but allows for unlabeled packets by
specifying an "ambient" label that is applied to incoming unlabeled
packets.
Because the other end of the connection may dislike IP options, and ssh
is one know application that behaves thus, it is prudent to respond in
kind.
This patch changes the network labeling behavior such that an outgoing
packet that would be given a CIPSO label that matches the ambient label
is left unlabeled. An "unlbl" domain is added and the netlabel
defaulting mechanism invoked rather than assuming that everything is
CIPSO. Locking has been added around changes to the ambient label as
the mechanisms used to do so are more involved.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
audit_log_d_path() is a d_path() wrapper that is used by the audit code. To
use a struct path in audit_log_d_path() I need to embed it into struct
avc_audit_data.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.
Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
<dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
struct path in every place where the stack can be traversed
- it reduces the overall code size:
without patch series:
text data bss dec hex filename
5321639 858418 715768 6895825 6938d1 vmlinux
with patch series:
text data bss dec hex filename
5320026 858418 715768 6894212 693284 vmlinux
This patch:
Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's a small problem with smack and NFS. A similar report was also
sent here: http://lkml.org/lkml/2007/10/27/85
I've also added similar checks in inode_{get/set}security(). Cheating from
SELinux post_create_socket(), it does the same.
[akpm@linux-foundation.org: remove uneeded BUG_ON()]
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Acked-by: Casey Schaufler <casey@schuafler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix SELinux to handle 64-bit capabilities correctly, and to catch
future extensions of capabilities beyond 64 bits to ensure that SELinux
is properly updated.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
The security_get_policycaps() functions has a couple of bugs in it and it
isn't currently used by any in-tree code, so get rid of it and all of it's
bugginess.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@localhost.localdomain>
Since it was decided that low memory protection from userspace couldn't
be turned on by default add a Kconfig option to allow users/distros to
set a default at compile time. This value is still tunable after boot
in /proc/sys/vm/mmap_min_addr
Discussion:
http://www.mail-archive.com/linux-security-module@vger.kernel.org/msg02543.html
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Smack is the Simplified Mandatory Access Control Kernel.
Smack implements mandatory access control (MAC) using labels
attached to tasks and data containers, including files, SVIPC,
and other tasks. Smack is a kernel based scheme that requires
an absolute minimum of application support and a very small
amount of configuration data.
Smack uses extended attributes and
provides a set of general mount options, borrowing technics used
elsewhere. Smack uses netlabel for CIPSO labeling. Smack provides
a pseudo-filesystem smackfs that is used for manipulation of
system Smack attributes.
The patch, patches for ls and sshd, a README, a startup script,
and x86 binaries for ls and sshd are also available on
http://www.schaufler-ca.com
Development has been done using Fedora Core 7 in a virtual machine
environment and on an old Sony laptop.
Smack provides mandatory access controls based on the label attached
to a task and the label attached to the object it is attempting to
access. Smack labels are deliberately short (1-23 characters) text
strings. Single character labels using special characters are reserved
for system use. The only operation applied to Smack labels is equality
comparison. No wildcards or expressions, regular or otherwise, are
used. Smack labels are composed of printable characters and may not
include "/".
A file always gets the Smack label of the task that created it.
Smack defines and uses these labels:
"*" - pronounced "star"
"_" - pronounced "floor"
"^" - pronounced "hat"
"?" - pronounced "huh"
The access rules enforced by Smack are, in order:
1. Any access requested by a task labeled "*" is denied.
2. A read or execute access requested by a task labeled "^"
is permitted.
3. A read or execute access requested on an object labeled "_"
is permitted.
4. Any access requested on an object labeled "*" is permitted.
5. Any access requested by a task on an object with the same
label is permitted.
6. Any access requested that is explicitly defined in the loaded
rule set is permitted.
7. Any other access is denied.
Rules may be explicitly defined by writing subject,object,access
triples to /smack/load.
Smack rule sets can be easily defined that describe Bell&LaPadula
sensitivity, Biba integrity, and a variety of interesting
configurations. Smack rule sets can be modified on the fly to
accommodate changes in the operating environment or even the time
of day.
Some practical use cases:
Hierarchical levels. The less common of the two usual uses
for MLS systems is to define hierarchical levels, often
unclassified, confidential, secret, and so on. To set up smack
to support this, these rules could be defined:
C Unclass rx
S C rx
S Unclass rx
TS S rx
TS C rx
TS Unclass rx
A TS process can read S, C, and Unclass data, but cannot write it.
An S process can read C and Unclass. Note that specifying that
TS can read S and S can read C does not imply TS can read C, it
has to be explicitly stated.
Non-hierarchical categories. This is the more common of the
usual uses for an MLS system. Since the default rule is that a
subject cannot access an object with a different label no
access rules are required to implement compartmentalization.
A case that the Bell & LaPadula policy does not allow is demonstrated
with this Smack access rule:
A case that Bell&LaPadula does not allow that Smack does:
ESPN ABC r
ABC ESPN r
On my portable video device I have two applications, one that
shows ABC programming and the other ESPN programming. ESPN wants
to show me sport stories that show up as news, and ABC will
only provide minimal information about a sports story if ESPN
is covering it. Each side can look at the other's info, neither
can change the other. Neither can see what FOX is up to, which
is just as well all things considered.
Another case that I especially like:
SatData Guard w
Guard Publish w
A program running with the Guard label opens a UDP socket and
accepts messages sent by a program running with a SatData label.
The Guard program inspects the message to ensure it is wholesome
and if it is sends it to a program running with the Publish label.
This program then puts the information passed in an appropriate
place. Note that the Guard program cannot write to a Publish
file system object because file system semanitic require read as
well as write.
The four cases (categories, levels, mutual read, guardbox) here
are all quite real, and problems I've been asked to solve over
the years. The first two are easy to do with traditonal MLS systems
while the last two you can't without invoking privilege, at least
for a while.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: Joshua Brindle <method@manicmethod.com>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Ahmed S. Darwish" <darwish.07@gmail.com>
Cc: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The capability bounding set is a set beyond which capabilities cannot grow.
Currently cap_bset is per-system. It can be manipulated through sysctl,
but only init can add capabilities. Root can remove capabilities. By
default it includes all caps except CAP_SETPCAP.
This patch makes the bounding set per-process when file capabilities are
enabled. It is inherited at fork from parent. Noone can add elements,
CAP_SETPCAP is required to remove them.
One example use of this is to start a safer container. For instance, until
device namespaces or per-container device whitelists are introduced, it is
best to take CAP_MKNOD away from a container.
The bounding set will not affect pP and pE immediately. It will only
affect pP' and pE' after subsequent exec()s. It also does not affect pI,
and exec() does not constrain pI'. So to really start a shell with no way
of regain CAP_MKNOD, you would do
prctl(PR_CAPBSET_DROP, CAP_MKNOD);
cap_t cap = cap_get_proc();
cap_value_t caparray[1];
caparray[0] = CAP_MKNOD;
cap_set_flag(cap, CAP_INHERITABLE, 1, caparray, CAP_DROP);
cap_set_proc(cap);
cap_free(cap);
The following test program will get and set the bounding
set (but not pI). For instance
./bset get
(lists capabilities in bset)
./bset drop cap_net_raw
(starts shell with new bset)
(use capset, setuid binary, or binary with
file capabilities to try to increase caps)
************************************************************
cap_bound.c
************************************************************
#include <sys/prctl.h>
#include <linux/capability.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#ifndef PR_CAPBSET_READ
#define PR_CAPBSET_READ 23
#endif
#ifndef PR_CAPBSET_DROP
#define PR_CAPBSET_DROP 24
#endif
int usage(char *me)
{
printf("Usage: %s get\n", me);
printf(" %s drop <capability>\n", me);
return 1;
}
#define numcaps 32
char *captable[numcaps] = {
"cap_chown",
"cap_dac_override",
"cap_dac_read_search",
"cap_fowner",
"cap_fsetid",
"cap_kill",
"cap_setgid",
"cap_setuid",
"cap_setpcap",
"cap_linux_immutable",
"cap_net_bind_service",
"cap_net_broadcast",
"cap_net_admin",
"cap_net_raw",
"cap_ipc_lock",
"cap_ipc_owner",
"cap_sys_module",
"cap_sys_rawio",
"cap_sys_chroot",
"cap_sys_ptrace",
"cap_sys_pacct",
"cap_sys_admin",
"cap_sys_boot",
"cap_sys_nice",
"cap_sys_resource",
"cap_sys_time",
"cap_sys_tty_config",
"cap_mknod",
"cap_lease",
"cap_audit_write",
"cap_audit_control",
"cap_setfcap"
};
int getbcap(void)
{
int comma=0;
unsigned long i;
int ret;
printf("i know of %d capabilities\n", numcaps);
printf("capability bounding set:");
for (i=0; i<numcaps; i++) {
ret = prctl(PR_CAPBSET_READ, i);
if (ret < 0)
perror("prctl");
else if (ret==1)
printf("%s%s", (comma++) ? ", " : " ", captable[i]);
}
printf("\n");
return 0;
}
int capdrop(char *str)
{
unsigned long i;
int found=0;
for (i=0; i<numcaps; i++) {
if (strcmp(captable[i], str) == 0) {
found=1;
break;
}
}
if (!found)
return 1;
if (prctl(PR_CAPBSET_DROP, i)) {
perror("prctl");
return 1;
}
return 0;
}
int main(int argc, char *argv[])
{
if (argc<2)
return usage(argv[0]);
if (strcmp(argv[1], "get")==0)
return getbcap();
if (strcmp(argv[1], "drop")!=0 || argc<3)
return usage(argv[0]);
if (capdrop(argv[2])) {
printf("unknown capability\n");
return 1;
}
return execl("/bin/bash", "/bin/bash", NULL);
}
************************************************************
[serue@us.ibm.com: fix typo]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>a
Signed-off-by: "Serge E. Hallyn" <serue@us.ibm.com>
Tested-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch supports legacy (32-bit) capability userspace, and where possible
translates 32-bit capabilities to/from userspace and the VFS to 64-bit
kernel space capabilities. If a capability set cannot be compressed into
32-bits for consumption by user space, the system call fails, with -ERANGE.
FWIW libcap-2.00 supports this change (and earlier capability formats)
http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/
[akpm@linux-foundation.org: coding-syle fixes]
[akpm@linux-foundation.org: use get_task_comm()]
[ezk@cs.sunysb.edu: build fix]
[akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
[akpm@linux-foundation.org: unused var]
[serue@us.ibm.com: export __cap_ symbols]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert b68680e473 to make way for the next
patch: "Add 64-bit capability support to the kernel".
We want to keep the vfs_cap_data.data[] structure, using two 'data's for
64-bit caps (and later three for 96-bit caps), whereas
b68680e473 had gotten rid of the 'data' struct
made its members inline.
The 64-bit caps patch keeps the stack abuse fix at get_file_caps(), which was
the more important part of that patch.
[akpm@linux-foundation.org: coding-style fixes]
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch modifies the interface to inode_getsecurity to have the function
return a buffer containing the security blob and its length via parameters
instead of relying on the calling function to give it an appropriately sized
buffer.
Security blobs obtained with this function should be freed using the
release_secctx LSM hook. This alleviates the problem of the caller having to
guess a length and preallocate a buffer for this function allowing it to be
used elsewhere for Labeled NFS.
The patch also removed the unused err parameter. The conversion is similar to
the one performed by Al Viro for the security_getprocattr hook.
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to correlate audit records to an individual login add a session
id. This is incremented every time a user logs in and is included in
almost all messages which currently output the auid. The field is
labeled ses= or oses=
Signed-off-by: Eric Paris <eparis@redhat.com>
As pointed out by Adrian Bunk, commit
45c950e0f8 ("fix memory leak in netlabel
code") caused a double-free when security_netlbl_sid_to_secattr()
fails. This patch fixes this by removing the netlbl_secattr_destroy()
call from that function since we are already releasing the secattr
memory in selinux_netlbl_sock_setsid().
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Capabilities have long been the default when CONFIG_SECURITY=n,
and its help text suggests turning it on when CONFIG_SECURITY=y.
But it is set to default n.
Default it to y instead.
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently network traffic can be sliently dropped due to non-avc errors which
can lead to much confusion when trying to debug the problem. This patch adds
warning messages so that when these events occur there is a user visible
notification.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch implements packet ingress/egress controls for SELinux which allow
SELinux security policy to control the flow of all IPv4 and IPv6 packets into
and out of the system. Currently SELinux does not have proper control over
forwarded packets and this patch corrects this problem.
Special thanks to Venkat Yekkirala <vyekkirala@trustedcs.com> whose earlier
work on this topic eventually led to this patch.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Now that the SELinux NetLabel "base SID" is always the netmsg initial SID we
can do a big optimization - caching the SID and not just the MLS attributes.
This not only saves a lot of per-packet memory allocations and copies but it
has a nice side effect of removing a chunk of code.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch introduces a mechanism for checking when labeled IPsec or SECMARK
are in use by keeping introducing a configuration reference counter for each
subsystem. In the case of labeled IPsec, whenever a labeled SA or SPD entry
is created the labeled IPsec/XFRM reference count is increased and when the
entry is removed it is decreased. In the case of SECMARK, when a SECMARK
target is created the reference count is increased and later decreased when the
target is removed. These reference counters allow SELinux to quickly determine
if either of these subsystems are enabled.
NetLabel already has a similar mechanism which provides the netlbl_enabled()
function.
This patch also renames the selinux_relabel_packet_permission() function to
selinux_secmark_relabel_packet_permission() as the original name and
description were misleading in that they referenced a single packet label which
is not the case.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Rework the handling of network peer labels so that the different peer labeling
subsystems work better together. This includes moving both subsystems to a
single "peer" object class which involves not only changes to the permission
checks but an improved method of consolidating multiple packet peer labels.
As part of this work the inbound packet permission check code has been heavily
modified to handle both the old and new behavior in as sane a fashion as
possible.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add additional Flask definitions to support the new "peer" object class and
additional permissions to the netif, node, and packet object classes. Also,
bring the kernel Flask definitions up to date with the Fedora SELinux policies
by adding the "flow_in" and "flow_out" permissions to the "packet" class.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add a new policy capabilities bitmap to SELinux policy version 22. This bitmap
will enable the security server to query the policy to determine which features
it supports.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds a SELinux IP address/node SID caching mechanism similar to the
sel_netif_*() functions. The node SID queries in the SELinux hooks files are
also modified to take advantage of this new functionality. In addition, remove
the address length information from the sk_buff parsing routines as it is
redundant since we already have the address family.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Instead of storing the packet's network interface name store the ifindex. This
allows us to defer the need to lookup the net_device structure until the audit
record is generated meaning that in the majority of cases we never need to
bother with this at all.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current SELinux netif code requires the caller have a valid net_device
struct pointer to lookup network interface information. However, we don't
always have a valid net_device pointer so convert the netif code to use
the ifindex values we always have as part of the sk_buff. This patch also
removes the default message SID from the network interface record, it is
not being used and therefore is "dead code".
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
In order to do any sort of IP header inspection of incoming packets we need to
know which address family, AF_INET/AF_INET6/etc., it belongs to and since the
sk_buff structure does not store this information we need to pass along the
address family separate from the packet itself.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds support to the NetLabel LSM secattr struct for a secid token
and a type field, paving the way for full LSM/SELinux context support and
"static" or "fallback" labels. In addition, this patch adds a fair amount
of documentation to the core NetLabel structures used as part of the
NetLabel kernel API.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The proc net rewrite had a side effect on selinux, leading it to mislabel
the /proc/net inodes, thereby leading to incorrect denials. Fix
security_genfs_sid to ignore extra leading / characters in the path supplied
by selinux_proc_get_sid since we now get "//net/..." rather than "/net/...".
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
selinux: make mls_compute_sid always polyinstantiate
security/selinux: constify function pointer tables and fields
security: add a secctx_to_secid() hook
security: call security_file_permission from rw_verify_area
security: remove security_sb_post_mountroot hook
Security: remove security.h include from mm.h
Security: remove security_file_mmap hook sparse-warnings (NULL as 0).
Security: add get, set, and cloning of superblock security information
security/selinux: Add missing "space"
There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
kernel_kset does not need to be a kset, but a much simpler kobject now
that we have kobj_attributes.
We also rename kernel_kset to kernel_kobj to catch all users of this
symbol with a build error instead of an easy-to-ignore build warning.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dynamically create the kset instead of declaring it statically. We also
rename kernel_subsys to kernel_kset to catch all users of this symbol
with a build error instead of an easy-to-ignore build warning.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't need a kset here, a simple kobject will do just fine, so
dynamically create the kobject and use it.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't need a "default" ktype for a kset. We should set this
explicitly every time for each kset. This change is needed so that we
can make ksets dynamic, and cleans up one of the odd, undocumented
assumption that the kset/kobject/ktype model has.
This patch is based on a lot of help from Kay Sievers.
Nasty bug in the block code was found by Dave Young
<hidave.darkstar@gmail.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes the requirement that the new and related object types
differ in order to polyinstantiate by MLS level. This allows MLS
polyinstantiation to occur in the absence of explicit type_member rules or
when the type has not changed.
Potential users of this support include pam_namespace.so (directory
polyinstantiation) and the SELinux X support (property polyinstantiation).
Signed-off-by: Eamon Walsh <ewalsh@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Add a secctx_to_secid() LSM hook to go along with the existing
secid_to_secctx() LSM hook. This patch also includes the SELinux
implementation for this hook.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
The security_sb_post_mountroot() hook is long-since obsolete, and is
fundamentally broken: it is never invoked if someone uses initramfs.
This is particularly damaging, because the existence of this hook has
been used as motivation for not using initramfs.
Stephen Smalley confirmed on 2007-07-19 that this hook was originally
used by SELinux but can now be safely removed:
http://marc.info/?l=linux-kernel&m=118485683612916&w=2
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: James Morris <jmorris@namei.org>
Adds security_get_sb_mnt_opts, security_set_sb_mnt_opts, and
security_clont_sb_mnt_opts to the LSM and to SELinux. This will allow
filesystems to directly own and control all of their mount options if they
so choose. This interface deals only with option identifiers and strings so
it should generic enough for any LSM which may come in the future.
Filesystems which pass text mount data around in the kernel (almost all of
them) need not currently make use of this interface when dealing with
SELinux since it will still parse those strings as it always has. I assume
future LSM's would do the same. NFS is the primary FS which does not use
text mount data and thus must make use of this interface.
An LSM would need to implement these functions only if they had mount time
options, such as selinux has context= or fscontext=. If the LSM has no
mount time options they could simply not implement and let the dummy ops
take care of things.
An LSM other than SELinux would need to define new option numbers in
security.h and any FS which decides to own there own security options would
need to be patched to use this new interface for every possible LSM. This
is because it was stated to me very clearly that LSM's should not attempt to
understand FS mount data and the burdon to understand security should be in
the FS which owns the options.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
In linux-2.6.24-rc1, security/commoncap.c:cap_inh_is_capped() was
introduced. It has the exact reverse of its intended behavior. This
led to an unintended privilege esculation involving a process'
inheritable capability set.
To be exposed to this bug, you need to have Filesystem Capabilities
enabled and in use. That is:
- CONFIG_SECURITY_FILE_CAPABILITIES must be defined for the buggy code
to be compiled in.
- You also need to have files on your system marked with fI bits raised.
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@akpm@linux-foundation.org>
Fix a memory leak in security_netlbl_sid_to_secattr() as reported here:
* https://bugzilla.redhat.com/show_bug.cgi?id=352281
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
On a kernel with CONFIG_SECURITY but without an LSM which implements
security_file_mmap it is impossible for an application to mmap addresses
lower than mmap_min_addr. Based on a suggestion from a developer in the
openwall community this patch adds a check for CAP_SYS_RAWIO. It is
assumed that any process with this capability can harm the system a lot
more easily than writing some stuff on the zero page and then trying to
get the kernel to trip over itself. It also means that programs like X
on i686 which use vm86 emulation can work even with mmap_min_addr set.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Instead of using f_op to detect dead booleans, check the inode index
against the number of booleans and check the dentry name against the
boolean name for that index on reads and writes. This prevents
incorrect use of a boolean file opened prior to a policy reload while
allowing valid use of it as long as it still corresponds to the same
boolean in the policy.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Do not clear f_op when removing entries since it isn't safe to do.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
An unprivileged process must be able to kill a setuid root program started
by the same user. This is legacy behavior needed for instance for xinit to
kill X when the window manager exits.
When an unprivileged user runs a setuid root program in !SECURE_NOROOT
mode, fP, fI, and fE are set full on, so pP' and pE' are full on. Then
cap_task_kill() prevents the user from signaling the setuid root task.
This is a change in behavior compared to when
!CONFIG_SECURITY_FILE_CAPABILITIES.
This patch introduces a special check into cap_task_kill() just to check
whether a non-root user is signaling a setuid root program started by the
same user. If so, then signal is allowed.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Andrew Morgan <morgan@kernel.org>
Cc: Stephen Smalley <sds@epoch.ncsc.mil>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix http://bugzilla.kernel.org/show_bug.cgi?id=9247
Allow sigcont to be sent to a process with greater capabilities if it is in
the same session. Otherwise, a shell from which I've started a root shell
and done 'suspend' can't be restarted by the parent shell.
Also don't do file-capabilities signaling checks when uids for the
processes don't match, since the standard check_kill_permission will have
done those checks.
[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: Andrew Morgan <morgan@kernel.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Tested-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Stephen Smalley <sds@epoch.ncsc.mil>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add more validity checks at policy load time to reject malformed
policies and prevent subsequent out-of-range indexing when in permissive
mode. Resolves the NULL pointer dereference reported in
https://bugzilla.redhat.com/show_bug.cgi?id=357541.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
The "e_iter = e_iter->next;" statement in the inner for loop is primally
bug. It should be moved to outside of the for loop.
Signed-off-by: KaiGai Kohei <kaigai@kaigai.gr.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
On PowerPC allmodconfig build we get this:
security/selinux/xfrm.c:214: warning: comparison is always false due to limited range of data type
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: James Morris <jmorris@namei.org>
When checking if we can wait on a child we were looking at
p->exit_signal and trying to make the decision based on if the signal
would eventually be allowed. One big flaw is that p->exit_signal is -1
for NPTL threads and so aignal_to_av was not actually checking SIGCHLD
which is what would have been sent. Even is exit_signal was set to
something strange it wouldn't change the fact that the child was there
and needed to be waited on. This patch just assumes wait is based on
SIGCHLD. Specific permission checks are made when the child actually
attempts to send a signal.
This resolves the problem of things like using GDB on confined domains
such as in RH BZ 232371. The confined domain did not have permission to
send a generic signal (exit_signal == -1) back to the unconfined GDB.
With this patch the GDB wait works and since the actual signal sent is
allowed everything functions as it should.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Simplify the vfs_cap_data structure.
Also fix get_file_caps which was declaring
__le32 v1caps[XATTR_CAPS_SZ] on the stack, but
XATTR_CAPS_SZ is already * sizeof(__le32).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Andrew Morgan <morgan@kernel.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().
A cgroup init has it's tsk->pid == 1.
A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.
Changelog:
2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().
2.6.21-mm2-pidns2:
- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.
[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Herbert Poetzel <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Get rid of sparse related warnings from places that use integer as NULL
pointer.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1,
can change the capabilities of another process, p2. This is not the
meaning that was intended for this capability at all, and this
implementation came about purely because, without filesystem capabilities,
there was no way to use capabilities without one process bestowing them on
another.
Since we now have a filesystem support for capabilities we can fix the
implementation of CAP_SETPCAP.
The most significant thing about this change is that, with it in effect, no
process can set the capabilities of another process.
The capabilities of a program are set via the capability convolution
rules:
pI(post-exec) = pI(pre-exec)
pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI)
pE(post-exec) = fE ? pP(post-exec) : 0
at exec() time. As such, the only influence the pre-exec() program can
have on the post-exec() program's capabilities are through the pI
capability set.
The correct implementation for CAP_SETPCAP (and that enabled by this patch)
is that it can be used to add extra pI capabilities to the current process
- to be picked up by subsequent exec()s when the above convolution rules
are applied.
Here is how it works:
Let's say we have a process, p. It has capability sets, pE, pP and pI.
Generally, p, can change the value of its own pI to pI' where
(pI' & ~pI) & ~pP = 0.
That is, the only new things in pI' that were not present in pI need to
be present in pP.
The role of CAP_SETPCAP is basically to permit changes to pI beyond
the above:
if (pE & CAP_SETPCAP) {
pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */
}
This capability is useful for things like login, which (say, via
pam_cap) might want to raise certain inheritable capabilities for use
by the children of the logged-in user's shell, but those capabilities
are not useful to or needed by the login program itself.
One such use might be to limit who can run ping. You set the
capabilities of the 'ping' program to be "= cap_net_raw+i", and then
only shells that have (pI & CAP_NET_RAW) will be able to run
it. Without CAP_SETPCAP implemented as described above, login(pam_cap)
would have to also have (pP & CAP_NET_RAW) in order to raise this
capability and pass it on through the inheritable set.
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains the following cleanups that are now possible:
- remove the unused security_operations->inode_xattr_getsuffix
- remove the no longer used security_operations->unregister_security
- remove some no longer required exit code
- remove a bunch of no longer used exports
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.
This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.
Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.
Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.
Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.
Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.
Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.
Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().
Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.
Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.
Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.
Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.
One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?
It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.
task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit
Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.
Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.
[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morgan <morgan@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert LSM into a static interface, as the ability to unload a security
module is not required by in-tree users and potentially complicates the
overall security architecture.
Needlessly exported LSM symbols have been unexported, to help reduce API
abuse.
Parameters for the capability and root_plug modules are now specified
at boot.
The SECURITY_FRAMEWORK_VERSION macro has also been removed.
In a nutshell, there is no safe way to unload an LSM. The modular interface
is thus unecessary and broken infrastructure. It is used only by out-of-tree
modules, which are often binary-only, illegal, abusive of the API and
dangerous, e.g. silently re-vectoring SELinux.
[akpm@linux-foundation.org: cleanups]
[akpm@linux-foundation.org: USB Kconfig fix]
[randy.dunlap@oracle.com: fix LSM kernel-doc]
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Acked-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make request_key() and co fundamentally asynchronous to make it easier for
NFS to make use of them. There are now accessor functions that do
asynchronous constructions, a wait function to wait for construction to
complete, and a completion function for the key type to indicate completion
of construction.
Note that the construction queue is now gone. Instead, keys under
construction are linked in to the appropriate keyring in advance, and that
anyone encountering one must wait for it to be complete before they can use
it. This is done automatically for userspace.
The following auxiliary changes are also made:
(1) Key type implementation stuff is split from linux/key.h into
linux/key-type.h.
(2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does
not need to call key_instantiate_and_link() directly.
(3) Adjust the debugging macros so that they're -Wformat checked even if
they are disabled, and make it so they can be enabled simply by defining
__KDEBUG to be consistent with other code of mine.
(3) Documentation.
[alan@lxorguk.ukuu.org.uk: keys: missing word in documentation]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch kills ugly warnings when the "Improve SELinux performance
when ACV misses" patch.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: James Morris <jmorris@namei.org>
* We add ebitmap_for_each_positive_bit() which enables to walk on
any positive bit on the given ebitmap, to improve its performance
using common bit-operations defined in linux/bitops.h.
In the previous version, this logic was implemented using a combination
of ebitmap_for_each_bit() and ebitmap_node_get_bit(), but is was worse
in performance aspect.
This logic is most frequestly used to compute a new AVC entry,
so this patch can improve SELinux performance when AVC misses are happen.
* struct ebitmap_node is redefined as an array of "unsigned long", to get
suitable for using find_next_bit() which is fasted than iteration of
shift and logical operation, and to maximize memory usage allocated
from general purpose slab.
* Any ebitmap_for_each_bit() are repleced by the new implementation
in ss/service.c and ss/mls.c. Some of related implementation are
changed, however, there is no incompatibility with the previous
version.
* The width of any new line are less or equal than 80-chars.
The following benchmark shows the effect of this patch, when we
access many files which have different security context one after
another. The number is more than /selinux/avc/cache_threshold, so
any access always causes AVC misses.
selinux-2.6 selinux-2.6-ebitmap
AVG: 22.763 [s] 8.750 [s]
STD: 0.265 0.019
------------------------------------------
1st: 22.558 [s] 8.786 [s]
2nd: 22.458 [s] 8.750 [s]
3rd: 22.478 [s] 8.754 [s]
4th: 22.724 [s] 8.745 [s]
5th: 22.918 [s] 8.748 [s]
6th: 22.905 [s] 8.764 [s]
7th: 23.238 [s] 8.726 [s]
8th: 22.822 [s] 8.729 [s]
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Allow policy to select, in much the same way as it selects MLS support, how
the kernel should handle access decisions which contain either unknown
classes or unknown permissions in known classes. The three choices for the
policy flags are
0 - Deny unknown security access. (default)
2 - reject loading policy if it does not contain all definitions
4 - allow unknown security access
The policy's choice is exported through 2 booleans in
selinuxfs. /selinux/deny_unknown and /selinux/reject_unknown.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
It reduces the selinux overhead on read/write by only revalidating
permissions in selinux_file_permission if the task or inode labels have
changed or the policy has changed since the open-time check. A new LSM
hook, security_dentry_open, is added to capture the necessary state at open
time to allow this optimization.
(see http://marc.info/?l=selinux&m=118972995207740&w=2)
Signed-off-by: Yuichi Nakamura<ynakam@hitachisoft.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
This patch reduces memory usage of SELinux by tuning avtab. Number of hash
slots in avtab was 32768. Unused slots used memory when number of rules is
fewer. This patch decides number of hash slots dynamically based on number
of rules. (chain length)^2 is also printed out in avtab_hash_eval to see
standard deviation of avtab hash table.
Signed-off-by: Yuichi Nakamura<ynakam@hitachisoft.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Expansion of original idea from Denis V. Lunev <den@openvz.org>
Add robustness and locking to the local_port_range sysctl.
1. Enforce that low < high when setting.
2. Use seqlock to ensure atomic update.
The locking might seem like overkill, but there are
cases where sysadmin might want to change value in the
middle of a DoS attack.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each netlink socket will live in exactly one network namespace,
this includes the controlling kernel sockets.
This patch updates all of the existing netlink protocols
to only support the initial network namespace. Request
by clients in other namespaces will get -ECONREFUSED.
As they would if the kernel did not have the support for
that netlink protocol compiled in.
As each netlink protocol is updated to be multiple network
namespace safe it can register multiple kernel sockets
to acquire a presence in the rest of the network namespaces.
The implementation in af_netlink is a simple filter implementation
at hash table insertion and hash table look up time.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Every user of the network device notifiers is either a protocol
stack or a pseudo device. If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.
To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.
As the rest of the code is made network namespace aware these
checks can be removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Given an illegal selinux option it was possible for match_token to work in
random memory at the end of the match_table_t array.
Note that privilege is required to perform a context mount, so this issue is
effectively limited to root only.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Clear parent death signal on SID transitions to prevent unauthorized
signaling between SIDs.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@localhost.localdomain>
The new exec code inserts an accounted vma into an mm struct which is not
current->mm. The existing memory check code has a hard coded assumption
that this does not happen as does the security code.
As the correct mm is known we pass the mm to the security method and the
helper function. A new security test is added for the case where we need
to pass the mm and the existing one is modified to pass current->mm to
avoid the need to change large amounts of code.
(Thanks to Tobias for fixing rejects and testing)
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: WU Fengguang <wfg@mail.ustc.edu.cn>
Cc: James Morris <jmorris@redhat.com>
Cc: Tobias Diedrich <ranma+kernel@tdiedrich.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corrects an error code so that it is valid to pass to userspace.
Signed-off-by: Steve Grubb <linux_4ever@yahoo.com>
Signed-off-by: James Morris <jmorris@halo.namei>
We don't need to check for NULL pointers before calling kfree().
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
A small fix to the SELinux/NetLabel glue code to ensure that the NetLabel
cache is utilized when possible. This was broken when the SELinux/NetLabel
glue code was reorganized in the last kernel release.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
xfrm_audit_log() expects the context string to be null-terminated
which currently doesn't happen with user-supplied contexts.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Fix memory leak in security_netlbl_cache_add()
Note: The Coverity checker gets credit for spotting this one.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Selinux folks had been complaining about the lack of AVC_PATH
records when audit is disabled. I must admit my stupidity - I assumed
that avc_audit() really couldn't use audit_log_d_path() because of
deadlocks (== could be called with dcache_lock or vfsmount_lock held).
Shouldn't have made that assumption - it never gets called that way.
It _is_ called under spinlocks, but not those.
Since audit_log_d_path() uses ab->gfp_mask for allocations,
kmalloc() in there is not a problem. IOW, the simple fix is sufficient:
let's rip AUDIT_AVC_PATH out and simply generate pathname as part of main
record. It's trivial to do.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: James Morris <jmorris@namei.org>
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel
SELinux: enable dynamic activation/deactivation of NetLabel/SELinux enforcement
This patch changes mm_struct.dumpable to a pair of bit flags.
set_dumpable() converts three-value dumpable to two flags and stores it into
lower two bits of mm_struct.flags instead of mm_struct.dumpable.
get_dumpable() behaves in the opposite way.
[akpm@linux-foundation.org: export set_dumpable]
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These changes will make NetLabel behave like labeled IPsec where there is an
access check for both labeled and unlabeled packets as well as providing the
ability to restrict domains to receiving only labeled packets when NetLabel is
in use. The changes to the policy are straight forward with the following
necessary to receive labeled traffic (with SECINITSID_NETMSG defined as
"netlabel_peer_t"):
allow mydom_t netlabel_peer_t:{ tcp_socket udp_socket rawip_socket } recvfrom;
The policy for unlabeled traffic would be:
allow mydom_t unlabeled_t:{ tcp_socket udp_socket rawip_socket } recvfrom;
These policy changes, as well as more general NetLabel support, are included in
the latest SELinux Reference Policy release 20070629 or later. Users who make
use of NetLabel are strongly encouraged to upgrade their policy to avoid
network problems. Users who do not make use of NetLabel will not notice any
difference.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Create a new NetLabel KAPI interface, netlbl_enabled(), which reports on the
current runtime status of NetLabel based on the existing configuration. LSMs
that make use of NetLabel, i.e. SELinux, can use this new function to determine
if they should perform NetLabel access checks. This patch changes the
NetLabel/SELinux glue code such that SELinux only enforces NetLabel related
access checks when netlbl_enabled() returns true.
At present NetLabel is considered to be enabled when there is at least one
labeled protocol configuration present. The result is that by default NetLabel
is considered to be disabled, however, as soon as an administrator configured
a CIPSO DOI definition NetLabel is enabled and SELinux starts enforcing
NetLabel related access controls - including unlabeled packet controls.
This patch also tries to consolidate the multiple "#ifdef CONFIG_NETLABEL"
blocks into a single block to ease future review as recommended by Linus.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Rather than using a tri-state integer for the wait flag in
call_usermodehelper_exec, define a proper enum, and use that. I've
preserved the integer values so that any callers I've missed should
still work OK.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: David Howells <dhowells@redhat.com>
Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]
The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add TTY input auditing, used to audit system administrator's actions. This is
required by various security standards such as DCID 6/3 and PCI to provide
non-repudiation of administrator's actions and to allow a review of past
actions if the administrator seems to overstep their duties or if the system
becomes misconfigured for unknown reasons. These requirements do not make it
necessary to audit TTY output as well.
Compared to an user-space keylogger, this approach records TTY input using the
audit subsystem, correlated with other audit events, and it is completely
transparent to the user-space application (e.g. the console ioctls still
work).
TTY input auditing works on a higher level than auditing all system calls
within the session, which would produce an overwhelming amount of mostly
useless audit events.
Add an "audit_tty" attribute, inherited across fork (). Data read from TTYs
by process with the attribute is sent to the audit subsystem by the kernel.
The audit netlink interface is extended to allow modifying the audit_tty
attribute, and to allow sending explanatory audit events from user-space (for
example, a shell might send an event containing the final command, after the
interactive command-line editing and history expansion is performed, which
might be difficult to decipher from the TTY input alone).
Because the "audit_tty" attribute is inherited across fork (), it would be set
e.g. for sshd restarted within an audited session. To prevent this, the
audit_tty attribute is cleared when a process with no open TTY file
descriptors (e.g. after daemon startup) opens a TTY.
See https://www.redhat.com/archives/linux-audit/2007-June/msg00000.html for a
more detailed rationale document for an older version of this patch.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Miloslav Trmac <mitr@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 9faf65fb6e.
It bit people like Michal Piotrowski:
"My system is too secure, I can not login :)"
because it changed how CONFIG_NETLABEL worked, and broke older SElinux
policies.
As a result, quoth James Morris:
"Can you please revert this patch?
We thought it only affected people running MLS, but it will affect others.
Sorry for the hassle."
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Cc: Paul Moore <paul.moore@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These changes will make NetLabel behave like labeled IPsec where there is an
access check for both labeled and unlabeled packets as well as providing the
ability to restrict domains to receiving only labeled packets when NetLabel
is in use. The changes to the policy are straight forward with the
following necessary to receive labeled traffic (with SECINITSID_NETMSG
defined as "netlabel_peer_t"):
allow mydom_t netlabel_peer_t:{ tcp_socket udp_socket rawip_socket } recvfrom;
The policy for unlabeled traffic would be:
allow mydom_t unlabeled_t:{ tcp_socket udp_socket rawip_socket } recvfrom;
These policy changes, as well as more general NetLabel support, are included
in the SELinux Reference Policy SVN tree, r2352 or later. Users who enable
NetLabel support in the kernel are strongly encouraged to upgrade their
policy to avoid network problems.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add a new security check on mmap operations to see if the user is attempting
to mmap to low area of the address space. The amount of space protected is
indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
0, preserving existing behavior.
This patch uses a new SELinux security class "memprotect." Policy already
contains a number of allow rules like a_t self:process * (unconfined_t being
one of them) which mean that putting this check in the process class (its
best current fit) would make it useless as all user processes, which we also
want to protect against, would be allowed. By taking the memprotect name of
the new class it will also make it possible for us to move some of the other
memory protect permissions out of 'process' and into the new class next time
we bump the policy version number (which I also think is a good future idea)
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Inode numbers are unsigned long and so need to %lu as format string of printf.
Signed-off-by: Tobias Oed <tobias.oed@octant-fr.com>
Signed-off-by: James Morris <jmorris@namei.org>
In security_get_user_sids, move the transition permission checks
outside of the section holding the policy rdlock, and use the AVC to
perform the checks, calling cond_resched after each one. These
changes should allow preemption between the individual checks and
enable caching of the results. It may however increase the overall
time spent in the function in some cases, particularly in the cache
miss case.
The long term fix will be to take much of this logic to userspace by
exporting additional state via selinuxfs, and ultimately deprecating
and eliminating this interface from the kernel.
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
During the LSPP testing we found that it was possible for
policydb_destroy() to take 10+ seconds of kernel time to complete.
Basically all policydb_destroy() does is walk some (possibly long) lists
and free the memory it finds. Turning off slab debugging config options
made the problem go away since the actual functions which took most of
the time were (as seen by oprofile)
> 121202 23.9879 .check_poison_obj
> 78247 15.4864 .check_slabp
were caused by that. So I decided to also add some voluntary schedule
points in that code so config voluntary preempt would be enough to solve
the problem. Something similar was done in places like
shmem_free_pages() when we have to walk a list of memory and free it.
This was tested by the LSPP group on the hardware which could reproduce
the problem just loading a new policy and was found to not trigger the
softlock detector. It takes just as much processing time, but the
kernel doesn't spend all that time stuck doing one thing and never
scheduling.
Someday a better way to handle memory might make the time needed in this
function a lot less, but this fixes the current issue as it stands
today.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The structure is as follows (relative to selinuxfs root):
/class/file/index
/class/file/perms/read
/class/file/perms/write
...
Each class is allocated 33 inodes, 1 for the class index and 32 for
permissions. Relative to SEL_CLASS_INO_OFFSET, the inode of the index file
DIV 33 is the class number. The inode of the permission file % 33 is the
index of the permission for that class.
Signed-off-by: Christopher J. PeBenito <cpebenito@tresys.com>
Signed-off-by: James Morris <jmorris@namei.org>
Specify the inode counter explicitly in sel_make_dir(), rather than always
using sel_last_ino.
Signed-off-by: Christopher J. PeBenito <cpebenito@tresys.com>
Signed-off-by: James Morris <jmorris@namei.org>
sel_remove_bools() will also be used by the object class discovery, rename
it for more general use.
Signed-off-by: Christopher J. PeBenito <cpebenito@tresys.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add support to the SELinux security server for obtaining a list of classes,
and for obtaining a list of permissions for a specified class.
Signed-off-by: Christopher J. PeBenito <cpebenito@tresys.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current NetLabel code has some redundant APIs which allow both
"struct socket" and "struct sock" types to be used; this may have made
sense at some point but it is wasteful now. Remove the functions that
operate on sockets and convert the callers. Not only does this make
the code smaller and more consistent but it pushes the locking burden
up to the caller which can be more intelligent about the locks. Also,
perform the same conversion (socket to sock) on the SELinux/NetLabel
glue code where it make sense.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.
Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While researching the tty layer pid leaks I found a weird case in selinux when
we drop a controlling tty because of inadequate permissions we don't do the
normal hangup processing. Which is a problem if it happens the session leader
has exec'd something that can no longer access the tty.
We already have code in the kernel to handle this case in the form of the
TIOCNOTTY ioctl. So this patch factors out a helper function that is the
essence of that ioctl and calls it from the selinux code.
This removes the inconsistency in handling dropping of a controlling tty and
who knows it might even make some part of user space happy because it received
a SIGHUP it was expecting.
In addition since this removes the last user of proc_set_tty outside of
tty_io.c proc_set_tty is made static and removed from tty.h
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to work on cleaning up the relationship between kobjects, ksets and
ktypes. The removal of 'struct subsystem' is the first step of this,
especially as it is not really needed at all.
Thanks to Kay for fixing the bugs in this patch.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
selinux: preserve boolean values across policy reloads
selinux: change numbering of boolean directory inodes in selinuxfs
selinux: remove unused enumeration constant from selinuxfs
selinux: explicitly number all selinuxfs inodes
selinux: export initial SID contexts via selinuxfs
selinux: remove userland security class and permission definitions
SELinux: move security_skb_extlbl_sid() out of the security server
MAINTAINERS: update selinux entry
SELinux: rename selinux_netlabel.h to netlabel.h
SELinux: extract the NetLabel SELinux support from the security server
NetLabel: convert a BUG_ON in the CIPSO code to a runtime check
NetLabel: cleanup and document CIPSO constants
Export the keyring key type definition and document its availability.
Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At present, the userland policy loading code has to go through contortions to preserve
boolean values across policy reloads, and cannot do so atomically.
As this is what we always want to do for reloads, let the kernel preserve them instead.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Karl MacMillan <kmacmillan@mentalrootkit.com>
Signed-off-by: James Morris <jmorris@namei.org>
Change the numbering of the booleans directory inodes in selinuxfs to
provide more room for new inodes without a conflict in inode numbers and
to be consistent with how inode numbering is done in the
initial_contexts directory.
Signed-off-by: James Carter <jwcart2@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@parisplace.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Remove the unused enumeration constant, SEL_AVC, from the sel_inos
enumeration in selinuxfs.
Signed-off-by: James Carter <jwcart2@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@parisplace.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Explicitly number all selinuxfs inodes to prevent a conflict between
inodes numbered using last_ino when created with new_inode() and those
labeled explicitly.
Signed-off-by: James Carter <jwcart2@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@parisplace.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Make the initial SID contexts accessible to userspace via selinuxfs.
An initial use of this support will be to make the unlabeled context
available to libselinux for use for invalidated userspace SIDs.
Signed-off-by: James Carter <jwcart2@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Remove userland security class and permission definitions from the kernel
as the kernel only needs to use and validate its own class and permission
definitions and userland definitions may change.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
As suggested, move the security_skb_extlbl_sid() function out of the security
server and into the SELinux hooks file.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
In the beginning I named the file selinux_netlabel.h to avoid potential
namespace colisions. However, over time I have realized that there are several
other similar cases of multiple header files with the same name so I'm changing
the name to something which better fits with existing naming conventions.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Up until this patch the functions which have provided NetLabel support to
SELinux have been integrated into the SELinux security server, which for
various reasons is not really ideal. This patch makes an effort to extract as
much of the NetLabel support from the security server as possibile and move it
into it's own file within the SELinux directory structure.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)
Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the quite common 'skb->nh.raw - skb->data' sequence.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
have it return the buffer it had allocated
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Always initialize *scontext and *scontext_len in security_sid_to_context.
(via http://lkml.org/lkml/2007/2/23/135)
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Below is a patch which demotes many printk lines to KERN_DEBUG from
KERN_INFO. It should help stop the spamming of logs with messages in
which users are not interested nor is there any action that users should
take. It also promotes some KERN_INFO to KERN_ERR such as when there
are improper attempts to register/unregister security modules.
A similar patch was discussed a while back on list:
http://marc.theaimsgroup.com/?t=116656343500003&r=1&w=2
This patch addresses almost all of the issues raised. I believe the
only advice not taken was in the demoting of messages related to
undefined permissions and classes.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
security/selinux/hooks.c | 20 ++++++++++----------
security/selinux/ss/avtab.c | 2 +-
security/selinux/ss/policydb.c | 6 +++---
security/selinux/ss/sidtab.c | 2 +-
4 files changed, 15 insertions(+), 15 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
Hmmm...turns out to not be quite enough, as the /proc/sys inodes aren't truly
private to the fs, so we can run into them in a variety of security hooks
beyond just the inode hooks, such as security_file_permission (when reading
and writing them via the vfs helpers), security_sb_mount (when mounting other
filesystems on directories in proc like binfmt_misc), and deeper within the
security module itself (as in flush_unauthorized_files upon inheritance across
execve). So I think we have to add an IS_PRIVATE() guard within SELinux, as
below. Note however that the use of the private flag here could be confusing,
as these inodes are _not_ private to the fs, are exposed to userspace, and
security modules must implement the sysctl hook to get any access control over
them.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I goofed and when reenabling the fine grained selinux labels for
sysctls and forgot to add the "/sys" prefix before consulting
the policy database. When computing the same path using
proc_dir_entries we got the "/sys" for free as it was part
of the tree, but it isn't true for clt_table trees.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It isn't needed anymore, all of the users are gone, and all of the ctl_table
initializers have been converted to use explicit names of the fields they are
initializing.
[akpm@osdl.org: NTFS fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there. Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.
To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.
Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm. I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace a small number of expressions with a call to the "container_of()"
macro.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the key serial number collision avoidance code in key_alloc_serial().
This didn't use to be so much of a problem as the key serial numbers were
allocated from a simple incremental counter, and it would have to go through
two billion keys before it could possibly encounter a collision. However, now
that random numbers are used instead, collisions are much more likely.
This is fixed by finding a hole in the rbtree where the next unused serial
number ought to be and using that by going almost back to the top of the
insertion routine and redoing the insertion with the new serial number rather
than trying to be clever and attempting to work out the insertion point
pointer directly.
This fixes kernel BZ #7727.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is an incremental fix to the flow_cache_genid
patch for selinux that breaks the build of 2.6.20-rc6 when
xfrm is not configured.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, old flow cache entries remain valid even after
a reload of SELinux policy.
This patch increments the flow cache generation id
on policy (re)loads so that flow cache entries are
revalidated as needed.
Thanks to Herbet Xu for pointing this out. See:
http://marc.theaimsgroup.com/?l=linux-netdev&m=116841378704536&w=2
There's also a general issue as well as a solution proposed
by David Miller for when flow_cache_genid wraps. I might be
submitting a separate patch for that later.
I request that this be applied to 2.6.20 since it's
a security relevant fix.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The spinlock protecting the update of the "sksec->nlbl_state" variable is not
currently softirq safe which can lead to problems. This patch fixes this by
changing the spin_{un}lock() functions into spin_{un}lock_bh() functions.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
This deletes mls_copy_context() in favor of mls_context_cpy() and
replaces mls_scopy_context() with mls_context_cpy_low().
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
do not call a sleeping lock API in an RCU read section.
lock_sock_nested can sleep, its BH counterpart doesn't.
selinux_netlbl_inode_permission() needs to use the BH counterpart
unconditionally.
Compile tested.
From: Ingo Molnar <mingo@elte.hu>
added BH disabling, because this function can be called from non-atomic
contexts too, so a naked bh_lock_sock() would be deadlock-prone.
Boot-tested the resulting kernel.
Signed-off-by: Parag Warudkar <paragw@paragw.zapto.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets. The code allows the number of fds supported by the
fdarray (fdtable->max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable->max_fdset).
In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.
Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal. This
patch removes fdtable->max_fdset. As an added bonus, most of the supporting
code becomes simpler.
Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the locking of signal->tty.
Use ->sighand->siglock to protect ->signal->tty; this lock is already used
by most other members of ->signal/->sighand. And unless we are 'current'
or the tasklist_lock is held we need ->siglock to access ->signal anyway.
(NOTE: sys_unshare() is broken wrt ->sighand locking rules)
Note that tty_mutex is held over tty destruction, so while holding
tty_mutex any tty pointer remains valid. Otherwise the lifetime of ttys
are governed by their open file handles. This leaves some holes for tty
access from signal->tty (or any other non file related tty access).
It solves the tty SLAB scribbles we were seeing.
(NOTE: the change from group_send_sig_info to __group_send_sig_info needs to
be examined by someone familiar with the security framework, I think
it is safe given the SEND_SIG_PRIV from other __group_send_sig_info
invocations)
[schwidefsky@de.ibm.com: 3270 fix]
[akpm@osdl.org: various post-viro fixes]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Name some of the remaning 'old_style_spin_init' locks
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#
set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done
The script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_KERNEL is an alias of GFP_KERNEL.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_ATOMIC is an alias of GFP_ATOMIC
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>