1
0
Fork 0
Commit Graph

82 Commits (a862f68a8b360086f248cbc3606029441b5f5197)

Author SHA1 Message Date
David Howells 822ad64d7e keys: Fix dependency loop between construction record and auth key
In the request_key() upcall mechanism there's a dependency loop by which if
a key type driver overrides the ->request_key hook and the userspace side
manages to lose the authorisation key, the auth key and the internal
construction record (struct key_construction) can keep each other pinned.

Fix this by the following changes:

 (1) Killing off the construction record and using the auth key instead.

 (2) Including the operation name in the auth key payload and making the
     payload available outside of security/keys/.

 (3) The ->request_key hook is given the authkey instead of the cons
     record and operation name.

Changes (2) and (3) allow the auth key to naturally be cleaned up if the
keyring it is in is destroyed or cleared or the auth key is unlinked.

Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
2019-02-15 14:12:09 -08:00
Linus Torvalds 75f95da078 libnvdimm for 4.21
* Add support for the security features of nvdimm devices that
   implement a security model similar to ATA hard drive security. The
   security model supports locking access to the media at
   device-power-loss, to be unlocked with a passphrase, and secure-erase
   (crypto-scramble).
 
   Unlike the ATA security case where the kernel expects device
   security to be managed in a pre-OS environment, the libnvdimm security
   implementation allows key provisioning and key-operations at OS
   runtime. Keys are managed with the kernel's encrypted-keys facility to
   provide data-at-rest security for the libnvdimm key material. The
   usage model mirrors fscrypt key management, but is driven via
   libnvdimm sysfs.
 
 * Miscellaneous updates for api usage and comment fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcJalaAAoJEB7SkWpmfYgCmpkP/35Isou2xxbYdICt9HR4WjiE
 E4r66Ck+ivZW4oEAub3RQw/SwcZePjuHArO5MgVGFoM1dBMABE+JvKiZm1xybCil
 iuWT/V7ozlNf5LuvacvjUXe/suUIoqji9yzLEEx9mDaISehrYvvy579DFxkE8Aww
 8Vcx8E1DQkXiWY4d9nmDmAjt8GQPNmsdHfTwMcJg7XSywTkURF64vW7/94aN8QtV
 rH5ZKKOra8Lqn0S05RoH4XegZcE6TZxgE23ZXsobBJrWdbGVFmgbz6AaewJc/+QU
 ZW3dx/1NAe7Op25xewJOjdG/Kl0gzWtqCrHCu8LeaSjwtHsgMQ/FBhL9g/7GRtHt
 hcQy12Iu9YTn3FOl8yfNRKb7lE5/1nJ2MtW8Z55WKHONHzN9cgrZLOwmpTYgrR/r
 SmIzwKNFavLwwz9bj+paHhmTngLDnNbvR6FwAKwSbWinotbLEyfMhU1jJ9RmHxSV
 M+jqS1BJ9IvH+WVN5bVa7oCNpVeq6yMw0Ow2vWXMPbiwh17ShACI59paqZKRiR06
 WH8SgdTPOZuMndtQV+m8zFBiJtmgAtGSiN8c1A/1QZJHDTpXz2MdWhtx5McvFsZ9
 Jot4Vd8mFWI7oVLdb+PF/AGZnKNAnD8PdsAlFws0k7+rM6lf6gf5n/CkjjMuSapG
 Q/kf0cOpPq8wdsrnOzyc
 =7KYd
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "The vast bulk of this update is the new support for the security
  capabilities of some nvdimms.

  The userspace tooling for this capability is still a work in progress,
  but the changes survive the existing libnvdimm unit tests. The changes
  also pass manual checkout on hardware and the new nfit_test emulation
  of the security capability.

  The touches of the security/keys/ files have received the necessary
  acks from Mimi and David. Those changes were necessary to allow for a
  new generic encrypted-key type, and allow the nvdimm sub-system to
  lookup key material referenced by the libnvdimm-sysfs interface.

  Summary:

   - Add support for the security features of nvdimm devices that
     implement a security model similar to ATA hard drive security. The
     security model supports locking access to the media at
     device-power-loss, to be unlocked with a passphrase, and
     secure-erase (crypto-scramble).

     Unlike the ATA security case where the kernel expects device
     security to be managed in a pre-OS environment, the libnvdimm
     security implementation allows key provisioning and key-operations
     at OS runtime. Keys are managed with the kernel's encrypted-keys
     facility to provide data-at-rest security for the libnvdimm key
     material. The usage model mirrors fscrypt key management, but is
     driven via libnvdimm sysfs.

   - Miscellaneous updates for api usage and comment fixes"

* tag 'libnvdimm-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
  libnvdimm/security: Quiet security operations
  libnvdimm/security: Add documentation for nvdimm security support
  tools/testing/nvdimm: add Intel DSM 1.8 support for nfit_test
  tools/testing/nvdimm: Add overwrite support for nfit_test
  tools/testing/nvdimm: Add test support for Intel nvdimm security DSMs
  acpi/nfit, libnvdimm/security: add Intel DSM 1.8 master passphrase support
  acpi/nfit, libnvdimm/security: Add security DSM overwrite support
  acpi/nfit, libnvdimm: Add support for issue secure erase DSM to Intel nvdimm
  acpi/nfit, libnvdimm: Add enable/update passphrase support for Intel nvdimms
  acpi/nfit, libnvdimm: Add disable passphrase support to Intel nvdimm.
  acpi/nfit, libnvdimm: Add unlock of nvdimm support for Intel DIMMs
  acpi/nfit, libnvdimm: Add freeze security support to Intel nvdimm
  acpi/nfit, libnvdimm: Introduce nvdimm_security_ops
  keys-encrypted: add nvdimm key format type to encrypted keys
  keys: Export lookup_user_key to external users
  acpi/nfit, libnvdimm: Store dimm id as a member to struct nvdimm
  libnvdimm, namespace: Replace kmemdup() with kstrndup()
  libnvdimm, label: Switch to bitmap_zalloc()
  ACPI/nfit: Adjust annotation for why return 0 if fail to find NFIT at start
  libnvdimm, bus: Check id immediately following ida_simple_get
  ...
2018-12-28 15:05:13 -08:00
Dave Jiang 76ef5e1725 keys: Export lookup_user_key to external users
Export lookup_user_key() symbol in order to allow nvdimm passphrase
update to retrieve user injected keys.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-12-13 17:54:12 -08:00
Paul Gortmaker 876979c930 security: audit and remove any unnecessary uses of module.h
Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends.  That changed
when we forked out support for the latter into the export.h file.
This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.

The advantage in removing such instances is that module.h itself
sources about 15 other headers; adding significantly to what we feed
cpp, and it can obscure what headers we are effectively using.

Since module.h might have been the implicit source for init.h
(for __init) and for export.h (for EXPORT_SYMBOL) we consider each
instance for the presence of either and replace as needed.

Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-security-module@vger.kernel.org
Cc: linux-integrity@vger.kernel.org
Cc: keyrings@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
2018-12-12 14:58:51 -08:00
Baolin Wang 074d589895 security: keys: Replace time_t/timespec with time64_t
The 'struct key' will use 'time_t' which we try to remove in the
kernel, since 'time_t' is not year 2038 safe on 32bit systems.
Also the 'struct keyring_search_context' will use 'timespec' type
to record current time, which is also not year 2038 safe on 32bit
systems.

Thus this patch replaces 'time_t' with 'time64_t' which is year 2038
safe for 'struct key', and replace 'timespec' with 'time64_t' for the
'struct keyring_search_context', since we only look at the the seconds
part of 'timespec' variable. Moreover we also change the codes where
using the 'time_t' and 'timespec', and we can get current time by
ktime_get_real_seconds() instead of current_kernel_time(), and use
'TIME64_MAX' macro to initialize the 'time64_t' type variable.

Especially in proc.c file, we have replaced 'unsigned long' and 'timespec'
type with 'u64' and 'time64_t' type to save the timeout value, which means
user will get one 'u64' type timeout value by issuing proc_keys_show()
function.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-11-15 16:38:45 +00:00
David Howells 363b02dab0 KEYS: Fix race between updating and finding a negative key
Consolidate KEY_FLAG_INSTANTIATED, KEY_FLAG_NEGATIVE and the rejection
error into one field such that:

 (1) The instantiation state can be modified/read atomically.

 (2) The error can be accessed atomically with the state.

 (3) The error isn't stored unioned with the payload pointers.

This deals with the problem that the state is spread over three different
objects (two bits and a separate variable) and reading or updating them
atomically isn't practical, given that not only can uninstantiated keys
change into instantiated or rejected keys, but rejected keys can also turn
into instantiated keys - and someone accessing the key might not be using
any locking.

The main side effect of this problem is that what was held in the payload
may change, depending on the state.  For instance, you might observe the
key to be in the rejected state.  You then read the cached error, but if
the key semaphore wasn't locked, the key might've become instantiated
between the two reads - and you might now have something in hand that isn't
actually an error code.

The state is now KEY_IS_UNINSTANTIATED, KEY_IS_POSITIVE or a negative error
code if the key is negatively instantiated.  The key_is_instantiated()
function is replaced with key_is_positive() to avoid confusion as negative
keys are also 'instantiated'.

Additionally, barriering is included:

 (1) Order payload-set before state-set during instantiation.

 (2) Order state-read before payload-read when using the key.

Further separate barriering is necessary if RCU is being used to access the
payload content after reading the payload pointers.

Fixes: 146aa8b145 ("KEYS: Merge the type-specific data with the payload data")
Cc: stable@vger.kernel.org # v4.4+
Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
2017-10-18 09:12:40 +01:00
Eric Biggers 237bbd29f7 KEYS: prevent creating a different user's keyrings
It was possible for an unprivileged user to create the user and user
session keyrings for another user.  For example:

    sudo -u '#3000' sh -c 'keyctl add keyring _uid.4000 "" @u
                           keyctl add keyring _uid_ses.4000 "" @u
                           sleep 15' &
    sleep 1
    sudo -u '#4000' keyctl describe @u
    sudo -u '#4000' keyctl describe @us

This is problematic because these "fake" keyrings won't have the right
permissions.  In particular, the user who created them first will own
them and will have full access to them via the possessor permissions,
which can be used to compromise the security of a user's keys:

    -4: alswrv-----v------------  3000     0 keyring: _uid.4000
    -5: alswrv-----v------------  3000     0 keyring: _uid_ses.4000

Fix it by marking user and user session keyrings with a flag
KEY_FLAG_UID_KEYRING.  Then, when searching for a user or user session
keyring by name, skip all keyrings that don't have the flag set.

Fixes: 69664cf16a ("keys: don't generate user and user session keyrings unless they're accessed")
Cc: <stable@vger.kernel.org>	[v2.6.26+]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-09-25 15:19:57 +01:00
Eric Biggers d636bd9f12 KEYS: put keyring if install_session_keyring_to_cred() fails
In join_session_keyring(), if install_session_keyring_to_cred() were to
fail, we would leak the keyring reference, just like in the bug fixed by
commit 23567fd052 ("KEYS: Fix keyring ref leak in
join_session_keyring()").  Fortunately this cannot happen currently, but
we really should be more careful.  Do this by adding and using a new
error label at which the keyring reference is dropped.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-06-09 13:29:46 +10:00
Linus Torvalds 0302e28dee Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

  IMA:
   - provide ">" and "<" operators for fowner/uid/euid rules

  KEYS:
   - add a system blacklist keyring

   - add KEYCTL_RESTRICT_KEYRING, exposes keyring link restriction
     functionality to userland via keyctl()

  LSM:
   - harden LSM API with __ro_after_init

   - add prlmit security hook, implement for SELinux

   - revive security_task_alloc hook

  TPM:
   - implement contextual TPM command 'spaces'"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (98 commits)
  tpm: Fix reference count to main device
  tpm_tis: convert to using locality callbacks
  tpm: fix handling of the TPM 2.0 event logs
  tpm_crb: remove a cruft constant
  keys: select CONFIG_CRYPTO when selecting DH / KDF
  apparmor: Make path_max parameter readonly
  apparmor: fix parameters so that the permission test is bypassed at boot
  apparmor: fix invalid reference to index variable of iterator line 836
  apparmor: use SHASH_DESC_ON_STACK
  security/apparmor/lsm.c: set debug messages
  apparmor: fix boolreturn.cocci warnings
  Smack: Use GFP_KERNEL for smk_netlbl_mls().
  smack: fix double free in smack_parse_opts_str()
  KEYS: add SP800-56A KDF support for DH
  KEYS: Keyring asymmetric key restrict method with chaining
  KEYS: Restrict asymmetric key linkage using a specific keychain
  KEYS: Add a lookup_restriction function for the asymmetric key type
  KEYS: Add KEYCTL_RESTRICT_KEYRING
  KEYS: Consistent ordering for __key_link_begin and restrict check
  KEYS: Add an optional lookup_restriction hook to key_type
  ...
2017-05-03 08:50:52 -07:00
Eric Biggers c9f838d104 KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings
This fixes CVE-2017-7472.

Running the following program as an unprivileged user exhausts kernel
memory by leaking thread keyrings:

	#include <keyutils.h>

	int main()
	{
		for (;;)
			keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
	}

Fix it by only creating a new thread keyring if there wasn't one before.
To make things more consistent, make install_thread_keyring_to_cred()
and install_process_keyring_to_cred() both return 0 if the corresponding
keyring is already present.

Fixes: d84f4f992c ("CRED: Inaugurate COW credentials")
Cc: stable@vger.kernel.org # 2.6.29+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-04-18 15:31:49 +01:00
Elena Reshetova ddb99e118e security, keys: convert key_user.usage from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-04-03 10:49:06 +10:00
Ingo Molnar 8703e8a465 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h>
We are going to split <linux/sched/user.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/user.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
David Howells 5ac7eace2d KEYS: Add a facility to restrict new links into a keyring
Add a facility whereby proposed new links to be added to a keyring can be
vetted, permitting them to be rejected if necessary.  This can be used to
block public keys from which the signature cannot be verified or for which
the signature verification fails.  It could also be used to provide
blacklisting.

This affects operations like add_key(), KEYCTL_LINK and KEYCTL_INSTANTIATE.

To this end:

 (1) A function pointer is added to the key struct that, if set, points to
     the vetting function.  This is called as:

	int (*restrict_link)(struct key *keyring,
			     const struct key_type *key_type,
			     unsigned long key_flags,
			     const union key_payload *key_payload),

     where 'keyring' will be the keyring being added to, key_type and
     key_payload will describe the key being added and key_flags[*] can be
     AND'ed with KEY_FLAG_TRUSTED.

     [*] This parameter will be removed in a later patch when
     	 KEY_FLAG_TRUSTED is removed.

     The function should return 0 to allow the link to take place or an
     error (typically -ENOKEY, -ENOPKG or -EKEYREJECTED) to reject the
     link.

     The pointer should not be set directly, but rather should be set
     through keyring_alloc().

     Note that if called during add_key(), preparse is called before this
     method, but a key isn't actually allocated until after this function
     is called.

 (2) KEY_ALLOC_BYPASS_RESTRICTION is added.  This can be passed to
     key_create_or_update() or key_instantiate_and_link() to bypass the
     restriction check.

 (3) KEY_FLAG_TRUSTED_ONLY is removed.  The entire contents of a keyring
     with this restriction emplaced can be considered 'trustworthy' by
     virtue of being in the keyring when that keyring is consulted.

 (4) key_alloc() and keyring_alloc() take an extra argument that will be
     used to set restrict_link in the new key.  This ensures that the
     pointer is set before the key is published, thus preventing a window
     of unrestrictedness.  Normally this argument will be NULL.

 (5) As a temporary affair, keyring_restrict_trusted_only() is added.  It
     should be passed to keyring_alloc() as the extra argument instead of
     setting KEY_FLAG_TRUSTED_ONLY on a keyring.  This will be replaced in
     a later patch with functions that look in the appropriate places for
     authoritative keys.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-04-11 22:37:37 +01:00
Yevgeny Pats 23567fd052 KEYS: Fix keyring ref leak in join_session_keyring()
This fixes CVE-2016-0728.

If a thread is asked to join as a session keyring the keyring that's already
set as its session, we leak a keyring reference.

This can be tested with the following program:

	#include <stddef.h>
	#include <stdio.h>
	#include <sys/types.h>
	#include <keyutils.h>

	int main(int argc, const char *argv[])
	{
		int i = 0;
		key_serial_t serial;

		serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,
				"leaked-keyring");
		if (serial < 0) {
			perror("keyctl");
			return -1;
		}

		if (keyctl(KEYCTL_SETPERM, serial,
			   KEY_POS_ALL | KEY_USR_ALL) < 0) {
			perror("keyctl");
			return -1;
		}

		for (i = 0; i < 100; i++) {
			serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,
					"leaked-keyring");
			if (serial < 0) {
				perror("keyctl");
				return -1;
			}
		}

		return 0;
	}

If, after the program has run, there something like the following line in
/proc/keys:

3f3d898f I--Q---   100 perm 3f3f0000     0     0 keyring   leaked-keyring: empty

with a usage count of 100 * the number of times the program has been run,
then the kernel is malfunctioning.  If leaked-keyring has zero usages or
has been garbage collected, then the problem is fixed.

Reported-by: Yevgeny Pats <yevgeny@perception-point.io>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-01-20 10:50:48 +11:00
David Howells 146aa8b145 KEYS: Merge the type-specific data with the payload data
Merge the type-specific data with the payload data into one four-word chunk
as it seems pointless to keep them separate.

Use user_key_payload() for accessing the payloads of overloaded
user-defined keys.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cifs@vger.kernel.org
cc: ecryptfs@vger.kernel.org
cc: linux-ext4@vger.kernel.org
cc: linux-f2fs-devel@lists.sourceforge.net
cc: linux-nfs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-ima-devel@lists.sourceforge.net
2015-10-21 15:18:36 +01:00
Andy Lutomirski 58319057b7 capabilities: ambient capabilities
Credit where credit is due: this idea comes from Christoph Lameter with
a lot of valuable input from Serge Hallyn.  This patch is heavily based
on Christoph's patch.

===== The status quo =====

On Linux, there are a number of capabilities defined by the kernel.  To
perform various privileged tasks, processes can wield capabilities that
they hold.

Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X).  When the kernel checks for a
capability, it checks pE.  The other capability masks serve to modify
what capabilities can be in pE.

Any task can remove capabilities from pE, pP, or pI at any time.  If a
task has a capability in pP, it can add that capability to pE and/or pI.
If a task has CAP_SETPCAP, then it can add any capability to pI, and it
can remove capabilities from X.

Tasks are not the only things that can have capabilities; files can also
have capabilities.  A file can have no capabilty information at all [1].
If a file has capability information, then it has a permitted mask (fP)
and an inheritable mask (fI) as well as a single effective bit (fE) [2].
File capabilities modify the capabilities of tasks that execve(2) them.

A task that successfully calls execve has its capabilities modified for
the file ultimately being excecuted (i.e.  the binary itself if that
binary is ELF or for the interpreter if the binary is a script.) [3] In
the capability evolution rules, for each mask Z, pZ represents the old
value and pZ' represents the new value.  The rules are:

  pP' = (X & fP) | (pI & fI)
  pI' = pI
  pE' = (fE ? pP' : 0)
  X is unchanged

For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior.  Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set).  For nonroot
users executing binaries with neither setuid nor file caps, fI and fP
are empty and fE is false.

As an extra complication, if you execute a process as nonroot and fE is
set, then the "secure exec" rules are in effect: AT_SECURE gets set,
LD_PRELOAD doesn't work, etc.

This is rather messy.  We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or file-capped
programs to be exploited for privilege escalation.

===== The problem =====

Capability inheritance is basically useless.

If you aren't root and you execute an ordinary binary, fI is zero, so
your capabilities have no effect whatsoever on pP'.  This means that you
can't usefully execute a helper process or a shell command with elevated
capabilities if you aren't root.

On current kernels, you can sort of work around this by setting fI to
the full set for most or all non-setuid executable files.  This causes
pP' = pI for nonroot, and inheritance works.  No one does this because
it's a PITA and it isn't even supported on most filesystems.

If you try this, you'll discover that every nonroot program ends up with
secure exec rules, breaking many things.

This is a problem that has bitten many people who have tried to use
capabilities for anything useful.

===== The proposed change =====

This patch adds a fifth capability mask called the ambient mask (pA).
pA does what most people expect pI to do.

pA obeys the invariant that no bit can ever be set in pA if it is not
set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
pA.  This ensures that existing programs that try to drop capabilities
still do so, with a complication.  Because capability inheritance is so
broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
then calling execve effectively drops capabilities.  Therefore,
setresuid from root to nonroot conditionally clears pA unless
SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
re-add bits to pA afterwards.

The capability evolution rules are changed:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (X & fP) | (pI & fI) | pA'
  pI' = pI
  pE' = (fE ? pP' : pA')
  X is unchanged

If you are nonroot but you have a capability, you can add it to pA.  If
you do so, your children get that capability in pA, pP, and pE.  For
example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
automatically bind low-numbered ports.  Hallelujah!

Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their namespace)
and unprivileged process trees.  This is currently more or less
impossible.  Hallelujah!

You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.

Users with nonzero pA are unlikely to unintentionally leak that
capability.  If they run programs that try to drop privileges, dropping
privileges will still work.

It's worth noting that the degree of paranoia in this patch could
possibly be reduced without causing serious problems.  Specifically, if
we allowed pA to persist across executing non-pA-aware setuid binaries
and across setresuid, then, naively, the only capabilities that could
leak as a result would be the capabilities in pA, and any attacker
*already* has those capabilities.  This would make me nervous, though --
setuid binaries that tried to privilege-separate might fail to do so,
and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
unexpected side effects.  (Whether these unexpected side effects would
be exploitable is an open question.) I've therefore taken the more
paranoid route.  We can revisit this later.

An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
ambient capabilities.  I think that this would be annoying and would
make granting otherwise unprivileged users minor ambient capabilities
(CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
it is with this patch.

===== Footnotes =====

[1] Files that are missing the "security.capability" xattr or that have
unrecognized values for that xattr end up with has_cap set to false.
The code that does that appears to be complicated for no good reason.

[2] The libcap capability mask parsers and formatters are dangerously
misleading and the documentation is flat-out wrong.  fE is *not* a mask;
it's a single bit.  This has probably confused every single person who
has tried to use file capabilities.

[3] Linux very confusingly processes both the script and the interpreter
if applicable, for reasons that elude me.  The results from thinking
about a script's file capabilities and/or setuid bits are mostly
discarded.

Preliminary userspace code is here, but it needs updating:
https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2

Here is a test program that can be used to verify the functionality
(from Christoph):

/*
 * Test program for the ambient capabilities. This program spawns a shell
 * that allows running processes with a defined set of capabilities.
 *
 * (C) 2015 Christoph Lameter <cl@linux.com>
 * Released under: GPL v3 or later.
 *
 *
 * Compile using:
 *
 *	gcc -o ambient_test ambient_test.o -lcap-ng
 *
 * This program must have the following capabilities to run properly:
 * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
 *
 * A command to equip the binary with the right caps is:
 *
 *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
 *
 *
 * To get a shell with additional caps that can be inherited by other processes:
 *
 *	./ambient_test /bin/bash
 *
 *
 * Verifying that it works:
 *
 * From the bash spawed by ambient_test run
 *
 *	cat /proc/$$/status
 *
 * and have a look at the capabilities.
 */

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <cap-ng.h>
#include <sys/prctl.h>
#include <linux/capability.h>

/*
 * Definitions from the kernel header files. These are going to be removed
 * when the /usr/include files have these defined.
 */
#define PR_CAP_AMBIENT 47
#define PR_CAP_AMBIENT_IS_SET 1
#define PR_CAP_AMBIENT_RAISE 2
#define PR_CAP_AMBIENT_LOWER 3
#define PR_CAP_AMBIENT_CLEAR_ALL 4

static void set_ambient_cap(int cap)
{
	int rc;

	capng_get_caps_process();
	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
	if (rc) {
		printf("Cannot add inheritable cap\n");
		exit(2);
	}
	capng_apply(CAPNG_SELECT_CAPS);

	/* Note the two 0s at the end. Kernel checks for these */
	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
		perror("Cannot set cap");
		exit(1);
	}
}

int main(int argc, char **argv)
{
	int rc;

	set_ambient_cap(CAP_NET_RAW);
	set_ambient_cap(CAP_NET_ADMIN);
	set_ambient_cap(CAP_SYS_NICE);

	printf("Ambient_test forking shell\n");
	if (execv(argv[1], argv + 1))
		perror("Cannot exec");

	return 0;
}

Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
David Howells 0c903ab64f KEYS: Make the key matching functions return bool
Make the key matching functions pointed to by key_match_data::cmp return bool
rather than int.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:08 +01:00
David Howells 462919591a KEYS: Preparse match data
Preparse the match data.  This provides several advantages:

 (1) The preparser can reject invalid criteria up front.

 (2) The preparser can convert the criteria to binary data if necessary (the
     asymmetric key type really wants to do binary comparison of the key IDs).

 (3) The preparser can set the type of search to be performed.  This means
     that it's not then a one-off setting in the key type.

 (4) The preparser can set an appropriate comparator function.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:02 +01:00
Mimi Zohar c124bde28b KEYS: initialize root uid and session keyrings early
In order to create the integrity keyrings (eg. _evm, _ima), root's
uid and session keyrings need to be initialized early.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-25 17:17:01 +01:00
David Howells ccc3e6d9c9 KEYS: Define a __key_get() wrapper to use rather than atomic_inc()
Define a __key_get() wrapper to use rather than atomic_inc() on the key usage
count as this makes it easier to hook in refcount error debugging.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:16 +01:00
David Howells 4bdf0bc300 KEYS: Introduce a search context structure
Search functions pass around a bunch of arguments, each of which gets copied
with each call.  Introduce a search context structure to hold these.

Whilst we're at it, create a search flag that indicates whether the search
should be directly to the description or whether it should iterate through all
keys looking for a non-description match.

This will be useful when keyrings use a generic data struct with generic
routines to manage their content as the search terms can just be passed
through to the iterator callback function.

Also, for future use, the data to be supplied to the match function is
separated from the description pointer in the search context.  This makes it
clear which is being supplied.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:15 +01:00
David Howells 61ea0c0ba9 KEYS: Skip key state checks when checking for possession
Skip key state checks (invalidation, revocation and expiration) when checking
for possession.  Without this, keys that have been marked invalid, revoked
keys and expired keys are not given a possession attribute - which means the
possessor is not granted any possession permits and cannot do anything with
them unless they also have one a user, group or other permit.

This causes failures in the keyutils test suite's revocation and expiration
tests now that commit 96b5c8fea6 reduced the
initial permissions granted to a key.

The failures are due to accesses to revoked and expired keys being given
EACCES instead of EKEYREVOKED or EKEYEXPIRED.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:13 +01:00
David Howells 0da9dfdd2c keys: fix race with concurrent install_user_keyrings()
This fixes CVE-2013-1792.

There is a race in install_user_keyrings() that can cause a NULL pointer
dereference when called concurrently for the same user if the uid and
uid-session keyrings are not yet created.  It might be possible for an
unprivileged user to trigger this by calling keyctl() from userspace in
parallel immediately after logging in.

Assume that we have two threads both executing lookup_user_key(), both
looking for KEY_SPEC_USER_SESSION_KEYRING.

	THREAD A			THREAD B
	===============================	===============================
					==>call install_user_keyrings();
	if (!cred->user->session_keyring)
	==>call install_user_keyrings()
					...
					user->uid_keyring = uid_keyring;
	if (user->uid_keyring)
		return 0;
	<==
	key = cred->user->session_keyring [== NULL]
					user->session_keyring = session_keyring;
	atomic_inc(&key->usage); [oops]

At the point thread A dereferences cred->user->session_keyring, thread B
hasn't updated user->session_keyring yet, but thread A assumes it is
populated because install_user_keyrings() returned ok.

The race window is really small but can be exploited if, for example,
thread B is interrupted or preempted after initializing uid_keyring, but
before doing setting session_keyring.

This couldn't be reproduced on a stock kernel.  However, after placing
systemtap probe on 'user->session_keyring = session_keyring;' that
introduced some delay, the kernel could be crashed reliably.

Fix this by checking both pointers before deciding whether to return.
Alternatively, the test could be done away with entirely as it is checked
inside the mutex - but since the mutex is global, that may not be the best
way.

Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-12 16:44:31 +11:00
Eric W. Biederman ba0e3427b0 userns: Stop oopsing in key_change_session_keyring
Dave Jones <davej@redhat.com> writes:
> Just hit this on Linus' current tree.
>
> [   89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
> [   89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0
> [   89.624901] Oops: 0000 [#1] PREEMPT SMP
> [   89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii
> [   89.637846] CPU 2
> [   89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
> [   89.639850] RIP: 0010:[<ffffffff810784b0>]  [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.641161] RSP: 0018:ffff880115657eb8  EFLAGS: 00010207
> [   89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000
> [   89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600
> [   89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000
> [   89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600
> [   89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000
> [   89.647431] FS:  00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000
> [   89.648660] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0
> [   89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490)
> [   89.654128] Stack:
> [   89.654433]  0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78
> [   89.655769]  ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000
> [   89.657073]  ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58
> [   89.658399] Call Trace:
> [   89.658822]  [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140
> [   89.659845]  [<ffffffff8106c665>] task_work_run+0xa5/0xd0
> [   89.660698]  [<ffffffff81002911>] do_notify_resume+0x71/0xb0
> [   89.661581]  [<ffffffff816c9a4a>] int_signal+0x12/0x17
> [   89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b
> [   89.667778] RIP  [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [   89.668733]  RSP <ffff880115657eb8>
> [   89.669301] CR2: 00000000000000c8
>
> My fastest trinity induced oops yet!
>
>
> Appears to be..
>
>                 if ((set_ns == subset_ns->parent)  &&
>      850:       48 8b 8a c8 00 00 00    mov    0xc8(%rdx),%rcx
>
> from the inlined cred_cap_issubset

By historical accident we have been reading trying to set new->user_ns
from new->user_ns.  Which is totally silly as new->user_ns is NULL (as
is every other field in new except session_keyring at that point).

The intent is clearly to copy all of the fields from old to new so copy
old->user_ns into  into new->user_ns.

Cc: stable@vger.kernel.org
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-03 19:35:38 -08:00
David Howells fe9453a1dc KEYS: Revert one application of "Fix unreachable code" patch
A patch to fix some unreachable code in search_my_process_keyrings() got
applied twice by two different routes upstream as commits e67eab39be
and b010520ab3 (both "fix unreachable code").

Unfortunately, the second application removed something it shouldn't
have and this wasn't detected by GIT.  This is due to the patch not
having sufficient lines of context to distinguish the two places of
application.

The effect of this is relatively minor: inside the kernel, the keyring
search routines may search multiple keyrings and then prioritise the
errors if no keys or negative keys are found in any of them.  With the
extra deletion, the presence of a negative key in the thread keyring
(causing ENOKEY) is incorrectly overridden by an error searching the
process keyring.

So revert the second application of the patch.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 07:56:25 -08:00
Alan Cox e67eab39be keys: fix unreachable code
We set ret to NULL then test it. Remove the bogus test

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 17:40:21 -08:00
Linus Torvalds 2a74dbb9a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "A quiet cycle for the security subsystem with just a few maintenance
  updates."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  Smack: create a sysfs mount point for smackfs
  Smack: use select not depends in Kconfig
  Yama: remove locking from delete path
  Yama: add RCU to drop read locking
  drivers/char/tpm: remove tasklet and cleanup
  KEYS: Use keyring_alloc() to create special keyrings
  KEYS: Reduce initial permissions on keys
  KEYS: Make the session and process keyrings per-thread
  seccomp: Make syscall skipping and nr changes more consistent
  key: Fix resource leak
  keys: Fix unreachable code
  KEYS: Add payload preparsing opportunity prior to key instantiate or update
2012-12-16 15:40:50 -08:00
Jiri Kosina 3bd7bf1f0f Merge branch 'master' into for-next
Sync up with Linus' tree to be able to apply Cesar's patch
against newer version of the code.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-10-28 19:29:19 +01:00
Alan Cox b010520ab3 keys: Fix unreachable code
We set ret to NULL then test it. Remove the bogus test

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-10-25 18:00:27 +02:00
David Howells 96b5c8fea6 KEYS: Reduce initial permissions on keys
Reduce the initial permissions on new keys to grant the possessor everything,
view permission only to the user (so the keys can be seen in /proc/keys) and
nothing else.

This gives the creator a chance to adjust the permissions mask before other
processes can access the new key or create a link to it.

To aid with this, keyring_alloc() now takes a permission argument rather than
setting the permissions itself.

The following permissions are now set:

 (1) The user and user-session keyrings grant the user that owns them full
     permissions and grant a possessor everything bar SETATTR.

 (2) The process and thread keyrings grant the possessor full permissions but
     only grant the user VIEW.  This permits the user to see them in
     /proc/keys, but not to do anything with them.

 (3) Anonymous session keyrings grant the possessor full permissions, but only
     grant the user VIEW and READ.  This means that the user can see them in
     /proc/keys and can list them, but nothing else.  Possibly READ shouldn't
     be provided either.

 (4) Named session keyrings grant everything an anonymous session keyring does,
     plus they grant the user LINK permission.  The whole point of named
     session keyrings is that others can also subscribe to them.  Possibly this
     should be a separate permission to LINK.

 (5) The temporary session keyring created by call_sbin_request_key() gets the
     same permissions as an anonymous session keyring.

 (6) Keys created by add_key() get VIEW, SEARCH, LINK and SETATTR for the
     possessor, plus READ and/or WRITE if the key type supports them.  The used
     only gets VIEW now.

 (7) Keys created by request_key() now get the same as those created by
     add_key().

Reported-by: Lennart Poettering <lennart@poettering.net>
Reported-by: Stef Walter <stefw@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-02 19:24:56 +01:00
David Howells 3a50597de8 KEYS: Make the session and process keyrings per-thread
Make the session keyring per-thread rather than per-process, but still
inherited from the parent thread to solve a problem with PAM and gdm.

The problem is that join_session_keyring() will reject attempts to change the
session keyring of a multithreaded program but gdm is now multithreaded before
it gets to the point of starting PAM and running pam_keyinit to create the
session keyring.  See:

	https://bugs.freedesktop.org/show_bug.cgi?id=49211

The reason that join_session_keyring() will only change the session keyring
under a single-threaded environment is that it's hard to alter the other
thread's credentials to effect the change in a multi-threaded program.  The
problems are such as:

 (1) How to prevent two threads both running join_session_keyring() from
     racing.

 (2) Another thread's credentials may not be modified directly by this process.

 (3) The number of threads is uncertain whilst we're not holding the
     appropriate spinlock, making preallocation slightly tricky.

 (4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
     another thread to replace its keyring, but that means preallocating for
     each thread.

A reasonable way around this is to make the session keyring per-thread rather
than per-process and just document that if you want a common session keyring,
you must get it before you spawn any threads - which is the current situation
anyway.

Whilst we're at it, we can the process keyring behave in the same way.  This
means we can clean up some of the ickyness in the creds code.

Basically, after this patch, the session, process and thread keyrings are about
inheritance rules only and not about sharing changes of keyring.

Reported-by: Mantas M. <grawity@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ray Strode <rstrode@redhat.com>
2012-10-02 19:24:29 +01:00
Alan Cox 631527703d keys: Fix unreachable code
We set ret to NULL then test it. Remove the bogus test

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-09-28 12:20:02 +01:00
Eric W. Biederman 9a56c2db49 userns: Convert security/keys to the new userns infrastructure
- Replace key_user ->user_ns equality checks with kuid_has_mapping checks.
- Use from_kuid to generate key descriptions
- Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t
- Avoid potential problems with file descriptor passing by displaying
  keys in the user namespace of the opener of key status proc files.

Cc: linux-security-module@vger.kernel.org
Cc: keyrings@linux-nfs.org
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-13 18:28:02 -07:00
Al Viro 67d1214551 merge task_work and rcu_head, get rid of separate allocation for keyring case
task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:57:56 +04:00
Al Viro 41f9d29f09 trimming task_work: kill ->data
get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:57:54 +04:00
Oleg Nesterov 413cd3d9ab keys: change keyctl_session_to_parent() to use task_work_add()
Change keyctl_session_to_parent() to use task_work_add() and move
key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to ensure that
only one work can be pending at any time.  This is important, we must not
allow user-space to abuse the parent's ->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING.  I guess this
is not really needed but looks better.

As a side effect, this fixes the (unlikely) race.  The callers of
key_replace_session_keyring() and keyctl_session_to_parent() lack the
necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and related
code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Smith <dsmith@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-23 22:11:23 -04:00
Linus Torvalds 644473e9c6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace enhancements from Eric Biederman:
 "This is a course correction for the user namespace, so that we can
  reach an inexpensive, maintainable, and reasonably complete
  implementation.

  Highlights:
   - Config guards make it impossible to enable the user namespace and
     code that has not been converted to be user namespace safe.

   - Use of the new kuid_t type ensures the if you somehow get past the
     config guards the kernel will encounter type errors if you enable
     user namespaces and attempt to compile in code whose permission
     checks have not been updated to be user namespace safe.

   - All uids from child user namespaces are mapped into the initial
     user namespace before they are processed.  Removing the need to add
     an additional check to see if the user namespace of the compared
     uids remains the same.

   - With the user namespaces compiled out the performance is as good or
     better than it is today.

   - For most operations absolutely nothing changes performance or
     operationally with the user namespace enabled.

   - The worst case performance I could come up with was timing 1
     billion cache cold stat operations with the user namespace code
     enabled.  This went from 156s to 164s on my laptop (or 156ns to
     164ns per stat operation).

   - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
     Most uid/gid setting system calls treat these value specially
     anyway so attempting to use -1 as a uid would likely cause
     entertaining failures in userspace.

   - If setuid is called with a uid that can not be mapped setuid fails.
     I have looked at sendmail, login, ssh and every other program I
     could think of that would call setuid and they all check for and
     handle the case where setuid fails.

   - If stat or a similar system call is called from a context in which
     we can not map a uid we lie and return overflowuid.  The LFS
     experience suggests not lying and returning an error code might be
     better, but the historical precedent with uids is different and I
     can not think of anything that would break by lying about a uid we
     can't map.

   - Capabilities are localized to the current user namespace making it
     safe to give the initial user in a user namespace all capabilities.

  My git tree covers all of the modifications needed to convert the core
  kernel and enough changes to make a system bootable to runlevel 1."

Fix up trivial conflicts due to nearby independent changes in fs/stat.c

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
  userns:  Silence silly gcc warning.
  cred: use correct cred accessor with regards to rcu read lock
  userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
  userns: Convert cgroup permission checks to use uid_eq
  userns: Convert tmpfs to use kuid and kgid where appropriate
  userns: Convert sysfs to use kgid/kuid where appropriate
  userns: Convert sysctl permission checks to use kuid and kgids.
  userns: Convert proc to use kuid/kgid where appropriate
  userns: Convert ext4 to user kuid/kgid where appropriate
  userns: Convert ext3 to use kuid/kgid where appropriate
  userns: Convert ext2 to use kuid/kgid where appropriate.
  userns: Convert devpts to use kuid/kgid where appropriate
  userns: Convert binary formats to use kuid/kgid where appropriate
  userns: Add negative depends on entries to avoid building code that is userns unsafe
  userns: signal remove unnecessary map_cred_ns
  userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
  userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
  userns: Convert stat to return values mapped from kuids and kgids
  userns: Convert user specfied uids and gids in chown into kuids and kgid
  userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
  ...
2012-05-23 17:42:39 -07:00
David Howells 31d5a79d7f KEYS: Do LRU discard in full keyrings
Do an LRU discard in keyrings that are full rather than returning ENFILE.  To
perform this, a time_t is added to the key struct and updated by the creation
of a link to a key and by a key being found as the result of a search.  At the
completion of a successful search, the keyrings in the path between the root of
the search and the first found link to it also have their last-used times
updated.

Note that discarding a link to a key from a keyring does not necessarily
destroy the key as there may be references held by other places.

An alternate discard method that might suffice is to perform FIFO discard from
the keyring, using the spare 2-byte hole in the keylist header as the index of
the next link to be discarded.

This is useful when using a keyring as a cache for DNS results or foreign
filesystem IDs.


This can be tested by the following.  As root do:

	echo 1000 >/proc/sys/kernel/keys/root_maxkeys

	kr=`keyctl newring foo @s`
	for ((i=0; i<2000; i++)); do keyctl add user a$i a $kr; done

Without this patch ENFILE should be reported when the keyring fills up.  With
this patch, the keyring discards keys in an LRU fashion.  Note that the stored
LRU time has a granularity of 1s.

After doing this, /proc/key-users can be observed and should show that most of
the 2000 keys have been discarded:

	[root@andromeda ~]# cat /proc/key-users
	    0:   517 516/516 513/1000 5249/20000

The "513/1000" here is the number of quota-accounted keys present for this user
out of the maximum permitted.

In /proc/keys, the keyring shows the number of keys it has and the number of
slots it has allocated:

	[root@andromeda ~]# grep foo /proc/keys
	200c64c4 I--Q--     1 perm 3b3f0000     0     0 keyring   foo: 509/509

The maximum is (PAGE_SIZE - header) / key pointer size.  That's typically 509
on a 64-bit system and 1020 on a 32-bit system.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-05-11 10:56:56 +01:00
Eric W. Biederman 0093ccb68f cred: Refcount the user_ns pointed to by the cred.
struct user_struct will shortly loose it's user_ns reference
so make the cred user_ns reference a proper reference complete
with reference counting.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-04-07 16:55:52 -07:00
Eric W. Biederman c4a4d60379 userns: Use cred->user_ns instead of cred->user->user_ns
Optimize performance and prepare for the removal of the user_ns reference
from user_struct.  Remove the slow long walk through cred->user->user_ns and
instead go straight to cred->user_ns.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-04-07 16:55:51 -07:00
Dan Carpenter f67dabbdde KEYS: testing wrong bit for KEY_FLAG_REVOKED
The test for "if (cred->request_key_auth->flags & KEY_FLAG_REVOKED) {"
should actually testing that the (1 << KEY_FLAG_REVOKED) bit is set.
The current code actually checks for KEY_FLAG_DEAD.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-03-07 11:12:06 +11:00
David Howells 3ecf1b4f34 KEYS: keyctl_get_keyring_ID() should create a session keyring if create flag set
The keyctl call:

	keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 1)

should create a session keyring if the process doesn't have one of its own
because the create flag argument is set - rather than subscribing to and
returning the user-session keyring as:

	keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0)

will do.

This can be tested by commenting out pam_keyinit in the /etc/pam.d files and
running the following program a couple of times in a row:

	#include <stdio.h>
	#include <stdlib.h>
	#include <keyutils.h>
	int main(int argc, char *argv[])
	{
		key_serial_t uk, usk, sk, nsk;
		uk  = keyctl_get_keyring_ID(KEY_SPEC_USER_KEYRING, 0);
		usk = keyctl_get_keyring_ID(KEY_SPEC_USER_SESSION_KEYRING, 0);
		sk  = keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0);
		nsk = keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 1);
		printf("keys: %08x %08x %08x %08x\n", uk, usk, sk, nsk);
		return 0;
	}

Without this patch, I see:

	keys: 3975ddc7 119c0c66 119c0c66 119c0c66
	keys: 3975ddc7 119c0c66 119c0c66 119c0c66

With this patch, I see:

	keys: 2cb4997b 34112878 34112878 17db2ce3
	keys: 2cb4997b 34112878 34112878 39f3c73e

As can be seen, the session keyring starts off the same as the user-session
keyring each time, but with the patch a new session keyring is created when
the create flag is set.

Reported-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-23 09:57:34 +10:00
David Howells 995995378f KEYS: If install_session_keyring() is given a keyring, it should install it
If install_session_keyring() is given a keyring, it should install it rather
than just creating a new one anyway.  This was accidentally broken in:

	commit d84f4f992c
	Author: David Howells <dhowells@redhat.com>
	Date:   Fri Nov 14 10:39:23 2008 +1100
	Subject: CRED: Inaugurate COW credentials

The impact of that commit is that pam_keyinit no longer works correctly if
'force' isn't specified against a login process. This is because:

	keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0)

now always creates a new session keyring and thus the check whether the session
keyring and the user-session keyring are the same is always false.  This leads
pam_keyinit to conclude that a session keyring is installed and it shouldn't be
revoked by pam_keyinit here if 'revoke' is specified.

Any system that specifies 'force' against pam_keyinit in the PAM configuration
files for login methods (login, ssh, su -l, kdm, etc.) is not affected since
that bypasses the broken check and forces the creation of a new session keyring
anyway (for which the revoke flag is not cleared) - and any subsequent call to
pam_keyinit really does have a session keyring already installed, and so the
check works correctly there.

Reverting to the previous behaviour will cause the kernel to subscribe the
process to the user-session keyring as its session keyring if it doesn't have a
session keyring of its own.  pam_keyinit will detect this and install a new
session keyring anyway (and won't clear the revert flag).

This can be tested by commenting out pam_keyinit in the /etc/pam.d files and
running the following program a couple of times in a row:

	#include <stdio.h>
	#include <stdlib.h>
	#include <keyutils.h>
	int main(int argc, char *argv[])
	{
		key_serial_t uk, usk, sk;
		uk = keyctl_get_keyring_ID(KEY_SPEC_USER_KEYRING, 0);
		usk = keyctl_get_keyring_ID(KEY_SPEC_USER_SESSION_KEYRING, 0);
		sk = keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0);
		printf("keys: %08x %08x %08x\n", uk, usk, sk);
		return 0;
	}

Without the patch, I see:

	keys: 3884e281 24c4dfcf 22825f8e
	keys: 3884e281 24c4dfcf 068772be

With the patch, I see:

	keys: 26be9c83 0e755ce0 0e755ce0
	keys: 26be9c83 0e755ce0 0e755ce0

As can be seen, with the patch, the session keyring is the same as the
user-session keyring each time; without the patch a new session keyring is
generated each time.

Reported-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-23 09:57:33 +10:00
Serge E. Hallyn f7285b5d63 Set cred->user_ns in key_replace_session_keyring
Since this cred was not created with copy_creds(), it needs to get
initialized.  Otherwise use of syscall(__NR_keyctl, KEYCTL_SESSION_TO_PARENT);
can lead to a NULL deref.  Thanks to Robert for finding this.

But introduced by commit 47a150edc2 ("Cache user_ns in struct cred").

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Reported-by: Robert Święcki <robert@swiecki.net>
Cc: David Howells <dhowells@redhat.com>
Cc: stable@kernel.org (2.6.39)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 13:49:19 -07:00
David Howells 78b7280cce KEYS: Improve /proc/keys
Improve /proc/keys by:

 (1) Don't attempt to summarise the payload of a negated key.  It won't have
     one.  To this end, a helper function - key_is_instantiated() has been
     added that allows the caller to find out whether the key is positively
     instantiated (as opposed to being uninstantiated or negatively
     instantiated).

 (2) Do show keys that are negative, expired or revoked rather than hiding
     them.  This requires an override flag (no_state_check) to be passed to
     search_my_process_keyrings() and keyring_search_aux() to suppress this
     check.

     Without this, keys that are possessed by the caller, but only grant
     permissions to the caller if possessed are skipped as the possession check
     fails.

     Keys that are visible due to user, group or other checks are visible with
     or without this patch.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-17 11:59:32 +11:00
David Howells 973c9f4f49 KEYS: Fix up comments in key management code
Fix up comments in the key management code.  No functional changes.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-21 14:59:30 -08:00
David Howells a8b17ed019 KEYS: Do some style cleanup in the key management code.
Do a bit of a style clean up in the key management code.  No functional
changes.

Done using:

  perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c
  perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c
  sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c

To remove /*****/ lines, remove comments on the closing brace of a
function to name the function and remove blank lines before the closing
brace of a function.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-21 14:59:29 -08:00
Andi Kleen 27d6379894 Fix install_process_keyring error handling
Fix an incorrect error check that returns 1 for error instead of the
expected error code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-28 09:02:15 -07:00
David Howells 927942aabb KEYS: Make /proc/keys check to see if a key is possessed before security check
Make /proc/keys check to see if the calling process possesses each key before
performing the security check.  The possession check can be skipped if the key
doesn't have the possessor-view permission bit set.

This causes the keys a process possesses to show up in /proc/keys, even if they
don't have matching user/group/other view permissions.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:34:27 +10:00