Commit graph

1578 commits

Author SHA1 Message Date
Stephen Smalley b78b7d59bd selinux: make default_noexec read-only after init
SELinux checks whether VM_EXEC is set in the VM_DATA_DEFAULT_FLAGS
during initialization and saves the result in default_noexec for use
in its mmap and mprotect hook function implementations to decide
whether to apply EXECMEM, EXECHEAP, EXECSTACK, and EXECMOD checks.
Mark default_noexec as ro_after_init to prevent later clearing it
and thereby disabling these checks.  It is only set legitimately from
init code.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-10 12:26:20 -05:00
Ravi Kumar Siddojigari fe49c7e4f8 selinux: move ibpkeys code under CONFIG_SECURITY_INFINIBAND.
Move cache based  pkey sid  retrieval code which was added
with commit "409dcf31" under CONFIG_SECURITY_INFINIBAND.
As its  going to alloc a new cache which impacts
low RAM devices which was enabled by default.

Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
[PM: checkpatch.pl cleanups, fixed capitalization in the description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-10 11:56:37 -05:00
Huaisheng Ye b82f3f6894 selinux: remove redundant msg_msg_alloc_security
selinux_msg_msg_alloc_security only calls msg_msg_alloc_security but
do nothing else. And also msg_msg_alloc_security is just used by the
former.

Remove the redundant function to simplify the code.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-10 11:32:13 -05:00
Stephen Smalley d41415eb5e Documentation,selinux: fix references to old selinuxfs mount point
selinuxfs was originally mounted on /selinux, and various docs and
kconfig help texts referred to nodes under it.  In Linux 3.0,
/sys/fs/selinux was introduced as the preferred mount point for selinuxfs.
Fix all the old references to /selinux/ to /sys/fs/selinux/.
While we are there, update the description of the selinux boot parameter
to reflect the fact that the default value is always 1 since
commit be6ec88f41 ("selinux: Remove SECURITY_SELINUX_BOOTPARAM_VALUE")
and drop discussion of runtime disable since it is deprecated.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-07 12:46:53 -05:00
Paul Moore 89b223bfb8 selinux: deprecate disabling SELinux and runtime
Deprecate the CONFIG_SECURITY_SELINUX_DISABLE functionality.  The
code was originally developed to make it easier for Linux
distributions to support architectures where adding parameters to the
kernel command line was difficult.  Unfortunately, supporting runtime
disable meant we had to make some security trade-offs when it came to
the LSM hooks, as documented in the Kconfig help text:

  NOTE: selecting this option will disable the '__ro_after_init'
  kernel hardening feature for security hooks.   Please consider
  using the selinux=0 boot parameter instead of enabling this
  option.

Fortunately it looks as if that the original motivation for the
runtime disable functionality is gone, and Fedora/RHEL appears to be
the only major distribution enabling this capability at build time
so we are now taking steps to remove it entirely from the kernel.
The first step is to mark the functionality as deprecated and print
an error when it is used (what this patch is doing).  As Fedora/RHEL
makes progress in transitioning the distribution away from runtime
disable, we will introduce follow-up patches over several kernel
releases which will block for increasing periods of time when the
runtime disable is used.  Finally we will remove the option entirely
once we believe all users have moved to the kernel cmdline approach.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-07 10:19:43 -05:00
Hridya Valsaraju 7a4b519474 selinux: allow per-file labelling for binderfs
This patch allows genfscon per-file labeling for binderfs.
This is required to have separate permissions to allow
access to binder, hwbinder and vndbinder devices which are
relocating to binderfs.

Acked-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-06 21:11:18 -05:00
liuyang34 7e78c87514 selinuxfs: use scnprintf to get real length for inode
The return value of snprintf maybe over the size of TMPBUFLEN, use
scnprintf instead in sel_read_class and sel_read_perm.

Signed-off-by: liuyang34 <liuyang34@xiaomi.com>
[PM: cleaned up the description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-01-06 21:05:57 -05:00
YueHaibing f126853402 selinux: remove set but not used variable 'sidtab'
security/selinux/ss/services.c: In function security_port_sid:
security/selinux/ss/services.c:2346:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
security/selinux/ss/services.c: In function security_ib_endport_sid:
security/selinux/ss/services.c:2435:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
security/selinux/ss/services.c: In function security_netif_sid:
security/selinux/ss/services.c:2480:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
security/selinux/ss/services.c: In function security_fs_use:
security/selinux/ss/services.c:2831:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]

Since commit 66f8e2f03c ("selinux: sidtab reverse lookup hash table")
'sidtab' is not used any more, so remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-24 14:34:01 -05:00
Paul Moore 15b590a81f selinux: ensure the policy has been loaded before reading the sidtab stats
Check to make sure we have loaded a policy before we query the
sidtab's hash stats.  Failure to do so could result in a kernel
panic/oops due to a dereferenced NULL pointer.

Fixes: 66f8e2f03c ("selinux: sidtab reverse lookup hash table")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-23 16:38:36 -05:00
Jaihind Yadav 030b995ad9 selinux: ensure we cleanup the internal AVC counters on error in avc_update()
In AVC update we don't call avc_node_kill() when avc_xperms_populate()
fails, resulting in the avc->avc_cache.active_nodes counter having a
false value.  In last patch this changes was missed , so correcting it.

Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Signed-off-by: Jaihind Yadav <jaihindyadav@codeaurora.org>
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
[PM: merge fuzz, minor description cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-21 10:59:21 -05:00
Stephen Smalley 5c108d4e18 selinux: randomize layout of key structures
Randomize the layout of key selinux data structures.
Initially this is applied to the selinux_state, selinux_ss,
policydb, and task_security_struct data structures.

NB To test/use this mechanism, one must install the
necessary build-time dependencies, e.g. gcc-plugin-devel on Fedora,
and enable CONFIG_GCC_PLUGIN_RANDSTRUCT in the kernel configuration.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Kees Cook <keescook@chromium.org>
[PM: double semi-colon fixed]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-18 21:26:06 -05:00
Stephen Smalley 6c5a682e64 selinux: clean up selinux_enabled/disabled/enforcing_boot
Rename selinux_enabled to selinux_enabled_boot to make it clear that
it only reflects whether SELinux was enabled at boot.  Replace the
references to it in the MAC_STATUS audit log in sel_write_enforce()
with hardcoded "1" values because this code is only reachable if SELinux
is enabled and does not change its value, and update the corresponding
MAC_STATUS audit log in sel_write_disable().  Stop clearing
selinux_enabled in selinux_disable() since it is not used outside of
initialization code that runs before selinux_disable() can be reached.
Mark both selinux_enabled_boot and selinux_enforcing_boot as __initdata
since they are only used in initialization code.

Wrap the disabled field in the struct selinux_state with
CONFIG_SECURITY_SELINUX_DISABLE since it is only used for
runtime disable.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-18 21:22:46 -05:00
Yang Guo 210a292874 selinux: remove unnecessary selinux cred request
task_security_struct was obtained at the beginning of may_create
and selinux_inode_init_security, no need to obtain again.
may_create will be called very frequently when create dir and file.

Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Yang Guo <guoyang2@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-12 08:50:39 -05:00
Paul Moore d8db60cb23 selinux: ensure we cleanup the internal AVC counters on error in avc_insert()
Fix avc_insert() to call avc_node_kill() if we've already allocated
an AVC node and the code fails to insert the node in the cache.

Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Reported-by: rsiddoji@codeaurora.org
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-10 14:16:53 -05:00
Stephen Smalley 5298d0b9b9 selinux: clean up selinux_inode_permission MAY_NOT_BLOCK tests
Through a somewhat convoluted series of changes, we have ended up
with multiple unnecessary occurrences of (flags & MAY_NOT_BLOCK)
tests in selinux_inode_permission().  Clean it up and simplify.
No functional change.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-09 18:47:27 -05:00
Stephen Smalley 0188d5c025 selinux: fall back to ref-walk if audit is required
commit bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
passed down the rcu flag to the SELinux AVC, but failed to adjust the
test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY.
Previously, we only returned -ECHILD if generating an audit record with
LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission.
Move the handling of MAY_NOT_BLOCK to avc_audit() and its inlined
equivalent in selinux_inode_permission() immediately after we determine
that audit is required, and always fall back to ref-walk in this case.

Fixes: bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
Reported-by: Will Deacon <will@kernel.org>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-09 18:37:47 -05:00
Stephen Smalley 1a37079c23 selinux: revert "stop passing MAY_NOT_BLOCK to the AVC upon follow_link"
This reverts commit e46e01eebb ("selinux: stop passing MAY_NOT_BLOCK
to the AVC upon follow_link"). The correct fix is to instead fall
back to ref-walk if audit is required irrespective of the specific
audit data type.  This is done in the next commit.

Fixes: e46e01eebb ("selinux: stop passing MAY_NOT_BLOCK to the AVC upon follow_link")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-09 18:28:56 -05:00
Stephen Smalley 59438b4647 security,lockdown,selinux: implement SELinux lockdown
Implement a SELinux hook for lockdown.  If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial.  To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.

Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.

Sample AVC audit output from denials:
avc:  denied  { integrity } for pid=3402 comm="fwupd"
 lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
 tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0

avc:  denied  { confidentiality } for pid=4628 comm="cp"
 lockdown_reason="/proc/kcore access"
 scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
 tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
 tclass=lockdown permissive=0

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-09 17:53:58 -05:00
Ondrej Mosnacek d97bd23c2d selinux: cache the SID -> context string translation
Translating a context struct to string can be quite slow, especially if
the context has a lot of category bits set. This can cause quite
noticeable performance impact in situations where the translation needs
to be done repeatedly. A common example is a UNIX datagram socket with
the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
when receiving log messages via datagram socket. This scenario can be
reproduced with:

    cat /dev/urandom | base64 | logger &
    timeout 30s perf record -p $(pidof systemd-journald) -a -g
    kill %1
    perf report -g none --pretty raw | grep security_secid_to_secctx

Before the caching introduced by this patch, computing the context
string (security_secid_to_secctx() function) takes up ~65% of
systemd-journald's CPU time (assuming a context with 1024 categories
set and Fedora x86_64 release kernel configs). After this patch
(assuming near-perfect cache hit ratio) this overhead is reduced to just
~2%.

This patch addresses the issue by caching a certain number (compile-time
configurable) of recently used context strings to speed up repeated
translations of the same context, while using only a small amount of
memory.

The cache is integrated into the existing sidtab table by adding a field
to each entry, which when not NULL contains an RCU-protected pointer to
a cache entry containing the cached string. The cache entries are kept
in a linked list sorted according to how recently they were used. On a
cache miss when the cache is full, the least recently used entry is
removed to make space for the new entry.

The patch migrates security_sid_to_context_core() to use the cache (also
a few other functions where it was possible without too much fuss, but
these mostly use the translation for logging in case of error, which is
rare).

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
Cc: Michal Sekletar <msekleta@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
[PM: lots of merge fixups due to collisions with other sidtab patches]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-09 16:14:51 -05:00
Jeff Vander Stoep 66f8e2f03c selinux: sidtab reverse lookup hash table
This replaces the reverse table lookup and reverse cache with a
hashtable which improves cache-miss reverse-lookup times from
O(n) to O(1)* and maintains the same performance as a reverse
cache hit.

This reduces the time needed to add a new sidtab entry from ~500us
to 5us on a Pixel 3 when there are ~10,000 sidtab entries.

The implementation uses the kernel's generic hashtable API,
It uses the context's string represtation as the hash source,
and the kernels generic string hashing algorithm full_name_hash()
to reduce the string to a 32 bit value.

This change also maintains the improvement introduced in
commit ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve
performance") which removed the need to keep the current sidtab
locked during policy reload. It does however introduce periodic
locking of the target sidtab while converting the hashtable. Sidtab
entries are never modified or removed, so the context struct stored
in the sid_to_context tree can also be used for the context_to_sid
hashtable to reduce memory usage.

This bug was reported by:
- On the selinux bug tracker.
  BUG: kernel softlockup due to too many SIDs/contexts #37
  https://github.com/SELinuxProject/selinux-kernel/issues/37
- Jovana Knezevic on Android's bugtracker.
  Bug: 140252993
  "During multi-user performance testing, we create and remove users
  many times. selinux_android_restorecon_pkgdir goes from 1ms to over
  20ms after about 200 user creations and removals. Accumulated over
  ~280 packages, that adds a significant time to user creation,
  making perf benchmarks unreliable."

* Hashtable lookup is only O(1) when n < the number of buckets.

Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Reported-by: Jovana Knezevic <jovanak@google.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: subj tweak, removed changelog from patch description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-12-09 16:14:51 -05:00
Linus Torvalds ceb3074745 y2038: syscall implementation cleanups
This is a series of cleanups for the y2038 work, mostly intended
 for namespace cleaning: the kernel defines the traditional
 time_t, timeval and timespec types that often lead to y2038-unsafe
 code. Even though the unsafe usage is mostly gone from the kernel,
 having the types and associated functions around means that we
 can still grow new users, and that we may be missing conversions
 to safe types that actually matter.
 
 There are still a number of driver specific patches needed to
 get the last users of these types removed, those have been
 submitted to the respective maintainers.
 
 Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJd3D+wAAoJEJpsee/mABjZfdcQAJvl6e+4ddKoDMIVJqVCE25N
 meFRgA7S8jy6BefEVeUgI8TxK+amGO36szMBUEnZxSSxq9u+gd13m5bEK6Xq/ov7
 4KTAiA3Irm/W5FBTktu1zc5ROIra1Xj7jLdubf8wEC3viSXIXB3+68Y28iBN7D2O
 k9kSpwINC5lWeC8guZy2I+2yc4ywUEXao9nVh8C/J+FQtU02TcdLtZop9OhpAa8u
 U19VVH3WHkQI7ZfLvBTUiYK6tlYTiYCnpr8l6sm850CnVv1fzBW+DzmVhPJ6FdFd
 4m5staC0sQ6gVqtjVMBOtT5CdzREse6hpwbKo2GRWFroO5W9tljMOJJXHvv/f6kz
 DxrpUmj37JuRbqAbr8KDmQqPo6M2CRkxFxjol1yh5ER63u1xMwLm/PQITZIMDvPO
 jrFc2C2SdM2E9bKP/RMCVoKSoRwxCJ5IwJ2AF237rrU0sx/zB2xsrOGssx5CWEgc
 3bbk6tDQujJJubnCfgRy1tTxpLZOHEEKw8YhFLLbR2LCtA9pA/0rfLLad16cjA5e
 5jIHxfsFc23zgpzrJeB7kAF/9xgu1tlA5BotOs3VBE89LtWOA9nK5dbPXng6qlUe
 er3xLCfS38ovhUw6DusQpaYLuaYuLM7DKO4iav9kuTMcY9GkbPk7vDD3KPGh2goy
 hY5cSM8+kT1q/THLnUBH
 =Bdbv
 -----END PGP SIGNATURE-----

Merge tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground

Pull y2038 cleanups from Arnd Bergmann:
 "y2038 syscall implementation cleanups

  This is a series of cleanups for the y2038 work, mostly intended for
  namespace cleaning: the kernel defines the traditional time_t, timeval
  and timespec types that often lead to y2038-unsafe code. Even though
  the unsafe usage is mostly gone from the kernel, having the types and
  associated functions around means that we can still grow new users,
  and that we may be missing conversions to safe types that actually
  matter.

  There are still a number of driver specific patches needed to get the
  last users of these types removed, those have been submitted to the
  respective maintainers"

Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/

* tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits)
  y2038: alarm: fix half-second cut-off
  y2038: ipc: fix x32 ABI breakage
  y2038: fix typo in powerpc vdso "LOPART"
  y2038: allow disabling time32 system calls
  y2038: itimer: change implementation to timespec64
  y2038: move itimer reset into itimer.c
  y2038: use compat_{get,set}_itimer on alpha
  y2038: itimer: compat handling to itimer.c
  y2038: time: avoid timespec usage in settimeofday()
  y2038: timerfd: Use timespec64 internally
  y2038: elfcore: Use __kernel_old_timeval for process times
  y2038: make ns_to_compat_timeval use __kernel_old_timeval
  y2038: socket: use __kernel_old_timespec instead of timespec
  y2038: socket: remove timespec reference in timestamping
  y2038: syscalls: change remaining timeval to __kernel_old_timeval
  y2038: rusage: use __kernel_old_timeval
  y2038: uapi: change __kernel_time_t to __kernel_old_time_t
  y2038: stat: avoid 'time_t' in 'struct stat'
  y2038: ipc: remove __kernel_time_t reference from headers
  y2038: vdso: powerpc: avoid timespec references
  ...
2019-12-01 14:00:59 -08:00
Linus Torvalds ba75082efc selinux/stable-5.5 PR 20191126
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl3dbJIUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNEyA/+MYEvQz0BhGBdpHgoCLnUGbEafsxH
 2pWnHSupsCi1/9gal4ztVqW6GMQAQe3C4PhEYAnJ0/sIvix5uBMDFehGEwxzj8aQ
 fOAHYROOsQVQ4D2sfxBfliY0gouTTxFpCK95hvgXR1Y6WP6RBSowS5fmsvfqvz7i
 qDDZvCLOg5ULPHvpmJR3wRjs8MIbZTOiKg1K0VjJgJGzdvEoFfm2faw6WiHkPwm5
 GFiW5PABv1c1Jbe4NL1r85r5iPPZgt+4BygS2QWkx7IVoAvPGhtlkB4Eh0pSESbg
 n141DwJ79uwI2UDUi0Zfcr1Y9OtH1YveUtxZUNi/NYSdxc9R4SjVZJC0TdjrwjWy
 K2jKV6gTn6NBLPuk+HAlj29ETF07G/BJDNkaIGqJcJ97t3kjuODb+QA+eFyM/1t7
 V1oGMtKm6qGhb6y2bMMzmgftFXB0qrppdklkj+Gs8bNEUTfxPRcJS7tk6CZgOTI0
 D5Ikfeu0muflIqug7w7wK8I6s63ZMNoDxZ5Nyy3idm5ZjxcQXp1Ys5x6Axir4lwe
 2tvXwgvNV+/1ZEXNBu8Yhh5EDnZ9Wp1ENd/MT8VlZ77KfAPNnhq48uBKjpQD3KY5
 wPavTRHTu+mnElqkO2cnjDYVhMYrhtbTCeQBY40wLyQDzbqzCOfLav9jisJnblzU
 R8iH6YlokDKPTeQ=
 =x1yB
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "Only three SELinux patches for v5.5:

   - Remove the size limit on SELinux policies, the limitation was a
     lingering vestige and no longer necessary.

   - Allow file labeling before the policy is loaded. This should ease
     some of the burden when the policy is initially loaded (no need to
     relabel files), but it should also help enable some new system
     concepts which dynamically create the root filesystem in the
     initrd.

   - Add support for the "greatest lower bound" policy construct which
     is defined as the intersection of the MLS range of two SELinux
     labels"

* tag 'selinux-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: default_range glblub implementation
  selinux: allow labeling before policy is loaded
  selinux: remove load size limit
2019-11-30 16:55:37 -08:00
Linus Torvalds 8c39f71ee2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "This is mostly to fix the iwlwifi regression:

  1) Flush GRO state properly in iwlwifi driver, from Alexander Lobakin.

  2) Validate TIPC link name with properly length macro, from John
     Rutherford.

  3) Fix completion init and device query timeouts in ibmvnic, from
     Thomas Falcon.

  4) Fix SKB size calculation for netlink messages in psample, from
     Nikolay Aleksandrov.

  5) Similar kind of fix for OVS flow dumps, from Paolo Abeni.

  6) Handle queue allocation failure unwind properly in gve driver, we
     could try to release pages we didn't allocate. From Jeroen de
     Borst.

  7) Serialize TX queue SKB list accesses properly in mscc ocelot
     driver. From Yangbo Lu"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
  net: usb: aqc111: Use the correct style for SPDX License Identifier
  net: phy: Use the correct style for SPDX License Identifier
  net: wireless: intel: iwlwifi: fix GRO_NORMAL packet stalling
  net: mscc: ocelot: use skb queue instead of skbs list
  net: mscc: ocelot: avoid incorrect consuming in skbs list
  gve: Fix the queue page list allocated pages count
  net: inet_is_local_reserved_port() port arg should be unsigned short
  openvswitch: fix flow command message size
  net: phy: dp83869: Fix return paths to return proper values
  net: psample: fix skb_over_panic
  net: usbnet: Fix -Wcast-function-type
  net: hso: Fix -Wcast-function-type
  net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port)
  ibmvnic: Serialize device queries
  ibmvnic: Bound waits for device queries
  ibmvnic: Terminate waiting device threads after loss of service
  ibmvnic: Fix completion structure initialization
  net-sctp: replace some sock_net(sk) with just 'net'
  net: Fix a documentation bug wrt. ip_unprivileged_port_start
  tipc: fix link name length check
2019-11-27 17:17:40 -08:00
Linus Torvalds 3f59dbcace Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "The main kernel side changes in this cycle were:

   - Various Intel-PT updates and optimizations (Alexander Shishkin)

   - Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)

   - Add support for LSM and SELinux checks to control access to the
     perf syscall (Joel Fernandes)

   - Misc other changes, optimizations, fixes and cleanups - see the
     shortlog for details.

  There were numerous tooling changes as well - 254 non-merge commits.
  Here are the main changes - too many to list in detail:

   - Enhancements to core tooling infrastructure, perf.data, libperf,
     libtraceevent, event parsing, vendor events, Intel PT, callchains,
     BPF support and instruction decoding.

   - There were updates to the following tools:

        perf annotate
        perf diff
        perf inject
        perf kvm
        perf list
        perf maps
        perf parse
        perf probe
        perf record
        perf report
        perf script
        perf stat
        perf test
        perf trace

   - And a lot of other changes: please see the shortlog and Git log for
     more details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits)
  perf parse: Fix potential memory leak when handling tracepoint errors
  perf probe: Fix spelling mistake "addrees" -> "address"
  libtraceevent: Fix memory leakage in copy_filter_type
  libtraceevent: Fix header installation
  perf intel-bts: Does not support AUX area sampling
  perf intel-pt: Add support for decoding AUX area samples
  perf intel-pt: Add support for recording AUX area samples
  perf pmu: When using default config, record which bits of config were changed by the user
  perf auxtrace: Add support for queuing AUX area samples
  perf session: Add facility to peek at all events
  perf auxtrace: Add support for dumping AUX area samples
  perf inject: Cut AUX area samples
  perf record: Add aux-sample-size config term
  perf record: Add support for AUX area sampling
  perf auxtrace: Add support for AUX area sample recording
  perf auxtrace: Move perf_evsel__find_pmu()
  perf record: Add a function to test for kernel support for AUX area sampling
  perf tools: Add kernel AUX area sampling definitions
  perf/core: Make the mlock accounting simple again
  perf report: Jump to symbol source view from total cycles view
  ...
2019-11-26 15:04:47 -08:00
Maciej Żenczykowski 82f31ebf61 net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port)
Note that the sysctl write accessor functions guarantee that:
  net->ipv4.sysctl_ip_prot_sock <= net->ipv4.ip_local_ports.range[0]
invariant is maintained, and as such the max() in selinux hooks is actually spurious.

ie. even though
  if (snum < max(inet_prot_sock(sock_net(sk)), low) || snum > high) {
per logic is the same as
  if ((snum < inet_prot_sock(sock_net(sk)) && snum < low) || snum > high) {
it is actually functionally equivalent to:
  if (snum < low || snum > high) {
which is equivalent to:
  if (snum < inet_prot_sock(sock_net(sk)) || snum < low || snum > high) {
even though the first clause is spurious.

But we want to hold on to it in case we ever want to change what what
inet_port_requires_bind_service() means (for example by changing
it from a, by default, [0..1024) range to some sort of set).

Test: builds, git 'grep inet_prot_sock' finds no other references
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-26 13:20:46 -08:00
Arnd Bergmann ddbc7d0657 y2038: move itimer reset into itimer.c
Preparing for a change to the itimer internals, stop using the
do_setitimer() symbol and instead use a new higher-level interface.

The do_getitimer()/do_setitimer functions can now be made static,
allowing the compiler to potentially produce better object code.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-15 14:38:30 +01:00
David S. Miller 2f184393e0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Several cases of overlapping changes which were for the most
part trivially resolvable.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-20 10:43:00 -07:00
Joel Fernandes (Google) da97e18458 perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl.  This has a number of
limitations:

1. The sysctl is only a single value. Many types of accesses are controlled
   based on the single value thus making the control very limited and
   coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
   all processes get access to perf_event_open(2) opening the door to
   security issues.

This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.

5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
   syscall itself. The hook is called from all the places that the
   perf_event_paranoid sysctl is checked to keep it consistent with the
   systctl. The hook gets passed a 'type' argument which controls CPU,
   kernel and tracepoint accesses (in this context, CPU, kernel and
   tracepoint have the same semantics as the perf_event_paranoid sysctl).
   Additionally, I added an 'open' type which is similar to
   perf_event_paranoid sysctl == 3 patch carried in Android and several other
   distros but was rejected in mainline [1] in 2016.

2. perf_event_alloc: This allocates a new security object for the event
   which stores the current SID within the event. It will be useful when
   the perf event's FD is passed through IPC to another process which may
   try to read the FD. Appropriate security checks will limit access.

3. perf_event_free: Called when the event is closed.

4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.

5. perf_event_write: Called from the ioctl(2) syscalls for the event.

[1] https://lwn.net/Articles/696240/

Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.

To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-17 21:31:55 +02:00
Linus Torvalds 2ef459167a selinux/stable-5.4 PR 20191007
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl2bu6kUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMsxhAAtoljww3Xur0JpD7y+g2yzKGZqn9F
 ovqH103NOdpXY3vRN5TL0ZfKEWZz/a2Rjyjz/9+Ix5kKFQuaguk9TVenp4LuAWjy
 yyo8aSArqwJEpPbrgQDRkjvq08zCcsHSQHwyR44L5MEB8w03Hr+GKFbroR7DkB8R
 qthF5nRoarblEpdc88s3WbPN/Yz32zRwl3EppSRriIBSBUNr6OP5yO6YDvBdwJso
 CvmQybMK/iGiZrDzm5jAXzUyI79MHkrrB55roNXIdam9Rnyb9Wqjt9SQgzDLTvO1
 Z7c4pXqDn1iMSECAqR7EeKLmsEvnp8omDMqbZOsGiWwka93nuNM4NRhswMF6X3pf
 EbmBAuj0CokWlRoJAxyxrw/Tn+KXWjyOpOMoNQR7dyyewenzPTWw4zLhiSsl4Epo
 e1+3PDkJeZhlrtqMcQhep/OgfnPp/8FlgZXNkq1wsMK6SawIiwvxH3mpELE4I8Zk
 3yzYZvnxIDNLcx6TmDgDcJyp+P/iuFGK707G6ogCoCK9VqyTs+nwdZn3s2o1KRDW
 00LdiuXiqOyfdDthfY/q5suKJoWExh+K1dhQ7Llx169yx3uOjlnzTaSTt8dcvhkh
 Y+Nf5pEk0MVgnldaIRy/Zzr4y81Q7QW6ZwD62NHCIhcSevYczFOP7K6V/mYFmDT1
 xlCDPXeHyuR5DrM=
 =btWt
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20191007' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinuxfix from Paul Moore:
 "One patch to ensure we don't copy bad memory up into userspace"

* tag 'selinux-pr-20191007' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix context string corruption in convert_context()
2019-10-08 10:51:37 -07:00
Joshua Brindle 42345b68c2 selinux: default_range glblub implementation
A policy developer can now specify glblub as a default_range default and
the computed transition will be the intersection of the mls range of
the two contexts.

The glb (greatest lower bound) lub (lowest upper bound) of a range is calculated
as the greater of the low sensitivities and the lower of the high sensitivities
and the and of each category bitmap.

This can be used by MLS solution developers to compute a context that satisfies,
for example, the range of a network interface and the range of a user logging in.

Some examples are:

User Permitted Range | Network Device Label | Computed Label
---------------------|----------------------|----------------
s0-s1:c0.c12         | s0                   | s0
s0-s1:c0.c12         | s0-s1:c0.c1023       | s0-s1:c0.c12
s0-s4:c0.c512        | s1-s1:c0.c1023       | s1-s1:c0.c512
s0-s15:c0,c2         | s4-s6:c0.c128        | s4-s6:c0,c2
s0-s4                | s2-s6                | s2-s4
s0-s4                | s5-s8                | INVALID
s5-s8                | s0-s4                | INVALID

Signed-off-by: Joshua Brindle <joshua.brindle@crunchydata.com>
[PM: subject lines and checkpatch.pl fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-07 19:01:35 -04:00
Ondrej Mosnacek 2a5243937c selinux: fix context string corruption in convert_context()
string_to_context_struct() may garble the context string, so we need to
copy back the contents again from the old context struct to avoid
storing the corrupted context.

Since string_to_context_struct() tokenizes (and therefore truncates) the
context string and we are later potentially copying it with kstrdup(),
this may eventually cause pieces of uninitialized kernel memory to be
disclosed to userspace (when copying to userspace based on the stored
length and not the null character).

How to reproduce on Fedora and similar:
    # dnf install -y memcached
    # systemctl start memcached
    # semodule -d memcached
    # load_policy
    # load_policy
    # systemctl stop memcached
    # ausearch -m AVC
    type=AVC msg=audit(1570090572.648:313): avc:  denied  { signal } for  pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=process permissive=0 trawcon=73797374656D5F75007400000000000070BE6E847296FFFF726F6D000096FFFF76

Cc: stable@vger.kernel.org
Reported-by: Milos Malik <mmalik@redhat.com>
Fixes: ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve performance")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-03 14:13:36 -04:00
Jiri Pirko 36fbf1e52b net: rtnetlink: add linkprop commands to add and delete alternative ifnames
Add two commands to add and delete list of link properties. Implement
the first property type along - alternative ifnames.
Each net device can have multiple alternative names.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-01 14:47:19 -07:00
Jonathan Lebon 3e3e24b420 selinux: allow labeling before policy is loaded
Currently, the SELinux LSM prevents one from setting the
`security.selinux` xattr on an inode without a policy first being
loaded. However, this restriction is problematic: it makes it impossible
to have newly created files with the correct label before actually
loading the policy.

This is relevant in distributions like Fedora, where the policy is
loaded by systemd shortly after pivoting out of the initrd. In such
instances, all files created prior to pivoting will be unlabeled. One
then has to relabel them after pivoting, an operation which inherently
races with other processes trying to access those same files.

Going further, there are use cases for creating the entire root
filesystem on first boot from the initrd (e.g. Container Linux supports
this today[1], and we'd like to support it in Fedora CoreOS as well[2]).
One can imagine doing this in two ways: at the block device level (e.g.
laying down a disk image), or at the filesystem level. In the former,
labeling can simply be part of the image. But even in the latter
scenario, one still really wants to be able to set the right labels when
populating the new filesystem.

This patch enables this by changing behaviour in the following two ways:
1. allow `setxattr` if we're not initialized
2. don't try to set the in-core inode SID if we're not initialized;
   instead leave it as `LABEL_INVALID` so that revalidation may be
   attempted at a later time

Note the first hunk of this patch is mostly the same as a previously
discussed one[3], though it was part of a larger series which wasn't
accepted.

[1] https://coreos.com/os/docs/latest/root-filesystem-placement.html
[2] https://github.com/coreos/fedora-coreos-tracker/issues/94
[3] https://www.spinics.net/lists/linux-initramfs/msg04593.html

Co-developed-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Jonathan Lebon <jlebon@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-01 09:45:35 -04:00
zhanglin e40642dc01 selinux: remove load size limit
Load size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.

Signed-off-by: zhanglin <zhang.lin16@zte.com.cn>
[PM: removed comments in the description about 'real world use cases']
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-10-01 09:29:04 -04:00
Linus Torvalds 5825a95fe9 selinux/stable-5.4 PR 20190917
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl2BLvcUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP9pA/+Ls9sRGZoEipycbgRnwkL9/6yFtn4
 UCFGMP0eobrjL82i8uMOa/72Budsp3ZaZRxf36NpbMDPyB9ohp5jf7o1WFTELESv
 EwxVvOMNwrxO2UbzRv3iywnhdPVJ4gHPa4GWfBHu2EEfhz3/Bv0tPIBdeXAbq4aC
 R0p+M9X0FFEp9eP4ftwOvFGpbZ8zKo1kwgdvCnqLhHDkyqtapqO/ByCTe1VATERP
 fyxjYDZNnITmI0plaIxCeeudklOTtVSAL4JPh1rk8rZIkUznZ4EBDHxdKiaz3j9C
 ZtAthiAA9PfAwf4DZSPHnGsfINxeNBKLD65jZn/PUne/gNJEx4DK041X9HXBNwjv
 OoArw58LCzxtTNZ//WB4CovRpeSdKvmKv0oh61k8cdQahLeHhzXE1wLQbnnBJLI3
 CTsumIp4ZPEOX5r4ogdS3UIQpo3KrZump7VO85yUTRni150JpZR3egYpmcJ0So1A
 QTPemBhC2CHJVTpycYZ9fVTlPeC4oNwosPmvpB8XeGu3w5JpuNSId+BDR/ZlQAmq
 xWiIocGL3UMuPuJUrTGChifqBAgzK+gLa7S7RYPEnTCkj6LVQwsuP4gBXf75QTG4
 FPwVcoMSDFxUDF0oFqwz4GfJlCxBSzX+BkWUn6jIiXKXBnQjU+1gu6KTwE25mf/j
 snJznFk25hFYFaM=
 =n4ht
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add LSM hooks, and SELinux access control hooks, for dnotify,
   fanotify, and inotify watches. This has been discussed with both the
   LSM and fs/notify folks and everybody is good with these new hooks.

 - The LSM stacking changes missed a few calls to current_security() in
   the SELinux code; we fix those and remove current_security() for
   good.

 - Improve our network object labeling cache so that we always return
   the object's label, even when under memory pressure. Previously we
   would return an error if we couldn't allocate a new cache entry, now
   we always return the label even if we can't create a new cache entry
   for it.

 - Convert the sidtab atomic_t counter to a normal u32 with
   READ/WRITE_ONCE() and memory barrier protection.

 - A few patches to policydb.c to clean things up (remove forward
   declarations, long lines, bad variable names, etc)

* tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  lsm: remove current_security()
  selinux: fix residual uses of current_security() for the SELinux blob
  selinux: avoid atomic_t usage in sidtab
  fanotify, inotify, dnotify, security: add security hook for fs notifications
  selinux: always return a secid from the network caches if we find one
  selinux: policydb - rename type_val_to_struct_array
  selinux: policydb - fix some checkpatch.pl warnings
  selinux: shuffle around policydb.c to get rid of forward declarations
2019-09-23 11:21:04 -07:00
Stephen Smalley 169ce0c081 selinux: fix residual uses of current_security() for the SELinux blob
We need to use selinux_cred() to fetch the SELinux cred blob instead
of directly using current->security or current_security().  There
were a couple of lingering uses of current_security() in the SELinux code
that were apparently missed during the earlier conversions. IIUC, this
would only manifest as a bug if multiple security modules including
SELinux are enabled and SELinux is not first in the lsm order. After
this change, there appear to be no other users of current_security()
in-tree; perhaps we should remove it altogether.

Fixes: bbd3662a83 ("Infrastructure management of the cred security blob")
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-09-04 18:41:12 -04:00
Ondrej Mosnacek 116f21bb96 selinux: avoid atomic_t usage in sidtab
As noted in Documentation/atomic_t.txt, if we don't need the RMW atomic
operations, we should only use READ_ONCE()/WRITE_ONCE() +
smp_rmb()/smp_wmb() where necessary (or the combined variants
smp_load_acquire()/smp_store_release()).

This patch converts the sidtab code to use regular u32 for the counter
and reverse lookup cache and use the appropriate operations instead of
atomic_get()/atomic_set(). Note that when reading/updating the reverse
lookup cache we don't need memory barriers as it doesn't need to be
consistent or accurate. We can now also replace some atomic ops with
regular loads (when under spinlock) and stores (for conversion target
fields that are always accessed under the master table's spinlock).

We can now also bump SIDTAB_MAX to U32_MAX as we can use the full u32
range again.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-27 13:26:13 -04:00
Aaron Goidel ac5656d8a4 fanotify, inotify, dnotify, security: add security hook for fs notifications
As of now, setting watches on filesystem objects has, at most, applied a
check for read access to the inode, and in the case of fanotify, requires
CAP_SYS_ADMIN. No specific security hook or permission check has been
provided to control the setting of watches. Using any of inotify, dnotify,
or fanotify, it is possible to observe, not only write-like operations, but
even read access to a file. Modeling the watch as being merely a read from
the file is insufficient for the needs of SELinux. This is due to the fact
that read access should not necessarily imply access to information about
when another process reads from a file. Furthermore, fanotify watches grant
more power to an application in the form of permission events. While
notification events are solely, unidirectional (i.e. they only pass
information to the receiving application), permission events are blocking.
Permission events make a request to the receiving application which will
then reply with a decision as to whether or not that action may be
completed. This causes the issue of the watching application having the
ability to exercise control over the triggering process. Without drawing a
distinction within the permission check, the ability to read would imply
the greater ability to control an application. Additionally, mount and
superblock watches apply to all files within the same mount or superblock.
Read access to one file should not necessarily imply the ability to watch
all files accessed within a given mount or superblock.

In order to solve these issues, a new LSM hook is implemented and has been
placed within the system calls for marking filesystem objects with inotify,
fanotify, and dnotify watches. These calls to the hook are placed at the
point at which the target path has been resolved and are provided with the
path struct, the mask of requested notification events, and the type of
object on which the mark is being set (inode, superblock, or mount). The
mask and obj_type have already been translated into common FS_* values
shared by the entirety of the fs notification infrastructure. The path
struct is passed rather than just the inode so that the mount is available,
particularly for mount watches. This also allows for use of the hook by
pathname-based security modules. However, since the hook is intended for
use even by inode based security modules, it is not placed under the
CONFIG_SECURITY_PATH conditional. Otherwise, the inode-based security
modules would need to enable all of the path hooks, even though they do not
use any of them.

This only provides a hook at the point of setting a watch, and presumes
that permission to set a particular watch implies the ability to receive
all notification about that object which match the mask. This is all that
is required for SELinux. If other security modules require additional hooks
or infrastructure to control delivery of notification, these can be added
by them. It does not make sense for us to propose hooks for which we have
no implementation. The understanding that all notifications received by the
requesting application are all strictly of a type for which the application
has been granted permission shows that this implementation is sufficient in
its coverage.

Security modules wishing to provide complete control over fanotify must
also implement a security_file_open hook that validates that the access
requested by the watching application is authorized. Fanotify has the issue
that it returns a file descriptor with the file mode specified during
fanotify_init() to the watching process on event. This is already covered
by the LSM security_file_open hook if the security module implements
checking of the requested file mode there. Otherwise, a watching process
can obtain escalated access to a file for which it has not been authorized.

The selinux_path_notify hook implementation works by adding five new file
permissions: watch, watch_mount, watch_sb, watch_reads, and watch_with_perm
(descriptions about which will follow), and one new filesystem permission:
watch (which is applied to superblock checks). The hook then decides which
subset of these permissions must be held by the requesting application
based on the contents of the provided mask and the obj_type. The
selinux_file_open hook already checks the requested file mode and therefore
ensures that a watching process cannot escalate its access through
fanotify.

The watch, watch_mount, and watch_sb permissions are the baseline
permissions for setting a watch on an object and each are a requirement for
any watch to be set on a file, mount, or superblock respectively. It should
be noted that having either of the other two permissions (watch_reads and
watch_with_perm) does not imply the watch, watch_mount, or watch_sb
permission. Superblock watches further require the filesystem watch
permission to the superblock. As there is no labeled object in view for
mounts, there is no specific check for mount watches beyond watch_mount to
the inode. Such a check could be added in the future, if a suitable labeled
object existed representing the mount.

The watch_reads permission is required to receive notifications from
read-exclusive events on filesystem objects. These events include accessing
a file for the purpose of reading and closing a file which has been opened
read-only. This distinction has been drawn in order to provide a direct
indication in the policy for this otherwise not obvious capability. Read
access to a file should not necessarily imply the ability to observe read
events on a file.

Finally, watch_with_perm only applies to fanotify masks since it is the
only way to set a mask which allows for the blocking, permission event.
This permission is needed for any watch which is of this type. Though
fanotify requires CAP_SYS_ADMIN, this is insufficient as it gives implicit
trust to root, which we do not do, and does not support least privilege.

Signed-off-by: Aaron Goidel <acgoide@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-12 17:45:39 -04:00
Paul Moore 9b80c36353 selinux: always return a secid from the network caches if we find one
Previously if we couldn't find an entry in the cache and we failed to
allocate memory for a new cache entry we would fail the network object
label lookup; this is obviously not ideal.  This patch fixes this so
that we return the object label even if we can't cache the object at
this point in time due to memory pressure.

The GitHub issue tracker is below:
 * https://github.com/SELinuxProject/selinux-kernel/issues/3

Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05 16:49:55 -04:00
Ondrej Mosnacek f07ea1d4ed selinux: policydb - rename type_val_to_struct_array
The name is overly long and inconsistent with the other *_val_to_struct
members. Dropping the "_array" prefix makes the code easier to read and
gets rid of one line over 80 characters warning.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05 16:21:06 -04:00
Ondrej Mosnacek 2492acaf1e selinux: policydb - fix some checkpatch.pl warnings
Fix most of the code style warnings discovered when moving code around.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05 16:17:56 -04:00
Paul Moore 0eb2f29624 selinux: shuffle around policydb.c to get rid of forward declarations
No code changes, but move a lot of the policydb destructors higher up
so we can get rid of a forward declaration.

This patch does expose a few old checkpatch.pl errors, but those will
be dealt with in a separate (set of) patches.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-08-05 15:58:57 -04:00
Linus Torvalds 4f1a6ef1df selinux/stable-5.3 PR 20190801
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl1DbfsUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPB9A/+Kr17ng4Oygg0fIO+dW1KrHu64ZCm
 TkLff1+9uNmWSu1NOsctJDQ5kSbBV7XgCT/wv8dT0TfA55D3CX11LtbhqVsIaASA
 8iSq2FNgt91d8AIlw0X+5tXljswWLHTJw29ROY/SC2Eyhj5G2fT8eOwMtz59AmJv
 rHlFt9VfAw7Faf4/egccmxS6fqE7p6gt4Prf77ZYSB8r9dlLDKqW8HT59UyE58MU
 09mK1hqE40U6+wZVuU95ATqtQRMrn4pRgTOEgO9j7xUeLKC6z9cbVRAWtzAcWMRr
 /bHuRm30ij83kHI18gYvXjMBr9Jierg+brW1s/sTV7KSXAyTYYXzUnQYgTHqbhJq
 Do+dggZwCbze19IGfPafI8fjUoGU1tBuPkcy3+Ag8r4+2yB+z+fuN1PxP+AqWZZC
 X1lQhtUlNfHNFmB/1XBTVzDaozKmKp56DiDjCmPvgcH5kWtc35ZTUuXk1YmYtB+a
 O76haRE5386K0SzEAJ4SaPpHPyWzg1Qgi7EQlJy2x8uGc2R4QkXZrj/uGyOL90QJ
 zjPNUPtqSAoLVzemA+PG7BZ/gcGVXuwrwHIPHprg/l/VVNl+4azW5b595pyHh5xL
 0d8A0j/zz1E+A8vzqK9/G0nlLgYw6+yIuI42aT3qBhbxDJDRzvZH8w07W93F4+df
 9+y0Fx+2HSsvbVA=
 =pIeX
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20190801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "One more small fix for a potential memory leak in an error path"

* tag 'selinux-pr-20190801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix memory leak in policydb_init()
2019-08-02 18:40:49 -07:00
Ondrej Mosnacek 45385237f6 selinux: fix memory leak in policydb_init()
Since roles_init() adds some entries to the role hash table, we need to
destroy also its keys/values on error, otherwise we get a memory leak in
the error path.

Cc: <stable@vger.kernel.org>
Reported-by: syzbot+fee3a14d4cdf92646287@syzkaller.appspotmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-07-31 16:51:23 -04:00
Linus Torvalds 40233e7c44 selinux/stable-5.3 PR 20190726
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl07eQIUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXORlg/9GoL9NMb5A6SMkkd+FrMM3Gn4A8iC
 15jn2AXBL8WMY64EB8DofPPhBrdss5EFVLNSeZTvpOVko3aITemsFdrytKNHqY9v
 FbtKAfNOuJI7DWak1yMeuKrgSurxd/ZFfze3qUxlwDzO6recf9RbNQkZ60n/LIr2
 vSnJDWqlDOQiUN5+qNzTL6ztpXAhmoT2D0Nx6GjZd/XBuvcY5Xf4gX+/UhlGpL3O
 e8bJO3b8kQbyBb3aaak/YYsesfzsPxzy5eGZdKFWmNnbsRL6L6Y4vDHP3xxNPsSd
 s0rhibAYNYzeM0MJNj5TD0KDl/vxildaLKPtmRo+vvLGtZeyKxPgyrmnA7AlBa7K
 6yQ9X4nM5VS/Gs68gzLzpz9IzViJBuX18+oMbCdUDM5Xfu+9/zpKBFW06OMEdxcr
 MIbXpCD02Zq6KrduAWP4WSdni2oTTXzOjY9YbyDjhKvo/xF9vloY8XJ9JfyQXqJi
 6uNG1rGhPgF9cQKHX6M84lp0PXdwLB1sUo0BqJvU39+tOqmBxfvOJmghDSuqbJPa
 BKuWNnsPhiRqRN6LIw/yCTxxlF2+cg0fywFl1981PIxUDnfTTYuNL+Rb+cyzo3L2
 QLABdl2sLTfl7GOXOKcBQEE6yHs11m6eYLOKhvdDNhFVy5EmOFF4IUFO69I6YNok
 R3IUowNF8JLYByE=
 =Bojy
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20190726' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "One small SELinux patch to add some proper bounds/overflow checking
  when adding a new sid/secid"

* tag 'selinux-pr-20190726' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: check sidtab limit before adding a new entry
2019-07-26 19:13:38 -07:00
Ondrej Mosnacek acbc372e61 selinux: check sidtab limit before adding a new entry
We need to error out when trying to add an entry above SIDTAB_MAX in
sidtab_reverse_lookup() to avoid overflow on the odd chance that this
happens.

Cc: stable@vger.kernel.org
Fixes: ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve performance")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-07-24 11:13:34 -04:00
Linus Torvalds 933a90bf4f Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount updates from Al Viro:
 "The first part of mount updates.

  Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  mnt_init(): call shmem_init() unconditionally
  constify ksys_mount() string arguments
  don't bother with registering rootfs
  init_rootfs(): don't bother with init_ramfs_fs()
  vfs: Convert smackfs to use the new mount API
  vfs: Convert selinuxfs to use the new mount API
  vfs: Convert securityfs to use the new mount API
  vfs: Convert apparmorfs to use the new mount API
  vfs: Convert openpromfs to use the new mount API
  vfs: Convert xenfs to use the new mount API
  vfs: Convert gadgetfs to use the new mount API
  vfs: Convert oprofilefs to use the new mount API
  vfs: Convert ibmasmfs to use the new mount API
  vfs: Convert qib_fs/ipathfs to use the new mount API
  vfs: Convert efivarfs to use the new mount API
  vfs: Convert configfs to use the new mount API
  vfs: Convert binfmt_misc to use the new mount API
  convenience helper: get_tree_single()
  convenience helper get_tree_nodev()
  vfs: Kill sget_userns()
  ...
2019-07-19 10:42:02 -07:00
Linus Torvalds 237f83dfbe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Some highlights from this development cycle:

   1) Big refactoring of ipv6 route and neigh handling to support
      nexthop objects configurable as units from userspace. From David
      Ahern.

   2) Convert explored_states in BPF verifier into a hash table,
      significantly decreased state held for programs with bpf2bpf
      calls, from Alexei Starovoitov.

   3) Implement bpf_send_signal() helper, from Yonghong Song.

   4) Various classifier enhancements to mvpp2 driver, from Maxime
      Chevallier.

   5) Add aRFS support to hns3 driver, from Jian Shen.

   6) Fix use after free in inet frags by allocating fqdirs dynamically
      and reworking how rhashtable dismantle occurs, from Eric Dumazet.

   7) Add act_ctinfo packet classifier action, from Kevin
      Darbyshire-Bryant.

   8) Add TFO key backup infrastructure, from Jason Baron.

   9) Remove several old and unused ISDN drivers, from Arnd Bergmann.

  10) Add devlink notifications for flash update status to mlxsw driver,
      from Jiri Pirko.

  11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski.

  12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes.

  13) Various enhancements to ipv6 flow label handling, from Eric
      Dumazet and Willem de Bruijn.

  14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van
      der Merwe, and others.

  15) Various improvements to axienet driver including converting it to
      phylink, from Robert Hancock.

  16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean.

  17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana
      Radulescu.

  18) Add devlink health reporting to mlx5, from Moshe Shemesh.

  19) Convert stmmac over to phylink, from Jose Abreu.

  20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from
      Shalom Toledo.

  21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera.

  22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel.

  23) Track spill/fill of constants in BPF verifier, from Alexei
      Starovoitov.

  24) Support bounded loops in BPF, from Alexei Starovoitov.

  25) Various page_pool API fixes and improvements, from Jesper Dangaard
      Brouer.

  26) Just like ipv4, support ref-countless ipv6 route handling. From
      Wei Wang.

  27) Support VLAN offloading in aquantia driver, from Igor Russkikh.

  28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy.

  29) Add flower GRE encap/decap support to nfp driver, from Pieter
      Jansen van Vuuren.

  30) Protect against stack overflow when using act_mirred, from John
      Hurley.

  31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen.

  32) Use page_pool API in netsec driver, Ilias Apalodimas.

  33) Add Google gve network driver, from Catherine Sullivan.

  34) More indirect call avoidance, from Paolo Abeni.

  35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan.

  36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek.

  37) Add MPLS manipulation actions to TC, from John Hurley.

  38) Add sending a packet to connection tracking from TC actions, and
      then allow flower classifier matching on conntrack state. From
      Paul Blakey.

  39) Netfilter hw offload support, from Pablo Neira Ayuso"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits)
  net/mlx5e: Return in default case statement in tx_post_resync_params
  mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync().
  net: dsa: add support for BRIDGE_MROUTER attribute
  pkt_sched: Include const.h
  net: netsec: remove static declaration for netsec_set_tx_de()
  net: netsec: remove superfluous if statement
  netfilter: nf_tables: add hardware offload support
  net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
  net: flow_offload: add flow_block_cb_is_busy() and use it
  net: sched: remove tcf block API
  drivers: net: use flow block API
  net: sched: use flow block API
  net: flow_offload: add flow_block_cb_{priv, incref, decref}()
  net: flow_offload: add list handling functions
  net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()
  net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_*
  net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND
  net: flow_offload: add flow_block_cb_setup_simple()
  net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC
  net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC
  ...
2019-07-11 10:55:49 -07:00
Linus Torvalds 028db3e290 Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs"
This reverts merge 0f75ef6a9c (and thus
effectively commits

   7a1ade8475 ("keys: Provide KEYCTL_GRANT_PERMISSION")
   2e12256b9a ("keys: Replace uid/gid/perm permissions checking with an ACL")

that the merge brought in).

It turns out that it breaks booting with an encrypted volume, and Eric
biggers reports that it also breaks the fscrypt tests [1] and loading of
in-kernel X.509 certificates [2].

The root cause of all the breakage is likely the same, but David Howells
is off email so rather than try to work it out it's getting reverted in
order to not impact the rest of the merge window.

 [1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/
 [2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/

Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/
Reported-by: Eric Biggers <ebiggers@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-10 18:43:43 -07:00
Linus Torvalds 8b68150883 Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
 "Bug fixes, code clean up, and new features:

   - IMA policy rules can be defined in terms of LSM labels, making the
     IMA policy dependent on LSM policy label changes, in particular LSM
     label deletions. The new environment, in which IMA-appraisal is
     being used, frequently updates the LSM policy and permits LSM label
     deletions.

   - Prevent an mmap'ed shared file opened for write from also being
     mmap'ed execute. In the long term, making this and other similar
     changes at the VFS layer would be preferable.

   - The IMA per policy rule template format support is needed for a
     couple of new/proposed features (eg. kexec boot command line
     measurement, appended signatures, and VFS provided file hashes).

   - Other than the "boot-aggregate" record in the IMA measuremeent
     list, all other measurements are of file data. Measuring and
     storing the kexec boot command line in the IMA measurement list is
     the first buffer based measurement included in the measurement
     list"

* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: Introduce struct evm_xattr
  ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definition
  KEXEC: Call ima_kexec_cmdline to measure the boot command line args
  IMA: Define a new template field buf
  IMA: Define a new hook to measure the kexec boot command line arguments
  IMA: support for per policy rule template formats
  integrity: Fix __integrity_init_keyring() section mismatch
  ima: Use designated initializers for struct ima_event_data
  ima: use the lsm policy update notifier
  LSM: switch to blocking policy update notifiers
  x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY
  ima: Make arch_policy_entry static
  ima: prevent a file already mmap'ed write to be mmap'ed execute
  x86/ima: check EFI SetupMode too
2019-07-08 20:28:59 -07:00