1
0
Fork 0
Commit Graph

574187 Commits (020bf042e5b397479c1174081b935d0ff15d1a64)

Author SHA1 Message Date
Stefan Haberland 020bf042e5 s390/dasd: prevent incorrect length error under z/VM after PAV changes
The channel checks the specified length and the provided amount of
data for CCWs and provides an incorrect length error if the size does
not match. Under z/VM with simulation activated the length may get
changed. Having the suppress length indication bit set is stated as
good CCW coding practice and avoids errors under z/VM.

Cc: stable@vger.kernel.org
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-11 13:05:53 +01:00
Christian Borntraeger 0986d97741 s390: fix DAT off memory access, e.g. on kdump
commit 204ee2c564 ("s390/irqflags: optimize irq restore") optimized
irqrestore to really only care about interrupts and adapted the
remaining low level users. One spot (memcpy_real) was not touched,
though - fix it. Otherwise a kdump kernel will fail while reading
the old kernel. As we re-enable irqs with a non-standard function
we have to tell lockdep about that.

Fixes: 204ee2c564 ("s390/irqflags: optimize irq restore")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-11 13:05:51 +01:00
Heiko Carstens 232f5dd785 s390/oprofile: fix address range for asynchronous stack
git commit dc7ee00d47 ("s390: lowcore stack pointer offsets")
introduced a regression in regard to s390_backtrace(). The stack
pointer for the asynchronous stack in the lowcore now has an
additional offset applied. This offset needs to be taken into account
in the calculation for the low and high address for the stack.

This bug was already partially fixed with commit 9cc5c206d9
("s390/dumpstack: fix address ranges for asynchronous and panic
stack"). This patch fixes it also for the oprofile code.

Fixes: dc7ee00d47 ("s390: lowcore stack pointer offsets")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:22 +01:00
Heiko Carstens 1f8cbb9c83 s390/perf_event: fix address range for asynchronous stack
git commit dc7ee00d47 ("s390: lowcore stack pointer offsets")
introduced a regression in regard to perf_callchain_kernel(). The
stack pointer for the asynchronous stack in the lowcore now has an
additional offset applied. This offset needs to be taken into account
in the calculation for the low and high address for the stack.

This bug was already partially fixed with 9cc5c206d9
("s390/dumpstack: fix address ranges for asynchronous and panic
stack"). This patch fixes it also for the perf_event code.

Fixes: dc7ee00d47 ("s390: lowcore stack pointer offsets")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:22 +01:00
Pratyush Anand e0115875c0 s390/stacktrace: add save_stack_trace_regs()
Implement save_stack_trace_regs, so that a stack trace of a kprobe
event can be obtained.

Without this we see following warning:
"save_stack_trace_regs() not implemented yet."
when we execute:
echo stacktrace > /sys/kernel/debug/tracing/trace_options
echo "p kfree" >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable

Reported-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Pratyush Anand <panand@redhat.com>
[heiko.carstens@de.ibm.com]: changed patch to use __save_stack_trace()
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:22 +01:00
Heiko Carstens 66adce8f1f s390/stacktrace: save full stack traces
save_stack_trace() only saves the stack trace of the current context
(interrupt or process context). This is different to what other
architectures like x86 do, which save the full stack trace across
different contexts.

Also extract a __save_stack_trace() helper function which will be used
by a follow on patch.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:21 +01:00
Heiko Carstens f6331aaccb s390/stacktrace: add missing end marker
save_stack_trace() did not write the ULONG_MAX end marker if there is
enough space left. So simply follow x86 and arm64.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:21 +01:00
Heiko Carstens 9900c48c46 s390/stacktrace: fix address ranges for asynchronous and panic stack
git commit dc7ee00d47 ("s390: lowcore stack pointer offsets")
introduced a regression in regard to save_stack_trace(). The stack
pointer for the asynchronous and the panic stack in the lowcore now
have an additional offset applied to them. This offset needs to be
taken into account in the calculation for the low and high address for
the stacks.

This bug was already partially fixed with 9cc5c206d9
("s390/dumpstack: fix address ranges for asynchronous and panic
stack"). This patch fixes it also for the stacktrace code.

Fixes: dc7ee00d47 ("s390: lowcore stack pointer offsets")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:20 +01:00
Heiko Carstens 665ca9187c s390/stacktrace: fix save_stack_trace_tsk() for current task
The function save_stack_trace_tsk() did not consider that it can be
used for tsk == current, for which the current stack pointer obviously
cannot be found in the thread structure.

Fix this and get the stack pointer with an inline assembly.

This fixes e.g. the output of "cat /proc/self/stack".

Before:
[<0000000000000000>]           (null)
[<ffffffffffffffff>] 0xffffffffffffffff

After:
[<000000000011b3ee>] save_stack_trace_tsk+0x56/0x98
[<0000000000366cde>] proc_pid_stack+0xae/0x108
[<00000000003636f0>] proc_single_show+0x70/0xc0
[<0000000000311fbc>] seq_read+0xcc/0x448
[<00000000002e7716>] __vfs_read+0x36/0x100
[<00000000002e872e>] vfs_read+0x76/0x130
[<00000000002e975e>] SyS_read+0x66/0xd8
[<000000000089490e>] system_call+0xd6/0x264
[<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-10 09:25:20 +01:00
Linus Torvalds 2178cbc68f Fix for async_probe module param added in 4.3 (clearly not widely used yet),
and a much more interesting kallsyms race which has been around approximately
 forever.  This fix is more invasive, and will require some care in backporting,
 but I hated all the bandaids I could think of, so...
 
 There are some more coming, which are only for breakages introduced this
 cycle (livepatch), but wanted these in now.
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWuoNiAAoJENkgDmzRrbjxsoIP/3xdco6C6DnjmrRgriyZABbU
 zQ+imTUwkoCtEHuZj9nqPecsuDw13p4UsOtqQ63J3I60tr1u595sQ3JG7HM9sLZy
 DvZzpoCcakJDR5If4+C1NmNT/GH8/discJ5ncu2LE+dBmtaKsFgTLG+hV4zaY62E
 goVM1X/0NR933VNw7RnZoUeLSeYDqs1EhKjkewp5GLR/XLo3wRduYfqPt2luXrfQ
 +qXU7AwfNni//HsJfkxpW9KwI9VwzWFtB2j71FGQ9fhHAzhEOHWQBOIZuhPNqE3V
 xcHrCdup31q/TSkH7jEv9CUKNdqDiMFto3PWJirOLgABb6TLUA6qw/9ABP3CH+su
 /y9iDXKnhGi1qLN8qBcgyIAf3QsbJIhwYzk0Qi0ovrWoEILLkw4ULGFiWOWV2xTc
 hmLtdPGWpH60A1q1DAMX4HnxTQ7/3TXMCIHRk3PQGvBfq8jXJy3ckmY/lvjh6r5s
 Oe2PIiSAykrFUD1TE5AcjjclrR5KlxPsUuAkw/44VVp4SmojGwOBQqNgmC5+yxJ3
 sh/AF2PnkFxzFNo+PdJvl0tE5oVo0Jao1hy525CiqYThaSWwIowhDmkO7iffM/y7
 kf05o45sHXExO5NAb3zMNqK+5G21kBHIWsBfNxllClDJRhWcV1oKxofqr9iNtOI0
 RlhuwHi5sr5QCR5L1F60
 =Lt9v
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module fixes from Rusty Russell:
 "Fix for async_probe module param added in 4.3 (clearly not widely used
  yet), and a much more interesting kallsyms race which has been around
  approximately forever.  This fix is more invasive, and will require
  some care in backporting, but I hated all the bandaids I could think
  of, so...

  There are some more coming, which are only for breakages introduced
  this cycle (livepatch), but wanted these in now"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modules: fix longstanding /proc/kallsyms vs module insertion race.
  module: wrapper for symbol name.
  modules: fix modparam async_probe request
2016-02-09 16:40:59 -08:00
Linus Torvalds 7cf91adc0d Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

  API:
   - Fix async algif_skcipher, it was broken by recent fixes.
   - Fix potential race condition in algif_skcipher with ctx.
   - Fix potential memory corruption in algif_skcipher.
   - Add missing lock to crypto_user when doing an alg dump.

  Drivers:
   - marvell/cesa was testing the wrong variable for NULL after
     allocation.
   - Fix potential double-free in atmel-sha.
   - Fix illegal call to sleepin function from atomic context in
     atmel-sha"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: marvell/cesa - fix test in mv_cesa_dev_dma_init()
  crypto: atmel-sha - remove calls of clk_prepare() from atomic contexts
  crypto: atmel-sha - fix atmel_sha_remove()
  crypto: algif_skcipher - Do not set MAY_BACKLOG on the async path
  crypto: algif_skcipher - Do not dereference ctx without socket lock
  crypto: algif_skcipher - Do not assume that req is unchanged
  crypto: user - lock crypto_alg_list on alg dump
2016-02-09 10:15:16 -08:00
J. Bruce Fields b64e86cdf6 scripts: add "prune-kernel" script to clean up old kernel images
Long ago, Dave Jones complained about CONFIG_LOCALVERSION_AUTO:
 "I don't use the auto config, because I end up filling up /boot unless
  I go through and clean them out by hand every time I install a new one
  (which I do probably a dozen or so times a day).  Is there some easy
  way to prune old builds I'm missing?"

To which Bruce replied:
 "I run this by hand every now and then.  I'm probably doing it all wrong"

And if he is running it wrong, then so am I - because I've been using
this script ever since.  It is true that CONFIG_LOCALVERSION_AUTO easily
ends up filling your /boot partition if you don't clean up old versions
regularly, and this script helps make that easier.

Checked with Bruce to see that it's fine to add this to the kernel
scripts.  Maybe people will come up with enhancements, but more
importantly, this way I won't misplace this script whenever I install a
new machine and start doing custom kernels for it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-09 10:09:52 -08:00
Linus Torvalds 765bdb406d KVM-ARM fixes, mostly coming from the PMU work.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWuLKAAAoJEL/70l94x66DfoMH+wUcYQ3we2STZc23yj9LIj3o
 xTLwBLHv3ZIjJhjhyYNkQNey+TXbnzf1oL1xeT5JZTMeVIf9KDP8KW9tuKJ4vDjf
 q02WT/uKkZLUAaOlsQ8k+izfqfnp2Q4wcsrBOepaUqmLzonOcAtSfBQq2s1YCa5f
 wtK1mojgKXgC0Kke5D61gTgSLaNQWghaMm09UB8Wg3QPcwu5VLmJIPhnWwS/QVG/
 tNDIkK4+pyY7vNAIp2t13tUa4/9UsC2U99Pl8iVdzKKefv49t+iBI4FeR9zTlBSq
 2dXoemCGWePf77M6myagczNb9BRFweu8bsVeQuBo2M8UbNUsowkvBYyhdkihJHg=
 =LCsT
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "KVM-ARM fixes, mostly coming from the PMU work"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  arm64: KVM: Fix guest dead loop when register accessor returns false
  arm64: KVM: Fix comments of the CP handler
  arm64: KVM: Fix wrong use of the CPSR MODE mask for 32bit guests
  arm64: KVM: Obey RES0/1 reserved bits when setting CPTR_EL2
  arm64: KVM: Fix AArch64 guest userspace exception injection
2016-02-08 10:32:30 -08:00
Linus Torvalds 92e6edd685 regmap: mmio: Revert to v4.4 endianness handling
Commit 29bb45f25f (regmap-mmio: Use native endianness for read/write)
 attempted to fix some long standing bugs in the MMIO implementation for
 big endian systems caused by duplicate byte swapping in both regmap and
 readl()/writel() which affected MIPS systems as when they are in big
 endian mode they flip the endianness of all registers in the system, not
 just the CPU.  MIPS systems had worked around this by declaring regmap
 using IPs as little endian which is inaccurate, unfortunately the issue
 had not been reported.
 
 Sadly the fix makes things worse rather than better.  By changing the
 behaviour to match the documentation it caused behaviour changes for
 other IPs which broke them and by using the __raw I/O accessors to avoid
 the endianness swapping in readl()/writel() it removed some memory
 ordering guarantees and could potentially generate unvirtualisable
 instructions on some architectures.
 
 Unfortunately sorting out all this mess in any half way sensible fashion
 was far too invasive to go in during an -rc cycle so instead let's go
 back to the old broken behaviour for v4.5, the better fixes are already
 queued for v4.6.  This does mean that we keep the broken MIPS DTs for
 another release but that seems the least bad way of handling the
 situation.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWtIjbAAoJECTWi3JdVIfQs8QH/jNpfio4klDkdlH/KpPZXlrp
 FzASbGePNtLqZXFL5WcG//ni3EYdbaiXZIdLBKDx9K4F2ca9FAF8aAnZAZ5uefGx
 bnloYpV34DqQwS5f9FrrNsm+YVTTuUIt0dx4ZRGCEdMTzW7i3efs/9eVEITUixK6
 U1obTJovAl33bihadsC9hzJVwfOq3H4aFFWc/EFZzbQaU2/so2eiA1dhPr/YErRJ
 dMR8drWxpYXuBsrk5T647R0sUw7pA4Zw+WAF032TPQf/1Fy9Vk1/yXbTyccZzFzo
 bfupRA/HpeLNZ9cN9z9y3Fa0je4UNbBZh0poB5B773af84NnhX7Ytenjo+peVxI=
 =+Q6E
 -----END PGP SIGNATURE-----

Merge tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "A single revert back to v4.4 endianness handling.

  Commit 29bb45f25f ("regmap-mmio: Use native endianness for
  read/write") attempted to fix some long standing bugs in the MMIO
  implementation for big endian systems caused by duplicate byte
  swapping in both regmap and readl()/writel().  Sadly the fix makes
  things worse rather than better, so revert it for now"

* tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: mmio: Revert to v4.4 endianness handling
2016-02-08 10:20:06 -08:00
Masahiro Yamada 4ba6a2b28f scatterlist: fix a typo in comment block of sg_miter_stop()
Fix the doubled "started" and tidy up the following sentences.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-08 10:15:17 -08:00
Paolo Bonzini afc6074381 KVM/ARM fixes for v4.5-rc2
A few random fixes, mostly coming from the PMU work by Shannon:
 
 - fix for injecting faults coming from the guest's userspace
 - cleanup for our CPTR_EL2 accessors (reserved bits)
 - fix for a bug impacting perf (user/kernel discrimination)
 - fix for a 32bit sysreg handling bug
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWqehPAAoJECPQ0LrRPXpDn6gP/2lrJ9lV5I3MxLzUytmRY8EM
 Xl8WnNEQJ0e7oEdb1l6k4DR8D/HefzXpp/YWHY1WdDZSej0b2egro1xsFWdgaOr9
 NVGJnoQBlCFqSIf2szml4ftpHXZZ/kMF/EvhtzEL6cpUdqeA/tkS6HoCMQknhCbx
 3zOYnNKCGQUkFhTKJUSXB6NcZ/950uqkQxAdCPNUTGg1YzkNfbcgTewqKsmb25Cv
 /sOUFmrq2AlnWkdH+QWP0BtNFUX9saOSXvxrABT6nfiXSpUeF6Rprcgi9gdoNhAD
 mfE5IFw0dOEo2XThZTchKu3FBSMAkDadvC9yWFr88dr62E6EKFPzY3vHLCA8QoT/
 zk5beGSjyWGe7FZZJ4CKdO4EWBZr/WSlSVzOfG4ZBVPUoh2AZcUEhzzrzTezzocO
 71/5ZVpQ6O8+Pxwyy85Vd2drf7OZLagGNydNx46RHXrRxl+q0c5vFTVh4Txbd4YU
 XNsd+kA62/OYyPHbtVzTzAPPKG7aM8hLzdy8dkTgvuDzWHmxFWhD/HgiMHfFrQqs
 WCafvBhTc4375dvwYOupxaU2ncHKvt/zQJtBOw6bEwAIUa5c1IkIUr0i8XgRq6lr
 x/YvhFIwiVyXVnrDt3ZSIx79Oajf541uJg7vLFyPBQkcnQlJ6T7oy7qJlqhM0567
 Sr6G0/YXa1ccIfmKyeh4
 =36kx
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM fixes for v4.5-rc2

A few random fixes, mostly coming from the PMU work by Shannon:

- fix for injecting faults coming from the guest's userspace
- cleanup for our CPTR_EL2 accessors (reserved bits)
- fix for a bug impacting perf (user/kernel discrimination)
- fix for a 32bit sysreg handling bug
2016-02-08 16:20:51 +01:00
Linus Torvalds 388f7b1d6e Linux 4.5-rc3 2016-02-07 15:38:30 -08:00
Linus Torvalds c17dfb019d ARM: SoC fixes for v4.5-rc
The first real batch of fixes for this release cycle, so there are a few more
 than usual.
 
 Most of these are fixes and tweaks to board support (DT bugfixes, etc). I've
 also picked up a couple of small cleanups that seemed innocent enough that
 there was little reason to wait (const/__initconst and Kconfig deps).
 
 Quite a bit of the changes on OMAP were due to fixes to no longer write to
 rodata from assembly when ARM_KERNMEM_PERMS was enabled, but there were also
 other fixes.
 
 Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC fixes
 on OMAP5, and Nomadik had changes to MMC parameters in DT.
 
 All in all, mostly the usual mix of various fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWt8K0AAoJEIwa5zzehBx3HxsQAJMqKkTCr/2hzHTw5V8sTgDf
 zrVYEi5WF5IGLR4eON31rF31tbEmQd0bqVlsTLy/yK3hu1gTQwDyqBJqoEQBMQUW
 lBShtVERP3mNUm0yICeupaWIhoRqaymlwFKKfq93f+YTn27pEDQ1ImEHuARlbAKa
 3zCd91ClRRm3WxrBXj9srt/NyMX7BlcHLjcN1BurpVkR0aciW1B692Lb8LotEP4k
 D1CLNZeQEwV+uOHcJsvjEdB/Uh42+dpsxbIAaBW2cFB0iuX3BsnmferoFe0cXmpC
 wO5ffvzr0LCMsrUzUsbvn0RgRtMDi2RxrS1n0cXrAVPP6OEeOaMLwGdPUGvQ2EVI
 cvCHpw3qXRz7CTERpy7bv0YugIY3vZPukJrne2ZEH7cpA/JLsuqlKm/cOmPRB7gJ
 tC2mXlP5jHbbGRiq/Kk3QB7QsKIxHfIalCZMMiRe0ldWSDW6jDpvrv4Nsfzs3etN
 LaB0iIm3f5DqOFjjZi+LVUJUGE3M8/3Fs2f70rCdPKDGq9fTqD3+2mK7l80ZaYXG
 J3wPKM+9WXGISakS/biQzvYA9iDnbDZCTUxBIM6VlvcHmARJEH3TS5ZjR0eaIb7w
 Sqx7e2ufm/2wpGINDoT1qms14cI8ayj7iq+8fDnI3R9XSXxeKk5J5jo9fKnbnDWP
 4A4Ai+NYBv/rDWjkg19s
 =1iBu
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "The first real batch of fixes for this release cycle, so there are a
  few more than usual.

  Most of these are fixes and tweaks to board support (DT bugfixes,
  etc).  I've also picked up a couple of small cleanups that seemed
  innocent enough that there was little reason to wait (const/
  __initconst and Kconfig deps).

  Quite a bit of the changes on OMAP were due to fixes to no longer
  write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but
  there were also other fixes.

  Kirkwood had a bunch of gpio fixes for some boards.  OMAP had RTC
  fixes on OMAP5, and Nomadik had changes to MMC parameters in DT.

  All in all, mostly the usual mix of various fixes"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
  ARM: multi_v7_defconfig: enable DW_WATCHDOG
  ARM: nomadik: fix up SD/MMC DT settings
  ARM64: tegra: Add chosen node for tegra132 norrin
  ARM: realview: use "depends on" instead of "if" after prompt
  ARM: tango: use "depends on" instead of "if" after prompt
  ARM: tango: use const and __initconst for smp_operations
  ARM: realview: use const and __initconst for smp_operations
  bus: uniphier-system-bus: revive tristate prompt
  arm64: dts: Add missing DMA Abort interrupt to Juno
  bus: vexpress-config: Add missing of_node_put
  ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
  ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
  ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
  ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
  ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
  ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
  ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
  ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
  ARM: dts: am4372: fix irq type for arm twd and global timer
  ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
  ...
2016-02-07 15:23:20 -08:00
Linus Torvalds 63fee123da Merge branch 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox fixes from Jassi Brar:

 - fix getting element from the pcc-channels array by simply indexing
   into it

 - prevent building mailbox-test driver for archs that don't have IOMEM

* 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: Fix dependencies for !HAS_IOMEM archs
  mailbox: pcc: fix channel calculation in get_pcc_channel()
2016-02-07 15:17:47 -08:00
Linus Torvalds 46df55ceea USB fixes for 4.5-rc3
Here are some USB fixes for 4.5-rc3.
 
 The usual, xhci fixes for reported issues, combined with some small
 gadget driver fixes, and a MAINTAINERS file update.  All have been in
 linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAla205wACgkQMUfUDdst+ymVcACgksg9fcSsphvjo4ruxN79QrjW
 GaEAn3VmLc2WyLf13on9cXmBi8KrS+WY
 =sFPR
 -----END PGP SIGNATURE-----

Merge tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some USB fixes for 4.5-rc3.

  The usual, xhci fixes for reported issues, combined with some small
  gadget driver fixes, and a MAINTAINERS file update.  All have been in
  linux-next with no reported issues"

* tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: harden xhci_find_next_ext_cap against device removal
  xhci: Fix list corruption in urb dequeue at host removal
  usb: host: xhci-plat: fix NULL pointer in probe for device tree case
  usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling
  usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT
  usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms
  usb: xhci: set SSIC port unused only if xhci_suspend succeeds
  usb: xhci: add a quirk bit for ssic port unused
  usb: xhci: handle both SSIC ports in PME stuck quirk
  usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver.
  Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
  MAINTAINERS: fix my email address
  usb: dwc2: Fix probe problem on bcm2835
  Revert "usb: dwc2: Move reset into dwc2_get_hwparams()"
  usb: musb: ux500: Fix NULL pointer dereference at system PM
  usb: phy: mxs: declare variable with initialized value
  usb: phy: msm: fix error handling in probe.
2016-02-06 22:14:46 -08:00
Linus Torvalds dacd53c805 Staging / IIO driver fixes for 4.5-rc3
Here are some IIO and staging driver fixes for 4.5-rc3.  All of them,
 except one, are for IIO drivers, and one is for a speakup driver fix
 caused by some earlier patches, to resolve a reported build failure.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAla21D8ACgkQMUfUDdst+ykHiQCgqYizz86HYi3ZR0LRgGwm3Gws
 4sEAn2x230E41WXDhLTLT3N6fAaetYl9
 =wa/Y
 -----END PGP SIGNATURE-----

Merge tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging and IIO driver fixes from Greg KH:
 "Here are some IIO and staging driver fixes for 4.5-rc3.

  All of them, except one, are for IIO drivers, and one is for a speakup
  driver fix caused by some earlier patches, to resolve a reported build
  failure"

* tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  Staging: speakup: Fix allyesconfig build on mn10300
  iio: dht11: Use boottime
  iio: ade7753: avoid uninitialized data
  iio: pressure: mpl115: fix temperature offset sign
  iio: imu: Fix dependencies for !HAS_IOMEM archs
  staging: iio: Fix dependencies for !HAS_IOMEM archs
  iio: adc: Fix dependencies for !HAS_IOMEM archs
  iio: inkern: fix a NULL dereference on error
  iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.
  iio: light: acpi-als: Report data as processed
  iio: dac: mcp4725: set iio name property in sysfs
  iio: add HAS_IOMEM dependency to VF610_ADC
  iio: add IIO_TRIGGER dependency to STK8BA50
  iio: proximity: lidar: correct return value
  iio-light: Use a signed return type for ltr501_match_samp_freq()
2016-02-06 22:13:16 -08:00
Boris BREZILLON 8a3978ad55 crypto: marvell/cesa - fix test in mv_cesa_dev_dma_init()
We are checking twice if dma->cache_pool is not NULL but are never testing
dma->padding_pool value.

Cc: stable@vger.kernel.org
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:23:56 +08:00
Cyrille Pitchen c033042aa8 crypto: atmel-sha - remove calls of clk_prepare() from atomic contexts
clk_prepare()/clk_unprepare() must not be called within atomic context.

This patch calls clk_prepare() once for all from atmel_sha_probe() and
clk_unprepare() from atmel_sha_remove().

Then calls of clk_prepare_enable()/clk_disable_unprepare() were replaced
by calls of clk_enable()/clk_disable().

Cc: stable@vger.kernel.org
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reported-by: Matthias Mayr <matthias.mayr@student.kit.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:23:56 +08:00
Cyrille Pitchen d961436c11 crypto: atmel-sha - fix atmel_sha_remove()
Since atmel_sha_probe() uses devm_xxx functions to allocate resources,
atmel_sha_remove() should no longer explicitly release them.

Cc: stable@vger.kernel.org
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Fixes: b0e8b3417a ("crypto: atmel - use devm_xxx() managed function")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:23:56 +08:00
Herbert Xu dad4199706 crypto: algif_skcipher - Do not set MAY_BACKLOG on the async path
The async path cannot use MAY_BACKLOG because it is not meant to
block, which is what MAY_BACKLOG does.  On the other hand, both
the sync and async paths can make use of MAY_SLEEP.

Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:23:55 +08:00
Herbert Xu 6454c2b83f crypto: algif_skcipher - Do not dereference ctx without socket lock
Any access to non-constant bits of the private context must be
done under the socket lock, in particular, this includes ctx->req.

This patch moves such accesses under the lock, and fetches the
tfm from the parent socket which is guaranteed to be constant,
rather than from ctx->req.

Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:23:55 +08:00
Herbert Xu ec69bbfb99 crypto: algif_skcipher - Do not assume that req is unchanged
The async path in algif_skcipher assumes that the crypto completion
function will be called with the original request.  This is not
necessarily the case.  In fact there is no need for this anyway
since we already embed information into the request with struct
skcipher_async_req.

This patch adds a pointer to that struct and then passes it as
the data to the callback function.

Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Tadeusz Struk <tadeusz.struk@intel.com>
2016-02-06 15:23:55 +08:00
Mathias Krause 63e41ebc66 crypto: user - lock crypto_alg_list on alg dump
We miss to take the crypto_alg_sem semaphore when traversing the
crypto_alg_list for CRYPTO_MSG_GETALG dumps. This allows a race with
crypto_unregister_alg() removing algorithms from the list while we're
still traversing it, thereby leading to a use-after-free as show below:

[ 3482.071639] general protection fault: 0000 [#1] SMP
[ 3482.075639] Modules linked in: aes_x86_64 glue_helper lrw ablk_helper cryptd gf128mul ipv6 pcspkr serio_raw virtio_net microcode virtio_pci virtio_ring virtio sr_mod cdrom [last unloaded: aesni_intel]
[ 3482.075639] CPU: 1 PID: 11065 Comm: crconf Not tainted 4.3.4-grsec+ #126
[ 3482.075639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 3482.075639] task: ffff88001cd41a40 ti: ffff88001cd422c8 task.ti: ffff88001cd422c8
[ 3482.075639] RIP: 0010:[<ffffffff93722bd3>]  [<ffffffff93722bd3>] strncpy+0x13/0x30
[ 3482.075639] RSP: 0018:ffff88001f713b60  EFLAGS: 00010202
[ 3482.075639] RAX: ffff88001f6c4430 RBX: ffff88001f6c43a0 RCX: ffff88001f6c4430
[ 3482.075639] RDX: 0000000000000040 RSI: fefefefefefeff16 RDI: ffff88001f6c4430
[ 3482.075639] RBP: ffff88001f713b60 R08: ffff88001f6c4470 R09: ffff88001f6c4480
[ 3482.075639] R10: 0000000000000002 R11: 0000000000000246 R12: ffff88001ce2aa28
[ 3482.075639] R13: ffff880000093700 R14: ffff88001f5e4bf8 R15: 0000000000003b20
[ 3482.075639] FS:  0000033826fa2700(0000) GS:ffff88001e900000(0000) knlGS:0000000000000000
[ 3482.075639] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3482.075639] CR2: ffffffffff600400 CR3: 00000000139ec000 CR4: 00000000001606f0
[ 3482.075639] Stack:
[ 3482.075639]  ffff88001f713bd8 ffffffff936ccd00 ffff88001e5c4200 ffff880000093700
[ 3482.075639]  ffff88001f713bd0 ffffffff938ef4bf 0000000000000000 0000000000003b20
[ 3482.075639]  ffff88001f5e4bf8 ffff88001f5e4848 0000000000000000 0000000000003b20
[ 3482.075639] Call Trace:
[ 3482.075639]  [<ffffffff936ccd00>] crypto_report_alg+0xc0/0x3e0
[ 3482.075639]  [<ffffffff938ef4bf>] ? __alloc_skb+0x16f/0x300
[ 3482.075639]  [<ffffffff936cd08a>] crypto_dump_report+0x6a/0x90
[ 3482.075639]  [<ffffffff93935707>] netlink_dump+0x147/0x2e0
[ 3482.075639]  [<ffffffff93935f99>] __netlink_dump_start+0x159/0x190
[ 3482.075639]  [<ffffffff936ccb13>] crypto_user_rcv_msg+0xc3/0x130
[ 3482.075639]  [<ffffffff936cd020>] ? crypto_report_alg+0x3e0/0x3e0
[ 3482.075639]  [<ffffffff936cc4b0>] ? alg_test_crc32c+0x120/0x120
[ 3482.075639]  [<ffffffff93933145>] ? __netlink_lookup+0xd5/0x120
[ 3482.075639]  [<ffffffff936cca50>] ? crypto_add_alg+0x1d0/0x1d0
[ 3482.075639]  [<ffffffff93938141>] netlink_rcv_skb+0xe1/0x130
[ 3482.075639]  [<ffffffff936cc4f8>] crypto_netlink_rcv+0x28/0x40
[ 3482.075639]  [<ffffffff939375a8>] netlink_unicast+0x108/0x180
[ 3482.075639]  [<ffffffff93937c21>] netlink_sendmsg+0x541/0x770
[ 3482.075639]  [<ffffffff938e31e1>] sock_sendmsg+0x21/0x40
[ 3482.075639]  [<ffffffff938e4763>] SyS_sendto+0xf3/0x130
[ 3482.075639]  [<ffffffff93444203>] ? bad_area_nosemaphore+0x13/0x20
[ 3482.075639]  [<ffffffff93444470>] ? __do_page_fault+0x80/0x3a0
[ 3482.075639]  [<ffffffff939d80cb>] entry_SYSCALL_64_fastpath+0x12/0x6e
[ 3482.075639] Code: 88 4a ff 75 ed 5d 48 0f ba 2c 24 3f c3 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 85 d2 48 89 f8 48 89 f9 4c 8d 04 17 48 89 e5 74 15 <0f> b6 16 80 fa 01 88 11 48 83 de ff 48 83 c1 01 4c 39 c1 75 eb
[ 3482.075639] RIP  [<ffffffff93722bd3>] strncpy+0x13/0x30

To trigger the race run the following loops simultaneously for a while:
  $ while : ; do modprobe aesni-intel; rmmod aesni-intel; done
  $ while : ; do crconf show all > /dev/null; done

Fix the race by taking the crypto_alg_sem read lock, thereby preventing
crypto_unregister_alg() from modifying the algorithm list during the
dump.

This bug has been detected by the PaX memory sanitize feature.

Cc: stable@vger.kernel.org
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: PaX Team <pageexec@freemail.hu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-06 15:23:55 +08:00
Linus Torvalds 5af9c2e19d Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "22 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
  epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
  radix-tree: fix oops after radix_tree_iter_retry
  MAINTAINERS: trim the file triggers for ABI/API
  dax: dirty inode only if required
  thp: make deferred_split_scan() work again
  mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
  ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
  um: asm/page.h: remove the pte_high member from struct pte_t
  mm, hugetlb: don't require CMA for runtime gigantic pages
  mm/hugetlb: fix gigantic page initialization/allocation
  mm: downgrade VM_BUG in isolate_lru_page() to warning
  mempolicy: do not try to queue pages from !vma_migratable()
  mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
  vmstat: make vmstat_update deferrable
  mm, vmstat: make quiet_vmstat lighter
  mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
  memblock: don't mark memblock_phys_mem_size() as __init
  dump_stack: avoid potential deadlocks
  mm: validate_mm browse_rb SMP race condition
  m32r: fix build failure due to SMP and MMU
  ...
2016-02-05 20:20:07 -08:00
Linus Torvalds 5d6a6a75e0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
 "We have a few wire protocol compatibility fixes, ports of a few recent
  CRUSH mapping changes, and a couple error path fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: MOSDOpReply v7 encoding
  libceph: advertise support for TUNABLES5
  crush: decode and initialize chooseleaf_stable
  crush: add chooseleaf_stable tunable
  crush: ensure take bucket value is valid
  crush: ensure bucket id is valid before indexing buckets array
  ceph: fix snap context leak in error path
  ceph: checking for IS_ERR instead of NULL
2016-02-05 19:52:57 -08:00
Linus Torvalds 9b108828ed Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Fixes all over the place:

   - amdkfd: two static checker fixes
   - mst: a bunch of static checker and spec/hw interaction fixes
   - amdgpu: fix Iceland hw properly, and some fiji bugs, along with
     some write-combining fixes.
   - exynos: some regression fixes
   - adv7511: fix some EDID reading issues"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits)
  drm/dp/mst: deallocate payload on port destruction
  drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
  drm/dp/mst: move GUID storage from mgr, port to only mst branch
  drm/dp/mst: change MST detection scheme
  drm/dp/mst: Calculate MST PBN with 31.32 fixed point
  drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
  drm/mst: Add range check for max_payloads during init
  drm/mst: Don't ignore the MST PBN self-test result
  drm: fix missing reference counting decrease
  drm/amdgpu: disable uvd and vce clockgating on Fiji
  drm/amdgpu: remove exp hardware support from iceland
  drm/amdgpu: load MEC ucode manually on iceland
  drm/amdgpu: don't load MEC2 on topaz
  drm/amdgpu: drop topaz support from gmc8 module
  drm/amdgpu: pull topaz gmc bits into gmc_v7
  drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
  drm/amdgpu: iceland use CI based MC IP
  drm/amdgpu: move gmc7 support out of CIK dependency
  drm/amdgpu/gfx7: enable cp inst/reg error interrupts
  drm/amdgpu/gfx8: enable cp inst/reg error interrupts
  ...
2016-02-05 19:38:15 -08:00
Linus Torvalds 22f60701d5 Power management and ACPI fixes for v4.5-rc3
- PM core fix to avoid false-positive warnings generated when
    the pm_domain field is cleared for a device that appears to
    be bound to a driver (Rafael Wysocki).
 
  - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).
 
  - Fix for an "unused function" compiler warning in the generic
    power domains framework (Ulf Hansson).
 
  - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set
    the PM domain pointer of a device properly in one place that
    was overlooked by a recent PM core update (Andy Shevchenko).
 
  - Removal of a redundant function declaration in the ACPI CPPC
    core code (Timur Tabi).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWtTQKAAoJEILEb/54YlRxQboP+gOgld+AZxMTe/88umM92utO
 M9PqzJhFGQ1kCUQliJhcgB2eggwXQwb8dKo+THv8Ql8DEzPLERcRMD/jffdRj2vQ
 3LQBVkD48nGwu6HOMZ0+e/61AkSkORhq9W7W3wkLcNdUhmetHpoG9+By/5ekf9WR
 BEDdvdEOyWHx044GWIgennaJC7wrmzrzxbjiVW1+aD0p/yRd2eFqNGSFcc2U4hoG
 gylay5sGJ/XU7v6kYg3mr04w+UXOkR/TrbsKheg0BTw/RJzgO69rZnrpZkjymHb0
 98XMNwGu+W2E1lRaKXBqctpuGD5NDOvuZIlqow5JBQXVLZbHE8gkGXJDET72cwRB
 wBPwiiYmdMxZrFU7pwP8FeeX39EDlWCQw4ji4DPqk8VAFjKu6HnHRd0K9vgJ46Gh
 YEGIQVvnuPB7DC3dPpa+FBaL1uIysZSri6/hTJQ9i7oz+WSoW1oNfatBtMtBFKvs
 BZ31QUq2rOf+fjjRiggHmSu/gzHd0ydOKaI3awm1aNJ9wP/J4xj1wBmhrIx9Tbxw
 NXu3wyf1LpRA4Z3s8TUIygWMAPhdoBW9TzgrIlzf26SB42CY+1ftub1xkLqTqTmU
 G1lQcc9wFdYvwUQVAy1Gq1iuwUM/V/UEKZVcttG8yEe5w0qKDdjARNqPFTGcEF0G
 LqLTypyGztWfPOxHs74m
 =DFSF
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These are: a fix for a recently introduced false-positive warnings
  about PM domain pointers being changed inappropriately (harmless but
  annoying), an MCH size workaround quirk for one more platform, a
  compiler warning fix (generic power domains framework), an ACPI LPSS
  (Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code.

  Specifics:

   - PM core fix to avoid false-positive warnings generated when the
     pm_domain field is cleared for a device that appears to be bound to
     a driver (Rafael Wysocki).

   - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).

   - Fix for an "unused function" compiler warning in the generic power
     domains framework (Ulf Hansson).

   - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM
     domain pointer of a device properly in one place that was
     overlooked by a recent PM core update (Andy Shevchenko).

   - Removal of a redundant function declaration in the ACPI CPPC core
     code (Timur Tabi)"

* tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: Avoid false-positive warnings in dev_pm_domain_set()
  PM / Domains: Silence compiler warning for an unused function
  ACPI / CPPC: remove redundant mbox_send_message() declaration
  ACPI / LPSS: set PM domain via helper setter
  PNP: Add Haswell-ULT to Intel MCH size workaround
2016-02-05 18:11:23 -08:00
Jason Baron b6a515c8a0 epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
In the current implementation of the EPOLLEXCLUSIVE flag (added for
4.5-rc1), if epoll waiters create different POLL* sets and register them
as exclusive against the same target fd, the current implementation will
stop waking any further waiters once it finds the first idle waiter.
This means that waiters could miss wakeups in certain cases.

For example, when we wake up a pipe for reading we do:
wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if
one epoll set or epfd is added to pipe p with POLLIN and a second set
epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the
wakeup since the current implementation will stop after it finds any
intersection of events with a waiter that is blocked in epoll_wait().

We could potentially address this by requiring all epoll waiters that
are added to p be required to pass the same set of POLL* events.  IE the
first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL*
flags to be used by any other epfds that are added as EPOLLEXCLUSIVE.
However, I think it might be somewhat confusing interface as we would
have to reference count the number of users for that set, and so
userspace would have to keep track of that count, or we would need a
more involved interface.  It also adds some shared state that we'd have
store somewhere.  I don't think anybody will want to bloat
__wait_queue_head for this.

I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE
such that it can only be specified with EPOLLIN and/or EPOLLOUT.  So
that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop
once we hit the first idle waiter that specifies the EPOLLIN bit, since
any remaining waiters that only have 'POLLOUT' set wouldn't need to be
woken.  Likewise, we can do the same thing if 'POLLOUT' is in the wakeup
bit set and not 'POLLIN'.  If both 'POLLOUT' and 'POLLIN' are set in the
wake bit set (there is at least one example of this I saw in fs/pipe.c),
then we just wake the entire exclusive list.  Having both 'POLLOUT' and
'POLLIN' both set should not be on any performance critical path, so I
think that's ok (in fs/pipe.c its in pipe_release()).  We also continue
to include EPOLLERR and EPOLLHUP by default in any exclusive set.  Thus,
the user can specify EPOLLERR and/or EPOLLHUP but is not required to do
so.

Since epoll waiters may be interested in other events as well besides
EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by
doing a 'dup' call on the target fd and adding that as one normally
would with EPOLL_CTL_ADD.  Since I think that the POLLIN and POLLOUT
events are what we are interest in balancing, I think that the 'dup'
thing could perhaps be added to only one of the waiter threads.
However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be
sufficient for the majority of use-cases.

Since EPOLLEXCLUSIVE is intended to be used with a target fd shared
among multiple epfds, where between 1 and n of the epfds may receive an
event, it does not satisfy the semantics of EPOLLONESHOT where only 1
epfd would get an event.  Thus, it is not allowed to be specified in
conjunction with EPOLLEXCLUSIVE.

EPOLL_CTL_MOD is also not allowed if the fd was previously added as
EPOLLEXCLUSIVE.  It seems with the limited number of flags to not be as
interesting, but this could be relaxed at some further point.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Tested-by: Madars Vitolins <m@silodev.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Konstantin Khlebnikov 732042821c radix-tree: fix oops after radix_tree_iter_retry
Helper radix_tree_iter_retry() resets next_index to the current index.
In following radix_tree_next_slot current chunk size becomes zero.  This
isn't checked and it tries to dereference null pointer in slot.

Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.

Fixes: 46437f9a55 ("radix-tree: fix race in gang lookup")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Michael Kerrisk (man-pages) b14fd334ff MAINTAINERS: trim the file triggers for ABI/API
Commit ea8f8fc863 ("MAINTAINERS: add linux-api for review of API/ABI
changes") added file triggers for various paths that likely indicated
API/ABI changes.  However, catching all changes in Documentation/ABI/
and include/uapi/ produces a large volume of mail to linux-api, rather
than only API/ABI changes.  Drop those two entries, but leave
include/linux/syscalls.h and kernel/sys_ni.c to catch syscall-related
changes.

[josh@joshtriplett.org: redid changelog]
Signed-off-by: Michael Kerrisk <mtk.man-pages@gmail.com>
Acked-by: Shuah khan <shuahkh@osg.samsung.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Dmitry Monakhov d2b2a28e64 dax: dirty inode only if required
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Kirill A. Shutemov ae026204a2 thp: make deferred_split_scan() work again
We need to iterate over split_queue, not local empty list to get
anything split from the shrinker.

Fixes: e3ae19535c ("thp: limit number of object to scan on deferred_split_scan()")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Konstantin Khlebnikov 12352d3cae mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock.  We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here.  There are
only few users of these legacy helpers.  Let's get rid of them.

This patch fixes anon_vma lock imbalance in validate_mm().  Write lock
isn't required here, read lock is enough.

And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.

Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
xuejiufei c95a51807b ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
When recovery master down, dlm_do_local_recovery_cleanup() only remove
the $RECOVERY lock owned by dead node, but do not clear the refmap bit.
Which will make umount thread falling in dead loop migrating $RECOVERY
to the dead node.

Signed-off-by: xuejiufei <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Nicolai Stange 012a4163be um: asm/page.h: remove the pte_high member from struct pte_t
Commit 16da306849 ("um: kill pfn_t") introduced a compile warning for
defconfig (SUBARCH=i386):

  arch/um/kernel/skas/mmu.c:38:206:
      warning: right shift count >= width of type [-Wshift-count-overflow]

Aforementioned patch changes the definition of the phys_to_pfn() macro
from

  ((pfn_t) ((p) >> PAGE_SHIFT))

to

  ((p) >> PAGE_SHIFT)

This effectively changes the phys_to_pfn() expansion's type from
unsigned long long to unsigned long.

Through the callchain init_stub_pte() => mk_pte(), the expansion of
phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
pte_set_val(pte, phys, prot) macro, eventually leading to

  (pte).pte_high = (phys) >> 32;

This results in the warning from above.

Since UML only deals with 32 bit addresses, the upper 32 bits from
'phys' used to be always zero anyway.  Also, all page protection flags
defined by UML don't use any bits beyond bit 9.  Since the contents of a
PTE are defined within architecture scope only, the ->pte_high member
can be safely removed.

Remove the ->pte_high member from struct pte_t.
Rename ->pte_low to ->pte.
Adapt the pte helper macros in arch/um/include/asm/page.h.

Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped.  This
write barrier doesn't seem to be paired with any read barrier though and
thus, was useless anyway.

Fixes: 16da306849 ("um: kill pfn_t")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Vlastimil Babka 080fe2068e mm, hugetlb: don't require CMA for runtime gigantic pages
Commit 944d9fec8d ("hugetlb: add support for gigantic page allocation
at runtime") has added the runtime gigantic page allocation via
alloc_contig_range(), making this support available only when CONFIG_CMA
is enabled.  Because it doesn't depend on MIGRATE_CMA pageblocks and the
associated infrastructure, it is possible with few simple adjustments to
require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.

After this patch, alloc_contig_range() and related functions are
available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
enabled.  Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION.  This allows
supporting runtime gigantic pages without the CMA-specific checks in
page allocator fastpaths.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Mike Kravetz b4330afbed mm/hugetlb: fix gigantic page initialization/allocation
Attempting to preallocate 1G gigantic huge pages at boot time with
"hugepagesz=1G hugepages=1" on the kernel command line will prevent
booting with the following:

  kernel BUG at mm/hugetlb.c:1218!

When mapcount accounting was reworked, the setting of
compound_mapcount_ptr in prep_compound_gigantic_page was overlooked.  As
a result, the validation of mapcount in free_huge_page fails.

The "BUG_ON" checks in free_huge_page were also changed to
"VM_BUG_ON_PAGE" to assist with debugging.

Fixes: 53f9263bab ("mm: rework mapcount accounting to enable 4k mapping of THPs")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Kirill A. Shutemov cf2a82ee43 mm: downgrade VM_BUG in isolate_lru_page() to warning
Calling isolate_lru_page() is wrong and shouldn't happen, but it not
nessesary fatal: the page just will not be isolated if it's not on LRU.

Let's downgrade the VM_BUG_ON_PAGE() to WARN_RATELIMIT().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Kirill A. Shutemov 77bf45e780 mempolicy: do not try to queue pages from !vma_migratable()
Maybe I miss some point, but I don't see a reason why we try to queue
pages from non migratable VMAs.

This testcase steps on VM_BUG_ON_PAGE() in isolate_lru_page():

    #include <fcntl.h>
    #include <unistd.h>
    #include <stdio.h>
    #include <sys/mman.h>
    #include <numaif.h>

    #define SIZE 0x2000

    int foo;

    int main()
    {
        int fd;
        char *p;
        unsigned long mask = 2;

        fd = open("/dev/sg0", O_RDWR);
        p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
        /* Faultin pages */
        foo = p[0] + p[0x1000];
        mbind(p, SIZE, MPOL_BIND, &mask, 4, MPOL_MF_MOVE | MPOL_MF_STRICT);
        return 0;
    }

The only case when we can queue pages from such VMA is MPOL_MF_STRICT
plus MPOL_MF_MOVE or MPOL_MF_MOVE_ALL for VMA which has pages on LRU,
but gfp mask is not sutable for migaration (see mapping_gfp_mask() check
in vma_migratable()).  That's looks like a bug to me.

Let's filter out non-migratable vma at start of queue_pages_test_walk()
and go to queue_pages_pte_range() only if MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL flag is set.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Tetsuo Handa 564e81a57f mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
Jan Stancek has reported that system occasionally hanging after "oom01"
testcase from LTP triggers OOM.  Guessing from a result that there is a
kworker thread doing memory allocation and the values between "Node 0
Normal free:" and "Node 0 Normal:" differs when hanging, vmstat is not
up-to-date for some reason.

According to commit 373ccbe592 ("mm, vmstat: allow WQ concurrency to
discover memory reclaim doesn't make any progress"), it meant to force
the kworker thread to take a short sleep, but it by error used
schedule_timeout(1).  We missed that schedule_timeout() in state
TASK_RUNNING doesn't do anything.

Fix it by using schedule_timeout_uninterruptible(1) which forces the
kworker thread to take a short sleep in order to make sure that vmstat
is up-to-date.

Fixes: 373ccbe592 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Cristopher Lameter <clameter@sgi.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Arkadiusz Miskiewicz <arekm@maven.pl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Michal Hocko ccde8bd401 vmstat: make vmstat_update deferrable
Commit 0eb77e9880 ("vmstat: make vmstat_updater deferrable again and
shut down on idle") made vmstat_shepherd deferrable.  vmstat_update
itself is still useing standard timer which might interrupt idle task.
This is possible because "mm, vmstat: make quiet_vmstat lighter" removed
cancel_delayed_work from the quiet_vmstat.

Change vmstat_work to use DEFERRABLE_WORK to prevent from pointless
wakeups from the idle context.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Michal Hocko f01f17d370 mm, vmstat: make quiet_vmstat lighter
Mike has reported a considerable overhead of refresh_cpu_vm_stats from
the idle entry during pipe test:

    12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
     4.75%  [kernel]       [k] __schedule
     4.70%  [kernel]       [k] mutex_unlock
     3.14%  [kernel]       [k] __switch_to

This is caused by commit 0eb77e9880 ("vmstat: make vmstat_updater
deferrable again and shut down on idle") which has placed quiet_vmstat
into cpu_idle_loop.  The main reason here seems to be that the idle
entry has to get over all zones and perform atomic operations for each
vmstat entry even though there might be no per cpu diffs.  This is a
pointless overhead for _each_ idle entry.

Make sure that quiet_vmstat is as light as possible.

First of all it doesn't make any sense to do any local sync if the
current cpu is already set in oncpu_stat_off because vmstat_update puts
itself there only if there is nothing to do.

Then we can check need_update which should be a cheap way to check for
potential per-cpu diffs and only then do refresh_cpu_vm_stats.

The original patch also did cancel_delayed_work which we are not doing
here.  There are two reasons for that.  Firstly cancel_delayed_work from
idle context will blow up on RT kernels (reported by Mike):

  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
  Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
  Call Trace:
    dump_stack+0x49/0x67
    ___might_sleep+0xf5/0x180
    rt_spin_lock+0x20/0x50
    try_to_grab_pending+0x69/0x240
    cancel_delayed_work+0x26/0xe0
    quiet_vmstat+0x75/0xa0
    cpu_idle_loop+0x38/0x3e0
    cpu_startup_entry+0x13/0x20
    start_secondary+0x114/0x140

And secondly, even on !RT kernels it might add some non trivial overhead
which is not necessary.  Even if the vmstat worker wakes up and preempts
idle then it will be most likely a single shot noop because the stats
were already synced and so it would end up on the oncpu_stat_off anyway.
We just need to teach both vmstat_shepherd and vmstat_update to stop
scheduling the worker if there is nothing to do.

[mgalbraith@suse.de: cancel pending work of the cpu_stat_off CPU]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Vlastimil Babka 1ce221036b mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
The description mentions kswapd threads, while the deferred struct page
initialization is actually done by one-off "pgdatinitX" threads.

Fix the description so that potentially users are not confused about
pgdatinit threads using CPU after boot instead of kswapd.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
David Gibson 1f1ffb8a15 memblock: don't mark memblock_phys_mem_size() as __init
At the moment memblock_phys_mem_size() is marked as __init, and so is
discarded after boot.  This is different from most of the memblock
functions which are marked __init_memblock, and are only discarded after
boot if memory hotplug is not configured.

To allow for upcoming code which will need memblock_phys_mem_size() in
the hotplug path, change it from __init to __init_memblock.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Eric Dumazet d7ce369243 dump_stack: avoid potential deadlocks
Some servers experienced fatal deadlocks because of a combination of
bugs, leading to multiple cpus calling dump_stack().

The checksumming bug was fixed in commit 34ae6a1aa0 ("ipv6: update
skb->csum when CE mark is propagated").

The second problem is a faulty locking in dump_stack()

CPU1 runs in process context and calls dump_stack(), grabs dump_lock.

   CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
   call dump_stack() from netdev_rx_csum_fault().

   dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
   dump_lock is owned by CPU1

While dumping its stack, CPU1 is interrupted by a softirq, and happens
to process a packet for the TCP socket locked by CPU2.

CPU1 spins forever in spin_lock() : deadlock

Stack trace on CPU1 looked like :

    NMI backtrace for cpu 1
    RIP: _raw_spin_lock+0x25/0x30
    ...
    Call Trace:
      <IRQ>
      tcp_v6_rcv+0x243/0x620
      ip6_input_finish+0x11f/0x330
      ip6_input+0x38/0x40
      ip6_rcv_finish+0x3c/0x90
      ipv6_rcv+0x2a9/0x500
      process_backlog+0x461/0xaa0
      net_rx_action+0x147/0x430
      __do_softirq+0x167/0x2d0
      call_softirq+0x1c/0x30
      do_softirq+0x3f/0x80
      irq_exit+0x6e/0xc0
      smp_call_function_single_interrupt+0x35/0x40
      call_function_single_interrupt+0x6a/0x70
      <EOI>
      printk+0x4d/0x4f
      printk_address+0x31/0x33
      print_trace_address+0x33/0x3c
      print_context_stack+0x7f/0x119
      dump_trace+0x26b/0x28e
      show_trace_log_lvl+0x4f/0x5c
      show_stack_log_lvl+0x104/0x113
      show_stack+0x42/0x44
      dump_stack+0x46/0x58
      netdev_rx_csum_fault+0x38/0x3c
      __skb_checksum_complete_head+0x6e/0x80
      __skb_checksum_complete+0x11/0x20
      tcp_rcv_established+0x2bd5/0x2fd0
      tcp_v6_do_rcv+0x13c/0x620
      sk_backlog_rcv+0x15/0x30
      release_sock+0xd2/0x150
      tcp_recvmsg+0x1c1/0xfc0
      inet_recvmsg+0x7d/0x90
      sock_recvmsg+0xaf/0xe0
      ___sys_recvmsg+0x111/0x3b0
      SyS_recvmsg+0x5c/0xb0
      system_call_fastpath+0x16/0x1b

Fixes: b58d977432 ("dump_stack: serialize the output from dump_stack()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00