1
0
Fork 0
alistair23-linux/drivers
David Hildenbrand 381eab4a6e mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
There seem to be some problems as result of 30467e0b3b ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
	a) device_lock()
	b) mem_hotplug_lock

While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took a),
followed by b), exposing a possible deadlock.

In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.

The problems I spotted related to this:

1. Memory block device attributes: While .state first calls
   mem_hotplug_begin() and the calls device_online() - which takes
   device_lock() - .online does no longer call mem_hotplug_begin(), so
   effectively calls online_pages() without mem_hotplug_lock.

2. device_online() should be called under device_hotplug_lock, however
   onlining memory during add_memory() does not take care of that.

In addition, I think there is also something wrong about the locking in

3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
   without locks. This was introduced after 30467e0b3b. And skimming over
   the code, I assume it could need some more care in regards to locking
   (e.g. device_online() called without device_hotplug_lock. This will
   be addressed in the following patches.

Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()

We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().

Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).

[1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
    2015-February/065324.html

This patch is partly based on a patch by Vitaly Kuznetsov.

Link: http://lkml.kernel.org/r/20180925091457.28651-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:17 -07:00
..
accessibility
acpi mm/memory_hotplug: make add_memory() take the device_hotplug_lock 2018-10-31 08:54:17 -07:00
amba
android
ata libata: Apply NOLPM quirk for SAMSUNG MZ7TD256HAFV-000L9 2018-10-26 08:21:04 -06:00
atm atm: zatm: Fix empty body Clang warnings 2018-10-18 15:39:10 -07:00
auxdisplay
base mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock 2018-10-31 08:54:17 -07:00
bcma
block powerpc updates for 4.20 2018-10-26 14:36:21 -07:00
bluetooth Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-10-24 14:43:41 +01:00
bus ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
cdrom gdrom: fix mistake in assignment of error 2018-10-25 11:17:40 -06:00
char RTC for 4.20 2018-10-27 09:24:24 -07:00
clk memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
clocksource RISC-V Patches for the 4.20 Merge Window, Part 1 2018-10-25 18:01:29 -07:00
connector
cpufreq cpufreq: remove unused arm_big_little_dt driver 2018-10-25 18:39:02 +02:00
cpuidle More power management updates for 4.20-rc1 2018-10-30 09:08:07 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-10-25 16:43:35 -07:00
dax
dca
devfreq
dio
dma pci-v4.20-changes 2018-10-25 06:50:48 -07:00
dma-buf
edac ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
eisa
extcon
firewire
firmware memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
fmc
fpga fpga: add devm_fpga_region_create 2018-10-16 11:13:50 +02:00
fsi
gnss
gpio pci-v4.20-changes 2018-10-25 06:50:48 -07:00
gpu media updates for v4.20-rc1 2018-10-29 14:29:58 -07:00
hid media updates for v4.20-rc1 2018-10-29 14:29:58 -07:00
hsi
hv hv_balloon: Replace spin_is_locked() with lockdep 2018-10-15 20:54:17 +02:00
hwmon Lots of small changes to the IPMI driver. Most of the changes 2018-10-23 09:42:05 +01:00
hwspinlock
hwtracing
i2c More ACPI updates for 4.20-rc1 2018-10-30 09:15:31 -07:00
ide
idle Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
iio Staging/IIO patches for 4.20-rc1 2018-10-29 10:38:10 -07:00
infiniband Revert "mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks" 2018-10-26 16:25:19 -07:00
input Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
iommu mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ipack
irqchip This tag contains the Linux port for C-SKY(csky) based on linux-4.19 2018-10-29 08:25:00 -07:00
isdn Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-10-24 14:43:41 +01:00
leds leds: gpio: set led_dat->gpiod pointer for OF defined GPIO leds 2018-10-26 20:51:36 +02:00
lightnvm
macintosh memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
mailbox - Convert print users to use the %pOFn format specifier 2018-10-29 10:30:44 -07:00
mcb
md Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2018-10-26 13:00:44 -07:00
media media updates for v4.20-rc1 2018-10-29 14:29:58 -07:00
memory
memstick
message
mfd - New Drivers 2018-10-25 06:19:15 -07:00
misc Merge branch 'i2c/for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2018-10-29 14:44:03 -07:00
mmc Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
mtd mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
mux This is the bulk of GPIO changes for the v4.20 series: 2018-10-23 08:45:05 +01:00
net mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
nfc NFC: nfcmrvl_uart: fix OF child-node lookup 2018-10-23 13:28:53 -05:00
ntb
nubus
nvdimm libnvdimm for 4.20 2018-10-25 06:31:56 -07:00
nvme pci-v4.20-changes 2018-10-25 06:50:48 -07:00
nvmem nvmem: hide unused nvmem_find_cell_by_index function 2018-10-15 15:56:15 +02:00
of memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
opp
oprofile
parisc parisc: Add alternative coding infrastructure 2018-10-17 17:22:26 +02:00
parport
pci Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
pcmcia powerpc updates for 4.20 2018-10-26 14:36:21 -07:00
perf arm64 updates for 4.20: 2018-10-22 17:30:06 +01:00
phy USB/PHY patches for 4.20-rc1 2018-10-26 08:14:13 -07:00
pinctrl This is the bulk of GPIO changes for the v4.20 series: 2018-10-23 08:45:05 +01:00
platform Char/Misc driver patches for 4.20-rc1 2018-10-26 09:11:43 -07:00
pnp
power Devicetree updates for 4.20: 2018-10-26 12:09:58 -07:00
powercap Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
pps
ps3
ptp ptp: drop redundant kasprintf() to create worker name 2018-10-28 19:20:06 -07:00
pwm
rapidio
ras
regulator regulator: Regulator updates for next release 2018-10-23 01:54:44 +01:00
remoteproc remoteproc: qcom: q6v5-mss: Register segments/dumpfn for coredump 2018-10-19 12:54:03 -07:00
reset ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
rpmsg
rtc rtc: sc27xx: Always read normal alarm when registering RTC device 2018-10-25 02:35:42 +02:00
s390 mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
sbus
scsi for-linus-20181026 2018-10-26 12:43:13 -07:00
sfi mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
sh
siox
slimbus
sn
soc ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
soundwire
spi - New Drivers 2018-10-25 06:19:15 -07:00
spmi
ssb
staging mm: remove CONFIG_HAVE_MEMBLOCK 2018-10-31 08:54:15 -07:00
target SCSI misc on 20181024 2018-10-25 07:40:30 -07:00
tc
tee
thermal Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal 2018-10-26 12:04:29 -07:00
thunderbolt
tty mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
uio
usb mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
uwb
vfio KVM: PPC: Optimize clearing TCEs for sparse tables 2018-10-20 20:47:02 +11:00
vhost
video media updates for v4.20-rc1 2018-10-29 14:29:58 -07:00
virt
virtio
visorbus
vlynq
vme
w1 w1: IAD Register is yet readable trough iad sys file. Fix snprintf (%u for unsigned, count for max size). 2018-10-15 20:50:32 +02:00
watchdog watchdog: ts4800: release syscon device node in ts4800_wdt_probe() 2018-10-22 10:16:28 +02:00
xen mm/memory_hotplug: make add_memory() take the device_hotplug_lock 2018-10-31 08:54:17 -07:00
zorro
Kconfig
Makefile