1
0
Fork 0
alistair23-linux/drivers
Robert Richter d2136afb51 EDAC/ghes: Fix locking and memory barrier issues
[ Upstream commit 23f61b9fc5 ]

The ghes registration and refcount is broken in several ways:

 * ghes_edac_register() returns with success for a 2nd instance
   even if a first instance's registration is still running. This is
   not correct as the first instance may fail later. A subsequent
   registration may not finish before the first. Parallel registrations
   must be avoided.

 * The refcount was increased even if a registration failed. This
   leads to stale counters preventing the device from being released.

 * The ghes refcount may not be decremented properly on unregistration.
   Always decrement the refcount once ghes_edac_unregister() is called to
   keep the refcount sane.

 * The ghes_pvt pointer is handed to the irq handler before registration
   finished.

 * The mci structure could be freed while the irq handler is running.

Fix this by adding a mutex to ghes_edac_register(). This mutex
serializes instances to register and unregister. The refcount is only
increased if the registration succeeded. This makes sure the refcount is
in a consistent state after registering or unregistering a device.

Note: A spinlock cannot be used here as the code section may sleep.

The ghes_pvt is protected by ghes_lock now. This ensures the pointer is
not updated before registration was finished or while the irq handler is
running. It is unset before unregistering the device including necessary
(implicit) memory barriers making the changes visible to other CPUs.
Thus, the device can not be used anymore by an interrupt.

Also, rename ghes_init to ghes_refcount for better readability and
switch to refcount API.

A refcount is needed because there can be multiple GHES structures being
defined (see ACPI 6.3 specification, 18.3.2.7 Generic Hardware Error
Source, "Some platforms may describe multiple Generic Hardware Error
Source structures with different notification types, ...").

Another approach to use the mci's device refcount (get_device()) and
have a release function does not work here. A release function will be
called only for device_release() with the last put_device() call. The
device must be deleted *before* that with device_del(). This is only
possible by maintaining an own refcount.

 [ bp: touchups. ]

Fixes: 0fe5f281f7 ("EDAC, ghes: Model a single, logical memory controller")
Fixes: 1e72e673b9 ("EDAC/ghes: Fix Use after free in ghes_edac remove path")
Co-developed-by: James Morse <james.morse@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20191105200732.3053-1-rrichter@marvell.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:43:30 +01:00
..
accessibility
acpi Power management fix for 5.4-rc6 2019-11-01 09:30:48 -07:00
amba ARM updates for 5.4-rc: 2019-10-23 06:26:33 -04:00
android binder: Handle start==NULL in binder_update_page_range() 2019-12-13 08:43:25 +01:00
ata ata: libahci_platform: Fix regulator_get_optional() misuse 2019-10-25 14:22:20 -06:00
atm atm: he: clean up an indentation issue 2019-09-25 13:54:45 +02:00
auxdisplay It's a somewhat calmer cycle for docs this time, as the churn of the mass 2019-09-17 16:22:26 -07:00
base driver core: platform: use the correct callback type for bus_find_device 2019-12-04 22:30:45 +01:00
bcma bcma: make arrays pwr_info_offset and sprom_sizes static const, shrinks object size 2019-09-13 16:44:49 +03:00
block nbd: prevent memory leak 2019-11-29 10:09:47 +01:00
bluetooth Revert "Bluetooth: hci_ll: set operational frequency earlier" 2019-11-29 10:09:43 +01:00
bus bus: ti-sysc: Fix watchdog quirk handling 2019-10-18 08:45:32 -07:00
cdrom
char lp: fix sparc64 LPSETTIMEOUT ioctl 2019-12-13 08:42:17 +01:00
clk Fixes for various clk driver issues that happened because of code we 2019-11-08 08:15:01 -08:00
clocksource - Fix scary messages in sh_mtu2 by using platform_irq_count() helper 2019-11-04 18:43:23 +01:00
connector
counter
cpufreq cpufreq: imx-cpufreq-dt: Correct i.MX8MN's default speed grade value 2019-12-13 08:43:28 +01:00
cpuidle cpuidle: haltpoll: Take 'idle=' override into account 2019-10-22 11:43:17 +02:00
crypto crypto: ccp - fix uninitialized list head 2019-12-13 08:43:08 +01:00
dax
dca
devfreq
dio
dma dmaengine: cppi41: Fix cppi41_dma_prep_slave_sg() when idle 2019-10-23 21:15:21 +05:30
dma-buf dma-buf/resv: fix exclusive fence get 2019-10-10 17:05:20 +02:00
edac EDAC/ghes: Fix locking and memory barrier issues 2019-12-13 08:43:30 +01:00
eisa
extcon chrome platform changes for v5.4 2019-09-19 14:14:28 -07:00
firewire
firmware efi/efi_test: Lock down /dev/efi_test and require CAP_SYS_ADMIN 2019-10-31 09:40:21 +01:00
fpga Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
fsi
gnss
gpio gpio fixes for v5.4-rc8 2019-11-13 22:58:01 +01:00
gpu drm/mcde: Fix an error handling path in 'mcde_probe()' 2019-12-13 08:43:29 +01:00
greybus
hid HID: core: check whether Usage Page item is after Usage ID items 2019-12-04 22:31:07 +01:00
hsi HSI changes for the 5.4 series 2019-09-22 12:02:21 -07:00
hv Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM_SLEEP 2019-10-01 14:49:45 -04:00
hwmon hwmon: (ina3221) Fix read timeout issue 2019-10-28 18:46:55 -07:00
hwspinlock
hwtracing coresight: etm4x: Fix input validation for sysfs. 2019-12-13 08:42:43 +01:00
i2c i2c: core: fix use after free in of_i2c_notify 2019-11-15 22:01:13 +01:00
i3c
ide
idle
iio iio: adc: stm32-adc: fix stopping dma 2019-10-27 15:57:19 +00:00
infiniband RDMA/qib: Validate ->show()/store() callbacks before calling them 2019-12-13 08:43:16 +01:00
input Input: Fix memory leak in psxpad_spi_probe 2019-12-13 08:42:44 +01:00
interconnect interconnect: Add locking in icc_set_tag() 2019-10-20 12:14:41 +03:00
iommu iommu/vt-d: Fix panic after kexec -p for kdump 2019-10-30 10:30:22 +01:00
ipack
irqchip irqchip updates for 5.4, take 2 2019-10-25 14:25:15 +02:00
isdn net: use skb_queue_empty_lockless() in poll() handlers 2019-10-28 13:33:41 -07:00
leds leds: lm3532: Fix optional led-max-microamp prop error handling 2019-09-12 20:45:52 +02:00
lightnvm lightnvm: print error when target is not found 2019-09-05 13:17:01 -06:00
macintosh cpufreq: Use per-policy frequency QoS 2019-10-21 02:05:21 +02:00
mailbox mailbox: tegra: Fix superfluous IRQ error message 2019-12-13 08:42:19 +01:00
mcb
md md/raid0: Fix an error message in raid0_make_request() 2019-12-13 08:43:29 +01:00
media media: rc: mark input device as pointing stick 2019-12-13 08:42:45 +01:00
memory
memstick memstick: jmb38x_ms: Fix an error handling path in 'jmb38x_ms_probe()' 2019-10-09 11:08:03 +02:00
message
mfd mfd: mt6397: Fix probe after changing mt6397-core 2019-10-24 08:49:25 +01:00
misc mei: me: add comet point V device id 2019-12-04 22:30:49 +01:00
mmc mmc: sdhci-of-at91: fix quirk2 overwrite 2019-11-14 14:57:53 +01:00
mtd mtd: rawnand: au1550nd: Fix au_read_buf16() prototype 2019-10-07 09:56:36 +02:00
mux
net can: ucan: fix non-atomic allocation in completion handler 2019-12-13 08:43:16 +01:00
nfc nfc: port100: handle command failure cleanly 2019-11-21 11:48:17 -08:00
ntb NTB: fix IDT Kconfig typos/spellos 2019-09-23 17:20:40 -04:00
nubus
nvdimm libnvdimm fixes v5.4-rc1 2019-09-29 10:33:41 -07:00
nvme for-linus-2019-11-08 2019-11-08 18:15:55 -08:00
nvmem Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
of of: reserved_mem: add missing of_node_put() for proper ref-counting 2019-10-23 15:15:05 -05:00
opp opp: Reinitialize the list_kref before adding the static OPPs again 2019-10-23 10:58:44 +05:30
oprofile
parisc parisc: Remove 32-bit DMA enforcement from sba_iommu 2019-10-14 21:44:26 +02:00
parport Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
pci PCI: PM: Fix pci_power_up() 2019-10-15 23:51:36 +02:00
pcmcia Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-09-28 08:14:15 -07:00
perf
phy pci-v5.4-changes 2019-09-23 19:16:01 -07:00
pinctrl pinctrl: stmfx: fix valid_mask init sequence 2019-11-07 10:06:46 +01:00
platform platform/x86: hp-wmi: Fix ACPI errors caused by passing 0 as input size 2019-12-04 22:31:08 +01:00
pnp
power power supply and reset changes for the v5.4 series 2019-09-22 12:04:59 -07:00
powercap Power management updates for 5.4-rc1 2019-09-17 19:15:14 -07:00
pps
ps3
ptp ptp: Introduce strict checking of external time stamp options. 2019-11-15 12:48:32 -08:00
pwm pwm: bcm-iproc: Prevent unloading the driver module while in use 2019-11-08 18:38:06 +01:00
rapidio
ras
regulator regulator: Fixes for v5.4 2019-10-23 15:31:17 -04:00
remoteproc remoteproc updates for v5.4 2019-09-22 10:55:08 -07:00
reset reset: fix of_reset_control_get_count kerneldoc comment 2019-10-24 10:26:33 +02:00
rpmsg rpmsg: glink-smem: Name the edge based on parent remoteproc 2019-09-17 15:33:31 -07:00
rtc RTC for 5.4 2019-09-22 11:05:43 -07:00
s390 s390/qeth: return proper errno on IO error 2019-11-20 12:29:47 -08:00
sbus
scsi SCSI fixes on 20191111 2019-11-11 09:14:36 -08:00
sfi
sh
siox
slimbus
soc soc: mediatek: cmdq: fixup wrong input order of write api 2019-12-13 08:42:40 +01:00
soundwire soundwire: slave: fix scanf format 2019-10-24 16:55:45 +05:30
spi spi: Fix NULL pointer when setting SPI_CS_HIGH for GPIO CS 2019-12-13 08:43:15 +01:00
spmi
ssb ssb: make array pwr_info_offset static const, makes object smaller 2019-09-13 17:23:18 +03:00
staging staging/octeon: Use stubs for MIPS && !CAVIUM_OCTEON_SOC 2019-12-13 08:42:19 +01:00
target SCSI fixes on 20191101 2019-11-02 11:15:52 -07:00
tc
tee tee/shm: untag user pointers in tee_shm_register 2019-09-25 17:51:41 -07:00
thermal thermal: Fix deadlock in thermal thermal_zone_device_check 2019-12-13 08:43:21 +01:00
thunderbolt thunderbolt: Power cycle the router if NVM authentication fails 2019-12-04 22:30:50 +01:00
tty Revert "serial/8250: Add support for NI-Serial PXI/PXIe+485 devices" 2019-12-13 08:43:22 +01:00
uio Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
usb usb: gadget: u_serial: add missing port entry locking 2019-12-13 08:42:20 +01:00
vfio vfio/type1: Initialize resv_msi_base 2019-10-15 14:07:01 -06:00
vhost vringh: fix copy direction of vringh_iov_push_kern() 2019-10-28 04:25:04 -04:00
video - Some new documentation for GEM shmem madvise helpers 2019-11-08 12:12:57 +10:00
virt virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr 2019-10-10 14:50:32 +02:00
virtio virtio_balloon: fix shrinker count 2019-11-20 02:15:57 -05:00
visorbus
vlynq
vme
w1 w1: ds250x: Fix build error without CRC16 2019-10-10 15:35:41 +02:00
watchdog watchdog: aspeed: Fix clock behaviour for ast2600 2019-12-13 08:43:30 +01:00
xen Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-10-19 17:09:11 -04:00
zorro
Kconfig Staging/IIO driver patches for 5.4-rc1 2019-09-18 11:05:34 -07:00
Makefile Staging/IIO driver patches for 5.4-rc1 2019-09-18 11:05:34 -07:00