alistair23-linux/drivers/edac
Chen Gong e35fca4791 edac: avoid mce decoding crash after edac driver unloaded
Some edac drivers register themselves as mce decoders via
notifier_chain. But in current notifier_chain implementation logic,
it doesn't accept same notifier registered twice. If so, it will be
wrong when adding/removing the element from the list. For example,
on one SandyBridge platform, remove module sb_edac and then trigger
one error, it will hit oops because it has no mce decoder registered
but related notifier_chain still points to an invalid callback
function. Here is an example:

Call Trace:
 [<ffffffff8150ef6a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff8102b936>] mce_log+0x46/0x180
 [<ffffffff8102eaea>] apei_mce_report_mem_error+0x4a/0x60
 [<ffffffff812e19d2>] ghes_do_proc+0x192/0x210
 [<ffffffff812e2066>] ghes_proc+0x46/0x70
 [<ffffffff812e20d8>] ghes_notify_sci+0x48/0x80
 [<ffffffff8150ef05>] notifier_call_chain+0x55/0x80
 [<ffffffff81076f1a>] __blocking_notifier_call_chain+0x5a/0x80
 [<ffffffff812aea11>] ? acpi_os_wait_events_complete+0x23/0x23
 [<ffffffff81076f56>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff812ddc4d>] acpi_hed_notify+0x19/0x1b
 [<ffffffff812b16bd>] acpi_device_notify+0x19/0x1b
 [<ffffffff812beb38>] acpi_ev_notify_dispatch+0x67/0x7f
 [<ffffffff812aea3a>] acpi_os_execute_deferred+0x29/0x36
 [<ffffffff81069dc2>] process_one_work+0x132/0x450
 [<ffffffff8106bbcb>] worker_thread+0x17b/0x3c0
 [<ffffffff8106ba50>] ? manage_workers+0x120/0x120
 [<ffffffff81070aee>] kthread+0x9e/0xb0
 [<ffffffff81514724>] kernel_thread_helper+0x4/0x10
 [<ffffffff81070a50>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff81514720>] ? gs_change+0x13/0x13
Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d <4c> 8b
79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
RIP  [<ffffffff8150eef6>] notifier_call_chain+0x46/0x80
 RSP <ffff88042868fb20>
CR2: ffffffffa01af838
---[ end trace 0100930068e73e6f ]---
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff810705b0>] kthread_data+0x10/0x20
PGD 1a0d067 PUD 1a0e067 PMD 0
Oops: 0000 [#2] SMP

Only i7core_edac and sb_edac have such issues because they have more
than one memory controller which means they have to register mce
decoder many times.

Cc: <stable@vger.kernel.org> # 3.2 and upper
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-06-11 11:49:51 -03:00
..
amd64_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
amd64_edac.h amd64_edac: Erratum #637 workaround 2011-04-26 16:18:56 +02:00
amd64_edac_dbg.c
amd64_edac_inj.c
amd76x_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
amd8111_edac.c edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
amd8111_edac.h
amd8131_edac.c edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
amd8131_edac.h
cell_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
cpc925_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
e7xxx_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
e752x_edac.c e752x_edac: provide more info about how DIMMS/ranks are mapped 2012-05-28 19:13:53 -03:00
edac_core.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac 2012-05-29 18:32:37 -07:00
edac_device.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac 2012-05-29 18:32:37 -07:00
edac_device_sysfs.c edac: convert sysdev_class to a regular subsystem 2011-12-14 15:21:07 -08:00
edac_mc.c edac: Initialize the dimm label with the known information 2012-05-28 19:13:50 -03:00
edac_mc_sysfs.c edac: Initialize the dimm label with the known information 2012-05-28 19:13:50 -03:00
edac_module.c edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
edac_module.h edac: rewrite edac_align_ptr() 2012-05-28 19:10:59 -03:00
edac_pci.c edac: rewrite edac_align_ptr() 2012-05-28 19:10:59 -03:00
edac_pci_sysfs.c edac: convert sysdev_class to a regular subsystem 2011-12-14 15:21:07 -08:00
edac_stub.c device.h: cleanup users outside of linux/include (C files) 2012-03-11 14:27:37 -04:00
i7core_edac.c edac: avoid mce decoding crash after edac driver unloaded 2012-06-11 11:49:51 -03:00
i3000_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
i3200_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
i5000_edac.c i5000: Fix the fatal error handling 2012-05-28 19:13:54 -03:00
i5100_edac.c i5100_edac: Fix a warning when compiled with 32 bits 2012-05-28 19:13:54 -03:00
i5400_edac.c i5400_edac: improve debug messages to better represent the filled memory 2012-05-28 19:13:51 -03:00
i7300_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
i82443bxgx_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
i82860_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
i82875p_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
i82975x_edac.c i82975x_edac: Test nr_pages earlier to save a few CPU cycles 2012-05-28 19:13:53 -03:00
Kconfig edac: sb_edac: Let the driver depend on PCI_MMCONFIG 2012-03-21 15:19:56 -03:00
Makefile edac: sb_edac: Add it to the building system 2011-11-01 10:01:54 -02:00
mce_amd.c MCE, AMD: Drop too granulary family model checks 2012-04-04 15:50:11 +02:00
mce_amd.h x86/bitops: Move BIT_64() for a wider use 2012-05-23 17:16:42 +02:00
mce_amd_inj.c device.h: cleanup users outside of linux/include (C files) 2012-03-11 14:27:37 -04:00
mpc85xx_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
mpc85xx_edac.h edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
mv64x60_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
mv64x60_edac.h edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
pasemi_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
ppc4xx_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
ppc4xx_edac.h
r82600_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
sb_edac.c edac: avoid mce decoding crash after edac driver unloaded 2012-06-11 11:49:51 -03:00
tile_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00
x38_edac.c edac: Remove the legacy EDAC ABI 2012-05-28 19:13:50 -03:00