1
0
Fork 0
alistair23-linux/arch
Tony Luck 9a20d6c327 x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned
commit 17fae1294a upstream.

An interesting thing happened when a guest Linux instance took a machine
check. The VMM unmapped the bad page from guest physical space and
passed the machine check to the guest.

Linux took all the normal actions to offline the page from the process
that was using it. But then guest Linux crashed because it said there
was a second machine check inside the kernel with this stack trace:

do_memory_failure
    set_mce_nospec
         set_memory_uc
              _set_memory_uc
                   change_page_attr_set_clr
                        cpa_flush
                             clflush_cache_range_opt

This was odd, because a CLFLUSH instruction shouldn't raise a machine
check (it isn't consuming the data). Further investigation showed that
the VMM had passed in another machine check because is appeared that the
guest was accessing the bad page.

Fix is to check the scope of the poison by checking the MCi_MISC register.
If the entire page is affected, then unmap the page. If only part of the
page is affected, then mark the page as uncacheable.

This assumes that VMMs will do the logical thing and pass in the "whole
page scope" via the MCi_MISC register (since they unmapped the entire
page).

  [ bp: Adjust to x86/entry changes. ]

Fixes: 284ce4011b ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Reported-by: Jue Wang <juew@google.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jue Wang <juew@google.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200520163546.GA7977@agluck-desk2.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-17 16:40:31 +02:00
..
alpha mm: introduce MADV_PAGEOUT 2019-09-25 17:51:41 -07:00
arc ARC: [plat-eznps]: Restrict to CONFIG_ISA_ARCOMPACT 2020-06-07 13:18:50 +02:00
arm ARM: 8977/1: ptrace: Fix mask for thumb breakpoint hook 2020-06-17 16:40:21 +02:00
arm64 arm64: acpi: fix UBSAN warning 2020-06-17 16:40:28 +02:00
c6x mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
csky csky: Fixup abiv2 syscall_trace break a4 & a5 2020-06-17 16:40:21 +02:00
h8300 mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
hexagon hexagon: define ioremap_uc 2020-05-10 10:31:31 +02:00
ia64 mm/memory_hotplug: shrink zones when offlining memory 2020-01-09 10:19:56 +01:00
m68k mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
microblaze microblaze: Prevent the overflow of the start 2020-02-24 08:37:02 +01:00
mips MIPS: OCTEON: irq: Fix potential NULL pointer dereference 2020-04-17 10:50:12 +02:00
nds32 asm-generic/nds32: don't redefine cacheflush primitives 2020-01-17 19:48:43 +01:00
nios2 nios2 update for v5.4-rc1 2019-09-27 13:02:19 -07:00
openrisc mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
parisc parisc: Fix kernel panic in mem_init() 2020-06-03 08:21:28 +02:00
powerpc powerpc/ptdump: Properly handle non standard page size 2020-06-17 16:40:26 +02:00
riscv riscv: stacktrace: Fix undefined reference to `walk_stackframe' 2020-06-03 08:21:13 +02:00
s390 s390/pci: Log new handle in clp_disable_fh() 2020-06-17 16:40:23 +02:00
sh pinctrl: sh-pfc: sh7269: Fix CAN function GPIOs 2020-02-24 08:36:41 +01:00
sparc sparc: Add .exit.data section. 2020-02-24 08:36:27 +01:00
um um: ensure `make ARCH=um mrproper` removes arch/$(SUBARCH)/include/generated/ 2020-05-02 08:48:53 +02:00
unicore32 mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
x86 x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned 2020-06-17 16:40:31 +02:00
xtensa xtensa: Implement copy_thread_tls 2020-01-14 20:08:35 +01:00
.gitignore
Kconfig asm-generic/tlb: add missing CONFIG symbol 2020-02-24 08:37:02 +01:00