alistair23-linux/arch/x86/kernel
Andy Lutomirski 7536656f08 x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
Right after SYSENTER, we can get a #DB or NMI.  On x86_32, there's no IST,
so the exception handler is invoked on the temporary SYSENTER stack.

Because the SYSENTER stack is very small, we have a fixup to switch
off the stack quickly when this happens.  The old fixup had several issues:

 1. It checked the interrupt frame's CS and EIP.  This wasn't
    obviously correct on Xen or if vm86 mode was in use [1].

 2. In the NMI handler, it did some frightening digging into the
    stack frame.  I'm not convinced this digging was correct.

 3. The fixup didn't switch stacks and then switch back.  Instead, it
    synthesized a brand new stack frame that would redirect the IRET
    back to the SYSENTER code.  That frame was highly questionable.
    For one thing, if NMI nested inside #DB, we would effectively
    abort the #DB prologue, which was probably safe but was
    frightening.  For another, the code used PUSHFL to write the
    FLAGS portion of the frame, which was simply bogus -- by the time
    PUSHFL was called, at least TF, NT, VM, and all of the arithmetic
    flags were clobbered.

Simplify this considerably.  Instead of looking at the saved frame
to see where we came from, check the hardware ESP register against
the SYSENTER stack directly.  Malicious user code cannot spoof the
kernel ESP register, and by moving the check after SAVE_ALL, we can
use normal PER_CPU accesses to find all the relevant addresses.

With this patch applied, the improved syscall_nt_32 test finally
passes on 32-bit kernels.

[1] It isn't obviously correct, but it is nonetheless safe from vm86
    shenanigans as far as I can tell.  A user can't point EIP at
    entry_SYSENTER_32 while in vm86 mode because entry_SYSENTER_32,
    like all kernel addresses, is greater than 0xffff and would thus
    violate the CS segment limit.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b2cdbc037031c07ecf2c40a96069318aec0e7971.1457578375.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-10 09:48:14 +01:00
..
acpi PM / sleep / x86: Fix crash on graph trace through x86 suspend 2016-03-03 02:28:28 +01:00
apic Merge branch 'x86/urgent' into x86/asm, to pick up fixes 2016-02-18 09:28:03 +01:00
cpu x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled 2016-03-08 14:16:44 +01:00
fpu x86/fpu: Disable AVX when eagerfpu is off 2016-01-12 11:51:21 +01:00
kprobes Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-14 14:37:47 -07:00
.gitignore
alternative.c x86/alternatives: Make optimize_nops() interrupt safe and synced 2015-09-03 21:27:47 +02:00
amd_gart_64.c
amd_nb.c x86/gart: Check for GART support before accessing GART registers 2015-05-06 11:15:53 +02:00
apb_timer.c x86/asm/tsc: Rename native_read_tsc() to rdtsc() 2015-07-06 15:23:28 +02:00
aperture_64.c x86/gart: Check for GART support before accessing GART registers 2015-05-06 11:15:53 +02:00
apm_32.c apm32: Fix cputime == jiffies assumption 2015-07-29 15:44:58 +02:00
asm-offsets.c x86/asm-offsets: Remove PARAVIRT_enabled 2016-03-08 14:16:44 +01:00
asm-offsets_32.c x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup 2016-03-10 09:48:14 +01:00
asm-offsets_64.c x86/syscalls: Add syscall entry qualifiers 2016-01-29 09:46:38 +01:00
audit_64.c x86: hook up execveat system call 2014-12-13 12:42:51 -08:00
bootflag.c x86: don't use module_init for non-modular core bootflag code 2015-06-16 14:12:34 -04:00
check.c Linux 4.2-rc8 2015-08-25 09:59:19 +02:00
cpuid.c new helpers: no_seek_end_llseek{,_size}() 2015-12-23 10:41:31 -05:00
crash.c perf, x86: Stop Intel PT before kdump starts 2015-11-23 09:58:26 +01:00
crash_dump_32.c
crash_dump_64.c
devicetree.c Replace module_init with equivalent device_initcall in non modules. 2015-07-02 10:30:48 -07:00
doublefault.c
dumpstack.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:23:34 -07:00
dumpstack_32.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:23:34 -07:00
dumpstack_64.c x86/kernel: Use kstack_end() in dumpstack_64.c 2015-02-23 18:37:13 +01:00
e820.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
early-quirks.c Linux 4.4-rc2 2015-11-23 09:04:05 +01:00
early_printk.c x86/early_printk: Set __iomem address space for IO 2015-10-13 21:45:56 +02:00
espfix_64.c Merge branch 'x86/urgent' into x86/asm, before applying dependent patches 2015-07-31 10:23:35 +02:00
ftrace.c x86/ftrace, x86/asm: Kill ftrace_caller_end label 2016-02-17 08:47:22 +01:00
head.c
head32.c x86: Store a per-cpu shadow copy of CR4 2015-02-04 12:10:42 +01:00
head64.c x86/platform/intel-mid: Enable 64-bit build 2016-01-19 08:39:56 +01:00
head_32.S x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
head_64.S x86/asm: Remove unused L3_PAGE_OFFSET 2016-01-27 11:37:49 +01:00
hpet.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
hw_breakpoint.c x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros 2015-12-19 11:49:55 +01:00
i386_ksyms_32.c preempt: Use preempt_schedule_context() as the official tracing preemption point 2015-06-07 15:57:42 +02:00
i8237.c
i8253.c clockevents/drivers/i8253: Migrate to new 'set-state' interface 2015-08-10 11:40:30 +02:00
i8259.c x86/irq: Probe for PIC presence before allocating descs for legacy IRQs 2015-11-07 10:37:37 +01:00
io_delay.c
ioport.c x86/asm/entry: Rename 'init_tss' to 'cpu_tss' 2015-03-06 08:32:58 +01:00
irq.c x86/irq: Call irq_force_move_complete with irq descriptor 2016-01-15 13:44:01 +01:00
irq_32.c genirq: Remove irq argument from irq flow handlers 2015-09-16 15:47:51 +02:00
irq_64.c x86/irq: Drop unlikely before IS_ERR_OR_NULL 2015-10-01 11:08:56 +02:00
irq_work.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
irqinit.c x86/irq: Store irq descriptor in vector array 2015-08-06 00:14:59 +02:00
jump_label.c jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP} 2015-08-03 11:34:12 +02:00
kdebugfs.c
kexec-bzimage64.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2015-09-08 12:41:25 -07:00
kgdb.c x86/kgdb: Replace bool_int_array[NR_CPUS] with bitmap 2015-09-28 10:13:31 +02:00
ksysfs.c
kvm.c The bulk of the changes here is for x86. And for once it's not 2015-06-24 09:36:49 -07:00
kvmclock.c x86/vdso: Remove pvclock fixmap machinery 2015-12-11 08:56:03 +01:00
ldt.c x86/ldt: Fix small LDT allocation for Xen 2015-09-14 12:10:50 +02:00
livepatch.c livepatch: Cleanup module page permission changes 2015-12-04 22:51:07 +01:00
machine_kexec_32.c x86, irq: Move IOAPIC related declarations from hw_irq.h into io_apic.h 2014-12-16 14:08:17 +01:00
machine_kexec_64.c kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILE 2016-01-20 17:09:18 -08:00
Makefile kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
mcount_64.S x86/ftrace, x86/asm: Kill ftrace_caller_end label 2016-02-17 08:47:22 +01:00
mmconf-fam10h_64.c
module.c x86/mm/KASLR: Propagate KASLR status to kernel proper 2015-04-03 15:26:15 +02:00
mpparse.c x86: Cleanup irq_domain ops 2015-04-24 15:36:55 +02:00
msr.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
nmi.c x86/nmi: Save regs in crash dump on external NMI 2015-12-19 11:07:01 +01:00
nmi_selftest.c
paravirt-spinlocks.c locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS 2015-05-11 09:52:09 +02:00
paravirt.c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-01-11 16:26:03 -08:00
paravirt_patch_32.c x86/paravirt: Remove the unused irq_enable_sysexit pv op 2015-11-23 10:48:16 +01:00
paravirt_patch_64.c x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op 2015-11-23 10:48:16 +01:00
pci-calgary_64.c x86/platform/calgary: Constify cal_chipset_ops structures 2015-11-29 08:50:58 +01:00
pci-dma.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
pci-iommu_table.c
pci-nommu.c
pci-swiotlb.c x86/mm/64: Enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN 2015-12-06 12:46:31 +01:00
pcspeaker.c
perf_regs.c perf/x86/64: Report regs_user->ax too in get_regs_user() 2015-04-11 13:08:53 +02:00
pmem.c libnvdimm, e820: skip module loading when no type-12 2015-11-30 09:10:33 -08:00
probe_roms.c
process.c x86/vm86: Set thread.vm86 to NULL on fork/clone 2015-10-31 09:50:25 +01:00
process_32.c sched/core, sched/x86: Kill thread_info::saved_preempt_count 2015-10-06 17:08:18 +02:00
process_64.c x86/LDT: Print the real LDT base address 2015-12-29 12:34:38 +01:00
ptrace.c arch/x86/kernel/ptrace.c: Remove unused arg_offs_table 2015-12-29 11:35:34 +01:00
pvclock.c x86/vdso: Remove pvclock fixmap machinery 2015-12-11 08:56:03 +01:00
quirks.c timers/x86/hpet: Type adjustments 2015-10-21 11:17:32 +02:00
reboot.c x86/reboot/quirks: Add iMac10,1 to pci_reboot_dmi_table[] 2016-01-12 12:27:36 +01:00
reboot_fixups_32.c
relocate_kernel_32.S x86/asm: Optimize unnecessarily wide TEST instructions 2015-03-07 11:12:43 +01:00
relocate_kernel_64.S x86/asm: Replace "MOVQ $imm, %reg" with MOVL 2015-04-01 13:17:39 +02:00
resource.c
rtc.c x86/paravirt: Prevent rtc_cmos platform device init on PV guests 2015-12-19 21:35:13 +01:00
setup.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-01-11 17:16:01 -08:00
setup_percpu.c x86: Convert a few more per-CPU items to read-mostly ones 2014-11-04 20:13:28 +01:00
signal.c x86/signal/64: Re-add support for SS in the 64-bit signal context 2016-02-17 08:32:11 +01:00
signal_compat.c x86/compat: Move copy_siginfo_*_user32() to signal_compat.c 2015-07-06 15:28:55 +02:00
smp.c x86/smp: Remove single IPI wrapper 2015-11-05 13:07:54 +01:00
smpboot.c x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros 2015-12-19 11:49:55 +01:00
stacktrace.c
step.c Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes 2015-08-18 09:39:47 +02:00
sys_x86_64.c x86/mm: Improve AMD Bulldozer ASLR workaround 2015-03-31 10:01:17 +02:00
sysfb.c x86/sysfb: Use PTR_ERR_OR_ZERO 2014-10-17 13:40:52 -07:00
sysfb_efi.c
sysfb_simplefb.c x86/simplefb: Use PTR_ERR_OR_ZERO 2014-10-17 13:40:52 -07:00
tboot.c
tce_64.c
test_nx.c
test_rodata.c treewide: Fix typo in printk messages 2015-03-06 23:05:39 +01:00
time.c x86/asm/entry: Change all 'user_mode_vm()' calls to 'user_mode()' 2015-03-23 11:14:17 +01:00
tls.c x86, tls: Interpret an all-zero struct user_desc as "no segment" 2015-01-22 21:45:07 +01:00
tls.h
topology.c x86: Drop bogus __ref / __refdata annotations 2015-07-20 18:57:20 +02:00
trace_clock.c x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sites 2015-07-06 15:23:29 +02:00
tracepoint.c
traps.c x86/entry: Vastly simplify SYSENTER TF (single-step) handling 2016-03-10 09:48:13 +01:00
tsc.c x86/tsc: Remove unused tsc_pre_init() hook 2015-11-19 11:03:13 +01:00
tsc_msr.c
tsc_sync.c x86/asm/tsc/sync: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers 2015-07-06 15:23:29 +02:00
uprobes.c uprobes/x86: Make arch_uretprobe_is_alive(RP_CHECK_CALL) more clever 2015-07-31 10:38:06 +02:00
verify_cpu.S x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
vm86_32.c x86/cpufeature: Replace the old static_cpu_has() with safe variant 2016-01-30 11:22:18 +01:00
vmlinux.lds.S x86/alternatives: Add an auxilary section 2016-01-30 11:22:20 +01:00
vsmp_64.c x86: replace __init_or_module with __init in non-modular vsmp_64.c 2015-06-16 14:12:41 -04:00
x86_init.c x86/tsc: Remove unused tsc_pre_init() hook 2015-11-19 11:03:13 +01:00
x8664_ksyms_64.c preempt: Use preempt_schedule_context() as the official tracing preemption point 2015-06-07 15:57:42 +02:00