alistair23-linux/arch/x86
Sean Christopherson 45def77ebf KVM: x86: update %rip after emulating IO
Most (all?) x86 platforms provide a port IO based reset mechanism, e.g.
OUT 92h or CF9h.  Userspace may emulate said mechanism, i.e. reset a
vCPU in response to KVM_EXIT_IO, without explicitly announcing to KVM
that it is doing a reset, e.g. Qemu jams vCPU state and resumes running.

To avoid corruping %rip after such a reset, commit 0967b7bf1c ("KVM:
Skip pio instruction when it is emulated, not executed") changed the
behavior of PIO handlers, i.e. today's "fast" PIO handling to skip the
instruction prior to exiting to userspace.  Full emulation doesn't need
such tricks becase re-emulating the instruction will naturally handle
%rip being changed to point at the reset vector.

Updating %rip prior to executing to userspace has several drawbacks:

  - Userspace sees the wrong %rip on the exit, e.g. if PIO emulation
    fails it will likely yell about the wrong address.
  - Single step exits to userspace for are effectively dropped as
    KVM_EXIT_DEBUG is overwritten with KVM_EXIT_IO.
  - Behavior of PIO emulation is different depending on whether it
    goes down the fast path or the slow path.

Rather than skip the PIO instruction before exiting to userspace,
snapshot the linear %rip and cancel PIO completion if the current
value does not match the snapshot.  For a 64-bit vCPU, i.e. the most
common scenario, the snapshot and comparison has negligible overhead
as VMCS.GUEST_RIP will be cached regardless, i.e. there is no extra
VMREAD in this case.

All other alternatives to snapshotting the linear %rip that don't
rely on an explicit reset announcenment suffer from one corner case
or another.  For example, canceling PIO completion on any write to
%rip fails if userspace does a save/restore of %rip, and attempting to
avoid that issue by canceling PIO only if %rip changed then fails if PIO
collides with the reset %rip.  Attempting to zero in on the exact reset
vector won't work for APs, which means adding more hooks such as the
vCPU's MP_STATE, and so on and so forth.

Checking for a linear %rip match technically suffers from corner cases,
e.g. userspace could theoretically rewrite the underlying code page and
expect a different instruction to execute, or the guest hardcodes a PIO
reset at 0xfffffff0, but those are far, far outside of what can be
considered normal operation.

Fixes: 432baf60ee ("KVM: VMX: use kvm_fast_pio_in for handling IN I/O")
Cc: <stable@vger.kernel.org>
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-28 17:29:04 +01:00
..
boot x86/boot: Restrict header scope to make Clang happy 2019-03-21 12:24:38 +01:00
configs Merge branch 'akpm' (patches from Andrew) 2019-03-07 19:25:37 -08:00
crypto crypto: x86/poly1305 - Clear key material from stack in SSE2 variant 2019-02-28 14:17:59 +08:00
entry pidfd patches for v5.1-rc1 2019-03-16 13:47:14 -07:00
events perf/x86/intel: Make dev_attr_allow_tsx_force_abort static 2019-03-17 08:40:18 +01:00
hyperv x86/hyperv: Prevent potential NULL pointer dereference 2019-03-21 12:24:39 +01:00
ia32 a.out: remove core dumping support 2019-03-05 10:00:35 -08:00
include KVM: x86: update %rip after emulating IO 2019-03-28 17:29:04 +01:00
kernel x86/gart: Exclude GART aperture from kcore 2019-03-23 12:11:49 +01:00
kvm KVM: x86: update %rip after emulating IO 2019-03-28 17:29:04 +01:00
lib x86/lib: Fix indentation issue, remove extra tab 2019-03-21 12:24:38 +01:00
math-emu Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
mm x86/mm/pti: Make local symbols static 2019-03-22 13:31:28 +01:00
net x32: bpf: implement jitting of JMP32 2019-01-26 13:33:02 -08:00
oprofile x86/oprofile: Fix bogus GCC-8 warning in nmi_setup() 2018-02-21 09:54:17 +01:00
pci x86/PCI: Fixup RTIT_BAR of Intel Denverton Trace Hub 2019-02-07 08:43:58 -06:00
platform treewide: add checks for the return value of memblock_alloc*() 2019-03-12 10:04:02 -07:00
power mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
purgatory kbuild: move bin2c back to scripts/ from scripts/basic/ 2018-07-18 01:18:05 +09:00
ras
realmode Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
tools x86: Clean up 'sizeof x' => 'sizeof(x)' 2018-10-29 07:13:28 +01:00
um Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-05 14:08:26 -08:00
video
xen Merge branch 'akpm' (patches from Andrew) 2019-03-12 10:39:53 -07:00
.gitignore x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore 2018-02-13 14:10:29 +01:00
Kbuild KVM: x86: Allow Qemu/KVM to use PVH entry point 2018-12-13 13:41:49 -05:00
Kconfig DMA mapping updates for 5.1 2019-03-10 11:54:48 -07:00
Kconfig.cpu x86/cpu: Create Hygon Dhyana architecture support file 2018-09-27 16:14:05 +02:00
Kconfig.debug efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation 2019-02-04 08:27:30 +01:00
Makefile Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-07 17:09:28 -08:00
Makefile.um x86, powerpc: Remove -funit-at-a-time compiler option entirely 2018-12-09 11:55:32 +01:00
Makefile_32.cpu