kvm: x86: mmu: Update documentation for fast page fault mechanism
Add a brief description of the lockless access tracking mechanism to the documentation of fast page faults in locking.txt. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>hifive-unleashed-5.1
parent
f160c7b7bb
commit
63dbe14d39
|
@ -26,9 +26,16 @@ sections.
|
||||||
Fast page fault:
|
Fast page fault:
|
||||||
|
|
||||||
Fast page fault is the fast path which fixes the guest page fault out of
|
Fast page fault is the fast path which fixes the guest page fault out of
|
||||||
the mmu-lock on x86. Currently, the page fault can be fast only if the
|
the mmu-lock on x86. Currently, the page fault can be fast in one of the
|
||||||
shadow page table is present and it is caused by write-protect, that means
|
following two cases:
|
||||||
we just need change the W bit of the spte.
|
|
||||||
|
1. Access Tracking: The SPTE is not present, but it is marked for access
|
||||||
|
tracking i.e. the SPTE_SPECIAL_MASK is set. That means we need to
|
||||||
|
restore the saved R/X bits. This is described in more detail later below.
|
||||||
|
|
||||||
|
2. Write-Protection: The SPTE is present and the fault is
|
||||||
|
caused by write-protect. That means we just need to change the W bit of the
|
||||||
|
spte.
|
||||||
|
|
||||||
What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and
|
What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and
|
||||||
SPTE_MMU_WRITEABLE bit on the spte:
|
SPTE_MMU_WRITEABLE bit on the spte:
|
||||||
|
@ -38,7 +45,8 @@ SPTE_MMU_WRITEABLE bit on the spte:
|
||||||
page write-protection.
|
page write-protection.
|
||||||
|
|
||||||
On fast page fault path, we will use cmpxchg to atomically set the spte W
|
On fast page fault path, we will use cmpxchg to atomically set the spte W
|
||||||
bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, this
|
bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, or
|
||||||
|
restore the saved R/X bits if VMX_EPT_TRACK_ACCESS mask is set, or both. This
|
||||||
is safe because whenever changing these bits can be detected by cmpxchg.
|
is safe because whenever changing these bits can be detected by cmpxchg.
|
||||||
|
|
||||||
But we need carefully check these cases:
|
But we need carefully check these cases:
|
||||||
|
@ -142,6 +150,21 @@ Since the spte is "volatile" if it can be updated out of mmu-lock, we always
|
||||||
atomically update the spte, the race caused by fast page fault can be avoided,
|
atomically update the spte, the race caused by fast page fault can be avoided,
|
||||||
See the comments in spte_has_volatile_bits() and mmu_spte_update().
|
See the comments in spte_has_volatile_bits() and mmu_spte_update().
|
||||||
|
|
||||||
|
Lockless Access Tracking:
|
||||||
|
|
||||||
|
This is used for Intel CPUs that are using EPT but do not support the EPT A/D
|
||||||
|
bits. In this case, when the KVM MMU notifier is called to track accesses to a
|
||||||
|
page (via kvm_mmu_notifier_clear_flush_young), it marks the PTE as not-present
|
||||||
|
by clearing the RWX bits in the PTE and storing the original R & X bits in
|
||||||
|
some unused/ignored bits. In addition, the SPTE_SPECIAL_MASK is also set on the
|
||||||
|
PTE (using the ignored bit 62). When the VM tries to access the page later on,
|
||||||
|
a fault is generated and the fast page fault mechanism described above is used
|
||||||
|
to atomically restore the PTE to a Present state. The W bit is not saved when
|
||||||
|
the PTE is marked for access tracking and during restoration to the Present
|
||||||
|
state, the W bit is set depending on whether or not it was a write access. If
|
||||||
|
it wasn't, then the W bit will remain clear until a write access happens, at
|
||||||
|
which time it will be set using the Dirty tracking mechanism described above.
|
||||||
|
|
||||||
3. Reference
|
3. Reference
|
||||||
------------
|
------------
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue