From mboxrd@z Thu Jan 1 00:00:00 1970 From: Junaid Shahid Subject: [PATCH v3 8/8] kvm: x86: mmu: Update documentation for fast page fault mechanism Date: Tue, 6 Dec 2016 16:46:17 -0800 Message-ID: <1481071577-40250-9-git-send-email-junaids@google.com> References: <1481071577-40250-1-git-send-email-junaids@google.com> Cc: andreslc@google.com, pfeiner@google.com, pbonzini@redhat.com, guangrong.xiao@linux.intel.com To: kvm@vger.kernel.org Return-path: Received: from mail-pg0-f51.google.com ([74.125.83.51]:34303 "EHLO mail-pg0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752940AbcLGAq0 (ORCPT ); Tue, 6 Dec 2016 19:46:26 -0500 Received: by mail-pg0-f51.google.com with SMTP id x23so155021036pgx.1 for ; Tue, 06 Dec 2016 16:46:26 -0800 (PST) In-Reply-To: <1481071577-40250-1-git-send-email-junaids@google.com> Sender: kvm-owner@vger.kernel.org List-ID: Add a brief description of the lockless access tracking mechanism to the documentation of fast page faults in locking.txt. Signed-off-by: Junaid Shahid --- Documentation/virtual/kvm/locking.txt | 31 +++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/Documentation/virtual/kvm/locking.txt b/Documentation/virtual/kvm/locking.txt index e5dd9f4..51c435a 100644 --- a/Documentation/virtual/kvm/locking.txt +++ b/Documentation/virtual/kvm/locking.txt @@ -22,9 +22,16 @@ else is a leaf: no other lock is taken inside the critical sections. Fast page fault: Fast page fault is the fast path which fixes the guest page fault out of -the mmu-lock on x86. Currently, the page fault can be fast only if the -shadow page table is present and it is caused by write-protect, that means -we just need change the W bit of the spte. +the mmu-lock on x86. Currently, the page fault can be fast in one of the +following two cases: + +1. Access Tracking: The SPTE is not present, but it is marked for access +tracking i.e. the SPTE_SPECIAL_MASK is set. That means we need to +restore the saved R/X bits. This is described in more detail later below. + +2. Write-Protection: The SPTE is present and the fault is +caused by write-protect. That means we just need to change the W bit of the +spte. What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and SPTE_MMU_WRITEABLE bit on the spte: @@ -34,7 +41,8 @@ SPTE_MMU_WRITEABLE bit on the spte: page write-protection. On fast page fault path, we will use cmpxchg to atomically set the spte W -bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, this +bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, or +restore the saved R/X bits if VMX_EPT_TRACK_ACCESS mask is set, or both. This is safe because whenever changing these bits can be detected by cmpxchg. But we need carefully check these cases: @@ -138,6 +146,21 @@ Since the spte is "volatile" if it can be updated out of mmu-lock, we always atomically update the spte, the race caused by fast page fault can be avoided, See the comments in spte_has_volatile_bits() and mmu_spte_update(). +Lockless Access Tracking: + +This is used for Intel CPUs that are using EPT but do not support the EPT A/D +bits. In this case, when the KVM MMU notifier is called to track accesses to a +page (via kvm_mmu_notifier_clear_flush_young), it marks the PTE as not-present +by clearing the RWX bits in the PTE and storing the original R & X bits in +some unused/ignored bits. In addition, the SPTE_SPECIAL_MASK is also set on the +PTE (using the ignored bit 62). When the VM tries to access the page later on, +a fault is generated and the fast page fault mechanism described above is used +to atomically restore the PTE to a Present state. The W bit is not saved when +the PTE is marked for access tracking and during restoration to the Present +state, the W bit is set depending on whether or not it was a write access. If +it wasn't, then the W bit will remain clear until a write access happens, at +which time it will be set using the Dirty tracking mechanism described above. + 3. Reference ------------ -- 2.8.0.rc3.226.g39d4020