kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* x86 MMU: RMap Interface
@ 2020-07-19 22:32 contact
  2020-07-20 15:49 ` Sean Christopherson
  0 siblings, 1 reply; 4+ messages in thread
From: contact @ 2020-07-19 22:32 UTC (permalink / raw)
  To: kvm

Hi,

I'm a bit confused by the interface for interacting with the page rmap. 
For context, on a TDP-enabled x86-64 host, I'm logging each time a 
GFN->PFN mapping is created/modified/removed for a non-MMIO page (kernel 
version 5.4).

First, my understanding is that the page rmap is a mapping of non-MMIO 
PFNs back to the GFNs that use them. The interface for creating an rmap 
entry (and thus, a new GFN->PFN mapping) appears to be rmap_add() and is 
quite straightforward. However, rmap_remove() does not appear to be the 
(only) function for removing an entry from the page rmap. For instance, 
kvm_zap_rmapp()---used by the mmu_notifier for invalidations---jumps 
straight to pte_list_remove(), while drop_spte() uses rmap_remove(). 
Would it be fair to say that mmu_spte_clear_track_bits() is found on all 
paths for removing an entry from the page rmap?

Second, for updates to the frame numbers in an existing SPTE, there are 
both mmu_set_spte() and mmu_spte_set(). Could someone please clarify the 
difference between these functions?

Finally, much of the logic between the page rmap and parent PTE rmaps 
(understandably) overlaps. However, with TDP-enabled, I'm not entirely 
sure what the role of the parent PTE rmaps is relative to the page rmap. 
Could someone possibly clarify?

Thanks, and best wishes,

Kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: x86 MMU: RMap Interface
  2020-07-19 22:32 x86 MMU: RMap Interface contact
@ 2020-07-20 15:49 ` Sean Christopherson
  2020-08-15 21:08   ` Kevin Loughlin
       [not found]   ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org>
  0 siblings, 2 replies; 4+ messages in thread
From: Sean Christopherson @ 2020-07-20 15:49 UTC (permalink / raw)
  To: contact; +Cc: kvm

On Sun, Jul 19, 2020 at 06:32:22PM -0400, contact@kevinloughlin.org wrote:
> Hi,
> 
> I'm a bit confused by the interface for interacting with the page rmap. For
> context, on a TDP-enabled x86-64 host, I'm logging each time a GFN->PFN
> mapping is created/modified/removed for a non-MMIO page (kernel version
> 5.4).
> 
> First, my understanding is that the page rmap is a mapping of non-MMIO PFNs
> back to the GFNs that use them. The interface for creating an rmap entry
> (and thus, a new GFN->PFN mapping) appears to be rmap_add() and is quite
> straightforward. However, rmap_remove() does not appear to be the (only)
> function for removing an entry from the page rmap. For instance,
> kvm_zap_rmapp()---used by the mmu_notifier for invalidations---jumps
> straight to pte_list_remove(), while drop_spte() uses rmap_remove().

The rmaps are associated with the memslot, the drop_spte() path allows KVM
to clean up SPTEs without having to guarantee the validity of the memslot
that was used to create the SPTE.

> Would it be fair to say that mmu_spte_clear_track_bits() is found on all
> paths for removing an entry from the page rmap?

Yes, that should hold true.
 
> Second, for updates to the frame numbers in an existing SPTE, there are both
> mmu_set_spte() and mmu_spte_set(). Could someone please clarify the
> difference between these functions?

mmu_set_spte() is the higher level helper that is used during a page fault
or prefetch to convert a host PFN and basic access permissions into a SPTE
value, handle large/huge page interactions and accounting, add the rmap,
etc..., and of course eventually update the SPTE.

mmu_spte_set() is a low level helper that does nothing more than write a
SPTE.  It's just a wrapper to __set_spte() that also WARNs if the old SPTE
is present.

> Finally, much of the logic between the page rmap and parent PTE rmaps
> (understandably) overlaps. However, with TDP-enabled, I'm not entirely sure
> what the role of the parent PTE rmaps is relative to the page rmap. Could
> someone possibly clarify?

KVM needs the backpointers to remove the SPTE for a shadow page, which
exists in the parent shadow page, when the child is zapped, e.g. if a L2 SP
is removed, its SPTE in a L3 SP needs to be updated.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: x86 MMU: RMap Interface
  2020-07-20 15:49 ` Sean Christopherson
@ 2020-08-15 21:08   ` Kevin Loughlin
       [not found]   ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org>
  1 sibling, 0 replies; 4+ messages in thread
From: Kevin Loughlin @ 2020-08-15 21:08 UTC (permalink / raw)
  To: kvm

Given this info, am I correct in saying that all non-MMIO guest pages
are (1) added to the rmap upon being marked present, and (2) removed
from the rmap upon being marked non-present?

I primarily ask because I'm observing behavior (running x86-64 guest
with TDP/EPT enabled) wherein multiple SPTEs appear to be added to the
rmap for the same GFN<->PFN mapping (sometimes later followed by
multiple removals of the same GFN<->PFN mapping). My understanding was
that, for a given guest, each GFN<->PFN mapping corresponds to exactly
one rmap entry (and vice versa). Is this incorrect?

I observe the behavior I mentioned whether I log upon rmap updates, or
upon mmu_spte_set() (for non-present->present) and
mmu_clear_track_bits() (for present->non-present). Perhaps I'm missing
a more obvious interface for logging when the PFNs backing guest pages
are marked as present/non-present?

Best wishes, and thanks again for the help,

Kevin

On Mon, Jul 20, 2020 at 11:49 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Sun, Jul 19, 2020 at 06:32:22PM -0400, contact@kevinloughlin.org wrote:
> > Hi,
> >
> > I'm a bit confused by the interface for interacting with the page rmap. For
> > context, on a TDP-enabled x86-64 host, I'm logging each time a GFN->PFN
> > mapping is created/modified/removed for a non-MMIO page (kernel version
> > 5.4).
> >
> > First, my understanding is that the page rmap is a mapping of non-MMIO PFNs
> > back to the GFNs that use them. The interface for creating an rmap entry
> > (and thus, a new GFN->PFN mapping) appears to be rmap_add() and is quite
> > straightforward. However, rmap_remove() does not appear to be the (only)
> > function for removing an entry from the page rmap. For instance,
> > kvm_zap_rmapp()---used by the mmu_notifier for invalidations---jumps
> > straight to pte_list_remove(), while drop_spte() uses rmap_remove().
>
> The rmaps are associated with the memslot, the drop_spte() path allows KVM
> to clean up SPTEs without having to guarantee the validity of the memslot
> that was used to create the SPTE.
>
> > Would it be fair to say that mmu_spte_clear_track_bits() is found on all
> > paths for removing an entry from the page rmap?
>
> Yes, that should hold true.
>
> > Second, for updates to the frame numbers in an existing SPTE, there are both
> > mmu_set_spte() and mmu_spte_set(). Could someone please clarify the
> > difference between these functions?
>
> mmu_set_spte() is the higher level helper that is used during a page fault
> or prefetch to convert a host PFN and basic access permissions into a SPTE
> value, handle large/huge page interactions and accounting, add the rmap,
> etc..., and of course eventually update the SPTE.
>
> mmu_spte_set() is a low level helper that does nothing more than write a
> SPTE.  It's just a wrapper to __set_spte() that also WARNs if the old SPTE
> is present.
>
> > Finally, much of the logic between the page rmap and parent PTE rmaps
> > (understandably) overlaps. However, with TDP-enabled, I'm not entirely sure
> > what the role of the parent PTE rmaps is relative to the page rmap. Could
> > someone possibly clarify?
>
> KVM needs the backpointers to remove the SPTE for a shadow page, which
> exists in the parent shadow page, when the child is zapped, e.g. if a L2 SP
> is removed, its SPTE in a L3 SP needs to be updated.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: x86 MMU: RMap Interface
       [not found]   ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org>
@ 2020-08-17 16:54     ` Sean Christopherson
  0 siblings, 0 replies; 4+ messages in thread
From: Sean Christopherson @ 2020-08-17 16:54 UTC (permalink / raw)
  To: contact; +Cc: kvm

On Fri, Aug 14, 2020 at 11:44:49PM -0400, contact@kevinloughlin.org wrote:
> Thanks!
> 
> Given this info, am I correct in saying that all non-MMIO guest pages are
> (1) added to the rmap upon being marked present, and (2) removed from the
> rmap upon being marked non-present?
> 
> I primarily ask because I'm observing behavior (running x86-64 guest with
> TDP/EPT enabled) wherein multiple SPTEs appear to be added to the rmap for
> the same GFN<->PFN mapping (sometimes later followed by multiple removals of
> the same GFN<->PFN mapping). My understanding was that, for a given guest,
> each GFN<->PFN mapping corresponds to exactly one rmap entry (and vice
> versa). Is this incorrect?
> 
> I observe the behavior I mentioned whether I log upon rmap updates, or upon
> mmu_spte_set() (for non-present->present) and mmu_clear_track_bits() (for
> present->non-present). Perhaps I'm missing a more obvious interface for
> logging when the PFNs backing guest pages are marked as present/non-present?

The basic premise is correct, but there are exceptions (or rather, at least
one exception that immediately comes to mind).  With TDP and no nested VMs,
a given instance of the MMU will have a 1:1 GFN:PFN mapping.  But, if the
MMU is recreated (reloaded with a different EPTP), e.g. as part of a fast
zap, then there may be mappings for the GFN:PFN in both the old MMU/EPTP
instance and the new MMU/EPTP instance, and thus multiple rmaps.

KVM currently does a fast zap (and MMU reload) when deleting memslots, which
happens multiple times during boot, so the behavior you're observing is
expected.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-17 16:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-19 22:32 x86 MMU: RMap Interface contact
2020-07-20 15:49 ` Sean Christopherson
2020-08-15 21:08   ` Kevin Loughlin
     [not found]   ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org>
2020-08-17 16:54     ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).