All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ben Gardon <bgardon@google.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Peter Xu <peterx@redhat.com>, Peter Shier <pshier@google.com>,
	Peter Feiner <pfeiner@google.com>,
	Junaid Shahid <junaids@google.com>,
	Jim Mattson <jmattson@google.com>,
	Yulei Zhang <yulei.kernel@gmail.com>,
	Wanpeng Li <kernellwp@gmail.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>
Subject: Re: [PATCH 19/24] kvm: x86/mmu: Protect tdp_mmu_pages with a lock
Date: Tue, 26 Jan 2021 14:02:25 -0800	[thread overview]
Message-ID: <YBCRcalZJwAlkO9F@google.com> (raw)
In-Reply-To: <335d27f7-0849-de37-f380-a5018c5c5535@redhat.com>

On Tue, Jan 26, 2021, Paolo Bonzini wrote:
> On 21/01/21 22:32, Sean Christopherson wrote:
> > Coming back to this series, I wonder if the RCU approach is truly necessary to
> > get the desired scalability.  If both zap_collapsible_sptes() and NX huge page
> > recovery zap_only_  leaf SPTEs, then the only path that can actually unlink a
> > shadow page while holding the lock for read is the page fault path that installs
> > a huge page over an existing shadow page.
> > 
> > Assuming the above analysis is correct, I think it's worth exploring alternatives
> > to using RCU to defer freeing the SP memory, e.g. promoting to a write lock in
> > the specific case of overwriting a SP (though that may not exist for rwlocks),
> > or maybe something entirely different?
> 
> You can do the deferred freeing with a short write-side critical section to
> ensure all readers have terminated.

Hmm, the most obvious downside I see is that the zap_collapsible_sptes() case
will not scale as well as the RCU approach.  E.g. the lock may be heavily
contested when refaulting all of guest memory to (re)install huge pages after a
failed migration.

Though I wonder, could we do something even more clever for that particular
case?  And I suppose it would apply to NX huge pages as well.  Instead of
zapping the leaf PTEs and letting the fault handler install the huge page, do an
in-place promotion when dirty logging is disabled.  That could all be done under
the read lock, and with Paolo's method for deferred free on the back end.  That
way only the thread doing the memslot update would take mmu_lock for write, and
only once per memslot update.

> If the bool argument to handle_disconnected_tdp_mmu_page is true(*), the
> pages would be added to an llist, instead of being freed immediately. At the
> end of a shared critical section you would do
> 
> 	if (!llist_empty(&kvm->arch.tdp_mmu_disconnected_pages)) {
> 		struct llist_node *first;
> 		kvm_mmu_lock(kvm);
> 		first = __list_del_all(&kvm->arch.tdp_mmu_disconnected_pages);
> 		kvm_mmu_unlock(kvm);
> 
> 		/*
> 		 * All vCPUs have already stopped using the pages when
> 		 * their TLBs were flushed.  The exclusive critical
> 		 * section above means that there can be no readers
> 		 * either.
> 		 */
> 		tdp_mmu_free_disconnected_pages(first);
> 	}
> 
> So this is still deferred reclamation, but it's done by one of the vCPUs
> rather than a worker RCU thread.  This would replace patches 11/12/13 and
> probably would be implemented after patch 18.
> 
> Paolo
> 
> (*) this idea is what prompted the comment about s/atomic/shared/
> 

  parent reply	other threads:[~2021-01-27  3:40 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-12 18:10 [PATCH 00/24] Allow parallel page faults with TDP MMU Ben Gardon
2021-01-12 18:10 ` [PATCH 01/24] locking/rwlocks: Add contention detection for rwlocks Ben Gardon
2021-01-12 21:20   ` kernel test robot
2021-01-12 18:10 ` [PATCH 02/24] sched: Add needbreak " Ben Gardon
2021-01-12 18:10 ` [PATCH 03/24] sched: Add cond_resched_rwlock Ben Gardon
2021-01-12 18:10 ` [PATCH 04/24] kvm: x86/mmu: change TDP MMU yield function returns to match cond_resched Ben Gardon
2021-01-20 18:38   ` Sean Christopherson
2021-01-21 20:22     ` Paolo Bonzini
2021-01-26 14:11     ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 05/24] kvm: x86/mmu: Fix yielding in TDP MMU Ben Gardon
2021-01-20 19:28   ` Sean Christopherson
2021-01-22  1:06     ` Ben Gardon
2021-01-12 18:10 ` [PATCH 06/24] kvm: x86/mmu: Skip no-op changes in TDP MMU functions Ben Gardon
2021-01-20 19:51   ` Sean Christopherson
2021-01-25 23:51     ` Ben Gardon
2021-01-12 18:10 ` [PATCH 07/24] kvm: x86/mmu: Add comment on __tdp_mmu_set_spte Ben Gardon
2021-01-26 14:13   ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 08/24] kvm: x86/mmu: Add lockdep when setting a TDP MMU SPTE Ben Gardon
2021-01-20 19:58   ` Sean Christopherson
2021-01-26 14:13   ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 09/24] kvm: x86/mmu: Don't redundantly clear TDP MMU pt memory Ben Gardon
2021-01-20 20:06   ` Sean Christopherson
2021-01-26 14:14   ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 10/24] kvm: x86/mmu: Factor out handle disconnected pt Ben Gardon
2021-01-20 20:30   ` Sean Christopherson
2021-01-26 14:14   ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 11/24] kvm: x86/mmu: Put TDP MMU PT walks in RCU read-critical section Ben Gardon
2021-01-20 22:19   ` Sean Christopherson
2021-01-12 18:10 ` [PATCH 12/24] kvm: x86/kvm: RCU dereference tdp mmu page table links Ben Gardon
2021-01-22 18:32   ` Sean Christopherson
2021-01-26 18:17     ` Ben Gardon
2021-01-12 18:10 ` [PATCH 13/24] kvm: x86/mmu: Only free tdp_mmu pages after a grace period Ben Gardon
2021-01-12 18:10 ` [PATCH 14/24] kvm: mmu: Wrap mmu_lock lock / unlock in a function Ben Gardon
2021-01-13  2:35   ` kernel test robot
2021-01-13  2:35     ` kernel test robot
2021-01-12 18:10 ` [PATCH 15/24] kvm: mmu: Wrap mmu_lock cond_resched and needbreak Ben Gardon
2021-01-21  0:19   ` Sean Christopherson
2021-01-21 20:17     ` Paolo Bonzini
2021-01-26 14:38     ` Paolo Bonzini
2021-01-26 17:47       ` Ben Gardon
2021-01-26 17:55         ` Paolo Bonzini
2021-01-26 18:11           ` Ben Gardon
2021-01-26 20:47             ` Paolo Bonzini
2021-01-27 20:08               ` Ben Gardon
2021-01-27 20:55                 ` Paolo Bonzini
2021-01-27 21:20                   ` Ben Gardon
2021-01-28  8:18                     ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 16/24] kvm: mmu: Wrap mmu_lock assertions Ben Gardon
2021-01-26 14:29   ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 17/24] kvm: mmu: Move mmu_lock to struct kvm_arch Ben Gardon
2021-01-12 18:10 ` [PATCH 18/24] kvm: x86/mmu: Use an rwlock for the x86 TDP MMU Ben Gardon
2021-01-21  0:45   ` Sean Christopherson
2021-01-12 18:10 ` [PATCH 19/24] kvm: x86/mmu: Protect tdp_mmu_pages with a lock Ben Gardon
2021-01-21 19:22   ` Sean Christopherson
2021-01-21 21:32     ` Sean Christopherson
2021-01-26 14:27       ` Paolo Bonzini
2021-01-26 21:47         ` Ben Gardon
2021-01-26 22:02         ` Sean Christopherson [this message]
2021-01-26 22:09           ` Sean Christopherson
2021-01-27 12:40           ` Paolo Bonzini
2021-01-26 13:37   ` Paolo Bonzini
2021-01-26 21:07     ` Ben Gardon
2021-01-12 18:10 ` [PATCH 20/24] kvm: x86/mmu: Add atomic option for setting SPTEs Ben Gardon
2021-01-13  0:05   ` kernel test robot
2021-01-13  0:05     ` kernel test robot
2021-01-26 14:21   ` Paolo Bonzini
2021-01-12 18:10 ` [PATCH 21/24] kvm: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map Ben Gardon
2021-01-12 18:10 ` [PATCH 22/24] kvm: x86/mmu: Flush TLBs after zap in TDP MMU PF handler Ben Gardon
2021-01-21  0:05   ` Sean Christopherson
2021-01-12 18:10 ` [PATCH 23/24] kvm: x86/mmu: Freeze SPTEs in disconnected pages Ben Gardon
2021-01-12 18:10 ` [PATCH 24/24] kvm: x86/mmu: Allow parallel page faults for the TDP MMU Ben Gardon
2021-01-21  0:55   ` Sean Christopherson
2021-01-26 21:57     ` Ben Gardon
2021-01-27 17:14       ` Sean Christopherson
2021-01-26 13:37   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YBCRcalZJwAlkO9F@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pfeiner@google.com \
    --cc=pshier@google.com \
    --cc=vkuznets@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yulei.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.