All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Ben Gardon <bgardon@google.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Peter Shier <pshier@google.com>,
	David Matlack <dmatlack@google.com>,
	Mingwei Zhang <mizhang@google.com>,
	Yulei Zhang <yulei.kernel@gmail.com>,
	Wanpeng Li <kernellwp@gmail.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Kai Huang <kai.huang@intel.com>,
	Keqian Zhu <zhukeqian1@huawei.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH 15/15] KVM: x86/mmu: Promote pages in-place when disabling dirty logging
Date: Tue, 30 Nov 2021 15:28:07 +0800	[thread overview]
Message-ID: <YaXSh6RUOH7NHG8G@xz-m1.local> (raw)
In-Reply-To: <CANgfPd9pK83S+yoRokLg7wiroE6-OkieATTqgGn3yCCzwNFi4A@mail.gmail.com>

On Mon, Nov 29, 2021 at 10:31:14AM -0800, Ben Gardon wrote:
> > As comment above handle_removed_tdp_mmu_page() showed, at this point IIUC
> > current thread should have exclusive ownership of this orphaned and abandoned
> > pgtable page, then why in handle_removed_tdp_mmu_page() we still need all the
> > atomic operations and REMOVED_SPTE tricks to protect from concurrent access?
> > Since that's cmpxchg-ed out of the old pgtable, what can be accessing it
> > besides the current thread?
> 
> The cmpxchg does nothing to guarantee that other threads can't have a
> pointer to the page table, only that this thread knows it's the one
> that removed it from the page table. Other threads could still have
> pointers to it in two ways:
> 1. A kernel thread could be in the process of modifying an SPTE in the
> page table, under the MMU lock in read mode. In that case, there's no
> guarantee that there's not another kernel thread with a pointer to the
> SPTE until the end of an RCU grace period.

Right, I definitely missed that whole picture of the RCU usage.  Thanks.

> 2. There could be a pointer to the page table in a vCPU's paging
> structure caches, which are similar to the TLB but cache partial
> translations. These are also cleared out on TLB flush.

Could you elaborate what's the structure cache that you mentioned?  I thought
the processor page walker will just use the data cache (L1-L3) as pgtable
caches, in which case IIUC the invalidation happens when we do WRITE_ONCE()
that'll invalidate all the rest data cache besides the writter core.  But I
could be completely missing something..

> Sean's recent series linked the RCU grace period and TLB flush in a
> clever way so that we can ensure that the end of a grace period
> implies that the necessary flushes have happened already, but we still
> need to clear out the disconnected page table with atomic operations.
> We need to clear it out mostly to collect dirty / accessed bits and
> update page size stats.

Yes, this sounds reasonable too.

-- 
Peter Xu


  parent reply	other threads:[~2021-11-30  7:28 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-15 23:45 [PATCH 00/15] Currently disabling dirty logging with the TDP MMU is extremely slow. On a 96 vCPU / 96G VM it takes ~45 seconds to disable dirty logging with the TDP MMU, as opposed to ~3.5 seconds with the legacy MMU. This series optimizes TLB flushes and introduces in-place large page promotion, to bring the disable dirty log time down to ~2 seconds Ben Gardon
2021-11-15 23:45 ` [PATCH 01/15] KVM: x86/mmu: Remove redundant flushes when disabling dirty logging Ben Gardon
2021-11-18  8:26   ` Paolo Bonzini
2021-11-15 23:45 ` [PATCH 02/15] KVM: x86/mmu: Introduce vcpu_make_spte Ben Gardon
2021-11-15 23:45 ` [PATCH 03/15] KVM: x86/mmu: Factor wrprot for nested PML out of make_spte Ben Gardon
2021-11-15 23:45 ` [PATCH 04/15] KVM: x86/mmu: Factor mt_mask " Ben Gardon
2021-11-15 23:45 ` [PATCH 05/15] KVM: x86/mmu: Remove need for a vcpu from kvm_slot_page_track_is_active Ben Gardon
2021-11-18  8:25   ` Paolo Bonzini
2021-11-15 23:45 ` [PATCH 06/15] KVM: x86/mmu: Remove need for a vcpu from mmu_try_to_unsync_pages Ben Gardon
2021-11-18  8:25   ` Paolo Bonzini
2021-11-15 23:45 ` [PATCH 07/15] KVM: x86/mmu: Factor shadow_zero_check out of make_spte Ben Gardon
2021-11-15 23:45 ` [PATCH 08/15] KVM: x86/mmu: Replace vcpu argument with kvm pointer in make_spte Ben Gardon
2021-11-15 23:45 ` [PATCH 09/15] KVM: x86/mmu: Factor out the meat of reset_tdp_shadow_zero_bits_mask Ben Gardon
2021-11-15 23:45 ` [PATCH 10/15] KVM: x86/mmu: Propagate memslot const qualifier Ben Gardon
2021-11-18  8:27   ` Paolo Bonzini
2021-11-15 23:45 ` [PATCH 11/15] KVM: x86/MMU: Refactor vmx_get_mt_mask Ben Gardon
2021-11-18  8:30   ` Paolo Bonzini
2021-11-18 15:30     ` Sean Christopherson
2021-11-19  9:02       ` Paolo Bonzini
2021-11-22 18:11         ` Ben Gardon
2021-11-22 18:46           ` Sean Christopherson
2021-11-15 23:46 ` [PATCH 12/15] KVM: x86/mmu: Factor out part of vmx_get_mt_mask which does not depend on vcpu Ben Gardon
2021-11-15 23:46 ` [PATCH 13/15] KVM: x86/mmu: Add try_get_mt_mask to x86_ops Ben Gardon
2021-11-15 23:46 ` [PATCH 14/15] KVM: x86/mmu: Make kvm_is_mmio_pfn usable outside of spte.c Ben Gardon
2021-11-15 23:46 ` [PATCH 15/15] KVM: x86/mmu: Promote pages in-place when disabling dirty logging Ben Gardon
2021-11-25  4:18   ` Peter Xu
2021-11-29 18:31     ` Ben Gardon
2021-11-30  0:13       ` Sean Christopherson
2021-11-30  7:28       ` Peter Xu [this message]
2021-11-30 16:01         ` Sean Christopherson
2021-12-01  1:59           ` Peter Xu
2021-11-15 23:58 ` [PATCH 00/15] Currently disabling dirty logging with the TDP MMU is extremely slow. On a 96 vCPU / 96G VM it takes ~45 seconds to disable dirty logging with the TDP MMU, as opposed to ~3.5 seconds with the legacy MMU. This series optimizes TLB flushes and introduces in-place large page promotion, to bring the disable dirty log time down to ~2 seconds Ben Gardon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YaXSh6RUOH7NHG8G@xz-m1.local \
    --to=peterx@redhat.com \
    --cc=bgardon@google.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=kai.huang@intel.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=pshier@google.com \
    --cc=seanjc@google.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yulei.kernel@gmail.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.