kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Ben Gardon <bgardon@google.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Peter Feiner <pfeiner@google.com>,
	Peter Shier <pshier@google.com>,
	Junaid Shahid <junaids@google.com>,
	Jim Mattson <jmattson@google.com>
Subject: Re: [RFC PATCH 16/28] kvm: mmu: Add direct MMU page fault handler
Date: Wed, 8 Jan 2020 12:20:11 -0500	[thread overview]
Message-ID: <20200108172011.GB7096@xz-x1> (raw)
In-Reply-To: <20190926231824.149014-17-bgardon@google.com>

On Thu, Sep 26, 2019 at 04:18:12PM -0700, Ben Gardon wrote:

[...]

> +static int handle_direct_page_fault(struct kvm_vcpu *vcpu,
> +		unsigned long mmu_seq, int write, int map_writable, int level,
> +		gpa_t gpa, gfn_t gfn, kvm_pfn_t pfn, bool prefault)
> +{
> +	struct direct_walk_iterator iter;
> +	struct kvm_mmu_memory_cache *pf_pt_cache = &vcpu->arch.mmu_page_cache;
> +	u64 *child_pt;
> +	u64 new_pte;
> +	int ret = RET_PF_RETRY;
> +
> +	direct_walk_iterator_setup_walk(&iter, vcpu->kvm,
> +			kvm_arch_vcpu_memslots_id(vcpu), gpa >> PAGE_SHIFT,
> +			(gpa >> PAGE_SHIFT) + 1, MMU_READ_LOCK);
> +	while (direct_walk_iterator_next_pte(&iter)) {
> +		if (iter.level == level) {
> +			ret = direct_page_fault_handle_target_level(vcpu,
> +					write, map_writable, &iter, pfn,
> +					prefault);
> +
> +			break;
> +		} else if (!is_present_direct_pte(iter.old_pte) ||
> +			   is_large_pte(iter.old_pte)) {
> +			/*
> +			 * The leaf PTE for this fault must be mapped at a
> +			 * lower level, so a non-leaf PTE must be inserted into
> +			 * the paging structure. If the assignment below
> +			 * succeeds, it will add the non-leaf PTE and a new
> +			 * page of page table memory. Then the iterator can
> +			 * traverse into that new page. If the atomic compare/
> +			 * exchange fails, the iterator will repeat the current
> +			 * PTE, so the only thing this function must do
> +			 * differently is return the page table memory to the
> +			 * vCPU's fault cache.
> +			 */
> +			child_pt = mmu_memory_cache_alloc(pf_pt_cache);
> +			new_pte = generate_nonleaf_pte(child_pt, false);
> +
> +			if (!direct_walk_iterator_set_pte(&iter, new_pte))
> +				mmu_memory_cache_return(pf_pt_cache, child_pt);
> +		}
> +	}

I have a question on how this will guarantee safe concurrency...

As you mentioned previously somewhere, the design somehow mimics how
the core mm works with process page tables, and IIUC here the rwlock
works really like the mmap_sem that we have for the process mm.  So
with the series now we can have multiple page fault happening with
read lock held of the mmu_lock to reach here.

Then I'm imagining a case where both vcpu threads faulted on the same
address range while when they wanted to do different things, like: (1)
vcpu1 thread wanted to map this as a 2M huge page, while (2) vcpu2
thread wanted to map this as a 4K page.  Then is it possible that
vcpu2 is faster so it firstly setup the pmd as a page table page (via
direct_walk_iterator_set_pte above), then vcpu1 quickly overwrite it
as a huge page (via direct_page_fault_handle_target_level, level=2),
then I feel like the previous page table page that setup by vcpu2 can
be lost unnoticed.

I think general process page table does not have this issue is because
it has per pmd lock so anyone who changes the pmd or beneath it will
need to take that.  However here we don't have it, instead we only
depend on the atomic ops, which seems to be not enough for this?

Thanks,

> +	direct_walk_iterator_end_traversal(&iter);
> +
> +	/* If emulating, flush this vcpu's TLB. */
> +	if (ret == RET_PF_EMULATE)
> +		kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
> +
> +	return ret;
> +}

-- 
Peter Xu


  reply	other threads:[~2020-01-08 17:20 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-26 23:17 [RFC PATCH 00/28] kvm: mmu: Rework the x86 TDP direct mapped case Ben Gardon
2019-09-26 23:17 ` [RFC PATCH 01/28] kvm: mmu: Separate generating and setting mmio ptes Ben Gardon
2019-11-27 18:15   ` Sean Christopherson
2019-09-26 23:17 ` [RFC PATCH 02/28] kvm: mmu: Separate pte generation from set_spte Ben Gardon
2019-11-27 18:25   ` Sean Christopherson
2019-09-26 23:17 ` [RFC PATCH 03/28] kvm: mmu: Zero page cache memory at allocation time Ben Gardon
2019-11-27 18:32   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 04/28] kvm: mmu: Update the lpages stat atomically Ben Gardon
2019-11-27 18:39   ` Sean Christopherson
2019-12-06 20:10     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 05/28] sched: Add cond_resched_rwlock Ben Gardon
2019-11-27 18:42   ` Sean Christopherson
2019-12-06 20:12     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 06/28] kvm: mmu: Replace mmu_lock with a read/write lock Ben Gardon
2019-11-27 18:47   ` Sean Christopherson
2019-12-02 22:45     ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 07/28] kvm: mmu: Add functions for handling changed PTEs Ben Gardon
2019-11-27 19:04   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 08/28] kvm: mmu: Init / Uninit the direct MMU Ben Gardon
2019-12-02 23:40   ` Sean Christopherson
2019-12-06 20:25     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 09/28] kvm: mmu: Free direct MMU page table memory in an RCU callback Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 10/28] kvm: mmu: Flush TLBs before freeing direct MMU page table memory Ben Gardon
2019-12-02 23:46   ` Sean Christopherson
2019-12-06 20:31     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 11/28] kvm: mmu: Optimize for freeing direct MMU PTs on teardown Ben Gardon
2019-12-02 23:54   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 12/28] kvm: mmu: Set tlbs_dirty atomically Ben Gardon
2019-12-03  0:13   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 13/28] kvm: mmu: Add an iterator for concurrent paging structure walks Ben Gardon
2019-12-03  2:15   ` Sean Christopherson
2019-12-18 18:25     ` Ben Gardon
2019-12-18 19:14       ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 14/28] kvm: mmu: Batch updates to the direct mmu disconnected list Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 15/28] kvm: mmu: Support invalidate_zap_all_pages Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 16/28] kvm: mmu: Add direct MMU page fault handler Ben Gardon
2020-01-08 17:20   ` Peter Xu [this message]
2020-01-08 18:15     ` Ben Gardon
2020-01-08 19:00       ` Peter Xu
2019-09-26 23:18 ` [RFC PATCH 17/28] kvm: mmu: Add direct MMU fast " Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 18/28] kvm: mmu: Add an hva range iterator for memslot GFNs Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 19/28] kvm: mmu: Make address space ID a property of memslots Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 20/28] kvm: mmu: Implement the invalidation MMU notifiers for the direct MMU Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 21/28] kvm: mmu: Integrate the direct mmu with the changed pte notifier Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 22/28] kvm: mmu: Implement access tracking for the direct MMU Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 23/28] kvm: mmu: Make mark_page_dirty_in_slot usable from outside kvm_main Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 24/28] kvm: mmu: Support dirty logging in the direct MMU Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 25/28] kvm: mmu: Support kvm_zap_gfn_range " Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 26/28] kvm: mmu: Integrate direct MMU with nesting Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 27/28] kvm: mmu: Lazily allocate rmap when direct MMU is enabled Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 28/28] kvm: mmu: Support MMIO in the direct MMU Ben Gardon
2019-10-17 18:50 ` [RFC PATCH 00/28] kvm: mmu: Rework the x86 TDP direct mapped case Sean Christopherson
2019-10-18 13:42   ` Paolo Bonzini
2019-11-27 19:09 ` Sean Christopherson
2019-12-06 19:55   ` Ben Gardon
2019-12-06 19:57     ` Sean Christopherson
2019-12-06 20:42       ` Ben Gardon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200108172011.GB7096@xz-x1 \
    --to=peterx@redhat.com \
    --cc=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pfeiner@google.com \
    --cc=pshier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).