All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nikunj A. Dadhania" <nikunj@amd.com>
To: Mingwei Zhang <mizhang@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Peter Gonda <pgonda@google.com>, Bharata B Rao <bharata@amd.com>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	David Hildenbrand <david@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v1 5/9] KVM: SVM: Implement demand page pinning
Date: Mon, 21 Mar 2022 14:49:27 +0530	[thread overview]
Message-ID: <22268ddb-5643-f35e-6c34-eb5c2b0ad4cb@amd.com> (raw)
In-Reply-To: <YjgXIyrcDA5+u8d+@google.com>

On 3/21/2022 11:41 AM, Mingwei Zhang wrote:
> On Wed, Mar 09, 2022, Nikunj A. Dadhania wrote:
>> On 3/9/2022 3:23 AM, Mingwei Zhang wrote:
>>> On Tue, Mar 08, 2022, Nikunj A Dadhania wrote:
>>>> Use the memslot metadata to store the pinned data along with the pfns.
>>>> This improves the SEV guest startup time from O(n) to a constant by
>>>> deferring guest page pinning until the pages are used to satisfy
>>>> nested page faults. The page reference will be dropped in the memslot
>>>> free path or deallocation path.
>>>>
>>>> Reuse enc_region structure definition as pinned_region to maintain
>>>> pages that are pinned outside of MMU demand pinning. Remove rest of
>>>> the code which did upfront pinning, as they are no longer needed in
>>>> view of the demand pinning support.
>>>
>>> I don't quite understand why we still need the enc_region. I have
>>> several concerns. Details below.
>>
>> With patch 9 the enc_region is used only for memory that was pinned before 
>> the vcpu is online (i.e. mmu is not yet usable)
>>
>>>>
>>>> Retain svm_register_enc_region() and svm_unregister_enc_region() with
>>>> required checks for resource limit.
>>>>
>>>> Guest boot time comparison
>>>>   +---------------+----------------+-------------------+
>>>>   | Guest Memory  |   baseline     |  Demand Pinning   |
>>>>   | Size (GB)     |    (secs)      |     (secs)        |
>>>>   +---------------+----------------+-------------------+
>>>>   |      4        |     6.16       |      5.71         |
>>>>   +---------------+----------------+-------------------+
>>>>   |     16        |     7.38       |      5.91         |
>>>>   +---------------+----------------+-------------------+
>>>>   |     64        |    12.17       |      6.16         |
>>>>   +---------------+----------------+-------------------+
>>>>   |    128        |    18.20       |      6.50         |
>>>>   +---------------+----------------+-------------------+
>>>>   |    192        |    24.56       |      6.80         |
>>>>   +---------------+----------------+-------------------+
>>>>
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>>> ---
>>>>  arch/x86/kvm/svm/sev.c | 304 ++++++++++++++++++++++++++---------------
>>>>  arch/x86/kvm/svm/svm.c |   1 +
>>>>  arch/x86/kvm/svm/svm.h |   6 +-
>>>>  3 files changed, 200 insertions(+), 111 deletions(-)
>>>>

<SNIP>

>>>>  static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>>  				    unsigned long ulen, unsigned long *n,
>>>>  				    int write)
>>>>  {
>>>>  	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
>>>> +	struct pinned_region *region;
>>>>  	unsigned long npages, size;
>>>>  	int npinned;
>>>> -	unsigned long locked, lock_limit;
>>>>  	struct page **pages;
>>>> -	unsigned long first, last;
>>>>  	int ret;
>>>>  
>>>>  	lockdep_assert_held(&kvm->lock);
>>>> @@ -395,15 +413,12 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>>  	if (ulen == 0 || uaddr + ulen < uaddr)
>>>>  		return ERR_PTR(-EINVAL);
>>>>  
>>>> -	/* Calculate number of pages. */
>>>> -	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>>>> -	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>>>> -	npages = (last - first + 1);
>>>> +	npages = get_npages(uaddr, ulen);
>>>>  
>>>> -	locked = sev->pages_locked + npages;
>>>> -	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>>>> -	if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
>>>> -		pr_err("SEV: %lu locked pages exceed the lock limit of %lu.\n", locked, lock_limit);
>>>> +	if (rlimit_memlock_exceeds(sev->pages_to_lock, npages)) {
>>>> +		pr_err("SEV: %lu locked pages exceed the lock limit of %lu.\n",
>>>> +			sev->pages_to_lock + npages,
>>>> +			(rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT));
>>>>  		return ERR_PTR(-ENOMEM);
>>>>  	}
>>>>  
>>>> @@ -429,7 +444,19 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>>  	}
>>>>  
>>>>  	*n = npages;
>>>> -	sev->pages_locked = locked;
>>>> +	sev->pages_to_lock += npages;
>>>> +
>>>> +	/* Maintain region list that is pinned to be unpinned in vm destroy path */
>>>> +	region = kzalloc(sizeof(*region), GFP_KERNEL_ACCOUNT);
>>>> +	if (!region) {
>>>> +		ret = -ENOMEM;
>>>> +		goto err;
>>>> +	}
>>>> +	region->uaddr = uaddr;
>>>> +	region->size = ulen;
>>>> +	region->pages = pages;
>>>> +	region->npages = npages;
>>>> +	list_add_tail(&region->list, &sev->pinned_regions_list);
>>>
>>> Hmm. I see a duplication of the metadata. We already store the pfns in
>>> memslot. But now we also do it in regions. Is this one used for
>>> migration purpose?
>>
>> We are not duplicating, the enc_region holds regions that are pinned other 
>> than svm_register_enc_region(). Later patches add infrastructure to directly 
>> fault-in those pages which will use memslot->pfns. 
>>
>>>
>>> I might miss some of the context here. 
>>
>> More context here:
>> https://lore.kernel.org/kvm/CAMkAt6p1-82LTRNB3pkPRwYh=wGpreUN=jcUeBj_dZt8ss9w0Q@mail.gmail.com/
> 
> hmm. I think I might got the point. However, logically, I still think we
> might not need double data structures for pinning. When vcpu is not
> online, we could use the the array in memslot to contain the pinned
> pages, right?

Yes.

> Since user-level code is not allowed to pin arbitrary regions of HVA, we
> could check that and bail out early if the region goes out of a memslot.
> 
> From that point, the only requirement is that we need a valid memslot
> before doing memory encryption and pinning. So enc_region is still not
> needed from this point.
> 
> This should save some time to avoid double pinning and make the pinning
> information clear.

Agreed, I think that should be possible:

* Check for addr/end being part of a memslot
* Error out in case it is not part of any memslot
* Add __sev_pin_pfn() which is not dependent on vcpu arg.
* Iterate over the pages and use __sev_pin_pfn() routine to pin.
	slots = kvm_memslots(kvm);
	kvm_for_each_memslot_in_hva_range(node, slots, addr, end) {
		slot = container_of(node, struct kvm_memory_slot,
			    hva_node[slots->node_idx]);
		slot_start = slot->userspace_addr;
		slot_end = slot_start + (slot->npages << PAGE_SHIFT);
		hva_start = max(addr, slot_start);
		hva_end = min(end, slot_end)
		for (uaddr = hva_start; uaddr < hva_end; uaddr += PAGE_SIZE) {
			__sev_pin_pfn(slot, uaddr, PG_LEVEL_4K)
		}
	}

This will make sure memslot based data structure is used and enc_region can be removed.

Regards
Nikunj
  



  reply	other threads:[~2022-03-21  9:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-08  4:38 [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 1/9] KVM: Introduce pinning flag to hva_to_pfn* Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 2/9] KVM: x86/mmu: Move hugepage adjust to direct_page_fault Nikunj A Dadhania
2022-03-28 21:04   ` Sean Christopherson
2022-03-08  4:38 ` [PATCH RFC v1 3/9] KVM: x86/mmu: Add hook to pin PFNs on demand in MMU Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 4/9] KVM: SVM: Add pinning metadata in the arch memslot Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 5/9] KVM: SVM: Implement demand page pinning Nikunj A Dadhania
2022-03-08 21:53   ` Mingwei Zhang
2022-03-09  5:10     ` Nikunj A. Dadhania
2022-03-21  6:11       ` Mingwei Zhang
2022-03-21  9:19         ` Nikunj A. Dadhania [this message]
2022-03-08  4:38 ` [PATCH RFC v1 6/9] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV/TDX Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 7/9] KVM: SEV: Carve out routine for allocation of pages Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 8/9] KVM: Move kvm_for_each_memslot_in_hva_range() to be used in SVM Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 9/9] KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data() Nikunj A Dadhania
2022-03-09 16:57   ` Maciej S. Szmigiero
2022-03-09 17:47     ` Nikunj A. Dadhania
2022-03-28 21:00 ` [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests Sean Christopherson
2022-03-30  4:42   ` Nikunj A. Dadhania
2022-03-30 19:47     ` Sean Christopherson
2022-03-31  4:48       ` Nikunj A. Dadhania
2022-03-31 18:32         ` Peter Gonda
2022-03-31 19:00           ` Sean Christopherson
2022-04-01  3:22             ` Nikunj A. Dadhania
2022-04-01 14:54               ` Sean Christopherson
2022-04-01 15:39                 ` Nikunj A. Dadhania
2022-04-01 17:28             ` Marc Orr
2022-04-01 18:02               ` Sean Christopherson
2022-04-01 18:19                 ` Marc Orr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22268ddb-5643-f35e-6c34-eb5c2b0ad4cb@amd.com \
    --to=nikunj@amd.com \
    --cc=bharata@amd.com \
    --cc=brijesh.singh@amd.com \
    --cc=david@redhat.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=seanjc@google.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.