linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Nikunj A. Dadhania" <nikunj@amd.com>
To: Mingwei Zhang <mizhang@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Peter Gonda <pgonda@google.com>, Bharata B Rao <bharata@amd.com>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	David Hildenbrand <david@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v1 5/9] KVM: SVM: Implement demand page pinning
Date: Mon, 21 Mar 2022 14:49:27 +0530	[thread overview]
Message-ID: <22268ddb-5643-f35e-6c34-eb5c2b0ad4cb@amd.com> (raw)
In-Reply-To: <YjgXIyrcDA5+u8d+@google.com>

On 3/21/2022 11:41 AM, Mingwei Zhang wrote:
> On Wed, Mar 09, 2022, Nikunj A. Dadhania wrote:
>> On 3/9/2022 3:23 AM, Mingwei Zhang wrote:
>>> On Tue, Mar 08, 2022, Nikunj A Dadhania wrote:
>>>> Use the memslot metadata to store the pinned data along with the pfns.
>>>> This improves the SEV guest startup time from O(n) to a constant by
>>>> deferring guest page pinning until the pages are used to satisfy
>>>> nested page faults. The page reference will be dropped in the memslot
>>>> free path or deallocation path.
>>>>
>>>> Reuse enc_region structure definition as pinned_region to maintain
>>>> pages that are pinned outside of MMU demand pinning. Remove rest of
>>>> the code which did upfront pinning, as they are no longer needed in
>>>> view of the demand pinning support.
>>>
>>> I don't quite understand why we still need the enc_region. I have
>>> several concerns. Details below.
>>
>> With patch 9 the enc_region is used only for memory that was pinned before 
>> the vcpu is online (i.e. mmu is not yet usable)
>>
>>>>
>>>> Retain svm_register_enc_region() and svm_unregister_enc_region() with
>>>> required checks for resource limit.
>>>>
>>>> Guest boot time comparison
>>>>   +---------------+----------------+-------------------+
>>>>   | Guest Memory  |   baseline     |  Demand Pinning   |
>>>>   | Size (GB)     |    (secs)      |     (secs)        |
>>>>   +---------------+----------------+-------------------+
>>>>   |      4        |     6.16       |      5.71         |
>>>>   +---------------+----------------+-------------------+
>>>>   |     16        |     7.38       |      5.91         |
>>>>   +---------------+----------------+-------------------+
>>>>   |     64        |    12.17       |      6.16         |
>>>>   +---------------+----------------+-------------------+
>>>>   |    128        |    18.20       |      6.50         |
>>>>   +---------------+----------------+-------------------+
>>>>   |    192        |    24.56       |      6.80         |
>>>>   +---------------+----------------+-------------------+
>>>>
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>>> ---
>>>>  arch/x86/kvm/svm/sev.c | 304 ++++++++++++++++++++++++++---------------
>>>>  arch/x86/kvm/svm/svm.c |   1 +
>>>>  arch/x86/kvm/svm/svm.h |   6 +-
>>>>  3 files changed, 200 insertions(+), 111 deletions(-)
>>>>

<SNIP>

>>>>  static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>>  				    unsigned long ulen, unsigned long *n,
>>>>  				    int write)
>>>>  {
>>>>  	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
>>>> +	struct pinned_region *region;
>>>>  	unsigned long npages, size;
>>>>  	int npinned;
>>>> -	unsigned long locked, lock_limit;
>>>>  	struct page **pages;
>>>> -	unsigned long first, last;
>>>>  	int ret;
>>>>  
>>>>  	lockdep_assert_held(&kvm->lock);
>>>> @@ -395,15 +413,12 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>>  	if (ulen == 0 || uaddr + ulen < uaddr)
>>>>  		return ERR_PTR(-EINVAL);
>>>>  
>>>> -	/* Calculate number of pages. */
>>>> -	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>>>> -	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>>>> -	npages = (last - first + 1);
>>>> +	npages = get_npages(uaddr, ulen);
>>>>  
>>>> -	locked = sev->pages_locked + npages;
>>>> -	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>>>> -	if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
>>>> -		pr_err("SEV: %lu locked pages exceed the lock limit of %lu.\n", locked, lock_limit);
>>>> +	if (rlimit_memlock_exceeds(sev->pages_to_lock, npages)) {
>>>> +		pr_err("SEV: %lu locked pages exceed the lock limit of %lu.\n",
>>>> +			sev->pages_to_lock + npages,
>>>> +			(rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT));
>>>>  		return ERR_PTR(-ENOMEM);
>>>>  	}
>>>>  
>>>> @@ -429,7 +444,19 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>>  	}
>>>>  
>>>>  	*n = npages;
>>>> -	sev->pages_locked = locked;
>>>> +	sev->pages_to_lock += npages;
>>>> +
>>>> +	/* Maintain region list that is pinned to be unpinned in vm destroy path */
>>>> +	region = kzalloc(sizeof(*region), GFP_KERNEL_ACCOUNT);
>>>> +	if (!region) {
>>>> +		ret = -ENOMEM;
>>>> +		goto err;
>>>> +	}
>>>> +	region->uaddr = uaddr;
>>>> +	region->size = ulen;
>>>> +	region->pages = pages;
>>>> +	region->npages = npages;
>>>> +	list_add_tail(&region->list, &sev->pinned_regions_list);
>>>
>>> Hmm. I see a duplication of the metadata. We already store the pfns in
>>> memslot. But now we also do it in regions. Is this one used for
>>> migration purpose?
>>
>> We are not duplicating, the enc_region holds regions that are pinned other 
>> than svm_register_enc_region(). Later patches add infrastructure to directly 
>> fault-in those pages which will use memslot->pfns. 
>>
>>>
>>> I might miss some of the context here. 
>>
>> More context here:
>> https://lore.kernel.org/kvm/CAMkAt6p1-82LTRNB3pkPRwYh=wGpreUN=jcUeBj_dZt8ss9w0Q@mail.gmail.com/
> 
> hmm. I think I might got the point. However, logically, I still think we
> might not need double data structures for pinning. When vcpu is not
> online, we could use the the array in memslot to contain the pinned
> pages, right?

Yes.

> Since user-level code is not allowed to pin arbitrary regions of HVA, we
> could check that and bail out early if the region goes out of a memslot.
> 
> From that point, the only requirement is that we need a valid memslot
> before doing memory encryption and pinning. So enc_region is still not
> needed from this point.
> 
> This should save some time to avoid double pinning and make the pinning
> information clear.

Agreed, I think that should be possible:

* Check for addr/end being part of a memslot
* Error out in case it is not part of any memslot
* Add __sev_pin_pfn() which is not dependent on vcpu arg.
* Iterate over the pages and use __sev_pin_pfn() routine to pin.
	slots = kvm_memslots(kvm);
	kvm_for_each_memslot_in_hva_range(node, slots, addr, end) {
		slot = container_of(node, struct kvm_memory_slot,
			    hva_node[slots->node_idx]);
		slot_start = slot->userspace_addr;
		slot_end = slot_start + (slot->npages << PAGE_SHIFT);
		hva_start = max(addr, slot_start);
		hva_end = min(end, slot_end)
		for (uaddr = hva_start; uaddr < hva_end; uaddr += PAGE_SIZE) {
			__sev_pin_pfn(slot, uaddr, PG_LEVEL_4K)
		}
	}

This will make sure memslot based data structure is used and enc_region can be removed.

Regards
Nikunj
  



  reply	other threads:[~2022-03-21  9:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-08  4:38 [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 1/9] KVM: Introduce pinning flag to hva_to_pfn* Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 2/9] KVM: x86/mmu: Move hugepage adjust to direct_page_fault Nikunj A Dadhania
2022-03-28 21:04   ` Sean Christopherson
2022-03-08  4:38 ` [PATCH RFC v1 3/9] KVM: x86/mmu: Add hook to pin PFNs on demand in MMU Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 4/9] KVM: SVM: Add pinning metadata in the arch memslot Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 5/9] KVM: SVM: Implement demand page pinning Nikunj A Dadhania
2022-03-08 21:53   ` Mingwei Zhang
2022-03-09  5:10     ` Nikunj A. Dadhania
2022-03-21  6:11       ` Mingwei Zhang
2022-03-21  9:19         ` Nikunj A. Dadhania [this message]
2022-03-08  4:38 ` [PATCH RFC v1 6/9] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV/TDX Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 7/9] KVM: SEV: Carve out routine for allocation of pages Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 8/9] KVM: Move kvm_for_each_memslot_in_hva_range() to be used in SVM Nikunj A Dadhania
2022-03-08  4:38 ` [PATCH RFC v1 9/9] KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data() Nikunj A Dadhania
2022-03-09 16:57   ` Maciej S. Szmigiero
2022-03-09 17:47     ` Nikunj A. Dadhania
2022-03-28 21:00 ` [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests Sean Christopherson
2022-03-30  4:42   ` Nikunj A. Dadhania
2022-03-30 19:47     ` Sean Christopherson
2022-03-31  4:48       ` Nikunj A. Dadhania
2022-03-31 18:32         ` Peter Gonda
2022-03-31 19:00           ` Sean Christopherson
2022-04-01  3:22             ` Nikunj A. Dadhania
2022-04-01 14:54               ` Sean Christopherson
2022-04-01 15:39                 ` Nikunj A. Dadhania
2022-04-01 17:28             ` Marc Orr
2022-04-01 18:02               ` Sean Christopherson
2022-04-01 18:19                 ` Marc Orr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22268ddb-5643-f35e-6c34-eb5c2b0ad4cb@amd.com \
    --to=nikunj@amd.com \
    --cc=bharata@amd.com \
    --cc=brijesh.singh@amd.com \
    --cc=david@redhat.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=seanjc@google.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).