From: "Nikunj A. Dadhania" <nikunj@amd.com>
To: Mingwei Zhang <mizhang@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Brijesh Singh <brijesh.singh@amd.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Peter Gonda <pgonda@google.com>, Bharata B Rao <bharata@amd.com>,
"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
David Hildenbrand <david@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v1 5/9] KVM: SVM: Implement demand page pinning
Date: Mon, 21 Mar 2022 14:49:27 +0530 [thread overview]
Message-ID: <22268ddb-5643-f35e-6c34-eb5c2b0ad4cb@amd.com> (raw)
In-Reply-To: <YjgXIyrcDA5+u8d+@google.com>
On 3/21/2022 11:41 AM, Mingwei Zhang wrote:
> On Wed, Mar 09, 2022, Nikunj A. Dadhania wrote:
>> On 3/9/2022 3:23 AM, Mingwei Zhang wrote:
>>> On Tue, Mar 08, 2022, Nikunj A Dadhania wrote:
>>>> Use the memslot metadata to store the pinned data along with the pfns.
>>>> This improves the SEV guest startup time from O(n) to a constant by
>>>> deferring guest page pinning until the pages are used to satisfy
>>>> nested page faults. The page reference will be dropped in the memslot
>>>> free path or deallocation path.
>>>>
>>>> Reuse enc_region structure definition as pinned_region to maintain
>>>> pages that are pinned outside of MMU demand pinning. Remove rest of
>>>> the code which did upfront pinning, as they are no longer needed in
>>>> view of the demand pinning support.
>>>
>>> I don't quite understand why we still need the enc_region. I have
>>> several concerns. Details below.
>>
>> With patch 9 the enc_region is used only for memory that was pinned before
>> the vcpu is online (i.e. mmu is not yet usable)
>>
>>>>
>>>> Retain svm_register_enc_region() and svm_unregister_enc_region() with
>>>> required checks for resource limit.
>>>>
>>>> Guest boot time comparison
>>>> +---------------+----------------+-------------------+
>>>> | Guest Memory | baseline | Demand Pinning |
>>>> | Size (GB) | (secs) | (secs) |
>>>> +---------------+----------------+-------------------+
>>>> | 4 | 6.16 | 5.71 |
>>>> +---------------+----------------+-------------------+
>>>> | 16 | 7.38 | 5.91 |
>>>> +---------------+----------------+-------------------+
>>>> | 64 | 12.17 | 6.16 |
>>>> +---------------+----------------+-------------------+
>>>> | 128 | 18.20 | 6.50 |
>>>> +---------------+----------------+-------------------+
>>>> | 192 | 24.56 | 6.80 |
>>>> +---------------+----------------+-------------------+
>>>>
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>>> ---
>>>> arch/x86/kvm/svm/sev.c | 304 ++++++++++++++++++++++++++---------------
>>>> arch/x86/kvm/svm/svm.c | 1 +
>>>> arch/x86/kvm/svm/svm.h | 6 +-
>>>> 3 files changed, 200 insertions(+), 111 deletions(-)
>>>>
<SNIP>
>>>> static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>> unsigned long ulen, unsigned long *n,
>>>> int write)
>>>> {
>>>> struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
>>>> + struct pinned_region *region;
>>>> unsigned long npages, size;
>>>> int npinned;
>>>> - unsigned long locked, lock_limit;
>>>> struct page **pages;
>>>> - unsigned long first, last;
>>>> int ret;
>>>>
>>>> lockdep_assert_held(&kvm->lock);
>>>> @@ -395,15 +413,12 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>> if (ulen == 0 || uaddr + ulen < uaddr)
>>>> return ERR_PTR(-EINVAL);
>>>>
>>>> - /* Calculate number of pages. */
>>>> - first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>>>> - last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>>>> - npages = (last - first + 1);
>>>> + npages = get_npages(uaddr, ulen);
>>>>
>>>> - locked = sev->pages_locked + npages;
>>>> - lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>>>> - if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
>>>> - pr_err("SEV: %lu locked pages exceed the lock limit of %lu.\n", locked, lock_limit);
>>>> + if (rlimit_memlock_exceeds(sev->pages_to_lock, npages)) {
>>>> + pr_err("SEV: %lu locked pages exceed the lock limit of %lu.\n",
>>>> + sev->pages_to_lock + npages,
>>>> + (rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT));
>>>> return ERR_PTR(-ENOMEM);
>>>> }
>>>>
>>>> @@ -429,7 +444,19 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
>>>> }
>>>>
>>>> *n = npages;
>>>> - sev->pages_locked = locked;
>>>> + sev->pages_to_lock += npages;
>>>> +
>>>> + /* Maintain region list that is pinned to be unpinned in vm destroy path */
>>>> + region = kzalloc(sizeof(*region), GFP_KERNEL_ACCOUNT);
>>>> + if (!region) {
>>>> + ret = -ENOMEM;
>>>> + goto err;
>>>> + }
>>>> + region->uaddr = uaddr;
>>>> + region->size = ulen;
>>>> + region->pages = pages;
>>>> + region->npages = npages;
>>>> + list_add_tail(®ion->list, &sev->pinned_regions_list);
>>>
>>> Hmm. I see a duplication of the metadata. We already store the pfns in
>>> memslot. But now we also do it in regions. Is this one used for
>>> migration purpose?
>>
>> We are not duplicating, the enc_region holds regions that are pinned other
>> than svm_register_enc_region(). Later patches add infrastructure to directly
>> fault-in those pages which will use memslot->pfns.
>>
>>>
>>> I might miss some of the context here.
>>
>> More context here:
>> https://lore.kernel.org/kvm/CAMkAt6p1-82LTRNB3pkPRwYh=wGpreUN=jcUeBj_dZt8ss9w0Q@mail.gmail.com/
>
> hmm. I think I might got the point. However, logically, I still think we
> might not need double data structures for pinning. When vcpu is not
> online, we could use the the array in memslot to contain the pinned
> pages, right?
Yes.
> Since user-level code is not allowed to pin arbitrary regions of HVA, we
> could check that and bail out early if the region goes out of a memslot.
>
> From that point, the only requirement is that we need a valid memslot
> before doing memory encryption and pinning. So enc_region is still not
> needed from this point.
>
> This should save some time to avoid double pinning and make the pinning
> information clear.
Agreed, I think that should be possible:
* Check for addr/end being part of a memslot
* Error out in case it is not part of any memslot
* Add __sev_pin_pfn() which is not dependent on vcpu arg.
* Iterate over the pages and use __sev_pin_pfn() routine to pin.
slots = kvm_memslots(kvm);
kvm_for_each_memslot_in_hva_range(node, slots, addr, end) {
slot = container_of(node, struct kvm_memory_slot,
hva_node[slots->node_idx]);
slot_start = slot->userspace_addr;
slot_end = slot_start + (slot->npages << PAGE_SHIFT);
hva_start = max(addr, slot_start);
hva_end = min(end, slot_end)
for (uaddr = hva_start; uaddr < hva_end; uaddr += PAGE_SIZE) {
__sev_pin_pfn(slot, uaddr, PG_LEVEL_4K)
}
}
This will make sure memslot based data structure is used and enc_region can be removed.
Regards
Nikunj
next prev parent reply other threads:[~2022-03-21 9:20 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-08 4:38 [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 1/9] KVM: Introduce pinning flag to hva_to_pfn* Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 2/9] KVM: x86/mmu: Move hugepage adjust to direct_page_fault Nikunj A Dadhania
2022-03-28 21:04 ` Sean Christopherson
2022-03-08 4:38 ` [PATCH RFC v1 3/9] KVM: x86/mmu: Add hook to pin PFNs on demand in MMU Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 4/9] KVM: SVM: Add pinning metadata in the arch memslot Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 5/9] KVM: SVM: Implement demand page pinning Nikunj A Dadhania
2022-03-08 21:53 ` Mingwei Zhang
2022-03-09 5:10 ` Nikunj A. Dadhania
2022-03-21 6:11 ` Mingwei Zhang
2022-03-21 9:19 ` Nikunj A. Dadhania [this message]
2022-03-08 4:38 ` [PATCH RFC v1 6/9] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV/TDX Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 7/9] KVM: SEV: Carve out routine for allocation of pages Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 8/9] KVM: Move kvm_for_each_memslot_in_hva_range() to be used in SVM Nikunj A Dadhania
2022-03-08 4:38 ` [PATCH RFC v1 9/9] KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data() Nikunj A Dadhania
2022-03-09 16:57 ` Maciej S. Szmigiero
2022-03-09 17:47 ` Nikunj A. Dadhania
2022-03-28 21:00 ` [PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests Sean Christopherson
2022-03-30 4:42 ` Nikunj A. Dadhania
2022-03-30 19:47 ` Sean Christopherson
2022-03-31 4:48 ` Nikunj A. Dadhania
2022-03-31 18:32 ` Peter Gonda
2022-03-31 19:00 ` Sean Christopherson
2022-04-01 3:22 ` Nikunj A. Dadhania
2022-04-01 14:54 ` Sean Christopherson
2022-04-01 15:39 ` Nikunj A. Dadhania
2022-04-01 17:28 ` Marc Orr
2022-04-01 18:02 ` Sean Christopherson
2022-04-01 18:19 ` Marc Orr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22268ddb-5643-f35e-6c34-eb5c2b0ad4cb@amd.com \
--to=nikunj@amd.com \
--cc=bharata@amd.com \
--cc=brijesh.singh@amd.com \
--cc=david@redhat.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mail@maciej.szmigiero.name \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=pgonda@google.com \
--cc=seanjc@google.com \
--cc=thomas.lendacky@amd.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).