From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>,
Igor Mammedov <imammedo@redhat.com>,
Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>,
Julien Thierry <julien.thierry.kdev@gmail.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Paul Mackerras <paulus@ozlabs.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
David Hildenbrand <david@redhat.com>,
Cornelia Huck <cohuck@redhat.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Joerg Roedel <joro@8bytes.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/8] KVM: x86: Cache total page count to avoid traversing the memslot array
Date: Fri, 21 May 2021 09:03:23 +0200 [thread overview]
Message-ID: <e3769513-d2d8-e3fb-7887-3c8872b0f00c@maciej.szmigiero.name> (raw)
In-Reply-To: <YKV8hHDS489g9JBS@google.com>
On 19.05.2021 23:00, Sean Christopherson wrote:
> On Sun, May 16, 2021, Maciej S. Szmigiero wrote:
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> There is no point in recalculating from scratch the total number of pages
>> in all memslots each time a memslot is created or deleted.
>>
>> Just cache the value and update it accordingly on each such operation so
>> the code doesn't need to traverse the whole memslot array each time.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> ---
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 5bd550eaf683..8c7738b75393 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -11112,9 +11112,21 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>> const struct kvm_memory_slot *new,
>> enum kvm_mr_change change)
>> {
>> - if (!kvm->arch.n_requested_mmu_pages)
>> - kvm_mmu_change_mmu_pages(kvm,
>> - kvm_mmu_calculate_default_mmu_pages(kvm));
>> + if (change == KVM_MR_CREATE)
>> + kvm->arch.n_memslots_pages += new->npages;
>> + else if (change == KVM_MR_DELETE) {
>> + WARN_ON(kvm->arch.n_memslots_pages < old->npages);
>
> Heh, so I think this WARN can be triggered at will by userspace on 32-bit KVM by
> causing the running count to wrap. KVM artificially caps the size of a single
> memslot at ((1UL << 31) - 1), but userspace could create multiple gigantic slots
> to overflow arch.n_memslots_pages.
>
> I _think_ changing it to a u64 would fix the problem since KVM forbids overlapping
> memslots in the GPA space.
You are right, n_memslots_pages needs to be u64 so it does not overflow
on 32-bit KVM.
The memslot count is limited to 32k in each of 2 address spaces, so in
the worst case the variable should hold 15-bits + 1 bit + 31-bits = 47 bit number.
> Also, what about moving the check-and-WARN to prepare_memory_region() so that
> KVM can error out if the check fails? Doesn't really matter, but an explicit
> error for userspace is preferable to underflowing the number of pages and getting
> weird MMU errors/behavior down the line.
In principle this seems like a possibility, however, it is a more
regression-risky option, in case something has (perhaps unintentionally)
relied on the fact that kvm_mmu_zap_oldest_mmu_pages() call from
kvm_mmu_change_mmu_pages() was being done only in the memslot commit
function.
>> + kvm->arch.n_memslots_pages -= old->npages;
>> + }
>> +
>> + if (!kvm->arch.n_requested_mmu_pages) {
>
> If we're going to bother caching the number of pages then we should also skip
> the update when the number pages isn't changing, e.g.
>
> if (change == KVM_MR_CREATE || change == KVM_MR_DELETE) {
> if (change == KVM_MR_CREATE)
> kvm->arch.n_memslots_pages += new->npages;
> else
> kvm->arch.n_memslots_pages -= old->npages;
>
> if (!kvm->arch.n_requested_mmu_pages) {
> unsigned long nr_mmu_pages;
>
> nr_mmu_pages = kvm->arch.n_memslots_pages *
> KVM_PERMILLE_MMU_PAGES / 1000;
> nr_mmu_pages = max(nr_mmu_pages, KVM_MIN_ALLOC_MMU_PAGES);
> kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
> }
> }
The old code did it that way (unconditionally) and, as in the case above,
I didn't want to risk an regression.
If we are going to change this fact then I think it should happen in a
separate patch.
Thanks,
Maciej
next prev parent reply other threads:[~2021-05-21 7:03 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-16 21:44 [PATCH v3 0/8] KVM: Scalable memslots implementation Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 1/8] KVM: x86: Cache total page count to avoid traversing the memslot array Maciej S. Szmigiero
2021-05-19 21:00 ` Sean Christopherson
2021-05-21 7:03 ` Maciej S. Szmigiero [this message]
2021-05-16 21:44 ` [PATCH v3 2/8] KVM: Integrate gfn_to_memslot_approx() into search_memslots() Maciej S. Szmigiero
2021-05-19 21:24 ` Sean Christopherson
2021-05-21 7:03 ` Maciej S. Szmigiero
2021-06-10 16:17 ` Paolo Bonzini
2021-05-16 21:44 ` [PATCH v3 3/8] KVM: Resolve memslot ID via a hash table instead of via a static array Maciej S. Szmigiero
2021-05-19 22:31 ` Sean Christopherson
2021-05-21 7:05 ` Maciej S. Szmigiero
2021-05-22 11:11 ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 4/8] KVM: Introduce memslots hva tree Maciej S. Szmigiero
2021-05-19 23:07 ` Sean Christopherson
2021-05-21 7:06 ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 5/8] KVM: s390: Introduce kvm_s390_get_gfn_end() Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 6/8] KVM: Keep memslots in tree-based structures instead of array-based ones Maciej S. Szmigiero
2021-05-19 23:10 ` Sean Christopherson
2021-05-21 7:06 ` Maciej S. Szmigiero
2021-05-25 23:21 ` Sean Christopherson
2021-06-01 20:24 ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 7/8] KVM: Optimize gfn lookup in kvm_zap_gfn_range() Maciej S. Szmigiero
2021-05-26 17:33 ` Sean Christopherson
2021-06-01 20:25 ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 8/8] KVM: Optimize overlapping memslots check Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e3769513-d2d8-e3fb-7887-3c8872b0f00c@maciej.szmigiero.name \
--to=mail@maciej.szmigiero.name \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=borntraeger@de.ibm.com \
--cc=chenhuacai@kernel.org \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=imammedo@redhat.com \
--cc=imbrenda@linux.ibm.com \
--cc=james.morse@arm.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=julien.thierry.kdev@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=paulus@ozlabs.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).