From: Sean Christopherson <seanjc@google.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h
Date: Wed, 20 Jan 2021 09:25:13 -0800 [thread overview]
Message-ID: <YAhnebWjQCOfLtJ0@google.com> (raw)
In-Reply-To: <87czxz6cl9.fsf@vitty.brq.redhat.com>
On Wed, Jan 20, 2021, Vitaly Kuznetsov wrote:
> Igor Mammedov <imammedo@redhat.com> writes:
>
> > On Tue, 19 Jan 2021 09:20:42 -0800
> > Sean Christopherson <seanjc@google.com> wrote:
> >
> >>
> >> Were you planning on adding a capability to check for the new and improved
> >> memslots limit, e.g. to know whether or not KVM might die on a large VM?
> >> If so, requiring the VMM to call an ioctl() to set a higher (or lower?) limit
> >> would be another option. That wouldn't have the same permission requirements as
> >> a module param, but it would likely be a more effective safeguard in practice,
> >> e.g. use cases with a fixed number of memslots or a well-defined upper bound
> >> could use the capability to limit themselves.
> > Currently QEMU uses KVM_CAP_NR_MEMSLOTS to get limit, and depending on place the
> > limit is reached it either fails gracefully (i.e. it checks if free slot is
> > available before slot allocation) or aborts (in case where it tries to allocate
> > slot without check).
>
> FWIW, 'synic problem' causes it to abort.
>
> > New ioctl() seems redundant as we already have upper limit check
> > (unless it would allow go over that limit, which in its turn defeats purpose of
> > the limit).
Gah, and I even looked for an ioctl(). No idea how I didn't find NR_MEMSLOTS.
> Right, I didn't plan to add any new CAP as what we already have should
> be enough to query the limits. Having an ioctl to set the upper limit
> seems complicated: with device and CPU hotplug it may not be easy to
> guess what it should be upfront so VMMs will likely just add a call to
> raise the limit in memslot modification code so it won't be providing
> any new protection.
True. Maybe the best approach, if we want to make the limit configurable, would
be to make a lower limit opt-in. I.e. default=KVM_CAP_NR_MEMSLOTS=SHRT_MAX,
with an ioctl() to set a lower limit. That would also allow us to defer adding
a new ioctl() until someone actually plans on using it.
> >> Thoughts? An ioctl() feels a little over-engineered, but I suspect that adding
> >> a module param that defaults to N*KVM_MAX_VPCUS will be a waste, e.g. no one
> >> will ever touch the param and we'll end up with dead, rarely-tested code.
>
> Alternatively, we can hard-code KVM_USER_MEM_SLOTS to N*KVM_MAX_VPCUS so
> no new parameter is needed but personally, I'd prefer to have it
> configurable (in case we decide to set it to something lower than
> SHRT_MAX of course) even if it won't be altered very often (which is a
> good thing for 'general purpose' usage, right?).
>
> First, it will allow tightening the limit for some very specific deployments
> (e.g. FaaS/ Firecracker-style) to say '20' which should be enough.
A module param likely isn't usable for many such deployments though, as it would
require a completely isolated pool of systems. That's why an ioctl() is
appealing; the expected number of memslots is a property of the VM, not of the
platform.
> Second, we may be overlooking some configurations where the number of
> memslots is actually dependent on the number of vCPUs but nobody complained
> so far just because these configutrarions use a farly small number and the
> ceiling wasn't hit yet.
>
> One more spare thought. Per-vCPU memslots may come handy if someone
> decides to move some of the KVM PV features to userspace. E.g. I can
> imagine an attempt to move async_pf out of kernel.
Memslots aren't per-vCPU though, any userspace feature that tries to treat them
as such will be flawed in some way. Practically speaking, per-vCPU memslots
just aren't feasible, at least not without KVM changes that are likely
unpalatable, e.g. KVM would need to incorporate the existence of a vCPU specific
memslot into the MMU role.
next prev parent reply other threads:[~2021-01-20 17:34 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-15 13:18 [PATCH RFC 0/4] KVM: x86: Drastically raise KVM_USER_MEM_SLOTS limit Vitaly Kuznetsov
2021-01-15 13:18 ` [PATCH RFC 1/4] KVM: x86: Drop redundant KVM_MEM_SLOTS_NUM definition Vitaly Kuznetsov
2021-01-15 16:47 ` Sean Christopherson
2021-01-15 13:18 ` [PATCH RFC 2/4] KVM: mips: Drop KVM_PRIVATE_MEM_SLOTS definition Vitaly Kuznetsov
2021-01-15 13:18 ` [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h Vitaly Kuznetsov
2021-01-15 17:05 ` Sean Christopherson
2021-01-18 9:52 ` Vitaly Kuznetsov
2021-01-19 17:20 ` Sean Christopherson
2021-01-20 11:34 ` Igor Mammedov
2021-01-20 12:02 ` Vitaly Kuznetsov
2021-01-20 17:25 ` Sean Christopherson [this message]
2021-01-15 13:18 ` [PATCH RFC 4/4] KVM: x86: Stop limiting KVM_USER_MEM_SLOTS Vitaly Kuznetsov
2021-01-15 16:03 ` [PATCH RFC 0/4] KVM: x86: Drastically raise KVM_USER_MEM_SLOTS limit Sean Christopherson
2021-01-15 16:23 ` Vitaly Kuznetsov
2021-01-15 16:45 ` Sean Christopherson
2021-01-15 18:47 ` Maciej S. Szmigiero
2021-01-27 17:55 ` Vitaly Kuznetsov
2021-01-27 22:26 ` Maciej S. Szmigiero
2021-01-28 8:47 ` Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YAhnebWjQCOfLtJ0@google.com \
--to=seanjc@google.com \
--cc=imammedo@redhat.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).