kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
	kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h
Date: Wed, 20 Jan 2021 09:25:13 -0800	[thread overview]
Message-ID: <YAhnebWjQCOfLtJ0@google.com> (raw)
In-Reply-To: <87czxz6cl9.fsf@vitty.brq.redhat.com>

On Wed, Jan 20, 2021, Vitaly Kuznetsov wrote:
> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Tue, 19 Jan 2021 09:20:42 -0800
> > Sean Christopherson <seanjc@google.com> wrote:
> >
> >> 
> >> Were you planning on adding a capability to check for the new and improved
> >> memslots limit, e.g. to know whether or not KVM might die on a large VM?
> >> If so, requiring the VMM to call an ioctl() to set a higher (or lower?) limit
> >> would be another option.  That wouldn't have the same permission requirements as
> >> a module param, but it would likely be a more effective safeguard in practice,
> >> e.g. use cases with a fixed number of memslots or a well-defined upper bound
> >> could use the capability to limit themselves.
> > Currently QEMU uses KVM_CAP_NR_MEMSLOTS to get limit, and depending on place the
> > limit is reached it either fails gracefully (i.e. it checks if free slot is
> > available before slot allocation) or aborts (in case where it tries to allocate
> > slot without check).
> 
> FWIW, 'synic problem' causes it to abort.
> 
> > New ioctl() seems redundant as we already have upper limit check
> > (unless it would allow go over that limit, which in its turn defeats purpose of
> > the limit).

Gah, and I even looked for an ioctl().  No idea how I didn't find NR_MEMSLOTS.

> Right, I didn't plan to add any new CAP as what we already have should
> be enough to query the limits. Having an ioctl to set the upper limit
> seems complicated: with device and CPU hotplug it may not be easy to
> guess what it should be upfront so VMMs will likely just add a call to
> raise the limit in memslot modification code so it won't be providing
> any new protection.

True.  Maybe the best approach, if we want to make the limit configurable, would
be to make a lower limit opt-in.  I.e. default=KVM_CAP_NR_MEMSLOTS=SHRT_MAX,
with an ioctl() to set a lower limit.  That would also allow us to defer adding
a new ioctl() until someone actually plans on using it.

> >> Thoughts?  An ioctl() feels a little over-engineered, but I suspect that adding
> >> a module param that defaults to N*KVM_MAX_VPCUS will be a waste, e.g. no one
> >> will ever touch the param and we'll end up with dead, rarely-tested code.
> 
> Alternatively, we can hard-code KVM_USER_MEM_SLOTS to N*KVM_MAX_VPCUS so
> no new parameter is needed but personally, I'd prefer to have it
> configurable (in case we decide to set it to something lower than
> SHRT_MAX of course) even if it won't be altered very often (which is a
> good thing for 'general purpose' usage, right?).
>
> First, it will allow tightening the limit for some very specific deployments
> (e.g. FaaS/ Firecracker-style) to say '20' which should be enough.

A module param likely isn't usable for many such deployments though, as it would
require a completely isolated pool of systems.  That's why an ioctl() is
appealing; the expected number of memslots is a property of the VM, not of the
platform.

> Second, we may be overlooking some configurations where the number of
> memslots is actually dependent on the number of vCPUs but nobody complained
> so far just because these configutrarions use a farly small number and the
> ceiling wasn't hit yet.
> 
> One more spare thought. Per-vCPU memslots may come handy if someone
> decides to move some of the KVM PV features to userspace. E.g. I can
> imagine an attempt to move async_pf out of kernel.

Memslots aren't per-vCPU though, any userspace feature that tries to treat them
as such will be flawed in some way.  Practically speaking, per-vCPU memslots
just aren't feasible, at least not without KVM changes that are likely
unpalatable, e.g. KVM would need to incorporate the existence of a vCPU specific
memslot into the MMU role.

  reply	other threads:[~2021-01-20 17:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-15 13:18 [PATCH RFC 0/4] KVM: x86: Drastically raise KVM_USER_MEM_SLOTS limit Vitaly Kuznetsov
2021-01-15 13:18 ` [PATCH RFC 1/4] KVM: x86: Drop redundant KVM_MEM_SLOTS_NUM definition Vitaly Kuznetsov
2021-01-15 16:47   ` Sean Christopherson
2021-01-15 13:18 ` [PATCH RFC 2/4] KVM: mips: Drop KVM_PRIVATE_MEM_SLOTS definition Vitaly Kuznetsov
2021-01-15 13:18 ` [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h Vitaly Kuznetsov
2021-01-15 17:05   ` Sean Christopherson
2021-01-18  9:52     ` Vitaly Kuznetsov
2021-01-19 17:20       ` Sean Christopherson
2021-01-20 11:34         ` Igor Mammedov
2021-01-20 12:02           ` Vitaly Kuznetsov
2021-01-20 17:25             ` Sean Christopherson [this message]
2021-01-15 13:18 ` [PATCH RFC 4/4] KVM: x86: Stop limiting KVM_USER_MEM_SLOTS Vitaly Kuznetsov
2021-01-15 16:03 ` [PATCH RFC 0/4] KVM: x86: Drastically raise KVM_USER_MEM_SLOTS limit Sean Christopherson
2021-01-15 16:23   ` Vitaly Kuznetsov
2021-01-15 16:45     ` Sean Christopherson
2021-01-15 18:47 ` Maciej S. Szmigiero
2021-01-27 17:55   ` Vitaly Kuznetsov
2021-01-27 22:26     ` Maciej S. Szmigiero
2021-01-28  8:47       ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YAhnebWjQCOfLtJ0@google.com \
    --to=seanjc@google.com \
    --cc=imammedo@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).