From: Elena Afanasova <eafanasova@gmail.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: kvm@vger.kernel.org, jag.raman@oracle.com, elena.ufimtseva@oracle.com
Subject: Re: [RFC v2 3/4] KVM: add support for ioregionfd cmds/replies serialization
Date: Wed, 03 Feb 2021 06:10:25 -0800 [thread overview]
Message-ID: <dc35fdd3eb2febbe49cfd6561da6faf045f12ee3.camel@gmail.com> (raw)
In-Reply-To: <20210130185415.GD98016@stefanha-x1.localdomain>
On Sat, 2021-01-30 at 18:54 +0000, Stefan Hajnoczi wrote:
> On Thu, Jan 28, 2021 at 09:32:22PM +0300, Elena Afanasova wrote:
> > Add ioregionfd context and kvm_io_device_ops->prepare/finish()
> > in order to serialize all bytes requested by guest.
> >
> > Signed-off-by: Elena Afanasova <eafanasova@gmail.com>
> > ---
> > arch/x86/kvm/x86.c | 19 ++++++++
> > include/kvm/iodev.h | 14 ++++++
> > include/linux/kvm_host.h | 4 ++
> > virt/kvm/ioregion.c | 102 +++++++++++++++++++++++++++++++++
> > ------
> > virt/kvm/kvm_main.c | 32 ++++++++++++
> > 5 files changed, 157 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index a04516b531da..393fb0f4bf46 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5802,6 +5802,8 @@ static int vcpu_mmio_write(struct kvm_vcpu
> > *vcpu, gpa_t addr, int len,
> > int ret = 0;
> > bool is_apic;
> >
> > + kvm_io_bus_prepare(vcpu, KVM_MMIO_BUS, addr, len);
> > +
> > do {
> > n = min(len, 8);
> > is_apic = lapic_in_kernel(vcpu) &&
> > @@ -5823,8 +5825,10 @@ static int vcpu_mmio_write(struct kvm_vcpu
> > *vcpu, gpa_t addr, int len,
> > if (ret == -EINTR) {
> > vcpu->run->exit_reason = KVM_EXIT_INTR;
> > ++vcpu->stat.signal_exits;
> > + return handled;
> > }
> > #endif
> > + kvm_io_bus_finish(vcpu, KVM_MMIO_BUS, addr, len);
>
> Hmm...it would be nice for kvm_io_bus_prepare() to return the idx or
> the
> device pointer so the devices don't need to be searched in
> read/write/finish. However, it's complicated by the loop which may
> access multiple devices.
>
Agree
> > @@ -9309,6 +9325,7 @@ static int complete_ioregion_mmio(struct
> > kvm_vcpu *vcpu)
> > vcpu->mmio_cur_fragment++;
> > }
> >
> > + vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> > vcpu->mmio_needed = 0;
> > if (!vcpu->ioregion_ctx.in) {
> > srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > @@ -9333,6 +9350,7 @@ static int complete_ioregion_pio(struct
> > kvm_vcpu *vcpu)
> > vcpu->ioregion_ctx.val += vcpu->ioregion_ctx.len;
> > }
> >
> > + vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> > if (vcpu->ioregion_ctx.in)
> > r = kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
> > srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > @@ -9352,6 +9370,7 @@ static int complete_ioregion_fast_pio(struct
> > kvm_vcpu *vcpu)
> > complete_ioregion_access(vcpu, vcpu->ioregion_ctx.addr,
> > vcpu->ioregion_ctx.len,
> > vcpu->ioregion_ctx.val);
> > + vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> > srcu_read_unlock(&vcpu->kvm->srcu, idx);
> >
> > if (vcpu->ioregion_ctx.in) {
>
> Normally userspace will invoke ioctl(KVM_RUN) and reach one of these
> completion functions, but what if the vcpu fd is closed instead?
> ->finish() should still be called to avoid leaks.
>
Will fix
> > diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> > index d75fc4365746..db8a3c69b7bb 100644
> > --- a/include/kvm/iodev.h
> > +++ b/include/kvm/iodev.h
> > @@ -25,6 +25,8 @@ struct kvm_io_device_ops {
> > gpa_t addr,
> > int len,
> > const void *val);
> > + void (*prepare)(struct kvm_io_device *this);
> > + void (*finish)(struct kvm_io_device *this);
> > void (*destructor)(struct kvm_io_device *this);
> > };
> >
> > @@ -55,6 +57,18 @@ static inline int kvm_iodevice_write(struct
> > kvm_vcpu *vcpu,
> > : -EOPNOTSUPP;
> > }
> >
> > +static inline void kvm_iodevice_prepare(struct kvm_io_device *dev)
> > +{
> > + if (dev->ops->prepare)
> > + dev->ops->prepare(dev);
> > +}
> > +
> > +static inline void kvm_iodevice_finish(struct kvm_io_device *dev)
> > +{
> > + if (dev->ops->finish)
> > + dev->ops->finish(dev);
> > +}
>
> A performance optimization: keep a separate list of struct
> kvm_io_devices that implement prepare/finish. That way the search
> doesn't need to iterate over devices that don't support this
> interface.
>
Thanks for the idea
> Before implementing an optimization like this it would be good to
> check
> how this patch affects performance on guests with many in-kernel
> devices
> (e.g. a guest that has many multi-queue virtio-net/blk devices with
> ioeventfd). ioregionfd shouldn't reduce performance of existing KVM
> configurations, so it's worth measuring.
>
> > diff --git a/virt/kvm/ioregion.c b/virt/kvm/ioregion.c
> > index da38124e1418..3474090ccc8c 100644
> > --- a/virt/kvm/ioregion.c
> > +++ b/virt/kvm/ioregion.c
> > @@ -1,6 +1,6 @@
> > // SPDX-License-Identifier: GPL-2.0-only
> > #include <linux/kvm_host.h>
> > -#include <linux/fs.h>
> > +#include <linux/wait.h>
> > #include <kvm/iodev.h>
> > #include "eventfd.h"
> > #include <uapi/linux/ioregion.h>
> > @@ -12,15 +12,23 @@ kvm_ioregionfd_init(struct kvm *kvm)
> > INIT_LIST_HEAD(&kvm->ioregions_pio);
> > }
> >
> > +/* Serializes ioregionfd cmds/replies */
>
> Please expand on this comment:
>
> ioregions that share the same rfd are serialized so that only one
> vCPU
> thread sends a struct ioregionfd_cmd to userspace at a time. This
> ensures that the struct ioregionfd_resp received from userspace
> will
> be processed by the one and only vCPU thread that sent it.
>
> A waitqueue is used to wake up waiting vCPU threads in order. Most
> of
> the time the waitqueue is unused and the lock is not contended.
> For best performance userspace should set up ioregionfds so that
> there
> is no contention (e.g. dedicated ioregionfds for queue doorbell
> registers on multi-queue devices).
>
> A comment along these lines will give readers an idea of why the code
> does this.
Ok, thank you
next prev parent reply other threads:[~2021-02-03 14:11 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-28 18:32 [RFC v2 0/4] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Elena Afanasova
2021-01-28 18:32 ` [RFC v2 2/4] KVM: x86: add support for ioregionfd signal handling Elena Afanasova
2021-01-30 16:58 ` Stefan Hajnoczi
2021-02-03 14:00 ` Elena Afanasova
2021-02-09 6:21 ` Jason Wang
2021-02-09 14:49 ` Stefan Hajnoczi
2021-02-10 19:06 ` Elena Afanasova
2021-02-09 6:26 ` Jason Wang
2021-01-28 18:32 ` [RFC v2 3/4] KVM: add support for ioregionfd cmds/replies serialization Elena Afanasova
2021-01-30 18:54 ` Stefan Hajnoczi
2021-02-03 14:10 ` Elena Afanasova [this message]
2021-01-28 18:32 ` [RFC v2 4/4] KVM: enforce NR_IOBUS_DEVS limit if kmemcg is disabled Elena Afanasova
2021-01-29 18:48 ` [RESEND RFC v2 1/4] KVM: add initial support for KVM_SET_IOREGION Elena Afanasova
2021-01-30 15:04 ` Stefan Hajnoczi
2021-02-04 13:03 ` Cornelia Huck
2021-02-05 18:39 ` Elena Afanasova
2021-02-08 11:49 ` Cornelia Huck
2021-02-08 6:21 ` Jason Wang
2021-02-09 14:59 ` Stefan Hajnoczi
2021-02-18 6:17 ` Jason Wang
2021-02-10 19:31 ` Elena Afanasova
2021-02-11 14:59 ` Stefan Hajnoczi
2021-02-17 23:05 ` Elena Afanasova
2021-02-18 6:22 ` Jason Wang
2021-02-18 6:20 ` Jason Wang
2021-01-30 14:56 ` [RFC v2 0/4] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Stefan Hajnoczi
2021-02-02 14:59 ` Stefan Hajnoczi
2021-02-08 6:02 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dc35fdd3eb2febbe49cfd6561da6faf045f12ee3.camel@gmail.com \
--to=eafanasova@gmail.com \
--cc=elena.ufimtseva@oracle.com \
--cc=jag.raman@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).