kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elena Afanasova <eafanasova@gmail.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: kvm@vger.kernel.org, jag.raman@oracle.com, elena.ufimtseva@oracle.com
Subject: Re: [RFC v2 3/4] KVM: add support for ioregionfd cmds/replies serialization
Date: Wed, 03 Feb 2021 06:10:25 -0800	[thread overview]
Message-ID: <dc35fdd3eb2febbe49cfd6561da6faf045f12ee3.camel@gmail.com> (raw)
In-Reply-To: <20210130185415.GD98016@stefanha-x1.localdomain>

On Sat, 2021-01-30 at 18:54 +0000, Stefan Hajnoczi wrote:
> On Thu, Jan 28, 2021 at 09:32:22PM +0300, Elena Afanasova wrote:
> > Add ioregionfd context and kvm_io_device_ops->prepare/finish()
> > in order to serialize all bytes requested by guest.
> > 
> > Signed-off-by: Elena Afanasova <eafanasova@gmail.com>
> > ---
> >  arch/x86/kvm/x86.c       |  19 ++++++++
> >  include/kvm/iodev.h      |  14 ++++++
> >  include/linux/kvm_host.h |   4 ++
> >  virt/kvm/ioregion.c      | 102 +++++++++++++++++++++++++++++++++
> > ------
> >  virt/kvm/kvm_main.c      |  32 ++++++++++++
> >  5 files changed, 157 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index a04516b531da..393fb0f4bf46 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5802,6 +5802,8 @@ static int vcpu_mmio_write(struct kvm_vcpu
> > *vcpu, gpa_t addr, int len,
> >  	int ret = 0;
> >  	bool is_apic;
> >  
> > +	kvm_io_bus_prepare(vcpu, KVM_MMIO_BUS, addr, len);
> > +
> >  	do {
> >  		n = min(len, 8);
> >  		is_apic = lapic_in_kernel(vcpu) &&
> > @@ -5823,8 +5825,10 @@ static int vcpu_mmio_write(struct kvm_vcpu
> > *vcpu, gpa_t addr, int len,
> >  	if (ret == -EINTR) {
> >  		vcpu->run->exit_reason = KVM_EXIT_INTR;
> >  		++vcpu->stat.signal_exits;
> > +		return handled;
> >  	}
> >  #endif
> > +	kvm_io_bus_finish(vcpu, KVM_MMIO_BUS, addr, len);
> 
> Hmm...it would be nice for kvm_io_bus_prepare() to return the idx or
> the
> device pointer so the devices don't need to be searched in
> read/write/finish. However, it's complicated by the loop which may
> access multiple devices.
> 
Agree

> > @@ -9309,6 +9325,7 @@ static int complete_ioregion_mmio(struct
> > kvm_vcpu *vcpu)
> >  		vcpu->mmio_cur_fragment++;
> >  	}
> >  
> > +	vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> >  	vcpu->mmio_needed = 0;
> >  	if (!vcpu->ioregion_ctx.in) {
> >  		srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > @@ -9333,6 +9350,7 @@ static int complete_ioregion_pio(struct
> > kvm_vcpu *vcpu)
> >  		vcpu->ioregion_ctx.val += vcpu->ioregion_ctx.len;
> >  	}
> >  
> > +	vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> >  	if (vcpu->ioregion_ctx.in)
> >  		r = kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
> >  	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> > @@ -9352,6 +9370,7 @@ static int complete_ioregion_fast_pio(struct
> > kvm_vcpu *vcpu)
> >  	complete_ioregion_access(vcpu, vcpu->ioregion_ctx.addr,
> >  				 vcpu->ioregion_ctx.len,
> >  				 vcpu->ioregion_ctx.val);
> > +	vcpu->ioregion_ctx.dev->ops->finish(vcpu->ioregion_ctx.dev);
> >  	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> >  
> >  	if (vcpu->ioregion_ctx.in) {
> 
> Normally userspace will invoke ioctl(KVM_RUN) and reach one of these
> completion functions, but what if the vcpu fd is closed instead?
> ->finish() should still be called to avoid leaks.
> 
Will fix

> > diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> > index d75fc4365746..db8a3c69b7bb 100644
> > --- a/include/kvm/iodev.h
> > +++ b/include/kvm/iodev.h
> > @@ -25,6 +25,8 @@ struct kvm_io_device_ops {
> >  		     gpa_t addr,
> >  		     int len,
> >  		     const void *val);
> > +	void (*prepare)(struct kvm_io_device *this);
> > +	void (*finish)(struct kvm_io_device *this);
> >  	void (*destructor)(struct kvm_io_device *this);
> >  };
> >  
> > @@ -55,6 +57,18 @@ static inline int kvm_iodevice_write(struct
> > kvm_vcpu *vcpu,
> >  				 : -EOPNOTSUPP;
> >  }
> >  
> > +static inline void kvm_iodevice_prepare(struct kvm_io_device *dev)
> > +{
> > +	if (dev->ops->prepare)
> > +		dev->ops->prepare(dev);
> > +}
> > +
> > +static inline void kvm_iodevice_finish(struct kvm_io_device *dev)
> > +{
> > +	if (dev->ops->finish)
> > +		dev->ops->finish(dev);
> > +}
> 
> A performance optimization: keep a separate list of struct
> kvm_io_devices that implement prepare/finish. That way the search
> doesn't need to iterate over devices that don't support this
> interface.
> 
Thanks for the idea

> Before implementing an optimization like this it would be good to
> check
> how this patch affects performance on guests with many in-kernel
> devices
> (e.g. a guest that has many multi-queue virtio-net/blk devices with
> ioeventfd). ioregionfd shouldn't reduce performance of existing KVM
> configurations, so it's worth measuring.
> 
> > diff --git a/virt/kvm/ioregion.c b/virt/kvm/ioregion.c
> > index da38124e1418..3474090ccc8c 100644
> > --- a/virt/kvm/ioregion.c
> > +++ b/virt/kvm/ioregion.c
> > @@ -1,6 +1,6 @@
> >  // SPDX-License-Identifier: GPL-2.0-only
> >  #include <linux/kvm_host.h>
> > -#include <linux/fs.h>
> > +#include <linux/wait.h>
> >  #include <kvm/iodev.h>
> >  #include "eventfd.h"
> >  #include <uapi/linux/ioregion.h>
> > @@ -12,15 +12,23 @@ kvm_ioregionfd_init(struct kvm *kvm)
> >  	INIT_LIST_HEAD(&kvm->ioregions_pio);
> >  }
> >  
> > +/* Serializes ioregionfd cmds/replies */
> 
> Please expand on this comment:
> 
>   ioregions that share the same rfd are serialized so that only one
> vCPU
>   thread sends a struct ioregionfd_cmd to userspace at a time. This
>   ensures that the struct ioregionfd_resp received from userspace
> will
>   be processed by the one and only vCPU thread that sent it.
> 
>   A waitqueue is used to wake up waiting vCPU threads in order. Most
> of
>   the time the waitqueue is unused and the lock is not contended.
>   For best performance userspace should set up ioregionfds so that
> there
>   is no contention (e.g. dedicated ioregionfds for queue doorbell
>   registers on multi-queue devices).
> 
> A comment along these lines will give readers an idea of why the code
> does this.

Ok, thank you


  reply	other threads:[~2021-02-03 14:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-28 18:32 [RFC v2 0/4] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Elena Afanasova
2021-01-28 18:32 ` [RFC v2 2/4] KVM: x86: add support for ioregionfd signal handling Elena Afanasova
2021-01-30 16:58   ` Stefan Hajnoczi
2021-02-03 14:00     ` Elena Afanasova
2021-02-09  6:21   ` Jason Wang
2021-02-09 14:49     ` Stefan Hajnoczi
2021-02-10 19:06     ` Elena Afanasova
2021-02-09  6:26   ` Jason Wang
2021-01-28 18:32 ` [RFC v2 3/4] KVM: add support for ioregionfd cmds/replies serialization Elena Afanasova
2021-01-30 18:54   ` Stefan Hajnoczi
2021-02-03 14:10     ` Elena Afanasova [this message]
2021-01-28 18:32 ` [RFC v2 4/4] KVM: enforce NR_IOBUS_DEVS limit if kmemcg is disabled Elena Afanasova
2021-01-29 18:48 ` [RESEND RFC v2 1/4] KVM: add initial support for KVM_SET_IOREGION Elena Afanasova
2021-01-30 15:04   ` Stefan Hajnoczi
2021-02-04 13:03   ` Cornelia Huck
2021-02-05 18:39     ` Elena Afanasova
2021-02-08 11:49       ` Cornelia Huck
2021-02-08  6:21   ` Jason Wang
2021-02-09 14:59     ` Stefan Hajnoczi
2021-02-18  6:17       ` Jason Wang
2021-02-10 19:31     ` Elena Afanasova
2021-02-11 14:59       ` Stefan Hajnoczi
2021-02-17 23:05         ` Elena Afanasova
2021-02-18  6:22         ` Jason Wang
2021-02-18  6:20       ` Jason Wang
2021-01-30 14:56 ` [RFC v2 0/4] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Stefan Hajnoczi
2021-02-02 14:59 ` Stefan Hajnoczi
2021-02-08  6:02 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc35fdd3eb2febbe49cfd6561da6faf045f12ee3.camel@gmail.com \
    --to=eafanasova@gmail.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=jag.raman@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).