All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: virtio-dev@lists.oasis-open.org,
	Zha Bin <zhabin@linux.alibaba.com>,
	Jing Liu <jing2.liu@linux.intel.com>,
	Chao Peng <chao.p.peng@linux.intel.com>,
	cohuck@redhat.com, Jan Kiszka <jan.kiszka@siemens.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [virtio-dev] On doorbells (queue notifications)
Date: Thu, 16 Jul 2020 11:00:51 +0100	[thread overview]
Message-ID: <20200716100051.GC85868@stefanha-x1.localdomain> (raw)
In-Reply-To: <871rlcybni.fsf@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 3729 bytes --]

On Wed, Jul 15, 2020 at 05:40:33PM +0100, Alex Bennée wrote:
> 
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > On Wed, Jul 15, 2020 at 02:29:04PM +0100, Alex Bennée wrote:
> >> Stefan Hajnoczi <stefanha@redhat.com> writes:
> >> > On Tue, Jul 14, 2020 at 10:43:36PM +0100, Alex Bennée wrote:
> >> >> Finally I'm curious if this is just a problem avoided by the s390
> >> >> channel approach? Does the use of messages over a channel just avoid the
> >> >> sort of bouncing back and forth that other hypervisors have to do when
> >> >> emulating a device?
> >> >
> >> > What does "bouncing back and forth" mean exactly?
> >> 
> >> Context switching between guest and hypervisor.
> >
> > I have CCed Cornelia Huck, who can explain the lifecycle of an I/O
> > request on s390 channel I/O.
> 
> Thanks.
> 
> I was also wondering about the efficiency of doorbells/notifications the
> other way. AFAIUI for both PCI and MMIO only a single write is required
> to the notify flag which causes a trap to the hypervisor and the rest of
> the processing. The hypervisor doesn't have the cost multiple exits to
> read the guest state although it obviously wants to be as efficient as
> possible passing the data back up to what ever is handling the backend
> of the device so it doesn't need to do multiple context switches.
> 
> Has there been any investigation into other mechanisms for notifying the
> hypervisor of an event - for example using a HYP call or similar
> mechanism?
> 
> My gut tells me this probably doesn't make any difference as a trap to
> the hypervisor is likely to cost the same either way because you still
> need to save the guest context before actioning something but it would
> be interesting to know if anyone has looked at it. Perhaps there is a
> benefit in partitioned systems where core running the guest can return
> straight away after initiating what it needs to internally in the
> hypervisor to pass the notification to something that can deal with it?

It's very architecture-specific. This is something Michael Tsirkin
looked in in the past. He found that MMIO and PIO perform differently on
x86. VIRTIO supports both so the device can be configured optimally.
There was an old discussion from 2013 here:
https://lkml.org/lkml/2013/4/4/299

Without nested page tables MMIO was slower than PIO. But with nested
page tables it was faster.

Another option on x86 is using Model-Specific Registers (for hypercalls)
but this doesn't fit into the PCI device model.

A bigger issue than vmexit latency is device emulation thread wakeup
latency. There is a thread (QEMU, vhost-user, vhost, etc) monitoring the
ioeventfd but it may be descheduled. Its physical CPU may be in a low
power state. I ran a benchmark late last year with QEMU's AioContext
adaptive polling disabled so we can measure the wakeup latency:

       CPU 0/KVM 26102 [000] 85626.737072:       kvm:kvm_fast_mmio:
fast mmio at gpa 0xfde03000
    IO iothread1 26099 [001] 85626.737076: syscalls:sys_exit_ppoll: 0x1
                   4 microseconds ------^

(I did not manually configure physical CPU power states or use the
idle=poll host kernel parameter.)

Each virtqueue kick had 4 microseconds of latency before the device
emulation thread had a chance to process the virtqueue. This means the
maximum I/O Operations Per Second (IOPS) is capped at 250k before
virtqueue processing has even begun!

QEMU AioContext adaptive polling helps here because we skip the vmexit
entirely while the IOThread is polling the vring (for up to 32
microseconds by default).

It would be great if more people dig into this and optimize
notifications further.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2020-07-16 10:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14 21:43 [virtio-dev] On doorbells (queue notifications) Alex Bennée
2020-07-15 11:48 ` Stefan Hajnoczi
2020-07-15 13:29   ` Alex Bennée
2020-07-15 15:47     ` Stefan Hajnoczi
2020-07-15 16:40       ` Alex Bennée
2020-07-15 17:09         ` Cornelia Huck
2020-07-16 10:00         ` Stefan Hajnoczi [this message]
2020-07-16 11:25           ` Christophe de Dinechin
2020-07-16 14:19             ` Stefan Hajnoczi
2020-07-16 14:31               ` Christophe de Dinechin
2020-07-16 14:34               ` Christophe de Dinechin
2020-07-17  8:42                 ` Stefan Hajnoczi
2020-07-15 17:01       ` Cornelia Huck
2020-07-15 17:25         ` Alex Bennée
2020-07-15 20:04           ` Halil Pasic
2020-07-16  9:41             ` Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200716100051.GC85868@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=chao.p.peng@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=mst@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=zhabin@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.