linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Jason Wang <jasowang@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Ming Lei <ming.lei@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	virtualization@lists.linux-foundation.org,
	Linux PM <linux-pm@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [RFC 0/3] cpuidle: add poll_source API and virtio vq polling
Date: Mon, 26 Jul 2021 18:37:13 +0200	[thread overview]
Message-ID: <CAJZ5v0i4+2xda4Z6=JwRQf4ZzM2_agiyCwhMDRzAC-yz39fGzg@mail.gmail.com> (raw)
In-Reply-To: <YP7cTjrfipfsJe9O@stefanha-x1.localdomain>

On Mon, Jul 26, 2021 at 6:04 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Mon, Jul 26, 2021 at 05:47:19PM +0200, Rafael J. Wysocki wrote:
> > On Mon, Jul 26, 2021 at 5:17 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > On Thu, Jul 22, 2021 at 05:04:57PM +0800, Jason Wang wrote:
> > > >
> > > > 在 2021/7/21 下午5:41, Stefan Hajnoczi 写道:
> > > > > On Wed, Jul 21, 2021 at 11:29:55AM +0800, Jason Wang wrote:
> > > > > > 在 2021/7/14 上午12:19, Stefan Hajnoczi 写道:
> > > > > > > These patches are not polished yet but I would like request feedback on this
> > > > > > > approach and share performance results with you.
> > > > > > >
> > > > > > > Idle CPUs tentatively enter a busy wait loop before halting when the cpuidle
> > > > > > > haltpoll driver is enabled inside a virtual machine. This reduces wakeup
> > > > > > > latency for events that occur soon after the vCPU becomes idle.
> > > > > > >
> > > > > > > This patch series extends the cpuidle busy wait loop with the new poll_source
> > > > > > > API so drivers can participate in polling. Such polling-aware drivers disable
> > > > > > > their device's irq during the busy wait loop to avoid the cost of interrupts.
> > > > > > > This reduces latency further than regular cpuidle haltpoll, which still relies
> > > > > > > on irqs.
> > > > > > >
> > > > > > > Virtio drivers are modified to use the poll_source API so all virtio device
> > > > > > > types get this feature. The following virtio-blk fio benchmark results show the
> > > > > > > improvement:
> > > > > > >
> > > > > > >                IOPS (numjobs=4, iodepth=1, 4 virtqueues)
> > > > > > >                  before   poll_source      io_poll
> > > > > > > 4k randread    167102  186049 (+11%)  186654 (+11%)
> > > > > > > 4k randwrite   162204  181214 (+11%)  181850 (+12%)
> > > > > > > 4k randrw      159520  177071 (+11%)  177928 (+11%)
> > > > > > >
> > > > > > > The comparison against io_poll shows that cpuidle poll_source achieves
> > > > > > > equivalent performance to the block layer's io_poll feature (which I
> > > > > > > implemented in a separate patch series [1]).
> > > > > > >
> > > > > > > The advantage of poll_source is that applications do not need to explicitly set
> > > > > > > the RWF_HIPRI I/O request flag. The poll_source approach is attractive because
> > > > > > > few applications actually use RWF_HIPRI and it takes advantage of CPU cycles we
> > > > > > > would have spent in cpuidle haltpoll anyway.
> > > > > > >
> > > > > > > The current series does not improve virtio-net. I haven't investigated deeply,
> > > > > > > but it is possible that NAPI and poll_source do not combine. See the final
> > > > > > > patch for a starting point on making the two work together.
> > > > > > >
> > > > > > > I have not tried this on bare metal but it might help there too. The cost of
> > > > > > > disabling a device's irq must be less than the savings from avoiding irq
> > > > > > > handling for this optimization to make sense.
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/linux-block/20210520141305.355961-1-stefanha@redhat.com/
> > > > > >
> > > > > > Hi Stefan:
> > > > > >
> > > > > > Some questions:
> > > > > >
> > > > > > 1) What's the advantages of introducing polling at virtio level instead of
> > > > > > doing it at each subsystems? Polling in virtio level may only work well if
> > > > > > all (or most) of the devices are virtio
> > > > > I'm not sure I understand the question. cpuidle haltpoll benefits all
> > > > > devices today, except it incurs interrupt latency. The poll_source API
> > > > > eliminates the interrupt latency for drivers that can disable device
> > > > > interrupts cheaply.
> > > > >
> > > > > This patch adds poll_source to core virtio code so that all virtio
> > > > > drivers get this feature for free. No driver-specific changes are
> > > > > needed.
> > > > >
> > > > > If you mean networking, block layer, etc by "subsystems" then there's
> > > > > nothing those subsystems can do to help. Whether poll_source can be used
> > > > > depends on the specific driver, not the subsystem. If you consider
> > > > > drivers/virtio/ a subsystem, then that's exactly what the patch series
> > > > > is doing.
> > > >
> > > >
> > > > I meant, if we choose to use idle poll, we have some several choices:
> > > >
> > > > 1) bus level (e.g the virtio)
> > > > 2) subsystem level (e.g the networking and block)
> > > >
> > > > I'm not sure which one is better.
> > >
> > > This API is intended to be driver- or bus-level. I don't think
> > > subsystems can do very much since they don't know the hardware
> > > capabilities (cheap interrupt disabling) and in most cases there's no
> > > advantage of plumbing it through subsystems when drivers can call the
> > > API directly.
> > >
> > > > > > 2) What's the advantages of using cpuidle instead of using a thread (and
> > > > > > leverage the scheduler)?
> > > > > In order to combine with the existing cpuidle infrastructure. No new
> > > > > polling loop is introduced and no additional CPU cycles are spent on
> > > > > polling.
> > > > >
> > > > > If cpuidle itself is converted to threads then poll_source would
> > > > > automatically operate in a thread too, but this patch series doesn't
> > > > > change how the core cpuidle code works.
> > > >
> > > >
> > > > So networking subsystem can use NAPI busy polling in the process context
> > > > which means it can be leveraged by the scheduler.
> > > >
> > > > I'm not sure it's a good idea to poll drivers for a specific bus in the
> > > > general cpu idle layer.
> > >
> > > Why? Maybe because the cpuidle execution environment is a little special?
> >
> > Well, this would be prone to abuse.
> >
> > The time spent in that driver callback counts as CPU idle time while
> > it really is the driver running and there is not limit on how much
> > time the callback can take, while doing costly things in the idle loop
> > is generally avoided, because on wakeup the CPU needs to be available
> > to the task needing it as soon as possible.  IOW, the callback
> > potentially add unbounded latency to the CPU wakeup path.
>
> How is this different from driver interrupt handlers running during
> cpuidle?

The time spent on handling interrupts does not count as CPU idle time.

  reply	other threads:[~2021-07-26 16:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13 16:19 [RFC 0/3] cpuidle: add poll_source API and virtio vq polling Stefan Hajnoczi
2021-07-13 16:19 ` [RFC 1/3] cpuidle: add poll_source API Stefan Hajnoczi
2021-07-19 21:03   ` Marcelo Tosatti
2021-07-20 14:15     ` Stefan Hajnoczi
2021-07-13 16:19 ` [RFC 2/3] virtio: add poll_source virtqueue polling Stefan Hajnoczi
2021-07-13 16:19 ` [RFC 3/3] softirq: participate in cpuidle polling Stefan Hajnoczi
2021-07-21  3:29 ` [RFC 0/3] cpuidle: add poll_source API and virtio vq polling Jason Wang
2021-07-21  9:41   ` Stefan Hajnoczi
2021-07-22  9:04     ` Jason Wang
2021-07-26 15:17       ` Stefan Hajnoczi
2021-07-26 15:47         ` Rafael J. Wysocki
2021-07-26 16:01           ` Stefan Hajnoczi
2021-07-26 16:37             ` Rafael J. Wysocki [this message]
2021-07-27 13:32               ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0i4+2xda4Z6=JwRQf4ZzM2_agiyCwhMDRzAC-yz39fGzg@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=daniel.lezcano@linaro.org \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=rjw@rjwysocki.net \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).