All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Laurent Vivier <lvivier@redhat.com>, Fam Zheng <famz@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Michael Roth <mdroth@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread
Date: Thu, 7 Sep 2017 17:53:37 +0100	[thread overview]
Message-ID: <CAJSP0QWT2fNs16_mAy-iO_zc=ocYegCVwgV6UAXAL8S_GFjGXg@mail.gmail.com> (raw)
In-Reply-To: <20170907120227.GE23040@pxdev.xzpeter.org>

On Thu, Sep 7, 2017 at 1:02 PM, Peter Xu <peterx@redhat.com> wrote:
> On Thu, Sep 07, 2017 at 11:09:29AM +0100, Stefan Hajnoczi wrote:
>> On Thu, Sep 7, 2017 at 10:35 AM, Dr. David Alan Gilbert
>> <dgilbert@redhat.com> wrote:
>> > * Stefan Hajnoczi (stefanha@gmail.com) wrote:
>> >> On Wed, Sep 6, 2017 at 4:14 PM, Dr. David Alan Gilbert
>> >> <dgilbert@redhat.com> wrote:
>> >> > * Stefan Hajnoczi (stefanha@gmail.com) wrote:
>> >> >> On Wed, Aug 23, 2017 at 02:51:03PM +0800, Peter Xu wrote:
>> >> >> > The root problem is that, monitor commands are all handled in main
>> >> >> > loop thread now, no matter how many monitors we specify. And, if main
>> >> >> > loop thread hangs due to some reason, all monitors will be stuck.
>> >> >>
>> >> >> I see a larger issue with postcopy: existing QEMU code assumes that
>> >> >> guest memory access is instantaneous.
>> >> >>
>> >> >> Postcopy breaks this assumption and introduces blocking points that can
>> >> >> now take unbounded time.
>> >> >>
>> >> >> This problem isn't specific to the monitor.  It can also happen to other
>> >> >> components in QEMU like the gdbstub.
>> >> >>
>> >> >> Do we need an asynchronous memory API?  Synchronous memory access should
>> >> >> only be allowed in vcpu threads.
>> >> >
>> >> > It would probably be useful for gdbstub where the overhead of async
>> >> > doesn't matter;  but doing that for all IO emulation is hard.
>> >>
>> >> Why is it hard?
>> >>
>> >> Memory access can be synchronous in the vcpu thread.  That eliminates
>> >> a lot of code straight away.
>> >>
>> >> Anything using dma-helpers.c is already async.  They just don't know
>> >> that the memory access part is being made async too :).
>> >
>> > Can you point me to some info on that ?
>>
>> IDE and SCSI use dma-helpers.c to perform I/O:
>> hw/ide/core.c:892:        s->bus->dma->aiocb =
>> dma_blk_io(blk_get_aio_context(s->blk),
>> hw/ide/macio.c:189:        s->bus->dma->aiocb =
>> dma_blk_io(blk_get_aio_context(s->blk), &s->sg,
>> hw/scsi/scsi-disk.c:348:        r->req.aiocb =
>> dma_blk_io(blk_get_aio_context(s->qdev.conf.blk),
>> hw/scsi/scsi-disk.c:551:        r->req.aiocb =
>> dma_blk_io(blk_get_aio_context(s->qdev.conf.blk),
>>
>> They pass a scatter-gather list of guest RAM addresses to
>> dma-helpers.c.  They receive a callback when I/O has finished.
>>
>> Try following the code path.  Request submission may be from a vcpu
>> thread or IOThread.  Completion occurs in the main loop or an
>> IOThread.
>>
>> The main point is that this API is already asynchronous.  If any
>> changes are needed for async guest memory access (not sure, I haven't
>> checked), then at least the dma-helpers.c users do not need to be
>> modified.
>>
>> >> The remaining cases are virtio and some other devices.
>> >>
>> >> If you are worried about performance, the first rule is that async
>> >> memory access is only needed on the destination side when post-copy is
>> >> active.  Maybe use setjmp to return from the signal handler and queue
>> >> a callback for when the page has been loaded.
>> >
>> > I'm not sure it's worth trying to be too clever at avoiding this;
>> > I see the fact that we're doing IO with the bql held as a more
>> > fundamental problem.
>>
>> QEMU should be doing I/O syscalls in async fashion or threadpool
>> workers (no BQL) so the BQL is not an issue.  Anything else could
>> cause unbounded waits even without postcopy.
>
> E.g. when vcpu got page faulted with BQL taken, while the main thread
> needs the BQL to dispatch anything, including monitor commands.
>
> So I think it's a multiplex problem - we need to solve both (1) main
> thread accessing guest memories which is still missing, and (2) BQL
> deadlocks between vcpu threads and main thread.

I think we need a single solution and cannot treat these as separate.
This is because the same virtio device emulation code may run in 3
contexts:
1. vcpu thread (ioeventfd=off)
2. main loop thread (ioeventfd=on)
3. IOThread (ioeventfd=on, iothread=<id>)

If you try to solve them separately then the code won't work in all 3
contexts anymore.

Stefan

  reply	other threads:[~2017-09-07 16:53 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23  6:51 [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 1/8] monitor: move skip_flush into monitor_data_init Peter Xu
2017-08-23 16:31   ` Dr. David Alan Gilbert
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll Peter Xu
2017-08-23 17:35   ` Dr. David Alan Gilbert
2017-08-25  4:25     ` Peter Xu
2017-08-25  9:30       ` Dr. David Alan Gilbert
2017-08-28  5:53         ` Peter Xu
2017-09-08 17:29           ` Dr. David Alan Gilbert
2017-08-25 15:27   ` Marc-André Lureau
2017-08-25 15:33     ` Dr. David Alan Gilbert
2017-08-25 16:07       ` Marc-André Lureau
2017-08-25 16:12         ` Dr. David Alan Gilbert
2017-08-25 16:21           ` Marc-André Lureau
2017-08-25 16:29             ` Dr. David Alan Gilbert
2017-08-26  8:33               ` Marc-André Lureau
2017-08-28  3:05         ` Peter Xu
2017-08-28 10:11           ` Marc-André Lureau
2017-08-28 12:48             ` Peter Xu
2017-09-05 18:58               ` Dr. David Alan Gilbert
2017-08-28 11:08         ` Markus Armbruster
2017-08-28 12:28           ` Marc-André Lureau
2017-08-28 16:24             ` Markus Armbruster
2017-08-28 17:24               ` Marc-André Lureau
2017-08-29  6:27                 ` Markus Armbruster
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 3/8] char-io: fix possible risk on IOWatchPoll Peter Xu
2017-08-25 14:44   ` Marc-André Lureau
2017-08-26  7:19   ` Fam Zheng
2017-08-28  5:56     ` Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 4/8] QAPI: new QMP command option "without-bql" Peter Xu
2017-08-23 17:44   ` Dr. David Alan Gilbert
2017-08-23 23:37     ` Fam Zheng
2017-08-25  5:37       ` Peter Xu
2017-08-25  9:14         ` Dr. David Alan Gilbert
2017-08-28  8:08           ` Peter Xu
2017-09-08 17:38             ` Dr. David Alan Gilbert
2017-08-25  5:35     ` Peter Xu
2017-08-25  9:06       ` Dr. David Alan Gilbert
2017-08-28  8:26         ` Peter Xu
2017-09-08 17:52           ` Dr. David Alan Gilbert
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 5/8] hmp: support "without_bql" Peter Xu
2017-08-23 17:46   ` Dr. David Alan Gilbert
2017-08-25  5:44     ` Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 6/8] migration: qmp: migrate_incoming don't need BQL Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 7/8] migration: hmp: " Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 8/8] migration: add incoming mgmt lock Peter Xu
2017-08-23 18:01   ` Dr. David Alan Gilbert
2017-08-25  5:49     ` Peter Xu
2017-08-25  9:34       ` Dr. David Alan Gilbert
2017-08-28  8:39         ` Peter Xu
2017-08-29 11:03 ` [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread Daniel P. Berrange
2017-08-30  7:06   ` Markus Armbruster
2017-08-30 10:13     ` Daniel P. Berrange
2017-08-31  3:31       ` Peter Xu
2017-08-31  9:14         ` Daniel P. Berrange
2017-09-06  9:48   ` Dr. David Alan Gilbert
2017-09-06 10:46     ` Daniel P. Berrange
2017-09-06 10:48       ` Dr. David Alan Gilbert
2017-09-06 10:54         ` Daniel P. Berrange
2017-09-06 10:57           ` Dr. David Alan Gilbert
2017-09-06 11:06             ` Daniel P. Berrange
2017-09-06 11:31               ` Dr. David Alan Gilbert
2017-09-06 11:54                 ` Daniel P. Berrange
2017-09-07  8:13                   ` Peter Xu
2017-09-07  8:49                     ` Stefan Hajnoczi
2017-09-07  9:18                       ` Dr. David Alan Gilbert
2017-09-07 10:19                         ` Stefan Hajnoczi
2017-09-07 10:24                         ` Peter Xu
2017-09-07  8:55                     ` Daniel P. Berrange
2017-09-07  9:19                       ` Dr. David Alan Gilbert
2017-09-07  9:22                         ` Daniel P. Berrange
2017-09-07  9:27                           ` Dr. David Alan Gilbert
2017-09-07 11:19                         ` Markus Armbruster
2017-09-07 11:31                           ` Dr. David Alan Gilbert
2017-09-07  9:15                     ` Dr. David Alan Gilbert
2017-09-07  9:25                       ` Daniel P. Berrange
2017-09-07 12:59                     ` Markus Armbruster
2017-09-07 13:22                       ` Daniel P. Berrange
2017-09-07 17:41                         ` Markus Armbruster
2017-09-07 18:09                           ` Dr. David Alan Gilbert
2017-09-08  8:41                             ` Markus Armbruster
2017-09-08  9:32                               ` Dr. David Alan Gilbert
2017-09-08 11:49                                 ` Markus Armbruster
2017-09-08 13:19                                   ` Stefan Hajnoczi
2017-09-11 10:32                                   ` Peter Xu
2017-09-11 10:36                                     ` Peter Xu
2017-09-11 10:43                                   ` Daniel P. Berrange
2017-09-08  9:27                           ` Daniel P. Berrange
2017-09-07 14:20                       ` Dr. David Alan Gilbert
2017-09-07 17:41                         ` Markus Armbruster
2017-09-07 18:04                           ` Dr. David Alan Gilbert
2017-09-07 10:04                   ` Dr. David Alan Gilbert
2017-09-07 10:08                     ` Daniel P. Berrange
2017-09-07 13:59                 ` Eric Blake
2017-09-06 14:50 ` Stefan Hajnoczi
2017-09-06 15:14   ` Dr. David Alan Gilbert
2017-09-07  7:38     ` Peter Xu
2017-09-07  8:58     ` Stefan Hajnoczi
2017-09-07  9:35       ` Dr. David Alan Gilbert
2017-09-07 10:09         ` Stefan Hajnoczi
2017-09-07 12:02           ` Peter Xu
2017-09-07 16:53             ` Stefan Hajnoczi [this message]
2017-09-07 17:14               ` Dr. David Alan Gilbert
2017-09-07 17:35                 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJSP0QWT2fNs16_mAy-iO_zc=ocYegCVwgV6UAXAL8S_GFjGXg@mail.gmail.com' \
    --to=stefanha@gmail.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=famz@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.