qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Pavel Dovgalyuk" <dovgaluk@ispras.ru>
To: 'Kevin Wolf' <kwolf@redhat.com>
Cc: edgar.iglesias@xilinx.com, peter.maydell@linaro.org,
	igor.rubinov@gmail.com, mark.burton@greensocs.com,
	real@ispras.ru, hines@cert.org, qemu-devel@nongnu.org,
	maria.klimushenkova@ispras.ru, stefanha@redhat.com,
	pbonzini@redhat.com, batuzovk@ispras.ru, alex.bennee@linaro.org,
	fred.konrad@greensocs.com
Subject: Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay
Date: Tue, 16 Feb 2016 09:25:08 +0300	[thread overview]
Message-ID: <000601d16882$c9637270$5c2a5750$@ru> (raw)
In-Reply-To: <20160215150110.GG5244@noname.str.redhat.com>

> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Am 15.02.2016 um 15:24 hat Pavel Dovgalyuk geschrieben:
> > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> >
> > > > There could be asynchronous events that occur in non-cpu threads.
> > > > For now these events are shutdown request and block task execution.
> > > > They may "hide" following clock (or another one) events. That is why
> > > > we process them until synchronous event (like clock, instructions
> > > > execution, or checkpoint) is met.
> > > >
> > > >
> > > > > Anyway, what does "can't proceed" mean? The coroutine yields because
> > > > > it's waiting for I/O, but it is never reentered? Or is it hanging while
> > > > > trying to acquire a lock?
> > > >
> > > > I've solved this problem by slightly modifying the queue.
> > > > I haven't yet made BlockDriverState assignment to the request ids.
> > > > Therefore aio_poll was temporarily replaced with usleep.
> > > > Now execution starts and hangs at some random moment of OS loading.
> > > >
> > > > Here is the current version of blkreplay functions:
> > > >
> > > > static int coroutine_fn blkreplay_co_readv(BlockDriverState *bs,
> > > >     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
> > > > {
> > > >     uint32_t reqid = request_id++;
> > > >     Request *req;
> > > >     req = block_request_insert(reqid, bs, qemu_coroutine_self());
> > > >     bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov);
> > > >
> > > >     if (replay_mode == REPLAY_MODE_RECORD) {
> > > >         replay_save_block_event(reqid);
> > > >     } else {
> > > >         assert(replay_mode == REPLAY_MODE_PLAY);
> > > >         qemu_coroutine_yield();
> > > >     }
> > > >     block_request_remove(req);
> > > >
> > > >     return 0;
> > > > }
> > > >
> > > > void replay_run_block_event(uint32_t id)
> > > > {
> > > >     Request *req;
> > > >     if (replay_mode == REPLAY_MODE_PLAY) {
> > > >         while (!(req = block_request_find(id))) {
> > > >             //aio_poll(bdrv_get_aio_context(req->bs), true);
> > > >             usleep(1);
> > > >         }
> > >
> > > How is this loop supposed to make any progress?
> >
> > This loop does not supposed to make any progress. It waits until block_request_insert
> > call is added to the queue.
> 
> Yes. And who is supposed to add something to the queue? You are looping
> and doing nothing, so no other code gets to run in this thread. It's
> only aio_poll() that would run event handlers that could eventually add
> something to the list.

blkreplay_co_readv adds request to the queue.

> 
> > > And I still don't understand why aio_poll() doesn't work and where it
> > > hangs.
> >
> > aio_poll hangs if "req = block_request_insert(reqid, bs, qemu_coroutine_self());" line
> > is executed after bdrv_co_readv. When bdrv_co_readv yields, replay_run_block_event has no
> > information about pending request and cannot jump to its coroutine.
> > Maybe I should implement aio_poll execution there to make progress in that case?
> 
> Up in the thread, I posted code that was a bit more complex than what
> you have, and which considered both cases (completion before
> replay/completion after replay). If you implement it like this, it might
> just work. And yes, it involved an aio_poll() loop in the replay
> function.

You are right, but I found kind of fundamental problem here.
There are two possible approaches for replaying coroutine events:

1. Waiting with aio_poll in replay_run_block_event.
   In this case replay cannot be made, because of recursive mutex lock:
   aio_poll -> qemu_clock_get_ns -> <lock replay mutex> -> 
     replay_run_block_event -> aio_poll -> qemu_clock_get_ns -> <lock replay mutex> ->
     <assert failed>
   I.e. we have recursive aio_poll function calls that lead to recursive replay calls
   and to recursive mutex lock.

2. Adding block events to the queue until checkpoint is met.
   This will not work with synchronous requests, because they will wait infinitely
   and checkpoint will never happen. Correct me, if I'm not right for this case.
   If blkreplay_co_readv will yield, can vcpu thread unlock the BQL and wait for 
   the checkpoint in iothread?

Pavel Dovgalyuk

  reply	other threads:[~2016-02-16  6:25 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-09  5:55 [Qemu-devel] [PATCH 0/3] Deterministic replay extensions Pavel Dovgalyuk
2016-02-09  5:55 ` [Qemu-devel] [PATCH 1/3] replay: character devices Pavel Dovgalyuk
2016-02-09  5:55 ` [Qemu-devel] [PATCH 2/3] replay: introduce new checkpoint for icount warp Pavel Dovgalyuk
2016-02-09  5:55 ` [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay Pavel Dovgalyuk
2016-02-09 10:27   ` Kevin Wolf
2016-02-09 11:52     ` Pavel Dovgalyuk
2016-02-10 11:45       ` Kevin Wolf
2016-02-10 12:05         ` Pavel Dovgalyuk
2016-02-10 12:28           ` Kevin Wolf
2016-02-10 12:51             ` Pavel Dovgalyuk
2016-02-10 13:25               ` Kevin Wolf
2016-02-10 13:33                 ` Pavel Dovgalyuk
2016-02-10 13:52                   ` Kevin Wolf
2016-02-11  6:05                 ` Pavel Dovgalyuk
2016-02-11  9:43                   ` Kevin Wolf
2016-02-11 11:00                     ` Pavel Dovgalyuk
2016-02-11 12:18                       ` Kevin Wolf
2016-02-11 12:24                         ` Pavel Dovgalyuk
2016-02-12  8:33                         ` Pavel Dovgalyuk
2016-02-12  9:44                           ` Kevin Wolf
2016-02-12 13:19                 ` Pavel Dovgalyuk
2016-02-12 13:58                   ` Kevin Wolf
2016-02-15  8:38                     ` Pavel Dovgalyuk
2016-02-15  9:10                       ` Kevin Wolf
2016-02-15  9:14                       ` Pavel Dovgalyuk
2016-02-15  9:38                         ` Kevin Wolf
2016-02-15 11:19                           ` Pavel Dovgalyuk
2016-02-15 12:46                             ` Kevin Wolf
2016-02-15 13:54                           ` Pavel Dovgalyuk
2016-02-15 14:06                             ` Kevin Wolf
2016-02-15 14:24                               ` Pavel Dovgalyuk
2016-02-15 15:01                                 ` Kevin Wolf
2016-02-16  6:25                                   ` Pavel Dovgalyuk [this message]
2016-02-16 10:02                                     ` Kevin Wolf
2016-02-16 11:20                                       ` Pavel Dovgalyuk
2016-02-16 12:54                                         ` Kevin Wolf
2016-02-18  9:18                                           ` Pavel Dovgalyuk
2016-02-20  7:11                                           ` Pavel Dovgalyuk
2016-02-22 11:06                                             ` Kevin Wolf
2016-02-24 11:59                                               ` Pavel Dovgalyuk
2016-02-24 13:14                                                 ` Kevin Wolf
2016-02-25  9:06                                                   ` Pavel Dovgalyuk
2016-02-26  9:01                                                     ` Kevin Wolf
2016-02-29  7:03                                                       ` Pavel Dovgalyuk
2016-02-29  7:54                                                         ` Kevin Wolf
2016-02-15 14:50                               ` Pavel Dovgalyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='000601d16882$c9637270$5c2a5750$@ru' \
    --to=dovgaluk@ispras.ru \
    --cc=alex.bennee@linaro.org \
    --cc=batuzovk@ispras.ru \
    --cc=edgar.iglesias@xilinx.com \
    --cc=fred.konrad@greensocs.com \
    --cc=hines@cert.org \
    --cc=igor.rubinov@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=maria.klimushenkova@ispras.ru \
    --cc=mark.burton@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=real@ispras.ru \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).