qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Pavel Dovgalyuk" <Pavel.Dovgaluk@ispras.ru>
To: 'Kevin Wolf' <kwolf@redhat.com>
Cc: edgar.iglesias@xilinx.com, peter.maydell@linaro.org,
	igor.rubinov@gmail.com, mark.burton@greensocs.com,
	real@ispras.ru, hines@cert.org, qemu-devel@nongnu.org,
	maria.klimushenkova@ispras.ru, stefanha@redhat.com,
	pbonzini@redhat.com, batuzovk@ispras.ru, alex.bennee@linaro.org,
	fred.konrad@greensocs.com
Subject: Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay
Date: Thu, 11 Feb 2016 15:24:55 +0300	[thread overview]
Message-ID: <001501d164c7$3745b600$a5d12200$@Dovgaluk@ispras.ru> (raw)
In-Reply-To: <20160211121829.GB4419@noname.redhat.com>

> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Am 11.02.2016 um 12:00 hat Pavel Dovgalyuk geschrieben:
> > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > Am 11.02.2016 um 07:05 hat Pavel Dovgalyuk geschrieben:
> > > > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > > > Am 10.02.2016 um 13:51 hat Pavel Dovgalyuk geschrieben:
> > > > > > However, I don't understand yet which layer do you offer as the candidate
> > > > > > for record/replay? What functions should be changed?
> > > > > > I would like to investigate this way, but I don't got it yet.
> > > > >
> > > > > At the core, I wouldn't change any existing function, but introduce a
> > > > > new block driver. You could copy raw_bsd.c for a start and then tweak
> > > > > it. Leave out functions that you don't want to support, and add the
> > > > > necessary magic to .bdrv_co_readv/writev.
> > > > >
> > > > > Something like this (can probably be generalised for more than just
> > > > > reads as the part after the bdrv_co_reads() call should be the same for
> > > > > reads, writes and any other request types):
> > > > >
> > > > > int blkreplay_co_readv()
> > > > > {
> > > > >     BlockReplayState *s = bs->opaque;
> > > > >     int reqid = s->reqid++;
> > > > >
> > > > >     bdrv_co_readv(bs->file, ...);
> > > > >
> > > > >     if (mode == record) {
> > > > >         log(reqid, time);
> > > > >     } else {
> > > > >         assert(mode == replay);
> > > > >         bool *done = req_replayed_list_get(reqid)
> > > > >         if (done) {
> > > > >             *done = true;
> > > > >         } else {
> > > > >             req_completed_list_insert(reqid, qemu_coroutine_self());
> > > > >             qemu_coroutine_yield();
> > > > >         }
> > > > >     }
> > > > > }
> > > > >
> > > > > /* called by replay.c */
> > > > > int blkreplay_run_event()
> > > > > {
> > > > >     if (mode == replay) {
> > > > >         co = req_completed_list_get(e.reqid);
> > > > >         if (co) {
> > > > >             qemu_coroutine_enter(co);
> > > > >         } else {
> > > > >             bool done = false;
> > > > >             req_replayed_list_insert(reqid, &done);
> > > > >             /* wait synchronously for completion */
> > > > >             while (!done) {
> > > > >                 aio_poll();
> > > > >             }
> > > > >         }
> > > > >     }
> > > > > }
> > > > >
> > > > > Where we could consider changing existing code is that it might be
> > > > > desirable to automatically put an instance of this block driver on top
> > > > > of every block device when record/replay is used. If we don't do that,
> > > > > you need to explicitly specify -drive driver=blkreplay,...
> > > >
> > > > As far, as I understand, all synchronous read/write request are also passed
> > > > through this coroutines layer.
> > >
> > > Yes, all read/write requests go through the same function internally, no
> > > matter which external interface was used.
> > >
> > > > It means that every disk access in replay phase should match the recording phase.
> > >
> > > Right. If I'm not mistaken, this was the fundamental requirement you
> > > have, so I wouldn't have suggested this otherwise.
> > >
> > > > Record/replay is intended to be used for debugging and analysis.
> > > > When execution is replayed, guest machine cannot notice analysis overhead.
> > > > Some analysis methods may include disk image reading. E.g., qemu-based
> > > > analysis framework DECAF uses sleuthkit for disk forensics (
> > > https://github.com/sycurelab/DECAF ).
> > > > If similar framework will be used with replay, forensics disk access operations
> > > > won't work if we will record/replay the coroutines.
> > >
> > > Sorry, I'm not sure if I can follow.
> > >
> > > If such analysis software runs in the guest, it's not a replay any more
> > > and I completely fail to see what you're doing.
> > >
> > > If it's a qemu component independent from the guest, then my method
> > > gives you a clean way to bypass the replay driver that wouldn't be
> > > possible with yours.
> >
> > The second one. qemu may be extended with some components that
> > perform guest introspection.
> >
> > > If your plan was to record/replay only async requests and then use sync
> > > requests to bypass the record/replay, let me clearly state that this is
> > > the wrong approach: There are still guest devices which do synchronous
> > > I/O and need to be considered in the replay log, and you shouldn't
> > > prevent the analysis code from using AIO (in fact, using sync I/O in new
> > > code is very much frowned upon).
> >
> > Why do guest synchronous requests have to be recorded?
> > Aren't they completely deterministic?
> 
> Good point. I think you're right in practice. In theory, with dataplane
> (i.e. when running the request in a separate thread) it could happen,
> but I guess that isn't very compatible with replay anyway - and at the
> first sight I couldn't see it performing synchronous requests either.
> 
> > > I can explain in more detail what the block device structure looks like
> > > and how to access an image with and without record/replay, but first let
> > > me please know whether I guessed right what your problem is. Or if I
> > > missed your point, can you please describe in detail a case that
> > > wouldn't work?
> >
> > You have understood it correctly.
> > And what is the solution for bypassing one of the layers from component that
> > should not affect the replay?
> 
> For this, you need to understand how block drivers are stacked in qemu.
> Each driver in the stack has a separate struct BlockDriverState, which
> can be used to access its data. You could hook up things like this:
> 
>       virtio-blk              NBD server
>     --------------           ------------
>           |                        |
>           v                        |
>     +------------+                 |
>     | blkreplay  |                 |
>     +------------+                 |
>           |                        |
>           v                        |
>     +------------+                 |
>     |   qcow2    | <---------------+
>     +------------+
>           |
>           v
>     +------------+
>     | raw-posix  |
>     +------------+
>           |
>           v
>     --------------
>       filesystem
> 
> As you see, what I've chosen for the external analysis interface is just
> an NBD server as this is the component that we already have today. You
> could hook up any other (new) code there; the important part is that it
> doesn't work on the BDS of the blkreplay driver, but directly on the BDS
> of the qcow2 driver.
> 
> On the command line, it could look like this (this assumes that we don't
> add syntactic sugar that creates the blkreplay part automatically - we
> can always do that):
> 
>     -drive file=image.qcow2,if=none,id=img-direct
>     -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay
>     -device virtio-blk-pci,drive=img-blkreplay
> 
> The NBD server can't be started on the command line, so you'd go to the
> monitor and start it there with the direct access:
> 
>     (qemu) nbd_server_start unix:/tmp/my_socket
>     (qemu) nbd_server_add img-direct
> 
> (Exact syntax is untested, but this should roughly be how it works.)

Thank you!

I'll try this approach and come back either with patches or new questions.

Pavel Dovgalyuk

  reply	other threads:[~2016-02-11 12:25 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-09  5:55 [Qemu-devel] [PATCH 0/3] Deterministic replay extensions Pavel Dovgalyuk
2016-02-09  5:55 ` [Qemu-devel] [PATCH 1/3] replay: character devices Pavel Dovgalyuk
2016-02-09  5:55 ` [Qemu-devel] [PATCH 2/3] replay: introduce new checkpoint for icount warp Pavel Dovgalyuk
2016-02-09  5:55 ` [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay Pavel Dovgalyuk
2016-02-09 10:27   ` Kevin Wolf
2016-02-09 11:52     ` Pavel Dovgalyuk
2016-02-10 11:45       ` Kevin Wolf
2016-02-10 12:05         ` Pavel Dovgalyuk
2016-02-10 12:28           ` Kevin Wolf
2016-02-10 12:51             ` Pavel Dovgalyuk
2016-02-10 13:25               ` Kevin Wolf
2016-02-10 13:33                 ` Pavel Dovgalyuk
2016-02-10 13:52                   ` Kevin Wolf
2016-02-11  6:05                 ` Pavel Dovgalyuk
2016-02-11  9:43                   ` Kevin Wolf
2016-02-11 11:00                     ` Pavel Dovgalyuk
2016-02-11 12:18                       ` Kevin Wolf
2016-02-11 12:24                         ` Pavel Dovgalyuk [this message]
2016-02-12  8:33                         ` Pavel Dovgalyuk
2016-02-12  9:44                           ` Kevin Wolf
2016-02-12 13:19                 ` Pavel Dovgalyuk
2016-02-12 13:58                   ` Kevin Wolf
2016-02-15  8:38                     ` Pavel Dovgalyuk
2016-02-15  9:10                       ` Kevin Wolf
2016-02-15  9:14                       ` Pavel Dovgalyuk
2016-02-15  9:38                         ` Kevin Wolf
2016-02-15 11:19                           ` Pavel Dovgalyuk
2016-02-15 12:46                             ` Kevin Wolf
2016-02-15 13:54                           ` Pavel Dovgalyuk
2016-02-15 14:06                             ` Kevin Wolf
2016-02-15 14:24                               ` Pavel Dovgalyuk
2016-02-15 15:01                                 ` Kevin Wolf
2016-02-16  6:25                                   ` Pavel Dovgalyuk
2016-02-16 10:02                                     ` Kevin Wolf
2016-02-16 11:20                                       ` Pavel Dovgalyuk
2016-02-16 12:54                                         ` Kevin Wolf
2016-02-18  9:18                                           ` Pavel Dovgalyuk
2016-02-20  7:11                                           ` Pavel Dovgalyuk
2016-02-22 11:06                                             ` Kevin Wolf
2016-02-24 11:59                                               ` Pavel Dovgalyuk
2016-02-24 13:14                                                 ` Kevin Wolf
2016-02-25  9:06                                                   ` Pavel Dovgalyuk
2016-02-26  9:01                                                     ` Kevin Wolf
2016-02-29  7:03                                                       ` Pavel Dovgalyuk
2016-02-29  7:54                                                         ` Kevin Wolf
2016-02-15 14:50                               ` Pavel Dovgalyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='001501d164c7$3745b600$a5d12200$@Dovgaluk@ispras.ru' \
    --to=pavel.dovgaluk@ispras.ru \
    --cc=alex.bennee@linaro.org \
    --cc=batuzovk@ispras.ru \
    --cc=edgar.iglesias@xilinx.com \
    --cc=fred.konrad@greensocs.com \
    --cc=hines@cert.org \
    --cc=igor.rubinov@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=maria.klimushenkova@ispras.ru \
    --cc=mark.burton@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=real@ispras.ru \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).