All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp,
	quintela@redhat.com, qemu-devel@nongnu.org,
	lilei@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 15/46] Rework loadvm path for subloops
Date: Wed, 16 Jul 2014 10:25:39 +0100	[thread overview]
Message-ID: <20140716092538.GD2514@work-vm> (raw)
In-Reply-To: <53BAB466.8060609@redhat.com>

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> Il 07/07/2014 16:35, Dr. David Alan Gilbert ha scritto:
> >* Paolo Bonzini (pbonzini@redhat.com) wrote:
> >>Il 04/07/2014 19:41, Dr. David Alan Gilbert (git) ha scritto:
> >>>From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >>>
> >>>Postcopy needs to have two migration streams loading concurrently;
> >>>one from memory (with the device state) and the other from the fd
> >>>with the memory transactions.
> >>
> >>Can you explain this?
> >>
> >>I would have though the order is
> >>
> >>    precopy RAM and everything
> >>    prepare postcopy RAM ("sent && dirty" bitmap)
> >>    finish precopy non-RAM
> >>    finish devices
> >>    postcopy RAM
> >>
> >>Why do you need to have all the packaging stuff and a separate memory-based
> >>migration stream for devices?  I'm sure I'm missing something. :)
> >
> >The thing you're missing is the details of 'finish devices'.
> >The device emulation may access guest memory as part of loading it's
> >state, so you can't successfully complete 'finish devices' without
> >having the 'postcopy RAM' available to provide pages.
> 
> I see.  Can you document the flow (preferrably as a reply to this email
> _and_ in docs/ when you send v2 of the code :))?

I thought I documented enough in the docs/migration.txt stuff in the last
patch (see the 'Postcopy states' section); however lets see if I the following
is better:

----
Postcopy stream

Loading of device data may cause the device emulation to access guest RAM
that may trigger faults that have to be resolved by the source, as such
the migration stream has to be able to respond with page data *during* the
device load, and hence the device data has to be read from the stream completely
before the device load begins to free the stream up.  This is acheived by
'packaging' the device data into a blob that's read in one go.

Source behaviour

Until postcopy is entered the migration stream is identical to normal postcopy,
except for the addition of a 'postcopy advise' command at the beginning to
let the destination know that postcopy might happen.  When postcopy starts
the source sends the page discard data and then forms the 'package' containing:

   Command: 'postcopy ram listen'
   The device state
      A series of sections, identical to the precopy streams device state stream
      containing everything except postcopiable devices (i.e. RAM)
   Command: 'postcopy ram run'

The 'package' is sent as the data part of a Command: 'CMD_PACKAGED', and the
contents are formatted in the same way as the main migration stream.

Destination behaviour

Initially the destination looks the same as precopy, with a single thread
reading the migration stream; the 'postcopy advise' and 'discard' commands
are processed to change the way RAM is managed, but don't affect the stream
processing.

------------------------------------------------------------------------------
                        1      2   3     4 5                      6   7
main -----DISCARD-CMD_PACKAGED ( LISTEN  DEVICE     DEVICE DEVICE RUN )
thread                             |       |
                                   |     (page request)   
                                   |        \___
                                   v            \
listen thread:                     --- page -- page -- page -- page -- page --

                                   a   b        c
------------------------------------------------------------------------------

On receipt of CMD_PACKAGED (1)
   All the data associated with the package - the ( ... ) section in the
diagram - is read into memory (into a QEMUSizedBuffer), and the main thread
recurses into qemu_loadvm_state_main to process the contents of the package (2)
which contains commands (3,6) and devices (4...)

On receipt of 'postcopy ram listen' - 3 -(i.e. the 1st command in the package)
a new thread (a) is started that takes over servicing the migration stream,
while the main thread carries on loading the package.   It loads normal
background page data (b) but if during a device load a fault happens (5) the
returned page (c) is loaded by the listen thread allowing the main threads
device load to carry on.

The last thing in the CMD_PACKAGED is a 'RUN' command (6) letting the destination
CPUs start running.
At the end of the CMD_PACKAGED (7) the main thread returns to normal running behaviour
and is no longer used by migration, while the listen thread carries
on servicing page data until the end of migration.

----

Is that any better?

Dave
P.S. I know of at least one bug in this code at the moment, it happens
on a VM that doesn't have many dirty pages where all the pages are
transmitted, and hence the listen thread finishes, before the main thread
gets to 'run'.

> From my cursory read of the code it is something like this on the source:
> 
>     finish precopy non-RAM
>     start RAM postcopy
>     for each device
>         pack up data
>         send it to destination
> 
> and on the destination:
> 
>     while source sends packet
>         pick up packet atomically
>         pass the packet to device loader
>             (while the loader works, userfaultfd does background magic)
> 
> But something is missing still, either some kind of ack is needed between
> device data sends or userfaultfd needs to be able to process device data
> packets.
> 
> Paolo
> 
> >Thus you need to be able to start up 'postcopy RAM' before 'finish devices'
> >has completed, and you can't do that if 'finish devices' is still stuffing
> >data down the fd.
> >
> >Now, if hypothetically you had:
> >  1) A migration format that let you separate out device state so that you
> >could load all the state of the device off the fd without calling the device
> >IO code.
> >  2) All devices were good and didn't touch guest memory while loading their
> >state.
> >
> >then you could avoid this complexity.  However, if you look at how Stefan's
> >BER code tried to do 1 (which I don't do in my way of doing it), it was by
> >using the same trick of stuffing the device data into a dummy memory file
> >to find out the size of the data.   And I'm not convinced (2) will happen
> >this century.
> >
> >>Paolo
> >
> >Dave
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  parent reply	other threads:[~2014-07-16  9:26 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-04 17:41 [Qemu-devel] [PATCH 00/46] Postcopy implementation Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 01/46] qemu_ram_foreach_block: pass up error value, and down the ramblock name Dr. David Alan Gilbert (git)
2014-07-07 15:46   ` Eric Blake
2014-07-07 15:48     ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 02/46] Move QEMUFile structure to qemu-file.h Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 03/46] QEMUSizedBuffer/QEMUFile Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 04/46] improve DPRINTF macros, add to savevm Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 05/46] Add qemu_get_counted_string to read a string prefixed by a count byte Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 06/46] Create MigrationIncomingState Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 07/46] Return path: Open a return path on QEMUFile for sockets Dr. David Alan Gilbert (git)
2014-07-05 10:06   ` Paolo Bonzini
2014-07-16  9:37     ` Dr. David Alan Gilbert
2014-07-16  9:50       ` Paolo Bonzini
2014-07-16 11:52         ` Dr. David Alan Gilbert
2014-07-16 12:31           ` Paolo Bonzini
2014-07-16 17:10             ` Dr. David Alan Gilbert
2014-07-17  6:25               ` Paolo Bonzini
2014-07-04 17:41 ` [Qemu-devel] [PATCH 08/46] Return path: socket_writev_buffer: Block even on non-blocking fd's Dr. David Alan Gilbert (git)
2014-07-05 10:07   ` Paolo Bonzini
2014-07-04 17:41 ` [Qemu-devel] [PATCH 09/46] Migration commands Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 10/46] Return path: Control commands Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 11/46] Return path: Send responses from destination to source Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 12/46] Return path: Source handling of return path Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 13/46] qemu_loadvm debug Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 14/46] ram_debug_dump_bitmap: Dump a migration bitmap as text Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 15/46] Rework loadvm path for subloops Dr. David Alan Gilbert (git)
2014-07-05 10:26   ` Paolo Bonzini
2014-07-07 14:35     ` Dr. David Alan Gilbert
2014-07-07 14:53       ` Paolo Bonzini
2014-07-07 15:04         ` Dr. David Alan Gilbert
2014-07-16  9:25         ` Dr. David Alan Gilbert [this message]
2014-07-04 17:41 ` [Qemu-devel] [PATCH 16/46] Add migration-capability boolean for postcopy-ram Dr. David Alan Gilbert (git)
2014-07-07 19:41   ` Eric Blake
2014-07-07 20:23     ` Dr. David Alan Gilbert
2014-07-10 16:17       ` Paolo Bonzini
2014-07-10 19:02         ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 17/46] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 18/46] QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 19/46] migrate_init: Call from savevm Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 20/46] Allow savevm handlers to state whether they could go into postcopy Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 21/46] postcopy: OS support test Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 22/46] Migration parameters: Add qmp/hmp commands for setting/viewing Dr. David Alan Gilbert (git)
2014-07-07 19:50   ` Eric Blake
2014-07-04 17:41 ` [Qemu-devel] [PATCH 23/46] MIG_STATE_POSTCOPY_ACTIVE: Add new migration state Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 24/46] qemu_savevm_state_complete: Postcopy changes Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 25/46] Postcopy: Maintain sentmap during postcopy pre phase Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 26/46] Postcopy page-map-incoming (PMI) structure Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 27/46] postcopy: Add incoming_init/cleanup functions Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 28/46] postcopy: Incoming initialisation Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 29/46] postcopy: ram_enable_notify to switch on userfault Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 30/46] Postcopy: postcopy_start Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 31/46] Postcopy: Rework migration thread for postcopy mode Dr. David Alan Gilbert (git)
2014-07-05 10:19   ` Paolo Bonzini
2014-08-28 11:04     ` Dr. David Alan Gilbert
2014-08-28 11:23       ` Paolo Bonzini
2014-07-04 17:41 ` [Qemu-devel] [PATCH 32/46] mig fd_connect: open return path Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 33/46] Postcopy: Create a fault handler thread before marking the ram as userfault Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 34/46] Page request: Add MIG_RPCOMM_REQPAGES reverse command Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 35/46] Page request: Process incoming page request Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 36/46] Page request: Consume pages off the post-copy queue Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 37/46] Add assertion to check migration_dirty_pages doesn't go -ve; have seen it happen once but not sure why Dr. David Alan Gilbert (git)
2014-07-11 15:20   ` Eric Blake
2014-07-11 15:41     ` Dr. David Alan Gilbert
2014-07-04 17:41 ` [Qemu-devel] [PATCH 38/46] postcopy_ram.c: place_page and helpers Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 39/46] Postcopy: Use helpers to map pages during migration Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 40/46] qemu_ram_block_from_host Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 41/46] Handle userfault requests (although userfaultfd not done yet) Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 42/46] Start up a postcopy/listener thread ready for incoming page data Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 43/46] postcopy: Wire up loadvm_postcopy_ram_handle_{run, end} commands Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 44/46] postcopy: Use userfaultfd Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 45/46] End of migration for postcopy Dr. David Alan Gilbert (git)
2014-07-04 17:41 ` [Qemu-devel] [PATCH 46/46] Start documenting how postcopy works Dr. David Alan Gilbert (git)
2014-07-05 10:28 ` [Qemu-devel] [PATCH 00/46] Postcopy implementation Paolo Bonzini
2014-07-07 14:02   ` Dr. David Alan Gilbert
2014-07-07 14:35     ` Paolo Bonzini
2014-07-07 14:58       ` Dr. David Alan Gilbert
2014-07-10 11:29       ` Dr. David Alan Gilbert
2014-07-10 12:48         ` Eric Blake
2014-07-10 13:37           ` Dr. David Alan Gilbert
2014-07-10 15:33             ` Andrea Arcangeli
2014-07-10 15:49               ` Dr. David Alan Gilbert
2014-07-11  4:05                 ` Sanidhya Kashyap
2014-08-11 15:31           ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140716092538.GD2514@work-vm \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=lilei@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yamahata@private.email.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.