All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>,
	aliguori@linux.vnet.ibm.com,
	Anthony Liguori <aliguori@us.ibm.com>,
	qemu-devel@nongnu.org, Stefan Berger <stefanb@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [RFC] New Migration Protocol using Visitor Interface
Date: Mon, 3 Oct 2011 17:45:54 +0200	[thread overview]
Message-ID: <20111003154554.GE20141@redhat.com> (raw)
In-Reply-To: <4E89CE20.6050706@codemonkey.ws>

On Mon, Oct 03, 2011 at 10:00:48AM -0500, Anthony Liguori wrote:
> On 10/03/2011 09:41 AM, Michael S. Tsirkin wrote:
> >On Mon, Oct 03, 2011 at 08:51:10AM -0500, Anthony Liguori wrote:
> >>On 10/03/2011 08:38 AM, Michael S. Tsirkin wrote:
> >>>On Mon, Oct 03, 2011 at 07:55:48AM -0500, Anthony Liguori wrote:
> >>>>On 10/02/2011 04:08 PM, Michael S. Tsirkin wrote:
> >>>>>On Sun, Oct 02, 2011 at 04:21:47PM -0400, Stefan Berger wrote:
> >>>>>>
> >>>>>>>4) Implement the BERVisitor and make this the default migration protocol.
> >>>>>>>
> >>>>>>>Most of the work will be in 1), though with the implementation in this series we should be able to do it incrementally. I'm not sure if the best approach is doing the mechanical phase 1 conversion, then doing phase 2 sometime after 4), doing phase 1 + 2 as part of 1), or just doing VMState conversions which gives basically the same capabilities as phase 1 + 2.
> >>>>>>>
> >>>>>>>Thoughts?
> >>>>>>Is anyone working on this? If not I may give it a shot (tomorrow++)
> >>>>>>for at least some of the primitives... for enabling vNVRAM metadata
> >>>>>>of course. Indefinite length encoding of constructed data types I
> >>>>>>suppose won't be used otherwise the visitor interface seems wrong
> >>>>>>for parsing and skipping of extra data towards the end of a
> >>>>>>structure if version n wrote the stream and appended some of its
> >>>>>>version n data and now version m<    n is trying to read the struct
> >>>>>>and needs to skip the version [m+1, n ] data fields ... in that case
> >>>>>>the de-serialization of the stream should probably be stream-driven
> >>>>>>rather than structure-driven.
> >>>>>>
> >>>>>>    Stefan
> >>>>>
> >>>>>Yes I've been struggling with that exactly.
> >>>>>Anthony, any thoughts?
> >>>>
> >>>>It just depends on how you write your visitor.  If you used
> >>>>sequences, you'd probably do something like this:
> >>>>
> >>>>start_struct ->
> >>>>   check for sequence tag, push starting offset and size onto stack
> >>>>   increment offset to next tag
> >>>>
> >>>>type_int (et al) ->
> >>>>   check for explicit type, parse data
> >>>>   increment offset to next tag
> >>>>
> >>>>end_struct ->
> >>>>   pop starting offset and size to temp variables
> >>>>   set offset to starting offset + size
> >>>>
> >>>>This is roughly how the QMP input marshaller works FWIW.
> >>>>
> >>>>Regards,
> >>>>
> >>>>Anthony Liguori
> >>>
> >>>One thing I worry about is enabling zero copy for
> >>>large string types (e.g. memory migration).
> >>
> >>Memory shouldn't be done through Visitors.  It should be handled as a special case.
> >
> >OK, that's fine then.
> >
> >>>So we need to be able to see a tag for memory page + address,
> >>>read that from socket directly at the correct virtual address.
> >>>
> >>>Probably, we can avoid using visitors for memory, and hope
> >>>everything else can stand an extra copy since it's small.
> >>>
> >>>But then, why do we worry about the size of
> >>>encoded device state as Anthony seems to do?
> >>
> >>There's a significant difference between the cost of something on
> >>the wire and the cost of doing a memcpy.  The cost of the data on
> >>the wire is directly proportional to downtime.  So if we increase
> >>the size of the device state by a factor of 10, we increase the
> >>minimum downtime by a factor of 10.
> >>
> >>Of course, *if* the size of device state is already negligible with
> >>respect to the minimum downtime, then it doesn't matter.  This is
> >>easy to quantify though.  For a normal migration session today,
> >>what's the total size of the device state in relation to the
> >>calculated bandwidth of the minimum downtime?
> >>
> >>If it's very small, then we can add names and not worry about it.
> >>
> >>Regards,
> >>
> >>Anthony Liguori
> >
> >Yes, it's easy to quantify. I think the following gives us
> >the offset before and after, so the difference is the size
> >we seek, right?
> 
> Yeah, you'll also want:
> 
> diff --git a/arch_init.c b/arch_init.c
> index a6c69c7..0d64200 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -334,6 +334,10 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, voi
> 
>      expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
> 
> +    if (stage == 2 && expected_time <= migrate_max_downtime()) {
> +        fprintf(stderr, "max bwidth: %lld\n", (long)(expected_time * bwidth));
> +    }
> +
>      return (stage == 2) && (expected_time <= migrate_max_downtime());
>  }
> 
> You'll want to compare the size to max bwidth.

Well that depends on how guest behaves etc. I'm guessing
just full memory size is a sane thing to compare against.

I don't have a problem sticking this fprintf in as well.

> BTW, putting this info properly into migration stats would probably
> be pretty useful.
> 
> Regards,
> 
> Anthony Liguori

Problem is adding anything to monitor makes me worry
about future compatibility so much I usually just give up.
IMO we really need a namespace for in-development experimental
commands, like "unsupported-XXX", this would belong.

-- 
MST

  reply	other threads:[~2011-10-03 15:44 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-19 14:41 [Qemu-devel] [RFC] New Migration Protocol using Visitor Interface Michael Roth
2011-09-19 14:41 ` [Qemu-devel] [RFC 1/8] qapi: add Visitor interfaces for uint*_t and int*_t Michael Roth
2011-09-19 14:41 ` [Qemu-devel] [RFC 2/8] qapi: add QemuFileOutputVisitor Michael Roth
2011-09-19 14:41 ` [Qemu-devel] [RFC 3/8] qapi: add QemuFileInputVisitor Michael Roth
2011-10-24 23:59   ` Chris Krumme
2011-09-19 14:41 ` [Qemu-devel] [RFC 4/8] savevm: move QEMUFile interfaces into qemu-file.c Michael Roth
2011-09-24  7:23   ` Blue Swirl
2011-09-19 14:41 ` [Qemu-devel] [RFC 5/8] qapi: test cases for QEMUFile input/output visitors Michael Roth
2011-09-19 14:41 ` [Qemu-devel] [RFC 6/8] savevm: add QEMUFile->visitor lookup routines Michael Roth
2011-09-19 14:41 ` [Qemu-devel] [RFC 7/8] cutil: add strocat(), to concat a string to an offset in another Michael Roth
2011-09-20 10:43   ` Paolo Bonzini
2011-09-19 14:41 ` [Qemu-devel] [RFC 8/8] slirp: convert save/load function to visitor interface Michael Roth
2011-09-30 13:39   ` Anthony Liguori
2011-09-30 14:08     ` Michael Roth
2011-10-02 20:21 ` [Qemu-devel] [RFC] New Migration Protocol using Visitor Interface Stefan Berger
2011-10-02 21:08   ` Michael S. Tsirkin
2011-10-03 12:55     ` Anthony Liguori
2011-10-03 13:10       ` Stefan Berger
2011-10-03 13:18         ` Anthony Liguori
2011-10-03 13:30           ` Michael S. Tsirkin
2011-10-03 13:48             ` Anthony Liguori
2011-10-03 14:18               ` Michael S. Tsirkin
2011-10-03 14:56                 ` Anthony Liguori
2011-10-03 15:42                   ` Michael S. Tsirkin
2011-10-03 13:38       ` Michael S. Tsirkin
2011-10-03 13:51         ` Anthony Liguori
2011-10-03 14:41           ` Michael S. Tsirkin
2011-10-03 15:00             ` Anthony Liguori
2011-10-03 15:45               ` Michael S. Tsirkin [this message]
2011-10-03 16:05                 ` Anthony Liguori
2011-10-03 16:24                   ` Daniel P. Berrange
2011-10-03 16:51                   ` Michael S. Tsirkin
2011-10-05 11:28               ` Michael S. Tsirkin
2011-10-05 12:46                 ` Anthony Liguori
2011-10-03  6:46 ` Michael S. Tsirkin
2011-10-03 12:51   ` Anthony Liguori
2011-10-03 13:24     ` Michael S. Tsirkin
2011-10-03 13:43       ` Anthony Liguori
2011-10-03 14:11         ` Michael S. Tsirkin
2011-10-03 14:42           ` Anthony Liguori
2011-10-03 15:29             ` Michael S. Tsirkin
2011-10-03 15:44               ` Anthony Liguori
2011-10-03 15:58                 ` Michael S. Tsirkin
2011-10-03 16:02                   ` Anthony Liguori
2011-10-03 14:15         ` Michael S. Tsirkin
2011-10-03 14:55           ` Anthony Liguori
2011-10-03 15:41             ` Michael S. Tsirkin
2011-10-05  2:05         ` Stefan Berger
2011-10-05 12:54           ` Anthony Liguori
2011-10-05 19:06             ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111003154554.GE20141@redhat.com \
    --to=mst@redhat.com \
    --cc=aliguori@linux.vnet.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=anthony@codemonkey.ws \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanb@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.