From: David Gibson <david@gibson.dropbear.id.au>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>,
Juan Quintela <quintela@redhat.com>,
Scott Cheloha <cheloha@linux.vnet.ibm.com>,
Michael Roth <mdroth@linux.vnet.ibm.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v2 2/2] migration: savevm_state_handler_insert: constant-time element insertion
Date: Thu, 5 Dec 2019 09:28:27 +1100 [thread overview]
Message-ID: <20191204222827.GE5031@umbus.fritz.box> (raw)
In-Reply-To: <20191204164915.GM3325@work-vm>
[-- Attachment #1: Type: text/plain, Size: 3629 bytes --]
On Wed, Dec 04, 2019 at 04:49:15PM +0000, Dr. David Alan Gilbert wrote:
> * Scott Cheloha (cheloha@linux.vnet.ibm.com) wrote:
> > On Mon, Oct 21, 2019 at 09:14:44AM +0100, Dr. David Alan Gilbert wrote:
> > > * David Gibson (david@gibson.dropbear.id.au) wrote:
> > > > On Fri, Oct 18, 2019 at 10:43:52AM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Laurent Vivier (lvivier@redhat.com) wrote:
> > > > > > On 18/10/2019 10:16, Dr. David Alan Gilbert wrote:
> > > > > > > * Scott Cheloha (cheloha@linux.vnet.ibm.com) wrote:
> > > > > > >> savevm_state's SaveStateEntry TAILQ is a priority queue. Priority
> > > > > > >> sorting is maintained by searching from head to tail for a suitable
> > > > > > >> insertion spot. Insertion is thus an O(n) operation.
> > > > > > >>
> > > > > > >> If we instead keep track of the head of each priority's subqueue
> > > > > > >> within that larger queue we can reduce this operation to O(1) time.
> > > > > > >>
> > > > > > >> savevm_state_handler_remove() becomes slightly more complex to
> > > > > > >> accomodate these gains: we need to replace the head of a priority's
> > > > > > >> subqueue when removing it.
> > > > > > >>
> > > > > > >> With O(1) insertion, booting VMs with many SaveStateEntry objects is
> > > > > > >> more plausible. For example, a ppc64 VM with maxmem=8T has 40000 such
> > > > > > >> objects to insert.
> > > > > > >
> > > > > > > Separate from reviewing this patch, I'd like to understand why you've
> > > > > > > got 40000 objects. This feels very very wrong and is likely to cause
> > > > > > > problems to random other bits of qemu as well.
> > > > > >
> > > > > > I think the 40000 objects are the "dr-connectors" that are used to plug
> > > > > > peripherals (memory, pci card, cpus, ...).
> > > > >
> > > > > Yes, Scott confirmed that in the reply to the previous version.
> > > > > IMHO nothing in qemu is designed to deal with that many devices/objects
> > > > > - I'm sure that something other than the migration code is going to
> > > > > get upset.
> > > >
> > > > It kind of did. Particularly when there was n^2 and n^3 cubed
> > > > behaviour in the property stuff we had some ludicrously long startup
> > > > times (hours) with large maxmem values.
> > > >
> > > > Fwiw, the DRCs for PCI slots, DRCs and PHBs aren't really a problem.
> > > > The problem is the memory DRCs, there's one for each LMB - each 256MiB
> > > > chunk of memory (or possible memory).
> > > >
> > > > > Is perhaps the structure wrong somewhere - should there be a single DRC
> > > > > device that knows about all DRCs?
> > > >
> > > > Maybe. The tricky bit is how to get there from here without breaking
> > > > migration or something else along the way.
> > >
> > > Switch on the next machine type version - it doesn't matter if migration
> > > is incompatible then.
> >
> > 1mo bump.
> >
> > Is there anything I need to do with this patch in particular to make it suitable
> > for merging?
>
> Apologies for the delay; hopefully this will go in one of the pulls
> just after the tree opens again.
>
> Please please try and work on reducing the number of objects somehow -
> while this migration fix is a useful short term fix, and not too
> invasive; having that many objects around qemu is a really really bad
> idea so needs fixing properly.
I'm hoping to have a crack at this tomorrow.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2019-12-04 22:41 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-17 20:59 [PATCH v2 0/2] migration: faster savevm_state_handler_insert() Scott Cheloha
2019-10-17 20:59 ` [PATCH v2 1/2] migration: add savevm_state_handler_remove() Scott Cheloha
2019-12-04 16:43 ` Dr. David Alan Gilbert
2020-01-08 19:07 ` Juan Quintela
2019-10-17 20:59 ` [PATCH v2 2/2] migration: savevm_state_handler_insert: constant-time element insertion Scott Cheloha
2019-10-18 8:16 ` Dr. David Alan Gilbert
2019-10-18 8:34 ` Laurent Vivier
2019-10-18 9:43 ` Dr. David Alan Gilbert
2019-10-18 16:38 ` Michael Roth
2019-10-18 17:26 ` Dr. David Alan Gilbert
2019-10-21 7:33 ` David Gibson
2019-10-19 10:12 ` David Gibson
2019-10-21 8:14 ` Dr. David Alan Gilbert
2019-11-20 21:48 ` Scott Cheloha
2019-12-04 16:49 ` Dr. David Alan Gilbert
2019-12-04 22:28 ` David Gibson [this message]
2019-12-04 16:47 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191204222827.GE5031@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=cheloha@linux.vnet.ibm.com \
--cc=dgilbert@redhat.com \
--cc=lvivier@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).