All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: duanj@linux.vnet.ibm.com, quintela@redhat.com,
	qemu-devel@nongnu.org, dgilbert@redhat.com, qemu-ppc@nongnu.org,
	bharata@linux.vnet.ibm.com, amit.shah@redhat.com
Subject: Re: [Qemu-devel] [PATCH for-2.8 1/3] migration: alternative way to set instance_id in SaveStateEntry
Date: Wed, 30 Nov 2016 16:22:42 -0600	[thread overview]
Message-ID: <20161130222242.1725.43900@loki> (raw)
In-Reply-To: <20161122225844.3756.74867@loki>

Quoting Michael Roth (2016-11-22 16:58:44)
> Quoting David Gibson (2016-11-22 00:15:10)
> > On Thu, Nov 17, 2016 at 07:40:25PM -0600, Michael Roth wrote:
> > > From: Jianjun Duan <duanj@linux.vnet.ibm.com>
> > > 
> > > Currently migrated Devices are identified with an idstr which is
> > > calculated automatically using their path in the QOM composition
> > > tree. In some cases these Devices may have a more reliable
> > > identifier that may be preferable in situations where there's a
> > > chance their path in the composition tree might change in the
> > > future as a resulting of changes in how the device is modeled
> > > in QEMU.
> > > 
> > > One such Device is the "spapr-dr-connector" used to handle hotplug
> > > for various resources on pseries machine types, where the PAPR
> > > specification mandates that each DRC have a 32-bit globally unique
> > > identifier associated with it. As such, this identifier is also ideal
> > > as a reliable way to identify a particular DRC in the migration
> > > stream, so we introduce support here for using a caller-side supplied
> > > instance_id for Devices in preparation for that.
> > > 
> > > register_savevm_live() and vmstate_register_with_alias_id() already
> > > provide a means to let the caller supply an instance_id instead of
> > > relying on the migration subsystem to generate one automatically,
> > > but in cases where we're registering SaveVMHandlers or VMSDs
> > > (respectively) that are associated with a Device, the instance_id is
> > > ignored in favor of a string identifier based solely on the QOM path.
> > > 
> > > This patch generalizes this so that an instance_id can also be
> > > supplied by the caller for Devices. Since VMSD registration for
> > > Devices is generally handled automatically by qdev, we also introduce
> > > a DeviceClass->dev_get_instance_id() method that, when set, is called
> > > by qdev to obtain the corresponding instance_id that should be used
> > > for a particular Device. Otherwise we maintain the original behavior
> > > of passing instance_id == -1 and thus falling back to the previous
> > > logic of using the QOM path.
> > > 
> > > Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
> > > * moved usage of DeviceClass->dev_get_instance_id() out of savevm.c
> > >   and into caller (qdev) instead
> > > * clarified usage/intent in comments and commit msg
> > > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> > 
> > I've had to remove this patch and 2/3 from ppc-for-2.8, because I
> > discovered (as I was preparing a pull request) that it causes a weird
> > breakage.
> 
> Sorry for the breakage. I will make sure to run make check on ppc64
> beforehand next time as well. Though in this case we might've gotten
> lucky that it happened to trigger at all...
> 
> > 
> > Specifically, on some, but not all, setups it causes the postcopy-test
> > to fail with the error:
> > 
> > qemu-system-ppc64: RP: Received invalid message 0x0000 length 0x0000
> > FAIL
> > 
> > For me, it fails when running on a ppc64le or ppc64 host (RHEL7.3),
> > and on 32-bit x86 (Fedora container) but not on x86_64 host (Fedora
> > 24).
> > 
> > That's a pretty baffling set of symptoms, and so far I haven't gotten
> > far in figuring out how it could happen.  But I really want to get the
> > rest of the patches in ppc-for-2.8 pulled, so I've dropped these for
> > now.
> > 
> > I'll try to debug this once I've prepared the next pull request, but
> > if you're able to work out what's going on for me, that would be
> > extremely helpful.
> 
> It turns out all these strange connections might just be red herrings.
> I think the problem is here in spapr_pci_post_load:
> 
>     unsigned int bus_no = 0;
> 
>     /* Set detach_cb for the drc unconditionally after migration */
>     if (bus) {
>         pci_for_each_device(bus, pci_bus_num(bus), spapr_pci_set_detach_cb,
>                             &bus_no);
>     }
> 
> I think that was a copy/paste artifact from one of the other callsites in
> spapr_pci.c. &bus_no is an opaque that spapr_pci_set_detach_cb ends up trying
> to use as sPAPRMachineState, which was leading to a NULL pointer dereference
> when trying to access the DRC corresponding the the PCIDevice. The search
> query is something like:
> 
>     return spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_PCI,
>                                     (phb->index << 16) |
>                                     (busnr << 8) |
>                                     devfn);
> 
> My guess is that in some cases phb->index just happened to be 0, which would
> in most cases lead to a successful query. That might explain the different
> results on different architectures.
> 
> Based on the some discussion elsehwere is turns out we don't need this
> detach cb stuff at all for the unplug issue we're trying to fix here, so
> I will strip it out for the next version.
> 
> > 
> > I do have one vague theory..
> 
> I think that needs to be dealt with as well, more comments below.
> 
> > 
> > > ---
> > >  hw/core/qdev.c         | 6 +++++-
> > >  include/hw/qdev-core.h | 9 +++++++++
> > >  migration/savevm.c     | 4 ++--
> > >  3 files changed, 16 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > > index 5783442..c8c0c44 100644
> > > --- a/hw/core/qdev.c
> > > +++ b/hw/core/qdev.c
> > > @@ -933,7 +933,11 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
> > >          }
> > >  
> > >          if (qdev_get_vmsd(dev)) {
> > > -            vmstate_register_with_alias_id(dev, -1, qdev_get_vmsd(dev), dev,
> > > +            int instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id
> > > +                ? DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev)
> > > +                : -1;
> > > +
> > > +            vmstate_register_with_alias_id(dev, instance_id, qdev_get_vmsd(dev), dev,
> > >                                             dev->instance_id_alias,
> > >                                             dev->alias_required_for_version);
> > >          }
> > > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> > > index 2c97347..8ba82af 100644
> > > --- a/include/hw/qdev-core.h
> > > +++ b/include/hw/qdev-core.h
> > > @@ -139,6 +139,15 @@ typedef struct DeviceClass {
> > >      qdev_initfn init; /* TODO remove, once users are converted to realize */
> > >      qdev_event exit; /* TODO remove, once users are converted to unrealize */
> > >      const char *bus_type;
> > > +
> > > +    /* When this field is set, instead of using the device's QOM path,
> > > +     * SaveStateEntry's for devices will be identified using a combination
> > > +     * of the corresponding VMSD name and an instance_id returned by this
> > > +     * function. This should only be necessary for situations where the
> > > +     * QOM path is anticipated to change and a more stable identifier is
> > > +     * desired to identify a device in the migration stream.
> > > +     */
> > > +    int (*dev_get_instance_id)(DeviceState *dev);
> > >  } DeviceClass;
> > >  
> > >  typedef struct NamedGPIOList NamedGPIOList;
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index 0363372..a95fff9 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -556,7 +556,7 @@ int register_savevm_live(DeviceState *dev,
> > >          se->is_ram = 1;
> > >      }
> > >  
> > > -    if (dev) {
> > > +    if (dev && instance_id == -1) {
> > 
> > .. so, I'm not sure how this could lead to the observed symptoms, but
> > I did note that adding this condition makes another if within the
> > block redundant, because it also checks for instance_id == -1.  That
> > suggests that simply skipping the whole block in this case is probably
> > not the right thing to do.
> 
> I don't think this is the cause (verified the loading orders didn't
> change before/after these patches for the scenario in question), but
> that's a good catch:
> 
> there was previously se->compat->idstr/instance_id handling for
> cases where a we register a VMSD for a Device and provide an explicit
> instance_id. This registers an additional check so that we can map
> the se->compat values from an older QEMU to the current ID. This
> unfortunately doesn't work in the opposite direction so didn't suite
> our purposes for controlling identifiers in the outgoing stream, but
> by accidentally bypassing it completely there's now a chance old->new
> configurations might break for certain devices, so this needs to be
> re-thought a bit...

It turns out spapr_iommu already had a fairly straightforward solution.
Rather than relying on qdev to register it's vmstate, it does it
manually via:

  vmstate_register(DEVICE(tcet), tcet->liobn, &vmstate_spapr_tce_table,
                   tcet);

This avoids the qom-path-based code due to the fact that
qdev_get_dev_path() is NULL for bus-less Devices, so as a fallback the
VMSD gets identified via "spapr_iommu"/liobn. So using this approach,
we don't actually need patch 1 at all.

But it turns out, for this particular memory unplug issue, we don't
actually need DRC migration. The real issue is that the default DRC
state for coldplugged LMBs *should* correspond to the post-hotplug
state. Otherwise, unplug fails for cold-plugged LMBs *even if you
don't migrate beforehand*. The fix for that is much more
straight-forward, and the same approach we currently use for
coldplugged CPUs. Will send a patch shortly.

> 
> > 
> > >          char *id = qdev_get_dev_path(dev);
> > >          if (id) {
> > >              pstrcpy(se->idstr, sizeof(se->idstr), id);
> > > @@ -640,7 +640,7 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
> > >      se->vmsd = vmsd;
> > >      se->alias_id = alias_id;
> > >  
> > > -    if (dev) {
> > > +    if (dev && instance_id == -1) {
> > >          char *id = qdev_get_dev_path(dev);
> > >          if (id) {
> > >              pstrcpy(se->idstr, sizeof(se->idstr), id);
> > 
> > -- 
> > David Gibson                    | I'll have my music baroque, and my code
> > david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
> >                                 | _way_ _around_!
> > http://www.ozlabs.org/~dgibson
> 
> 

  reply	other threads:[~2016-11-30 22:23 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-18  1:40 [Qemu-devel] [PATCH for-2.8 0/3] spapr: fix breakage of memory unplug after migration Michael Roth
2016-11-18  1:40 ` [Qemu-devel] [PATCH for-2.8 1/3] migration: alternative way to set instance_id in SaveStateEntry Michael Roth
2016-11-22  6:15   ` David Gibson
2016-11-22 10:23     ` Dr. David Alan Gilbert
2016-11-22 22:58     ` Michael Roth
2016-11-30 22:22       ` Michael Roth [this message]
2016-11-18  1:40 ` [Qemu-devel] [PATCH for-2.8 2/3] migration: spapr_drc: defined VMStateDescription struct Michael Roth
2016-11-18  6:04   ` David Gibson
2016-11-18 16:32     ` Michael Roth
2016-11-22 16:35   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2016-11-22 17:24     ` Michael Roth
2016-11-22 17:33       ` Michael Roth
2016-11-22 21:28         ` Greg Kurz
2016-11-22 20:09       ` Greg Kurz
2016-11-18  1:40 ` [Qemu-devel] [PATCH for-2.8 3/3] spapr: migration support for CAS-negotiated option vectors Michael Roth
2016-11-18 16:08   ` Michael Roth
2016-11-20 23:57     ` David Gibson
2016-11-18  1:51 ` [Qemu-devel] [PATCH for-2.8 0/3] spapr: fix breakage of memory unplug after migration no-reply
2016-11-18  5:45 ` David Gibson
2016-11-18 16:39   ` Michael Roth
2016-11-20 23:58     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161130222242.1725.43900@loki \
    --to=mdroth@linux.vnet.ibm.com \
    --cc=amit.shah@redhat.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dgilbert@redhat.com \
    --cc=duanj@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.