All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: Thomas Huth <thuth@redhat.com>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"QEMU Developers" <qemu-devel@nongnu.org>,
	"qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
	"Cédric Le Goater" <clg@kaod.org>,
	"David Gibson" <dgibson@redhat.com>
Subject: Re: qemu-system-ppc64 abort()s with pcie bridges
Date: Thu, 9 Jul 2020 11:28:01 +0200	[thread overview]
Message-ID: <20200709112801.0f4cae40@bahia.lan> (raw)
In-Reply-To: <20200708115703.7926205a@bahia.lan>

On Wed, 8 Jul 2020 11:57:03 +0200
Greg Kurz <groug@kaod.org> wrote:

> On Wed, 8 Jul 2020 10:03:47 +0200
> Thomas Huth <thuth@redhat.com> wrote:
> 
> > 
> >  Hi,
> > 
> > qemu-system-ppc64 currently abort()s when it is started with a pcie
> > bridge device:
> > 
> > $ qemu-system-ppc64 -M pseries-5.1 -device pcie-pci-bridge
> > Unexpected error in object_property_find() at qom/object.c:1240:
> > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not found
> > Aborted (core dumped)
> > 
> > or:
> > 
> > $ qemu-system-ppc64 -M pseries -device dec-21154-p2p-bridge
> > Unexpected error in object_property_find() at qom/object.c:1240:
> > qemu-system-ppc64: -device dec-21154-p2p-bridge: Property '.chassis_nr'
> > not found
> > Aborted (core dumped)
> > 
> > That's kind of ugly, and it shows up as error when running
> > scripts/device-crash-test. Is there an easy way to avoid the abort() and
> > fail more gracefully here?
> > 
> 
> And even worse, this can tear down a running guest with hotplug :\
> 
> (qemu) device_add pcie-pci-bridge 
> Unexpected error in object_property_find() at /home/greg/Work/qemu/qemu-ppc/qom/object.c:1240:
> Property '.chassis_nr' not found
> Aborted (core dumped)
> 
> This is caused by recent commit:
> 
> commit 7ef1553dac8ef8dbe547b58d7420461a16be0eeb
> Author: Markus Armbruster <armbru@redhat.com>
> Date:   Tue May 5 17:29:25 2020 +0200
> 
>     spapr_pci: Drop some dead error handling
>     
>     chassis_from_bus() uses object_property_get_uint() to get property
>     "chassis_nr" of the bridge device.  Failure would be a programming
>     error.  Pass &error_abort, and simplify its callers.
>     
>     Cc: David Gibson <david@gibson.dropbear.id.au>
>     Cc: qemu-ppc@nongnu.org
>     Signed-off-by: Markus Armbruster <armbru@redhat.com>
>     Acked-by: David Gibson <david@gibson.dropbear.id.au>
>     Reviewed-by: Greg Kurz <groug@kaod.org>
>     Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>     Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
>     Message-Id: <20200505152926.18877-18-armbru@redhat.com>
> 
> Before that, we would simply print the "chassir_nr not found" error,
> and in case of a cold plugged device exit.
> 
> The root cause is that the sPAPR PCI code assumes that a PCI bridge
> has a "chassir_nr" property, ie. it is a standard PCI bridge. Other
> PCI bridge types don't have that. Not sure yet why this information
> is required, I'll check LoPAPR.
> 

More on this side : each slot of a PCI bridge is associated a DRC (a
PAPR thingy to handle hot plug/unplug). Each DRC must have a unique
identifier system-wide. We used to use the bus number to compute
the DRC id but it was broken, so we now _hijack_ "chassis_nr" as an
alternative since this commit:

commit 05929a6c5dfe1028ef66250b7bbf11939f8e77cd
Author: David Gibson <david@gibson.dropbear.id.au>
Date:   Wed Apr 10 11:49:28 2019 +1000

    spapr: Don't use bus number for building DRC ids

This means that we only support the standard pci-bridge device,
and this relies on the availability of "chassis_nr". Failure
to find this property is then not a programming error, but
an expected case where we want to fail gracefully (ie. revert
Markus's commit mentioned above).

While reading code I realized that we have another problem : the
realization of the pci-bridge device does fail if "chassis_nr" is
zero, but I failed to find a uniqueness check. And we get:

$ qemu-system-ppc64 -device pci-bridge,chassis_nr=1 -device pci-bridge,chassis_nr=1
Unexpected error in object_property_try_add() at qom/object.c:1167:
qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add duplicate property '40000100' to object (type 'container')
Aborted (core dumped)

It is very confusing to see that we state that "chassis_nr" is unique
several times in slotid_cap_init() but it is never enforced anywhere.

    if (!chassis) {
        error_setg(errp, "Bridge chassis not specified. Each bridge is required"
                   " to be assigned a unique chassis id > 0.");
        return -EINVAL;
    }

or

    /* We make each chassis unique, this way each bridge is First in Chassis */


Michael, Marcel or anyone with PCI knowledge,

Can you shed some light on the semantics of "chassis_nr" ?

> In the meantime, since we're in soft freeze, I guess we should
> revert Markus's patch and add a big fat comment to explain
> what's going on and maybe change the error message to something
> more informative, eg. "PCIE-to-PCI bridges are not supported".
> 
> Thoughts ?
> 
> >  Thomas
> > 
> 



      reply	other threads:[~2020-07-09  9:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-08  8:03 qemu-system-ppc64 abort()s with pcie bridges Thomas Huth
2020-07-08  9:57 ` Greg Kurz
2020-07-09  9:28   ` Greg Kurz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200709112801.0f4cae40@bahia.lan \
    --to=groug@kaod.org \
    --cc=armbru@redhat.com \
    --cc=clg@kaod.org \
    --cc=dgibson@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.