All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Reproducible crash on PCIe hotplug
@ 2016-12-09 20:39 Eduardo Habkost
  2016-12-12  5:34 ` Cao jin
  2016-12-12 16:48 ` Markus Armbruster
  0 siblings, 2 replies; 11+ messages in thread
From: Eduardo Habkost @ 2016-12-09 20:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Marcel Apfelbaum, Cao jin, Michael S. Tsirkin

Using latest qemu.git master:

  $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
  QEMU 2.7.93 monitor - type 'help' for more information
  (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
  (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
  Segmentation fault (core dumped)

It crashes at:

  #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
      at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
  983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
  (gdb) l
  978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
  979                        bus->devices[devfn]->name);
  980             return NULL;
  981         } else if (dev->hotplugged &&
  982                    pci_get_function_0(pci_dev)) {
  983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
  984                        " new func %s cannot be exposed to guest.",
  985                        PCI_SLOT(devfn),
  986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
  987                        name);

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-09 20:39 [Qemu-devel] Reproducible crash on PCIe hotplug Eduardo Habkost
@ 2016-12-12  5:34 ` Cao jin
  2016-12-12 17:29   ` Stefan Hajnoczi
  2016-12-12 16:48 ` Markus Armbruster
  1 sibling, 1 reply; 11+ messages in thread
From: Cao jin @ 2016-12-12  5:34 UTC (permalink / raw)
  To: Eduardo Habkost, qemu-devel; +Cc: Marcel Apfelbaum, Michael S. Tsirkin



On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> Using latest qemu.git master:
> 
>   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>   QEMU 2.7.93 monitor - type 'help' for more information
>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>   Segmentation fault (core dumped)
> 
> It crashes at:
> 
>   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
>       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
>   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>   (gdb) l
>   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
>   979                        bus->devices[devfn]->name);
>   980             return NULL;
>   981         } else if (dev->hotplugged &&
>   982                    pci_get_function_0(pci_dev)) {
>   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>   984                        " new func %s cannot be exposed to guest.",
>   985                        PCI_SLOT(devfn),
>   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
>   987                        name);
> 

Thanks for informing me. I am kind of busy for now, so I suppose I will
investigate it after 2.8 release.
-- 
Sincerely,
Cao jin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-09 20:39 [Qemu-devel] Reproducible crash on PCIe hotplug Eduardo Habkost
  2016-12-12  5:34 ` Cao jin
@ 2016-12-12 16:48 ` Markus Armbruster
  1 sibling, 0 replies; 11+ messages in thread
From: Markus Armbruster @ 2016-12-12 16:48 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, Marcel Apfelbaum, Cao jin, Michael S. Tsirkin

Eduardo Habkost <ehabkost@redhat.com> writes:

> Using latest qemu.git master:
>
>   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>   QEMU 2.7.93 monitor - type 'help' for more information
>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>   Segmentation fault (core dumped)

Bisected to

commit 3f1e1478db2d67098d98f2c3acf5a4946b7fb643
Author: Cao jin <caoj.fnst@cn.fujitsu.com>
Date:   Wed Oct 28 14:20:31 2015 +0800

    enable multi-function hot-add
    
    Enable PCIe device multi-function hot-add, just ensure function 0 is added
    last, then driver will get the notification to scan the slot.
    
    Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

It's in v2.5.0, probably no need to hold the release for a fix.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12  5:34 ` Cao jin
@ 2016-12-12 17:29   ` Stefan Hajnoczi
  2016-12-12 17:32     ` Eduardo Habkost
  2016-12-12 18:41     ` Michael S. Tsirkin
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2016-12-12 17:29 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, Marcel Apfelbaum, Michael S. Tsirkin, Cao jin

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
> 
> 
> On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> > Using latest qemu.git master:
> > 
> >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> >   QEMU 2.7.93 monitor - type 'help' for more information
> >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> >   Segmentation fault (core dumped)
> > 
> > It crashes at:
> > 
> >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
> >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
> >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> >   (gdb) l
> >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
> >   979                        bus->devices[devfn]->name);
> >   980             return NULL;
> >   981         } else if (dev->hotplugged &&
> >   982                    pci_get_function_0(pci_dev)) {
> >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> >   984                        " new func %s cannot be exposed to guest.",
> >   985                        PCI_SLOT(devfn),
> >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> >   987                        name);
> > 
> 
> Thanks for informing me. I am kind of busy for now, so I suppose I will
> investigate it after 2.8 release.

Please let me know if this should be considered a release blocker.

The proposed QEMU 2.8 release date is tomorrow (December 13th)!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 17:29   ` Stefan Hajnoczi
@ 2016-12-12 17:32     ` Eduardo Habkost
  2016-12-12 18:27       ` Stefan Hajnoczi
  2016-12-12 18:41     ` Michael S. Tsirkin
  1 sibling, 1 reply; 11+ messages in thread
From: Eduardo Habkost @ 2016-12-12 17:32 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Marcel Apfelbaum, Michael S. Tsirkin, Cao jin

On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
> On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
> > 
> > 
> > On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> > > Using latest qemu.git master:
> > > 
> > >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> > >   QEMU 2.7.93 monitor - type 'help' for more information
> > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> > >   Segmentation fault (core dumped)
> > > 
> > > It crashes at:
> > > 
> > >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
> > >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
> > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > >   (gdb) l
> > >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
> > >   979                        bus->devices[devfn]->name);
> > >   980             return NULL;
> > >   981         } else if (dev->hotplugged &&
> > >   982                    pci_get_function_0(pci_dev)) {
> > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > >   984                        " new func %s cannot be exposed to guest.",
> > >   985                        PCI_SLOT(devfn),
> > >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > >   987                        name);
> > > 
> > 
> > Thanks for informing me. I am kind of busy for now, so I suppose I will
> > investigate it after 2.8 release.
> 
> Please let me know if this should be considered a release blocker.
> 
> The proposed QEMU 2.8 release date is tomorrow (December 13th)!

The bug went undetected since QEMU 2.5, and the crash happens
only on cases where hotplug was already going to return an error.
I don't think it should be a release blocker.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 17:32     ` Eduardo Habkost
@ 2016-12-12 18:27       ` Stefan Hajnoczi
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2016-12-12 18:27 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: qemu-devel, Marcel Apfelbaum, Michael S. Tsirkin, Cao jin

On Mon, Dec 12, 2016 at 5:32 PM, Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
>> On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
>> >
>> >
>> > On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
>> > > Using latest qemu.git master:
>> > >
>> > >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>> > >   QEMU 2.7.93 monitor - type 'help' for more information
>> > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>> > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>> > >   Segmentation fault (core dumped)
>> > >
>> > > It crashes at:
>> > >
>> > >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
>> > >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
>> > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>> > >   (gdb) l
>> > >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
>> > >   979                        bus->devices[devfn]->name);
>> > >   980             return NULL;
>> > >   981         } else if (dev->hotplugged &&
>> > >   982                    pci_get_function_0(pci_dev)) {
>> > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>> > >   984                        " new func %s cannot be exposed to guest.",
>> > >   985                        PCI_SLOT(devfn),
>> > >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
>> > >   987                        name);
>> > >
>> >
>> > Thanks for informing me. I am kind of busy for now, so I suppose I will
>> > investigate it after 2.8 release.
>>
>> Please let me know if this should be considered a release blocker.
>>
>> The proposed QEMU 2.8 release date is tomorrow (December 13th)!
>
> The bug went undetected since QEMU 2.5, and the crash happens
> only on cases where hotplug was already going to return an error.
> I don't think it should be a release blocker.

Excellent, thanks for clarifying.

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 17:29   ` Stefan Hajnoczi
  2016-12-12 17:32     ` Eduardo Habkost
@ 2016-12-12 18:41     ` Michael S. Tsirkin
  2016-12-12 18:57       ` Eduardo Habkost
  1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2016-12-12 18:41 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Eduardo Habkost, qemu-devel, Marcel Apfelbaum, Cao jin

On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
> On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
> > 
> > 
> > On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> > > Using latest qemu.git master:
> > > 
> > >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> > >   QEMU 2.7.93 monitor - type 'help' for more information
> > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> > >   Segmentation fault (core dumped)
> > > 
> > > It crashes at:
> > > 
> > >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
> > >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
> > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > >   (gdb) l
> > >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
> > >   979                        bus->devices[devfn]->name);
> > >   980             return NULL;
> > >   981         } else if (dev->hotplugged &&
> > >   982                    pci_get_function_0(pci_dev)) {
> > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > >   984                        " new func %s cannot be exposed to guest.",
> > >   985                        PCI_SLOT(devfn),
> > >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > >   987                        name);
> > > 
> > 
> > Thanks for informing me. I am kind of busy for now, so I suppose I will
> > investigate it after 2.8 release.
> 
> Please let me know if this should be considered a release blocker.
> 
> The proposed QEMU 2.8 release date is tomorrow (December 13th)!
> 
> Stefan

I don't see how it's a blocker, it's an illegal configuration.
Here's the fix. It's a rather obvious one.
I'll target the fix for 2.9.
Eduardo, I'd appreciate a tested-by tag.

-->

pci: fix error message for express slots

PCI Express downstream slot has a single PCI slot
behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
does not give you function 0 in cases such as ARI
as well as some error cases.

This is exactly what we are hitting:
   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
   Segmentation fault (core dumped)

The fix is to use the pci_get_function_0 API.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
---

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 24fae16..339c531 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -983,7 +983,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
         error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
                    " new func %s cannot be exposed to guest.",
                    PCI_SLOT(devfn),
-                   bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
+                   pci_get_function_0(pci_dev)->name,
                    name);
 
        return NULL;

-- 
MST

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 18:41     ` Michael S. Tsirkin
@ 2016-12-12 18:57       ` Eduardo Habkost
  2016-12-12 22:09         ` Michael S. Tsirkin
  0 siblings, 1 reply; 11+ messages in thread
From: Eduardo Habkost @ 2016-12-12 18:57 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, Marcel Apfelbaum, Cao jin

On Mon, Dec 12, 2016 at 08:41:41PM +0200, Michael S. Tsirkin wrote:
> On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
> > On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
> > > 
> > > 
> > > On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> > > > Using latest qemu.git master:
> > > > 
> > > >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> > > >   QEMU 2.7.93 monitor - type 'help' for more information
> > > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> > > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> > > >   Segmentation fault (core dumped)
> > > > 
> > > > It crashes at:
> > > > 
> > > >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
> > > >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
> > > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > > >   (gdb) l
> > > >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
> > > >   979                        bus->devices[devfn]->name);
> > > >   980             return NULL;
> > > >   981         } else if (dev->hotplugged &&
> > > >   982                    pci_get_function_0(pci_dev)) {
> > > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > > >   984                        " new func %s cannot be exposed to guest.",
> > > >   985                        PCI_SLOT(devfn),
> > > >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > > >   987                        name);
> > > > 
> > > 
> > > Thanks for informing me. I am kind of busy for now, so I suppose I will
> > > investigate it after 2.8 release.
> > 
> > Please let me know if this should be considered a release blocker.
> > 
> > The proposed QEMU 2.8 release date is tomorrow (December 13th)!
> > 
> > Stefan
> 
> I don't see how it's a blocker, it's an illegal configuration.
> Here's the fix. It's a rather obvious one.
> I'll target the fix for 2.9.
> Eduardo, I'd appreciate a tested-by tag.

I confirm the patch fixes the crash, but the error message seems
incorrect: the existing e1000e device is on slot 0 function 0,
not slot 8.

  $ ./x86-kvm-build/x86_64-softmmu/qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
  QEMU 2.7.93 monitor - type 'help' for more information
  (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
  (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
  PCI: slot 8 function 0 already ocuppied by e1000e, new func e1000e cannot be exposed to guest.
           ^^^


> 
> -->
> 
> pci: fix error message for express slots
> 
> PCI Express downstream slot has a single PCI slot
> behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
> does not give you function 0 in cases such as ARI
> as well as some error cases.
> 
> This is exactly what we are hitting:
>    $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>    Segmentation fault (core dumped)
> 
> The fix is to use the pci_get_function_0 API.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Reported-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 24fae16..339c531 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -983,7 +983,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>          error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>                     " new func %s cannot be exposed to guest.",
>                     PCI_SLOT(devfn),
> -                   bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> +                   pci_get_function_0(pci_dev)->name,
>                     name);
>  
>         return NULL;
> 
> -- 
> MST

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 18:57       ` Eduardo Habkost
@ 2016-12-12 22:09         ` Michael S. Tsirkin
  2016-12-13  2:41           ` Cao jin
  2016-12-13 12:02           ` Eduardo Habkost
  0 siblings, 2 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2016-12-12 22:09 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Stefan Hajnoczi, qemu-devel, Marcel Apfelbaum, Cao jin

On Mon, Dec 12, 2016 at 04:57:30PM -0200, Eduardo Habkost wrote:
> On Mon, Dec 12, 2016 at 08:41:41PM +0200, Michael S. Tsirkin wrote:
> > On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
> > > On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
> > > > 
> > > > 
> > > > On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> > > > > Using latest qemu.git master:
> > > > > 
> > > > >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> > > > >   QEMU 2.7.93 monitor - type 'help' for more information
> > > > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> > > > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> > > > >   Segmentation fault (core dumped)
> > > > > 
> > > > > It crashes at:
> > > > > 
> > > > >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
> > > > >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
> > > > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > > > >   (gdb) l
> > > > >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
> > > > >   979                        bus->devices[devfn]->name);
> > > > >   980             return NULL;
> > > > >   981         } else if (dev->hotplugged &&
> > > > >   982                    pci_get_function_0(pci_dev)) {
> > > > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > > > >   984                        " new func %s cannot be exposed to guest.",
> > > > >   985                        PCI_SLOT(devfn),
> > > > >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > > > >   987                        name);
> > > > > 
> > > > 
> > > > Thanks for informing me. I am kind of busy for now, so I suppose I will
> > > > investigate it after 2.8 release.
> > > 
> > > Please let me know if this should be considered a release blocker.
> > > 
> > > The proposed QEMU 2.8 release date is tomorrow (December 13th)!
> > > 
> > > Stefan
> > 
> > I don't see how it's a blocker, it's an illegal configuration.
> > Here's the fix. It's a rather obvious one.
> > I'll target the fix for 2.9.
> > Eduardo, I'd appreciate a tested-by tag.
> 
> I confirm the patch fixes the crash, but the error message seems
> incorrect: the existing e1000e device is on slot 0 function 0,
> not slot 8.
> 
>   $ ./x86-kvm-build/x86_64-softmmu/qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>   QEMU 2.7.93 monitor - type 'help' for more information
>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>   PCI: slot 8 function 0 already ocuppied by e1000e, new func e1000e cannot be exposed to guest.
>            ^^^
> 
> 
> > 
> > -->
> > 
> > pci: fix error message for express slots
> > 
> > PCI Express downstream slot has a single PCI slot
> > behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
> > does not give you function 0 in cases such as ARI
> > as well as some error cases.
> > 
> > This is exactly what we are hitting:
> >    $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> >    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> >    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> >    Segmentation fault (core dumped)
> > 
> > The fix is to use the pci_get_function_0 API.
> > 
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > Reported-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> > 
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index 24fae16..339c531 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -983,7 +983,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
> >          error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> >                     " new func %s cannot be exposed to guest.",
> >                     PCI_SLOT(devfn),
> > -                   bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > +                   pci_get_function_0(pci_dev)->name,
> >                     name);
> >  
> >         return NULL;
> > 
> > -- 
> > MST
> 
> -- 



this then?


diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 339c531..637d545 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -982,7 +982,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
                pci_get_function_0(pci_dev)) {
         error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
                    " new func %s cannot be exposed to guest.",
-                   PCI_SLOT(devfn),
+                   PCI_SLOT(pci_get_function_0(pci_dev)->devfn),
                    pci_get_function_0(pci_dev)->name,
                    name);
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 22:09         ` Michael S. Tsirkin
@ 2016-12-13  2:41           ` Cao jin
  2016-12-13 12:02           ` Eduardo Habkost
  1 sibling, 0 replies; 11+ messages in thread
From: Cao jin @ 2016-12-13  2:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, Eduardo Habkost
  Cc: Stefan Hajnoczi, qemu-devel, Marcel Apfelbaum



On 12/13/2016 06:09 AM, Michael S. Tsirkin wrote:
> On Mon, Dec 12, 2016 at 04:57:30PM -0200, Eduardo Habkost wrote:
>> On Mon, Dec 12, 2016 at 08:41:41PM +0200, Michael S. Tsirkin wrote:
>>> On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
>>>> On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
>>>>>
>>>>>
>>>>> On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
>>>>>> Using latest qemu.git master:
>>>>>>
>>>>>>   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>>>>>>   QEMU 2.7.93 monitor - type 'help' for more information
>>>>>>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>>>>>>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>>>>>>   Segmentation fault (core dumped)
>>>>>>
>>>>>> It crashes at:
>>>>>>
>>>>>>   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
>>>>>>       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
>>>>>>   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>>>>>>   (gdb) l
>>>>>>   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
>>>>>>   979                        bus->devices[devfn]->name);
>>>>>>   980             return NULL;
>>>>>>   981         } else if (dev->hotplugged &&
>>>>>>   982                    pci_get_function_0(pci_dev)) {
>>>>>>   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>>>>>>   984                        " new func %s cannot be exposed to guest.",
>>>>>>   985                        PCI_SLOT(devfn),
>>>>>>   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
>>>>>>   987                        name);
>>>>>>
>>>>>
>>>>> Thanks for informing me. I am kind of busy for now, so I suppose I will
>>>>> investigate it after 2.8 release.
>>>>
>>>> Please let me know if this should be considered a release blocker.
>>>>
>>>> The proposed QEMU 2.8 release date is tomorrow (December 13th)!
>>>>
>>>> Stefan
>>>
>>> I don't see how it's a blocker, it's an illegal configuration.
>>> Here's the fix. It's a rather obvious one.
>>> I'll target the fix for 2.9.
>>> Eduardo, I'd appreciate a tested-by tag.
>>
>> I confirm the patch fixes the crash, but the error message seems
>> incorrect: the existing e1000e device is on slot 0 function 0,
>> not slot 8.
>>
>>   $ ./x86-kvm-build/x86_64-softmmu/qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>>   QEMU 2.7.93 monitor - type 'help' for more information
>>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>>   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>>   PCI: slot 8 function 0 already ocuppied by e1000e, new func e1000e cannot be exposed to guest.
>>            ^^^
>>
>>
>>>
>>> -->
>>>
>>> pci: fix error message for express slots
>>>
>>> PCI Express downstream slot has a single PCI slot
>>> behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
>>> does not give you function 0 in cases such as ARI
>>> as well as some error cases.
>>>
>>> This is exactly what we are hitting:
>>>    $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
>>>    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
>>>    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
>>>    Segmentation fault (core dumped)
>>>
>>> The fix is to use the pci_get_function_0 API.
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>> Reported-by: Eduardo Habkost <ehabkost@redhat.com>
>>> ---
>>>
>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>> index 24fae16..339c531 100644
>>> --- a/hw/pci/pci.c
>>> +++ b/hw/pci/pci.c
>>> @@ -983,7 +983,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>>>          error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>>>                     " new func %s cannot be exposed to guest.",
>>>                     PCI_SLOT(devfn),
>>> -                   bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
>>> +                   pci_get_function_0(pci_dev)->name,
>>>                     name);
>>>  
>>>         return NULL;
>>>
>>> -- 
>>> MST
>>
>> -- 
> 
> 
> 
> this then?
> 
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 339c531..637d545 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -982,7 +982,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>                 pci_get_function_0(pci_dev)) {
>          error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>                     " new func %s cannot be exposed to guest.",
> -                   PCI_SLOT(devfn),
> +                   PCI_SLOT(pci_get_function_0(pci_dev)->devfn),
>                     pci_get_function_0(pci_dev)->name,
>                     name);
>  

Tested-by: Cao jin <caoj.fnst@cn.fujitsu.com>

./qemu-system-x86_64 -machine q35 -readconfig ../docs/q35-chipset.cfg
-monitor stdio
QEMU 2.7.91 monitor - type 'help' for more information
(qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
(qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
PCI: slot 0 function 0 already ocuppied by e1000e, new func e1000e
cannot be exposed to guest.

-- 
Sincerely,
Cao jin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Reproducible crash on PCIe hotplug
  2016-12-12 22:09         ` Michael S. Tsirkin
  2016-12-13  2:41           ` Cao jin
@ 2016-12-13 12:02           ` Eduardo Habkost
  1 sibling, 0 replies; 11+ messages in thread
From: Eduardo Habkost @ 2016-12-13 12:02 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, qemu-devel, Marcel Apfelbaum, Cao jin

On Tue, Dec 13, 2016 at 12:09:33AM +0200, Michael S. Tsirkin wrote:
> On Mon, Dec 12, 2016 at 04:57:30PM -0200, Eduardo Habkost wrote:
> > On Mon, Dec 12, 2016 at 08:41:41PM +0200, Michael S. Tsirkin wrote:
> > > On Mon, Dec 12, 2016 at 05:29:15PM +0000, Stefan Hajnoczi wrote:
> > > > On Mon, Dec 12, 2016 at 01:34:05PM +0800, Cao jin wrote:
> > > > > 
> > > > > 
> > > > > On 12/10/2016 04:39 AM, Eduardo Habkost wrote:
> > > > > > Using latest qemu.git master:
> > > > > > 
> > > > > >   $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> > > > > >   QEMU 2.7.93 monitor - type 'help' for more information
> > > > > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> > > > > >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> > > > > >   Segmentation fault (core dumped)
> > > > > > 
> > > > > > It crashes at:
> > > > > > 
> > > > > >   #7  0x000055555598d7dc in do_pci_register_device (errp=0x7fffffffbfd0, devfn=64, name=0x5555565df340 "e1000e", bus=0x555558487380, pci_dev=0x5555589cd000)
> > > > > >       at /home/ehabkost/rh/proj/virt/qemu/hw/pci/pci.c:983
> > > > > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > > > > >   (gdb) l
> > > > > >   978                        PCI_SLOT(devfn), PCI_FUNC(devfn), name,
> > > > > >   979                        bus->devices[devfn]->name);
> > > > > >   980             return NULL;
> > > > > >   981         } else if (dev->hotplugged &&
> > > > > >   982                    pci_get_function_0(pci_dev)) {
> > > > > >   983             error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > > > > >   984                        " new func %s cannot be exposed to guest.",
> > > > > >   985                        PCI_SLOT(devfn),
> > > > > >   986                        bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > > > > >   987                        name);
> > > > > > 
> > > > > 
> > > > > Thanks for informing me. I am kind of busy for now, so I suppose I will
> > > > > investigate it after 2.8 release.
> > > > 
> > > > Please let me know if this should be considered a release blocker.
> > > > 
> > > > The proposed QEMU 2.8 release date is tomorrow (December 13th)!
> > > > 
> > > > Stefan
> > > 
> > > I don't see how it's a blocker, it's an illegal configuration.
> > > Here's the fix. It's a rather obvious one.
> > > I'll target the fix for 2.9.
> > > Eduardo, I'd appreciate a tested-by tag.
> > 
> > I confirm the patch fixes the crash, but the error message seems
> > incorrect: the existing e1000e device is on slot 0 function 0,
> > not slot 8.
> > 
> >   $ ./x86-kvm-build/x86_64-softmmu/qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> >   QEMU 2.7.93 monitor - type 'help' for more information
> >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> >   (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> >   PCI: slot 8 function 0 already ocuppied by e1000e, new func e1000e cannot be exposed to guest.
> >            ^^^
> > 
> > 
> > > 
> > > -->
> > > 
> > > pci: fix error message for express slots
> > > 
> > > PCI Express downstream slot has a single PCI slot
> > > behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
> > > does not give you function 0 in cases such as ARI
> > > as well as some error cases.
> > > 
> > > This is exactly what we are hitting:
> > >    $ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg -monitor stdio
> > >    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
> > >    (qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
> > >    Segmentation fault (core dumped)
> > > 
> > > The fix is to use the pci_get_function_0 API.
> > > 
> > > Cc: qemu-stable@nongnu.org
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > Reported-by: Eduardo Habkost <ehabkost@redhat.com>
> > > ---
> > > 
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index 24fae16..339c531 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -983,7 +983,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
> > >          error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
> > >                     " new func %s cannot be exposed to guest.",
> > >                     PCI_SLOT(devfn),
> > > -                   bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name,
> > > +                   pci_get_function_0(pci_dev)->name,
> > >                     name);
> > >  
> > >         return NULL;
> > > 
> > > -- 
> > > MST
> > 
> > -- 
> 
> 
> 
> this then?
> 
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 339c531..637d545 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -982,7 +982,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>                 pci_get_function_0(pci_dev)) {
>          error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s,"
>                     " new func %s cannot be exposed to guest.",
> -                   PCI_SLOT(devfn),
> +                   PCI_SLOT(pci_get_function_0(pci_dev)->devfn),
>                     pci_get_function_0(pci_dev)->name,
>                     name);

Works for me. Thanks!

Tested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-13 12:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-09 20:39 [Qemu-devel] Reproducible crash on PCIe hotplug Eduardo Habkost
2016-12-12  5:34 ` Cao jin
2016-12-12 17:29   ` Stefan Hajnoczi
2016-12-12 17:32     ` Eduardo Habkost
2016-12-12 18:27       ` Stefan Hajnoczi
2016-12-12 18:41     ` Michael S. Tsirkin
2016-12-12 18:57       ` Eduardo Habkost
2016-12-12 22:09         ` Michael S. Tsirkin
2016-12-13  2:41           ` Cao jin
2016-12-13 12:02           ` Eduardo Habkost
2016-12-12 16:48 ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.