All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/3] pcie: add check for ari capability of pcie devices
@ 2014-09-30 10:11 arei.gonglei
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 1/3] qdev: Introduce a function to get qbus's parent arei.gonglei
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: arei.gonglei @ 2014-09-30 10:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.crosthwaite, weidong.huang, mst, knut.omang, marcel.a,
	luonengjun, peter.huangpeng, armbru, Gonglei, imammedo, pbonzini,
	afaerber

From: Gonglei <arei.gonglei@huawei.com>

Changes since v3:
 - fix some typos in comments and commit message. (Michael)
 - rename pcie_cap_ari_check to pcie_cap_slot_check.

Changes since v2:
 - make patch 1/3 more simpler and safer.(Hu Tao)
 - change check logic from pci.c to pcie.c and change function's name
 - judge devices' ARI capability instead of PCIe ports' ARI Forwarding
   (Michael)
 - add trivial patch 3/3
 - update patch's commit messages and code comments.

Changes since v1:
 - using object_dynamic_cast() instead of simple string comparing (Paolo)
 - add ARI Forwarding enable bit check
 - using pcie_cap_get_type() instead of simple string comparing (Marcel)
 - fix some other comments.

Root ports and downstream ports of switches are the hot
pluggable ports in a PCI Express hierarchy.
PCI Express supports chip-to-chip interconnect, a PCIe link can
only connect one PCI device/Switch/EndPoint or PCI-bridge.

7.3. Configuration Transaction Rules (PCI Express specification 3.0)
7.3.1. Device Number

Downstream Ports that do not have ARI Forwarding enabled must
associate only Device 0 with the device attached to the Logical Bus
representing the Link from the Port.

In QEMU, ARI Forwarding is enabled default at emulation of PCIe
ports. ARI Forwarding enable setting at firmware/OS Control handoff.
If the bit is Set when a non-ARI Device is present, the non-ARI
Device can respond to Configuration Space accesses under what it
interprets as being different Device Numbers, and its Functions can
be aliased under multiple Device Numbers, generally leading to
undesired behavior.

So, for PCI devices attached in PCIe root ports or downstream pots,
we should assure that its slot is not non-zero. For PCIe devices, which
ARP capability is not enabled, we also should assure that its slot
is not non-zero.

Gonglei (3):
  qdev: Introduce a function to get qbus's parent
  pcie: add check for ari capability of pcie devices
  pcie: remove confused comments

 hw/core/qdev.c         |  9 ++++++++
 hw/pci/pci.c           |  4 ++++
 hw/pci/pcie.c          | 59 +++++++++++++++++++++++++++++++++++++++++++-------
 include/hw/pci/pcie.h  |  1 +
 include/hw/qdev-core.h |  1 +
 5 files changed, 66 insertions(+), 8 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v4 1/3] qdev: Introduce a function to get qbus's parent
  2014-09-30 10:11 [Qemu-devel] [PATCH v4 0/3] pcie: add check for ari capability of pcie devices arei.gonglei
@ 2014-09-30 10:11 ` arei.gonglei
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices arei.gonglei
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments arei.gonglei
  2 siblings, 0 replies; 16+ messages in thread
From: arei.gonglei @ 2014-09-30 10:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.crosthwaite, weidong.huang, mst, knut.omang, marcel.a,
	luonengjun, peter.huangpeng, armbru, Gonglei, imammedo, pbonzini,
	afaerber

From: Gonglei <arei.gonglei@huawei.com>

We may use it check type of qbus's parent.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 hw/core/qdev.c         | 9 +++++++++
 include/hw/qdev-core.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index fcb1638..c85bf99 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -616,6 +616,15 @@ BusState *qbus_create(const char *typename, DeviceState *parent, const char *nam
     return bus;
 }
 
+DeviceState *qbus_get_parent(BusState *bus)
+{
+    if (bus) {
+        return bus->parent;
+    }
+
+    return NULL;
+}
+
 static char *bus_get_fw_dev_path(BusState *bus, DeviceState *dev)
 {
     BusClass *bc = BUS_GET_CLASS(bus);
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 178fee2..4bd561d 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -301,6 +301,7 @@ typedef int (qdev_walkerfn)(DeviceState *dev, void *opaque);
 void qbus_create_inplace(void *bus, size_t size, const char *typename,
                          DeviceState *parent, const char *name);
 BusState *qbus_create(const char *typename, DeviceState *parent, const char *name);
+DeviceState *qbus_get_parent(BusState *bus);
 /* Returns > 0 if either devfn or busfn skip walk somewhere in cursion,
  *         < 0 if either devfn or busfn terminate walk somewhere in cursion,
  *           0 otherwise. */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-09-30 10:11 [Qemu-devel] [PATCH v4 0/3] pcie: add check for ari capability of pcie devices arei.gonglei
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 1/3] qdev: Introduce a function to get qbus's parent arei.gonglei
@ 2014-09-30 10:11 ` arei.gonglei
  2014-09-30 11:59   ` Michael S. Tsirkin
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments arei.gonglei
  2 siblings, 1 reply; 16+ messages in thread
From: arei.gonglei @ 2014-09-30 10:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.crosthwaite, weidong.huang, mst, knut.omang, marcel.a,
	luonengjun, peter.huangpeng, armbru, Gonglei, imammedo, pbonzini,
	afaerber

From: Gonglei <arei.gonglei@huawei.com>

In QEMU, ARI Forwarding is enabled default at emulation of PCIe
ports. ARI Forwarding enable setting at firmware/OS Control handoff.
If the bit is Set when a non-ARI Device is present, the non-ARI
Device can respond to Configuration Space accesses under what it
interprets as being different Device Numbers, and its Functions can
be aliased under multiple Device Numbers, generally leading to
undesired behavior.

So, for PCI devices attached in PCIe root ports or downstream pots,
we should assure that its slot is not non-zero. For PCIe devices, which
ARP capability is not enabled, we also should assure that its slot
is not non-zero.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 hw/pci/pci.c          |  4 ++++
 hw/pci/pcie.c         | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/pci/pcie.h |  1 +
 3 files changed, 56 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 6ce75aa..22b7ca0 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
         }
     }
 
+    if (pcie_cap_slot_check(bus, pci_dev)) {
+        return -1;
+    }
+
     /* rom loading */
     is_default_rom = false;
     if (pci_dev->romfile == NULL && pc->romfile != NULL) {
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 1babddf..b82211a 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn)
                         offset, PCI_ARI_SIZEOF);
     pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
 }
+
+int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
+{
+    Object *obj = OBJECT(bus);
+
+    if (pci_bus_is_root(bus)) {
+        return 0;
+    }
+
+    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
+        DeviceState *parent = qbus_get_parent(BUS(obj));
+        PCIDevice *pci_dev = PCI_DEVICE(parent);
+        uint8_t port_type;
+        /*
+         * Root ports and downstream ports of switches are the hot
+         * pluggable ports in a PCI Express hierarchy.
+         * PCI Express supports chip-to-chip interconnect, a PCIe link can
+         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
+         *
+         * 7.3. Configuration Transaction Rules (PCI Express specification 3.0)
+         * 7.3.1. Device Number
+         *
+         * Downstream Ports that do not have ARI Forwarding enabled must
+         * associate only Device 0 with the device attached to the Logical Bus
+         * representing the Link from the Port.
+         *
+         * In QEMU, ARI Forwarding is enabled default at emulation of PCIe
+         * ports. ARI Forwarding enable setting at firmware/OS Control handoff.
+         * If the bit is Set when a non-ARI Device is present, the non-ARI
+           Device can respond to Configuration Space accesses under what it
+         * interprets as being different Device Numbers, and its Functions can
+         * be aliased under multiple Device Numbers, generally leading to
+         * undesired behavior.
+         */
+        port_type = pcie_cap_get_type(pci_dev);
+        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
+            port_type == PCI_EXP_TYPE_ROOT_PORT) {
+            if (!pci_is_express(dev) ||
+                (pci_is_express(dev) &&
+                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
+                if (PCI_SLOT(dev->devfn) != 0) {
+                    error_report("PCIe: non-ARI device can't be populated"
+                                 " in slot %d", PCI_SLOT(dev->devfn));
+                    return -1;
+                }
+            }
+        }
+    }
+
+    return 0;
+}
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index d139d58..1d8f3f4 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
                          uint16_t offset, uint16_t size);
 
 void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
+int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
 
 extern const VMStateDescription vmstate_pcie_device;
 
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments
  2014-09-30 10:11 [Qemu-devel] [PATCH v4 0/3] pcie: add check for ari capability of pcie devices arei.gonglei
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 1/3] qdev: Introduce a function to get qbus's parent arei.gonglei
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices arei.gonglei
@ 2014-09-30 10:11 ` arei.gonglei
  2014-09-30 12:46   ` Michael S. Tsirkin
  2 siblings, 1 reply; 16+ messages in thread
From: arei.gonglei @ 2014-09-30 10:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.crosthwaite, weidong.huang, mst, knut.omang, marcel.a,
	luonengjun, peter.huangpeng, armbru, Gonglei, imammedo, pbonzini,
	afaerber

From: Gonglei <arei.gonglei@huawei.com>

The below functions is not allocation functions, but
find helper function. The only allocation function is
pcie_add_capability(), but its argument names is not the
same with the comments. So remove this comments which
make people confused.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 hw/pci/pcie.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index b82211a..f2b9d5d 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -528,14 +528,6 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
         PCI_EXP_DEVCTL2_ARI;
 }
 
-/**************************************************************************
- * pci express extended capability allocation functions
- * uint16_t ext_cap_id (16 bit)
- * uint8_t cap_ver (4 bit)
- * uint16_t cap_offset (12 bit)
- * uint16_t ext_cap_size
- */
-
 static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
                                           uint16_t *prev_p)
 {
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices arei.gonglei
@ 2014-09-30 11:59   ` Michael S. Tsirkin
  2014-09-30 13:38     ` Gonglei
  0 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2014-09-30 11:59 UTC (permalink / raw)
  To: arei.gonglei
  Cc: peter.crosthwaite, knut.omang, weidong.huang, marcel.a, armbru,
	luonengjun, qemu-devel, peter.huangpeng, imammedo, pbonzini,
	afaerber

On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> From: Gonglei <arei.gonglei@huawei.com>
> 
> In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> If the bit is Set when a non-ARI Device is present, the non-ARI
> Device can respond to Configuration Space accesses under what it
> interprets as being different Device Numbers, and its Functions can
> be aliased under multiple Device Numbers, generally leading to
> undesired behavior.

So what is the undesired behaviour?
Did you observe such?
Please make this explicit in the commit log.




> 
> So, for PCI devices attached in PCIe root ports or downstream pots,
> we should assure that its slot is not non-zero. For PCIe devices, which
> ARP capability is not enabled, we also should assure that its slot
> is not non-zero.

not non zero => zero

> 
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  hw/pci/pci.c          |  4 ++++
>  hw/pci/pcie.c         | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/pci/pcie.h |  1 +
>  3 files changed, 56 insertions(+)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 6ce75aa..22b7ca0 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
>          }
>      }
>  
> +    if (pcie_cap_slot_check(bus, pci_dev)) {
> +        return -1;
> +    }
> +
>      /* rom loading */
>      is_default_rom = false;
>      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index 1babddf..b82211a 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn)
>                          offset, PCI_ARI_SIZEOF);
>      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
>  }
> +
> +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> +{
> +    Object *obj = OBJECT(bus);
> +
> +    if (pci_bus_is_root(bus)) {
> +        return 0;
> +    }
> +
> +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> +        DeviceState *parent = qbus_get_parent(BUS(obj));
> +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> +        uint8_t port_type;
> +        /*
> +         * Root ports and downstream ports of switches are the hot
> +         * pluggable ports in a PCI Express hierarchy.
> +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> +         *
> +         * 7.3. Configuration Transaction Rules (PCI Express specification 3.0)
> +         * 7.3.1. Device Number
> +         *
> +         * Downstream Ports that do not have ARI Forwarding enabled must
> +         * associate only Device 0 with the device attached to the Logical Bus
> +         * representing the Link from the Port.
> +         *
> +         * In QEMU, ARI Forwarding is enabled default at emulation of PCIe

s/enabled default/enabled by default/

> +         * ports. ARI Forwarding enable setting at firmware/OS Control handoff.


can parse last sentence. drop it?

> +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> +           Device can respond to Configuration Space accesses under what it
> +         * interprets as being different Device Numbers, and its Functions can
> +         * be aliased under multiple Device Numbers, generally leading to
> +         * undesired behavior.

I don't think any badness really happens.
Did you observe any?



> +         */
> +        port_type = pcie_cap_get_type(pci_dev);
> +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> +            if (!pci_is_express(dev) ||
> +                (pci_is_express(dev) &&
> +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {

I would just skip the test for a non express function.

> +                if (PCI_SLOT(dev->devfn) != 0) {
> +                    error_report("PCIe: non-ARI device can't be populated"
> +                                 " in slot %d", PCI_SLOT(dev->devfn));
> +                    return -1;
> +                }
> +            }
> +        }
> +    }
> +
> +    return 0;
> +}
> diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> index d139d58..1d8f3f4 100644
> --- a/include/hw/pci/pcie.h
> +++ b/include/hw/pci/pcie.h
> @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
>                           uint16_t offset, uint16_t size);
>  
>  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
>  
>  extern const VMStateDescription vmstate_pcie_device;
>  
> -- 
> 1.7.12.4
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments
  2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments arei.gonglei
@ 2014-09-30 12:46   ` Michael S. Tsirkin
  2014-09-30 12:58     ` Gonglei
  0 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2014-09-30 12:46 UTC (permalink / raw)
  To: arei.gonglei
  Cc: peter.crosthwaite, knut.omang, weidong.huang, marcel.a, armbru,
	luonengjun, qemu-devel, peter.huangpeng, imammedo, pbonzini,
	afaerber

On Tue, Sep 30, 2014 at 06:11:26PM +0800, arei.gonglei@huawei.com wrote:
> From: Gonglei <arei.gonglei@huawei.com>
> 
> The below functions is not allocation functions, but
> find helper function. The only allocation function is
> pcie_add_capability(), but its argument names is not the
> same with the comments. So remove this comments which
> make people confused.
> 
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>

The comment applies to all functions below it.
Maybe simply replace
 s/capability allocation/capability list management/?

Also pls make this a separate patch, does not belong in
this series.

> ---
>  hw/pci/pcie.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index b82211a..f2b9d5d 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -528,14 +528,6 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
>          PCI_EXP_DEVCTL2_ARI;
>  }
>  
> -/**************************************************************************
> - * pci express extended capability allocation functions
> - * uint16_t ext_cap_id (16 bit)
> - * uint8_t cap_ver (4 bit)
> - * uint16_t cap_offset (12 bit)
> - * uint16_t ext_cap_size
> - */
> -
>  static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
>                                            uint16_t *prev_p)
>  {
> -- 
> 1.7.12.4
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments
  2014-09-30 12:46   ` Michael S. Tsirkin
@ 2014-09-30 12:58     ` Gonglei
  0 siblings, 0 replies; 16+ messages in thread
From: Gonglei @ 2014-09-30 12:58 UTC (permalink / raw)
  To: 'Michael S. Tsirkin', arei.gonglei
  Cc: peter.crosthwaite, weidong.huang, marcel.a, knut.omang,
	luonengjun, armbru, qemu-devel, pbonzini, imammedo,
	peter.huangpeng, afaerber

> Subject: Re: [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments
> 
> On Tue, Sep 30, 2014 at 06:11:26PM +0800, arei.gonglei@huawei.com wrote:
> > From: Gonglei <arei.gonglei@huawei.com>
> >
> > The below functions is not allocation functions, but
> > find helper function. The only allocation function is
> > pcie_add_capability(), but its argument names is not the
> > same with the comments. So remove this comments which
> > make people confused.
> >
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> 
> The comment applies to all functions below it.

This explanation is fine.

> Maybe simply replace
>  s/capability allocation/capability list management/?
> 

Agree.

> Also pls make this a separate patch, does not belong in
> this series.
> 
OK. Will do. Thanks!

Best regards,
-Gonglei

> > ---
> >  hw/pci/pcie.c | 8 --------
> >  1 file changed, 8 deletions(-)
> >
> > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > index b82211a..f2b9d5d 100644
> > --- a/hw/pci/pcie.c
> > +++ b/hw/pci/pcie.c
> > @@ -528,14 +528,6 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice
> *dev)
> >          PCI_EXP_DEVCTL2_ARI;
> >  }
> >
> >
> -/**************************************************************
> ************
> > - * pci express extended capability allocation functions
> > - * uint16_t ext_cap_id (16 bit)
> > - * uint8_t cap_ver (4 bit)
> > - * uint16_t cap_offset (12 bit)
> > - * uint16_t ext_cap_size
> > - */
> > -
> >  static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
> >                                            uint16_t *prev_p)
> >  {
> > --
> > 1.7.12.4
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-09-30 11:59   ` Michael S. Tsirkin
@ 2014-09-30 13:38     ` Gonglei
  2014-09-30 13:58       ` Michael S. Tsirkin
  2014-10-01  5:26       ` Knut Omang
  0 siblings, 2 replies; 16+ messages in thread
From: Gonglei @ 2014-09-30 13:38 UTC (permalink / raw)
  To: 'Michael S. Tsirkin', arei.gonglei
  Cc: peter.crosthwaite, weidong.huang, marcel.a, knut.omang,
	luonengjun, armbru, qemu-devel, pbonzini, imammedo,
	peter.huangpeng, afaerber

> Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> pcie devices
> 
> On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> > From: Gonglei <arei.gonglei@huawei.com>
> >
> > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > If the bit is Set when a non-ARI Device is present, the non-ARI
> > Device can respond to Configuration Space accesses under what it
> > interprets as being different Device Numbers, and its Functions can
> > be aliased under multiple Device Numbers, generally leading to
> > undesired behavior.
> 
> So what is the undesired behaviour?
> Did you observe such?

I just observe the PCI device don't work, and the dmesg show me:

[ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
[ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
[ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due to button press.
[ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
[ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
[ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)

> Please make this explicit in the commit log.
> 
Sorry, I copied the description from PCIe spec. :(

IMPLEMENTATION NOTE at Page 19:

https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-interpretation-070604.pdf


> 
> >
> > So, for PCI devices attached in PCIe root ports or downstream pots,
> > we should assure that its slot is not non-zero. For PCIe devices, which
> > ARP capability is not enabled, we also should assure that its slot
> > is not non-zero.
> 
> not non zero => zero
> 
OK.

> >
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > ---
> >  hw/pci/pci.c          |  4 ++++
> >  hw/pci/pcie.c         | 51
> +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/hw/pci/pcie.h |  1 +
> >  3 files changed, 56 insertions(+)
> >
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index 6ce75aa..22b7ca0 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> >          }
> >      }
> >
> > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > +        return -1;
> > +    }
> > +
> >      /* rom loading */
> >      is_default_rom = false;
> >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > index 1babddf..b82211a 100644
> > --- a/hw/pci/pcie.c
> > +++ b/hw/pci/pcie.c
> > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset,
> uint16_t nextfn)
> >                          offset, PCI_ARI_SIZEOF);
> >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
> >  }
> > +
> > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > +{
> > +    Object *obj = OBJECT(bus);
> > +
> > +    if (pci_bus_is_root(bus)) {
> > +        return 0;
> > +    }
> > +
> > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > +        uint8_t port_type;
> > +        /*
> > +         * Root ports and downstream ports of switches are the hot
> > +         * pluggable ports in a PCI Express hierarchy.
> > +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > +         *
> > +         * 7.3. Configuration Transaction Rules (PCI Express specification
> 3.0)
> > +         * 7.3.1. Device Number
> > +         *
> > +         * Downstream Ports that do not have ARI Forwarding enabled
> must
> > +         * associate only Device 0 with the device attached to the Logical
> Bus
> > +         * representing the Link from the Port.
> > +         *
> > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> PCIe
> 
> s/enabled default/enabled by default/
> 
OK.

> > +         * ports. ARI Forwarding enable setting at firmware/OS Control
> handoff.
> 
> 
> can parse last sentence. drop it?
> 
OK.

> > +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> > +           Device can respond to Configuration Space accesses under
> what it
> > +         * interprets as being different Device Numbers, and its Functions
> can
> > +         * be aliased under multiple Device Numbers, generally leading to
> > +         * undesired behavior.
> 
> I don't think any badness really happens.
> Did you observe any?
> 
Same with above explanation.

> 
> 
> > +         */
> > +        port_type = pcie_cap_get_type(pci_dev);
> > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > +            if (!pci_is_express(dev) ||
> > +                (pci_is_express(dev) &&
> > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> 
> I would just skip the test for a non express function.
> 
Sorry?

Best regards,
-Gonglei

> > +                if (PCI_SLOT(dev->devfn) != 0) {
> > +                    error_report("PCIe: non-ARI device can't be
> populated"
> > +                                 " in slot %d",
> PCI_SLOT(dev->devfn));
> > +                    return -1;
> > +                }
> > +            }
> > +        }
> > +    }
> > +
> > +    return 0;
> > +}
> > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > index d139d58..1d8f3f4 100644
> > --- a/include/hw/pci/pcie.h
> > +++ b/include/hw/pci/pcie.h
> > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> >                           uint16_t offset, uint16_t size);
> >
> >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> >
> >  extern const VMStateDescription vmstate_pcie_device;
> >
> > --
> > 1.7.12.4
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-09-30 13:38     ` Gonglei
@ 2014-09-30 13:58       ` Michael S. Tsirkin
  2014-10-01  4:04         ` Gonglei
  2014-10-01  5:26       ` Knut Omang
  1 sibling, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2014-09-30 13:58 UTC (permalink / raw)
  To: Gonglei
  Cc: peter.crosthwaite, weidong.huang, marcel.a, luonengjun,
	knut.omang, armbru, qemu-devel, arei.gonglei, pbonzini, imammedo,
	peter.huangpeng, afaerber

On Tue, Sep 30, 2014 at 09:38:51PM +0800, Gonglei wrote:
> > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > pcie devices
> > 
> > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> > > From: Gonglei <arei.gonglei@huawei.com>
> > >
> > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > Device can respond to Configuration Space accesses under what it
> > > interprets as being different Device Numbers, and its Functions can
> > > be aliased under multiple Device Numbers, generally leading to
> > > undesired behavior.
> > 
> > So what is the undesired behaviour?
> > Did you observe such?
> 
> I just observe the PCI device don't work,

Sigh.
What did you do to make it not work?
Which device did you try?
By comparison, what happens after your patch is applied?



> and the dmesg show me:
> 
> [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due to button press.
> [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)
> 
> > Please make this explicit in the commit log.
> > 
> Sorry, I copied the description from PCIe spec. :(
> 
> IMPLEMENTATION NOTE at Page 19:
> 
> https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-interpretation-070604.pdf
> 
> 
> > 
> > >
> > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > we should assure that its slot is not non-zero. For PCIe devices, which
> > > ARP capability is not enabled, we also should assure that its slot
> > > is not non-zero.
> > 
> > not non zero => zero
> > 
> OK.
> 
> > >
> > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > ---
> > >  hw/pci/pci.c          |  4 ++++
> > >  hw/pci/pcie.c         | 51
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  include/hw/pci/pcie.h |  1 +
> > >  3 files changed, 56 insertions(+)
> > >
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index 6ce75aa..22b7ca0 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > >          }
> > >      }
> > >
> > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > +        return -1;
> > > +    }
> > > +
> > >      /* rom loading */
> > >      is_default_rom = false;
> > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > index 1babddf..b82211a 100644
> > > --- a/hw/pci/pcie.c
> > > +++ b/hw/pci/pcie.c
> > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset,
> > uint16_t nextfn)
> > >                          offset, PCI_ARI_SIZEOF);
> > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
> > >  }
> > > +
> > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > +{
> > > +    Object *obj = OBJECT(bus);
> > > +
> > > +    if (pci_bus_is_root(bus)) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > +        uint8_t port_type;
> > > +        /*
> > > +         * Root ports and downstream ports of switches are the hot
> > > +         * pluggable ports in a PCI Express hierarchy.
> > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > +         *
> > > +         * 7.3. Configuration Transaction Rules (PCI Express specification
> > 3.0)
> > > +         * 7.3.1. Device Number
> > > +         *
> > > +         * Downstream Ports that do not have ARI Forwarding enabled
> > must
> > > +         * associate only Device 0 with the device attached to the Logical
> > Bus
> > > +         * representing the Link from the Port.
> > > +         *
> > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > PCIe
> > 
> > s/enabled default/enabled by default/
> > 
> OK.
> 
> > > +         * ports. ARI Forwarding enable setting at firmware/OS Control
> > handoff.
> > 
> > 
> > can parse last sentence. drop it?
> > 
> OK.
> 
> > > +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> > > +           Device can respond to Configuration Space accesses under
> > what it
> > > +         * interprets as being different Device Numbers, and its Functions
> > can
> > > +         * be aliased under multiple Device Numbers, generally leading to
> > > +         * undesired behavior.
> > 
> > I don't think any badness really happens.
> > Did you observe any?
> > 
> Same with above explanation.
> 
> > 
> > 
> > > +         */
> > > +        port_type = pcie_cap_get_type(pci_dev);
> > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > +            if (!pci_is_express(dev) ||
> > > +                (pci_is_express(dev) &&
> > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > 
> > I would just skip the test for a non express function.
> > 
> Sorry?
> 
> Best regards,
> -Gonglei

First this is equivalent, and clearer:
           if (!pci_is_express(dev) ||
               !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {

Second do we need to do the check for non express devices?



> > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > +                    error_report("PCIe: non-ARI device can't be
> > populated"
> > > +                                 " in slot %d",
> > PCI_SLOT(dev->devfn));
> > > +                    return -1;
> > > +                }
> > > +            }
> > > +        }
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > index d139d58..1d8f3f4 100644
> > > --- a/include/hw/pci/pcie.h
> > > +++ b/include/hw/pci/pcie.h
> > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > >                           uint16_t offset, uint16_t size);
> > >
> > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > >
> > >  extern const VMStateDescription vmstate_pcie_device;
> > >
> > > --
> > > 1.7.12.4
> > >
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-09-30 13:58       ` Michael S. Tsirkin
@ 2014-10-01  4:04         ` Gonglei
  0 siblings, 0 replies; 16+ messages in thread
From: Gonglei @ 2014-10-01  4:04 UTC (permalink / raw)
  To: 'Michael S. Tsirkin'
  Cc: peter.crosthwaite, weidong.huang, marcel.a, knut.omang,
	luonengjun, armbru, qemu-devel, arei.gonglei, imammedo, pbonzini,
	peter.huangpeng, afaerber

> Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> pcie devices
> 
> On Tue, Sep 30, 2014 at 09:38:51PM +0800, Gonglei wrote:
> > > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > > pcie devices
> > >
> > > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com
> wrote:
> > > > From: Gonglei <arei.gonglei@huawei.com>
> > > >
> > > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > Device can respond to Configuration Space accesses under what it
> > > > interprets as being different Device Numbers, and its Functions can
> > > > be aliased under multiple Device Numbers, generally leading to
> > > > undesired behavior.
> > >
> > > So what is the undesired behaviour?
> > > Did you observe such?
> >
> > I just observe the PCI device don't work,
> 
> Sigh.
> What did you do to make it not work?

I configured a device with 'slot > 0'.

> Which device did you try?

virtio-net-pci device.

> By comparison, what happens after your patch is applied?
> 
Please see my previous email answering you :
http://lists.gnu.org/archive/html/qemu-devel/2014-09/msg05781.html

> 
> 
> > and the dmesg show me:
> >
> > [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> > [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> > [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due
> to button press.
> > [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> > [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> > [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)
> >
> > > Please make this explicit in the commit log.
> > >
> > Sorry, I copied the description from PCIe spec. :(
> >
> > IMPLEMENTATION NOTE at Page 19:
> >
> >
> https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-in
> terpretation-070604.pdf
> >
> >
> > >
> > > >
> > > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > > we should assure that its slot is not non-zero. For PCIe devices, which
> > > > ARP capability is not enabled, we also should assure that its slot
> > > > is not non-zero.
> > >
> > > not non zero => zero
> > >
> > OK.
> >
> > > >
> > > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > > ---
> > > >  hw/pci/pci.c          |  4 ++++
> > > >  hw/pci/pcie.c         | 51
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  include/hw/pci/pcie.h |  1 +
> > > >  3 files changed, 56 insertions(+)
> > > >
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index 6ce75aa..22b7ca0 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > > >          }
> > > >      }
> > > >
> > > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > > +        return -1;
> > > > +    }
> > > > +
> > > >      /* rom loading */
> > > >      is_default_rom = false;
> > > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > > index 1babddf..b82211a 100644
> > > > --- a/hw/pci/pcie.c
> > > > +++ b/hw/pci/pcie.c
> > > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t
> offset,
> > > uint16_t nextfn)
> > > >                          offset, PCI_ARI_SIZEOF);
> > > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) <<
> 8);
> > > >  }
> > > > +
> > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > > +{
> > > > +    Object *obj = OBJECT(bus);
> > > > +
> > > > +    if (pci_bus_is_root(bus)) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > > +        uint8_t port_type;
> > > > +        /*
> > > > +         * Root ports and downstream ports of switches are the hot
> > > > +         * pluggable ports in a PCI Express hierarchy.
> > > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link
> can
> > > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > > +         *
> > > > +         * 7.3. Configuration Transaction Rules (PCI Express
> specification
> > > 3.0)
> > > > +         * 7.3.1. Device Number
> > > > +         *
> > > > +         * Downstream Ports that do not have ARI Forwarding
> enabled
> > > must
> > > > +         * associate only Device 0 with the device attached to the
> Logical
> > > Bus
> > > > +         * representing the Link from the Port.
> > > > +         *
> > > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > > PCIe
> > >
> > > s/enabled default/enabled by default/
> > >
> > OK.
> >
> > > > +         * ports. ARI Forwarding enable setting at firmware/OS
> Control
> > > handoff.
> > >
> > >
> > > can parse last sentence. drop it?
> > >
> > OK.
> >
> > > > +         * If the bit is Set when a non-ARI Device is present, the
> non-ARI
> > > > +           Device can respond to Configuration Space accesses under
> > > what it
> > > > +         * interprets as being different Device Numbers, and its
> Functions
> > > can
> > > > +         * be aliased under multiple Device Numbers, generally
> leading to
> > > > +         * undesired behavior.
> > >
> > > I don't think any badness really happens.
> > > Did you observe any?
> > >
> > Same with above explanation.
> >
> > >
> > >
> > > > +         */
> > > > +        port_type = pcie_cap_get_type(pci_dev);
> > > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > > +            if (!pci_is_express(dev) ||
> > > > +                (pci_is_express(dev) &&
> > > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > >
> > > I would just skip the test for a non express function.
> > >
> > Sorry?
> >
> > Best regards,
> > -Gonglei
> 
> First this is equivalent, and clearer:
>            if (!pci_is_express(dev) ||
>                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> 
> Second do we need to do the check for non express devices?
> 
Good catch. Yes! Thanks.

Best regards,
-Gonglei

> 
> 
> > > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > > +                    error_report("PCIe: non-ARI device can't be
> > > populated"
> > > > +                                 " in slot %d",
> > > PCI_SLOT(dev->devfn));
> > > > +                    return -1;
> > > > +                }
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return 0;
> > > > +}
> > > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > > index d139d58..1d8f3f4 100644
> > > > --- a/include/hw/pci/pcie.h
> > > > +++ b/include/hw/pci/pcie.h
> > > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > > >                           uint16_t offset, uint16_t size);
> > > >
> > > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > > >
> > > >  extern const VMStateDescription vmstate_pcie_device;
> > > >
> > > > --
> > > > 1.7.12.4
> > > >
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-09-30 13:38     ` Gonglei
  2014-09-30 13:58       ` Michael S. Tsirkin
@ 2014-10-01  5:26       ` Knut Omang
  2014-10-01  7:15         ` Gonglei
  2014-10-01 14:08         ` Marcel Apfelbaum
  1 sibling, 2 replies; 16+ messages in thread
From: Knut Omang @ 2014-10-01  5:26 UTC (permalink / raw)
  To: Gonglei
  Cc: peter.crosthwaite, weidong.huang, marcel.a,
	'Michael S. Tsirkin',
	luonengjun, armbru, qemu-devel, arei.gonglei, pbonzini, imammedo,
	peter.huangpeng, afaerber

On Tue, 2014-09-30 at 21:38 +0800, Gonglei wrote:
> > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > pcie devices
> > 
> > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> > > From: Gonglei <arei.gonglei@huawei.com>
> > >
> > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > Device can respond to Configuration Space accesses under what it
> > > interprets as being different Device Numbers, and its Functions can
> > > be aliased under multiple Device Numbers, generally leading to
> > > undesired behavior.
> > 
> > So what is the undesired behaviour?
> > Did you observe such?
> 
> I just observe the PCI device don't work, and the dmesg show me:
> 
> [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due to button press.
> [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)

This seems very much like the symptoms I see when I use hotplug after
the latest changes to the hotplug code - do you have something that
confirms this has something to do with wrong interpretation of device
IDs? My suspicion is that something has gone wrong or is missing in the
hotplug logic but I havent had time to dig deeper into it yet.

I am just concerned that this might alleviate the symptoms you see but
not fix the problem itself,

Knut

> > Please make this explicit in the commit log.
> > 
> Sorry, I copied the description from PCIe spec. :(
> 
> IMPLEMENTATION NOTE at Page 19:
> 
> https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-interpretation-070604.pdf
> 
> 
> > 
> > >
> > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > we should assure that its slot is not non-zero. For PCIe devices, which
> > > ARP capability is not enabled, we also should assure that its slot
> > > is not non-zero.
> > 
> > not non zero => zero
> > 
> OK.
> 
> > >
> > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > ---
> > >  hw/pci/pci.c          |  4 ++++
> > >  hw/pci/pcie.c         | 51
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  include/hw/pci/pcie.h |  1 +
> > >  3 files changed, 56 insertions(+)
> > >
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index 6ce75aa..22b7ca0 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > >          }
> > >      }
> > >
> > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > +        return -1;
> > > +    }
> > > +
> > >      /* rom loading */
> > >      is_default_rom = false;
> > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > index 1babddf..b82211a 100644
> > > --- a/hw/pci/pcie.c
> > > +++ b/hw/pci/pcie.c
> > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset,
> > uint16_t nextfn)
> > >                          offset, PCI_ARI_SIZEOF);
> > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
> > >  }
> > > +
> > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > +{
> > > +    Object *obj = OBJECT(bus);
> > > +
> > > +    if (pci_bus_is_root(bus)) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > +        uint8_t port_type;
> > > +        /*
> > > +         * Root ports and downstream ports of switches are the hot
> > > +         * pluggable ports in a PCI Express hierarchy.
> > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > +         *
> > > +         * 7.3. Configuration Transaction Rules (PCI Express specification
> > 3.0)
> > > +         * 7.3.1. Device Number
> > > +         *
> > > +         * Downstream Ports that do not have ARI Forwarding enabled
> > must
> > > +         * associate only Device 0 with the device attached to the Logical
> > Bus
> > > +         * representing the Link from the Port.
> > > +         *
> > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > PCIe
> > 
> > s/enabled default/enabled by default/
> > 
> OK.
> 
> > > +         * ports. ARI Forwarding enable setting at firmware/OS Control
> > handoff.
> > 
> > 
> > can parse last sentence. drop it?
> > 
> OK.
> 
> > > +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> > > +           Device can respond to Configuration Space accesses under
> > what it
> > > +         * interprets as being different Device Numbers, and its Functions
> > can
> > > +         * be aliased under multiple Device Numbers, generally leading to
> > > +         * undesired behavior.
> > 
> > I don't think any badness really happens.
> > Did you observe any?
> > 
> Same with above explanation.
> 
> > 
> > 
> > > +         */
> > > +        port_type = pcie_cap_get_type(pci_dev);
> > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > +            if (!pci_is_express(dev) ||
> > > +                (pci_is_express(dev) &&
> > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > 
> > I would just skip the test for a non express function.
> > 
> Sorry?
> 
> Best regards,
> -Gonglei
> 
> > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > +                    error_report("PCIe: non-ARI device can't be
> > populated"
> > > +                                 " in slot %d",
> > PCI_SLOT(dev->devfn));
> > > +                    return -1;
> > > +                }
> > > +            }
> > > +        }
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > index d139d58..1d8f3f4 100644
> > > --- a/include/hw/pci/pcie.h
> > > +++ b/include/hw/pci/pcie.h
> > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > >                           uint16_t offset, uint16_t size);
> > >
> > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > >
> > >  extern const VMStateDescription vmstate_pcie_device;
> > >
> > > --
> > > 1.7.12.4
> > >
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-10-01  5:26       ` Knut Omang
@ 2014-10-01  7:15         ` Gonglei
  2014-10-01 14:08         ` Marcel Apfelbaum
  1 sibling, 0 replies; 16+ messages in thread
From: Gonglei @ 2014-10-01  7:15 UTC (permalink / raw)
  To: 'Knut Omang'
  Cc: peter.crosthwaite, weidong.huang, 'Michael S. Tsirkin',
	marcel.a, luonengjun, armbru, qemu-devel, arei.gonglei, imammedo,
	pbonzini, peter.huangpeng, afaerber

> Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> pcie devices
> 
> On Tue, 2014-09-30 at 21:38 +0800, Gonglei wrote:
> > > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > > pcie devices
> > >
> > > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com
> wrote:
> > > > From: Gonglei <arei.gonglei@huawei.com>
> > > >
> > > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > Device can respond to Configuration Space accesses under what it
> > > > interprets as being different Device Numbers, and its Functions can
> > > > be aliased under multiple Device Numbers, generally leading to
> > > > undesired behavior.
> > >
> > > So what is the undesired behaviour?
> > > Did you observe such?
> >
> > I just observe the PCI device don't work, and the dmesg show me:
> >
> > [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> > [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> > [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due
> to button press.
> > [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> > [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> > [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)
> 
> This seems very much like the symptoms I see when I use hotplug after
> the latest changes to the hotplug code - do you have something that
> confirms this has something to do with wrong interpretation of device
> IDs?

Sorry, I don't at present. 
I just made a comparison between 'slot=0' and 'slot>0', and got
the evidence from PCIe spec. :)

> My suspicion is that something has gone wrong or is missing in the
> hotplug logic but I havent had time to dig deeper into it yet.
> 
Not only hotplugging, but clodplugging. 
So, I don't think this a issue of hotplugging.

> I am just concerned that this might alleviate the symptoms you see but
> not fix the problem itself,
> 

IMHO, it is always right that we comply with PCIe specification. Right?

Best regards,
-Gonglei

> Knut
> 
> > > Please make this explicit in the commit log.
> > >
> > Sorry, I copied the description from PCIe spec. :(
> >
> > IMPLEMENTATION NOTE at Page 19:
> >
> >
> https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-in
> terpretation-070604.pdf
> >
> >
> > >
> > > >
> > > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > > we should assure that its slot is not non-zero. For PCIe devices, which
> > > > ARP capability is not enabled, we also should assure that its slot
> > > > is not non-zero.
> > >
> > > not non zero => zero
> > >
> > OK.
> >
> > > >
> > > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > > ---
> > > >  hw/pci/pci.c          |  4 ++++
> > > >  hw/pci/pcie.c         | 51
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  include/hw/pci/pcie.h |  1 +
> > > >  3 files changed, 56 insertions(+)
> > > >
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index 6ce75aa..22b7ca0 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > > >          }
> > > >      }
> > > >
> > > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > > +        return -1;
> > > > +    }
> > > > +
> > > >      /* rom loading */
> > > >      is_default_rom = false;
> > > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > > index 1babddf..b82211a 100644
> > > > --- a/hw/pci/pcie.c
> > > > +++ b/hw/pci/pcie.c
> > > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t
> offset,
> > > uint16_t nextfn)
> > > >                          offset, PCI_ARI_SIZEOF);
> > > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) <<
> 8);
> > > >  }
> > > > +
> > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > > +{
> > > > +    Object *obj = OBJECT(bus);
> > > > +
> > > > +    if (pci_bus_is_root(bus)) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > > +        uint8_t port_type;
> > > > +        /*
> > > > +         * Root ports and downstream ports of switches are the hot
> > > > +         * pluggable ports in a PCI Express hierarchy.
> > > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link
> can
> > > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > > +         *
> > > > +         * 7.3. Configuration Transaction Rules (PCI Express
> specification
> > > 3.0)
> > > > +         * 7.3.1. Device Number
> > > > +         *
> > > > +         * Downstream Ports that do not have ARI Forwarding
> enabled
> > > must
> > > > +         * associate only Device 0 with the device attached to the
> Logical
> > > Bus
> > > > +         * representing the Link from the Port.
> > > > +         *
> > > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > > PCIe
> > >
> > > s/enabled default/enabled by default/
> > >
> > OK.
> >
> > > > +         * ports. ARI Forwarding enable setting at firmware/OS
> Control
> > > handoff.
> > >
> > >
> > > can parse last sentence. drop it?
> > >
> > OK.
> >
> > > > +         * If the bit is Set when a non-ARI Device is present, the
> non-ARI
> > > > +           Device can respond to Configuration Space accesses under
> > > what it
> > > > +         * interprets as being different Device Numbers, and its
> Functions
> > > can
> > > > +         * be aliased under multiple Device Numbers, generally
> leading to
> > > > +         * undesired behavior.
> > >
> > > I don't think any badness really happens.
> > > Did you observe any?
> > >
> > Same with above explanation.
> >
> > >
> > >
> > > > +         */
> > > > +        port_type = pcie_cap_get_type(pci_dev);
> > > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > > +            if (!pci_is_express(dev) ||
> > > > +                (pci_is_express(dev) &&
> > > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > >
> > > I would just skip the test for a non express function.
> > >
> > Sorry?
> >
> > Best regards,
> > -Gonglei
> >
> > > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > > +                    error_report("PCIe: non-ARI device can't be
> > > populated"
> > > > +                                 " in slot %d",
> > > PCI_SLOT(dev->devfn));
> > > > +                    return -1;
> > > > +                }
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return 0;
> > > > +}
> > > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > > index d139d58..1d8f3f4 100644
> > > > --- a/include/hw/pci/pcie.h
> > > > +++ b/include/hw/pci/pcie.h
> > > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > > >                           uint16_t offset, uint16_t size);
> > > >
> > > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > > >
> > > >  extern const VMStateDescription vmstate_pcie_device;
> > > >
> > > > --
> > > > 1.7.12.4
> > > >
> >
> >
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-10-01  5:26       ` Knut Omang
  2014-10-01  7:15         ` Gonglei
@ 2014-10-01 14:08         ` Marcel Apfelbaum
  2014-10-03 11:22           ` Knut Omang
  1 sibling, 1 reply; 16+ messages in thread
From: Marcel Apfelbaum @ 2014-10-01 14:08 UTC (permalink / raw)
  To: Knut Omang
  Cc: peter.crosthwaite, weidong.huang, 'Michael S. Tsirkin',
	luonengjun, armbru, qemu-devel, arei.gonglei, imammedo, pbonzini,
	peter.huangpeng, afaerber, Gonglei

On Wed, 2014-10-01 at 07:26 +0200, Knut Omang wrote:
> On Tue, 2014-09-30 at 21:38 +0800, Gonglei wrote:
> > > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > > pcie devices
> > > 
> > > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> > > > From: Gonglei <arei.gonglei@huawei.com>
> > > >
> > > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > Device can respond to Configuration Space accesses under what it
> > > > interprets as being different Device Numbers, and its Functions can
> > > > be aliased under multiple Device Numbers, generally leading to
> > > > undesired behavior.
> > > 
> > > So what is the undesired behaviour?
> > > Did you observe such?
> > 
> > I just observe the PCI device don't work, and the dmesg show me:
> > 
> > [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> > [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> > [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due to button press.
> > [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> > [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> > [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)
> 
> This seems very much like the symptoms I see when I use hotplug after
> the latest changes to the hotplug code - do you have something that
> confirms this has something to do with wrong interpretation of device
> IDs? My suspicion is that something has gone wrong or is missing in the
> hotplug logic but I havent had time to dig deeper into it yet.
Can you please describe me the steps to reproduce the issue?
It should work correctly by now, maybe a "surprise removal/addition"
warning and nothing more.

Thank you,
Marcel

> 
> I am just concerned that this might alleviate the symptoms you see but
> not fix the problem itself,
> 
> Knut
> 
> > > Please make this explicit in the commit log.
> > > 
> > Sorry, I copied the description from PCIe spec. :(
> > 
> > IMPLEMENTATION NOTE at Page 19:
> > 
> > https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-interpretation-070604.pdf
> > 
> > 
> > > 
> > > >
> > > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > > we should assure that its slot is not non-zero. For PCIe devices, which
> > > > ARP capability is not enabled, we also should assure that its slot
> > > > is not non-zero.
> > > 
> > > not non zero => zero
> > > 
> > OK.
> > 
> > > >
> > > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > > ---
> > > >  hw/pci/pci.c          |  4 ++++
> > > >  hw/pci/pcie.c         | 51
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  include/hw/pci/pcie.h |  1 +
> > > >  3 files changed, 56 insertions(+)
> > > >
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index 6ce75aa..22b7ca0 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > > >          }
> > > >      }
> > > >
> > > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > > +        return -1;
> > > > +    }
> > > > +
> > > >      /* rom loading */
> > > >      is_default_rom = false;
> > > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > > index 1babddf..b82211a 100644
> > > > --- a/hw/pci/pcie.c
> > > > +++ b/hw/pci/pcie.c
> > > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset,
> > > uint16_t nextfn)
> > > >                          offset, PCI_ARI_SIZEOF);
> > > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
> > > >  }
> > > > +
> > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > > +{
> > > > +    Object *obj = OBJECT(bus);
> > > > +
> > > > +    if (pci_bus_is_root(bus)) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > > +        uint8_t port_type;
> > > > +        /*
> > > > +         * Root ports and downstream ports of switches are the hot
> > > > +         * pluggable ports in a PCI Express hierarchy.
> > > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > > +         *
> > > > +         * 7.3. Configuration Transaction Rules (PCI Express specification
> > > 3.0)
> > > > +         * 7.3.1. Device Number
> > > > +         *
> > > > +         * Downstream Ports that do not have ARI Forwarding enabled
> > > must
> > > > +         * associate only Device 0 with the device attached to the Logical
> > > Bus
> > > > +         * representing the Link from the Port.
> > > > +         *
> > > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > > PCIe
> > > 
> > > s/enabled default/enabled by default/
> > > 
> > OK.
> > 
> > > > +         * ports. ARI Forwarding enable setting at firmware/OS Control
> > > handoff.
> > > 
> > > 
> > > can parse last sentence. drop it?
> > > 
> > OK.
> > 
> > > > +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > +           Device can respond to Configuration Space accesses under
> > > what it
> > > > +         * interprets as being different Device Numbers, and its Functions
> > > can
> > > > +         * be aliased under multiple Device Numbers, generally leading to
> > > > +         * undesired behavior.
> > > 
> > > I don't think any badness really happens.
> > > Did you observe any?
> > > 
> > Same with above explanation.
> > 
> > > 
> > > 
> > > > +         */
> > > > +        port_type = pcie_cap_get_type(pci_dev);
> > > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > > +            if (!pci_is_express(dev) ||
> > > > +                (pci_is_express(dev) &&
> > > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > > 
> > > I would just skip the test for a non express function.
> > > 
> > Sorry?
> > 
> > Best regards,
> > -Gonglei
> > 
> > > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > > +                    error_report("PCIe: non-ARI device can't be
> > > populated"
> > > > +                                 " in slot %d",
> > > PCI_SLOT(dev->devfn));
> > > > +                    return -1;
> > > > +                }
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return 0;
> > > > +}
> > > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > > index d139d58..1d8f3f4 100644
> > > > --- a/include/hw/pci/pcie.h
> > > > +++ b/include/hw/pci/pcie.h
> > > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > > >                           uint16_t offset, uint16_t size);
> > > >
> > > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > > >
> > > >  extern const VMStateDescription vmstate_pcie_device;
> > > >
> > > > --
> > > > 1.7.12.4
> > > >
> > 
> > 
> 
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-10-01 14:08         ` Marcel Apfelbaum
@ 2014-10-03 11:22           ` Knut Omang
  2014-10-03 14:30             ` Knut Omang
  0 siblings, 1 reply; 16+ messages in thread
From: Knut Omang @ 2014-10-03 11:22 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: peter.crosthwaite, weidong.huang, 'Michael S. Tsirkin',
	luonengjun, armbru, qemu-devel, arei.gonglei, imammedo, pbonzini,
	peter.huangpeng, afaerber, Gonglei

On Wed, 2014-10-01 at 17:08 +0300, Marcel Apfelbaum wrote:
> On Wed, 2014-10-01 at 07:26 +0200, Knut Omang wrote:
> > On Tue, 2014-09-30 at 21:38 +0800, Gonglei wrote:
> > > > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > > > pcie devices
> > > > 
> > > > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> > > > > From: Gonglei <arei.gonglei@huawei.com>
> > > > >
> > > > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > > Device can respond to Configuration Space accesses under what it
> > > > > interprets as being different Device Numbers, and its Functions can
> > > > > be aliased under multiple Device Numbers, generally leading to
> > > > > undesired behavior.
> > > > 
> > > > So what is the undesired behaviour?
> > > > Did you observe such?
> > > 
> > > I just observe the PCI device don't work, and the dmesg show me:
> > > 
> > > [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> > > [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> > > [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due to button press.
> > > [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> > > [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> > > [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)
> > 
> > This seems very much like the symptoms I see when I use hotplug after
> > the latest changes to the hotplug code - do you have something that
> > confirms this has something to do with wrong interpretation of device
> > IDs? My suspicion is that something has gone wrong or is missing in the
> > hotplug logic but I havent had time to dig deeper into it yet.
> Can you please describe me the steps to reproduce the issue?

Hmm, while trying to reproduce again I realize my hotplug issues are not
the same as the one Gonglei reports, let me come back to that in a
separate mail later,

My main point here is that I don't see how this particular fix would
alleviate Gonglei's issue, as it does not seem to get triggered unless
there's a bug in the emulated port/switch?

Gonglei, I assume you are use the TI x3130 in your example:
IMHO any PCIe x 1 device should fit into any of the (3) PCIe slots
provided by the TI x3130, and according to the spec:

http://www.ti.com/lit/gpn/xio3130

these slots appear as separate buses, which means that all devices will
have devfn 0.0 but on different buses, eg. if you have all three switch
slots (not to be confused with the slot number given by the PCI_SLOT()
macro) populated and no other secondary buses before it, you should see
the downstream switch ports as 01:xx:x and devices in 02:00.0, 03:00.0
and 04:00.0 for slots 0, 1 and 2. This seems to correspond well with the
current Qemu model.

So unless your device itself exposes function# > 8 and is not ARI
capable (which would be a non-compliant device as far as I read the
standard) you should never be able to see any device in any of the
downstream ports have PCI_SLOT(devfn) != 0 ?

With qemu master + my SR/IOV patch set + the igb patch (just to have an
ARI capable device to play with) here:

https://github.com/knuto/qemu, branch sriov_patches_v3

and a guest running F20 I am able to boot with a device (such as for
instance an e1000) inserted in a root port, I am also able to hot plug
it from the qemu monitor, eg.

qemu-kvm  ... \
  -device ioh3420,slot=0,id=pcie_port.0

...

(qemu) device_add e1000,vlan=1,bus=pcie_port.0,id=ne
(qemu) device_del ne

In both cases the ARIfwd bit is disabled by default and not enabled by
the port driver (just as I would expect) so at least your comment 
is wrong as far as I can see.

Booting with a two-port downstream switch with no devices plugged, I see
the same, ARIfwd is not enabled on the downstream ports as long as no
device is in any slots, which is as expected, isn't it?

qemu-kvm  ... \
  -device x3130-upstream,id=upsw \
  -device xio3130-downstream,bus=upsw,addr=0.0,chassis=5,id=ds_port.0 \
  -device xio3130-downstream,bus=upsw,addr=1.0,chassis=6,id=ds_port.1
...

The upstream port does not support ARIfwd in the QEMU model, which I
suppose is correct as it will only contain individual downstream
devices, which expose new buses but never have more than one function.

However, either booting or hotplugging an e1000 into the downstream
switch, eg.

(qemu) device_add e1000,vlan=1,bus=ds_port.0,id=ne

with my current guest image kernel 3.13.10-200.fc20.x86_64+debug I get a
NULL pointer Oops in the guest,

[    0.520009] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
[    0.521000] IP: [<ffffffff813d942d>] pcie_aspm_init_link_state+0x6ed/0x7c0
[    0.521000] PGD 0 
[    0.521000] Oops: 0000 [#1] SMP 

The exact same happens if I use the ARI capable igb device instead.

I will upgrade the guest to a newer image (what kernel are you running with, Gonglei?) 

lspci reports this on the guest (before I try to add any device in the downstream port(s)):

00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
00:01.0 VGA compatible controller: Cirrus Logic GD 5446
00:02.0 Ethernet controller: Red Hat, Inc Virtio network device
00:03.0 Unclassified device [0002]: Red Hat, Inc Virtio filesystem
00:04.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI Express Root Port 0 (rev 02)
00:05.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Upstream) (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
02:00.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01)
02:01.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01)

# lspci -t
-[0000:00]-+-00.0
           +-01.0
           +-02.0
           +-03.0
           +-04.0-[01]--
           +-05.0-[02-04]--+-00.0-[03]--
           |               \-01.0-[04]--
           +-1f.0
           +-1f.2
           \-1f.3

eg. the devices plugged into the downstream ports will end up at 03:00.0 and 04:00.0,
the info qtree command verifies that the e1000 insert ends up in 03:00.0 as expected.

Similar if I instead use 

(qemu) device_add e1000,vlan=1,bus=ds_port.1,id=ne2

I see this with info qtree:

  class Ethernet controller, addr 04:00.0, pci id 8086:100e (sub 1af4:1100) 

Anyway I suppose this indicates that something is not right/is missing
with the downstream switch emulation, and not with the (generic) way the
ARIfwd cap/ctrl bit works

> It should work correctly by now, maybe a "surprise removal/addition"
> warning and nothing more.
> 
> Thank you,
> Marcel
> 
> > 
> > I am just concerned that this might alleviate the symptoms you see but
> > not fix the problem itself,
> > 
> > Knut
> > 
> > > > Please make this explicit in the commit log.
> > > > 
> > > Sorry, I copied the description from PCIe spec. :(
> > > 
> > > IMPLEMENTATION NOTE at Page 19:
> > > 
> > > https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-interpretation-070604.pdf
> > > 
> > > 
> > > > 
> > > > >
> > > > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > > > we should assure that its slot is not non-zero. 



> > > > > For PCIe devices, which
> > > > > ARP capability is not enabled, we also should assure that its slot
> > > > > is not non-zero.
> > > > 
> > > > not non zero => zero
> > > > 
> > > OK.
> > > 
> > > > >
> > > > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > > > ---
> > > > >  hw/pci/pci.c          |  4 ++++
> > > > >  hw/pci/pcie.c         | 51
> > > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >  include/hw/pci/pcie.h |  1 +
> > > > >  3 files changed, 56 insertions(+)
> > > > >
> > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > index 6ce75aa..22b7ca0 100644
> > > > > --- a/hw/pci/pci.c
> > > > > +++ b/hw/pci/pci.c
> > > > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > > > >          }
> > > > >      }
> > > > >
> > > > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > >      /* rom loading */
> > > > >      is_default_rom = false;
> > > > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > > > index 1babddf..b82211a 100644
> > > > > --- a/hw/pci/pcie.c
> > > > > +++ b/hw/pci/pcie.c
> > > > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset,
> > > > uint16_t nextfn)
> > > > >                          offset, PCI_ARI_SIZEOF);
> > > > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
> > > > >  }
> > > > > +
> > > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > > > +{
> > > > > +    Object *obj = OBJECT(bus);
> > > > > +
> > > > > +    if (pci_bus_is_root(bus)) {
> > > > > +        return 0;
> > > > > +    }
> > > > > +
> > > > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > > > +        uint8_t port_type;
> > > > > +        /*
> > > > > +         * Root ports and downstream ports of switches are the hot
> > > > > +         * pluggable ports in a PCI Express hierarchy.
> > > > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > > > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > > > +         *
> > > > > +         * 7.3. Configuration Transaction Rules (PCI Express specification
> > > > 3.0)
> > > > > +         * 7.3.1. Device Number
> > > > > +         *
> > > > > +         * Downstream Ports that do not have ARI Forwarding enabled
> > > > must
> > > > > +         * associate only Device 0 with the device attached to the Logical
> > > > Bus
> > > > > +         * representing the Link from the Port.
> > > > > +         *
> > > > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > > > PCIe
> > > > 
> > > > s/enabled default/enabled by default/

It is not when I try this (as discussed above)
The capability is there but it is disabled by default.

> > > > 
> > > OK.
> > > 
> > > > > +         * ports. ARI Forwarding enable setting at firmware/OS Control
> > > > handoff.
> > > > 
> > > > 
> > > > can parse last sentence. drop it?
> > > > 
> > > OK.
> > > 
> > > > > +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > > +           Device can respond to Configuration Space accesses under
> > > > what it
> > > > > +         * interprets as being different Device Numbers, and its Functions
> > > > can
> > > > > +         * be aliased under multiple Device Numbers, generally leading to
> > > > > +         * undesired behavior.

I don't understand how this is possible to achieve.

Knut

> > > > I don't think any badness really happens.
> > > > Did you observe any?
> > > > 
> > > Same with above explanation.
> > > 
> > > > 
> > > > 
> > > > > +         */
> > > > > +        port_type = pcie_cap_get_type(pci_dev);
> > > > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > > > +            if (!pci_is_express(dev) ||
> > > > > +                (pci_is_express(dev) &&
> > > > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > > > 
> > > > I would just skip the test for a non express function.
> > > > 
> > > Sorry?
> > > 
> > > Best regards,
> > > -Gonglei
> > > 
> > > > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > > > +                    error_report("PCIe: non-ARI device can't be
> > > > populated"
> > > > > +                                 " in slot %d",
> > > > PCI_SLOT(dev->devfn));
> > > > > +                    return -1;
> > > > > +                }
> > > > > +            }
> > > > > +        }
> > > > > +    }
> > > > > +
> > > > > +    return 0;
> > > > > +}
> > > > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > > > index d139d58..1d8f3f4 100644
> > > > > --- a/include/hw/pci/pcie.h
> > > > > +++ b/include/hw/pci/pcie.h
> > > > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > > > >                           uint16_t offset, uint16_t size);
> > > > >
> > > > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > > > >
> > > > >  extern const VMStateDescription vmstate_pcie_device;
> > > > >
> > > > > --
> > > > > 1.7.12.4
> > > > >
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-10-03 11:22           ` Knut Omang
@ 2014-10-03 14:30             ` Knut Omang
  2014-10-08  3:23               ` Gonglei (Arei)
  0 siblings, 1 reply; 16+ messages in thread
From: Knut Omang @ 2014-10-03 14:30 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: peter.crosthwaite, weidong.huang, 'Michael S. Tsirkin',
	luonengjun, armbru, qemu-devel, arei.gonglei, imammedo, pbonzini,
	peter.huangpeng, afaerber, Gonglei

On Fri, 2014-10-03 at 13:22 +0200, Knut Omang wrote:
> On Wed, 2014-10-01 at 17:08 +0300, Marcel Apfelbaum wrote:
> > On Wed, 2014-10-01 at 07:26 +0200, Knut Omang wrote:
> > > On Tue, 2014-09-30 at 21:38 +0800, Gonglei wrote:
> > > > > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> > > > > pcie devices
> > > > > 
> > > > > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com wrote:
> > > > > > From: Gonglei <arei.gonglei@huawei.com>
> > > > > >
> > > > > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > > > > ports. ARI Forwarding enable setting at firmware/OS Control handoff.
> > > > > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > > > Device can respond to Configuration Space accesses under what it
> > > > > > interprets as being different Device Numbers, and its Functions can
> > > > > > be aliased under multiple Device Numbers, generally leading to
> > > > > > undesired behavior.
> > > > > 
> > > > > So what is the undesired behaviour?
> > > > > Did you observe such?
> > > > 
> > > > I just observe the PCI device don't work, and the dmesg show me:
> > > > 
> > > > [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> > > > [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> > > > [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on due to button press.
> > > > [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> > > > [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 - 4)
> > > > [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 - 4)
> > > 
> > > This seems very much like the symptoms I see when I use hotplug after
> > > the latest changes to the hotplug code - do you have something that
> > > confirms this has something to do with wrong interpretation of device
> > > IDs? My suspicion is that something has gone wrong or is missing in the
> > > hotplug logic but I havent had time to dig deeper into it yet.
> > Can you please describe me the steps to reproduce the issue?
> 
> Hmm, while trying to reproduce again I realize my hotplug issues are not
> the same as the one Gonglei reports, let me come back to that in a
> separate mail later,
> 
> My main point here is that I don't see how this particular fix would
> alleviate Gonglei's issue, as it does not seem to get triggered unless
> there's a bug in the emulated port/switch?
> 
> Gonglei, I assume you are use the TI x3130 in your example:
> IMHO any PCIe x 1 device should fit into any of the (3) PCIe slots
> provided by the TI x3130, and according to the spec:
> 
> http://www.ti.com/lit/gpn/xio3130
> 
> these slots appear as separate buses, which means that all devices will
> have devfn 0.0 but on different buses, eg. if you have all three switch
> slots (not to be confused with the slot number given by the PCI_SLOT()
> macro) populated and no other secondary buses before it, you should see
> the downstream switch ports as 01:xx:x and devices in 02:00.0, 03:00.0
> and 04:00.0 for slots 0, 1 and 2. This seems to correspond well with the
> current Qemu model.
> 
> So unless your device itself exposes function# > 8 and is not ARI
> capable (which would be a non-compliant device as far as I read the
> standard) you should never be able to see any device in any of the
> downstream ports have PCI_SLOT(devfn) != 0 ?
> 
> With qemu master + my SR/IOV patch set + the igb patch (just to have an
> ARI capable device to play with) here:
> 
> https://github.com/knuto/qemu, branch sriov_patches_v3
> 
> and a guest running F20 I am able to boot with a device (such as for
> instance an e1000) inserted in a root port, I am also able to hot plug
> it from the qemu monitor, eg.
> 
> qemu-kvm  ... \
>   -device ioh3420,slot=0,id=pcie_port.0
> 
> ...
> 
> (qemu) device_add e1000,vlan=1,bus=pcie_port.0,id=ne
> (qemu) device_del ne
> 
> In both cases the ARIfwd bit is disabled by default and not enabled by
> the port driver (just as I would expect) so at least your comment 
> is wrong as far as I can see.
> 
> Booting with a two-port downstream switch with no devices plugged, I see
> the same, ARIfwd is not enabled on the downstream ports as long as no
> device is in any slots, which is as expected, isn't it?
> 
> qemu-kvm  ... \
>   -device x3130-upstream,id=upsw \
>   -device xio3130-downstream,bus=upsw,addr=0.0,chassis=5,id=ds_port.0 \
>   -device xio3130-downstream,bus=upsw,addr=1.0,chassis=6,id=ds_port.1
> ...
> 
> The upstream port does not support ARIfwd in the QEMU model, which I
> suppose is correct as it will only contain individual downstream
> devices, which expose new buses but never have more than one function.
> 
> However, either booting or hotplugging an e1000 into the downstream
> switch, eg.
> 
> (qemu) device_add e1000,vlan=1,bus=ds_port.0,id=ne
> 
> with my current guest image kernel 3.13.10-200.fc20.x86_64+debug I get a
> NULL pointer Oops in the guest,
> 
> [    0.520009] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> [    0.521000] IP: [<ffffffff813d942d>] pcie_aspm_init_link_state+0x6ed/0x7c0
> [    0.521000] PGD 0 
> [    0.521000] Oops: 0000 [#1] SMP 

here is the gdb version of the stack trace:

0xffffffff813d9431 in alloc_pcie_link_state (pdev=0xffff880178cec520) at drivers/pci/pcie/aspm.c:530
530                     parent = pdev->bus->parent->self->link_state;
(gdb) where
#0  0xffffffff813d9431 in alloc_pcie_link_state (pdev=0xffff880178cec520) at drivers/pci/pcie/aspm.c:530
#1  pcie_aspm_init_link_state (pdev=0xffff880178cec520) at drivers/pci/pcie/aspm.c:578
#2  0xffffffff813c6bbd in pci_scan_slot (bus=bus@entry=0xffff880178cad388, devfn=devfn@entry=0)
    at drivers/pci/probe.c:1486
#3  0xffffffff813e0d54 in pciehp_configure_device (p_slot=p_slot@entry=0xffff880173982760)
    at drivers/pci/hotplug/pciehp_pci.c:54
#4  0xffffffff813e05cb in board_added (p_slot=0xffff880173982760) at drivers/pci/hotplug/pciehp_ctrl.c:223
#5  pciehp_enable_slot (p_slot=p_slot@entry=0xffff880173982760) at drivers/pci/hotplug/pciehp_ctrl.c:510
#6  0xffffffff813e0ac0 in pciehp_power_thread (work=<optimized out>) at drivers/pci/hotplug/pciehp_ctrl.c:308
#7  0xffffffff8109bc31 in process_one_work (worker=worker@entry=0xffff8801730fd728, work=0xffff8801731bbc30)
    at kernel/workqueue.c:2214
#8  0xffffffff8109c1eb in worker_thread (__worker=0xffff8801730fd728) at kernel/workqueue.c:2340
#9  0xffffffff810a43ef in kthread (_create=0xffff88017336edc8) at kernel/kthread.c:207
#10 <signal handler called>
#11 0x0000000000000000 in irq_stack_union ()
#12 0x0000000000000000 in ?? ()
(gdb) p pdev->bus->parent->self
$2 = (struct pci_dev *) 0x0 <irq_stack_union>                                                                         
(gdb)                                        

I observe the same behaviour with guest kernel 3.16.3-200.fc20.x86_64.

Knut

> The exact same happens if I use the ARI capable igb device instead.
> 
> I will upgrade the guest to a newer image (what kernel are you running with, Gonglei?) 
> 
> lspci reports this on the guest (before I try to add any device in the downstream port(s)):
> 
> 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> 00:01.0 VGA compatible controller: Cirrus Logic GD 5446
> 00:02.0 Ethernet controller: Red Hat, Inc Virtio network device
> 00:03.0 Unclassified device [0002]: Red Hat, Inc Virtio filesystem
> 00:04.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI Express Root Port 0 (rev 02)
> 00:05.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Upstream) (rev 02)
> 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
> 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
> 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
> 02:00.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01)
> 02:01.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch (Downstream) (rev 01)
> 
> # lspci -t
> -[0000:00]-+-00.0
>            +-01.0
>            +-02.0
>            +-03.0
>            +-04.0-[01]--
>            +-05.0-[02-04]--+-00.0-[03]--
>            |               \-01.0-[04]--
>            +-1f.0
>            +-1f.2
>            \-1f.3
> 
> eg. the devices plugged into the downstream ports will end up at 03:00.0 and 04:00.0,
> the info qtree command verifies that the e1000 insert ends up in 03:00.0 as expected.
> 
> Similar if I instead use 
> 
> (qemu) device_add e1000,vlan=1,bus=ds_port.1,id=ne2
> 
> I see this with info qtree:
> 
>   class Ethernet controller, addr 04:00.0, pci id 8086:100e (sub 1af4:1100) 
> 
> Anyway I suppose this indicates that something is not right/is missing
> with the downstream switch emulation, and not with the (generic) way the
> ARIfwd cap/ctrl bit works
> 
> > It should work correctly by now, maybe a "surprise removal/addition"
> > warning and nothing more.
> > 
> > Thank you,
> > Marcel
> > 
> > > 
> > > I am just concerned that this might alleviate the symptoms you see but
> > > not fix the problem itself,
> > > 
> > > Knut
> > > 
> > > > > Please make this explicit in the commit log.
> > > > > 
> > > > Sorry, I copied the description from PCIe spec. :(
> > > > 
> > > > IMPLEMENTATION NOTE at Page 19:
> > > > 
> > > > https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-interpretation-070604.pdf
> > > > 
> > > > 
> > > > > 
> > > > > >
> > > > > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > > > > we should assure that its slot is not non-zero. 
> 
> 
> 
> > > > > > For PCIe devices, which
> > > > > > ARP capability is not enabled, we also should assure that its slot
> > > > > > is not non-zero.
> > > > > 
> > > > > not non zero => zero
> > > > > 
> > > > OK.
> > > > 
> > > > > >
> > > > > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > > > > ---
> > > > > >  hw/pci/pci.c          |  4 ++++
> > > > > >  hw/pci/pcie.c         | 51
> > > > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > > >  include/hw/pci/pcie.h |  1 +
> > > > > >  3 files changed, 56 insertions(+)
> > > > > >
> > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > > index 6ce75aa..22b7ca0 100644
> > > > > > --- a/hw/pci/pci.c
> > > > > > +++ b/hw/pci/pci.c
> > > > > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState *qdev)
> > > > > >          }
> > > > > >      }
> > > > > >
> > > > > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > > > > +        return -1;
> > > > > > +    }
> > > > > > +
> > > > > >      /* rom loading */
> > > > > >      is_default_rom = false;
> > > > > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > > > > index 1babddf..b82211a 100644
> > > > > > --- a/hw/pci/pcie.c
> > > > > > +++ b/hw/pci/pcie.c
> > > > > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t offset,
> > > > > uint16_t nextfn)
> > > > > >                          offset, PCI_ARI_SIZEOF);
> > > > > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn & 0xff) << 8);
> > > > > >  }
> > > > > > +
> > > > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > > > > +{
> > > > > > +    Object *obj = OBJECT(bus);
> > > > > > +
> > > > > > +    if (pci_bus_is_root(bus)) {
> > > > > > +        return 0;
> > > > > > +    }
> > > > > > +
> > > > > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > > > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > > > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > > > > +        uint8_t port_type;
> > > > > > +        /*
> > > > > > +         * Root ports and downstream ports of switches are the hot
> > > > > > +         * pluggable ports in a PCI Express hierarchy.
> > > > > > +         * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > > > > > +         * only connect one PCI device/Switch/EndPoint or PCI-bridge.
> > > > > > +         *
> > > > > > +         * 7.3. Configuration Transaction Rules (PCI Express specification
> > > > > 3.0)
> > > > > > +         * 7.3.1. Device Number
> > > > > > +         *
> > > > > > +         * Downstream Ports that do not have ARI Forwarding enabled
> > > > > must
> > > > > > +         * associate only Device 0 with the device attached to the Logical
> > > > > Bus
> > > > > > +         * representing the Link from the Port.
> > > > > > +         *
> > > > > > +         * In QEMU, ARI Forwarding is enabled default at emulation of
> > > > > PCIe
> > > > > 
> > > > > s/enabled default/enabled by default/
> 
> It is not when I try this (as discussed above)
> The capability is there but it is disabled by default.
> 
> > > > > 
> > > > OK.
> > > > 
> > > > > > +         * ports. ARI Forwarding enable setting at firmware/OS Control
> > > > > handoff.
> > > > > 
> > > > > 
> > > > > can parse last sentence. drop it?
> > > > > 
> > > > OK.
> > > > 
> > > > > > +         * If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > > > +           Device can respond to Configuration Space accesses under
> > > > > what it
> > > > > > +         * interprets as being different Device Numbers, and its Functions
> > > > > can
> > > > > > +         * be aliased under multiple Device Numbers, generally leading to
> > > > > > +         * undesired behavior.
> 
> I don't understand how this is possible to achieve.
> 
> Knut
> 
> > > > > I don't think any badness really happens.
> > > > > Did you observe any?
> > > > > 
> > > > Same with above explanation.
> > > > 
> > > > > 
> > > > > 
> > > > > > +         */
> > > > > > +        port_type = pcie_cap_get_type(pci_dev);
> > > > > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > > > > +            if (!pci_is_express(dev) ||
> > > > > > +                (pci_is_express(dev) &&
> > > > > > +                !pcie_find_capability(dev, PCI_EXT_CAP_ID_ARI))) {
> > > > > 
> > > > > I would just skip the test for a non express function.
> > > > > 
> > > > Sorry?
> > > > 
> > > > Best regards,
> > > > -Gonglei
> > > > 
> > > > > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > > > > +                    error_report("PCIe: non-ARI device can't be
> > > > > populated"
> > > > > > +                                 " in slot %d",
> > > > > PCI_SLOT(dev->devfn));
> > > > > > +                    return -1;
> > > > > > +                }
> > > > > > +            }
> > > > > > +        }
> > > > > > +    }
> > > > > > +
> > > > > > +    return 0;
> > > > > > +}
> > > > > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > > > > index d139d58..1d8f3f4 100644
> > > > > > --- a/include/hw/pci/pcie.h
> > > > > > +++ b/include/hw/pci/pcie.h
> > > > > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > > > > >                           uint16_t offset, uint16_t size);
> > > > > >
> > > > > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t nextfn);
> > > > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > > > > >
> > > > > >  extern const VMStateDescription vmstate_pcie_device;
> > > > > >
> > > > > > --
> > > > > > 1.7.12.4
> > > > > >
> > > > 
> > > > 
> > > 
> > > 
> > > 
> > 
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices
  2014-10-03 14:30             ` Knut Omang
@ 2014-10-08  3:23               ` Gonglei (Arei)
  0 siblings, 0 replies; 16+ messages in thread
From: Gonglei (Arei) @ 2014-10-08  3:23 UTC (permalink / raw)
  To: Knut Omang, Marcel Apfelbaum
  Cc: peter.crosthwaite, Huangweidong (C), 'Michael S. Tsirkin',
	Luonengjun, armbru, qemu-devel, imammedo, pbonzini,
	Huangpeng (Peter),
	afaerber, Gonglei

> Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of
> pcie devices
> 
> On Fri, 2014-10-03 at 13:22 +0200, Knut Omang wrote:
> > On Wed, 2014-10-01 at 17:08 +0300, Marcel Apfelbaum wrote:
> > > On Wed, 2014-10-01 at 07:26 +0200, Knut Omang wrote:
> > > > On Tue, 2014-09-30 at 21:38 +0800, Gonglei wrote:
> > > > > > Subject: Re: [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari
> capability of
> > > > > > pcie devices
> > > > > >
> > > > > > On Tue, Sep 30, 2014 at 06:11:25PM +0800, arei.gonglei@huawei.com
> wrote:
> > > > > > > From: Gonglei <arei.gonglei@huawei.com>
> > > > > > >
> > > > > > > In QEMU, ARI Forwarding is enabled default at emulation of PCIe
> > > > > > > ports. ARI Forwarding enable setting at firmware/OS Control
> handoff.
> > > > > > > If the bit is Set when a non-ARI Device is present, the non-ARI
> > > > > > > Device can respond to Configuration Space accesses under what it
> > > > > > > interprets as being different Device Numbers, and its Functions can
> > > > > > > be aliased under multiple Device Numbers, generally leading to
> > > > > > > undesired behavior.
> > > > > >
> > > > > > So what is the undesired behaviour?
> > > > > > Did you observe such?
> > > > >
> > > > > I just observe the PCI device don't work, and the dmesg show me:
> > > > >
> > > > > [ 159.035250] pciehp 0000:05:00.0:pcie24: Button pressed on Slot (0 - 4)
> > > > > [ 159.035274] pciehp 0000:05:00.0:pcie24: Card present on Slot (0 - 4)
> > > > > [ 159.036517] pciehp 0000:05:00.0:pcie24: PCI slot #0 - 4 - powering on
> due to button press.
> > > > > [ 159.188049] pciehp 0000:05:00.0:pcie24: Failed to check link status
> > > > > [ 159.201968] pciehp 0000:05:00.0:pcie24: Card not present on Slot (0 -
> 4)
> > > > > [ 159.202529] pciehp 0000:05:00.0:pcie24: Already disabled on Slot (0 -
> 4)
> > > >
> > > > This seems very much like the symptoms I see when I use hotplug after
> > > > the latest changes to the hotplug code - do you have something that
> > > > confirms this has something to do with wrong interpretation of device
> > > > IDs? My suspicion is that something has gone wrong or is missing in the
> > > > hotplug logic but I havent had time to dig deeper into it yet.
> > > Can you please describe me the steps to reproduce the issue?
> >
> > Hmm, while trying to reproduce again I realize my hotplug issues are not
> > the same as the one Gonglei reports, let me come back to that in a
> > separate mail later,
> >
> > My main point here is that I don't see how this particular fix would
> > alleviate Gonglei's issue, as it does not seem to get triggered unless
> > there's a bug in the emulated port/switch?
> >
> > Gonglei, I assume you are use the TI x3130 in your example:
> > IMHO any PCIe x 1 device should fit into any of the (3) PCIe slots
> > provided by the TI x3130, and according to the spec:
> >
> > http://www.ti.com/lit/gpn/xio3130
> >
> > these slots appear as separate buses, which means that all devices will
> > have devfn 0.0 but on different buses, eg. if you have all three switch
> > slots (not to be confused with the slot number given by the PCI_SLOT()
> > macro) populated and no other secondary buses before it, you should see
> > the downstream switch ports as 01:xx:x and devices in 02:00.0, 03:00.0
> > and 04:00.0 for slots 0, 1 and 2. This seems to correspond well with the
> > current Qemu model.
> >
> > So unless your device itself exposes function# > 8 and is not ARI
> > capable (which would be a non-compliant device as far as I read the
> > standard) you should never be able to see any device in any of the
> > downstream ports have PCI_SLOT(devfn) != 0 ?
> >
> > With qemu master + my SR/IOV patch set + the igb patch (just to have an
> > ARI capable device to play with) here:
> >
> > https://github.com/knuto/qemu, branch sriov_patches_v3
> >
> > and a guest running F20 I am able to boot with a device (such as for
> > instance an e1000) inserted in a root port, I am also able to hot plug
> > it from the qemu monitor, eg.
> >
> > qemu-kvm  ... \
> >   -device ioh3420,slot=0,id=pcie_port.0
> >
> > ...
> >
> > (qemu) device_add e1000,vlan=1,bus=pcie_port.0,id=ne
> > (qemu) device_del ne
> >
> > In both cases the ARIfwd bit is disabled by default and not enabled by
> > the port driver (just as I would expect) so at least your comment
> > is wrong as far as I can see.
> >
> > Booting with a two-port downstream switch with no devices plugged, I see
> > the same, ARIfwd is not enabled on the downstream ports as long as no
> > device is in any slots, which is as expected, isn't it?
> >
> > qemu-kvm  ... \
> >   -device x3130-upstream,id=upsw \
> >   -device xio3130-downstream,bus=upsw,addr=0.0,chassis=5,id=ds_port.0 \
> >   -device xio3130-downstream,bus=upsw,addr=1.0,chassis=6,id=ds_port.1
> > ...
> >
> > The upstream port does not support ARIfwd in the QEMU model, which I
> > suppose is correct as it will only contain individual downstream
> > devices, which expose new buses but never have more than one function.
> >
> > However, either booting or hotplugging an e1000 into the downstream
> > switch, eg.
> >
> > (qemu) device_add e1000,vlan=1,bus=ds_port.0,id=ne
> >
> > with my current guest image kernel 3.13.10-200.fc20.x86_64+debug I get a
> > NULL pointer Oops in the guest,
> >
> > [    0.520009] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000038
> > [    0.521000] IP: [<ffffffff813d942d>]
> pcie_aspm_init_link_state+0x6ed/0x7c0
> > [    0.521000] PGD 0
> > [    0.521000] Oops: 0000 [#1] SMP
> 
> here is the gdb version of the stack trace:
> 
> 0xffffffff813d9431 in alloc_pcie_link_state (pdev=0xffff880178cec520) at
> drivers/pci/pcie/aspm.c:530
> 530                     parent = pdev->bus->parent->self->link_state;
> (gdb) where
> #0  0xffffffff813d9431 in alloc_pcie_link_state (pdev=0xffff880178cec520) at
> drivers/pci/pcie/aspm.c:530
> #1  pcie_aspm_init_link_state (pdev=0xffff880178cec520) at
> drivers/pci/pcie/aspm.c:578
> #2  0xffffffff813c6bbd in pci_scan_slot (bus=bus@entry=0xffff880178cad388,
> devfn=devfn@entry=0)
>     at drivers/pci/probe.c:1486
> #3  0xffffffff813e0d54 in pciehp_configure_device
> (p_slot=p_slot@entry=0xffff880173982760)
>     at drivers/pci/hotplug/pciehp_pci.c:54
> #4  0xffffffff813e05cb in board_added (p_slot=0xffff880173982760) at
> drivers/pci/hotplug/pciehp_ctrl.c:223
> #5  pciehp_enable_slot (p_slot=p_slot@entry=0xffff880173982760) at
> drivers/pci/hotplug/pciehp_ctrl.c:510
> #6  0xffffffff813e0ac0 in pciehp_power_thread (work=<optimized out>) at
> drivers/pci/hotplug/pciehp_ctrl.c:308
> #7  0xffffffff8109bc31 in process_one_work
> (worker=worker@entry=0xffff8801730fd728, work=0xffff8801731bbc30)
>     at kernel/workqueue.c:2214
> #8  0xffffffff8109c1eb in worker_thread (__worker=0xffff8801730fd728) at
> kernel/workqueue.c:2340
> #9  0xffffffff810a43ef in kthread (_create=0xffff88017336edc8) at
> kernel/kthread.c:207
> #10 <signal handler called>
> #11 0x0000000000000000 in irq_stack_union ()
> #12 0x0000000000000000 in ?? ()
> (gdb) p pdev->bus->parent->self
> $2 = (struct pci_dev *) 0x0 <irq_stack_union>
> (gdb)
> 
> I observe the same behaviour with guest kernel 3.16.3-200.fc20.x86_64.
> 
> Knut
> 
Hi, Knut
 I haven't encounter this issue you described with guest kernel RLEH 7.0 (Linux 3.10.0-121.e17.x85_64)

The command line:
# ./qemu-system-x86_64 -enable-kvm -m 2048 -machine q35 -device ide-drive,bus=ide.2,drive=MacHDD \
 -drive id=MacHDD,if=none,file=/home/redhat_q35.img -monitor stdio -vnc :10 \
 -readconfig ../docs/q35-chipset.cfg.ori 
QEMU 2.1.50 monitor - type 'help' for more information
(qemu) device_add e1000,vlan=1,bus=pcie-switch-downstream-port-1-1,id=ne       -->   (it works)
(qemu) device_del ne
(qemu) device_add e1000,vlan=1,bus=pcie-switch-downstream-port-1-1,id=ne,addr=1.0     -->   (it doesn't work)
(qemu) device_del ne
Device 'ne' not found
(qemu)

# cat ../docs/q35-chipset.cfg.ori
[device "ich9-ehci-2"]
  driver = "ich9-usb-ehci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.7"

[device "ich9-uhci-4"]
  driver = "ich9-usb-uhci4"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.0"
  masterbus = "ich9-ehci-2.0"
  firstport = "0"

[device "ich9-uhci-5"]
  driver = "ich9-usb-uhci5"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.1"
  masterbus = "ich9-ehci-2.0"
  firstport = "2"

[device "ich9-uhci-6"]
  driver = "ich9-usb-uhci6"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.2"
  masterbus = "ich9-ehci-2.0"
  firstport = "4"


[device "ich9-hda-audio"]
  driver = "ich9-intel-hda"
  bus = "pcie.0"
  addr = "1b.0"


[device "ich9-pcie-port-1"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.0"
  port = "1"
  chassis = "1"

[device "ich9-pcie-port-2"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.1"
  port = "2"
  chassis = "2"

[device "ich9-pcie-port-3"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.2"
  port = "3"
  chassis = "3"

[device "ich9-pcie-port-4"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.3"
  port = "4"
  chassis = "4"

##
# Example PCIe switch with two downstream ports
#
[device "pcie-switch-upstream-port-1"]
  driver = "x3130-upstream"
  bus = "ich9-pcie-port-4"
  addr = "00.0"

[device "pcie-switch-downstream-port-1-1"]
  driver = "xio3130-downstream"
  multifunction = "on"
  bus = "pcie-switch-upstream-port-1"
  addr = "00.0"
  port = "1"
  chassis = "5"

[device "pcie-switch-downstream-port-1-2"]
  driver = "xio3130-downstream"
  multifunction = "on"
  bus = "pcie-switch-upstream-port-1"
  addr = "00.1"
  port = "1"
  chassis = "6"

[device "ich9-ehci-1"]
  driver = "ich9-usb-ehci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.7"

[device "ich9-uhci-1"]
  driver = "ich9-usb-uhci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.0"
  masterbus = "ich9-ehci-1.0"
  firstport = "0"

[device "ich9-uhci-2"]
  driver = "ich9-usb-uhci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.1"
  masterbus = "ich9-ehci-1.0"
  firstport = "2"

[device "ich9-uhci-3"]
  driver = "ich9-usb-uhci3"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.2"
  masterbus = "ich9-ehci-1.0"
  firstport = "4"


[device "ich9-pci-bridge"]
  driver = "i82801b11-bridge"
  bus = "pcie.0"
  addr = "1e.0"

[device "pci.0"]
driver = "pci-bridge" 
bus = "ich9-pci-bridge" 
addr = "1.0" 
chassis_nr = "1" 

[device "pci.1"]
driver = "pci-bridge" 
bus = "ich9-pci-bridge" 
addr = "2.0" 
chassis_nr = "2"


Best regards,
-Gonglei

> > The exact same happens if I use the ARI capable igb device instead.
> >
> > I will upgrade the guest to a newer image (what kernel are you running with,
> Gonglei?)
> >
> > lspci reports this on the guest (before I try to add any device in the
> downstream port(s)):
> >
> > 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM
> Controller
> > 00:01.0 VGA compatible controller: Cirrus Logic GD 5446
> > 00:02.0 Ethernet controller: Red Hat, Inc Virtio network device
> > 00:03.0 Unclassified device [0002]: Red Hat, Inc Virtio filesystem
> > 00:04.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI
> Express Root Port 0 (rev 02)
> > 00:05.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch
> (Upstream) (rev 02)
> > 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller
> (rev 02)
> > 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6
> port SATA Controller [AHCI mode] (rev 02)
> > 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev
> 02)
> > 02:00.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch
> (Downstream) (rev 01)
> > 02:01.0 PCI bridge: Texas Instruments XIO3130 PCI Express Switch
> (Downstream) (rev 01)
> >
> > # lspci -t
> > -[0000:00]-+-00.0
> >            +-01.0
> >            +-02.0
> >            +-03.0
> >            +-04.0-[01]--
> >            +-05.0-[02-04]--+-00.0-[03]--
> >            |               \-01.0-[04]--
> >            +-1f.0
> >            +-1f.2
> >            \-1f.3
> >
> > eg. the devices plugged into the downstream ports will end up at 03:00.0 and
> 04:00.0,
> > the info qtree command verifies that the e1000 insert ends up in 03:00.0 as
> expected.
> >
> > Similar if I instead use
> >
> > (qemu) device_add e1000,vlan=1,bus=ds_port.1,id=ne2
> >
> > I see this with info qtree:
> >
> >   class Ethernet controller, addr 04:00.0, pci id 8086:100e (sub 1af4:1100)
> >
> > Anyway I suppose this indicates that something is not right/is missing
> > with the downstream switch emulation, and not with the (generic) way the
> > ARIfwd cap/ctrl bit works
> >
> > > It should work correctly by now, maybe a "surprise removal/addition"
> > > warning and nothing more.
> > >
> > > Thank you,
> > > Marcel
> > >
> > > >
> > > > I am just concerned that this might alleviate the symptoms you see but
> > > > not fix the problem itself,
> > > >
> > > > Knut
> > > >
> > > > > > Please make this explicit in the commit log.
> > > > > >
> > > > > Sorry, I copied the description from PCIe spec. :(
> > > > >
> > > > > IMPLEMENTATION NOTE at Page 19:
> > > > >
> > > > >
> https://www.pcisig.com/specifications/pciexpress/specifications/ECN-alt-rid-in
> terpretation-070604.pdf
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > So, for PCI devices attached in PCIe root ports or downstream pots,
> > > > > > > we should assure that its slot is not non-zero.
> >
> >
> >
> > > > > > > For PCIe devices, which
> > > > > > > ARP capability is not enabled, we also should assure that its slot
> > > > > > > is not non-zero.
> > > > > >
> > > > > > not non zero => zero
> > > > > >
> > > > > OK.
> > > > >
> > > > > > >
> > > > > > > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > > > > > > ---
> > > > > > >  hw/pci/pci.c          |  4 ++++
> > > > > > >  hw/pci/pcie.c         | 51
> > > > > > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > > > >  include/hw/pci/pcie.h |  1 +
> > > > > > >  3 files changed, 56 insertions(+)
> > > > > > >
> > > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > > > index 6ce75aa..22b7ca0 100644
> > > > > > > --- a/hw/pci/pci.c
> > > > > > > +++ b/hw/pci/pci.c
> > > > > > > @@ -1770,6 +1770,10 @@ static int pci_qdev_init(DeviceState
> *qdev)
> > > > > > >          }
> > > > > > >      }
> > > > > > >
> > > > > > > +    if (pcie_cap_slot_check(bus, pci_dev)) {
> > > > > > > +        return -1;
> > > > > > > +    }
> > > > > > > +
> > > > > > >      /* rom loading */
> > > > > > >      is_default_rom = false;
> > > > > > >      if (pci_dev->romfile == NULL && pc->romfile != NULL) {
> > > > > > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > > > > > index 1babddf..b82211a 100644
> > > > > > > --- a/hw/pci/pcie.c
> > > > > > > +++ b/hw/pci/pcie.c
> > > > > > > @@ -633,3 +633,54 @@ void pcie_ari_init(PCIDevice *dev, uint16_t
> offset,
> > > > > > uint16_t nextfn)
> > > > > > >                          offset, PCI_ARI_SIZEOF);
> > > > > > >      pci_set_long(dev->config + offset + PCI_ARI_CAP, (nextfn &
> 0xff) << 8);
> > > > > > >  }
> > > > > > > +
> > > > > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev)
> > > > > > > +{
> > > > > > > +    Object *obj = OBJECT(bus);
> > > > > > > +
> > > > > > > +    if (pci_bus_is_root(bus)) {
> > > > > > > +        return 0;
> > > > > > > +    }
> > > > > > > +
> > > > > > > +    if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > > > > > > +        DeviceState *parent = qbus_get_parent(BUS(obj));
> > > > > > > +        PCIDevice *pci_dev = PCI_DEVICE(parent);
> > > > > > > +        uint8_t port_type;
> > > > > > > +        /*
> > > > > > > +         * Root ports and downstream ports of switches are the
> hot
> > > > > > > +         * pluggable ports in a PCI Express hierarchy.
> > > > > > > +         * PCI Express supports chip-to-chip interconnect, a PCIe
> link can
> > > > > > > +         * only connect one PCI device/Switch/EndPoint or
> PCI-bridge.
> > > > > > > +         *
> > > > > > > +         * 7.3. Configuration Transaction Rules (PCI Express
> specification
> > > > > > 3.0)
> > > > > > > +         * 7.3.1. Device Number
> > > > > > > +         *
> > > > > > > +         * Downstream Ports that do not have ARI Forwarding
> enabled
> > > > > > must
> > > > > > > +         * associate only Device 0 with the device attached to
> the Logical
> > > > > > Bus
> > > > > > > +         * representing the Link from the Port.
> > > > > > > +         *
> > > > > > > +         * In QEMU, ARI Forwarding is enabled default at
> emulation of
> > > > > > PCIe
> > > > > >
> > > > > > s/enabled default/enabled by default/
> >
> > It is not when I try this (as discussed above)
> > The capability is there but it is disabled by default.
> >
> > > > > >
> > > > > OK.
> > > > >
> > > > > > > +         * ports. ARI Forwarding enable setting at firmware/OS
> Control
> > > > > > handoff.
> > > > > >
> > > > > >
> > > > > > can parse last sentence. drop it?
> > > > > >
> > > > > OK.
> > > > >
> > > > > > > +         * If the bit is Set when a non-ARI Device is present, the
> non-ARI
> > > > > > > +           Device can respond to Configuration Space accesses
> under
> > > > > > what it
> > > > > > > +         * interprets as being different Device Numbers, and its
> Functions
> > > > > > can
> > > > > > > +         * be aliased under multiple Device Numbers, generally
> leading to
> > > > > > > +         * undesired behavior.
> >
> > I don't understand how this is possible to achieve.
> >
> > Knut
> >
> > > > > > I don't think any badness really happens.
> > > > > > Did you observe any?
> > > > > >
> > > > > Same with above explanation.
> > > > >
> > > > > >
> > > > > >
> > > > > > > +         */
> > > > > > > +        port_type = pcie_cap_get_type(pci_dev);
> > > > > > > +        if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > > > > > > +            port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > > > > > > +            if (!pci_is_express(dev) ||
> > > > > > > +                (pci_is_express(dev) &&
> > > > > > > +                !pcie_find_capability(dev,
> PCI_EXT_CAP_ID_ARI))) {
> > > > > >
> > > > > > I would just skip the test for a non express function.
> > > > > >
> > > > > Sorry?
> > > > >
> > > > > Best regards,
> > > > > -Gonglei
> > > > >
> > > > > > > +                if (PCI_SLOT(dev->devfn) != 0) {
> > > > > > > +                    error_report("PCIe: non-ARI device can't
> be
> > > > > > populated"
> > > > > > > +                                 " in slot %d",
> > > > > > PCI_SLOT(dev->devfn));
> > > > > > > +                    return -1;
> > > > > > > +                }
> > > > > > > +            }
> > > > > > > +        }
> > > > > > > +    }
> > > > > > > +
> > > > > > > +    return 0;
> > > > > > > +}
> > > > > > > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > > > > > > index d139d58..1d8f3f4 100644
> > > > > > > --- a/include/hw/pci/pcie.h
> > > > > > > +++ b/include/hw/pci/pcie.h
> > > > > > > @@ -115,6 +115,7 @@ void pcie_add_capability(PCIDevice *dev,
> > > > > > >                           uint16_t offset, uint16_t size);
> > > > > > >
> > > > > > >  void pcie_ari_init(PCIDevice *dev, uint16_t offset, uint16_t
> nextfn);
> > > > > > > +int pcie_cap_slot_check(PCIBus *bus, PCIDevice *dev);
> > > > > > >
> > > > > > >  extern const VMStateDescription vmstate_pcie_device;
> > > > > > >
> > > > > > > --
> > > > > > > 1.7.12.4
> > > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-10-08  3:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-30 10:11 [Qemu-devel] [PATCH v4 0/3] pcie: add check for ari capability of pcie devices arei.gonglei
2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 1/3] qdev: Introduce a function to get qbus's parent arei.gonglei
2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 2/3] pcie: add check for ari capability of pcie devices arei.gonglei
2014-09-30 11:59   ` Michael S. Tsirkin
2014-09-30 13:38     ` Gonglei
2014-09-30 13:58       ` Michael S. Tsirkin
2014-10-01  4:04         ` Gonglei
2014-10-01  5:26       ` Knut Omang
2014-10-01  7:15         ` Gonglei
2014-10-01 14:08         ` Marcel Apfelbaum
2014-10-03 11:22           ` Knut Omang
2014-10-03 14:30             ` Knut Omang
2014-10-08  3:23               ` Gonglei (Arei)
2014-09-30 10:11 ` [Qemu-devel] [PATCH v4 3/3] pcie: remove confused comments arei.gonglei
2014-09-30 12:46   ` Michael S. Tsirkin
2014-09-30 12:58     ` Gonglei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.