linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: fix sriov enabling with virtual bus
@ 2014-10-29 22:26 Yinghai Lu
  2014-10-30 17:09 ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Yinghai Lu @ 2014-10-29 22:26 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, Yinghai Lu

While enabling sriov for intel igb device got:

sca05-0a81fda5:~ # echo 7 > /sys/bus/pci/devices/0000\:09\:00.0/sriov_numvfs 
[  729.612191] pci 0000:0a:10.0: [8086:1520] type 00 class 0x020000
[  729.619002] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
[  729.627767] IP: [<ffffffff815901cb>] pci_get_hp_params+0x5b/0x640
[  729.634593] PGD 0 
[  729.636844] Oops: 0000 [#1] SMP 
[  729.640466] Modules linked in:
...
[  729.795268] Call Trace:
[  729.798007]  [<ffffffff81ea0843>] ? pci_mmcfg_read+0x123/0x140
[  729.804519]  [<ffffffff81ea0770>] ? pci_mmcfg_read+0x50/0x140
[  729.810942]  [<ffffffff815692a3>] pci_configure_device+0x33/0x350
[  729.817744]  [<ffffffff8156aba4>] pci_device_add+0x24/0x160
[  729.823965]  [<ffffffff8158f3bb>] pci_enable_sriov+0x4db/0x7d0
[  729.830486]  [<ffffffff81b63f54>] ? igb_pci_enable_sriov+0xe4/0x200
[  729.837481]  [<ffffffff81b63f7f>] igb_pci_enable_sriov+0x10f/0x200
[  729.844386]  [<ffffffff8155129c>] ? _kstrtoull+0x2c/0x80
[  729.850315]  [<ffffffff81b640a5>] igb_pci_sriov_configure+0x35/0x40
[  729.857318]  [<ffffffff815752b5>] sriov_numvfs_store+0xe5/0x140
[  729.863934]  [<ffffffff817e1f88>] dev_attr_store+0x18/0x30
[  729.870063]  [<ffffffff812624d8>] sysfs_kf_write+0x48/0x60
[  729.876186]  [<ffffffff812617ef>] ? kernfs_fop_write+0xaf/0x170
[  729.882797]  [<ffffffff81261827>] kernfs_fop_write+0xe7/0x170
[  729.889222]  [<ffffffff811ef66b>] vfs_write+0xcb/0x1c0
[  729.894958]  [<ffffffff811f0019>] SyS_write+0x49/0xb0
...
[  729.943531] ---[ end trace 7cf0cdb66637665a ]---

and pci_get_hp_params+0x5b point to
0xffffffff815901cb is in pci_get_hp_params (include/linux/device.h:815).
810	#include <linux/pm_wakeup.h>
811	
812	static inline const char *dev_name(const struct device *dev)
813	{
814		/* Use the init name until the kobject becomes available */
815		if (dev->init_name)
816			return dev->init_name;
817	
818		return kobject_name(&dev->kobj);

The root cause:
Now pci_configure_device/pci_get_hp_params will be called for every pci_dev,
including VF that is under virtual bus. But virtual bus does not have bridge
set. So we can not refer pbus->self->dev directly.

Add checking with pbus->slef and bail out early.

Fixing: commit 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 include/linux/pci-acpi.h |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/include/linux/pci-acpi.h
===================================================================
--- linux-2.6.orig/include/linux/pci-acpi.h
+++ linux-2.6/include/linux/pci-acpi.h
@@ -44,8 +44,12 @@ static inline acpi_handle acpi_pci_get_b
 
 	if (pci_is_root_bus(pbus))
 		dev = pbus->bridge;
-	else
+	else {
+		if (!pbus->self)
+			return NULL;
+
 		dev = &pbus->self->dev;
+	}
 
 	dev_printk(KERN_DEBUG, &pbus->dev, "  bridge ACPI_HANDLE %p : %s\n",
 			 ACPI_HANDLE(dev), dev_name(dev));

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-10-29 22:26 [PATCH] PCI: fix sriov enabling with virtual bus Yinghai Lu
@ 2014-10-30 17:09 ` Bjorn Helgaas
  2014-10-30 18:57   ` Yinghai Lu
  2014-11-05 20:22   ` Bjorn Helgaas
  0 siblings, 2 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2014-10-30 17:09 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi

[+cc Rafael, linux-acpi]

On Wed, Oct 29, 2014 at 03:26:10PM -0700, Yinghai Lu wrote:
> While enabling sriov for intel igb device got:
> 
> sca05-0a81fda5:~ # echo 7 > /sys/bus/pci/devices/0000\:09\:00.0/sriov_numvfs 
> [  729.612191] pci 0000:0a:10.0: [8086:1520] type 00 class 0x020000
> [  729.619002] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
> [  729.627767] IP: [<ffffffff815901cb>] pci_get_hp_params+0x5b/0x640
> [  729.634593] PGD 0 
> [  729.636844] Oops: 0000 [#1] SMP 
> [  729.640466] Modules linked in:
> ...
> [  729.795268] Call Trace:
> [  729.798007]  [<ffffffff81ea0843>] ? pci_mmcfg_read+0x123/0x140
> [  729.804519]  [<ffffffff81ea0770>] ? pci_mmcfg_read+0x50/0x140
> [  729.810942]  [<ffffffff815692a3>] pci_configure_device+0x33/0x350
> [  729.817744]  [<ffffffff8156aba4>] pci_device_add+0x24/0x160
> [  729.823965]  [<ffffffff8158f3bb>] pci_enable_sriov+0x4db/0x7d0
> [  729.830486]  [<ffffffff81b63f54>] ? igb_pci_enable_sriov+0xe4/0x200
> [  729.837481]  [<ffffffff81b63f7f>] igb_pci_enable_sriov+0x10f/0x200
> [  729.844386]  [<ffffffff8155129c>] ? _kstrtoull+0x2c/0x80
> [  729.850315]  [<ffffffff81b640a5>] igb_pci_sriov_configure+0x35/0x40
> [  729.857318]  [<ffffffff815752b5>] sriov_numvfs_store+0xe5/0x140
> [  729.863934]  [<ffffffff817e1f88>] dev_attr_store+0x18/0x30
> [  729.870063]  [<ffffffff812624d8>] sysfs_kf_write+0x48/0x60
> [  729.876186]  [<ffffffff812617ef>] ? kernfs_fop_write+0xaf/0x170
> [  729.882797]  [<ffffffff81261827>] kernfs_fop_write+0xe7/0x170
> [  729.889222]  [<ffffffff811ef66b>] vfs_write+0xcb/0x1c0
> [  729.894958]  [<ffffffff811f0019>] SyS_write+0x49/0xb0
> ...
> [  729.943531] ---[ end trace 7cf0cdb66637665a ]---
> 
> and pci_get_hp_params+0x5b point to
> 0xffffffff815901cb is in pci_get_hp_params (include/linux/device.h:815).
> 810	#include <linux/pm_wakeup.h>
> 811	
> 812	static inline const char *dev_name(const struct device *dev)
> 813	{
> 814		/* Use the init name until the kobject becomes available */
> 815		if (dev->init_name)
> 816			return dev->init_name;
> 817	
> 818		return kobject_name(&dev->kobj);
> 
> The root cause:
> Now pci_configure_device/pci_get_hp_params will be called for every pci_dev,
> including VF that is under virtual bus. But virtual bus does not have bridge
> set. So we can not refer pbus->self->dev directly.

This raises the question of what the correct behavior should be.  Your
patch certainly avoids the NULL pointer dereference.  It does so by making
acpi_pci_get_bridge_handle() fail gracefully, which means we will not look
for _HPP/_HPX for VF devices.  Is that what we want?

Most of the fields included in _HPX are read-only or not applicable for
VFs, but we need to at least ask the question of whether we want to
completely ignore _HPX for VFs.  If we do, I think maybe we should make
that more explicit in the code, e.g., by adding an explicit test of
dev->is_virtfn, instead of relying on this special case behavior of
acpi_pci_get_bridge_handle() that in turn depends on the obscure property
of a VF not having a bridge device.

Personally, I think that since the _HPX spec doesn't mention VFs at all, we
might want to assume that _HPX should apply to VFs, just like it applies to
PFs.  I can imagine future _HPX record formats, or non-ACPI firmware
configuration hints, that *would* apply to VFs, so it seems like it would
be pretty arbitrary to say "we won't configure VF devices at all."

> Add checking with pbus->slef and bail out early.
> 
> Fixing: commit 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")

Thanks for including this, but why not use the same format everybody else
does:

Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")

> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  include/linux/pci-acpi.h |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/include/linux/pci-acpi.h
> ===================================================================
> --- linux-2.6.orig/include/linux/pci-acpi.h
> +++ linux-2.6/include/linux/pci-acpi.h
> @@ -44,8 +44,12 @@ static inline acpi_handle acpi_pci_get_b
>  
>  	if (pci_is_root_bus(pbus))
>  		dev = pbus->bridge;
> -	else
> +	else {
> +		if (!pbus->self)
> +			return NULL;
> +
>  		dev = &pbus->self->dev;
> +	}
>  
>  	dev_printk(KERN_DEBUG, &pbus->dev, "  bridge ACPI_HANDLE %p : %s\n",
>  			 ACPI_HANDLE(dev), dev_name(dev));

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-10-30 17:09 ` Bjorn Helgaas
@ 2014-10-30 18:57   ` Yinghai Lu
  2014-10-30 19:27     ` Bjorn Helgaas
  2014-11-05 20:22   ` Bjorn Helgaas
  1 sibling, 1 reply; 9+ messages in thread
From: Yinghai Lu @ 2014-10-30 18:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Linux Kernel Mailing List, Rafael J. Wysocki,
	ACPI Devel Maling List

On Thu, Oct 30, 2014 at 10:09 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> [+cc Rafael, linux-acpi]

> This raises the question of what the correct behavior should be.  Your
> patch certainly avoids the NULL pointer dereference.  It does so by making
> acpi_pci_get_bridge_handle() fail gracefully, which means we will not look
> for _HPP/_HPX for VF devices.  Is that what we want?
>
> Most of the fields included in _HPX are read-only or not applicable for
> VFs, but we need to at least ask the question of whether we want to
> completely ignore _HPX for VFs.  If we do, I think maybe we should make
> that more explicit in the code, e.g., by adding an explicit test of
> dev->is_virtfn, instead of relying on this special case behavior of
> acpi_pci_get_bridge_handle() that in turn depends on the obscure property
> of a VF not having a bridge device.
>
> Personally, I think that since the _HPX spec doesn't mention VFs at all, we
> might want to assume that _HPX should apply to VFs, just like it applies to
> PFs.  I can imagine future _HPX record formats, or non-ACPI firmware
> configuration hints, that *would* apply to VFs, so it seems like it would
> be pretty arbitrary to say "we won't configure VF devices at all."

Yes, VF should be treated as PF if possible.

>
>> Add checking with pbus->slef and bail out early.
>>
>> Fixing: commit 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
>
> Thanks for including this, but why not use the same format everybody else
> does:
>
> Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")

will do that later. Is that formalized ?
checkpatch.pl only need 12 commit code, and (" ..") format.

Yinghai

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-10-30 18:57   ` Yinghai Lu
@ 2014-10-30 19:27     ` Bjorn Helgaas
  0 siblings, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2014-10-30 19:27 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: linux-pci, Linux Kernel Mailing List, Rafael J. Wysocki,
	ACPI Devel Maling List

On Thu, Oct 30, 2014 at 12:57 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, Oct 30, 2014 at 10:09 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:

>>> Fixing: commit 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
>>
>> Thanks for including this, but why not use the same format everybody else
>> does:
>>
>> Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
>
> will do that later. Is that formalized ?

I don't know that it's formalized, but it seems to be a common
convention, and using "Fixing: commit" instead of "Fixes:" doesn't
seem to add value.  I guess I must spend more time reading git
logs than most people :)

> checkpatch.pl only need 12 commit code, and (" ..") format.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-10-30 17:09 ` Bjorn Helgaas
  2014-10-30 18:57   ` Yinghai Lu
@ 2014-11-05 20:22   ` Bjorn Helgaas
  2014-11-05 21:44     ` Rafael J. Wysocki
  1 sibling, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2014-11-05 20:22 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: linux-pci, linux-kernel, Rafael J. Wysocki, linux-acpi,
	Chao Zhou, Joerg Roedel

[+cc Chao, Joerg]

On Thu, Oct 30, 2014 at 11:09:13AM -0600, Bjorn Helgaas wrote:
> [+cc Rafael, linux-acpi]
> 
> On Wed, Oct 29, 2014 at 03:26:10PM -0700, Yinghai Lu wrote:
> > While enabling sriov for intel igb device got:
> > 
> > sca05-0a81fda5:~ # echo 7 > /sys/bus/pci/devices/0000\:09\:00.0/sriov_numvfs 
> > [  729.612191] pci 0000:0a:10.0: [8086:1520] type 00 class 0x020000
> > [  729.619002] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
> > [  729.627767] IP: [<ffffffff815901cb>] pci_get_hp_params+0x5b/0x640
> > [  729.634593] PGD 0 
> > [  729.636844] Oops: 0000 [#1] SMP 
> > [  729.640466] Modules linked in:
> > ...
> > [  729.795268] Call Trace:
> > [  729.798007]  [<ffffffff81ea0843>] ? pci_mmcfg_read+0x123/0x140
> > [  729.804519]  [<ffffffff81ea0770>] ? pci_mmcfg_read+0x50/0x140
> > [  729.810942]  [<ffffffff815692a3>] pci_configure_device+0x33/0x350
> > [  729.817744]  [<ffffffff8156aba4>] pci_device_add+0x24/0x160
> > [  729.823965]  [<ffffffff8158f3bb>] pci_enable_sriov+0x4db/0x7d0
> > [  729.830486]  [<ffffffff81b63f54>] ? igb_pci_enable_sriov+0xe4/0x200
> > [  729.837481]  [<ffffffff81b63f7f>] igb_pci_enable_sriov+0x10f/0x200
> > [  729.844386]  [<ffffffff8155129c>] ? _kstrtoull+0x2c/0x80
> > [  729.850315]  [<ffffffff81b640a5>] igb_pci_sriov_configure+0x35/0x40
> > [  729.857318]  [<ffffffff815752b5>] sriov_numvfs_store+0xe5/0x140
> > [  729.863934]  [<ffffffff817e1f88>] dev_attr_store+0x18/0x30
> > [  729.870063]  [<ffffffff812624d8>] sysfs_kf_write+0x48/0x60
> > [  729.876186]  [<ffffffff812617ef>] ? kernfs_fop_write+0xaf/0x170
> > [  729.882797]  [<ffffffff81261827>] kernfs_fop_write+0xe7/0x170
> > [  729.889222]  [<ffffffff811ef66b>] vfs_write+0xcb/0x1c0
> > [  729.894958]  [<ffffffff811f0019>] SyS_write+0x49/0xb0
> > ...
> > [  729.943531] ---[ end trace 7cf0cdb66637665a ]---
> > 
> > and pci_get_hp_params+0x5b point to
> > 0xffffffff815901cb is in pci_get_hp_params (include/linux/device.h:815).
> > 810	#include <linux/pm_wakeup.h>
> > 811	
> > 812	static inline const char *dev_name(const struct device *dev)
> > 813	{
> > 814		/* Use the init name until the kobject becomes available */
> > 815		if (dev->init_name)
> > 816			return dev->init_name;
> > 817	
> > 818		return kobject_name(&dev->kobj);
> > 
> > The root cause:
> > Now pci_configure_device/pci_get_hp_params will be called for every pci_dev,
> > including VF that is under virtual bus. But virtual bus does not have bridge
> > set. So we can not refer pbus->self->dev directly.
> 
> This raises the question of what the correct behavior should be.  Your
> patch certainly avoids the NULL pointer dereference.  It does so by making
> acpi_pci_get_bridge_handle() fail gracefully, which means we will not look
> for _HPP/_HPX for VF devices.

I think I was mistaken about this.  A VF device might be on a virtual bus.
 And a virtual bus never has a bridge leading to it, i.e., its bus->self
pointer is NULL.  But I think a virtual bus always has a *parent* bus,
i.e., for a VF, dev->bus->parent is always valid.  This is because when
virtfn_add_bus() creates the virtual bus with "pci_add_new_bus(bus, NULL,
busnr)", the "bus" parameter (which becomes the parent bus of the virtual
bus) is a valid bus.

So with your patch, I think we *will* actually look for _HPP and _HPX for
VF devices, because we'll look for the handle of the bridge leading to the
virtual bus (which will return NULL), then for the handle of the bridge
leading to the virtual bus' parent bus, etc.

If you agree, Yinghai, I'll apply the patch below (same as what you posted,
with different changelog and a comment in the code).

The acpi_pci_get_bridge_handle(struct pci_bus *) interface niggles at me a
little because I don't think there's any concept of an ACPI device for a
PCI *bus*, so it doesn't seem like a very good fit to say "find the handle
for this bus".  But that's for later.

Bjorn



commit 32f638fc11db0526c706454d9ab4339d55ac89f3
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Thu Oct 30 10:17:25 2014 -0600

    PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle()
    
    acpi_pci_get_bridge_handle() returns the ACPI handle for the bridge device
    (either a host bridge or a PCI-to-PCI bridge) leading to a PCI bus.  But
    SR-IOV virtual functions can be on a virtual bus with no bridge leading to
    it.  Return a NULL acpi_handle in this case instead of trying to
    dereference the NULL pointer to the bridge.
    
    This fixes a NULL pointer dereference oops in pci_get_hp_params() when
    adding SR-IOV VF devices on virtual buses.
    
    [bhelgaas: changelog, add comment in code]
    Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=87591
    Reported-by: Chao Zhou <chao.zhou@intel.com>
    Reported-by: Joerg Roedel <joro@8bytes.org>
    Signed-off-by: Yinghai Lu <yinghai@kernel.org>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
index 64dacb7288a6..24c7728ca681 100644
--- a/include/linux/pci-acpi.h
+++ b/include/linux/pci-acpi.h
@@ -41,8 +41,13 @@ static inline acpi_handle acpi_pci_get_bridge_handle(struct pci_bus *pbus)
 
 	if (pci_is_root_bus(pbus))
 		dev = pbus->bridge;
-	else
+	else {
+		/* If pbus is a virtual bus, there is no bridge to it */
+		if (!pbus->self)
+			return NULL;
+
 		dev = &pbus->self->dev;
+	}
 
 	return ACPI_HANDLE(dev);
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-11-05 20:22   ` Bjorn Helgaas
@ 2014-11-05 21:44     ` Rafael J. Wysocki
  2014-11-05 21:57       ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2014-11-05 21:44 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yinghai Lu, linux-pci, linux-kernel, Rafael J. Wysocki,
	linux-acpi, Chao Zhou, Joerg Roedel

On Wednesday, November 05, 2014 01:22:52 PM Bjorn Helgaas wrote:
> [+cc Chao, Joerg]
> 
> On Thu, Oct 30, 2014 at 11:09:13AM -0600, Bjorn Helgaas wrote:
> > [+cc Rafael, linux-acpi]
> > 
> > On Wed, Oct 29, 2014 at 03:26:10PM -0700, Yinghai Lu wrote:
> > > While enabling sriov for intel igb device got:
> > > 
> > > sca05-0a81fda5:~ # echo 7 > /sys/bus/pci/devices/0000\:09\:00.0/sriov_numvfs 
> > > [  729.612191] pci 0000:0a:10.0: [8086:1520] type 00 class 0x020000
> > > [  729.619002] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
> > > [  729.627767] IP: [<ffffffff815901cb>] pci_get_hp_params+0x5b/0x640
> > > [  729.634593] PGD 0 
> > > [  729.636844] Oops: 0000 [#1] SMP 
> > > [  729.640466] Modules linked in:
> > > ...
> > > [  729.795268] Call Trace:
> > > [  729.798007]  [<ffffffff81ea0843>] ? pci_mmcfg_read+0x123/0x140
> > > [  729.804519]  [<ffffffff81ea0770>] ? pci_mmcfg_read+0x50/0x140
> > > [  729.810942]  [<ffffffff815692a3>] pci_configure_device+0x33/0x350
> > > [  729.817744]  [<ffffffff8156aba4>] pci_device_add+0x24/0x160
> > > [  729.823965]  [<ffffffff8158f3bb>] pci_enable_sriov+0x4db/0x7d0
> > > [  729.830486]  [<ffffffff81b63f54>] ? igb_pci_enable_sriov+0xe4/0x200
> > > [  729.837481]  [<ffffffff81b63f7f>] igb_pci_enable_sriov+0x10f/0x200
> > > [  729.844386]  [<ffffffff8155129c>] ? _kstrtoull+0x2c/0x80
> > > [  729.850315]  [<ffffffff81b640a5>] igb_pci_sriov_configure+0x35/0x40
> > > [  729.857318]  [<ffffffff815752b5>] sriov_numvfs_store+0xe5/0x140
> > > [  729.863934]  [<ffffffff817e1f88>] dev_attr_store+0x18/0x30
> > > [  729.870063]  [<ffffffff812624d8>] sysfs_kf_write+0x48/0x60
> > > [  729.876186]  [<ffffffff812617ef>] ? kernfs_fop_write+0xaf/0x170
> > > [  729.882797]  [<ffffffff81261827>] kernfs_fop_write+0xe7/0x170
> > > [  729.889222]  [<ffffffff811ef66b>] vfs_write+0xcb/0x1c0
> > > [  729.894958]  [<ffffffff811f0019>] SyS_write+0x49/0xb0
> > > ...
> > > [  729.943531] ---[ end trace 7cf0cdb66637665a ]---
> > > 
> > > and pci_get_hp_params+0x5b point to
> > > 0xffffffff815901cb is in pci_get_hp_params (include/linux/device.h:815).
> > > 810	#include <linux/pm_wakeup.h>
> > > 811	
> > > 812	static inline const char *dev_name(const struct device *dev)
> > > 813	{
> > > 814		/* Use the init name until the kobject becomes available */
> > > 815		if (dev->init_name)
> > > 816			return dev->init_name;
> > > 817	
> > > 818		return kobject_name(&dev->kobj);
> > > 
> > > The root cause:
> > > Now pci_configure_device/pci_get_hp_params will be called for every pci_dev,
> > > including VF that is under virtual bus. But virtual bus does not have bridge
> > > set. So we can not refer pbus->self->dev directly.
> > 
> > This raises the question of what the correct behavior should be.  Your
> > patch certainly avoids the NULL pointer dereference.  It does so by making
> > acpi_pci_get_bridge_handle() fail gracefully, which means we will not look
> > for _HPP/_HPX for VF devices.
> 
> I think I was mistaken about this.  A VF device might be on a virtual bus.
>  And a virtual bus never has a bridge leading to it, i.e., its bus->self
> pointer is NULL.  But I think a virtual bus always has a *parent* bus,
> i.e., for a VF, dev->bus->parent is always valid.  This is because when
> virtfn_add_bus() creates the virtual bus with "pci_add_new_bus(bus, NULL,
> busnr)", the "bus" parameter (which becomes the parent bus of the virtual
> bus) is a valid bus.
> 
> So with your patch, I think we *will* actually look for _HPP and _HPX for
> VF devices, because we'll look for the handle of the bridge leading to the
> virtual bus (which will return NULL), then for the handle of the bridge
> leading to the virtual bus' parent bus, etc.
> 
> If you agree, Yinghai, I'll apply the patch below (same as what you posted,
> with different changelog and a comment in the code).
> 
> The acpi_pci_get_bridge_handle(struct pci_bus *) interface niggles at me a
> little because I don't think there's any concept of an ACPI device for a
> PCI *bus*, so it doesn't seem like a very good fit to say "find the handle
> for this bus".  But that's for later.

To me it does what it says: Get me the handle of the bridge leading to this bus.

> commit 32f638fc11db0526c706454d9ab4339d55ac89f3
> Author: Yinghai Lu <yinghai@kernel.org>
> Date:   Thu Oct 30 10:17:25 2014 -0600
> 
>     PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle()
>     
>     acpi_pci_get_bridge_handle() returns the ACPI handle for the bridge device
>     (either a host bridge or a PCI-to-PCI bridge) leading to a PCI bus.  But
>     SR-IOV virtual functions can be on a virtual bus with no bridge leading to
>     it.  Return a NULL acpi_handle in this case instead of trying to
>     dereference the NULL pointer to the bridge.
>     
>     This fixes a NULL pointer dereference oops in pci_get_hp_params() when
>     adding SR-IOV VF devices on virtual buses.
>     
>     [bhelgaas: changelog, add comment in code]
>     Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
>     Link: https://bugzilla.kernel.org/show_bug.cgi?id=87591
>     Reported-by: Chao Zhou <chao.zhou@intel.com>
>     Reported-by: Joerg Roedel <joro@8bytes.org>
>     Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
> index 64dacb7288a6..24c7728ca681 100644
> --- a/include/linux/pci-acpi.h
> +++ b/include/linux/pci-acpi.h
> @@ -41,8 +41,13 @@ static inline acpi_handle acpi_pci_get_bridge_handle(struct pci_bus *pbus)
>  
>  	if (pci_is_root_bus(pbus))
>  		dev = pbus->bridge;
> -	else
> +	else {
> +		/* If pbus is a virtual bus, there is no bridge to it */
> +		if (!pbus->self)
> +			return NULL;
> +
>  		dev = &pbus->self->dev;
> +	}
>  
>  	return ACPI_HANDLE(dev);
>  }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-11-05 21:44     ` Rafael J. Wysocki
@ 2014-11-05 21:57       ` Bjorn Helgaas
  2014-11-05 22:42         ` Rafael J. Wysocki
  2014-11-06  7:11         ` Yinghai Lu
  0 siblings, 2 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2014-11-05 21:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Yinghai Lu, linux-pci, linux-kernel, Rafael J. Wysocki,
	linux-acpi, Chao Zhou, Joerg Roedel

On Wed, Nov 05, 2014 at 10:44:25PM +0100, Rafael J. Wysocki wrote:
> On Wednesday, November 05, 2014 01:22:52 PM Bjorn Helgaas wrote:
> > ...
> > The acpi_pci_get_bridge_handle(struct pci_bus *) interface niggles at me a
> > little because I don't think there's any concept of an ACPI device for a
> > PCI *bus*, so it doesn't seem like a very good fit to say "find the handle
> > for this bus".  But that's for later.
> 
> To me it does what it says: Get me the handle of the bridge leading to
> this bus.

Yeah, I know, it's a great name because it does *exactly* what the name
says.  I'm just wondering if there's a nicer way to express what the caller
needs.

Two of the three callers start with a pci_dev, look up a pci_bus to pass
in, and get back a handle corresponding to a pci_host_bridge or a pci_dev
(or a NULL).  It seems a little cluttered because the pci_bus is only
incidental and the caller doesn't care about it at all except for passing
it to acpi_pci_get_bridge_handle().

It's relatively common to start with a pci_dev and look for an ACPI handle
that corresponds to that device or the closest enclosing scope, so maybe
there should be a way to do that directly.

For pci_get_hp_params(), I think the current code is actually slightly
buggy because we don't look for _HPP/_HPX on the device itself; we only
look at the bridges upstream from it.

What I had in mind was something like the following (untested and not for
application).

Bjorn


diff --git a/drivers/pci/hotplug/acpi_pcihp.c b/drivers/pci/hotplug/acpi_pcihp.c
index 876ccc620440..5e95df56b8ae 100644
--- a/drivers/pci/hotplug/acpi_pcihp.c
+++ b/drivers/pci/hotplug/acpi_pcihp.c
@@ -116,20 +116,11 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev *pdev, u32 flags)
 		string = (struct acpi_buffer){ ACPI_ALLOCATE_BUFFER, NULL };
 	}
 
-	handle = ACPI_HANDLE(&pdev->dev);
-	if (!handle) {
-		/*
-		 * This hotplug controller was not listed in the ACPI name
-		 * space at all. Try to get acpi handle of parent pci bus.
-		 */
-		struct pci_bus *pbus;
-		for (pbus = pdev->bus; pbus; pbus = pbus->parent) {
-			handle = acpi_pci_get_bridge_handle(pbus);
-			if (handle)
-				break;
-		}
-	}
-
+	/*
+	 * We did not find _OSC on the host bridge, so look for any
+	 * enclosing device with an OSHP method.
+	 */
+	handle = pci_find_acpi_handle(pdev);
 	while (handle) {
 		acpi_get_name(handle, ACPI_FULL_PATHNAME, &string);
 		dbg("Trying to get hotplug control for %s \n",
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 6ebf8edc5f3c..3b3f0720fff0 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -246,14 +246,8 @@ int pci_get_hp_params(struct pci_dev *dev, struct hotplug_params *hpp)
 {
 	acpi_status status;
 	acpi_handle handle, phandle;
-	struct pci_bus *pbus;
 
-	handle = NULL;
-	for (pbus = dev->bus; pbus; pbus = pbus->parent) {
-		handle = acpi_pci_get_bridge_handle(pbus);
-		if (handle)
-			break;
-	}
+	handle = pci_find_acpi_handle(dev);
 
 	/*
 	 * _HPP settings apply to all child buses, until another _HPP is
@@ -279,6 +273,33 @@ int pci_get_hp_params(struct pci_dev *dev, struct hotplug_params *hpp)
 }
 EXPORT_SYMBOL_GPL(pci_get_hp_params);
 
+/*
+ * Search for an ACPI handle.  The PCI device itself may have one, or an
+ * upstream device (either a PCI-to-PCI bridge or a PCI host bridge) may
+ * have one.
+ */
+acpi_handle pci_find_acpi_handle(struct pci_dev *pdev)
+{
+	struct device *dev;
+	acpi_handle handle;
+	struct pci_bus *bus;
+
+	dev = &pdev->dev;
+	handle = ACPI_HANDLE(dev);
+	while (!handle) {
+		pdev = pci_physfn(pdev);
+		bus = pdev->bus;
+		if (pci_is_root_bus(bus))
+			dev = bus->bridge;
+		else {
+			pdev = bus->self;
+			dev = &pdev->dev;
+		}
+		handle = ACPI_HANDLE(dev);
+	}
+	return handle;
+}
+
 /**
  * pci_acpi_wake_bus - Root bus wakeup notification fork function.
  * @work: Work item to handle.

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-11-05 21:57       ` Bjorn Helgaas
@ 2014-11-05 22:42         ` Rafael J. Wysocki
  2014-11-06  7:11         ` Yinghai Lu
  1 sibling, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2014-11-05 22:42 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yinghai Lu, linux-pci, linux-kernel, Rafael J. Wysocki,
	linux-acpi, Chao Zhou, Joerg Roedel

On Wednesday, November 05, 2014 02:57:13 PM Bjorn Helgaas wrote:
> On Wed, Nov 05, 2014 at 10:44:25PM +0100, Rafael J. Wysocki wrote:
> > On Wednesday, November 05, 2014 01:22:52 PM Bjorn Helgaas wrote:
> > > ...
> > > The acpi_pci_get_bridge_handle(struct pci_bus *) interface niggles at me a
> > > little because I don't think there's any concept of an ACPI device for a
> > > PCI *bus*, so it doesn't seem like a very good fit to say "find the handle
> > > for this bus".  But that's for later.
> > 
> > To me it does what it says: Get me the handle of the bridge leading to
> > this bus.
> 
> Yeah, I know, it's a great name because it does *exactly* what the name
> says.  I'm just wondering if there's a nicer way to express what the caller
> needs.
> 
> Two of the three callers start with a pci_dev, look up a pci_bus to pass
> in, and get back a handle corresponding to a pci_host_bridge or a pci_dev
> (or a NULL).  It seems a little cluttered because the pci_bus is only
> incidental and the caller doesn't care about it at all except for passing
> it to acpi_pci_get_bridge_handle().
> 
> It's relatively common to start with a pci_dev and look for an ACPI handle
> that corresponds to that device or the closest enclosing scope, so maybe
> there should be a way to do that directly.

Well, the bridge in question should be the parent of the device if I'm
not mistaken.  At least that's the assumption made by acpi_pci_find_companion().

> For pci_get_hp_params(), I think the current code is actually slightly
> buggy because we don't look for _HPP/_HPX on the device itself; we only
> look at the bridges upstream from it.
> 
> What I had in mind was something like the following (untested and not for
> application).

So perhaps pci_find_acpi_handle() below may be somewhat simpler?

> diff --git a/drivers/pci/hotplug/acpi_pcihp.c b/drivers/pci/hotplug/acpi_pcihp.c
> index 876ccc620440..5e95df56b8ae 100644
> --- a/drivers/pci/hotplug/acpi_pcihp.c
> +++ b/drivers/pci/hotplug/acpi_pcihp.c
> @@ -116,20 +116,11 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev *pdev, u32 flags)
>  		string = (struct acpi_buffer){ ACPI_ALLOCATE_BUFFER, NULL };
>  	}
>  
> -	handle = ACPI_HANDLE(&pdev->dev);
> -	if (!handle) {
> -		/*
> -		 * This hotplug controller was not listed in the ACPI name
> -		 * space at all. Try to get acpi handle of parent pci bus.
> -		 */
> -		struct pci_bus *pbus;
> -		for (pbus = pdev->bus; pbus; pbus = pbus->parent) {
> -			handle = acpi_pci_get_bridge_handle(pbus);
> -			if (handle)
> -				break;
> -		}
> -	}
> -
> +	/*
> +	 * We did not find _OSC on the host bridge, so look for any
> +	 * enclosing device with an OSHP method.
> +	 */
> +	handle = pci_find_acpi_handle(pdev);
>  	while (handle) {
>  		acpi_get_name(handle, ACPI_FULL_PATHNAME, &string);
>  		dbg("Trying to get hotplug control for %s \n",
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 6ebf8edc5f3c..3b3f0720fff0 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -246,14 +246,8 @@ int pci_get_hp_params(struct pci_dev *dev, struct hotplug_params *hpp)
>  {
>  	acpi_status status;
>  	acpi_handle handle, phandle;
> -	struct pci_bus *pbus;
>  
> -	handle = NULL;
> -	for (pbus = dev->bus; pbus; pbus = pbus->parent) {
> -		handle = acpi_pci_get_bridge_handle(pbus);
> -		if (handle)
> -			break;
> -	}
> +	handle = pci_find_acpi_handle(dev);
>  
>  	/*
>  	 * _HPP settings apply to all child buses, until another _HPP is
> @@ -279,6 +273,33 @@ int pci_get_hp_params(struct pci_dev *dev, struct hotplug_params *hpp)
>  }
>  EXPORT_SYMBOL_GPL(pci_get_hp_params);
>  
> +/*
> + * Search for an ACPI handle.  The PCI device itself may have one, or an
> + * upstream device (either a PCI-to-PCI bridge or a PCI host bridge) may
> + * have one.
> + */
> +acpi_handle pci_find_acpi_handle(struct pci_dev *pdev)
> +{
> +	struct device *dev;
> +	acpi_handle handle;
> +	struct pci_bus *bus;
> +
> +	dev = &pdev->dev;
> +	handle = ACPI_HANDLE(dev);
> +	while (!handle) {
> +		pdev = pci_physfn(pdev);
> +		bus = pdev->bus;
> +		if (pci_is_root_bus(bus))
> +			dev = bus->bridge;
> +		else {
> +			pdev = bus->self;
> +			dev = &pdev->dev;
> +		}
> +		handle = ACPI_HANDLE(dev);
> +	}
> +	return handle;
> +}
> +
>  /**
>   * pci_acpi_wake_bus - Root bus wakeup notification fork function.
>   * @work: Work item to handle.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: fix sriov enabling with virtual bus
  2014-11-05 21:57       ` Bjorn Helgaas
  2014-11-05 22:42         ` Rafael J. Wysocki
@ 2014-11-06  7:11         ` Yinghai Lu
  1 sibling, 0 replies; 9+ messages in thread
From: Yinghai Lu @ 2014-11-06  7:11 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rafael J. Wysocki, linux-pci, Linux Kernel Mailing List,
	Rafael J. Wysocki, ACPI Devel Maling List, Chao Zhou,
	Joerg Roedel

On Wed, Nov 5, 2014 at 1:57 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> It's relatively common to start with a pci_dev and look for an ACPI handle
> that corresponds to that device or the closest enclosing scope, so maybe
> there should be a way to do that directly.
>
> For pci_get_hp_params(), I think the current code is actually slightly
> buggy because we don't look for _HPP/_HPX on the device itself; we only
> look at the bridges upstream from it.
>
> What I had in mind was something like the following (untested and not for
> application).



> + * Search for an ACPI handle.  The PCI device itself may have one, or an
> + * upstream device (either a PCI-to-PCI bridge or a PCI host bridge) may
> + * have one.
> + */
> +acpi_handle pci_find_acpi_handle(struct pci_dev *pdev)
> +{
> +       struct device *dev;
> +       acpi_handle handle;
> +       struct pci_bus *bus;
> +
> +       dev = &pdev->dev;
> +       handle = ACPI_HANDLE(dev);
> +       while (!handle) {
> +               pdev = pci_physfn(pdev);

I'm a little worried about to apply hpx value for PFs to VFs.

> +               bus = pdev->bus;
> +               if (pci_is_root_bus(bus))
> +                       dev = bus->bridge;
> +               else {
> +                       pdev = bus->self;
> +                       dev = &pdev->dev;
> +               }
> +               handle = ACPI_HANDLE(dev);
> +       }
> +       return handle;
> +}


Also found another problem,
pci_device_add==>pci_configure_device==>program_hpp_type2() path.
and program_hpp_type2() will check dev->subordinate().

but at the time child bridge is not scanned yet.

Maybe we should move pci_configure_device later ?

Yinghai Lu

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-11-06  7:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-29 22:26 [PATCH] PCI: fix sriov enabling with virtual bus Yinghai Lu
2014-10-30 17:09 ` Bjorn Helgaas
2014-10-30 18:57   ` Yinghai Lu
2014-10-30 19:27     ` Bjorn Helgaas
2014-11-05 20:22   ` Bjorn Helgaas
2014-11-05 21:44     ` Rafael J. Wysocki
2014-11-05 21:57       ` Bjorn Helgaas
2014-11-05 22:42         ` Rafael J. Wysocki
2014-11-06  7:11         ` Yinghai Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).