All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] CXL: Fix device_node reference counting
@ 2015-01-29  2:16 Ryan Grimm
  2015-01-29  2:50 ` Ian Munsie
  0 siblings, 1 reply; 9+ messages in thread
From: Ryan Grimm @ 2015-01-29  2:16 UTC (permalink / raw)
  To: imunsie, mikey; +Cc: Ryan Grimm, linuxppc-dev

When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

We are missing a call to of_node_get.  pnv_pci_to_phb_node should call
of_node_get otherwise np's reference count isn't incremented and it might go
away.  Rename pnv_pci_to_phb_node to pnv_pci_get_phb_node so it's clear it
calls of_node_get.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
---
Please consider this patch for stable.  Without this fix, node reference
counting is broken. 

 arch/powerpc/include/asm/pnv-pci.h        | 2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c | 6 +++---
 drivers/misc/cxl/pci.c                    | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/pnv-pci.h b/arch/powerpc/include/asm/pnv-pci.h
index 3c00d64..f9b4982 100644
--- a/arch/powerpc/include/asm/pnv-pci.h
+++ b/arch/powerpc/include/asm/pnv-pci.h
@@ -19,7 +19,7 @@ int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
 int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num);
 void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num);
 int pnv_cxl_get_irq_count(struct pci_dev *dev);
-struct device_node *pnv_pci_to_phb_node(struct pci_dev *dev);
+struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev);
 
 #ifdef CONFIG_CXL_BASE
 int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs,
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5d52d6f..8be1d4f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1460,13 +1460,13 @@ static void set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq)
 
 #ifdef CONFIG_CXL_BASE
 
-struct device_node *pnv_pci_to_phb_node(struct pci_dev *dev)
+struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev)
 {
 	struct pci_controller *hose = pci_bus_to_host(dev->bus);
 
-	return hose->dn;
+	return of_node_get(hose->dn);
 }
-EXPORT_SYMBOL(pnv_pci_to_phb_node);
+EXPORT_SYMBOL(pnv_pci_get_phb_node);
 
 int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode)
 {
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 428ea8ba..cb25067 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -317,7 +317,7 @@ static int init_implementation_adapter_regs(struct cxl *adapter, struct pci_dev
 	u64 psl_dsnctl;
 	u64 chipid;
 
-	if (!(np = pnv_pci_to_phb_node(dev)))
+	if (!(np = pnv_pci_get_phb_node(dev)))
 		return -ENODEV;
 
 	while (np && !(prop = of_get_property(np, "ibm,chip-id", NULL)))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] CXL: Fix device_node reference counting
  2015-01-29  2:16 [PATCH] CXL: Fix device_node reference counting Ryan Grimm
@ 2015-01-29  2:50 ` Ian Munsie
  0 siblings, 0 replies; 9+ messages in thread
From: Ian Munsie @ 2015-01-29  2:50 UTC (permalink / raw)
  To: Ryan Grimm; +Cc: mikey, linuxppc-dev

Acked-by: Ian Munsie <imunsie@au1.ibm.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] CXL: Fix device_node reference counting
  2015-01-28  5:53     ` Ian Munsie
  2015-01-28  6:07       ` Michael Ellerman
@ 2015-01-29  2:15       ` Ryan Grimm
  1 sibling, 0 replies; 9+ messages in thread
From: Ryan Grimm @ 2015-01-29  2:15 UTC (permalink / raw)
  To: Ian Munsie, Michael Ellerman; +Cc: linuxppc-dev

On 01/28/2015 12:53 AM, Ian Munsie wrote:
> Excerpts from Michael Ellerman's message of 2015-01-28 16:04:40 +1100:
>>> I just wanted to check the status of this one? I can't see it in your
>>> tree and wanted to make sure you didn't simply miss it.
>>
>> It looked fishy, but I never got around to replying.
>>
>> The second sentence in the explanation should never be true:
>
> Right, that was the point of the fix ;)
>
>> You shouldn't have np unless you did an of_node_get() to get it, otherwise it's
>> pointing at something you don't have a reference for and it might go away at
>> any time.
>>
>> So the patch may fix the bug but I don't think it's correct.
>>
>> I think pnv_pci_to_phb_node() should be doing a get for you, before returning
>> the pointer.
>
> Agreed - we should probably also rename it to have 'get' in the name,
> like pnv_pci_get_phb_node().

Yeah, that's way better than the current patch.

>
>> See as a comparison pcibios_get_phb_of_node().
>
> We could almost use that instead, except it's not exported for modules
> and I'm not sure if that even works with __weak functions?
>
>
> Ryan - do you want to respin this, or would you rather I take it?

Sure, I'll respin and resend as a bug fix.

-Ryan

>
> Cheers,
> -Ian
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] CXL: Fix device_node reference counting
  2015-01-28  5:53     ` Ian Munsie
@ 2015-01-28  6:07       ` Michael Ellerman
  2015-01-29  2:15       ` Ryan Grimm
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2015-01-28  6:07 UTC (permalink / raw)
  To: Ian Munsie; +Cc: Ryan Grimm, linuxppc-dev

On Wed, 2015-01-28 at 16:53 +1100, Ian Munsie wrote:
> Excerpts from Michael Ellerman's message of 2015-01-28 16:04:40 +1100:
> > > I just wanted to check the status of this one? I can't see it in your
> > > tree and wanted to make sure you didn't simply miss it.
> > 
> > It looked fishy, but I never got around to replying.
> > 
> > The second sentence in the explanation should never be true:
> 
> Right, that was the point of the fix ;)

Sure, but bodging of_node_get()s all over the place is not a path to success.

> > You shouldn't have np unless you did an of_node_get() to get it, otherwise it's
> > pointing at something you don't have a reference for and it might go away at
> > any time.
> > 
> > So the patch may fix the bug but I don't think it's correct.
> > 
> > I think pnv_pci_to_phb_node() should be doing a get for you, before returning
> > the pointer.
> 
> Agreed - we should probably also rename it to have 'get' in the name,
> like pnv_pci_get_phb_node().

Yep.

> > See as a comparison pcibios_get_phb_of_node().
> 
> We could almost use that instead, except it's not exported for modules
> and I'm not sure if that even works with __weak functions?

It should. It's only weak until the final link and then you get a non-weak
version AIUI.

Try it.

And a follow up patch to have it use pci_bus_to_host() would be nice too.

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] CXL: Fix device_node reference counting
  2015-01-28  5:04   ` Michael Ellerman
@ 2015-01-28  5:53     ` Ian Munsie
  2015-01-28  6:07       ` Michael Ellerman
  2015-01-29  2:15       ` Ryan Grimm
  0 siblings, 2 replies; 9+ messages in thread
From: Ian Munsie @ 2015-01-28  5:53 UTC (permalink / raw)
  To: Michael Ellerman, Ryan Grimm; +Cc: linuxppc-dev

Excerpts from Michael Ellerman's message of 2015-01-28 16:04:40 +1100:
> > I just wanted to check the status of this one? I can't see it in your
> > tree and wanted to make sure you didn't simply miss it.
> 
> It looked fishy, but I never got around to replying.
> 
> The second sentence in the explanation should never be true:

Right, that was the point of the fix ;)

> You shouldn't have np unless you did an of_node_get() to get it, otherwise it's
> pointing at something you don't have a reference for and it might go away at
> any time.
> 
> So the patch may fix the bug but I don't think it's correct.
> 
> I think pnv_pci_to_phb_node() should be doing a get for you, before returning
> the pointer.

Agreed - we should probably also rename it to have 'get' in the name,
like pnv_pci_get_phb_node().

> See as a comparison pcibios_get_phb_of_node().

We could almost use that instead, except it's not exported for modules
and I'm not sure if that even works with __weak functions?


Ryan - do you want to respin this, or would you rather I take it?

Cheers,
-Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] CXL: Fix device_node reference counting
  2015-01-28  4:02 ` Ian Munsie
@ 2015-01-28  5:04   ` Michael Ellerman
  2015-01-28  5:53     ` Ian Munsie
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2015-01-28  5:04 UTC (permalink / raw)
  To: Ian Munsie; +Cc: Ryan Grimm, linuxppc-dev

On Wed, 2015-01-28 at 15:02 +1100, Ian Munsie wrote:
> Excerpts from Ian Munsie's message of 2015-01-07 16:41:18 +1100:
> > From: Ryan Grimm <grimm@linux.vnet.ibm.com>
> > 
> > When unbinding and rebinding the driver on a system with a card in PHB0, this
> > error condition is reached after a few attempts:
> 
> Hey mpe,
> 
> I just wanted to check the status of this one? I can't see it in your
> tree and wanted to make sure you didn't simply miss it.

It looked fishy, but I never got around to replying.

The second sentence in the explanation should never be true:

  But, if while loop is not entered, of_node_put get called
  on np without an of_node_get.

You shouldn't have np unless you did an of_node_get() to get it, otherwise it's
pointing at something you don't have a reference for and it might go away at
any time.

So the patch may fix the bug but I don't think it's correct.

I think pnv_pci_to_phb_node() should be doing a get for you, before returning
the pointer.

See as a comparison pcibios_get_phb_of_node().

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] CXL: Fix device_node reference counting
  2015-01-07  5:41 ` Ian Munsie
  (?)
@ 2015-01-28  4:02 ` Ian Munsie
  2015-01-28  5:04   ` Michael Ellerman
  -1 siblings, 1 reply; 9+ messages in thread
From: Ian Munsie @ 2015-01-28  4:02 UTC (permalink / raw)
  To: mpe; +Cc: Ryan Grimm, linuxppc-dev

Excerpts from Ian Munsie's message of 2015-01-07 16:41:18 +1100:
> From: Ryan Grimm <grimm@linux.vnet.ibm.com>
> 
> When unbinding and rebinding the driver on a system with a card in PHB0, this
> error condition is reached after a few attempts:

Hey mpe,

I just wanted to check the status of this one? I can't see it in your
tree and wanted to make sure you didn't simply miss it.

Cheers,
-Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] CXL: Fix device_node reference counting
@ 2015-01-07  5:41 ` Ian Munsie
  0 siblings, 0 replies; 9+ messages in thread
From: Ian Munsie @ 2015-01-07  5:41 UTC (permalink / raw)
  To: mpe
  Cc: benh, mikey, anton, linux-kernel, linuxppc-dev, jk, imunsie,
	cbe-oss-dev, Aneesh Kumar K.V, Ryan Grimm

From: Ryan Grimm <grimm@linux.vnet.ibm.com>

When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

of_get_next_parent decrements parent's refcount and we need to call of_node_put
after the iteration.  But, if while loop is not entered, of_node_put get called
on np without an of_node_get.  So, call it before the while loop.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 drivers/misc/cxl/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 2ccd0a9..f801c28 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -319,6 +319,7 @@ static int init_implementation_adapter_regs(struct cxl *adapter, struct pci_dev
 	if (!(np = pnv_pci_to_phb_node(dev)))
 		return -ENODEV;
 
+	of_node_get(np);
 	while (np && !(prop = of_get_property(np, "ibm,chip-id", NULL)))
 		np = of_get_next_parent(np);
 	if (!np)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] CXL: Fix device_node reference counting
@ 2015-01-07  5:41 ` Ian Munsie
  0 siblings, 0 replies; 9+ messages in thread
From: Ian Munsie @ 2015-01-07  5:41 UTC (permalink / raw)
  To: mpe
  Cc: cbe-oss-dev, mikey, Aneesh Kumar K.V, linux-kernel, Ryan Grimm,
	linuxppc-dev, anton, imunsie, jk

From: Ryan Grimm <grimm@linux.vnet.ibm.com>

When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

of_get_next_parent decrements parent's refcount and we need to call of_node_put
after the iteration.  But, if while loop is not entered, of_node_put get called
on np without an of_node_get.  So, call it before the while loop.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 drivers/misc/cxl/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 2ccd0a9..f801c28 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -319,6 +319,7 @@ static int init_implementation_adapter_regs(struct cxl *adapter, struct pci_dev
 	if (!(np = pnv_pci_to_phb_node(dev)))
 		return -ENODEV;
 
+	of_node_get(np);
 	while (np && !(prop = of_get_property(np, "ibm,chip-id", NULL)))
 		np = of_get_next_parent(np);
 	if (!np)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-01-29  2:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-29  2:16 [PATCH] CXL: Fix device_node reference counting Ryan Grimm
2015-01-29  2:50 ` Ian Munsie
  -- strict thread matches above, loose matches on Subject: below --
2015-01-07  5:41 Ian Munsie
2015-01-07  5:41 ` Ian Munsie
2015-01-28  4:02 ` Ian Munsie
2015-01-28  5:04   ` Michael Ellerman
2015-01-28  5:53     ` Ian Munsie
2015-01-28  6:07       ` Michael Ellerman
2015-01-29  2:15       ` Ryan Grimm

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.