* [PATCH] powerpc/powernv/npu: Fix reference leak
@ 2019-04-19 15:34 Greg Kurz
2019-04-29 6:01 ` Alexey Kardashevskiy
2019-06-03 12:32 ` Michael Ellerman
0 siblings, 2 replies; 7+ messages in thread
From: Greg Kurz @ 2019-04-19 15:34 UTC (permalink / raw)
To: linux-kernel; +Cc: Alexey Kardashevskiy, linuxppc-dev, Alistair Popple
Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
has the effect of incrementing the reference count of the PCI device, as
explained in drivers/pci/search.c:
* Given a PCI domain, bus, and slot/function number, the desired PCI
* device is located in the list of PCI devices. If the device is
* found, its reference count is increased and this function returns a
* pointer to its data structure. The caller must decrement the
* reference count by calling pci_dev_put(). If no device is found,
* %NULL is returned.
Nothing was done to call pci_dev_put() and the reference count of GPU and
NPU PCI devices rockets up.
A natural way to fix this would be to teach the callers about the change,
so that they call pci_dev_put() when done with the pointer. This turns
out to be quite intrusive, as it affects many paths in npu-dma.c,
pci-ioda.c and vfio_pci_nvlink2.c. Also, the issue appeared in 4.16 and
some affected code got moved around since then: it would be problematic
to backport the fix to stable releases.
All that code never cared for reference counting anyway. Call pci_dev_put()
from get_pci_dev() to revert to the previous behavior.
Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn")
Cc: stable@vger.kernel.org # v4.16
Signed-off-by: Greg Kurz <groug@kaod.org>
---
arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index e713ade30087..d8f3647e8fb2 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
static struct pci_dev *get_pci_dev(struct device_node *dn)
{
struct pci_dn *pdn = PCI_DN(dn);
+ struct pci_dev *pdev;
- return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
+ pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
pdn->busno, pdn->devfn);
+
+ /*
+ * pci_get_domain_bus_and_slot() increased the reference count of
+ * the PCI device, but callers don't need that actually as the PE
+ * already holds a reference to the device. Since callers aren't
+ * aware of the reference count change, call pci_dev_put() now to
+ * avoid leaks.
+ */
+ if (pdev)
+ pci_dev_put(pdev);
+
+ return pdev;
}
/* Given a NPU device get the associated PCI device. */
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
2019-04-19 15:34 [PATCH] powerpc/powernv/npu: Fix reference leak Greg Kurz
@ 2019-04-29 6:01 ` Alexey Kardashevskiy
2019-04-29 10:36 ` Greg Kurz
2019-06-03 12:32 ` Michael Ellerman
1 sibling, 1 reply; 7+ messages in thread
From: Alexey Kardashevskiy @ 2019-04-29 6:01 UTC (permalink / raw)
To: Greg Kurz, linux-kernel; +Cc: linuxppc-dev, Alistair Popple
On 20/04/2019 01:34, Greg Kurz wrote:
> Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> has the effect of incrementing the reference count of the PCI device, as
> explained in drivers/pci/search.c:
>
> * Given a PCI domain, bus, and slot/function number, the desired PCI
> * device is located in the list of PCI devices. If the device is
> * found, its reference count is increased and this function returns a
> * pointer to its data structure. The caller must decrement the
> * reference count by calling pci_dev_put(). If no device is found,
> * %NULL is returned.
>
> Nothing was done to call pci_dev_put() and the reference count of GPU and
> NPU PCI devices rockets up.
>
> A natural way to fix this would be to teach the callers about the change,
> so that they call pci_dev_put() when done with the pointer. This turns
> out to be quite intrusive, as it affects many paths in npu-dma.c,
> pci-ioda.c and vfio_pci_nvlink2.c.
afaict this referencing is only done to protect the current traverser
and what you've done is actually a natural way (and the generic
pci_get_dev_by_id() does exactly the same), although this looks a bit weird.
> Also, the issue appeared in 4.16 and
> some affected code got moved around since then: it would be problematic
> to backport the fix to stable releases.
>
> All that code never cared for reference counting anyway. Call pci_dev_put()
> from get_pci_dev() to revert to the previous behavior.
>> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev
from pci_dn")
> Cc: stable@vger.kernel.org # v4.16
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index e713ade30087..d8f3647e8fb2 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
> static struct pci_dev *get_pci_dev(struct device_node *dn)
> {
> struct pci_dn *pdn = PCI_DN(dn);
> + struct pci_dev *pdev;
>
> - return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> pdn->busno, pdn->devfn);
> +
> + /*
> + * pci_get_domain_bus_and_slot() increased the reference count of
> + * the PCI device, but callers don't need that actually as the PE
> + * already holds a reference to the device.
Imho this would be just enough.
Anyway,
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
How did you find it? :)
> Since callers aren't
> + * aware of the reference count change, call pci_dev_put() now to
> + * avoid leaks.
> + */
> + if (pdev)
> + pci_dev_put(pdev);
> +
> + return pdev;
> }
>
> /* Given a NPU device get the associated PCI device. */
>
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
2019-04-29 6:01 ` Alexey Kardashevskiy
@ 2019-04-29 10:36 ` Greg Kurz
2019-05-13 11:56 ` Greg Kurz
0 siblings, 1 reply; 7+ messages in thread
From: Greg Kurz @ 2019-04-29 10:36 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: linuxppc-dev, linux-kernel, stable, Alistair Popple
On Mon, 29 Apr 2019 16:01:29 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> On 20/04/2019 01:34, Greg Kurz wrote:
> > Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> > has the effect of incrementing the reference count of the PCI device, as
> > explained in drivers/pci/search.c:
> >
> > * Given a PCI domain, bus, and slot/function number, the desired PCI
> > * device is located in the list of PCI devices. If the device is
> > * found, its reference count is increased and this function returns a
> > * pointer to its data structure. The caller must decrement the
> > * reference count by calling pci_dev_put(). If no device is found,
> > * %NULL is returned.
> >
> > Nothing was done to call pci_dev_put() and the reference count of GPU and
> > NPU PCI devices rockets up.
> >
> > A natural way to fix this would be to teach the callers about the change,
> > so that they call pci_dev_put() when done with the pointer. This turns
> > out to be quite intrusive, as it affects many paths in npu-dma.c,
> > pci-ioda.c and vfio_pci_nvlink2.c.
>
>
> afaict this referencing is only done to protect the current traverser
> and what you've done is actually a natural way (and the generic
> pci_get_dev_by_id() does exactly the same), although this looks a bit weird.
>
Not exactly the same: pci_get_dev_by_id() always increment the refcount
of the returned PCI device. The refcount is only decremented when this
device is passed to pci_get_dev_by_id() to continue searching.
That means that the users of the PCI device pointer returned by
pci_get_dev_by_id() or its exported variants pci_get_subsys(),
pci_get_device() and pci_get_class() do handle the refcount. They
all pass the pointer to pci_dev_put() or continue the search,
which calls pci_dev_put() internally.
Direct and indirect callers of get_pci_dev() don't care for the
refcount at all unless I'm missing something.
>
> > Also, the issue appeared in 4.16 and
> > some affected code got moved around since then: it would be problematic
> > to backport the fix to stable releases.
> >
> > All that code never cared for reference counting anyway. Call pci_dev_put()
> > from get_pci_dev() to revert to the previous behavior.
> >> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev
> from pci_dn")
> > Cc: stable@vger.kernel.org # v4.16
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> > arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++-
> > 1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> > index e713ade30087..d8f3647e8fb2 100644
> > --- a/arch/powerpc/platforms/powernv/npu-dma.c
> > +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> > @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
> > static struct pci_dev *get_pci_dev(struct device_node *dn)
> > {
> > struct pci_dn *pdn = PCI_DN(dn);
> > + struct pci_dev *pdev;
> >
> > - return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> > + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> > pdn->busno, pdn->devfn);
> > +
> > + /*
> > + * pci_get_domain_bus_and_slot() increased the reference count of
> > + * the PCI device, but callers don't need that actually as the PE
> > + * already holds a reference to the device.
>
> Imho this would be just enough.
>
> Anyway,
>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
Thanks !
I now realize that I forgot to add the --cc option for stable on my stgit
command line :-\.
Cc'ing now.
>
> How did you find it? :)
>
While reading code to find some inspiration for OpenCAPI passthrough. :)
I saw the following in vfio_pci_ibm_npu2_init():
if (!pnv_pci_get_gpu_dev(vdev->pdev))
return -ENODEV;
and simply followed the function calls.
>
> > Since callers aren't
> > + * aware of the reference count change, call pci_dev_put() now to
> > + * avoid leaks.
> > + */
> > + if (pdev)
> > + pci_dev_put(pdev);
> > +
> > + return pdev;
> > }
> >
> > /* Given a NPU device get the associated PCI device. */
> >
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
2019-04-29 10:36 ` Greg Kurz
@ 2019-05-13 11:56 ` Greg Kurz
2019-05-14 11:13 ` Michael Ellerman
0 siblings, 1 reply; 7+ messages in thread
From: Greg Kurz @ 2019-05-13 11:56 UTC (permalink / raw)
To: Michael Ellerman
Cc: Alexey Kardashevskiy, Alistair Popple, linuxppc-dev,
linux-kernel, stable
Michael,
Any comments on this patch ? Should I repost with a shorter comment
as suggested by Alexey ?
Cheers,
--
Greg
On Mon, 29 Apr 2019 12:36:59 +0200
Greg Kurz <groug@kaod.org> wrote:
> On Mon, 29 Apr 2019 16:01:29 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
> > On 20/04/2019 01:34, Greg Kurz wrote:
> > > Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> > > has the effect of incrementing the reference count of the PCI device, as
> > > explained in drivers/pci/search.c:
> > >
> > > * Given a PCI domain, bus, and slot/function number, the desired PCI
> > > * device is located in the list of PCI devices. If the device is
> > > * found, its reference count is increased and this function returns a
> > > * pointer to its data structure. The caller must decrement the
> > > * reference count by calling pci_dev_put(). If no device is found,
> > > * %NULL is returned.
> > >
> > > Nothing was done to call pci_dev_put() and the reference count of GPU and
> > > NPU PCI devices rockets up.
> > >
> > > A natural way to fix this would be to teach the callers about the change,
> > > so that they call pci_dev_put() when done with the pointer. This turns
> > > out to be quite intrusive, as it affects many paths in npu-dma.c,
> > > pci-ioda.c and vfio_pci_nvlink2.c.
> >
> >
> > afaict this referencing is only done to protect the current traverser
> > and what you've done is actually a natural way (and the generic
> > pci_get_dev_by_id() does exactly the same), although this looks a bit weird.
> >
>
> Not exactly the same: pci_get_dev_by_id() always increment the refcount
> of the returned PCI device. The refcount is only decremented when this
> device is passed to pci_get_dev_by_id() to continue searching.
>
> That means that the users of the PCI device pointer returned by
> pci_get_dev_by_id() or its exported variants pci_get_subsys(),
> pci_get_device() and pci_get_class() do handle the refcount. They
> all pass the pointer to pci_dev_put() or continue the search,
> which calls pci_dev_put() internally.
>
> Direct and indirect callers of get_pci_dev() don't care for the
> refcount at all unless I'm missing something.
>
> >
> > > Also, the issue appeared in 4.16 and
> > > some affected code got moved around since then: it would be problematic
> > > to backport the fix to stable releases.
> > >
> > > All that code never cared for reference counting anyway. Call pci_dev_put()
> > > from get_pci_dev() to revert to the previous behavior.
> > >> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev
> > from pci_dn")
> > > Cc: stable@vger.kernel.org # v4.16
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > > arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++-
> > > 1 file changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> > > index e713ade30087..d8f3647e8fb2 100644
> > > --- a/arch/powerpc/platforms/powernv/npu-dma.c
> > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> > > @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
> > > static struct pci_dev *get_pci_dev(struct device_node *dn)
> > > {
> > > struct pci_dn *pdn = PCI_DN(dn);
> > > + struct pci_dev *pdev;
> > >
> > > - return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> > > + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> > > pdn->busno, pdn->devfn);
> > > +
> > > + /*
> > > + * pci_get_domain_bus_and_slot() increased the reference count of
> > > + * the PCI device, but callers don't need that actually as the PE
> > > + * already holds a reference to the device.
> >
> > Imho this would be just enough.
> >
> > Anyway,
> >
> > Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >
>
> Thanks !
>
> I now realize that I forgot to add the --cc option for stable on my stgit
> command line :-\.
>
> Cc'ing now.
>
> >
> > How did you find it? :)
> >
>
> While reading code to find some inspiration for OpenCAPI passthrough. :)
>
> I saw the following in vfio_pci_ibm_npu2_init():
>
> if (!pnv_pci_get_gpu_dev(vdev->pdev))
> return -ENODEV;
>
> and simply followed the function calls.
>
> >
> > > Since callers aren't
> > > + * aware of the reference count change, call pci_dev_put() now to
> > > + * avoid leaks.
> > > + */
> > > + if (pdev)
> > > + pci_dev_put(pdev);
> > > +
> > > + return pdev;
> > > }
> > >
> > > /* Given a NPU device get the associated PCI device. */
> > >
> >
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
2019-05-13 11:56 ` Greg Kurz
@ 2019-05-14 11:13 ` Michael Ellerman
2019-05-14 12:25 ` Greg Kurz
0 siblings, 1 reply; 7+ messages in thread
From: Michael Ellerman @ 2019-05-14 11:13 UTC (permalink / raw)
To: Greg Kurz
Cc: Alexey Kardashevskiy, Alistair Popple, linuxppc-dev,
linux-kernel, stable
Greg Kurz <groug@kaod.org> writes:
> Michael,
>
> Any comments on this patch ? Should I repost with a shorter comment
> as suggested by Alexey ?
No the longer comment seems fine to me.
I'm not a big fan of the patch, it's basically a hack :)
But for a backportable fix I guess it is OK.
I would be happier though if we eventually fix up the code to do the
refcounting properly.
cheers
> On Mon, 29 Apr 2019 12:36:59 +0200
> Greg Kurz <groug@kaod.org> wrote:
>> On Mon, 29 Apr 2019 16:01:29 +1000
>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>
>> > On 20/04/2019 01:34, Greg Kurz wrote:
>> > > Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
>> > > has the effect of incrementing the reference count of the PCI device, as
>> > > explained in drivers/pci/search.c:
>> > >
>> > > * Given a PCI domain, bus, and slot/function number, the desired PCI
>> > > * device is located in the list of PCI devices. If the device is
>> > > * found, its reference count is increased and this function returns a
>> > > * pointer to its data structure. The caller must decrement the
>> > > * reference count by calling pci_dev_put(). If no device is found,
>> > > * %NULL is returned.
>> > >
>> > > Nothing was done to call pci_dev_put() and the reference count of GPU and
>> > > NPU PCI devices rockets up.
>> > >
>> > > A natural way to fix this would be to teach the callers about the change,
>> > > so that they call pci_dev_put() when done with the pointer. This turns
>> > > out to be quite intrusive, as it affects many paths in npu-dma.c,
>> > > pci-ioda.c and vfio_pci_nvlink2.c.
>> >
>> >
>> > afaict this referencing is only done to protect the current traverser
>> > and what you've done is actually a natural way (and the generic
>> > pci_get_dev_by_id() does exactly the same), although this looks a bit weird.
>> >
>>
>> Not exactly the same: pci_get_dev_by_id() always increment the refcount
>> of the returned PCI device. The refcount is only decremented when this
>> device is passed to pci_get_dev_by_id() to continue searching.
>>
>> That means that the users of the PCI device pointer returned by
>> pci_get_dev_by_id() or its exported variants pci_get_subsys(),
>> pci_get_device() and pci_get_class() do handle the refcount. They
>> all pass the pointer to pci_dev_put() or continue the search,
>> which calls pci_dev_put() internally.
>>
>> Direct and indirect callers of get_pci_dev() don't care for the
>> refcount at all unless I'm missing something.
>>
>> >
>> > > Also, the issue appeared in 4.16 and
>> > > some affected code got moved around since then: it would be problematic
>> > > to backport the fix to stable releases.
>> > >
>> > > All that code never cared for reference counting anyway. Call pci_dev_put()
>> > > from get_pci_dev() to revert to the previous behavior.
>> > >> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev
>> > from pci_dn")
>> > > Cc: stable@vger.kernel.org # v4.16
>> > > Signed-off-by: Greg Kurz <groug@kaod.org>
>> > > ---
>> > > arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++-
>> > > 1 file changed, 14 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
>> > > index e713ade30087..d8f3647e8fb2 100644
>> > > --- a/arch/powerpc/platforms/powernv/npu-dma.c
>> > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c
>> > > @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
>> > > static struct pci_dev *get_pci_dev(struct device_node *dn)
>> > > {
>> > > struct pci_dn *pdn = PCI_DN(dn);
>> > > + struct pci_dev *pdev;
>> > >
>> > > - return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
>> > > + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
>> > > pdn->busno, pdn->devfn);
>> > > +
>> > > + /*
>> > > + * pci_get_domain_bus_and_slot() increased the reference count of
>> > > + * the PCI device, but callers don't need that actually as the PE
>> > > + * already holds a reference to the device.
>> >
>> > Imho this would be just enough.
>> >
>> > Anyway,
>> >
>> > Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> >
>>
>> Thanks !
>>
>> I now realize that I forgot to add the --cc option for stable on my stgit
>> command line :-\.
>>
>> Cc'ing now.
>>
>> >
>> > How did you find it? :)
>> >
>>
>> While reading code to find some inspiration for OpenCAPI passthrough. :)
>>
>> I saw the following in vfio_pci_ibm_npu2_init():
>>
>> if (!pnv_pci_get_gpu_dev(vdev->pdev))
>> return -ENODEV;
>>
>> and simply followed the function calls.
>>
>> >
>> > > Since callers aren't
>> > > + * aware of the reference count change, call pci_dev_put() now to
>> > > + * avoid leaks.
>> > > + */
>> > > + if (pdev)
>> > > + pci_dev_put(pdev);
>> > > +
>> > > + return pdev;
>> > > }
>> > >
>> > > /* Given a NPU device get the associated PCI device. */
>> > >
>> >
>>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
2019-05-14 11:13 ` Michael Ellerman
@ 2019-05-14 12:25 ` Greg Kurz
0 siblings, 0 replies; 7+ messages in thread
From: Greg Kurz @ 2019-05-14 12:25 UTC (permalink / raw)
To: Michael Ellerman
Cc: Alexey Kardashevskiy, Alistair Popple, linuxppc-dev,
linux-kernel, stable
On Tue, 14 May 2019 21:13:40 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:
> Greg Kurz <groug@kaod.org> writes:
> > Michael,
> >
> > Any comments on this patch ? Should I repost with a shorter comment
> > as suggested by Alexey ?
>
> No the longer comment seems fine to me.
>
> I'm not a big fan of the patch, it's basically a hack :)
>
Yeah :)
> But for a backportable fix I guess it is OK.
>
> I would be happier though if we eventually fix up the code to do the
> refcounting properly.
>
I had started to do just that before deciding to go for the backportable
hack. Should I rebase the other patches I have on top of this patch and
repost the whole thing, so that we have both the ugly fix for stable and
the pretty one for 5.2 ?
Cheers,
--
Greg
> cheers
>
> > On Mon, 29 Apr 2019 12:36:59 +0200
> > Greg Kurz <groug@kaod.org> wrote:
> >> On Mon, 29 Apr 2019 16:01:29 +1000
> >> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> >>
> >> > On 20/04/2019 01:34, Greg Kurz wrote:
> >> > > Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> >> > > has the effect of incrementing the reference count of the PCI device, as
> >> > > explained in drivers/pci/search.c:
> >> > >
> >> > > * Given a PCI domain, bus, and slot/function number, the desired PCI
> >> > > * device is located in the list of PCI devices. If the device is
> >> > > * found, its reference count is increased and this function returns a
> >> > > * pointer to its data structure. The caller must decrement the
> >> > > * reference count by calling pci_dev_put(). If no device is found,
> >> > > * %NULL is returned.
> >> > >
> >> > > Nothing was done to call pci_dev_put() and the reference count of GPU and
> >> > > NPU PCI devices rockets up.
> >> > >
> >> > > A natural way to fix this would be to teach the callers about the change,
> >> > > so that they call pci_dev_put() when done with the pointer. This turns
> >> > > out to be quite intrusive, as it affects many paths in npu-dma.c,
> >> > > pci-ioda.c and vfio_pci_nvlink2.c.
> >> >
> >> >
> >> > afaict this referencing is only done to protect the current traverser
> >> > and what you've done is actually a natural way (and the generic
> >> > pci_get_dev_by_id() does exactly the same), although this looks a bit weird.
> >> >
> >>
> >> Not exactly the same: pci_get_dev_by_id() always increment the refcount
> >> of the returned PCI device. The refcount is only decremented when this
> >> device is passed to pci_get_dev_by_id() to continue searching.
> >>
> >> That means that the users of the PCI device pointer returned by
> >> pci_get_dev_by_id() or its exported variants pci_get_subsys(),
> >> pci_get_device() and pci_get_class() do handle the refcount. They
> >> all pass the pointer to pci_dev_put() or continue the search,
> >> which calls pci_dev_put() internally.
> >>
> >> Direct and indirect callers of get_pci_dev() don't care for the
> >> refcount at all unless I'm missing something.
> >>
> >> >
> >> > > Also, the issue appeared in 4.16 and
> >> > > some affected code got moved around since then: it would be problematic
> >> > > to backport the fix to stable releases.
> >> > >
> >> > > All that code never cared for reference counting anyway. Call pci_dev_put()
> >> > > from get_pci_dev() to revert to the previous behavior.
> >> > >> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev
> >> > from pci_dn")
> >> > > Cc: stable@vger.kernel.org # v4.16
> >> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> >> > > ---
> >> > > arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++-
> >> > > 1 file changed, 14 insertions(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> >> > > index e713ade30087..d8f3647e8fb2 100644
> >> > > --- a/arch/powerpc/platforms/powernv/npu-dma.c
> >> > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> >> > > @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
> >> > > static struct pci_dev *get_pci_dev(struct device_node *dn)
> >> > > {
> >> > > struct pci_dn *pdn = PCI_DN(dn);
> >> > > + struct pci_dev *pdev;
> >> > >
> >> > > - return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> >> > > + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> >> > > pdn->busno, pdn->devfn);
> >> > > +
> >> > > + /*
> >> > > + * pci_get_domain_bus_and_slot() increased the reference count of
> >> > > + * the PCI device, but callers don't need that actually as the PE
> >> > > + * already holds a reference to the device.
> >> >
> >> > Imho this would be just enough.
> >> >
> >> > Anyway,
> >> >
> >> > Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> >
> >>
> >> Thanks !
> >>
> >> I now realize that I forgot to add the --cc option for stable on my stgit
> >> command line :-\.
> >>
> >> Cc'ing now.
> >>
> >> >
> >> > How did you find it? :)
> >> >
> >>
> >> While reading code to find some inspiration for OpenCAPI passthrough. :)
> >>
> >> I saw the following in vfio_pci_ibm_npu2_init():
> >>
> >> if (!pnv_pci_get_gpu_dev(vdev->pdev))
> >> return -ENODEV;
> >>
> >> and simply followed the function calls.
> >>
> >> >
> >> > > Since callers aren't
> >> > > + * aware of the reference count change, call pci_dev_put() now to
> >> > > + * avoid leaks.
> >> > > + */
> >> > > + if (pdev)
> >> > > + pci_dev_put(pdev);
> >> > > +
> >> > > + return pdev;
> >> > > }
> >> > >
> >> > > /* Given a NPU device get the associated PCI device. */
> >> > >
> >> >
> >>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/powernv/npu: Fix reference leak
2019-04-19 15:34 [PATCH] powerpc/powernv/npu: Fix reference leak Greg Kurz
2019-04-29 6:01 ` Alexey Kardashevskiy
@ 2019-06-03 12:32 ` Michael Ellerman
1 sibling, 0 replies; 7+ messages in thread
From: Michael Ellerman @ 2019-06-03 12:32 UTC (permalink / raw)
To: Greg Kurz, linux-kernel
Cc: Alexey Kardashevskiy, Alistair Popple, linuxppc-dev
On Fri, 2019-04-19 at 15:34:13 UTC, Greg Kurz wrote:
> Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> has the effect of incrementing the reference count of the PCI device, as
> explained in drivers/pci/search.c:
>
> * Given a PCI domain, bus, and slot/function number, the desired PCI
> * device is located in the list of PCI devices. If the device is
> * found, its reference count is increased and this function returns a
> * pointer to its data structure. The caller must decrement the
> * reference count by calling pci_dev_put(). If no device is found,
> * %NULL is returned.
>
> Nothing was done to call pci_dev_put() and the reference count of GPU and
> NPU PCI devices rockets up.
>
> A natural way to fix this would be to teach the callers about the change,
> so that they call pci_dev_put() when done with the pointer. This turns
> out to be quite intrusive, as it affects many paths in npu-dma.c,
> pci-ioda.c and vfio_pci_nvlink2.c. Also, the issue appeared in 4.16 and
> some affected code got moved around since then: it would be problematic
> to backport the fix to stable releases.
>
> All that code never cared for reference counting anyway. Call pci_dev_put()
> from get_pci_dev() to revert to the previous behavior.
>
> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn")
> Cc: stable@vger.kernel.org # v4.16
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/02c5f5394918b9b47ff4357b1b183357
cheers
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-06-03 12:34 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-19 15:34 [PATCH] powerpc/powernv/npu: Fix reference leak Greg Kurz
2019-04-29 6:01 ` Alexey Kardashevskiy
2019-04-29 10:36 ` Greg Kurz
2019-05-13 11:56 ` Greg Kurz
2019-05-14 11:13 ` Michael Ellerman
2019-05-14 12:25 ` Greg Kurz
2019-06-03 12:32 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).