linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: Frederic Barrat <fbarrat@linux.ibm.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	linuxppc-dev@lists.ozlabs.org,
	Oliver O'Halloran <oohall@gmail.com>,
	s.miroshnichenko@yadro.com, alistair@popple.id.au
Subject: Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
Date: Wed, 27 Nov 2019 13:03:25 +0100	[thread overview]
Message-ID: <20191127130325.7496523b@bahia.w3ibm.bluemix.net> (raw)
In-Reply-To: <e5b52a32-f53e-7a34-01e4-90f8bae9a44c@linux.ibm.com>

On Wed, 27 Nov 2019 10:47:45 +0100
Frederic Barrat <fbarrat@linux.ibm.com> wrote:

> 
> 
> Le 27/11/2019 à 10:33, Greg Kurz a écrit :
> > On Wed, 27 Nov 2019 10:10:13 +0100
> > Frederic Barrat <fbarrat@linux.ibm.com> wrote:
> > 
> >>
> >>
> >> Le 27/11/2019 à 09:24, Greg Kurz a écrit :
> >>> On Wed, 27 Nov 2019 18:09:40 +1100
> >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> >>>>> The comment here implies that we don't need to take a ref to the pci_dev
> >>>>> because the ioda_pe will always have one. This implies that the current
> >>>>> expection is that the pci_dev for an NPU device will *never* be torn
> >>>>> down since the ioda_pe having a ref to the device will prevent the
> >>>>> release function from being called.
> >>>>>
> >>>>> In other words, the desired behaviour here appears to be leaking a ref.
> >>>>>
> >>>>> Nice!
> >>>>
> >>>>
> >>>> There is a history: https://patchwork.ozlabs.org/patch/1088078/
> >>>>
> >>>> We did not fix anything in particular then, we do not seem to be fixing
> >>>> anything now (in other words - we cannot test it in a normal natural
> >>>> way). I'd drop this one.
> >>>>
> >>>
> >>> Yeah, I didn't fix anything at the time. Just reverted to the ref
> >>> count behavior we had before:
> >>>
> >>> https://patchwork.ozlabs.org/patch/829172/
> >>>
> >>> Frederic recently posted his take on the same topic from the OpenCAPI
> >>> point of view:
> >>>
> >>> http://patchwork.ozlabs.org/patch/1198947/
> >>>
> >>> He seems to indicate the NPU devices as the real culprit because
> >>> nobody ever cared for them to be removable. Fixing that seems be
> >>> a chore nobody really wants to address obviously... :-\
> >>
> >>
> >> I had taken a stab at not leaking a ref for the nvlink devices and do
> >> the proper thing regarding ref counting (i.e. fixing all the callers of
> >> get_pci_dev() to drop the reference when they were done). With that, I
> >> could see that the ref count of the nvlink devices could drop to 0
> >> (calling remove for the device in /sys) and that the devices could go away.
> >>
> >> But then, I realized it's not necessarily desirable at this point. There
> >> are several comments in the code saying the npu devices (for nvlink)
> >> don't go away, there's no device release callback defined when it seems
> >> there should be, at least to handle releasing PEs.... All in all, it
> >> seems that some work would be needed. And if it hasn't been required by
> >> now...
> >>
> > 
> > If everyone is ok with leaking a reference in the NPU case, I guess
> > this isn't a problem. But if we move forward with Oliver's patch, a
> > pci_dev_put() would be needed for OpenCAPI, correct ?
> 
> 
> No, these code paths are nvlink-only.
> 

Oh yes indeed. Then this patch and yours fit well together :)

>    Fred
> 
> 
> 
> >>     Fred
> >>
> >>
> >>>>
> >>>>
> >>>>>
> >>>>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> >>>>> ---
> >>>>>    arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
> >>>>>    1 file changed, 3 insertions(+), 8 deletions(-)
> >>>>>
> >>>>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> >>>>> index 72d3749da02c..2eb6e6d45a98 100644
> >>>>> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> >>>>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> >>>>> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
> >>>>>    			break;
> >>>>>    
> >>>>>    	/*
> >>>>> -	 * pci_get_domain_bus_and_slot() increased the reference count of
> >>>>> -	 * the PCI device, but callers don't need that actually as the PE
> >>>>> -	 * already holds a reference to the device. Since callers aren't
> >>>>> -	 * aware of the reference count change, call pci_dev_put() now to
> >>>>> -	 * avoid leaks.
> >>>>> +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
> >>>>> +	 * Caller is responsible for dropping the ref when it's
> >>>>> +	 * finished with it.
> >>>>>    	 */
> >>>>> -	if (pdev)
> >>>>> -		pci_dev_put(pdev);
> >>>>> -
> >>>>>    	return pdev;
> >>>>>    }
> >>>>>    
> >>>>>
> >>>>
> >>>
> >>
> > 
> 


  reply	other threads:[~2019-11-27 13:24 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20  1:28 PCIPOCALYPSE Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 01/46] powerpc/eeh: Don't attempt to restore VF config space after reset Oliver O'Halloran
2019-11-21  3:38   ` Alexey Kardashevskiy
2019-11-21  4:34     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 02/46] powernv/pci: Add helper to find ioda_pe from BDFN Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 03/46] powernv/pci: Remove dma_dev_setup() for NPU PHBs Oliver O'Halloran
2019-11-21  3:57   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 04/46] powernv/pci: Move dma_{dev|bus}_setup into pci-ioda.c Oliver O'Halloran
2019-11-21  4:02   ` Alexey Kardashevskiy
2019-11-21  4:33     ` Oliver O'Halloran
2019-11-21  7:46   ` Christoph Hellwig
2019-11-20  1:28 ` [Very RFC 05/46] powernv/pci: Remove the pnv_phb dma_dev_setup callback Oliver O'Halloran
2019-11-21  4:03   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 06/46] powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() Oliver O'Halloran
2019-11-21  4:34   ` Alexey Kardashevskiy
2019-11-25  4:41     ` Oliver O'Halloran
2019-11-21  7:48   ` Christoph Hellwig
2019-11-25  4:39     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 07/46] powernv/pci: Rework IODA PE device accounting Oliver O'Halloran
2019-11-21  5:48   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 08/46] powerpc/eeh: Calculate VF index rather than looking it up in pci_dn Oliver O'Halloran
2019-11-22  4:43   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 09/46] powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config() Oliver O'Halloran
2019-11-22  4:52   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 10/46] powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config() Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 11/46] powerpc/eeh: Convert various printfs to use edev, not pci_dn Oliver O'Halloran
2019-11-22  4:55   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 12/46] powerpc/eeh: Split eeh_probe into probe_pdn and probe_pdev Oliver O'Halloran
2019-11-22  5:45   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 13/46] powerpc/eeh: Rework how pdev_probe() is used Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 14/46] powernv/eeh: Remove un-necessary call to eeh_add_device_early() Oliver O'Halloran
2019-11-22  6:01   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 15/46] powernv/eeh: Use pnv_eeh_*_config() for internal config ops Oliver O'Halloran
2019-11-22  6:15   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 16/46] powernv/eeh: Use eeh_edev_warn() rather than open-coding a BDFN print Oliver O'Halloran
2019-11-22  6:17   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 17/46] powernv/eeh: add pnv_eeh_find_edev() Oliver O'Halloran
2019-11-25  0:30   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 18/46] powernv/pci: Add pci_bus_to_pnvhb() helper Oliver O'Halloran
2019-11-25  0:42   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 19/46] powernv/eeh: Use standard PCI capability lookup functions Oliver O'Halloran
2019-11-25  1:02   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 20/46] powernv/eeh: Look up device info from pci_dev Oliver O'Halloran
2019-11-25  1:26   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 21/46] powernv/eeh: Rework finding an existing edev in probe_pdev() Oliver O'Halloran
2019-11-25  3:20   ` Alexey Kardashevskiy
2019-11-25  4:17     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 22/46] powernv/eeh: Allocate eeh_dev's when needed Oliver O'Halloran
2019-11-25  3:27   ` Alexey Kardashevskiy
2019-11-25  4:26     ` Oliver O'Halloran
2019-11-27  1:50       ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 23/46] powerpc/eeh: Moving finding the parent PE into the platform Oliver O'Halloran
2019-11-25  5:00   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 24/46] powernv/pci: Make the pre-cfg EEH freeze check use eeh_dev rather than pci_dn Oliver O'Halloran
2019-11-27  0:21   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 25/46] powernv/pci: Remove pdn from pnv_pci_config_check_eeh() Oliver O'Halloran
2019-11-27  1:05   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 26/46] powernv/pci: Remove pdn from pnv_pci_cfg_{read|write} Oliver O'Halloran
2019-11-27  2:16   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 27/46] powernv/pci: Clear reserved PE freezes Oliver O'Halloran
2019-11-27  3:00   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 28/46] powernv/iov: Move SR-IOV PF state out of pci_dn Oliver O'Halloran
2019-11-27  4:09   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 29/46] powernv/pci: Remove open-coded PE lookup in PELT-V setup Oliver O'Halloran
2019-11-27  4:26   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 30/46] powernv/pci: Remove open-coded PE lookup in PELT-V teardown Oliver O'Halloran
2019-11-27  4:50   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 31/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_ioda_dma_dev_setup() Oliver O'Halloran
2019-11-21  7:52   ` Christoph Hellwig
2019-11-27  4:53   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 32/46] powernv/pci: Remove open-coded PE lookup in iommu_bypass_supported() Oliver O'Halloran
2019-11-27  5:09   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 33/46] powernv/pci: Remove open-coded PE lookup in iommu notifier Oliver O'Halloran
2019-11-27  5:09   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 34/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_enable_device_hook() Oliver O'Halloran
2019-11-27  5:14   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device Oliver O'Halloran
2019-11-27  5:24   ` Alexey Kardashevskiy
2019-11-27  9:51     ` Oliver O'Halloran
2019-11-20  1:28 ` [Very RFC 36/46] powernv/npu: Remove open-coded PE lookup for GPU device Oliver O'Halloran
2019-11-27  5:45   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 37/46] powernv/pci: Use the PHB's rmap for pnv_ioda_to_pe() Oliver O'Halloran
2019-11-21  3:50   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 38/46] powerpc/pci-hotplug: Scan the whole bus when using PCI_PROBE_NORMAL Oliver O'Halloran
2019-11-27  6:27   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 39/46] powernv/npu: Avoid pci_dn when mapping device_node to a pci_dev Oliver O'Halloran
2019-11-27  6:58   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs Oliver O'Halloran
2019-11-27  7:09   ` Alexey Kardashevskiy
2019-11-27  8:24     ` Greg Kurz
2019-11-27  9:10       ` Frederic Barrat
2019-11-27  9:33         ` Greg Kurz
2019-11-27  9:40           ` Oliver O'Halloran
2019-11-27 12:00             ` Greg Kurz
2019-11-27  9:47           ` Frederic Barrat
2019-11-27 12:03             ` Greg Kurz [this message]
2019-11-20  1:28 ` [Very RFC 41/46] powernv/eeh: Remove pdn setup for SR-IOV VFs Oliver O'Halloran
2019-11-27  7:14   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 42/46] powernv/pci: Don't clear pdn->pe_number in pnv_pci_release_device Oliver O'Halloran
2019-11-27  7:30   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices Oliver O'Halloran
2019-11-27 22:49   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack Oliver O'Halloran
2019-11-27 22:54   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors Oliver O'Halloran
2019-11-27 23:00   ` Alexey Kardashevskiy
2019-11-20  1:28 ` [Very RFC 46/46] HACK: prevent pdn's from being created Oliver O'Halloran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191127130325.7496523b@bahia.w3ibm.bluemix.net \
    --to=groug@kaod.org \
    --cc=aik@ozlabs.ru \
    --cc=alistair@popple.id.au \
    --cc=fbarrat@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=oohall@gmail.com \
    --cc=s.miroshnichenko@yadro.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).