LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
From: Frederic Barrat <fbarrat@linux.ibm.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, andrew.donnellan@au1.ibm.com,
	clombard@linux.ibm.com, Alistair Popple <alistair@popple.id.au>
Cc: Reza Arbab <arbab@linux.ibm.com>, groug@kaod.org, alastair@au1.ibm.com
Subject: Re: [PATCH 01/11] powerpc/powernv/ioda: Fix ref count for devices with their own PE
Date: Tue, 12 Nov 2019 18:38:21 +0100
Message-ID: <e8cb8ac1-b54e-ce15-e1ea-6131b7386342@linux.ibm.com> (raw)
In-Reply-To: <010252f2-7f94-4a8a-90ae-82ff49b622d6@ozlabs.ru>



Le 27/09/2019 à 08:54, Alexey Kardashevskiy a écrit :
> 
> 
> On 27/09/2019 03:15, Frederic Barrat wrote:
>>
>>
>> Le 26/09/2019 à 18:44, Andrew Donnellan a écrit :
>>> On 9/9/19 5:45 pm, Frederic Barrat wrote:
>>>> Taking a reference on the pci_dev structure was required with initial
>>>> commit 184cd4a3b962 ("powerpc/powernv: PCI support for p7IOC under
>>>> OPAL v2"), where we were storing the pci_dev in the pci_dn structure.
>>>>
>>>> However, the pci_dev was later removed from the pci_dn structure, but
>>>> the reference was kept. See commit 902bdc57451c ("powerpc/powernv/idoa:
>>>> Remove unnecessary pcidev from pci_dn").
>>>>
>>>> The pnv_ioda_pe structure life cycle is the same as the pci_dev
>>>> structure, the PE is freed when the device is released. So we don't
>>>> need a reference for the pci_dev stored in the PE, otherwise the
>>>> pci_dev will never be released. Which is not really a surprise as the
>>>> comment (removed here as no longer needed) was stating as much.
>>>>
>>>> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn")
>>>> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
>>>> ---
>>>>    arch/powerpc/platforms/powernv/pci-ioda.c | 11 +----------
>>>>    1 file changed, 1 insertion(+), 10 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>> index d8080558d020..92767f006f20 100644
>>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>> @@ -1062,14 +1062,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>>>>            return NULL;
>>>>        }
>>>> -    /* NOTE: We get only one ref to the pci_dev for the pdn, not for the
>>>> -     * pointer in the PE data structure, both should be destroyed at the
>>>> -     * same time. However, this needs to be looked at more closely again
>>>> -     * once we actually start removing things (Hotplug, SR-IOV, ...)
>>>> -     *
>>>> -     * At some point we want to remove the PDN completely anyways
>>>> -     */
>>>> -    pci_dev_get(dev);
>>>>        pdn->pe_number = pe->pe_number;
>>>>        pe->flags = PNV_IODA_PE_DEV;
>>>>        pe->pdev = dev;
>>>> @@ -1084,7 +1076,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>>>>            pnv_ioda_free_pe(pe);
>>>>            pdn->pe_number = IODA_INVALID_PE;
>>>>            pe->pdev = NULL;
>>>> -        pci_dev_put(dev);
>>>>            return NULL;
>>>>        }
>>>> @@ -1228,7 +1219,7 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct pci_dev *npu_pdev)
>>>>                 */
>>>>                dev_info(&npu_pdev->dev,
>>>>                    "Associating to existing PE %x\n", pe_num);
>>>> -            pci_dev_get(npu_pdev);
>>>> +            pci_dev_get(npu_pdev); // still needed after 902bdc57451c ?
>>>
>>> Did you mean to leave that comment in?
>>
>> Yes, I assumed the series wouldn't get in on the first try and a nvlink-minded developer would weigh in :)
> 
> I am looking at it and I am still not so sure what exactly guarantees that lifetime(@dev) == lifetime(@pe). For example,
> sriov_disable() removes VFs first, and only then releases PEs so there is a window when we may have a stale pdev. Not sure.


Indeed, for SRIOV, PE life-cycle is handled differently. And hopefully 
correctly, but I don’t think this patch alters it.

I was discussing with Greg about it, as he had to look in that area in 
the past. My understanding is that this patch is ok as it follows the 
initial comment in pnv_ioda_setup_dev_PE() that we don’t need to get a 
reference for the PE, it makes sense and I could trace the origin of the 
pci_dev_get() and it seemed like it should have been removed in a 
previous patch (902bdc57451c).

However, one question is whether this patch breaks nvlink and if nvlink 
assumes the devices won’t go away because we explicitly take a reference 
forever. In npu_dma.c, there are 2 functions which allow to find the GPU 
associated to a npu device, and vice-versa. Both functions return a 
pointer to a struct pci_dev, but they don’t take a reference on the 
device being returned. So that seems dangerous. I’m probably missing 
something.

Alexey, Alistair: what, if anything, guarantees, that the npu or gpu 
devices stay valid. Is it because we simply don’t provide any means to 
get rid of them ? Otherwise, don’t we need the callers of 
pnv_pci_get_gpu_dev() and pnv_pci_get_npu_dev() to worry about reference 
counting ? I’ve started looking into it and the changes are scary, which 
explains Greg’s related commit 02c5f5394918b.

   Fred



> 
>>
>>    Fred
>>
>>
>>>>                npu_pdn = pci_get_pdn(npu_pdev);
>>>>                rid = npu_pdev->bus->number << 8 | npu_pdn->devfn;
>>>>                npu_pdn->pe_number = pe_num;
>>>>
>>>
>>
> 


  reply index

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09 15:45 [PATCH 00/11] opencapi: enable card reset and link retraining Frederic Barrat
2019-09-09 15:45 ` [PATCH 01/11] powerpc/powernv/ioda: Fix ref count for devices with their own PE Frederic Barrat
2019-09-10  0:20   ` Alastair D'Silva
2019-09-26 16:44   ` Andrew Donnellan
2019-09-26 17:15     ` Frederic Barrat
2019-09-27  6:54       ` Alexey Kardashevskiy
2019-11-12 17:38         ` Frederic Barrat [this message]
2019-11-18  1:04           ` Alistair Popple
2019-11-18  1:24             ` Oliver O'Halloran
2019-11-18  2:36               ` Alistair Popple
2019-11-18 18:21                 ` Frederic Barrat
2019-09-09 15:45 ` [PATCH 02/11] powerpc/powernv/ioda: Protect PE list Frederic Barrat
2019-09-10  0:34   ` Alastair D'Silva
2019-11-19 12:55     ` Frederic Barrat
2019-11-19 13:22       ` Oliver O'Halloran
2019-11-19 14:36         ` Frederic Barrat
2019-11-19  5:55   ` Andrew Donnellan
2019-09-09 15:45 ` [PATCH 03/11] powerpc/powernv/ioda: set up PE on opencapi device when enabling Frederic Barrat
2019-09-10  0:38   ` Alastair D'Silva
2019-09-27 16:43   ` Andrew Donnellan
2019-09-09 15:45 ` [PATCH 04/11] powerpc/powernv/ioda: Release opencapi device Frederic Barrat
2019-09-10  0:56   ` Alastair D'Silva
2019-11-19 17:32     ` Frederic Barrat
2019-11-19  7:08   ` Andrew Donnellan
2019-09-09 15:45 ` [PATCH 05/11] powerpc/powernv/ioda: Find opencapi slot for a device node Frederic Barrat
2019-09-10  0:57   ` Alastair D'Silva
2019-11-19  1:26   ` Andrew Donnellan
2019-11-19 14:33     ` Frederic Barrat
2019-09-09 15:45 ` [PATCH 06/11] pci/hotplug/pnv-php: Remove erroneous warning Frederic Barrat
2019-09-10  0:58   ` Alastair D'Silva
2019-11-19  5:00   ` Andrew Donnellan
2019-09-09 15:45 ` [PATCH 07/11] pci/hotplug/pnv-php: Improve error msg on power state change failure Frederic Barrat
2019-09-10  0:59   ` Alastair D'Silva
2019-11-19  2:23   ` Andrew Donnellan
2019-09-09 15:45 ` [PATCH 08/11] pci/hotplug/pnv-php: Register opencapi slots Frederic Barrat
2019-09-10  1:00   ` Alastair D'Silva
2019-11-19  5:18   ` Andrew Donnellan
2019-11-19 15:15     ` Frederic Barrat
2019-09-09 15:45 ` [PATCH 09/11] pci/hotplug/pnv-php: Relax check when disabling slot Frederic Barrat
2019-09-10  1:00   ` Alastair D'Silva
2019-09-27 15:56   ` Andrew Donnellan
2019-09-09 15:45 ` [PATCH 10/11] pci/hotplug/pnv-php: Wrap warnings in macro Frederic Barrat
2019-09-10  1:03   ` Alastair D'Silva
2019-09-26 17:18   ` Andrew Donnellan
2019-09-09 15:46 ` [PATCH 11/11] ocxl: Add PCI hotplug dependency to Kconfig Frederic Barrat
2019-09-10  1:03   ` Alastair D'Silva
2019-09-26 17:11   ` Andrew Donnellan
2019-09-24  4:24 ` [PATCH 00/11] opencapi: enable card reset and link retraining Alexey Kardashevskiy
2019-09-24  6:39   ` Frederic Barrat
2019-09-25  0:20     ` Alexey Kardashevskiy

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8cb8ac1-b54e-ce15-e1ea-6131b7386342@linux.ibm.com \
    --to=fbarrat@linux.ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=ajd@linux.ibm.com \
    --cc=alastair@au1.ibm.com \
    --cc=alistair@popple.id.au \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=arbab@linux.ibm.com \
    --cc=clombard@linux.ibm.com \
    --cc=groug@kaod.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git