All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Oliver O'Halloran <oohall@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, shawn@anastas.io,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH 1/3] powernv/iov: Ensure the pdn for VFs always contains a valid PE number
Date: Mon, 30 Sep 2019 12:09:48 -0500	[thread overview]
Message-ID: <20190930170948.GA154567@google.com> (raw)
In-Reply-To: <20190930020848.25767-2-oohall@gmail.com>

On Mon, Sep 30, 2019 at 12:08:46PM +1000, Oliver O'Halloran wrote:

This is all powerpc, so I assume Michael will handle this.  Just
random things I noticed; ignore if they don't make sense:

> On PowerNV we use the pcibios_sriov_enable() hook to do two things:
> 
> 1. Create a pci_dn structure for each of the VFs, and
> 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
>    so that each VF has it's own PE. Note that the PE also determines

s/it's/its/

>    the IOMMU table the HW uses for the device.
> 
> Currently we do not set the pe_number field of the pci_dn immediately after
> assigning the PE number for the VF that it represents. Instead, we do that
> in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
> pcibios_add_device() hook which is run prior to adding the device to the
> bus.
> 
> On PowerNV we add the device to it's IOMMU group using a bus notifier and

s/it's/its/

> in order for this to work the PE number needs to be known when the bus
> notifier is run. This works today since the PE number is set in the fixup
> which runs before adding the device to the bus. However, if we want to move
> the fixup to a later stage this will break.
> 
> We can fix this by setting the pdn->pe_number inside of
> pcibios_sriov_enable(). There's no good to avoid this since we already have

s/no good/no good reason/ ?

Not quite sure what "this" refers to ... "no good reason to avoid
setting pdn->pe_number in pcibios_sriov_enable()"?  The double
negative makes it a little hard to parse.

> all the required information at that point, so... do that. Moving this
> earlier does cause two problems:
> 
> 1. We trip the WARN_ON() in the fixup code, and
> 2. The EEH core will clear pdn->pe_number while recovering VFs.
> 
> The only justification for either of these is a comment in eeh_rmv_device()
> suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
> for the VF to be scanned. However, this comment appears to have no basis in
> reality so just delete it.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> Can't get rid of the fixup entirely since we need it to set the
> ioda_pe->pdev back-pointer. I'll look at killing that another time.
> ---
>  arch/powerpc/kernel/eeh_driver.c          |  6 ------
>  arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
>  arch/powerpc/platforms/powernv/pci.c      |  4 ----
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index d9279d0..7955fba 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -541,12 +541,6 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>  
>  		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
>  		edev->pdev = NULL;
> -
> -		/*
> -		 * We have to set the VF PE number to invalid one, which is
> -		 * required to plug the VF successfully.
> -		 */
> -		pdn->pe_number = IODA_INVALID_PE;
>  #endif
>  		if (rmv_data)
>  			list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5e3172d..70508b3 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1558,6 +1558,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  
>  	/* Reserve PE for each VF */
>  	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
> +		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
> +		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> +		struct pci_dn *vf_pdn;
> +
>  		if (pdn->m64_single_mode)
>  			pe_num = pdn->pe_num_map[vf_index];
>  		else
> @@ -1570,13 +1574,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		pe->pbus = NULL;
>  		pe->parent_dev = pdev;
>  		pe->mve_number = -1;
> -		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
> -			   pci_iov_virtfn_devfn(pdev, vf_index);
> +		pe->rid = (vf_bus << 8) | vf_devfn;
>  
>  		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",

Not related to *this* patch, but this looks like maybe it's supposed
to match the pci_name(), e.g., "%04x:%02x:%02x.%d" from
pci_setup_device()?  If so, the "%04d:%02d:%02d" here will be
confusing since the decimal & hex won't always match.

>  			hose->global_number, pdev->bus->number,

Consider pci_domain_nr(bus) instead of hose->global_number?  It would
be nice if you had the pci_dev * for each VF so you could just use
pci_name(vf) instead of all this domain/bus/PCI_SLOT/FUNC.

> -			PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
> -			PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
> +			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
>  
>  		if (pnv_ioda_configure_pe(phb, pe)) {
>  			/* XXX What do we do here ? */
> @@ -1590,6 +1592,15 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		list_add_tail(&pe->list, &phb->ioda.pe_list);
>  		mutex_unlock(&phb->ioda.pe_list_mutex);
>  
> +		/* associate this pe to it's pdn */
> +		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
> +			if (vf_pdn->busno == vf_bus &&
> +			    vf_pdn->devfn == vf_devfn) {
> +				vf_pdn->pe_number = pe_num;
> +				break;
> +			}
> +		}
> +
>  		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>  #ifdef CONFIG_IOMMU_API
>  		iommu_register_group(&pe->table_group,
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 2825d00..b7761e2 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -816,16 +816,12 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  	struct pnv_phb *phb = hose->private_data;
>  #ifdef CONFIG_PCI_IOV
>  	struct pnv_ioda_pe *pe;
> -	struct pci_dn *pdn;
>  
>  	/* Fix the VF pdn PE number */
>  	if (pdev->is_virtfn) {
> -		pdn = pci_get_pdn(pdev);
> -		WARN_ON(pdn->pe_number != IODA_INVALID_PE);
>  		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>  			if (pe->rid == ((pdev->bus->number << 8) |
>  			    (pdev->devfn & 0xff))) {
> -				pdn->pe_number = pe->pe_number;
>  				pe->pdev = pdev;
>  				break;
>  			}
> -- 
> 2.9.5
> 

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Oliver O'Halloran <oohall@gmail.com>
Cc: aik@ozlabs.ru, shawn@anastas.io, linuxppc-dev@lists.ozlabs.org,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH 1/3] powernv/iov: Ensure the pdn for VFs always contains a valid PE number
Date: Mon, 30 Sep 2019 12:09:48 -0500	[thread overview]
Message-ID: <20190930170948.GA154567@google.com> (raw)
In-Reply-To: <20190930020848.25767-2-oohall@gmail.com>

On Mon, Sep 30, 2019 at 12:08:46PM +1000, Oliver O'Halloran wrote:

This is all powerpc, so I assume Michael will handle this.  Just
random things I noticed; ignore if they don't make sense:

> On PowerNV we use the pcibios_sriov_enable() hook to do two things:
> 
> 1. Create a pci_dn structure for each of the VFs, and
> 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
>    so that each VF has it's own PE. Note that the PE also determines

s/it's/its/

>    the IOMMU table the HW uses for the device.
> 
> Currently we do not set the pe_number field of the pci_dn immediately after
> assigning the PE number for the VF that it represents. Instead, we do that
> in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
> pcibios_add_device() hook which is run prior to adding the device to the
> bus.
> 
> On PowerNV we add the device to it's IOMMU group using a bus notifier and

s/it's/its/

> in order for this to work the PE number needs to be known when the bus
> notifier is run. This works today since the PE number is set in the fixup
> which runs before adding the device to the bus. However, if we want to move
> the fixup to a later stage this will break.
> 
> We can fix this by setting the pdn->pe_number inside of
> pcibios_sriov_enable(). There's no good to avoid this since we already have

s/no good/no good reason/ ?

Not quite sure what "this" refers to ... "no good reason to avoid
setting pdn->pe_number in pcibios_sriov_enable()"?  The double
negative makes it a little hard to parse.

> all the required information at that point, so... do that. Moving this
> earlier does cause two problems:
> 
> 1. We trip the WARN_ON() in the fixup code, and
> 2. The EEH core will clear pdn->pe_number while recovering VFs.
> 
> The only justification for either of these is a comment in eeh_rmv_device()
> suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
> for the VF to be scanned. However, this comment appears to have no basis in
> reality so just delete it.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> Can't get rid of the fixup entirely since we need it to set the
> ioda_pe->pdev back-pointer. I'll look at killing that another time.
> ---
>  arch/powerpc/kernel/eeh_driver.c          |  6 ------
>  arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
>  arch/powerpc/platforms/powernv/pci.c      |  4 ----
>  3 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index d9279d0..7955fba 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -541,12 +541,6 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>  
>  		pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
>  		edev->pdev = NULL;
> -
> -		/*
> -		 * We have to set the VF PE number to invalid one, which is
> -		 * required to plug the VF successfully.
> -		 */
> -		pdn->pe_number = IODA_INVALID_PE;
>  #endif
>  		if (rmv_data)
>  			list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5e3172d..70508b3 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1558,6 +1558,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  
>  	/* Reserve PE for each VF */
>  	for (vf_index = 0; vf_index < num_vfs; vf_index++) {
> +		int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
> +		int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> +		struct pci_dn *vf_pdn;
> +
>  		if (pdn->m64_single_mode)
>  			pe_num = pdn->pe_num_map[vf_index];
>  		else
> @@ -1570,13 +1574,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		pe->pbus = NULL;
>  		pe->parent_dev = pdev;
>  		pe->mve_number = -1;
> -		pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
> -			   pci_iov_virtfn_devfn(pdev, vf_index);
> +		pe->rid = (vf_bus << 8) | vf_devfn;
>  
>  		pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",

Not related to *this* patch, but this looks like maybe it's supposed
to match the pci_name(), e.g., "%04x:%02x:%02x.%d" from
pci_setup_device()?  If so, the "%04d:%02d:%02d" here will be
confusing since the decimal & hex won't always match.

>  			hose->global_number, pdev->bus->number,

Consider pci_domain_nr(bus) instead of hose->global_number?  It would
be nice if you had the pci_dev * for each VF so you could just use
pci_name(vf) instead of all this domain/bus/PCI_SLOT/FUNC.

> -			PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
> -			PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
> +			PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
>  
>  		if (pnv_ioda_configure_pe(phb, pe)) {
>  			/* XXX What do we do here ? */
> @@ -1590,6 +1592,15 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>  		list_add_tail(&pe->list, &phb->ioda.pe_list);
>  		mutex_unlock(&phb->ioda.pe_list_mutex);
>  
> +		/* associate this pe to it's pdn */
> +		list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
> +			if (vf_pdn->busno == vf_bus &&
> +			    vf_pdn->devfn == vf_devfn) {
> +				vf_pdn->pe_number = pe_num;
> +				break;
> +			}
> +		}
> +
>  		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>  #ifdef CONFIG_IOMMU_API
>  		iommu_register_group(&pe->table_group,
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 2825d00..b7761e2 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -816,16 +816,12 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>  	struct pnv_phb *phb = hose->private_data;
>  #ifdef CONFIG_PCI_IOV
>  	struct pnv_ioda_pe *pe;
> -	struct pci_dn *pdn;
>  
>  	/* Fix the VF pdn PE number */
>  	if (pdev->is_virtfn) {
> -		pdn = pci_get_pdn(pdev);
> -		WARN_ON(pdn->pe_number != IODA_INVALID_PE);
>  		list_for_each_entry(pe, &phb->ioda.pe_list, list) {
>  			if (pe->rid == ((pdev->bus->number << 8) |
>  			    (pdev->devfn & 0xff))) {
> -				pdn->pe_number = pe->pe_number;
>  				pe->pdev = pdev;
>  				break;
>  			}
> -- 
> 2.9.5
> 

  reply	other threads:[~2019-09-30 17:09 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30  2:08 IOMMU group creation for pseries hotplug, and powernv VFs Oliver O'Halloran
2019-09-30  2:08 ` Oliver O'Halloran
2019-09-30  2:08 ` [PATCH 1/3] powernv/iov: Ensure the pdn for VFs always contains a valid PE number Oliver O'Halloran
2019-09-30  2:08   ` Oliver O'Halloran
2019-09-30 17:09   ` Bjorn Helgaas [this message]
2019-09-30 17:09     ` Bjorn Helgaas
2019-10-01  1:33     ` Oliver O'Halloran
2019-10-01  1:33       ` Oliver O'Halloran
2019-10-11  8:27       ` Michael Ellerman
2019-10-11  8:27         ` Michael Ellerman
2019-10-01  5:39   ` Alexey Kardashevskiy
2019-10-01  5:39     ` Alexey Kardashevskiy
2019-09-30  2:08 ` [PATCH 2/3] powerpc/pci: Fix pcibios_setup_device() ordering Oliver O'Halloran
2019-09-30  2:08   ` Oliver O'Halloran
2019-10-01  5:39   ` Alexey Kardashevskiy
2019-10-01  5:39     ` Alexey Kardashevskiy
2019-09-30  2:08 ` [PATCH 3/3] powerpc/pci: Remove pcibios_setup_bus_devices() Oliver O'Halloran
2019-09-30  2:08   ` Oliver O'Halloran
2019-10-01  5:40   ` Alexey Kardashevskiy
2019-10-01  5:40     ` Alexey Kardashevskiy
2019-10-01  7:39 ` IOMMU group creation for pseries hotplug, and powernv VFs Shawn Anastasio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190930170948.GA154567@google.com \
    --to=helgaas@kernel.org \
    --cc=aik@ozlabs.ru \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=oohall@gmail.com \
    --cc=shawn@anastas.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.