linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Aneela Devarasetty <aneela@mellanox.com>
To: Oliver O'Halloran <oohall@gmail.com>, Max Gurtovoy <maxg@mellanox.com>
Cc: Zhi-wei Dai <zdai@us.ibm.com>,
	Vladimir Koushnir <vladimirk@mellanox.com>,
	Carol Soto <clsoto@us.ibm.com>,
	linux-pci <linux-pci@vger.kernel.org>,
	Shlomi Nimrodi <shlomin@mellanox.com>,
	Israel Rukshin <israelr@mellanox.com>,
	Frederic Barrat <fbarrat@linux.ibm.com>,
	Idan Werpoler <Idanw@mellanox.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Christoph Hellwig <hch@lst.de>
Subject: RE: [PATCH 2/2 v2] powerpc/powernv: Enable and setup PCI P2P
Date: Mon, 10 Aug 2020 20:24:11 +0000	[thread overview]
Message-ID: <AM0PR05MB4642F95FB871C8CE741A0F1BA7440@AM0PR05MB4642.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <CAOSf1CGv=0bwShzzK5zP3dtKg=RxeTFvq52j-Vi4GDfZ4UpBJA@mail.gmail.com>

+ David from IBM.

-----Original Message-----
From: Oliver O'Halloran <oohall@gmail.com> 
Sent: Monday, August 3, 2020 2:35 AM
To: Max Gurtovoy <maxg@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>; linux-pci <linux-pci@vger.kernel.org>; linuxppc-dev <linuxppc-dev@lists.ozlabs.org>; Israel Rukshin <israelr@mellanox.com>; Idan Werpoler <Idanw@mellanox.com>; Vladimir Koushnir <vladimirk@mellanox.com>; Shlomi Nimrodi <shlomin@mellanox.com>; Frederic Barrat <fbarrat@linux.ibm.com>; Carol Soto <clsoto@us.ibm.com>; Aneela Devarasetty <aneela@mellanox.com>
Subject: Re: [PATCH 2/2 v2] powerpc/powernv: Enable and setup PCI P2P

On Thu, Apr 30, 2020 at 11:15 PM Max Gurtovoy <maxg@mellanox.com> wrote:
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 57d3a6a..9ecc576 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3706,18 +3706,208 @@ static void pnv_pci_ioda_dma_bus_setup(struct pci_bus *bus)
>         }
>  }
>
> +#ifdef CONFIG_PCI_P2PDMA
> +static DEFINE_MUTEX(p2p_mutex);
> +
> +static bool pnv_pci_controller_owns_addr(struct pci_controller *hose,
> +                                        phys_addr_t addr, size_t 
> +size) {
> +       int i;
> +
> +       /*
> +        * It seems safe to assume the full range is under the same PHB, so we
> +        * can ignore the size.
> +        */
> +       for (i = 0; i < ARRAY_SIZE(hose->mem_resources); i++) {
> +               struct resource *res = &hose->mem_resources[i];
> +
> +               if (res->flags && addr >= res->start && addr < res->end)
> +                       return true;
> +       }
> +       return false;
> +}
> +
> +/*
> + * find the phb owning a mmio address if not owned locally  */ static 
> +struct pnv_phb *pnv_pci_find_owning_phb(struct pci_dev *pdev,
> +                                              phys_addr_t addr, 
> +size_t size) {
> +       struct pci_controller *hose;
> +
> +       /* fast path */
> +       if (pnv_pci_controller_owns_addr(pdev->bus->sysdata, addr, size))
> +               return NULL;

Do we actually need this fast path? It's going to be slow either way.
Also if a device is doing p2p to another device under the same PHB then it should not be happening via the root complex. Is this a case you've tested?

> +       list_for_each_entry(hose, &hose_list, list_node) {
> +               struct pnv_phb *phb = hose->private_data;
> +
> +               if (phb->type != PNV_PHB_NPU_NVLINK &&
> +                   phb->type != PNV_PHB_NPU_OCAPI) {
> +                       if (pnv_pci_controller_owns_addr(hose, addr, size))
> +                               return phb;
> +               }
> +       }
> +       return NULL;
> +}
> +
> +static u64 pnv_pci_dma_dir_to_opal_p2p(enum dma_data_direction dir) {
> +       if (dir == DMA_TO_DEVICE)
> +               return OPAL_PCI_P2P_STORE;
> +       else if (dir == DMA_FROM_DEVICE)
> +               return OPAL_PCI_P2P_LOAD;
> +       else if (dir == DMA_BIDIRECTIONAL)
> +               return OPAL_PCI_P2P_LOAD | OPAL_PCI_P2P_STORE;
> +       else
> +               return 0;
> +}
> +
> +static int pnv_pci_ioda_enable_p2p(struct pci_dev *initiator,
> +                                  struct pnv_phb *phb_target,
> +                                  enum dma_data_direction dir) {
> +       struct pci_controller *hose;
> +       struct pnv_phb *phb_init;
> +       struct pnv_ioda_pe *pe_init;
> +       u64 desc;
> +       int rc;
> +
> +       if (!opal_check_token(OPAL_PCI_SET_P2P))
> +               return -ENXIO;
> +

> +       hose = pci_bus_to_host(initiator->bus);
> +       phb_init = hose->private_data;

You can use the pci_bus_to_pnvhb() helper

> +
> +       pe_init = pnv_ioda_get_pe(initiator);
> +       if (!pe_init)
> +               return -ENODEV;
> +
> +       if (!pe_init->tce_bypass_enabled)
> +               return -EINVAL;
> +
> +       /*
> +        * Configuring the initiator's PHB requires to adjust its TVE#1
> +        * setting. Since the same device can be an initiator several times for
> +        * different target devices, we need to keep a reference count to know
> +        * when we can restore the default bypass setting on its TVE#1 when
> +        * disabling. Opal is not tracking PE states, so we add a reference
> +        * count on the PE in linux.
> +        *
> +        * For the target, the configuration is per PHB, so we keep a
> +        * target reference count on the PHB.
> +        */

This irks me a bit because configuring the DMA address limits for the TVE is the kernel's job. What we really should be doing is using
opal_pci_map_pe_dma_window_real() to set the bypass-mode address limit for the TVE to something large enough to hit the MMIO ranges rather than having set_p2p do it as a side effect. Unfortunately, for some reason skiboot doesn't implement support for enabling 56bit addressing using opal_pci_map_pe_dma_window_real() and we do need to support older kernel's which used this stuff so I guess we're stuck with it for now. It'd be nice if we could fix this in the longer term though...

> +       mutex_lock(&p2p_mutex);
> +
> +       desc = OPAL_PCI_P2P_ENABLE | pnv_pci_dma_dir_to_opal_p2p(dir);
> +       /* always go to opal to validate the configuration */
> +       rc = opal_pci_set_p2p(phb_init->opal_id, phb_target->opal_id, desc,
> +                             pe_init->pe_number);
> +       if (rc != OPAL_SUCCESS) {
> +               rc = -EIO;
> +               goto out;
> +       }
> +
> +       pe_init->p2p_initiator_count++;
> +       phb_target->p2p_target_count++;
> +
> +       rc = 0;
> +out:
> +       mutex_unlock(&p2p_mutex);
> +       return rc;
> +}
> +
> +static int pnv_pci_dma_map_resource(struct pci_dev *pdev,
> +                                   phys_addr_t phys_addr, size_t size,
> +                                   enum dma_data_direction dir) {
> +       struct pnv_phb *target_phb;
> +
> +       target_phb = pnv_pci_find_owning_phb(pdev, phys_addr, size);
> +       if (!target_phb)
> +               return 0;
> +
> +       return pnv_pci_ioda_enable_p2p(pdev, target_phb, dir); }
> +
> +static int pnv_pci_ioda_disable_p2p(struct pci_dev *initiator,
> +               struct pnv_phb *phb_target) {
> +       struct pci_controller *hose;
> +       struct pnv_phb *phb_init;
> +       struct pnv_ioda_pe *pe_init;
> +       int rc;
> +
> +       if (!opal_check_token(OPAL_PCI_SET_P2P))
> +               return -ENXIO;

This should probably have a WARN_ON() since we can't hit this path unless the initial map succeeds.

> +       hose = pci_bus_to_host(initiator->bus);
> +       phb_init = hose->private_data;

pci_bus_to_pnvhb()

> +       pe_init = pnv_ioda_get_pe(initiator);
> +       if (!pe_init)
> +               return -ENODEV;
> +
> +       mutex_lock(&p2p_mutex);
> +
> +       if (!pe_init->p2p_initiator_count || !phb_target->p2p_target_count) {
> +               rc = -EINVAL;
> +               goto out;
> +       }
> +
> +       if (--pe_init->p2p_initiator_count == 0)
> +               pnv_pci_ioda2_set_bypass(pe_init, true);
> +
> +       if (--phb_target->p2p_target_count == 0) {
> +               rc = opal_pci_set_p2p(phb_init->opal_id, phb_target->opal_id,
> +                                     0, pe_init->pe_number);
> +               if (rc != OPAL_SUCCESS) {
> +                       rc = -EIO;
> +                       goto out;
> +               }
> +       }
> +
> +       rc = 0;
> +out:
> +       mutex_unlock(&p2p_mutex);
> +       return rc;
> +}
> +
> +static void pnv_pci_dma_unmap_resource(struct pci_dev *pdev,
> +                                      dma_addr_t addr, size_t size,
> +                                      enum dma_data_direction dir) {
> +       struct pnv_phb *target_phb;
> +       int rc;
> +
> +       target_phb = pnv_pci_find_owning_phb(pdev, addr, size);
> +       if (!target_phb)
> +               return;
> +
> +       rc = pnv_pci_ioda_disable_p2p(pdev, target_phb);
> +       if (rc)
> +               dev_err(&pdev->dev, "Failed to undo PCI peer-to-peer setup for address %llx: %d\n",
> +                       addr, rc);

Use pci_err() or pe_err().

  reply	other threads:[~2020-08-10 23:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 13:15 [PATCH 1/2 v2] powerpc/dma: Define map/unmap mmio resource callbacks Max Gurtovoy
2020-04-30 13:15 ` [PATCH 2/2 v2] powerpc/powernv: Enable and setup PCI P2P Max Gurtovoy
2020-08-03  7:35   ` Oliver O'Halloran
2020-08-10 20:24     ` Aneela Devarasetty [this message]
2020-08-11 17:25     ` Frederic Barrat
2020-07-31  3:30 ` [PATCH 1/2 v2] powerpc/dma: Define map/unmap mmio resource callbacks Oliver O'Halloran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM0PR05MB4642F95FB871C8CE741A0F1BA7440@AM0PR05MB4642.eurprd05.prod.outlook.com \
    --to=aneela@mellanox.com \
    --cc=Idanw@mellanox.com \
    --cc=clsoto@us.ibm.com \
    --cc=fbarrat@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=israelr@mellanox.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maxg@mellanox.com \
    --cc=oohall@gmail.com \
    --cc=shlomin@mellanox.com \
    --cc=vladimirk@mellanox.com \
    --cc=zdai@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).