netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers
@ 2020-09-04 22:41 Jason Gunthorpe
  2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
  2020-09-09 18:38 ` [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
  0 siblings, 2 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2020-09-04 22:41 UTC (permalink / raw)
  To: Adit Ranadive, Ariel Elior, Potnuri Bharat Teja, David S. Miller,
	Devesh Sharma, Doug Ledford, Faisal Latif, Gal Pressman,
	GR-everest-linux-l2, Wei Hu(Xavier),
	Jakub Kicinski, Leon Romanovsky, linux-rdma, Weihang Li,
	Michal Kalderon, Naresh Kumar PBS, netdev, Lijun Ou,
	VMware PV-Drivers, Selvin Xavier, Yossi Leybovich, Somnath Kotur,
	Sriharsha Basavapatna, Yishai Hadas
  Cc: Firas JahJah, Henry Orosco, Leon Romanovsky, Michael J. Ruhl,
	Michal Kalderon, Miguel Ojeda, Shiraz Saleem

Most RDMA drivers rely on a linear table of DMA addresses organized in
some device specific page size.

For a while now the core code has had the rdma_for_each_block() SG
iterator to help break a umem into DMA blocks for use in the device lists.

Improve on this by adding rdma_umem_for_each_dma_block(),
ib_umem_dma_offset() and ib_umem_num_dma_blocks().

Replace open codings, or calls to fixed PAGE_SIZE APIs, in most of the
drivers with one of the above APIs.

Get rid of the really weird and duplicative ib_umem_page_count().

Fix two problems with ib_umem_find_best_pgsz(), and several problems
related to computing the wrong DMA list length if IOVA != umem->address.

At this point many of the driver have a clear path to call
ib_umem_find_best_pgsz() and replace hardcoded PAGE_SIZE or PAGE_SHIFT
values when constructing their DMA lists.

This is the first series in an effort to modernize the umem usage in all
the DMA drivers.

v1: https://lore.kernel.org/r/0-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com
v2:
 - Fix ib_umem_find_best_pgsz() to use IOVA not umem->addr
 - Fix ib_umem_num_dma_blocks() to use IOVA not umem->addr
 - Two new patches to remove wrong open coded versions of
   ib_umem_num_dma_blocks() from EFA and i40iw
 - Redo the mlx4 ib_umem_num_dma_blocks() to do less and be safer
   until the whole thing can be moved to ib_umem_find_best_pgsz()
 - Two new patches to delete calls to ib_umem_offset() in qedr and
   ocrdma

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Jason Gunthorpe (17):
  RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page
    boundary
  RDMA/umem: Prevent small pages from being returned by
    ib_umem_find_best_pgsz()
  RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz()
  RDMA/umem: Add rdma_umem_for_each_dma_block()
  RDMA/umem: Replace for_each_sg_dma_page with
    rdma_umem_for_each_dma_block
  RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()
  RDMA/efa: Use ib_umem_num_dma_pages()
  RDMA/i40iw: Use ib_umem_num_dma_pages()
  RDMA/qedr: Use rdma_umem_for_each_dma_block() instead of open-coding
  RDMA/qedr: Use ib_umem_num_dma_blocks() instead of
    ib_umem_page_count()
  RDMA/bnxt: Do not use ib_umem_page_count() or ib_umem_num_pages()
  RDMA/hns: Use ib_umem_num_dma_blocks() instead of opencoding
  RDMA/ocrdma: Use ib_umem_num_dma_blocks() instead of
    ib_umem_page_count()
  RDMA/pvrdma: Use ib_umem_num_dma_blocks() instead of
    ib_umem_page_count()
  RDMA/mlx4: Use ib_umem_num_dma_blocks()
  RDMA/qedr: Remove fbo and zbva from the MR
  RDMA/ocrdma: Remove fbo from MR

 .clang-format                                 |  1 +
 drivers/infiniband/core/umem.c                | 45 +++++++-----
 drivers/infiniband/hw/bnxt_re/ib_verbs.c      | 72 +++++++------------
 drivers/infiniband/hw/cxgb4/mem.c             |  8 +--
 drivers/infiniband/hw/efa/efa_verbs.c         |  9 ++-
 drivers/infiniband/hw/hns/hns_roce_alloc.c    |  3 +-
 drivers/infiniband/hw/hns/hns_roce_mr.c       | 49 +++++--------
 drivers/infiniband/hw/i40iw/i40iw_verbs.c     | 13 +---
 drivers/infiniband/hw/mlx4/cq.c               |  1 -
 drivers/infiniband/hw/mlx4/mr.c               |  5 +-
 drivers/infiniband/hw/mlx4/qp.c               |  2 -
 drivers/infiniband/hw/mlx4/srq.c              |  5 +-
 drivers/infiniband/hw/mlx5/mem.c              |  4 +-
 drivers/infiniband/hw/mthca/mthca_provider.c  |  8 +--
 drivers/infiniband/hw/ocrdma/ocrdma.h         |  1 -
 drivers/infiniband/hw/ocrdma/ocrdma_hw.c      |  5 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   | 25 +++----
 drivers/infiniband/hw/qedr/verbs.c            | 52 +++++---------
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c  |  2 +-
 .../infiniband/hw/vmw_pvrdma/pvrdma_misc.c    |  9 ++-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c  |  2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c  |  6 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_rdma.c    | 12 +---
 include/linux/qed/qed_rdma_if.h               |  2 -
 include/rdma/ib_umem.h                        | 37 ++++++++--
 include/rdma/ib_verbs.h                       | 24 -------
 27 files changed, 170 insertions(+), 234 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR
  2020-09-04 22:41 [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
@ 2020-09-04 22:41 ` Jason Gunthorpe
  2020-09-06  8:01   ` [EXT] " Michal Kalderon
  2020-09-09 18:38 ` [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
  1 sibling, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2020-09-04 22:41 UTC (permalink / raw)
  To: Ariel Elior, David S. Miller, Doug Ledford, GR-everest-linux-l2,
	Jakub Kicinski, linux-rdma, Michal Kalderon, netdev

zbva is always false, so fbo is never read.

A 'zero-based-virtual-address' is simply IOVA == 0, and the driver already
supports this.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/qedr/verbs.c         |  4 ----
 drivers/net/ethernet/qlogic/qed/qed_rdma.c | 12 ++----------
 include/linux/qed/qed_rdma_if.h            |  2 --
 3 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 278b48443aedba..cca69b4ed354ea 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -2878,10 +2878,8 @@ struct ib_mr *qedr_reg_user_mr(struct ib_pd *ibpd, u64 start, u64 len,
 	mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
 	mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
 	mr->hw_mr.page_size_log = PAGE_SHIFT;
-	mr->hw_mr.fbo = ib_umem_offset(mr->umem);
 	mr->hw_mr.length = len;
 	mr->hw_mr.vaddr = usr_addr;
-	mr->hw_mr.zbva = false;
 	mr->hw_mr.phy_mr = false;
 	mr->hw_mr.dma_mr = false;
 
@@ -2974,10 +2972,8 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
 	mr->hw_mr.pbl_ptr = 0;
 	mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
 	mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
-	mr->hw_mr.fbo = 0;
 	mr->hw_mr.length = 0;
 	mr->hw_mr.vaddr = 0;
-	mr->hw_mr.zbva = false;
 	mr->hw_mr.phy_mr = true;
 	mr->hw_mr.dma_mr = false;
 
diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index a4bcde522cdf9d..baa4c36608ea91 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -1520,7 +1520,7 @@ qed_rdma_register_tid(void *rdma_cxt,
 		  params->pbl_two_level);
 
 	SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED,
-		  params->zbva);
+		  false);
 
 	SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR, params->phy_mr);
 
@@ -1582,15 +1582,7 @@ qed_rdma_register_tid(void *rdma_cxt,
 	p_ramrod->pd = cpu_to_le16(params->pd);
 	p_ramrod->length_hi = (u8)(params->length >> 32);
 	p_ramrod->length_lo = DMA_LO_LE(params->length);
-	if (params->zbva) {
-		/* Lower 32 bits of the registered MR address.
-		 * In case of zero based MR, will hold FBO
-		 */
-		p_ramrod->va.hi = 0;
-		p_ramrod->va.lo = cpu_to_le32(params->fbo);
-	} else {
-		DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
-	}
+	DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
 	DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr);
 
 	/* DIF */
diff --git a/include/linux/qed/qed_rdma_if.h b/include/linux/qed/qed_rdma_if.h
index f464d85e88a410..aeb242cefebfa8 100644
--- a/include/linux/qed/qed_rdma_if.h
+++ b/include/linux/qed/qed_rdma_if.h
@@ -242,10 +242,8 @@ struct qed_rdma_register_tid_in_params {
 	bool pbl_two_level;
 	u8 pbl_page_size_log;
 	u8 page_size_log;
-	u32 fbo;
 	u64 length;
 	u64 vaddr;
-	bool zbva;
 	bool phy_mr;
 	bool dma_mr;
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [EXT] [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR
  2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
@ 2020-09-06  8:01   ` Michal Kalderon
  0 siblings, 0 replies; 4+ messages in thread
From: Michal Kalderon @ 2020-09-06  8:01 UTC (permalink / raw)
  To: Jason Gunthorpe, Ariel Elior, David S. Miller, Doug Ledford,
	GR-everest-linux-l2, Jakub Kicinski, linux-rdma, netdev

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, September 5, 2020 1:42 AM
> zbva is always false, so fbo is never read.
> 
> A 'zero-based-virtual-address' is simply IOVA == 0, and the driver already
> supports this.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/hw/qedr/verbs.c         |  4 ----
>  drivers/net/ethernet/qlogic/qed/qed_rdma.c | 12 ++----------
>  include/linux/qed/qed_rdma_if.h            |  2 --
>  3 files changed, 2 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/qedr/verbs.c
> b/drivers/infiniband/hw/qedr/verbs.c
> index 278b48443aedba..cca69b4ed354ea 100644
> --- a/drivers/infiniband/hw/qedr/verbs.c
> +++ b/drivers/infiniband/hw/qedr/verbs.c
> @@ -2878,10 +2878,8 @@ struct ib_mr *qedr_reg_user_mr(struct ib_pd
> *ibpd, u64 start, u64 len,
>  	mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
>  	mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
>  	mr->hw_mr.page_size_log = PAGE_SHIFT;
> -	mr->hw_mr.fbo = ib_umem_offset(mr->umem);
>  	mr->hw_mr.length = len;
>  	mr->hw_mr.vaddr = usr_addr;
> -	mr->hw_mr.zbva = false;
>  	mr->hw_mr.phy_mr = false;
>  	mr->hw_mr.dma_mr = false;
> 
> @@ -2974,10 +2972,8 @@ static struct qedr_mr *__qedr_alloc_mr(struct
> ib_pd *ibpd,
>  	mr->hw_mr.pbl_ptr = 0;
>  	mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
>  	mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
> -	mr->hw_mr.fbo = 0;
>  	mr->hw_mr.length = 0;
>  	mr->hw_mr.vaddr = 0;
> -	mr->hw_mr.zbva = false;
>  	mr->hw_mr.phy_mr = true;
>  	mr->hw_mr.dma_mr = false;
> 
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> index a4bcde522cdf9d..baa4c36608ea91 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> @@ -1520,7 +1520,7 @@ qed_rdma_register_tid(void *rdma_cxt,
>  		  params->pbl_two_level);
> 
>  	SET_FIELD(flags,
> RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED,
> -		  params->zbva);
> +		  false);
> 
>  	SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR,
> params->phy_mr);
> 
> @@ -1582,15 +1582,7 @@ qed_rdma_register_tid(void *rdma_cxt,
>  	p_ramrod->pd = cpu_to_le16(params->pd);
>  	p_ramrod->length_hi = (u8)(params->length >> 32);
>  	p_ramrod->length_lo = DMA_LO_LE(params->length);
> -	if (params->zbva) {
> -		/* Lower 32 bits of the registered MR address.
> -		 * In case of zero based MR, will hold FBO
> -		 */
> -		p_ramrod->va.hi = 0;
> -		p_ramrod->va.lo = cpu_to_le32(params->fbo);
> -	} else {
> -		DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
> -	}
> +	DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
>  	DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr);
> 
>  	/* DIF */
> diff --git a/include/linux/qed/qed_rdma_if.h
> b/include/linux/qed/qed_rdma_if.h index f464d85e88a410..aeb242cefebfa8
> 100644
> --- a/include/linux/qed/qed_rdma_if.h
> +++ b/include/linux/qed/qed_rdma_if.h
> @@ -242,10 +242,8 @@ struct qed_rdma_register_tid_in_params {
>  	bool pbl_two_level;
>  	u8 pbl_page_size_log;
>  	u8 page_size_log;
> -	u32 fbo;
>  	u64 length;
>  	u64 vaddr;
> -	bool zbva;
>  	bool phy_mr;
>  	bool dma_mr;
> 
> --
> 2.28.0

Thanks, 

Acked-by: Michal Kalderon <michal.kalderon@marvell.com>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers
  2020-09-04 22:41 [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
  2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
@ 2020-09-09 18:38 ` Jason Gunthorpe
  1 sibling, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2020-09-09 18:38 UTC (permalink / raw)
  To: Adit Ranadive, Ariel Elior, Potnuri Bharat Teja, David S. Miller,
	Devesh Sharma, Doug Ledford, Faisal Latif, Gal Pressman,
	GR-everest-linux-l2, Wei Hu(Xavier),
	Jakub Kicinski, Leon Romanovsky, linux-rdma, Weihang Li,
	Michal Kalderon, Naresh Kumar PBS, netdev, Lijun Ou,
	VMware PV-Drivers, Selvin Xavier, Yossi Leybovich, Somnath Kotur,
	Sriharsha Basavapatna, Yishai Hadas
  Cc: Firas JahJah, Henry Orosco, Leon Romanovsky, Michael J. Ruhl,
	Michal Kalderon, Miguel Ojeda, Shiraz Saleem

On Fri, Sep 04, 2020 at 07:41:41PM -0300, Jason Gunthorpe wrote:
> Most RDMA drivers rely on a linear table of DMA addresses organized in
> some device specific page size.
> 
> For a while now the core code has had the rdma_for_each_block() SG
> iterator to help break a umem into DMA blocks for use in the device lists.
> 
> Improve on this by adding rdma_umem_for_each_dma_block(),
> ib_umem_dma_offset() and ib_umem_num_dma_blocks().
> 
> Replace open codings, or calls to fixed PAGE_SIZE APIs, in most of the
> drivers with one of the above APIs.
> 
> Get rid of the really weird and duplicative ib_umem_page_count().
> 
> Fix two problems with ib_umem_find_best_pgsz(), and several problems
> related to computing the wrong DMA list length if IOVA != umem->address.
> 
> At this point many of the driver have a clear path to call
> ib_umem_find_best_pgsz() and replace hardcoded PAGE_SIZE or PAGE_SHIFT
> values when constructing their DMA lists.
> 
> This is the first series in an effort to modernize the umem usage in all
> the DMA drivers.
> 
> v1: https://lore.kernel.org/r/0-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com
> v2:
>  - Fix ib_umem_find_best_pgsz() to use IOVA not umem->addr
>  - Fix ib_umem_num_dma_blocks() to use IOVA not umem->addr
>  - Two new patches to remove wrong open coded versions of
>    ib_umem_num_dma_blocks() from EFA and i40iw
>  - Redo the mlx4 ib_umem_num_dma_blocks() to do less and be safer
>    until the whole thing can be moved to ib_umem_find_best_pgsz()
>  - Two new patches to delete calls to ib_umem_offset() in qedr and
>    ocrdma
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Jason Gunthorpe (17):
>   RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page
>     boundary
>   RDMA/umem: Prevent small pages from being returned by
>     ib_umem_find_best_pgsz()
>   RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz()
>   RDMA/umem: Add rdma_umem_for_each_dma_block()
>   RDMA/umem: Replace for_each_sg_dma_page with
>     rdma_umem_for_each_dma_block
>   RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()
>   RDMA/efa: Use ib_umem_num_dma_pages()
>   RDMA/i40iw: Use ib_umem_num_dma_pages()
>   RDMA/qedr: Use rdma_umem_for_each_dma_block() instead of open-coding
>   RDMA/qedr: Use ib_umem_num_dma_blocks() instead of
>     ib_umem_page_count()
>   RDMA/bnxt: Do not use ib_umem_page_count() or ib_umem_num_pages()
>   RDMA/hns: Use ib_umem_num_dma_blocks() instead of opencoding
>   RDMA/ocrdma: Use ib_umem_num_dma_blocks() instead of
>     ib_umem_page_count()
>   RDMA/pvrdma: Use ib_umem_num_dma_blocks() instead of
>     ib_umem_page_count()
>   RDMA/mlx4: Use ib_umem_num_dma_blocks()
>   RDMA/qedr: Remove fbo and zbva from the MR
>   RDMA/ocrdma: Remove fbo from MR

Applied to for-next with Leon's note. Thanks everyone

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-09 18:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 22:41 [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
2020-09-06  8:01   ` [EXT] " Michal Kalderon
2020-09-09 18:38 ` [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).