linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 00/12] Improvements for ODP
@ 2019-08-19 11:16 Leon Romanovsky
  2019-08-19 11:16 ` [PATCH rdma-next 01/12] RDMA/odp: Use the common interval tree library instead of generic Leon Romanovsky
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:16 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

This series from Jason is a collection of general cleanups for
ODP to clarify some of the flows around umem creation and use
of the interval tree.

It is based on patch "RDMA/mlx5: Fix MR npages calculation for
IB_ACCESS_HUGETLB"
https://lore.kernel.org/linux-rdma/20190815083834.9245-5-leon@kernel.org

Thanks

Jason Gunthorpe (11):
  RDMA/odp: Use the common interval tree library instead of generic
  RDMA/odp: Iterate over the whole rbtree directly
  RDMA/odp: Make it clearer when a umem is an implicit ODP umem
  RMDA/odp: Consolidate umem_odp initialization
  RDMA/odp: Make the three ways to create a umem_odp clear
  RDMA/odp: Split creating a umem_odp from ib_umem_get
  RDMA/odp: Provide ib_umem_odp_release() to undo the allocs
  RDMA/odp: Check for overflow when computing the umem_odp end
  RDMA/odp: Use kvcalloc for the dma_list and page_list
  RDMA/mlx5: Use ib_umem_start instead of umem.address
  RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr

Moni Shoua (1):
  RDMA/core: Make invalidate_range a device operation

 drivers/infiniband/Kconfig           |   1 +
 drivers/infiniband/core/device.c     |   1 +
 drivers/infiniband/core/umem.c       |  50 +--
 drivers/infiniband/core/umem_odp.c   | 448 +++++++++++++++------------
 drivers/infiniband/core/uverbs_cmd.c |   2 -
 drivers/infiniband/hw/mlx5/main.c    |   4 -
 drivers/infiniband/hw/mlx5/mem.c     |  13 -
 drivers/infiniband/hw/mlx5/mr.c      |  38 ++-
 drivers/infiniband/hw/mlx5/odp.c     |  88 +++---
 include/rdma/ib_umem_odp.h           |  48 ++-
 include/rdma/ib_verbs.h              |   4 +-
 11 files changed, 370 insertions(+), 327 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 01/12] RDMA/odp: Use the common interval tree library instead of generic
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
@ 2019-08-19 11:16 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly Leon Romanovsky
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:16 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

ODP is working with userspace VA's in the interval tree which always fit
into an unsigned long, so we can use the common code.

This comes at a cost of a 16 byte increase in ib_umem_odp struct size due
to storing the interval tree start/last in addition to the umem
addr/length. However these values were computed and are performance
critical for the interval lookup, so this seems like a worthwhile trade
off.

Removes 2k of .text from the kernel.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/Kconfig         |  1 +
 drivers/infiniband/core/umem_odp.c | 72 ++++++++----------------------
 include/rdma/ib_umem_odp.h         | 20 +++++----
 3 files changed, 31 insertions(+), 62 deletions(-)

diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 85e103b147cc..b44b1c322ec8 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -55,6 +55,7 @@ config INFINIBAND_ON_DEMAND_PAGING
 	bool "InfiniBand on-demand paging support"
 	depends on INFINIBAND_USER_MEM
 	select MMU_NOTIFIER
+	select INTERVAL_TREE
 	default y
 	---help---
 	  On demand paging support for the InfiniBand subsystem.
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 2a75c6f8d827..8358eb8e3a26 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -39,45 +39,13 @@
 #include <linux/export.h>
 #include <linux/vmalloc.h>
 #include <linux/hugetlb.h>
-#include <linux/interval_tree_generic.h>
+#include <linux/interval_tree.h>
 #include <linux/pagemap.h>
 
 #include <rdma/ib_verbs.h>
 #include <rdma/ib_umem.h>
 #include <rdma/ib_umem_odp.h>
 
-/*
- * The ib_umem list keeps track of memory regions for which the HW
- * device request to receive notification when the related memory
- * mapping is changed.
- *
- * ib_umem_lock protects the list.
- */
-
-static u64 node_start(struct umem_odp_node *n)
-{
-	struct ib_umem_odp *umem_odp =
-			container_of(n, struct ib_umem_odp, interval_tree);
-
-	return ib_umem_start(umem_odp);
-}
-
-/* Note that the representation of the intervals in the interval tree
- * considers the ending point as contained in the interval, while the
- * function ib_umem_end returns the first address which is not contained
- * in the umem.
- */
-static u64 node_last(struct umem_odp_node *n)
-{
-	struct ib_umem_odp *umem_odp =
-			container_of(n, struct ib_umem_odp, interval_tree);
-
-	return ib_umem_end(umem_odp) - 1;
-}
-
-INTERVAL_TREE_DEFINE(struct umem_odp_node, rb, u64, __subtree_last,
-		     node_start, node_last, static, rbt_ib_umem)
-
 static void ib_umem_notifier_start_account(struct ib_umem_odp *umem_odp)
 {
 	mutex_lock(&umem_odp->umem_mutex);
@@ -209,9 +177,18 @@ static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp)
 	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
 
 	down_write(&per_mm->umem_rwsem);
-	if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp)))
-		rbt_ib_umem_insert(&umem_odp->interval_tree,
-				   &per_mm->umem_tree);
+	if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) {
+		/*
+		 * Note that the representation of the intervals in the
+		 * interval tree considers the ending point as contained in
+		 * the interval, while the function ib_umem_end returns the
+		 * first address which is not contained in the umem.
+		 */
+		umem_odp->interval_tree.start = ib_umem_start(umem_odp);
+		umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1;
+		interval_tree_insert(&umem_odp->interval_tree,
+				     &per_mm->umem_tree);
+	}
 	up_write(&per_mm->umem_rwsem);
 }
 
@@ -221,8 +198,8 @@ static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp)
 
 	down_write(&per_mm->umem_rwsem);
 	if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp)))
-		rbt_ib_umem_remove(&umem_odp->interval_tree,
-				   &per_mm->umem_tree);
+		interval_tree_remove(&umem_odp->interval_tree,
+				     &per_mm->umem_tree);
 	complete_all(&umem_odp->notifier_completion);
 
 	up_write(&per_mm->umem_rwsem);
@@ -765,18 +742,18 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root,
 				  void *cookie)
 {
 	int ret_val = 0;
-	struct umem_odp_node *node, *next;
+	struct interval_tree_node *node, *next;
 	struct ib_umem_odp *umem;
 
 	if (unlikely(start == last))
 		return ret_val;
 
-	for (node = rbt_ib_umem_iter_first(root, start, last - 1);
+	for (node = interval_tree_iter_first(root, start, last - 1);
 			node; node = next) {
 		/* TODO move the blockable decision up to the callback */
 		if (!blockable)
 			return -EAGAIN;
-		next = rbt_ib_umem_iter_next(node, start, last - 1);
+		next = interval_tree_iter_next(node, start, last - 1);
 		umem = container_of(node, struct ib_umem_odp, interval_tree);
 		ret_val = cb(umem, start, last, cookie) || ret_val;
 	}
@@ -784,16 +761,3 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root,
 	return ret_val;
 }
 EXPORT_SYMBOL(rbt_ib_umem_for_each_in_range);
-
-struct ib_umem_odp *rbt_ib_umem_lookup(struct rb_root_cached *root,
-				       u64 addr, u64 length)
-{
-	struct umem_odp_node *node;
-
-	node = rbt_ib_umem_iter_first(root, addr, addr + length - 1);
-	if (node)
-		return container_of(node, struct ib_umem_odp, interval_tree);
-	return NULL;
-
-}
-EXPORT_SYMBOL(rbt_ib_umem_lookup);
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 479db5c98ff6..030d5cbad02c 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -37,11 +37,6 @@
 #include <rdma/ib_verbs.h>
 #include <linux/interval_tree.h>
 
-struct umem_odp_node {
-	u64 __subtree_last;
-	struct rb_node rb;
-};
-
 struct ib_umem_odp {
 	struct ib_umem umem;
 	struct ib_ucontext_per_mm *per_mm;
@@ -72,7 +67,7 @@ struct ib_umem_odp {
 	int npages;
 
 	/* Tree tracking */
-	struct umem_odp_node	interval_tree;
+	struct interval_tree_node interval_tree;
 
 	struct completion	notifier_completion;
 	int			dying;
@@ -163,8 +158,17 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root,
  * Find first region intersecting with address range.
  * Return NULL if not found
  */
-struct ib_umem_odp *rbt_ib_umem_lookup(struct rb_root_cached *root,
-				       u64 addr, u64 length);
+static inline struct ib_umem_odp *
+rbt_ib_umem_lookup(struct rb_root_cached *root, u64 addr, u64 length)
+{
+	struct interval_tree_node *node;
+
+	node = interval_tree_iter_first(root, addr, addr + length - 1);
+	if (!node)
+		return NULL;
+	return container_of(node, struct ib_umem_odp, interval_tree);
+
+}
 
 static inline int ib_umem_mmu_notifier_retry(struct ib_umem_odp *umem_odp,
 					     unsigned long mmu_seq)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
  2019-08-19 11:16 ` [PATCH rdma-next 01/12] RDMA/odp: Use the common interval tree library instead of generic Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-21 17:15   ` Jason Gunthorpe
  2019-08-19 11:17 ` [PATCH rdma-next 03/12] RDMA/odp: Make it clearer when a umem is an implicit ODP umem Leon Romanovsky
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

Instead of intersecting a full interval, just iterate over every element
directly. This is faster and clearer.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c | 51 ++++++++++++++++--------------
 drivers/infiniband/hw/mlx5/odp.c   | 41 +++++++++++-------------
 2 files changed, 47 insertions(+), 45 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 8358eb8e3a26..b9bebef00a33 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -72,35 +72,41 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
 	mutex_unlock(&umem_odp->umem_mutex);
 }
 
-static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
-					       u64 start, u64 end, void *cookie)
-{
-	/*
-	 * Increase the number of notifiers running, to
-	 * prevent any further fault handling on this MR.
-	 */
-	ib_umem_notifier_start_account(umem_odp);
-	umem_odp->dying = 1;
-	/* Make sure that the fact the umem is dying is out before we release
-	 * all pending page faults. */
-	smp_wmb();
-	complete_all(&umem_odp->notifier_completion);
-	umem_odp->umem.context->invalidate_range(
-		umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp));
-	return 0;
-}
-
 static void ib_umem_notifier_release(struct mmu_notifier *mn,
 				     struct mm_struct *mm)
 {
 	struct ib_ucontext_per_mm *per_mm =
 		container_of(mn, struct ib_ucontext_per_mm, mn);
+	struct rb_node *node;
 
 	down_read(&per_mm->umem_rwsem);
-	if (per_mm->active)
-		rbt_ib_umem_for_each_in_range(
-			&per_mm->umem_tree, 0, ULLONG_MAX,
-			ib_umem_notifier_release_trampoline, true, NULL);
+	if (!per_mm->active)
+		goto out;
+
+	for (node = rb_first_cached(&per_mm->umem_tree); node;
+	     node = rb_next(node)) {
+		struct ib_umem_odp *umem_odp =
+			rb_entry(node, struct ib_umem_odp, interval_tree.rb);
+
+		/*
+		 * Increase the number of notifiers running, to prevent any
+		 * further fault handling on this MR.
+		 */
+		ib_umem_notifier_start_account(umem_odp);
+
+		umem_odp->dying = 1;
+		/*
+		 * Make sure that the fact the umem is dying is out before we
+		 * release all pending page faults.
+		 */
+		smp_wmb();
+		complete_all(&umem_odp->notifier_completion);
+		umem_odp->umem.context->invalidate_range(
+			umem_odp, ib_umem_start(umem_odp),
+			ib_umem_end(umem_odp));
+	}
+
+out:
 	up_read(&per_mm->umem_rwsem);
 }
 
@@ -760,4 +766,3 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root,
 
 	return ret_val;
 }
-EXPORT_SYMBOL(rbt_ib_umem_for_each_in_range);
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index b0c5de39d186..3922fced41ec 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -539,34 +539,31 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd,
 	return imr;
 }
 
-static int mr_leaf_free(struct ib_umem_odp *umem_odp, u64 start, u64 end,
-			void *cookie)
+void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *imr)
 {
-	struct mlx5_ib_mr *mr = umem_odp->private, *imr = cookie;
-
-	if (mr->parent != imr)
-		return 0;
-
-	ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
-				    ib_umem_end(umem_odp));
+	struct ib_ucontext_per_mm *per_mm = mr_to_per_mm(imr);
+	struct rb_node *node;
 
-	if (umem_odp->dying)
-		return 0;
+	down_read(&per_mm->umem_rwsem);
+	for (node = rb_first_cached(&per_mm->umem_tree); node;
+	     node = rb_next(node)) {
+		struct ib_umem_odp *umem_odp =
+			rb_entry(node, struct ib_umem_odp, interval_tree.rb);
+		struct mlx5_ib_mr *mr = umem_odp->private;
 
-	WRITE_ONCE(umem_odp->dying, 1);
-	atomic_inc(&imr->num_leaf_free);
-	schedule_work(&umem_odp->work);
+		if (mr->parent != imr)
+			continue;
 
-	return 0;
-}
+		ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
+					    ib_umem_end(umem_odp));
 
-void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *imr)
-{
-	struct ib_ucontext_per_mm *per_mm = mr_to_per_mm(imr);
+		if (umem_odp->dying)
+			continue;
 
-	down_read(&per_mm->umem_rwsem);
-	rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, 0, ULLONG_MAX,
-				      mr_leaf_free, true, imr);
+		WRITE_ONCE(umem_odp->dying, 1);
+		atomic_inc(&imr->num_leaf_free);
+		schedule_work(&umem_odp->work);
+	}
 	up_read(&per_mm->umem_rwsem);
 
 	wait_event(imr->q_leaf_free, !atomic_read(&imr->num_leaf_free));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 03/12] RDMA/odp: Make it clearer when a umem is an implicit ODP umem
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
  2019-08-19 11:16 ` [PATCH rdma-next 01/12] RDMA/odp: Use the common interval tree library instead of generic Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 04/12] RMDA/odp: Consolidate umem_odp initialization Leon Romanovsky
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

Implicit ODP umems are special, they don't have any page lists, they don't
exist in the interval tree and they are never DMA mapped.

Instead of trying to guess this based on a zero length use an explicit
flag.

Further, do not allow non-implicit umems to be 0 size.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c | 54 +++++++++++++++++-------------
 drivers/infiniband/hw/mlx5/mr.c    |  2 +-
 drivers/infiniband/hw/mlx5/odp.c   |  2 +-
 include/rdma/ib_umem_odp.h         |  8 +++++
 4 files changed, 40 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index b9bebef00a33..2eb184a5374a 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -183,18 +183,15 @@ static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp)
 	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
 
 	down_write(&per_mm->umem_rwsem);
-	if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) {
-		/*
-		 * Note that the representation of the intervals in the
-		 * interval tree considers the ending point as contained in
-		 * the interval, while the function ib_umem_end returns the
-		 * first address which is not contained in the umem.
-		 */
-		umem_odp->interval_tree.start = ib_umem_start(umem_odp);
-		umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1;
-		interval_tree_insert(&umem_odp->interval_tree,
-				     &per_mm->umem_tree);
-	}
+	/*
+	 * Note that the representation of the intervals in the interval tree
+	 * considers the ending point as contained in the interval, while the
+	 * function ib_umem_end returns the first address which is not
+	 * contained in the umem.
+	 */
+	umem_odp->interval_tree.start = ib_umem_start(umem_odp);
+	umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1;
+	interval_tree_insert(&umem_odp->interval_tree, &per_mm->umem_tree);
 	up_write(&per_mm->umem_rwsem);
 }
 
@@ -203,11 +200,8 @@ static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp)
 	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
 
 	down_write(&per_mm->umem_rwsem);
-	if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp)))
-		interval_tree_remove(&umem_odp->interval_tree,
-				     &per_mm->umem_tree);
+	interval_tree_remove(&umem_odp->interval_tree, &per_mm->umem_tree);
 	complete_all(&umem_odp->notifier_completion);
-
 	up_write(&per_mm->umem_rwsem);
 }
 
@@ -327,6 +321,9 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root,
 	int pages = size >> PAGE_SHIFT;
 	int ret;
 
+	if (!size)
+		return ERR_PTR(-EINVAL);
+
 	odp_data = kzalloc(sizeof(*odp_data), GFP_KERNEL);
 	if (!odp_data)
 		return ERR_PTR(-ENOMEM);
@@ -388,6 +385,9 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 	struct mm_struct *mm = umem->owning_mm;
 	int ret_val;
 
+	if (umem_odp->umem.address == 0 && umem_odp->umem.length == 0)
+		umem_odp->is_implicit_odp = 1;
+
 	umem_odp->page_shift = PAGE_SHIFT;
 	if (access & IB_ACCESS_HUGETLB) {
 		struct vm_area_struct *vma;
@@ -408,7 +408,10 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 
 	init_completion(&umem_odp->notifier_completion);
 
-	if (ib_umem_odp_num_pages(umem_odp)) {
+	if (!umem_odp->is_implicit_odp) {
+		if (!ib_umem_odp_num_pages(umem_odp))
+			return -EINVAL;
+
 		umem_odp->page_list =
 			vzalloc(array_size(sizeof(*umem_odp->page_list),
 					   ib_umem_odp_num_pages(umem_odp)));
@@ -427,7 +430,9 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 	ret_val = get_per_mm(umem_odp);
 	if (ret_val)
 		goto out_dma_list;
-	add_umem_to_per_mm(umem_odp);
+
+	if (!umem_odp->is_implicit_odp)
+		add_umem_to_per_mm(umem_odp);
 
 	return 0;
 
@@ -446,13 +451,14 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
 	 * It is the driver's responsibility to ensure, before calling us,
 	 * that the hardware will not attempt to access the MR any more.
 	 */
-	ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
-				    ib_umem_end(umem_odp));
-
-	remove_umem_from_per_mm(umem_odp);
+	if (!umem_odp->is_implicit_odp) {
+		ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
+					    ib_umem_end(umem_odp));
+		remove_umem_from_per_mm(umem_odp);
+		vfree(umem_odp->dma_list);
+		vfree(umem_odp->page_list);
+	}
 	put_per_mm(umem_odp);
-	vfree(umem_odp->dma_list);
-	vfree(umem_odp->page_list);
 }
 
 /*
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 2c77456f359f..e0015b612ffd 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1609,7 +1609,7 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
 		/* Wait for all running page-fault handlers to finish. */
 		synchronize_srcu(&dev->mr_srcu);
 		/* Destroy all page mappings */
-		if (umem_odp->page_list)
+		if (!umem_odp->is_implicit_odp)
 			mlx5_ib_invalidate_range(umem_odp,
 						 ib_umem_start(umem_odp),
 						 ib_umem_end(umem_odp));
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 3922fced41ec..5b6b2afa26a6 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -585,7 +585,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 	struct ib_umem_odp *odp;
 	size_t size;
 
-	if (!odp_mr->page_list) {
+	if (odp_mr->is_implicit_odp) {
 		odp = implicit_mr_get_data(mr, io_virt, bcnt);
 
 		if (IS_ERR(odp))
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 030d5cbad02c..14b38b4459c5 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -69,6 +69,14 @@ struct ib_umem_odp {
 	/* Tree tracking */
 	struct interval_tree_node interval_tree;
 
+	/*
+	 * An implicit odp umem cannot be DMA mapped, has 0 length, and serves
+	 * only as an anchor for the driver to hold onto the per_mm. FIXME:
+	 * This should be removed and drivers should work with the per_mm
+	 * directly.
+	 */
+	bool is_implicit_odp;
+
 	struct completion	notifier_completion;
 	int			dying;
 	unsigned int		page_shift;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 04/12] RMDA/odp: Consolidate umem_odp initialization
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (2 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 03/12] RDMA/odp: Make it clearer when a umem is an implicit ODP umem Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 05/12] RDMA/odp: Make the three ways to create a umem_odp clear Leon Romanovsky
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

This is done in two different places, consolidate all the post-allocation
initialization into a single function.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c | 200 +++++++++++++----------------
 1 file changed, 86 insertions(+), 114 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 2eb184a5374a..487a6371a053 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -178,23 +178,6 @@ static const struct mmu_notifier_ops ib_umem_notifiers = {
 	.invalidate_range_end       = ib_umem_notifier_invalidate_range_end,
 };
 
-static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp)
-{
-	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
-
-	down_write(&per_mm->umem_rwsem);
-	/*
-	 * Note that the representation of the intervals in the interval tree
-	 * considers the ending point as contained in the interval, while the
-	 * function ib_umem_end returns the first address which is not
-	 * contained in the umem.
-	 */
-	umem_odp->interval_tree.start = ib_umem_start(umem_odp);
-	umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1;
-	interval_tree_insert(&umem_odp->interval_tree, &per_mm->umem_tree);
-	up_write(&per_mm->umem_rwsem);
-}
-
 static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp)
 {
 	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
@@ -244,33 +227,23 @@ static struct ib_ucontext_per_mm *alloc_per_mm(struct ib_ucontext *ctx,
 	return ERR_PTR(ret);
 }
 
-static int get_per_mm(struct ib_umem_odp *umem_odp)
+static struct ib_ucontext_per_mm *get_per_mm(struct ib_umem_odp *umem_odp)
 {
 	struct ib_ucontext *ctx = umem_odp->umem.context;
 	struct ib_ucontext_per_mm *per_mm;
 
+	lockdep_assert_held(&ctx->per_mm_list_lock);
+
 	/*
 	 * Generally speaking we expect only one or two per_mm in this list,
 	 * so no reason to optimize this search today.
 	 */
-	mutex_lock(&ctx->per_mm_list_lock);
 	list_for_each_entry(per_mm, &ctx->per_mm_list, ucontext_list) {
 		if (per_mm->mm == umem_odp->umem.owning_mm)
-			goto found;
-	}
-
-	per_mm = alloc_per_mm(ctx, umem_odp->umem.owning_mm);
-	if (IS_ERR(per_mm)) {
-		mutex_unlock(&ctx->per_mm_list_lock);
-		return PTR_ERR(per_mm);
+			return per_mm;
 	}
 
-found:
-	umem_odp->per_mm = per_mm;
-	per_mm->odp_mrs_count++;
-	mutex_unlock(&ctx->per_mm_list_lock);
-
-	return 0;
+	return alloc_per_mm(ctx, umem_odp->umem.owning_mm);
 }
 
 static void free_per_mm(struct rcu_head *rcu)
@@ -311,79 +284,114 @@ static void put_per_mm(struct ib_umem_odp *umem_odp)
 	mmu_notifier_call_srcu(&per_mm->rcu, free_per_mm);
 }
 
+static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
+				   struct ib_ucontext_per_mm *per_mm)
+{
+	struct ib_ucontext *ctx = umem_odp->umem.context;
+	int ret;
+
+	umem_odp->umem.is_odp = 1;
+	if (!umem_odp->is_implicit_odp) {
+		size_t pages = ib_umem_odp_num_pages(umem_odp);
+
+		if (!pages)
+			return -EINVAL;
+
+		/*
+		 * Note that the representation of the intervals in the
+		 * interval tree considers the ending point as contained in
+		 * the interval, while the function ib_umem_end returns the
+		 * first address which is not contained in the umem.
+		 */
+		umem_odp->interval_tree.start = ib_umem_start(umem_odp);
+		umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1;
+
+		umem_odp->page_list = vzalloc(
+			array_size(sizeof(*umem_odp->page_list), pages));
+		if (!umem_odp->page_list)
+			return -ENOMEM;
+
+		umem_odp->dma_list =
+			vzalloc(array_size(sizeof(*umem_odp->dma_list), pages));
+		if (!umem_odp->dma_list) {
+			ret = -ENOMEM;
+			goto out_page_list;
+		}
+	}
+
+	mutex_lock(&ctx->per_mm_list_lock);
+	if (!per_mm) {
+		per_mm = get_per_mm(umem_odp);
+		if (IS_ERR(per_mm)) {
+			ret = PTR_ERR(per_mm);
+			goto out_unlock;
+		}
+	}
+	umem_odp->per_mm = per_mm;
+	per_mm->odp_mrs_count++;
+	mutex_unlock(&ctx->per_mm_list_lock);
+
+	mutex_init(&umem_odp->umem_mutex);
+	init_completion(&umem_odp->notifier_completion);
+
+	if (!umem_odp->is_implicit_odp) {
+		down_write(&per_mm->umem_rwsem);
+		interval_tree_insert(&umem_odp->interval_tree,
+				     &per_mm->umem_tree);
+		up_write(&per_mm->umem_rwsem);
+	}
+
+	return 0;
+
+out_unlock:
+	mutex_unlock(&ctx->per_mm_list_lock);
+	vfree(umem_odp->dma_list);
+out_page_list:
+	vfree(umem_odp->page_list);
+	return ret;
+}
+
 struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root,
 				      unsigned long addr, size_t size)
 {
-	struct ib_ucontext_per_mm *per_mm = root->per_mm;
-	struct ib_ucontext *ctx = per_mm->context;
+	/*
+	 * Caller must ensure that root cannot be freed during the call to
+	 * ib_alloc_odp_umem.
+	 */
 	struct ib_umem_odp *odp_data;
 	struct ib_umem *umem;
-	int pages = size >> PAGE_SHIFT;
 	int ret;
 
-	if (!size)
-		return ERR_PTR(-EINVAL);
-
 	odp_data = kzalloc(sizeof(*odp_data), GFP_KERNEL);
 	if (!odp_data)
 		return ERR_PTR(-ENOMEM);
 	umem = &odp_data->umem;
-	umem->context    = ctx;
+	umem->context    = root->umem.context;
 	umem->length     = size;
 	umem->address    = addr;
-	odp_data->page_shift = PAGE_SHIFT;
 	umem->writable   = root->umem.writable;
-	umem->is_odp = 1;
-	odp_data->per_mm = per_mm;
-	umem->owning_mm  = per_mm->mm;
-	mmgrab(umem->owning_mm);
-
-	mutex_init(&odp_data->umem_mutex);
-	init_completion(&odp_data->notifier_completion);
-
-	odp_data->page_list =
-		vzalloc(array_size(pages, sizeof(*odp_data->page_list)));
-	if (!odp_data->page_list) {
-		ret = -ENOMEM;
-		goto out_odp_data;
-	}
+	umem->owning_mm  = root->umem.owning_mm;
+	odp_data->page_shift = PAGE_SHIFT;
 
-	odp_data->dma_list =
-		vzalloc(array_size(pages, sizeof(*odp_data->dma_list)));
-	if (!odp_data->dma_list) {
-		ret = -ENOMEM;
-		goto out_page_list;
+	ret = ib_init_umem_odp(odp_data, root->per_mm);
+	if (ret) {
+		kfree(odp_data);
+		return ERR_PTR(ret);
 	}
 
-	/*
-	 * Caller must ensure that the umem_odp that the per_mm came from
-	 * cannot be freed during the call to ib_alloc_odp_umem.
-	 */
-	mutex_lock(&ctx->per_mm_list_lock);
-	per_mm->odp_mrs_count++;
-	mutex_unlock(&ctx->per_mm_list_lock);
-	add_umem_to_per_mm(odp_data);
+	mmgrab(umem->owning_mm);
 
 	return odp_data;
-
-out_page_list:
-	vfree(odp_data->page_list);
-out_odp_data:
-	mmdrop(umem->owning_mm);
-	kfree(odp_data);
-	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL(ib_alloc_odp_umem);
 
 int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 {
-	struct ib_umem *umem = &umem_odp->umem;
 	/*
 	 * NOTE: This must called in a process context where umem->owning_mm
 	 * == current->mm
 	 */
-	struct mm_struct *mm = umem->owning_mm;
-	int ret_val;
+	struct mm_struct *mm = umem_odp->umem.owning_mm;
 
 	if (umem_odp->umem.address == 0 && umem_odp->umem.length == 0)
 		umem_odp->is_implicit_odp = 1;
@@ -404,43 +412,7 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 		up_read(&mm->mmap_sem);
 	}
 
-	mutex_init(&umem_odp->umem_mutex);
-
-	init_completion(&umem_odp->notifier_completion);
-
-	if (!umem_odp->is_implicit_odp) {
-		if (!ib_umem_odp_num_pages(umem_odp))
-			return -EINVAL;
-
-		umem_odp->page_list =
-			vzalloc(array_size(sizeof(*umem_odp->page_list),
-					   ib_umem_odp_num_pages(umem_odp)));
-		if (!umem_odp->page_list)
-			return -ENOMEM;
-
-		umem_odp->dma_list =
-			vzalloc(array_size(sizeof(*umem_odp->dma_list),
-					   ib_umem_odp_num_pages(umem_odp)));
-		if (!umem_odp->dma_list) {
-			ret_val = -ENOMEM;
-			goto out_page_list;
-		}
-	}
-
-	ret_val = get_per_mm(umem_odp);
-	if (ret_val)
-		goto out_dma_list;
-
-	if (!umem_odp->is_implicit_odp)
-		add_umem_to_per_mm(umem_odp);
-
-	return 0;
-
-out_dma_list:
-	vfree(umem_odp->dma_list);
-out_page_list:
-	vfree(umem_odp->page_list);
-	return ret_val;
+	return ib_init_umem_odp(umem_odp, NULL);
 }
 
 void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 05/12] RDMA/odp: Make the three ways to create a umem_odp clear
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (3 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 04/12] RMDA/odp: Consolidate umem_odp initialization Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 06/12] RDMA/odp: Split creating a umem_odp from ib_umem_get Leon Romanovsky
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

The three paths to build the umem_odps are kind of muddled, they are:
- As a normal ib_mr umem
- As a child in an implicit ODP umem tree
- As the root of an implicit ODP umem tree

Only the first two are actually umem's, the last is an abuse.

The implicit case can only be triggered by explicit driver request, it
should never be co-mingled with the normal case. While we are here, make
sensible function names and add some comments to make this clearer.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c | 80 +++++++++++++++++++++++++++---
 drivers/infiniband/hw/mlx5/odp.c   | 23 ++++-----
 include/rdma/ib_umem_odp.h         |  6 ++-
 3 files changed, 89 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 487a6371a053..9b1f779493e9 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -46,6 +46,8 @@
 #include <rdma/ib_umem.h>
 #include <rdma/ib_umem_odp.h>
 
+#include "uverbs.h"
+
 static void ib_umem_notifier_start_account(struct ib_umem_odp *umem_odp)
 {
 	mutex_lock(&umem_odp->umem_mutex);
@@ -351,8 +353,67 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 	return ret;
 }
 
-struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root,
-				      unsigned long addr, size_t size)
+/**
+ * ib_umem_odp_alloc_implicit - Allocate a parent implicit ODP umem
+ *
+ * Implicit ODP umems do not have a VA range and do not have any page lists.
+ * They exist only to hold the per_mm reference to help the driver create
+ * children umems.
+ *
+ * @udata: udata from the syscall being used to create the umem
+ * @access: ib_reg_mr access flags
+ */
+struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
+					       int access)
+{
+	struct ib_ucontext *context =
+		container_of(udata, struct uverbs_attr_bundle, driver_udata)
+			->context;
+	struct ib_umem *umem;
+	struct ib_umem_odp *umem_odp;
+	int ret;
+
+	if (access & IB_ACCESS_HUGETLB)
+		return ERR_PTR(-EINVAL);
+
+	if (!context)
+		return ERR_PTR(-EIO);
+	if (WARN_ON_ONCE(!context->invalidate_range))
+		return ERR_PTR(-EINVAL);
+
+	umem_odp = kzalloc(sizeof(*umem_odp), GFP_KERNEL);
+	if (!umem_odp)
+		return ERR_PTR(-ENOMEM);
+	umem = &umem_odp->umem;
+	umem->context = context;
+	umem->writable = ib_access_writable(access);
+	umem->owning_mm = current->mm;
+	umem_odp->is_implicit_odp = 1;
+	umem_odp->page_shift = PAGE_SHIFT;
+
+	ret = ib_init_umem_odp(umem_odp, NULL);
+	if (ret) {
+		kfree(umem_odp);
+		return ERR_PTR(ret);
+	}
+
+	mmgrab(umem->owning_mm);
+
+	return umem_odp;
+}
+EXPORT_SYMBOL(ib_umem_odp_alloc_implicit);
+
+/**
+ * ib_umem_odp_alloc_child - Allocate a child ODP umem under an implicit
+ *                           parent ODP umem
+ *
+ * @root: The parent umem enclosing the child. This must be allocated using
+ *        ib_alloc_implicit_odp_umem()
+ * @addr: The starting userspace VA
+ * @size: The length of the userspace VA
+ */
+struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root,
+					    unsigned long addr, size_t size)
 {
 	/*
 	 * Caller must ensure that root cannot be freed during the call to
@@ -362,6 +423,9 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root,
 	struct ib_umem *umem;
 	int ret;
 
+	if (WARN_ON(!root->is_implicit_odp))
+		return ERR_PTR(-EINVAL);
+
 	odp_data = kzalloc(sizeof(*odp_data), GFP_KERNEL);
 	if (!odp_data)
 		return ERR_PTR(-ENOMEM);
@@ -383,8 +447,15 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root,
 
 	return odp_data;
 }
-EXPORT_SYMBOL(ib_alloc_odp_umem);
+EXPORT_SYMBOL(ib_umem_odp_alloc_child);
 
+/**
+ * ib_umem_odp_get - Complete ib_umem_get()
+ *
+ * @umem_odp: The partially configured umem from ib_umem_get()
+ * @addr: The starting userspace VA
+ * @access: ib_reg_mr access flags
+ */
 int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 {
 	/*
@@ -393,9 +464,6 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 	 */
 	struct mm_struct *mm = umem_odp->umem.owning_mm;
 
-	if (umem_odp->umem.address == 0 && umem_odp->umem.length == 0)
-		umem_odp->is_implicit_odp = 1;
-
 	umem_odp->page_shift = PAGE_SHIFT;
 	if (access & IB_ACCESS_HUGETLB) {
 		struct vm_area_struct *vma;
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 5b6b2afa26a6..4371fc759c23 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -384,7 +384,7 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev,
 }
 
 static struct mlx5_ib_mr *implicit_mr_alloc(struct ib_pd *pd,
-					    struct ib_umem *umem,
+					    struct ib_umem_odp *umem_odp,
 					    bool ksm, int access_flags)
 {
 	struct mlx5_ib_dev *dev = to_mdev(pd->device);
@@ -402,7 +402,7 @@ static struct mlx5_ib_mr *implicit_mr_alloc(struct ib_pd *pd,
 	mr->dev = dev;
 	mr->access_flags = access_flags;
 	mr->mmkey.iova = 0;
-	mr->umem = umem;
+	mr->umem = &umem_odp->umem;
 
 	if (ksm) {
 		err = mlx5_ib_update_xlt(mr, 0,
@@ -462,14 +462,13 @@ static struct ib_umem_odp *implicit_mr_get_data(struct mlx5_ib_mr *mr,
 		if (nentries)
 			nentries++;
 	} else {
-		odp = ib_alloc_odp_umem(odp_mr, addr,
-					MLX5_IMR_MTT_SIZE);
+		odp = ib_umem_odp_alloc_child(odp_mr, addr, MLX5_IMR_MTT_SIZE);
 		if (IS_ERR(odp)) {
 			mutex_unlock(&odp_mr->umem_mutex);
 			return ERR_CAST(odp);
 		}
 
-		mtt = implicit_mr_alloc(mr->ibmr.pd, &odp->umem, 0,
+		mtt = implicit_mr_alloc(mr->ibmr.pd, odp, 0,
 					mr->access_flags);
 		if (IS_ERR(mtt)) {
 			mutex_unlock(&odp_mr->umem_mutex);
@@ -519,19 +518,19 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd,
 					     int access_flags)
 {
 	struct mlx5_ib_mr *imr;
-	struct ib_umem *umem;
+	struct ib_umem_odp *umem_odp;
 
-	umem = ib_umem_get(udata, 0, 0, access_flags, 0);
-	if (IS_ERR(umem))
-		return ERR_CAST(umem);
+	umem_odp = ib_umem_odp_alloc_implicit(udata, access_flags);
+	if (IS_ERR(umem_odp))
+		return ERR_CAST(umem_odp);
 
-	imr = implicit_mr_alloc(&pd->ibpd, umem, 1, access_flags);
+	imr = implicit_mr_alloc(&pd->ibpd, umem_odp, 1, access_flags);
 	if (IS_ERR(imr)) {
-		ib_umem_release(umem);
+		ib_umem_release(&umem_odp->umem);
 		return ERR_CAST(imr);
 	}
 
-	imr->umem = umem;
+	imr->umem = &umem_odp->umem;
 	init_waitqueue_head(&imr->q_leaf_free);
 	atomic_set(&imr->num_leaf_free, 0);
 	atomic_set(&imr->num_pending_prefetch, 0);
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 14b38b4459c5..219fe7015e7d 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -140,8 +140,10 @@ struct ib_ucontext_per_mm {
 };
 
 int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access);
-struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root_umem,
-				      unsigned long addr, size_t size);
+struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
+					       int access);
+struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root_umem,
+					    unsigned long addr, size_t size);
 void ib_umem_odp_release(struct ib_umem_odp *umem_odp);
 
 int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 start_offset,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 06/12] RDMA/odp: Split creating a umem_odp from ib_umem_get
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (4 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 05/12] RDMA/odp: Make the three ways to create a umem_odp clear Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 07/12] RDMA/odp: Provide ib_umem_odp_release() to undo the allocs Leon Romanovsky
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

This is the last creation API that is overloaded for both, there is very
little code sharing and a driver has to be specifically ready for a
umem_odp to be created to use the odp version.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem.c     | 30 +++----------
 drivers/infiniband/core/umem_odp.c | 67 ++++++++++++++++++++++--------
 drivers/infiniband/hw/mlx5/mem.c   | 13 ------
 drivers/infiniband/hw/mlx5/mr.c    | 34 +++++++++++----
 include/rdma/ib_umem_odp.h         |  9 ++--
 5 files changed, 86 insertions(+), 67 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index f3bedbb7c4ab..ac7376401965 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -184,9 +184,6 @@ EXPORT_SYMBOL(ib_umem_find_best_pgsz);
 /**
  * ib_umem_get - Pin and DMA map userspace memory.
  *
- * If access flags indicate ODP memory, avoid pinning. Instead, stores
- * the mm for future page fault handling in conjunction with MMU notifiers.
- *
  * @udata: userspace context to pin memory for
  * @addr: userspace virtual address to start at
  * @size: length of region to pin
@@ -231,17 +228,12 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
 	if (!can_do_mlock())
 		return ERR_PTR(-EPERM);
 
-	if (access & IB_ACCESS_ON_DEMAND) {
-		umem = kzalloc(sizeof(struct ib_umem_odp), GFP_KERNEL);
-		if (!umem)
-			return ERR_PTR(-ENOMEM);
-		umem->is_odp = 1;
-	} else {
-		umem = kzalloc(sizeof(*umem), GFP_KERNEL);
-		if (!umem)
-			return ERR_PTR(-ENOMEM);
-	}
+	if (access & IB_ACCESS_ON_DEMAND)
+		return ERR_PTR(-EOPNOTSUPP);
 
+	umem = kzalloc(sizeof(*umem), GFP_KERNEL);
+	if (!umem)
+		return ERR_PTR(-ENOMEM);
 	umem->context    = context;
 	umem->length     = size;
 	umem->address    = addr;
@@ -249,18 +241,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
 	umem->owning_mm = mm = current->mm;
 	mmgrab(mm);
 
-	if (access & IB_ACCESS_ON_DEMAND) {
-		if (WARN_ON_ONCE(!context->invalidate_range)) {
-			ret = -EINVAL;
-			goto umem_kfree;
-		}
-
-		ret = ib_umem_odp_get(to_ib_umem_odp(umem), access);
-		if (ret)
-			goto umem_kfree;
-		return umem;
-	}
-
 	page_list = (struct page **) __get_free_page(GFP_KERNEL);
 	if (!page_list) {
 		ret = -ENOMEM;
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 9b1f779493e9..79995766316a 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -342,6 +342,7 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 				     &per_mm->umem_tree);
 		up_write(&per_mm->umem_rwsem);
 	}
+	mmgrab(umem_odp->umem.owning_mm);
 
 	return 0;
 
@@ -396,9 +397,6 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
 		kfree(umem_odp);
 		return ERR_PTR(ret);
 	}
-
-	mmgrab(umem->owning_mm);
-
 	return umem_odp;
 }
 EXPORT_SYMBOL(ib_umem_odp_alloc_implicit);
@@ -442,27 +440,51 @@ struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root,
 		kfree(odp_data);
 		return ERR_PTR(ret);
 	}
-
-	mmgrab(umem->owning_mm);
-
 	return odp_data;
 }
 EXPORT_SYMBOL(ib_umem_odp_alloc_child);
 
 /**
- * ib_umem_odp_get - Complete ib_umem_get()
+ * ib_umem_odp_get - Create a umem_odp for a userspace va
  *
- * @umem_odp: The partially configured umem from ib_umem_get()
- * @addr: The starting userspace VA
- * @access: ib_reg_mr access flags
+ * @udata: userspace context to pin memory for
+ * @addr: userspace virtual address to start at
+ * @size: length of region to pin
+ * @access: IB_ACCESS_xxx flags for memory being pinned
+ *
+ * The driver should use when the access flags indicate ODP memory. It avoids
+ * pinning, instead, stores the mm for future page fault handling in
+ * conjunction with MMU notifiers.
  */
-int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
+struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, unsigned long addr,
+				    size_t size, int access)
 {
-	/*
-	 * NOTE: This must called in a process context where umem->owning_mm
-	 * == current->mm
-	 */
-	struct mm_struct *mm = umem_odp->umem.owning_mm;
+	struct ib_umem_odp *umem_odp;
+	struct ib_ucontext *context;
+	struct mm_struct *mm;
+	int ret;
+
+	if (!udata)
+		return ERR_PTR(-EIO);
+
+	context = container_of(udata, struct uverbs_attr_bundle, driver_udata)
+			  ->context;
+	if (!context)
+		return ERR_PTR(-EIO);
+
+	if (WARN_ON_ONCE(!(access & IB_ACCESS_ON_DEMAND)) ||
+	    WARN_ON_ONCE(!context->invalidate_range))
+		return ERR_PTR(-EINVAL);
+
+	umem_odp = kzalloc(sizeof(struct ib_umem_odp), GFP_KERNEL);
+	if (!umem_odp)
+		return ERR_PTR(-ENOMEM);
+
+	umem_odp->umem.context = context;
+	umem_odp->umem.length = size;
+	umem_odp->umem.address = addr;
+	umem_odp->umem.writable = ib_access_writable(access);
+	umem_odp->umem.owning_mm = mm = current->mm;
 
 	umem_odp->page_shift = PAGE_SHIFT;
 	if (access & IB_ACCESS_HUGETLB) {
@@ -473,15 +495,24 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 		vma = find_vma(mm, ib_umem_start(umem_odp));
 		if (!vma || !is_vm_hugetlb_page(vma)) {
 			up_read(&mm->mmap_sem);
-			return -EINVAL;
+			ret = -EINVAL;
+			goto err_free;
 		}
 		h = hstate_vma(vma);
 		umem_odp->page_shift = huge_page_shift(h);
 		up_read(&mm->mmap_sem);
 	}
 
-	return ib_init_umem_odp(umem_odp, NULL);
+	ret = ib_init_umem_odp(umem_odp, NULL);
+	if (ret)
+		goto err_free;
+	return umem_odp;
+
+err_free:
+	kfree(umem_odp);
+	return ERR_PTR(ret);
 }
+EXPORT_SYMBOL(ib_umem_odp_get);
 
 void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
 {
diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
index a40e0abf2338..b5aece786b36 100644
--- a/drivers/infiniband/hw/mlx5/mem.c
+++ b/drivers/infiniband/hw/mlx5/mem.c
@@ -56,19 +56,6 @@ void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr,
 	struct scatterlist *sg;
 	int entry;
 
-	if (umem->is_odp) {
-		struct ib_umem_odp *odp = to_ib_umem_odp(umem);
-		unsigned int page_shift = odp->page_shift;
-
-		*ncont = ib_umem_odp_num_pages(odp);
-		*count = *ncont << (page_shift - PAGE_SHIFT);
-		*shift = page_shift;
-		if (order)
-			*order = ilog2(roundup_pow_of_two(*ncont));
-
-		return;
-	}
-
 	addr = addr >> PAGE_SHIFT;
 	tmp = (unsigned long)addr;
 	m = find_first_bit(&tmp, BITS_PER_LONG);
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index e0015b612ffd..c9690d3cfb5c 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -794,19 +794,37 @@ static int mr_umem_get(struct mlx5_ib_dev *dev, struct ib_udata *udata,
 		       int *ncont, int *order)
 {
 	struct ib_umem *u;
-	int err;
 
 	*umem = NULL;
 
-	u = ib_umem_get(udata, start, length, access_flags, 0);
-	err = PTR_ERR_OR_ZERO(u);
-	if (err) {
-		mlx5_ib_dbg(dev, "umem get failed (%d)\n", err);
-		return err;
+	if (access_flags & IB_ACCESS_ON_DEMAND) {
+		struct ib_umem_odp *odp;
+
+		odp = ib_umem_odp_get(udata, start, length, access_flags);
+		if (IS_ERR(odp)) {
+			mlx5_ib_dbg(dev, "umem get failed (%ld)\n",
+				    PTR_ERR(odp));
+			return PTR_ERR(odp);
+		}
+
+		u = &odp->umem;
+
+		*page_shift = odp->page_shift;
+		*ncont = ib_umem_odp_num_pages(odp);
+		*npages = *ncont << (*page_shift - PAGE_SHIFT);
+		if (order)
+			*order = ilog2(roundup_pow_of_two(*ncont));
+	} else {
+		u = ib_umem_get(udata, start, length, access_flags, 0);
+		if (IS_ERR(u)) {
+			mlx5_ib_dbg(dev, "umem get failed (%ld)\n", PTR_ERR(u));
+			return PTR_ERR(u);
+		}
+
+		mlx5_ib_cont_pages(u, start, MLX5_MKEY_PAGE_SHIFT_MASK, npages,
+				   page_shift, ncont, order);
 	}
 
-	mlx5_ib_cont_pages(u, start, MLX5_MKEY_PAGE_SHIFT_MASK, npages,
-			   page_shift, ncont, order);
 	if (!*npages) {
 		mlx5_ib_warn(dev, "avoid zero region\n");
 		ib_umem_release(u);
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 219fe7015e7d..5efb67f97b0a 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -139,7 +139,8 @@ struct ib_ucontext_per_mm {
 	struct rcu_head rcu;
 };
 
-int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access);
+struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, unsigned long addr,
+				    size_t size, int access);
 struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
 					       int access);
 struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root_umem,
@@ -199,9 +200,11 @@ static inline int ib_umem_mmu_notifier_retry(struct ib_umem_odp *umem_odp,
 
 #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 
-static inline int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
+static inline struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata,
+						  unsigned long addr,
+						  size_t size, int access)
 {
-	return -EINVAL;
+	return ERR_PTR(-EINVAL);
 }
 
 static inline void ib_umem_odp_release(struct ib_umem_odp *umem_odp) {}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 07/12] RDMA/odp: Provide ib_umem_odp_release() to undo the allocs
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (5 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 06/12] RDMA/odp: Split creating a umem_odp from ib_umem_get Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end Leon Romanovsky
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

Now that there are allocator APIs that return the ib_umem_odp directly
it should be freed through a umem_odp free'er as well.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem.c     | 20 ++++----------------
 drivers/infiniband/core/umem_odp.c |  3 +++
 drivers/infiniband/hw/mlx5/mr.c    |  2 +-
 drivers/infiniband/hw/mlx5/odp.c   |  6 +++---
 4 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index ac7376401965..312289f84987 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -326,15 +326,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
 }
 EXPORT_SYMBOL(ib_umem_get);
 
-static void __ib_umem_release_tail(struct ib_umem *umem)
-{
-	mmdrop(umem->owning_mm);
-	if (umem->is_odp)
-		kfree(to_ib_umem_odp(umem));
-	else
-		kfree(umem);
-}
-
 /**
  * ib_umem_release - release memory pinned with ib_umem_get
  * @umem: umem struct to release
@@ -343,17 +334,14 @@ void ib_umem_release(struct ib_umem *umem)
 {
 	if (!umem)
 		return;
-
-	if (umem->is_odp) {
-		ib_umem_odp_release(to_ib_umem_odp(umem));
-		__ib_umem_release_tail(umem);
-		return;
-	}
+	if (umem->is_odp)
+		return ib_umem_odp_release(to_ib_umem_odp(umem));
 
 	__ib_umem_release(umem->context->device, umem, 1);
 
 	atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm);
-	__ib_umem_release_tail(umem);
+	mmdrop(umem->owning_mm);
+	kfree(umem);
 }
 EXPORT_SYMBOL(ib_umem_release);
 
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 79995766316a..2575dd783196 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -530,7 +530,10 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
 		vfree(umem_odp->page_list);
 	}
 	put_per_mm(umem_odp);
+	mmdrop(umem_odp->umem.owning_mm);
+	kfree(umem_odp);
 }
+EXPORT_SYMBOL(ib_umem_odp_release);
 
 /*
  * Map for DMA and insert a single page into the on-demand paging page tables.
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index c9690d3cfb5c..aa0299662c05 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1638,7 +1638,7 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
 		 * so that there will not be any invalidations in
 		 * flight, looking at the *mr struct.
 		 */
-		ib_umem_release(umem);
+		ib_umem_odp_release(umem_odp);
 		atomic_sub(npages, &dev->mdev->priv.reg_pages);
 
 		/* Avoid double-freeing the umem. */
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 4371fc759c23..ad5d5f2c8509 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -206,7 +206,7 @@ static void mr_leaf_free_action(struct work_struct *work)
 	mr->parent = NULL;
 	synchronize_srcu(&mr->dev->mr_srcu);
 
-	ib_umem_release(&odp->umem);
+	ib_umem_odp_release(odp);
 	if (imr->live)
 		mlx5_ib_update_xlt(imr, idx, 1, 0,
 				   MLX5_IB_UPD_XLT_INDIRECT |
@@ -472,7 +472,7 @@ static struct ib_umem_odp *implicit_mr_get_data(struct mlx5_ib_mr *mr,
 					mr->access_flags);
 		if (IS_ERR(mtt)) {
 			mutex_unlock(&odp_mr->umem_mutex);
-			ib_umem_release(&odp->umem);
+			ib_umem_odp_release(odp);
 			return ERR_CAST(mtt);
 		}
 
@@ -526,7 +526,7 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd,
 
 	imr = implicit_mr_alloc(&pd->ibpd, umem_odp, 1, access_flags);
 	if (IS_ERR(imr)) {
-		ib_umem_release(&umem_odp->umem);
+		ib_umem_odp_release(umem_odp);
 		return ERR_CAST(imr);
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (6 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 07/12] RDMA/odp: Provide ib_umem_odp_release() to undo the allocs Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-26 16:42   ` Nathan Chancellor
  2019-08-19 11:17 ` [PATCH rdma-next 09/12] RDMA/odp: Use kvcalloc for the dma_list and page_list Leon Romanovsky
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

Since the page size can be extended in the ODP case by IB_ACCESS_HUGETLB
the existing overflow checks done by ib_umem_get() are not
sufficient. Check for overflow again.

Further, remove the unchecked math from the inlines and just use the
precomputed value stored in the interval_tree_node.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c | 25 +++++++++++++++++++------
 include/rdma/ib_umem_odp.h         |  5 ++---
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 2575dd783196..46ae9962fae3 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -294,19 +294,32 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 
 	umem_odp->umem.is_odp = 1;
 	if (!umem_odp->is_implicit_odp) {
-		size_t pages = ib_umem_odp_num_pages(umem_odp);
-
+		size_t page_size = 1UL << umem_odp->page_shift;
+		size_t pages;
+
+		umem_odp->interval_tree.start =
+			ALIGN_DOWN(umem_odp->umem.address, page_size);
+		if (check_add_overflow(umem_odp->umem.address,
+				       umem_odp->umem.length,
+				       &umem_odp->interval_tree.last))
+			return -EOVERFLOW;
+		umem_odp->interval_tree.last =
+			ALIGN(umem_odp->interval_tree.last, page_size);
+		if (unlikely(umem_odp->interval_tree.last < page_size))
+			return -EOVERFLOW;
+
+		pages = (umem_odp->interval_tree.last -
+			 umem_odp->interval_tree.start) >>
+			umem_odp->page_shift;
 		if (!pages)
 			return -EINVAL;
 
 		/*
 		 * Note that the representation of the intervals in the
 		 * interval tree considers the ending point as contained in
-		 * the interval, while the function ib_umem_end returns the
-		 * first address which is not contained in the umem.
+		 * the interval.
 		 */
-		umem_odp->interval_tree.start = ib_umem_start(umem_odp);
-		umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1;
+		umem_odp->interval_tree.last--;
 
 		umem_odp->page_list = vzalloc(
 			array_size(sizeof(*umem_odp->page_list), pages));
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 5efb67f97b0a..b37c674b7fe6 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -91,14 +91,13 @@ static inline struct ib_umem_odp *to_ib_umem_odp(struct ib_umem *umem)
 /* Returns the first page of an ODP umem. */
 static inline unsigned long ib_umem_start(struct ib_umem_odp *umem_odp)
 {
-	return ALIGN_DOWN(umem_odp->umem.address, 1UL << umem_odp->page_shift);
+	return umem_odp->interval_tree.start;
 }
 
 /* Returns the address of the page after the last one of an ODP umem. */
 static inline unsigned long ib_umem_end(struct ib_umem_odp *umem_odp)
 {
-	return ALIGN(umem_odp->umem.address + umem_odp->umem.length,
-		     1UL << umem_odp->page_shift);
+	return umem_odp->interval_tree.last + 1;
 }
 
 static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp *umem_odp)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 09/12] RDMA/odp: Use kvcalloc for the dma_list and page_list
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (7 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 10/12] RDMA/core: Make invalidate_range a device operation Leon Romanovsky
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

There is no specific need for these to be in the valloc space, let the
system decide automatically how to do the allocation.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 46ae9962fae3..f1b298575b4c 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -321,13 +321,13 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 		 */
 		umem_odp->interval_tree.last--;
 
-		umem_odp->page_list = vzalloc(
-			array_size(sizeof(*umem_odp->page_list), pages));
+		umem_odp->page_list = kvcalloc(
+			pages, sizeof(*umem_odp->page_list), GFP_KERNEL);
 		if (!umem_odp->page_list)
 			return -ENOMEM;
 
-		umem_odp->dma_list =
-			vzalloc(array_size(sizeof(*umem_odp->dma_list), pages));
+		umem_odp->dma_list = kvcalloc(
+			pages, sizeof(*umem_odp->dma_list), GFP_KERNEL);
 		if (!umem_odp->dma_list) {
 			ret = -ENOMEM;
 			goto out_page_list;
@@ -361,9 +361,9 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 
 out_unlock:
 	mutex_unlock(&ctx->per_mm_list_lock);
-	vfree(umem_odp->dma_list);
+	kvfree(umem_odp->dma_list);
 out_page_list:
-	vfree(umem_odp->page_list);
+	kvfree(umem_odp->page_list);
 	return ret;
 }
 
@@ -539,8 +539,8 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
 		ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
 					    ib_umem_end(umem_odp));
 		remove_umem_from_per_mm(umem_odp);
-		vfree(umem_odp->dma_list);
-		vfree(umem_odp->page_list);
+		kvfree(umem_odp->dma_list);
+		kvfree(umem_odp->page_list);
 	}
 	put_per_mm(umem_odp);
 	mmdrop(umem_odp->umem.owning_mm);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 10/12] RDMA/core: Make invalidate_range a device operation
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (8 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 09/12] RDMA/odp: Use kvcalloc for the dma_list and page_list Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 11/12] RDMA/mlx5: Use ib_umem_start instead of umem.address Leon Romanovsky
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Moni Shoua <monis@mellanox.com>

The callback function 'invalidate_range' is implemented in a driver so the
place for it is in the ib_device_ops structure and not in ib_ucontext.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/device.c     |  1 +
 drivers/infiniband/core/umem_odp.c   | 10 +++++-----
 drivers/infiniband/core/uverbs_cmd.c |  2 --
 drivers/infiniband/hw/mlx5/main.c    |  4 ----
 drivers/infiniband/hw/mlx5/odp.c     |  1 +
 include/rdma/ib_verbs.h              |  4 ++--
 6 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 8892862fb759..6e284963741e 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2582,6 +2582,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, get_vf_config);
 	SET_DEVICE_OP(dev_ops, get_vf_stats);
 	SET_DEVICE_OP(dev_ops, init_port);
+	SET_DEVICE_OP(dev_ops, invalidate_range);
 	SET_DEVICE_OP(dev_ops, iw_accept);
 	SET_DEVICE_OP(dev_ops, iw_add_ref);
 	SET_DEVICE_OP(dev_ops, iw_connect);
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index f1b298575b4c..09c0c585b2e7 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -103,7 +103,7 @@ static void ib_umem_notifier_release(struct mmu_notifier *mn,
 		 */
 		smp_wmb();
 		complete_all(&umem_odp->notifier_completion);
-		umem_odp->umem.context->invalidate_range(
+		umem_odp->umem.context->device->ops.invalidate_range(
 			umem_odp, ib_umem_start(umem_odp),
 			ib_umem_end(umem_odp));
 	}
@@ -116,7 +116,7 @@ static int invalidate_range_start_trampoline(struct ib_umem_odp *item,
 					     u64 start, u64 end, void *cookie)
 {
 	ib_umem_notifier_start_account(item);
-	item->umem.context->invalidate_range(item, start, end);
+	item->umem.context->device->ops.invalidate_range(item, start, end);
 	return 0;
 }
 
@@ -392,7 +392,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
 
 	if (!context)
 		return ERR_PTR(-EIO);
-	if (WARN_ON_ONCE(!context->invalidate_range))
+	if (WARN_ON_ONCE(!context->device->ops.invalidate_range))
 		return ERR_PTR(-EINVAL);
 
 	umem_odp = kzalloc(sizeof(*umem_odp), GFP_KERNEL);
@@ -486,7 +486,7 @@ struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, unsigned long addr,
 		return ERR_PTR(-EIO);
 
 	if (WARN_ON_ONCE(!(access & IB_ACCESS_ON_DEMAND)) ||
-	    WARN_ON_ONCE(!context->invalidate_range))
+	    WARN_ON_ONCE(!context->device->ops.invalidate_range))
 		return ERR_PTR(-EINVAL);
 
 	umem_odp = kzalloc(sizeof(struct ib_umem_odp), GFP_KERNEL);
@@ -614,7 +614,7 @@ static int ib_umem_odp_map_dma_single_page(
 
 	if (remove_existing_mapping) {
 		ib_umem_notifier_start_account(umem_odp);
-		context->invalidate_range(
+		dev->ops.invalidate_range(
 			umem_odp,
 			ib_umem_start(umem_odp) +
 				(page_index << umem_odp->page_shift),
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 7ddd0e5bc6b3..8f4fd4fac159 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -275,8 +275,6 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
 	ret = ib_dev->ops.alloc_ucontext(ucontext, &attrs->driver_udata);
 	if (ret)
 		goto err_file;
-	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
-		ucontext->invalidate_range = NULL;
 
 	rdma_restrack_uadd(&ucontext->res);
 
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 98e566acb746..08020affdc17 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1867,10 +1867,6 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx,
 	if (err)
 		goto out_sys_pages;
 
-	if (ibdev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING)
-		context->ibucontext.invalidate_range =
-			&mlx5_ib_invalidate_range;
-
 	if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) {
 		err = mlx5_ib_devx_create(dev, true);
 		if (err < 0)
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index ad5d5f2c8509..c755c76729bc 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -1594,6 +1594,7 @@ void mlx5_odp_init_mr_cache_entry(struct mlx5_cache_ent *ent)
 
 static const struct ib_device_ops mlx5_ib_dev_odp_ops = {
 	.advise_mr = mlx5_ib_advise_mr,
+	.invalidate_range = mlx5_ib_invalidate_range,
 };
 
 int mlx5_ib_odp_init_one(struct mlx5_ib_dev *dev)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 391499008a22..18a34888bbca 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1469,8 +1469,6 @@ struct ib_ucontext {
 
 	bool cleanup_retryable;
 
-	void (*invalidate_range)(struct ib_umem_odp *umem_odp,
-				 unsigned long start, unsigned long end);
 	struct mutex per_mm_list_lock;
 	struct list_head per_mm_list;
 
@@ -2430,6 +2428,8 @@ struct ib_device_ops {
 			    u64 iova);
 	int (*unmap_fmr)(struct list_head *fmr_list);
 	int (*dealloc_fmr)(struct ib_fmr *fmr);
+	void (*invalidate_range)(struct ib_umem_odp *umem_odp,
+				 unsigned long start, unsigned long end);
 	int (*attach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
 	int (*detach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
 	struct ib_xrcd *(*alloc_xrcd)(struct ib_device *device,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 11/12] RDMA/mlx5: Use ib_umem_start instead of umem.address
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (9 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 10/12] RDMA/core: Make invalidate_range a device operation Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-19 11:17 ` [PATCH rdma-next 12/12] RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr Leon Romanovsky
  2019-08-21 16:42 ` [PATCH rdma-next 00/12] Improvements for ODP Jason Gunthorpe
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

These are subtly different, the address is the original VA requested
during umem_get, while ib_umem_start() is the version that is rounded to
the proper page size, ie is the true start of the umem's dma map.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/odp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index c755c76729bc..70e0a3555f11 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -184,7 +184,7 @@ void mlx5_odp_populate_klm(struct mlx5_klm *pklm, size_t offset,
 	for (i = 0; i < nentries; i++, pklm++) {
 		pklm->bcount = cpu_to_be32(MLX5_IMR_MTT_SIZE);
 		va = (offset + i) * MLX5_IMR_MTT_SIZE;
-		if (odp && odp->umem.address == va) {
+		if (odp && ib_umem_start(odp) == va) {
 			struct mlx5_ib_mr *mtt = odp->private;
 
 			pklm->key = cpu_to_be32(mtt->ibmr.lkey);
@@ -494,7 +494,7 @@ static struct ib_umem_odp *implicit_mr_get_data(struct mlx5_ib_mr *mr,
 	addr += MLX5_IMR_MTT_SIZE;
 	if (unlikely(addr < io_virt + bcnt)) {
 		odp = odp_next(odp);
-		if (odp && odp->umem.address != addr)
+		if (odp && ib_umem_start(odp) != addr)
 			odp = NULL;
 		goto next_mr;
 	}
@@ -664,7 +664,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 
 		io_virt += size;
 		next = odp_next(odp);
-		if (unlikely(!next || next->umem.address != io_virt)) {
+		if (unlikely(!next || ib_umem_start(next) != io_virt)) {
 			mlx5_ib_dbg(dev, "next implicit leaf removed at 0x%llx. got %p\n",
 				    io_virt, next);
 			return -EAGAIN;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH rdma-next 12/12] RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (10 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 11/12] RDMA/mlx5: Use ib_umem_start instead of umem.address Leon Romanovsky
@ 2019-08-19 11:17 ` Leon Romanovsky
  2019-08-21 16:42 ` [PATCH rdma-next 00/12] Improvements for ODP Jason Gunthorpe
  12 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-19 11:17 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

From: Jason Gunthorpe <jgg@mellanox.com>

These are the same thing since mr always comes from odp->private. It is
confusing to reference the same memory via two names.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/odp.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 70e0a3555f11..8b155a1f0b38 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -603,7 +603,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 	start_idx = (io_virt - (mr->mmkey.iova & page_mask)) >> page_shift;
 	access_mask = ODP_READ_ALLOWED_BIT;
 
-	if (prefetch && !downgrade && !mr->umem->writable) {
+	if (prefetch && !downgrade && !odp->umem.writable) {
 		/* prefetch with write-access must
 		 * be supported by the MR
 		 */
@@ -611,7 +611,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 		goto out;
 	}
 
-	if (mr->umem->writable && !downgrade)
+	if (odp->umem.writable && !downgrade)
 		access_mask |= ODP_WRITE_ALLOWED_BIT;
 
 	current_seq = READ_ONCE(odp->notifiers_seq);
@@ -621,8 +621,8 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 	 */
 	smp_rmb();
 
-	ret = ib_umem_odp_map_dma_pages(to_ib_umem_odp(mr->umem), io_virt, size,
-					access_mask, current_seq);
+	ret = ib_umem_odp_map_dma_pages(odp, io_virt, size, access_mask,
+					current_seq);
 
 	if (ret < 0)
 		goto out;
@@ -630,8 +630,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
 	np = ret;
 
 	mutex_lock(&odp->umem_mutex);
-	if (!ib_umem_mmu_notifier_retry(to_ib_umem_odp(mr->umem),
-					current_seq)) {
+	if (!ib_umem_mmu_notifier_retry(odp, current_seq)) {
 		/*
 		 * No need to check whether the MTTs really belong to
 		 * this MR, since ib_umem_odp_map_dma_pages already
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 00/12] Improvements for ODP
  2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
                   ` (11 preceding siblings ...)
  2019-08-19 11:17 ` [PATCH rdma-next 12/12] RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr Leon Romanovsky
@ 2019-08-21 16:42 ` Jason Gunthorpe
  12 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2019-08-21 16:42 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

On Mon, Aug 19, 2019 at 02:16:58PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Hi,
> 
> This series from Jason is a collection of general cleanups for
> ODP to clarify some of the flows around umem creation and use
> of the interval tree.
> 
> It is based on patch "RDMA/mlx5: Fix MR npages calculation for
> IB_ACCESS_HUGETLB"
> https://lore.kernel.org/linux-rdma/20190815083834.9245-5-leon@kernel.org
> 
> Thanks
> 
> Jason Gunthorpe (11):
>   RDMA/odp: Use the common interval tree library instead of generic
>   RDMA/odp: Iterate over the whole rbtree directly
>   RDMA/odp: Make it clearer when a umem is an implicit ODP umem
>   RMDA/odp: Consolidate umem_odp initialization
>   RDMA/odp: Make the three ways to create a umem_odp clear
>   RDMA/odp: Split creating a umem_odp from ib_umem_get
>   RDMA/odp: Provide ib_umem_odp_release() to undo the allocs
>   RDMA/odp: Check for overflow when computing the umem_odp end
>   RDMA/odp: Use kvcalloc for the dma_list and page_list
>   RDMA/mlx5: Use ib_umem_start instead of umem.address
>   RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
> 
> Moni Shoua (1):
>   RDMA/core: Make invalidate_range a device operation

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly
  2019-08-19 11:17 ` [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly Leon Romanovsky
@ 2019-08-21 17:15   ` Jason Gunthorpe
  2019-08-21 17:27     ` Leon Romanovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2019-08-21 17:15 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Guy Levi, Moni Shoua

On Mon, Aug 19, 2019 at 02:17:00PM +0300, Leon Romanovsky wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> Instead of intersecting a full interval, just iterate over every element
> directly. This is faster and clearer.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/core/umem_odp.c | 51 ++++++++++++++++--------------
>  drivers/infiniband/hw/mlx5/odp.c   | 41 +++++++++++-------------
>  2 files changed, 47 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> index 8358eb8e3a26..b9bebef00a33 100644
> +++ b/drivers/infiniband/core/umem_odp.c
> @@ -72,35 +72,41 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
>  	mutex_unlock(&umem_odp->umem_mutex);
>  }
>  
> -static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
> -					       u64 start, u64 end, void *cookie)
> -{
> -	/*
> -	 * Increase the number of notifiers running, to
> -	 * prevent any further fault handling on this MR.
> -	 */
> -	ib_umem_notifier_start_account(umem_odp);
> -	umem_odp->dying = 1;

This patch was not applied on top of the commit noted in the cover
letter

> -	/* Make sure that the fact the umem is dying is out before we release
> -	 * all pending page faults. */
> -	smp_wmb();
> -	complete_all(&umem_odp->notifier_completion);
> -	umem_odp->umem.context->invalidate_range(
> -		umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp));
> -	return 0;
> -}
> -
>  static void ib_umem_notifier_release(struct mmu_notifier *mn,
>  				     struct mm_struct *mm)
>  {
>  	struct ib_ucontext_per_mm *per_mm =
>  		container_of(mn, struct ib_ucontext_per_mm, mn);
> +	struct rb_node *node;
>  
>  	down_read(&per_mm->umem_rwsem);
> -	if (per_mm->active)
> -		rbt_ib_umem_for_each_in_range(
> -			&per_mm->umem_tree, 0, ULLONG_MAX,
> -			ib_umem_notifier_release_trampoline, true, NULL);
> +	if (!per_mm->active)
> +		goto out;
> +
> +	for (node = rb_first_cached(&per_mm->umem_tree); node;
> +	     node = rb_next(node)) {
> +		struct ib_umem_odp *umem_odp =
> +			rb_entry(node, struct ib_umem_odp, interval_tree.rb);
> +
> +		/*
> +		 * Increase the number of notifiers running, to prevent any
> +		 * further fault handling on this MR.
> +		 */
> +		ib_umem_notifier_start_account(umem_odp);
> +
> +		umem_odp->dying = 1;

So this ends up as a 'rebasing error'

I fixed it

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly
  2019-08-21 17:15   ` Jason Gunthorpe
@ 2019-08-21 17:27     ` Leon Romanovsky
  2019-08-21 17:35       ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-21 17:27 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Guy Levi, Moni Shoua

On Wed, Aug 21, 2019 at 02:15:02PM -0300, Jason Gunthorpe wrote:
> On Mon, Aug 19, 2019 at 02:17:00PM +0300, Leon Romanovsky wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> >
> > Instead of intersecting a full interval, just iterate over every element
> > directly. This is faster and clearer.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/umem_odp.c | 51 ++++++++++++++++--------------
> >  drivers/infiniband/hw/mlx5/odp.c   | 41 +++++++++++-------------
> >  2 files changed, 47 insertions(+), 45 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> > index 8358eb8e3a26..b9bebef00a33 100644
> > +++ b/drivers/infiniband/core/umem_odp.c
> > @@ -72,35 +72,41 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
> >  	mutex_unlock(&umem_odp->umem_mutex);
> >  }
> >
> > -static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
> > -					       u64 start, u64 end, void *cookie)
> > -{
> > -	/*
> > -	 * Increase the number of notifiers running, to
> > -	 * prevent any further fault handling on this MR.
> > -	 */
> > -	ib_umem_notifier_start_account(umem_odp);
> > -	umem_odp->dying = 1;
>
> This patch was not applied on top of the commit noted in the cover
> letter

Strange: git log --oneline on my submission queue.
....
39c10977a728 RDMA/odp: Iterate over the whole rbtree directly
779c1205d0e0 RDMA/odp: Use the common interval tree library instead of generic
25705cc22617 RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB
---


>
> > -	/* Make sure that the fact the umem is dying is out before we release
> > -	 * all pending page faults. */
> > -	smp_wmb();
> > -	complete_all(&umem_odp->notifier_completion);
> > -	umem_odp->umem.context->invalidate_range(
> > -		umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp));
> > -	return 0;
> > -}
> > -
> >  static void ib_umem_notifier_release(struct mmu_notifier *mn,
> >  				     struct mm_struct *mm)
> >  {
> >  	struct ib_ucontext_per_mm *per_mm =
> >  		container_of(mn, struct ib_ucontext_per_mm, mn);
> > +	struct rb_node *node;
> >
> >  	down_read(&per_mm->umem_rwsem);
> > -	if (per_mm->active)
> > -		rbt_ib_umem_for_each_in_range(
> > -			&per_mm->umem_tree, 0, ULLONG_MAX,
> > -			ib_umem_notifier_release_trampoline, true, NULL);
> > +	if (!per_mm->active)
> > +		goto out;
> > +
> > +	for (node = rb_first_cached(&per_mm->umem_tree); node;
> > +	     node = rb_next(node)) {
> > +		struct ib_umem_odp *umem_odp =
> > +			rb_entry(node, struct ib_umem_odp, interval_tree.rb);
> > +
> > +		/*
> > +		 * Increase the number of notifiers running, to prevent any
> > +		 * further fault handling on this MR.
> > +		 */
> > +		ib_umem_notifier_start_account(umem_odp);
> > +
> > +		umem_odp->dying = 1;
>
> So this ends up as a 'rebasing error'
>
> I fixed it
>
> Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly
  2019-08-21 17:27     ` Leon Romanovsky
@ 2019-08-21 17:35       ` Jason Gunthorpe
  2019-08-21 17:47         ` Leon Romanovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2019-08-21 17:35 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, RDMA mailing list, Guy Levi, Moni Shoua

On Wed, Aug 21, 2019 at 08:27:35PM +0300, Leon Romanovsky wrote:
> On Wed, Aug 21, 2019 at 02:15:02PM -0300, Jason Gunthorpe wrote:
> > On Mon, Aug 19, 2019 at 02:17:00PM +0300, Leon Romanovsky wrote:
> > > From: Jason Gunthorpe <jgg@mellanox.com>
> > >
> > > Instead of intersecting a full interval, just iterate over every element
> > > directly. This is faster and clearer.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > >  drivers/infiniband/core/umem_odp.c | 51 ++++++++++++++++--------------
> > >  drivers/infiniband/hw/mlx5/odp.c   | 41 +++++++++++-------------
> > >  2 files changed, 47 insertions(+), 45 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> > > index 8358eb8e3a26..b9bebef00a33 100644
> > > +++ b/drivers/infiniband/core/umem_odp.c
> > > @@ -72,35 +72,41 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
> > >  	mutex_unlock(&umem_odp->umem_mutex);
> > >  }
> > >
> > > -static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
> > > -					       u64 start, u64 end, void *cookie)
> > > -{
> > > -	/*
> > > -	 * Increase the number of notifiers running, to
> > > -	 * prevent any further fault handling on this MR.
> > > -	 */
> > > -	ib_umem_notifier_start_account(umem_odp);
> > > -	umem_odp->dying = 1;
> >
> > This patch was not applied on top of the commit noted in the cover
> > letter
> 
> Strange: git log --oneline on my submission queue.
> ....
> 39c10977a728 RDMA/odp: Iterate over the whole rbtree directly
> 779c1205d0e0 RDMA/odp: Use the common interval tree library instead of generic
> 25705cc22617 RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB

But that patch has to apply on top of rc, which has the other commit
that deleted dying

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly
  2019-08-21 17:35       ` Jason Gunthorpe
@ 2019-08-21 17:47         ` Leon Romanovsky
  0 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2019-08-21 17:47 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, RDMA mailing list, Guy Levi, Moni Shoua

On Wed, Aug 21, 2019 at 02:35:52PM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 21, 2019 at 08:27:35PM +0300, Leon Romanovsky wrote:
> > On Wed, Aug 21, 2019 at 02:15:02PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Aug 19, 2019 at 02:17:00PM +0300, Leon Romanovsky wrote:
> > > > From: Jason Gunthorpe <jgg@mellanox.com>
> > > >
> > > > Instead of intersecting a full interval, just iterate over every element
> > > > directly. This is faster and clearer.
> > > >
> > > > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> > > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > >  drivers/infiniband/core/umem_odp.c | 51 ++++++++++++++++--------------
> > > >  drivers/infiniband/hw/mlx5/odp.c   | 41 +++++++++++-------------
> > > >  2 files changed, 47 insertions(+), 45 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> > > > index 8358eb8e3a26..b9bebef00a33 100644
> > > > +++ b/drivers/infiniband/core/umem_odp.c
> > > > @@ -72,35 +72,41 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
> > > >  	mutex_unlock(&umem_odp->umem_mutex);
> > > >  }
> > > >
> > > > -static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp,
> > > > -					       u64 start, u64 end, void *cookie)
> > > > -{
> > > > -	/*
> > > > -	 * Increase the number of notifiers running, to
> > > > -	 * prevent any further fault handling on this MR.
> > > > -	 */
> > > > -	ib_umem_notifier_start_account(umem_odp);
> > > > -	umem_odp->dying = 1;
> > >
> > > This patch was not applied on top of the commit noted in the cover
> > > letter
> >
> > Strange: git log --oneline on my submission queue.
> > ....
> > 39c10977a728 RDMA/odp: Iterate over the whole rbtree directly
> > 779c1205d0e0 RDMA/odp: Use the common interval tree library instead of generic
> > 25705cc22617 RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB
>
> But that patch has to apply on top of rc, which has the other commit
> that deleted dying

Interesting

>
> Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end
  2019-08-19 11:17 ` [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end Leon Romanovsky
@ 2019-08-26 16:42   ` Nathan Chancellor
  2019-08-26 16:55     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Nathan Chancellor @ 2019-08-26 16:42 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Jason Gunthorpe, Leon Romanovsky,
	RDMA mailing list, Guy Levi, Moni Shoua

On Mon, Aug 19, 2019 at 02:17:06PM +0300, Leon Romanovsky wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> Since the page size can be extended in the ODP case by IB_ACCESS_HUGETLB
> the existing overflow checks done by ib_umem_get() are not
> sufficient. Check for overflow again.
> 
> Further, remove the unchecked math from the inlines and just use the
> precomputed value stored in the interval_tree_node.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/core/umem_odp.c | 25 +++++++++++++++++++------
>  include/rdma/ib_umem_odp.h         |  5 ++---
>  2 files changed, 21 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> index 2575dd783196..46ae9962fae3 100644
> --- a/drivers/infiniband/core/umem_odp.c
> +++ b/drivers/infiniband/core/umem_odp.c
> @@ -294,19 +294,32 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
>  
>  	umem_odp->umem.is_odp = 1;
>  	if (!umem_odp->is_implicit_odp) {
> -		size_t pages = ib_umem_odp_num_pages(umem_odp);
> -
> +		size_t page_size = 1UL << umem_odp->page_shift;
> +		size_t pages;
> +
> +		umem_odp->interval_tree.start =
> +			ALIGN_DOWN(umem_odp->umem.address, page_size);
> +		if (check_add_overflow(umem_odp->umem.address,
> +				       umem_odp->umem.length,
> +				       &umem_odp->interval_tree.last))
> +			return -EOVERFLOW;

This if statement causes a warning on 32-bit ARM:

drivers/infiniband/core/umem_odp.c:295:7: warning: comparison of distinct
pointer types ('typeof (umem_odp->umem.address) *' (aka 'unsigned long *')
and 'typeof (umem_odp->umem.length) *' (aka 'unsigned int *'))
[-Wcompare-distinct-pointer-types]
                if (check_add_overflow(umem_odp->umem.address,
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/overflow.h:59:15: note: expanded from macro 'check_add_overflow'
        (void) (&__a == &__b);                  \
                ~~~~ ^  ~~~~
1 warning generated.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end
  2019-08-26 16:42   ` Nathan Chancellor
@ 2019-08-26 16:55     ` Jason Gunthorpe
  2019-08-27 19:25       ` Nathan Chancellor
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2019-08-26 16:55 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky,
	RDMA mailing list, Guy Levi(SW),
	Moni Shoua

On Mon, Aug 26, 2019 at 09:42:23AM -0700, Nathan Chancellor wrote:
> On Mon, Aug 19, 2019 at 02:17:06PM +0300, Leon Romanovsky wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > Since the page size can be extended in the ODP case by IB_ACCESS_HUGETLB
> > the existing overflow checks done by ib_umem_get() are not
> > sufficient. Check for overflow again.
> > 
> > Further, remove the unchecked math from the inlines and just use the
> > precomputed value stored in the interval_tree_node.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/umem_odp.c | 25 +++++++++++++++++++------
> >  include/rdma/ib_umem_odp.h         |  5 ++---
> >  2 files changed, 21 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> > index 2575dd783196..46ae9962fae3 100644
> > +++ b/drivers/infiniband/core/umem_odp.c
> > @@ -294,19 +294,32 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
> >  
> >  	umem_odp->umem.is_odp = 1;
> >  	if (!umem_odp->is_implicit_odp) {
> > -		size_t pages = ib_umem_odp_num_pages(umem_odp);
> > -
> > +		size_t page_size = 1UL << umem_odp->page_shift;
> > +		size_t pages;
> > +
> > +		umem_odp->interval_tree.start =
> > +			ALIGN_DOWN(umem_odp->umem.address, page_size);
> > +		if (check_add_overflow(umem_odp->umem.address,
> > +				       umem_odp->umem.length,
> > +				       &umem_odp->interval_tree.last))
> > +			return -EOVERFLOW;
> 
> This if statement causes a warning on 32-bit ARM:
> 
> drivers/infiniband/core/umem_odp.c:295:7: warning: comparison of distinct
> pointer types ('typeof (umem_odp->umem.address) *' (aka 'unsigned long *')
> and 'typeof (umem_odp->umem.length) *' (aka 'unsigned int *'))
> [-Wcompare-distinct-pointer-types]
>                 if (check_add_overflow(umem_odp->umem.address,
>                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> include/linux/overflow.h:59:15: note: expanded from macro 'check_add_overflow'
>         (void) (&__a == &__b);                  \
>                 ~~~~ ^  ~~~~
> 1 warning generated.

Hum, I'm pretty sure 0-day has stopped running 32 bit builds or
something :\

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end
  2019-08-26 16:55     ` Jason Gunthorpe
@ 2019-08-27 19:25       ` Nathan Chancellor
  0 siblings, 0 replies; 21+ messages in thread
From: Nathan Chancellor @ 2019-08-27 19:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky,
	RDMA mailing list, Guy Levi(SW),
	Moni Shoua, Philip Li, Rong Chen

On Mon, Aug 26, 2019 at 04:55:45PM +0000, Jason Gunthorpe wrote:
> On Mon, Aug 26, 2019 at 09:42:23AM -0700, Nathan Chancellor wrote:
> > On Mon, Aug 19, 2019 at 02:17:06PM +0300, Leon Romanovsky wrote:
> > > From: Jason Gunthorpe <jgg@mellanox.com>
> > > 
> > > Since the page size can be extended in the ODP case by IB_ACCESS_HUGETLB
> > > the existing overflow checks done by ib_umem_get() are not
> > > sufficient. Check for overflow again.
> > > 
> > > Further, remove the unchecked math from the inlines and just use the
> > > precomputed value stored in the interval_tree_node.
> > > 
> > > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > >  drivers/infiniband/core/umem_odp.c | 25 +++++++++++++++++++------
> > >  include/rdma/ib_umem_odp.h         |  5 ++---
> > >  2 files changed, 21 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> > > index 2575dd783196..46ae9962fae3 100644
> > > +++ b/drivers/infiniband/core/umem_odp.c
> > > @@ -294,19 +294,32 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
> > >  
> > >  	umem_odp->umem.is_odp = 1;
> > >  	if (!umem_odp->is_implicit_odp) {
> > > -		size_t pages = ib_umem_odp_num_pages(umem_odp);
> > > -
> > > +		size_t page_size = 1UL << umem_odp->page_shift;
> > > +		size_t pages;
> > > +
> > > +		umem_odp->interval_tree.start =
> > > +			ALIGN_DOWN(umem_odp->umem.address, page_size);
> > > +		if (check_add_overflow(umem_odp->umem.address,
> > > +				       umem_odp->umem.length,
> > > +				       &umem_odp->interval_tree.last))
> > > +			return -EOVERFLOW;
> > 
> > This if statement causes a warning on 32-bit ARM:
> > 
> > drivers/infiniband/core/umem_odp.c:295:7: warning: comparison of distinct
> > pointer types ('typeof (umem_odp->umem.address) *' (aka 'unsigned long *')
> > and 'typeof (umem_odp->umem.length) *' (aka 'unsigned int *'))
> > [-Wcompare-distinct-pointer-types]
> >                 if (check_add_overflow(umem_odp->umem.address,
> >                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > include/linux/overflow.h:59:15: note: expanded from macro 'check_add_overflow'
> >         (void) (&__a == &__b);                  \
> >                 ~~~~ ^  ~~~~
> > 1 warning generated.
> 
> Hum, I'm pretty sure 0-day has stopped running 32 bit builds or
> something :\
> 
> Jason

My report was with clang but GCC reports the same type of warning:

In file included from ../include/linux/slab.h:16,
                 from ../drivers/infiniband/core/umem_odp.c:38:
../drivers/infiniband/core/umem_odp.c: In function 'ib_init_umem_odp':
../include/linux/overflow.h:59:15: warning: comparison of distinct pointer types lacks a cast
   59 |  (void) (&__a == &__b);   \
      |               ^~
../drivers/infiniband/core/umem_odp.c:220:7: note: in expansion of macro 'check_add_overflow'
  220 |   if (check_add_overflow(umem_odp->umem.address,
      |       ^~~~~~~~~~~~~~~~~~

Adding Philip and Rong as I believe that they are the current 0-day
maintainers.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-08-27 19:25 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-19 11:16 [PATCH rdma-next 00/12] Improvements for ODP Leon Romanovsky
2019-08-19 11:16 ` [PATCH rdma-next 01/12] RDMA/odp: Use the common interval tree library instead of generic Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly Leon Romanovsky
2019-08-21 17:15   ` Jason Gunthorpe
2019-08-21 17:27     ` Leon Romanovsky
2019-08-21 17:35       ` Jason Gunthorpe
2019-08-21 17:47         ` Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 03/12] RDMA/odp: Make it clearer when a umem is an implicit ODP umem Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 04/12] RMDA/odp: Consolidate umem_odp initialization Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 05/12] RDMA/odp: Make the three ways to create a umem_odp clear Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 06/12] RDMA/odp: Split creating a umem_odp from ib_umem_get Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 07/12] RDMA/odp: Provide ib_umem_odp_release() to undo the allocs Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end Leon Romanovsky
2019-08-26 16:42   ` Nathan Chancellor
2019-08-26 16:55     ` Jason Gunthorpe
2019-08-27 19:25       ` Nathan Chancellor
2019-08-19 11:17 ` [PATCH rdma-next 09/12] RDMA/odp: Use kvcalloc for the dma_list and page_list Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 10/12] RDMA/core: Make invalidate_range a device operation Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 11/12] RDMA/mlx5: Use ib_umem_start instead of umem.address Leon Romanovsky
2019-08-19 11:17 ` [PATCH rdma-next 12/12] RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr Leon Romanovsky
2019-08-21 16:42 ` [PATCH rdma-next 00/12] Improvements for ODP Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).