linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage
@ 2018-12-20  9:23 Leon Romanovsky
  2018-12-20  9:23 ` [PATCH rdma-next 1/5] RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING Leon Romanovsky
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20  9:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

As a followup to Jason's request to rethink CONFIG_INFINIBAND_ON_DEMAND_PAGING
usage, this series cleans mlx5_ib and RDMA/core code and it is based on already
sent but not yet accepted patch https://patchwork.kernel.org/patch/10735547/

It is under extensive testing now, but I wanted to raise awareness as soon
as possible for the patch "RDMA/core: Don't depend device ODP capabilities
on kconfig option", which changes behavior for mlx5 devices with
CONFIG_INFINIBAND_ON_DEMAND_PAGING set to no.

Thanks

Leon Romanovsky (5):
  RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING
  RDMA/core: Don't depend device ODP capabilities on kconfig option
  RDMA/mlx5: Introduce and reuse helper to identify ODP MR
  RDMA/mlx5: Embed into the code flow the ODP config option
  RDMA/mlx5: Delete declaration of already removed function

 drivers/infiniband/core/uverbs_cmd.c |  8 ++---
 drivers/infiniband/hw/mlx5/main.c    | 37 ++++++++++-------------
 drivers/infiniband/hw/mlx5/mem.c     |  5 +---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 12 ++++++--
 drivers/infiniband/hw/mlx5/mr.c      | 45 ++++++++--------------------
 drivers/infiniband/hw/mlx5/odp.c     |  6 ++--
 include/linux/mlx5/driver.h          |  4 ---
 include/rdma/ib_umem_odp.h           | 26 ++++++++--------
 include/rdma/ib_verbs.h              |  2 --
 9 files changed, 58 insertions(+), 87 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH rdma-next 1/5] RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING
  2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
@ 2018-12-20  9:23 ` Leon Romanovsky
  2018-12-20  9:23 ` [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option Leon Romanovsky
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20  9:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

CONFIG_INFINIBAND_ON_DEMAND_PAGING is used in general structures
to micro-optimize the memory footprint. Remove it, so it will allow
us to simplify various ODP device flows.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 --
 include/rdma/ib_verbs.h              | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d531f25a1105..58efcbd596fd 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -911,7 +911,6 @@ struct mlx5_ib_dev {
 	/* Prevents soft lock on massive reg MRs */
 	struct mutex			slow_path_mutex;
 	int				fill_delay;
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	struct ib_odp_caps	odp_caps;
 	u64			odp_max_size;
 	struct mlx5_ib_pf_eq	odp_pf_eq;
@@ -923,7 +922,6 @@ struct mlx5_ib_dev {
 	struct srcu_struct      mr_srcu;
 	u32			null_mkey;
 	struct workqueue_struct *advise_mr_wq;
-#endif
 	struct mlx5_ib_flow_db	*flow_db;
 	/* protect resources needed as part of reset flow */
 	spinlock_t		reset_flow_resource_lock;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a6e57f05410c..67339da6c72d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1504,12 +1504,10 @@ struct ib_ucontext {
 
 	bool cleanup_retryable;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	void (*invalidate_range)(struct ib_umem_odp *umem_odp,
 				 unsigned long start, unsigned long end);
 	struct mutex per_mm_list_lock;
 	struct list_head per_mm_list;
-#endif
 
 	struct ib_rdmacg_object	cg_obj;
 	/*
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option
  2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
  2018-12-20  9:23 ` [PATCH rdma-next 1/5] RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING Leon Romanovsky
@ 2018-12-20  9:23 ` Leon Romanovsky
  2018-12-20 17:20   ` Jason Gunthorpe
  2018-12-20  9:23 ` [PATCH rdma-next 3/5] RDMA/mlx5: Introduce and reuse helper to identify ODP MR Leon Romanovsky
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20  9:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

Device capability bits are exposing what specific device supports from
HW perspective. Those bits are not dependent on kernel configurations
and RDMA/core should ensure that proper interfaces to users will be
disabled if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set.

Fixes: f4056bfd8ccf ("IB/core: Add on demand paging caps to ib_uverbs_ex_query_device")
Fixes: 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c | 2 --
 drivers/infiniband/hw/mlx5/main.c    | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 6b12cc5f97b2..549d9eedf62e 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3609,7 +3609,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
 
 	copy_query_dev_fields(ucontext, &resp.base, &attr);
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
 	resp.odp_caps.per_transport_caps.rc_odp_caps =
 		attr.odp_caps.per_transport_caps.rc_odp_caps;
@@ -3617,7 +3616,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
 		attr.odp_caps.per_transport_caps.uc_odp_caps;
 	resp.odp_caps.per_transport_caps.ud_odp_caps =
 		attr.odp_caps.per_transport_caps.ud_odp_caps;
-#endif
 
 	resp.timestamp_mask = attr.timestamp_mask;
 	resp.hca_core_clock = attr.hca_core_clock;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 6584e638387a..d7e5ba5034aa 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -923,11 +923,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
 	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	if (MLX5_CAP_GEN(mdev, pg))
 		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
 	props->odp_caps = dev->odp_caps;
-#endif
 
 	if (MLX5_CAP_GEN(mdev, cd))
 		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rdma-next 3/5] RDMA/mlx5: Introduce and reuse helper to identify ODP MR
  2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
  2018-12-20  9:23 ` [PATCH rdma-next 1/5] RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING Leon Romanovsky
  2018-12-20  9:23 ` [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option Leon Romanovsky
@ 2018-12-20  9:23 ` Leon Romanovsky
  2018-12-20  9:23 ` [PATCH rdma-next 4/5] RDMA/mlx5: Embed into the code flow the ODP config option Leon Romanovsky
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20  9:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

Consolidate various checks if MR is ODP backed to one simple
helper and update call sites to use it.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 10 ++++++++++
 drivers/infiniband/hw/mlx5/mr.c      | 29 ++++++----------------------
 drivers/infiniband/hw/mlx5/odp.c     |  6 +++---
 3 files changed, 19 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 58efcbd596fd..6563e531fdc2 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -36,6 +36,7 @@
 #include <linux/kernel.h>
 #include <linux/sched.h>
 #include <rdma/ib_verbs.h>
+#include <rdma/ib_umem.h>
 #include <rdma/ib_smi.h>
 #include <linux/mlx5/driver.h>
 #include <linux/mlx5/cq.h>
@@ -589,6 +590,12 @@ struct mlx5_ib_mr {
 	wait_queue_head_t       q_leaf_free;
 };
 
+static inline bool is_odp_mr(struct mlx5_ib_mr *mr)
+{
+	return IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && mr->umem &&
+	       mr->umem->is_odp;
+}
+
 struct mlx5_ib_mw {
 	struct ib_mw		ibmw;
 	struct mlx5_core_mkey	mmkey;
@@ -1212,6 +1219,9 @@ mlx5_ib_advise_mr_prefetch(struct ib_pd *pd,
 {
 	return -EOPNOTSUPP;
 }
+static inline void mlx5_ib_invalidate_range(struct ib_umem_odp *umem_odp,
+					    unsigned long start,
+					    unsigned long end){};
 #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 
 /* Needed for rep profile */
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 1bd8c1b1dba1..b861b4a5b0e0 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -71,11 +71,9 @@ static int destroy_mkey(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
 {
 	int err = mlx5_core_destroy_mkey(dev->mdev, &mr->mmkey);
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	/* Wait until all page fault handlers using the mr complete. */
-	if (mr->umem && mr->umem->is_odp)
+	if (is_odp_mr(mr))
 		synchronize_srcu(&dev->mr_srcu);
-#endif
 
 	return err;
 }
@@ -96,10 +94,9 @@ static bool use_umr_mtt_update(struct mlx5_ib_mr *mr, u64 start, u64 length)
 		length + (start & (MLX5_ADAPTER_PAGE_SIZE - 1));
 }
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 static void update_odp_mr(struct mlx5_ib_mr *mr)
 {
-	if (mr->umem->is_odp) {
+	if (is_odp_mr(mr)) {
 		/*
 		 * This barrier prevents the compiler from moving the
 		 * setting of umem->odp_data->private to point to our
@@ -122,7 +119,6 @@ static void update_odp_mr(struct mlx5_ib_mr *mr)
 		smp_wmb();
 	}
 }
-#endif
 
 static void reg_mr_callback(int status, void *context)
 {
@@ -238,9 +234,7 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num)
 {
 	struct mlx5_mr_cache *cache = &dev->cache;
 	struct mlx5_cache_ent *ent = &cache->ent[c];
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	bool odp_mkey_exist = false;
-#endif
 	struct mlx5_ib_mr *tmp_mr;
 	struct mlx5_ib_mr *mr;
 	LIST_HEAD(del_list);
@@ -253,10 +247,8 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num)
 			break;
 		}
 		mr = list_first_entry(&ent->head, struct mlx5_ib_mr, list);
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-		if (mr->umem && mr->umem->is_odp)
+		if (is_odp_mr(mr))
 			odp_mkey_exist = true;
-#endif
 		list_move(&mr->list, &del_list);
 		ent->cur--;
 		ent->size--;
@@ -264,10 +256,8 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num)
 		mlx5_core_destroy_mkey(dev->mdev, &mr->mmkey);
 	}
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	if (odp_mkey_exist)
 		synchronize_srcu(&dev->mr_srcu);
-#endif
 
 	list_for_each_entry_safe(mr, tmp_mr, &del_list, list) {
 		list_del(&mr->list);
@@ -594,7 +584,7 @@ static void clean_keys(struct mlx5_ib_dev *dev, int c)
 			break;
 		}
 		mr = list_first_entry(&ent->head, struct mlx5_ib_mr, list);
-		if (mr->umem && mr->umem->is_odp)
+		if (is_odp_mr(mr))
 			odp_mkey_exist = true;
 		list_move(&mr->list, &del_list);
 		ent->cur--;
@@ -603,10 +593,8 @@ static void clean_keys(struct mlx5_ib_dev *dev, int c)
 		mlx5_core_destroy_mkey(dev->mdev, &mr->mmkey);
 	}
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	if (odp_mkey_exist)
 		synchronize_srcu(&dev->mr_srcu);
-#endif
 
 	list_for_each_entry_safe(mr, tmp_mr, &del_list, list) {
 		list_del(&mr->list);
@@ -1399,9 +1387,7 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	mr->umem = umem;
 	set_mr_fields(dev, mr, npages, length, access_flags);
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	update_odp_mr(mr);
-#endif
 
 	if (!populate_mtts) {
 		int update_xlt_flags = MLX5_IB_UPD_XLT_ENABLE;
@@ -1566,9 +1552,7 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start,
 
 	set_mr_fields(dev, mr, npages, len, access_flags);
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	update_odp_mr(mr);
-#endif
 	return 0;
 
 err:
@@ -1654,8 +1638,7 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
 	int npages = mr->npages;
 	struct ib_umem *umem = mr->umem;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	if (umem && umem->is_odp) {
+	if (is_odp_mr(mr)) {
 		struct ib_umem_odp *umem_odp = to_ib_umem_odp(umem);
 
 		/* Prevent new page faults from succeeding */
@@ -1679,7 +1662,7 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
 		/* Avoid double-freeing the umem. */
 		umem = NULL;
 	}
-#endif
+
 	clean_mr(dev, mr);
 
 	/*
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 80fa2438db8f..86c64c3468df 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -103,7 +103,7 @@ static int check_parent(struct ib_umem_odp *odp,
 
 struct ib_ucontext_per_mm *mr_to_per_mm(struct mlx5_ib_mr *mr)
 {
-	if (WARN_ON(!mr || !mr->umem || !mr->umem->is_odp))
+	if (WARN_ON(!mr || !is_odp_mr(mr)))
 		return NULL;
 
 	return to_ib_umem_odp(mr->umem)->per_mm;
@@ -740,12 +740,12 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, u32 key,
 			goto srcu_unlock;
 		}
 
-		if (prefetch && !mr->umem->is_odp) {
+		if (prefetch && !is_odp_mr(mr)) {
 			ret = -EINVAL;
 			goto srcu_unlock;
 		}
 
-		if (!mr->umem->is_odp) {
+		if (!is_odp_mr(mr)) {
 			mlx5_ib_dbg(dev, "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n",
 				    key);
 			if (bytes_mapped)
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH rdma-next 4/5] RDMA/mlx5: Embed into the code flow the ODP config option
  2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
                   ` (2 preceding siblings ...)
  2018-12-20  9:23 ` [PATCH rdma-next 3/5] RDMA/mlx5: Introduce and reuse helper to identify ODP MR Leon Romanovsky
@ 2018-12-20  9:23 ` Leon Romanovsky
  2018-12-20  9:23 ` [PATCH mlx5-next 5/5] RDMA/mlx5: Delete declaration of already removed function Leon Romanovsky
  2018-12-21  3:32 ` [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Jason Gunthorpe
  5 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20  9:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

Convert various places to more readable code, which embeds
CONFIG_INFINIBAND_ON_DEMAND_PAGING into the code flow.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c |  6 ++---
 drivers/infiniband/hw/mlx5/main.c    | 35 +++++++++++++---------------
 drivers/infiniband/hw/mlx5/mem.c     |  5 +---
 drivers/infiniband/hw/mlx5/mr.c      | 16 ++++++-------
 include/rdma/ib_umem_odp.h           | 26 ++++++++++-----------
 5 files changed, 39 insertions(+), 49 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 549d9eedf62e..4d28db23f539 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -234,14 +234,12 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
 	ucontext->closing = false;
 	ucontext->cleanup_retryable = false;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	mutex_init(&ucontext->per_mm_list_lock);
 	INIT_LIST_HEAD(&ucontext->per_mm_list);
-	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
+	if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) ||
+	    !(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
 		ucontext->invalidate_range = NULL;
 
-#endif
-
 	resp.num_comp_vectors = file->device->num_comp_vectors;
 
 	ret = get_unused_fd_flags(O_CLOEXEC);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index d7e5ba5034aa..0b68795fb04e 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1761,9 +1761,7 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 	if (err)
 		goto out_sys_pages;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
-#endif
 
 	if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) {
 		err = mlx5_ib_devx_create(dev, true);
@@ -1895,12 +1893,10 @@ static int mlx5_ib_dealloc_ucontext(struct ib_ucontext *ibcontext)
 	struct mlx5_ib_dev *dev = to_mdev(ibcontext->device);
 	struct mlx5_bfreg_info *bfregi;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	/* All umem's must be destroyed before destroying the ucontext. */
 	mutex_lock(&ibcontext->per_mm_list_lock);
 	WARN_ON(!list_empty(&ibcontext->per_mm_list));
 	mutex_unlock(&ibcontext->per_mm_list_lock);
-#endif
 
 	bfregi = &context->bfregi;
 	mlx5_ib_dealloc_transport_domain(dev, context->tdn, context->devx_uid);
@@ -5720,11 +5716,11 @@ static struct ib_counters *mlx5_ib_create_counters(struct ib_device *device,
 void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev)
 {
 	mlx5_ib_cleanup_multiport_master(dev);
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	cleanup_srcu_struct(&dev->mr_srcu);
-	drain_workqueue(dev->advise_mr_wq);
-	destroy_workqueue(dev->advise_mr_wq);
-#endif
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
+		cleanup_srcu_struct(&dev->mr_srcu);
+		drain_workqueue(dev->advise_mr_wq);
+		destroy_workqueue(dev->advise_mr_wq);
+	}
 	kfree(dev->port);
 }
 
@@ -5777,17 +5773,18 @@ int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
 	spin_lock_init(&dev->memic.memic_lock);
 	dev->memic.dev = mdev;
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	dev->advise_mr_wq = alloc_ordered_workqueue("mlx5_ib_advise_mr_wq", 0);
-	if (!dev->advise_mr_wq) {
-		err = -ENOMEM;
-		goto err_free_port;
-	}
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
+		dev->advise_mr_wq =
+			alloc_ordered_workqueue("mlx5_ib_advise_mr_wq", 0);
+		if (!dev->advise_mr_wq) {
+			err = -ENOMEM;
+			goto err_free_port;
+		}
 
-	err = init_srcu_struct(&dev->mr_srcu);
-	if (err)
-		goto err_free_port;
-#endif
+		err = init_srcu_struct(&dev->mr_srcu);
+		if (err)
+			goto err_free_port;
+	}
 
 	return 0;
 err_mp:
diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
index 549234988bb4..9f90be296ee0 100644
--- a/drivers/infiniband/hw/mlx5/mem.c
+++ b/drivers/infiniband/hw/mlx5/mem.c
@@ -111,7 +111,6 @@ void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr,
 	*count = i;
 }
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 static u64 umem_dma_to_mtt(dma_addr_t umem_dma)
 {
 	u64 mtt_entry = umem_dma & ODP_DMA_ADDR_MASK;
@@ -123,7 +122,6 @@ static u64 umem_dma_to_mtt(dma_addr_t umem_dma)
 
 	return mtt_entry;
 }
-#endif
 
 /*
  * Populate the given array with bus addresses from the umem.
@@ -151,7 +149,7 @@ void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
 	int len;
 	struct scatterlist *sg;
 	int entry;
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+
 	if (umem->is_odp) {
 		WARN_ON(shift != 0);
 		WARN_ON(access_flags != (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE));
@@ -164,7 +162,6 @@ void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
 		}
 		return;
 	}
-#endif
 
 	i = 0;
 	for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index b861b4a5b0e0..65d07c111d42 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1332,8 +1332,8 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n",
 		    start, virt_addr, length, access_flags);
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	if (!start && length == U64_MAX) {
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start &&
+	    length == U64_MAX) {
 		if (!(access_flags & IB_ACCESS_ON_DEMAND) ||
 		    !(dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
 			return ERR_PTR(-EINVAL);
@@ -1343,7 +1343,6 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 			return ERR_CAST(mr);
 		return &mr->ibmr;
 	}
-#endif
 
 	err = mr_umem_get(pd, start, length, access_flags, &umem, &npages,
 			   &page_shift, &ncont, &order);
@@ -1404,9 +1403,9 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 		}
 	}
 
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	mr->live = 1;
-#endif
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
+		mr->live = 1;
+
 	return &mr->ibmr;
 error:
 	ib_umem_release(umem);
@@ -1521,9 +1520,8 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start,
 		}
 
 		mr->allocated_from_cache = 0;
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-		mr->live = 1;
-#endif
+		if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
+			mr->live = 1;
 	} else {
 		/*
 		 * Send a UMR WQE
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 0b1446fe2fab..d3725cf13ecd 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -83,6 +83,19 @@ static inline struct ib_umem_odp *to_ib_umem_odp(struct ib_umem *umem)
 	return container_of(umem, struct ib_umem_odp, umem);
 }
 
+/*
+ * The lower 2 bits of the DMA address signal the R/W permissions for
+ * the entry. To upgrade the permissions, provide the appropriate
+ * bitmask to the map_dma_pages function.
+ *
+ * Be aware that upgrading a mapped address might result in change of
+ * the DMA address for the page.
+ */
+#define ODP_READ_ALLOWED_BIT  (1<<0ULL)
+#define ODP_WRITE_ALLOWED_BIT (1<<1ULL)
+
+#define ODP_DMA_ADDR_MASK (~(ODP_READ_ALLOWED_BIT | ODP_WRITE_ALLOWED_BIT))
+
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 
 struct ib_ucontext_per_mm {
@@ -107,19 +120,6 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_ucontext_per_mm *per_mm,
 				      unsigned long addr, size_t size);
 void ib_umem_odp_release(struct ib_umem_odp *umem_odp);
 
-/*
- * The lower 2 bits of the DMA address signal the R/W permissions for
- * the entry. To upgrade the permissions, provide the appropriate
- * bitmask to the map_dma_pages function.
- *
- * Be aware that upgrading a mapped address might result in change of
- * the DMA address for the page.
- */
-#define ODP_READ_ALLOWED_BIT  (1<<0ULL)
-#define ODP_WRITE_ALLOWED_BIT (1<<1ULL)
-
-#define ODP_DMA_ADDR_MASK (~(ODP_READ_ALLOWED_BIT | ODP_WRITE_ALLOWED_BIT))
-
 int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 start_offset,
 			      u64 bcnt, u64 access_mask,
 			      unsigned long current_seq);
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH mlx5-next 5/5] RDMA/mlx5: Delete declaration of already removed function
  2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
                   ` (3 preceding siblings ...)
  2018-12-20  9:23 ` [PATCH rdma-next 4/5] RDMA/mlx5: Embed into the code flow the ODP config option Leon Romanovsky
@ 2018-12-20  9:23 ` Leon Romanovsky
  2018-12-21  3:32 ` [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Jason Gunthorpe
  5 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20  9:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

The implementation of mlx5_core_page_fault_resume() was removed
in commit d5d284b829a6 ("{net,IB}/mlx5: Move Page fault EQ and
ODP logic to RDMA"). This patch removes declaration too.

Fixes: d5d284b829a6 ("{net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 include/linux/mlx5/driver.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index cb1c5b10279c..709446bea72c 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -937,10 +937,6 @@ int mlx5_query_odp_caps(struct mlx5_core_dev *dev,
 			struct mlx5_odp_caps *odp_caps);
 int mlx5_core_query_ib_ppcnt(struct mlx5_core_dev *dev,
 			     u8 port_num, void *out, size_t sz);
-#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-int mlx5_core_page_fault_resume(struct mlx5_core_dev *dev, u32 token,
-				u32 wq_num, u8 type, int error);
-#endif
 
 int mlx5_init_rl_table(struct mlx5_core_dev *dev);
 void mlx5_cleanup_rl_table(struct mlx5_core_dev *dev);
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option
  2018-12-20  9:23 ` [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option Leon Romanovsky
@ 2018-12-20 17:20   ` Jason Gunthorpe
  2018-12-20 17:29     ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2018-12-20 17:20 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Haggai Eran,
	Saeed Mahameed, linux-netdev

On Thu, Dec 20, 2018 at 11:23:15AM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Device capability bits are exposing what specific device supports from
> HW perspective. Those bits are not dependent on kernel configurations
> and RDMA/core should ensure that proper interfaces to users will be
> disabled if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set.
> 
> Fixes: f4056bfd8ccf ("IB/core: Add on demand paging caps to ib_uverbs_ex_query_device")
> Fixes: 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb")
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/core/uverbs_cmd.c | 2 --
>  drivers/infiniband/hw/mlx5/main.c    | 2 --
>  2 files changed, 4 deletions(-)
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 6b12cc5f97b2..549d9eedf62e 100644
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -3609,7 +3609,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
>  
>  	copy_query_dev_fields(ucontext, &resp.base, &attr);
>  
> -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
>  	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
>  	resp.odp_caps.per_transport_caps.rc_odp_caps =
>  		attr.odp_caps.per_transport_caps.rc_odp_caps;
> @@ -3617,7 +3616,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
>  		attr.odp_caps.per_transport_caps.uc_odp_caps;
>  	resp.odp_caps.per_transport_caps.ud_odp_caps =
>  		attr.odp_caps.per_transport_caps.ud_odp_caps;
> -#endif

This one is maybe OK, assuming the driver fills in 0..

>  	resp.timestamp_mask = attr.timestamp_mask;
>  	resp.hca_core_clock = attr.hca_core_clock;
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index 6584e638387a..d7e5ba5034aa 100644
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -923,11 +923,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
>  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
>  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
>  
> -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
>  	if (MLX5_CAP_GEN(mdev, pg))
>  		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
>  	props->odp_caps = dev->odp_caps;
> -#endif

But shouldn't this be protected? If the driver has compiled out ODP it
shouldn't set the cap flag...

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option
  2018-12-20 17:20   ` Jason Gunthorpe
@ 2018-12-20 17:29     ` Leon Romanovsky
  2018-12-20 17:33       ` Jason Gunthorpe
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20 17:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

[-- Attachment #1: Type: text/plain, Size: 2930 bytes --]

On Thu, Dec 20, 2018 at 05:20:51PM +0000, Jason Gunthorpe wrote:
> On Thu, Dec 20, 2018 at 11:23:15AM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Device capability bits are exposing what specific device supports from
> > HW perspective. Those bits are not dependent on kernel configurations
> > and RDMA/core should ensure that proper interfaces to users will be
> > disabled if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set.
> >
> > Fixes: f4056bfd8ccf ("IB/core: Add on demand paging caps to ib_uverbs_ex_query_device")
> > Fixes: 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb")
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/uverbs_cmd.c | 2 --
> >  drivers/infiniband/hw/mlx5/main.c    | 2 --
> >  2 files changed, 4 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > index 6b12cc5f97b2..549d9eedf62e 100644
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -3609,7 +3609,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> >
> >  	copy_query_dev_fields(ucontext, &resp.base, &attr);
> >
> > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> >  	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> >  	resp.odp_caps.per_transport_caps.rc_odp_caps =
> >  		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > @@ -3617,7 +3616,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> >  		attr.odp_caps.per_transport_caps.uc_odp_caps;
> >  	resp.odp_caps.per_transport_caps.ud_odp_caps =
> >  		attr.odp_caps.per_transport_caps.ud_odp_caps;
> > -#endif
>
> This one is maybe OK, assuming the driver fills in 0..
>
> >  	resp.timestamp_mask = attr.timestamp_mask;
> >  	resp.hca_core_clock = attr.hca_core_clock;
> > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > index 6584e638387a..d7e5ba5034aa 100644
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -923,11 +923,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
> >  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
> >  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
> >
> > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> >  	if (MLX5_CAP_GEN(mdev, pg))
> >  		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> >  	props->odp_caps = dev->odp_caps;
> > -#endif
>
> But shouldn't this be protected? If the driver has compiled out ODP it
> shouldn't set the cap flag...

I see those capabilities as device properties and not as kernel ones.

Current situation looks bad for me, when I have same device which
reports differently information depends on some compilation flag.

If driver was compiled without CONFIG_INFINIBAND_ON_DEMAND_PAGING, the
kernel core code should be compiled without this config set too and core
code wouldn't pass any ODP related verbs/flags.

Thanks

>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option
  2018-12-20 17:29     ` Leon Romanovsky
@ 2018-12-20 17:33       ` Jason Gunthorpe
  2018-12-20 17:50         ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2018-12-20 17:33 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

On Thu, Dec 20, 2018 at 07:29:56PM +0200, Leon Romanovsky wrote:
> On Thu, Dec 20, 2018 at 05:20:51PM +0000, Jason Gunthorpe wrote:
> > On Thu, Dec 20, 2018 at 11:23:15AM +0200, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro@mellanox.com>
> > >
> > > Device capability bits are exposing what specific device supports from
> > > HW perspective. Those bits are not dependent on kernel configurations
> > > and RDMA/core should ensure that proper interfaces to users will be
> > > disabled if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set.
> > >
> > > Fixes: f4056bfd8ccf ("IB/core: Add on demand paging caps to ib_uverbs_ex_query_device")
> > > Fixes: 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb")
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > >  drivers/infiniband/core/uverbs_cmd.c | 2 --
> > >  drivers/infiniband/hw/mlx5/main.c    | 2 --
> > >  2 files changed, 4 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > > index 6b12cc5f97b2..549d9eedf62e 100644
> > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > @@ -3609,7 +3609,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> > >
> > >  	copy_query_dev_fields(ucontext, &resp.base, &attr);
> > >
> > > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > >  	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > >  	resp.odp_caps.per_transport_caps.rc_odp_caps =
> > >  		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > > @@ -3617,7 +3616,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> > >  		attr.odp_caps.per_transport_caps.uc_odp_caps;
> > >  	resp.odp_caps.per_transport_caps.ud_odp_caps =
> > >  		attr.odp_caps.per_transport_caps.ud_odp_caps;
> > > -#endif
> >
> > This one is maybe OK, assuming the driver fills in 0..
> >
> > >  	resp.timestamp_mask = attr.timestamp_mask;
> > >  	resp.hca_core_clock = attr.hca_core_clock;
> > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > > index 6584e638387a..d7e5ba5034aa 100644
> > > +++ b/drivers/infiniband/hw/mlx5/main.c
> > > @@ -923,11 +923,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
> > >  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
> > >  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
> > >
> > > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > >  	if (MLX5_CAP_GEN(mdev, pg))
> > >  		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > >  	props->odp_caps = dev->odp_caps;
> > > -#endif
> >
> > But shouldn't this be protected? If the driver has compiled out ODP it
> > shouldn't set the cap flag...
> 
> I see those capabilities as device properties and not as kernel ones.
> 
> Current situation looks bad for me, when I have same device which
> reports differently information depends on some compilation flag.

It is not device capabilities, it is a kernel API capability if the
API is not available the bit should not be set.

The field is badly named.

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option
  2018-12-20 17:33       ` Jason Gunthorpe
@ 2018-12-20 17:50         ` Leon Romanovsky
  2018-12-20 23:22           ` Jason Gunthorpe
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-20 17:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

[-- Attachment #1: Type: text/plain, Size: 3929 bytes --]

On Thu, Dec 20, 2018 at 05:33:11PM +0000, Jason Gunthorpe wrote:
> On Thu, Dec 20, 2018 at 07:29:56PM +0200, Leon Romanovsky wrote:
> > On Thu, Dec 20, 2018 at 05:20:51PM +0000, Jason Gunthorpe wrote:
> > > On Thu, Dec 20, 2018 at 11:23:15AM +0200, Leon Romanovsky wrote:
> > > > From: Leon Romanovsky <leonro@mellanox.com>
> > > >
> > > > Device capability bits are exposing what specific device supports from
> > > > HW perspective. Those bits are not dependent on kernel configurations
> > > > and RDMA/core should ensure that proper interfaces to users will be
> > > > disabled if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set.
> > > >
> > > > Fixes: f4056bfd8ccf ("IB/core: Add on demand paging caps to ib_uverbs_ex_query_device")
> > > > Fixes: 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb")
> > > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > >  drivers/infiniband/core/uverbs_cmd.c | 2 --
> > > >  drivers/infiniband/hw/mlx5/main.c    | 2 --
> > > >  2 files changed, 4 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > > > index 6b12cc5f97b2..549d9eedf62e 100644
> > > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > > @@ -3609,7 +3609,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> > > >
> > > >  	copy_query_dev_fields(ucontext, &resp.base, &attr);
> > > >
> > > > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > > >  	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > > >  	resp.odp_caps.per_transport_caps.rc_odp_caps =
> > > >  		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > > > @@ -3617,7 +3616,6 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> > > >  		attr.odp_caps.per_transport_caps.uc_odp_caps;
> > > >  	resp.odp_caps.per_transport_caps.ud_odp_caps =
> > > >  		attr.odp_caps.per_transport_caps.ud_odp_caps;
> > > > -#endif
> > >
> > > This one is maybe OK, assuming the driver fills in 0..
> > >
> > > >  	resp.timestamp_mask = attr.timestamp_mask;
> > > >  	resp.hca_core_clock = attr.hca_core_clock;
> > > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > > > index 6584e638387a..d7e5ba5034aa 100644
> > > > +++ b/drivers/infiniband/hw/mlx5/main.c
> > > > @@ -923,11 +923,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
> > > >  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
> > > >  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
> > > >
> > > > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > > >  	if (MLX5_CAP_GEN(mdev, pg))
> > > >  		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > > >  	props->odp_caps = dev->odp_caps;
> > > > -#endif
> > >
> > > But shouldn't this be protected? If the driver has compiled out ODP it
> > > shouldn't set the cap flag...
> >
> > I see those capabilities as device properties and not as kernel ones.
> >
> > Current situation looks bad for me, when I have same device which
> > reports differently information depends on some compilation flag.
>
> It is not device capabilities, it is a kernel API capability if the
> API is not available the bit should not be set.

Is it better? Should I resend the series?

diff --git a/drivers/infiniband/hw/mlx5/main.c
b/drivers/infiniband/hw/mlx5/main.c
index 0b68795fb04e..c95c2f545b8f 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -925,7 +925,8 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,

        if (MLX5_CAP_GEN(mdev, pg))
                props->device_cap_flags |=  IB_DEVICE_ON_DEMAND_PAGING;
-       props->odp_caps = dev->odp_caps;
+       if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
+               props->odp_caps =  dev->odp_caps;

        if (MLX5_CAP_GEN(mdev, cd))
                props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;


>
> The field is badly named.
>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option
  2018-12-20 17:50         ` Leon Romanovsky
@ 2018-12-20 23:22           ` Jason Gunthorpe
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2018-12-20 23:22 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

On Thu, Dec 20, 2018 at 07:50:20PM +0200, Leon Romanovsky wrote:
> > > > > -#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > > > >  	if (MLX5_CAP_GEN(mdev, pg))
> > > > >  		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > > > >  	props->odp_caps = dev->odp_caps;
> > > > > -#endif
> > > >
> > > > But shouldn't this be protected? If the driver has compiled out ODP it
> > > > shouldn't set the cap flag...
> > >
> > > I see those capabilities as device properties and not as kernel ones.
> > >
> > > Current situation looks bad for me, when I have same device which
> > > reports differently information depends on some compilation flag.
> >
> > It is not device capabilities, it is a kernel API capability if the
> > API is not available the bit should not be set.
> 
> Is it better? Should I resend the series?

Hrm.. I think I'm inclined to keep the the if() in the same place as
the ifdefs. We can revise it later.

I noticed other busted up stuff:
 - IB_DEVICE_ON_DEMAND_PAGING is apparently UAPI but not in a uapi
   header (grr)
 - ucontext shouldn't have a driver callback, that should be moved to
   ops (ie to mlx5_ib_dev_odp_ops)
 - IB_DEVICE_ON_DEMAND_PAGING should be set in core code based
   on ops.invalidate_range being present, drivers shouldn't set it..

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage
  2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
                   ` (4 preceding siblings ...)
  2018-12-20  9:23 ` [PATCH mlx5-next 5/5] RDMA/mlx5: Delete declaration of already removed function Leon Romanovsky
@ 2018-12-21  3:32 ` Jason Gunthorpe
  2018-12-21 13:59   ` Leon Romanovsky
  5 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2018-12-21  3:32 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Haggai Eran,
	Saeed Mahameed, linux-netdev

On Thu, Dec 20, 2018 at 11:23:13AM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Hi,
> 
> As a followup to Jason's request to rethink CONFIG_INFINIBAND_ON_DEMAND_PAGING
> usage, this series cleans mlx5_ib and RDMA/core code and it is based on already
> sent but not yet accepted patch https://patchwork.kernel.org/patch/10735547/
> 
> It is under extensive testing now, but I wanted to raise awareness as soon
> as possible for the patch "RDMA/core: Don't depend device ODP capabilities
> on kconfig option", which changes behavior for mlx5 devices with
> CONFIG_INFINIBAND_ON_DEMAND_PAGING set to no.
> 
> Thanks
> 
> Leon Romanovsky (5):
>   RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING
>   RDMA/core: Don't depend device ODP capabilities on kconfig option
>   RDMA/mlx5: Introduce and reuse helper to identify ODP MR
>   RDMA/mlx5: Embed into the code flow the ODP config option
>   RDMA/mlx5: Delete declaration of already removed function

I'm imagining something like this integrated into these patches, what
do you think?

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index c6144df47ea47e..c2615b6bb68841 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -95,6 +95,9 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
 	struct scatterlist *sg, *sg_list_start;
 	unsigned int gup_flags = FOLL_WRITE;
 
+	if ((access & IB_ACCESS_ON_DEMAND) && !context->invalidate_range)
+		return ERR_PTR(-EOPNOTSUPP);
+
 	if (dmasync)
 		dma_attrs |= DMA_ATTR_WRITE_BARRIER;
 
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 4d28db23f53955..241376bae09540 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -236,8 +236,7 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
 
 	mutex_init(&ucontext->per_mm_list_lock);
 	INIT_LIST_HEAD(&ucontext->per_mm_list);
-	if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) ||
-	    !(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
+	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
 		ucontext->invalidate_range = NULL;
 
 	resp.num_comp_vectors = file->device->num_comp_vectors;
@@ -3607,13 +3606,15 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
 
 	copy_query_dev_fields(ucontext, &resp.base, &attr);
 
-	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
-	resp.odp_caps.per_transport_caps.rc_odp_caps =
-		attr.odp_caps.per_transport_caps.rc_odp_caps;
-	resp.odp_caps.per_transport_caps.uc_odp_caps =
-		attr.odp_caps.per_transport_caps.uc_odp_caps;
-	resp.odp_caps.per_transport_caps.ud_odp_caps =
-		attr.odp_caps.per_transport_caps.ud_odp_caps;
+	if (ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING) {
+		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
+		resp.odp_caps.per_transport_caps.rc_odp_caps =
+			attr.odp_caps.per_transport_caps.rc_odp_caps;
+		resp.odp_caps.per_transport_caps.uc_odp_caps =
+			attr.odp_caps.per_transport_caps.uc_odp_caps;
+		resp.odp_caps.per_transport_caps.ud_odp_caps =
+			attr.odp_caps.per_transport_caps.ud_odp_caps;
+	}
 
 	resp.timestamp_mask = attr.timestamp_mask;
 	resp.hca_core_clock = attr.hca_core_clock;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index ff131e4c874ec5..df8366fb0142d6 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -923,9 +923,11 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
 	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
 
-	if (MLX5_CAP_GEN(mdev, pg))
-		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
-	props->odp_caps = dev->odp_caps;
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
+		if (MLX5_CAP_GEN(mdev, pg))
+			props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
+		props->odp_caps = dev->odp_caps;
+	}
 
 	if (MLX5_CAP_GEN(mdev, cd))
 		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
@@ -1761,7 +1763,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 	if (err)
 		goto out_sys_pages;
 
-	context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
+	if (ibdev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING)
+		context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
 
 	if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) {
 		err = mlx5_ib_devx_create(dev, true);
diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
index 9f90be296ee0f7..22827ba4b6d8eb 100644
--- a/drivers/infiniband/hw/mlx5/mem.c
+++ b/drivers/infiniband/hw/mlx5/mem.c
@@ -150,7 +150,7 @@ void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
 	struct scatterlist *sg;
 	int entry;
 
-	if (umem->is_odp) {
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && umem->is_odp) {
 		WARN_ON(shift != 0);
 		WARN_ON(access_flags != (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE));
 
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 65d07c111d42a7..8183e94da5a1ea 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1332,12 +1332,14 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n",
 		    start, virt_addr, length, access_flags);
 
-	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start &&
-	    length == U64_MAX) {
+	if (!start && length == U64_MAX) {
 		if (!(access_flags & IB_ACCESS_ON_DEMAND) ||
 		    !(dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
 			return ERR_PTR(-EINVAL);
 
+		if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
+			return ERR_PTR(-EOPNOTSUPP);
+
 		mr = mlx5_ib_alloc_implicit_mr(to_mpd(pd), access_flags);
 		if (IS_ERR(mr))
 			return ERR_CAST(mr);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage
  2018-12-21  3:32 ` [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Jason Gunthorpe
@ 2018-12-21 13:59   ` Leon Romanovsky
  2018-12-21 16:59     ` Jason Gunthorpe
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-21 13:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

[-- Attachment #1: Type: text/plain, Size: 7579 bytes --]

On Thu, Dec 20, 2018 at 08:32:35PM -0700, Jason Gunthorpe wrote:
> On Thu, Dec 20, 2018 at 11:23:13AM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Hi,
> >
> > As a followup to Jason's request to rethink CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > usage, this series cleans mlx5_ib and RDMA/core code and it is based on already
> > sent but not yet accepted patch https://patchwork.kernel.org/patch/10735547/
> >
> > It is under extensive testing now, but I wanted to raise awareness as soon
> > as possible for the patch "RDMA/core: Don't depend device ODP capabilities
> > on kconfig option", which changes behavior for mlx5 devices with
> > CONFIG_INFINIBAND_ON_DEMAND_PAGING set to no.
> >
> > Thanks
> >
> > Leon Romanovsky (5):
> >   RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING
> >   RDMA/core: Don't depend device ODP capabilities on kconfig option
> >   RDMA/mlx5: Introduce and reuse helper to identify ODP MR
> >   RDMA/mlx5: Embed into the code flow the ODP config option
> >   RDMA/mlx5: Delete declaration of already removed function
>
> I'm imagining something like this integrated into these patches, what
> do you think?

See my comments below.

>
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index c6144df47ea47e..c2615b6bb68841 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -95,6 +95,9 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
>  	struct scatterlist *sg, *sg_list_start;
>  	unsigned int gup_flags = FOLL_WRITE;
>
> +	if ((access & IB_ACCESS_ON_DEMAND) && !context->invalidate_range)
> +		return ERR_PTR(-EOPNOTSUPP);
> +

My expectation that we won't be in this state because it is too far away
from entry where we could check and prevent unsupported access.

uverbs entry point -> driver code -> ib_umem_get
  ^^^^ this is better place to check right flags.


>  	if (dmasync)
>  		dma_attrs |= DMA_ATTR_WRITE_BARRIER;
>
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 4d28db23f53955..241376bae09540 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -236,8 +236,7 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
>
>  	mutex_init(&ucontext->per_mm_list_lock);
>  	INIT_LIST_HEAD(&ucontext->per_mm_list);
> -	if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) ||
> -	    !(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
> +	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
>  		ucontext->invalidate_range = NULL;

No problem

>
>  	resp.num_comp_vectors = file->device->num_comp_vectors;
> @@ -3607,13 +3606,15 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
>
>  	copy_query_dev_fields(ucontext, &resp.base, &attr);
>
> -	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> -	resp.odp_caps.per_transport_caps.rc_odp_caps =
> -		attr.odp_caps.per_transport_caps.rc_odp_caps;
> -	resp.odp_caps.per_transport_caps.uc_odp_caps =
> -		attr.odp_caps.per_transport_caps.uc_odp_caps;
> -	resp.odp_caps.per_transport_caps.ud_odp_caps =
> -		attr.odp_caps.per_transport_caps.ud_odp_caps;
> +	if (ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING) {
> +		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> +		resp.odp_caps.per_transport_caps.rc_odp_caps =
> +			attr.odp_caps.per_transport_caps.rc_odp_caps;
> +		resp.odp_caps.per_transport_caps.uc_odp_caps =
> +			attr.odp_caps.per_transport_caps.uc_odp_caps;
> +		resp.odp_caps.per_transport_caps.ud_odp_caps =
> +			attr.odp_caps.per_transport_caps.ud_odp_caps;
> +	}

"attr" is initialized to zero, there is no need to place those odp_caps under "if",

>
>  	resp.timestamp_mask = attr.timestamp_mask;
>  	resp.hca_core_clock = attr.hca_core_clock;
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index ff131e4c874ec5..df8366fb0142d6 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -923,9 +923,11 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
>  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
>  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
>
> -	if (MLX5_CAP_GEN(mdev, pg))
> -		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> -	props->odp_caps = dev->odp_caps;
> +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
> +		if (MLX5_CAP_GEN(mdev, pg))
> +			props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> +		props->odp_caps = dev->odp_caps;
> +	}

I accepted your claim about odp_caps being SW properties, but why did
you place device_cap_flags under CONFIG_INFINIBAND_ON_DEMAND_PAGING?
Especially when it is set based on HW capability.

>
>  	if (MLX5_CAP_GEN(mdev, cd))
>  		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
> @@ -1761,7 +1763,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
>  	if (err)
>  		goto out_sys_pages;
>
> -	context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
> +	if (ibdev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING)
> +		context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;

We are not supposed to call to invalidate_range() if umem is not ODP.
It means that the below "if" is redundant.

>
>  	if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) {
>  		err = mlx5_ib_devx_create(dev, true);
> diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
> index 9f90be296ee0f7..22827ba4b6d8eb 100644
> --- a/drivers/infiniband/hw/mlx5/mem.c
> +++ b/drivers/infiniband/hw/mlx5/mem.c
> @@ -150,7 +150,7 @@ void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
>  	struct scatterlist *sg;
>  	int entry;
>
> -	if (umem->is_odp) {
> +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && umem->is_odp) {

How can we have is_odp == True and CONFIG_INFINIBAND_ON_DEMAND_PAGING = n?
mlx5 code expects that if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set,
all occurrences of is_odp are false.

>  		WARN_ON(shift != 0);
>  		WARN_ON(access_flags != (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE));
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index 65d07c111d42a7..8183e94da5a1ea 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1332,12 +1332,14 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
>  	mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n",
>  		    start, virt_addr, length, access_flags);
>
> -	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start &&
> -	    length == U64_MAX) {
> +	if (!start && length == U64_MAX) {
>  		if (!(access_flags & IB_ACCESS_ON_DEMAND) ||
>  		    !(dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
>  			return ERR_PTR(-EINVAL);
>
> +		if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
> +			return ERR_PTR(-EOPNOTSUPP);
> +

I tried to preserve previous behavior and that piece of code was simply
skipped if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set. You will
return -EOPNOTSUPP in new code. It can be right, it can be wrong, but
that change should be standalone.

>  		mr = mlx5_ib_alloc_implicit_mr(to_mpd(pd), access_flags);
>  		if (IS_ERR(mr))
>  			return ERR_CAST(mr);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage
  2018-12-21 13:59   ` Leon Romanovsky
@ 2018-12-21 16:59     ` Jason Gunthorpe
  2018-12-22  9:18       ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2018-12-21 16:59 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

On Fri, Dec 21, 2018 at 03:59:54PM +0200, Leon Romanovsky wrote:

> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index c6144df47ea47e..c2615b6bb68841 100644
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -95,6 +95,9 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
> >  	struct scatterlist *sg, *sg_list_start;
> >  	unsigned int gup_flags = FOLL_WRITE;
> >
> > +	if ((access & IB_ACCESS_ON_DEMAND) && !context->invalidate_range)
> > +		return ERR_PTR(-EOPNOTSUPP);
> > +
> 
> My expectation that we won't be in this state because it is too far away
> from entry where we could check and prevent unsupported access.
> 
> uverbs entry point -> driver code -> ib_umem_get
>   ^^^^ this is better place to check right flags.

I view umem as the proper core entry point here. We should never
allow to create an ODP umem without driver support.

> > @@ -3607,13 +3606,15 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> >
> >  	copy_query_dev_fields(ucontext, &resp.base, &attr);
> >
> > -	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > -	resp.odp_caps.per_transport_caps.rc_odp_caps =
> > -		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > -	resp.odp_caps.per_transport_caps.uc_odp_caps =
> > -		attr.odp_caps.per_transport_caps.uc_odp_caps;
> > -	resp.odp_caps.per_transport_caps.ud_odp_caps =
> > -		attr.odp_caps.per_transport_caps.ud_odp_caps;
> > +	if (ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING) {
> > +		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > +		resp.odp_caps.per_transport_caps.rc_odp_caps =
> > +			attr.odp_caps.per_transport_caps.rc_odp_caps;
> > +		resp.odp_caps.per_transport_caps.uc_odp_caps =
> > +			attr.odp_caps.per_transport_caps.uc_odp_caps;
> > +		resp.odp_caps.per_transport_caps.ud_odp_caps =
> > +			attr.odp_caps.per_transport_caps.ud_odp_caps;
> > +	}
> 
> "attr" is initialized to zero, there is no need to place those odp_caps under "if",

It is not zero, it is the result of query_device - and broken drivers
should not be allowed to put non-zero values here..

> >  	resp.timestamp_mask = attr.timestamp_mask;
> >  	resp.hca_core_clock = attr.hca_core_clock;
> > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > index ff131e4c874ec5..df8366fb0142d6 100644
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -923,9 +923,11 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
> >  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
> >  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
> >
> > -	if (MLX5_CAP_GEN(mdev, pg))
> > -		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > -	props->odp_caps = dev->odp_caps;
> > +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
> > +		if (MLX5_CAP_GEN(mdev, pg))
> > +			props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > +		props->odp_caps = dev->odp_caps;
> > +	}
> 
> I accepted your claim about odp_caps being SW properties, but why did
> you place device_cap_flags under CONFIG_INFINIBAND_ON_DEMAND_PAGING?
> Especially when it is set based on HW capability.

My claim was about device_cap_flags, it indicates the SW capability
for ODP, it has nothing to do with HW.

> >
> >  	if (MLX5_CAP_GEN(mdev, cd))
> >  		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
> > @@ -1761,7 +1763,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
> >  	if (err)
> >  		goto out_sys_pages;
> >
> > -	context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
> > +	if (ibdev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING)
> > +		context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
> 
> We are not supposed to call to invalidate_range() if umem is not ODP.
> It means that the below "if" is redundant.

I want to see all function pointers the driver does not implement set
to NULL... I suppose this needs to have an IS_ENABLED too since
mlx5_ib_invalidate_range is in odp.c

> > diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
> > index 9f90be296ee0f7..22827ba4b6d8eb 100644
> > +++ b/drivers/infiniband/hw/mlx5/mem.c
> > @@ -150,7 +150,7 @@ void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
> >  	struct scatterlist *sg;
> >  	int entry;
> >
> > -	if (umem->is_odp) {
> > +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && umem->is_odp) {
> 
> How can we have is_odp == True and CONFIG_INFINIBAND_ON_DEMAND_PAGING = n?
> mlx5 code expects that if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set,
> all occurrences of is_odp are false.

Whenver you see the IS_ENABLED it is just size optimizing code. The
compiler can't know that is_odp == FALSE, so the extra check is
needed.

> > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > index 65d07c111d42a7..8183e94da5a1ea 100644
> > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > @@ -1332,12 +1332,14 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
> >  	mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n",
> >  		    start, virt_addr, length, access_flags);
> >
> > -	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start &&
> > -	    length == U64_MAX) {
> > +	if (!start && length == U64_MAX) {
> >  		if (!(access_flags & IB_ACCESS_ON_DEMAND) ||
> >  		    !(dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
> >  			return ERR_PTR(-EINVAL);
> >
> > +		if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
> > +			return ERR_PTR(-EOPNOTSUPP);
> > +
> 
> I tried to preserve previous behavior and that piece of code was simply
> skipped if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set. You will
> return -EOPNOTSUPP in new code. It can be right, it can be wrong, but
> that change should be standalone.

Sure - it is a bug that it checks arguments differently depending on
compile options.

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage
  2018-12-21 16:59     ` Jason Gunthorpe
@ 2018-12-22  9:18       ` Leon Romanovsky
  0 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2018-12-22  9:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Haggai Eran, Saeed Mahameed,
	linux-netdev

[-- Attachment #1: Type: text/plain, Size: 7132 bytes --]

On Fri, Dec 21, 2018 at 09:59:12AM -0700, Jason Gunthorpe wrote:
> On Fri, Dec 21, 2018 at 03:59:54PM +0200, Leon Romanovsky wrote:
>
> > > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > > index c6144df47ea47e..c2615b6bb68841 100644
> > > +++ b/drivers/infiniband/core/umem.c
> > > @@ -95,6 +95,9 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
> > >  	struct scatterlist *sg, *sg_list_start;
> > >  	unsigned int gup_flags = FOLL_WRITE;
> > >
> > > +	if ((access & IB_ACCESS_ON_DEMAND) && !context->invalidate_range)
> > > +		return ERR_PTR(-EOPNOTSUPP);
> > > +
> >
> > My expectation that we won't be in this state because it is too far away
> > from entry where we could check and prevent unsupported access.
> >
> > uverbs entry point -> driver code -> ib_umem_get
> >   ^^^^ this is better place to check right flags.
>
> I view umem as the proper core entry point here. We should never
> allow to create an ODP umem without driver support.

And drivers should never ask ODP umem if they don't support it.
You are requesting to add by definition unreachable code.

>
> > > @@ -3607,13 +3606,15 @@ static int ib_uverbs_ex_query_device(struct uverbs_attr_bundle *attrs)
> > >
> > >  	copy_query_dev_fields(ucontext, &resp.base, &attr);
> > >
> > > -	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > > -	resp.odp_caps.per_transport_caps.rc_odp_caps =
> > > -		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > > -	resp.odp_caps.per_transport_caps.uc_odp_caps =
> > > -		attr.odp_caps.per_transport_caps.uc_odp_caps;
> > > -	resp.odp_caps.per_transport_caps.ud_odp_caps =
> > > -		attr.odp_caps.per_transport_caps.ud_odp_caps;
> > > +	if (ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING) {
> > > +		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > > +		resp.odp_caps.per_transport_caps.rc_odp_caps =
> > > +			attr.odp_caps.per_transport_caps.rc_odp_caps;
> > > +		resp.odp_caps.per_transport_caps.uc_odp_caps =
> > > +			attr.odp_caps.per_transport_caps.uc_odp_caps;
> > > +		resp.odp_caps.per_transport_caps.ud_odp_caps =
> > > +			attr.odp_caps.per_transport_caps.ud_odp_caps;
> > > +	}
> >
> > "attr" is initialized to zero, there is no need to place those odp_caps under "if",
>
> It is not zero, it is the result of query_device - and broken drivers
> should not be allowed to put non-zero values here..

Do you have real example in mind? Because if we have such broken drivers
in subsystem, we would see it already, because ODP config is enabled by
default in many distributions.

>
> > >  	resp.timestamp_mask = attr.timestamp_mask;
> > >  	resp.hca_core_clock = attr.hca_core_clock;
> > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > > index ff131e4c874ec5..df8366fb0142d6 100644
> > > +++ b/drivers/infiniband/hw/mlx5/main.c
> > > @@ -923,9 +923,11 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
> > >  	props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
> > >  	props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
> > >
> > > -	if (MLX5_CAP_GEN(mdev, pg))
> > > -		props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > > -	props->odp_caps = dev->odp_caps;
> > > +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
> > > +		if (MLX5_CAP_GEN(mdev, pg))
> > > +			props->device_cap_flags |= IB_DEVICE_ON_DEMAND_PAGING;
> > > +		props->odp_caps = dev->odp_caps;
> > > +	}
> >
> > I accepted your claim about odp_caps being SW properties, but why did
> > you place device_cap_flags under CONFIG_INFINIBAND_ON_DEMAND_PAGING?
> > Especially when it is set based on HW capability.
>
> My claim was about device_cap_flags, it indicates the SW capability
> for ODP, it has nothing to do with HW.

ODP doesn't exist without HW support.

>
> > >
> > >  	if (MLX5_CAP_GEN(mdev, cd))
> > >  		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
> > > @@ -1761,7 +1763,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
> > >  	if (err)
> > >  		goto out_sys_pages;
> > >
> > > -	context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
> > > +	if (ibdev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING)
> > > +		context->ibucontext.invalidate_range = &mlx5_ib_invalidate_range;
> >
> > We are not supposed to call to invalidate_range() if umem is not ODP.
> > It means that the below "if" is redundant.
>
> I want to see all function pointers the driver does not implement set
> to NULL... I suppose this needs to have an IS_ENABLED too since
> mlx5_ib_invalidate_range is in odp.c

No problem

>
> > > diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
> > > index 9f90be296ee0f7..22827ba4b6d8eb 100644
> > > +++ b/drivers/infiniband/hw/mlx5/mem.c
> > > @@ -150,7 +150,7 @@ void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
> > >  	struct scatterlist *sg;
> > >  	int entry;
> > >
> > > -	if (umem->is_odp) {
> > > +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && umem->is_odp) {
> >
> > How can we have is_odp == True and CONFIG_INFINIBAND_ON_DEMAND_PAGING = n?
> > mlx5 code expects that if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set,
> > all occurrences of is_odp are false.
>
> Whenver you see the IS_ENABLED it is just size optimizing code. The
> compiler can't know that is_odp == FALSE, so the extra check is
> needed.a

It is always correct, but why is it important in this path and why
is_odp being in processor cache wouldn't enough? IMHO, you are saving
first cache miss only.

>
> > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > > index 65d07c111d42a7..8183e94da5a1ea 100644
> > > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > > @@ -1332,12 +1332,14 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
> > >  	mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n",
> > >  		    start, virt_addr, length, access_flags);
> > >
> > > -	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start &&
> > > -	    length == U64_MAX) {
> > > +	if (!start && length == U64_MAX) {
> > >  		if (!(access_flags & IB_ACCESS_ON_DEMAND) ||
> > >  		    !(dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
> > >  			return ERR_PTR(-EINVAL);
> > >
> > > +		if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
> > > +			return ERR_PTR(-EOPNOTSUPP);
> > > +
> >
> > I tried to preserve previous behavior and that piece of code was simply
> > skipped if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set. You will
> > return -EOPNOTSUPP in new code. It can be right, it can be wrong, but
> > that change should be standalone.
>
> Sure - it is a bug that it checks arguments differently depending on
> compile options.

I'll take a look on it later and will send followup if needed.

Regarding rest of your comments, how will we converge?

I disagree with most of your points and believe that you are proposing
to fix non-existing problems with dead code. If you insist, I'll add
your code as long as it helps us to progress here, should I.

Thanks

>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-12-22  9:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-20  9:23 [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Leon Romanovsky
2018-12-20  9:23 ` [PATCH rdma-next 1/5] RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGING Leon Romanovsky
2018-12-20  9:23 ` [PATCH rdma-next 2/5] RDMA/core: Don't depend device ODP capabilities on kconfig option Leon Romanovsky
2018-12-20 17:20   ` Jason Gunthorpe
2018-12-20 17:29     ` Leon Romanovsky
2018-12-20 17:33       ` Jason Gunthorpe
2018-12-20 17:50         ` Leon Romanovsky
2018-12-20 23:22           ` Jason Gunthorpe
2018-12-20  9:23 ` [PATCH rdma-next 3/5] RDMA/mlx5: Introduce and reuse helper to identify ODP MR Leon Romanovsky
2018-12-20  9:23 ` [PATCH rdma-next 4/5] RDMA/mlx5: Embed into the code flow the ODP config option Leon Romanovsky
2018-12-20  9:23 ` [PATCH mlx5-next 5/5] RDMA/mlx5: Delete declaration of already removed function Leon Romanovsky
2018-12-21  3:32 ` [PATCH rdma-next 0/5] Cleanup of CONFIG_INFINIBAND_ON_DEMAND_PAGING usage Jason Gunthorpe
2018-12-21 13:59   ` Leon Romanovsky
2018-12-21 16:59     ` Jason Gunthorpe
2018-12-22  9:18       ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).