linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Simplify query_device() in libibverbs
@ 2020-11-16 20:23 Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 1/9] verbs: Simplify query_device_ex Jason Gunthorpe
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson; +Cc: Gal Pressman

Now that that kernel implements query_device_ex for every driver we should
do the same in userspace. Have providers implement only query_device_ex()
and feed query_device() down the same path. This eliminates some
duplicated code in the providers and allows every provider to fully
support query_device_ex with any new kernel features added.

This is a PR:

https://github.com/linux-rdma/rdma-core/pull/890

Jason Gunthorpe (9):
  verbs: Simplify query_device_ex
  verbs: Add ibv_cmd_query_device_any()
  mlx5: Move context initialization out of mlx5_query_device_ex()
  efa: Move the context intialization out of efa_query_device_ex()
  mlx4: Move the context intialization out of mlx4_query_device_ex()
  providers: Remove normal query_device() from providers that have _ex
  providers: Convert all providers to implement query_device_ex
  verbs: Remove dead code
  verbs: Delete query_device() internal support

 libibverbs/cmd.c                   | 212 -----------------------------
 libibverbs/cmd_device.c            | 155 +++++++++++++++++++++
 libibverbs/device.c                |   1 +
 libibverbs/driver.h                |  19 +--
 libibverbs/dummy_ops.c             |  21 +--
 libibverbs/libibverbs.map.in       |   2 +-
 libibverbs/verbs.c                 |   5 +-
 libibverbs/verbs.h                 |   3 +-
 providers/bnxt_re/main.c           |   2 +-
 providers/bnxt_re/verbs.c          |  30 ++--
 providers/bnxt_re/verbs.h          |   5 +-
 providers/cxgb4/dev.c              |  10 +-
 providers/cxgb4/libcxgb4.h         |   3 +-
 providers/cxgb4/verbs.c            |  14 +-
 providers/efa/efa.c                |  15 +-
 providers/efa/verbs.c              |  82 ++++++-----
 providers/efa/verbs.h              |   2 +-
 providers/hfi1verbs/hfiverbs.c     |   2 +-
 providers/hfi1verbs/hfiverbs.h     |   5 +-
 providers/hfi1verbs/verbs.c        |  13 +-
 providers/hns/hns_roce_u.c         |   8 +-
 providers/hns/hns_roce_u.h         |   3 +-
 providers/hns/hns_roce_u_verbs.c   |  15 +-
 providers/i40iw/i40iw_umain.c      |   2 +-
 providers/i40iw/i40iw_umain.h      |   4 +-
 providers/i40iw/i40iw_uverbs.c     |  18 ++-
 providers/ipathverbs/ipathverbs.c  |   2 +-
 providers/ipathverbs/ipathverbs.h  |   5 +-
 providers/ipathverbs/verbs.c       |  13 +-
 providers/mlx4/mlx4.c              |  34 +----
 providers/mlx4/mlx4.h              |   3 +-
 providers/mlx4/verbs.c             |  86 +++++++-----
 providers/mlx5/mlx5.c              |  11 +-
 providers/mlx5/mlx5.h              |   3 +-
 providers/mlx5/verbs.c             | 129 +++++++++---------
 providers/mthca/mthca.c            |   2 +-
 providers/mthca/mthca.h            |   3 +-
 providers/mthca/verbs.c            |  13 +-
 providers/ocrdma/ocrdma_main.c     |   2 +-
 providers/ocrdma/ocrdma_main.h     |   4 +-
 providers/ocrdma/ocrdma_verbs.c    |  20 +--
 providers/qedr/qelr_main.c         |   2 +-
 providers/qedr/qelr_verbs.c        |  21 +--
 providers/qedr/qelr_verbs.h        |   3 +-
 providers/rxe/rxe.c                |  15 +-
 providers/siw/siw.c                |  20 +--
 providers/vmw_pvrdma/pvrdma.h      |   3 +-
 providers/vmw_pvrdma/pvrdma_main.c |   2 +-
 providers/vmw_pvrdma/verbs.c       |  13 +-
 util/util.h                        |   3 +
 50 files changed, 511 insertions(+), 552 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/9] verbs: Simplify query_device_ex
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 2/9] verbs: Add ibv_cmd_query_device_any() Jason Gunthorpe
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

The obtuse logic here is hard to read, simplify it with a small macro and
add offsetofend()

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 libibverbs/cmd.c | 146 ++++++++++++++++-------------------------------
 util/util.h      |   3 +
 2 files changed, 52 insertions(+), 97 deletions(-)

diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index 25c8a971540c63..a439f8c06481dd 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -44,6 +44,7 @@
 #include <infiniband/cmd_write.h>
 #include "ibverbs.h"
 #include <ccan/minmax.h>
+#include <util/util.h>
 
 bool verbs_allow_disassociate_destroy;
 
@@ -144,117 +145,68 @@ int ibv_cmd_query_device_ex(struct ibv_context *context,
 	/* Report back supported comp_mask bits. For now no comp_mask bit is
 	 * defined */
 	attr->comp_mask = resp->comp_mask & 0;
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, odp_caps) +
-			 sizeof(attr->odp_caps)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, odp_caps) +
-		    sizeof(resp->odp_caps)) {
-			attr->odp_caps.general_caps = resp->odp_caps.general_caps;
-			attr->odp_caps.per_transport_caps.rc_odp_caps =
-				resp->odp_caps.per_transport_caps.rc_odp_caps;
-			attr->odp_caps.per_transport_caps.uc_odp_caps =
-				resp->odp_caps.per_transport_caps.uc_odp_caps;
-			attr->odp_caps.per_transport_caps.ud_odp_caps =
-				resp->odp_caps.per_transport_caps.ud_odp_caps;
-		}
-	}
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex,
-				  completion_timestamp_mask) +
-			 sizeof(attr->completion_timestamp_mask)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, timestamp_mask) +
-		    sizeof(resp->timestamp_mask))
-			attr->completion_timestamp_mask = resp->timestamp_mask;
+#define CAN_COPY(_ibv_attr, _uverbs_attr)                                      \
+	(attr_size >= offsetofend(struct ibv_device_attr_ex, _ibv_attr) &&     \
+	 resp->response_length >=                                              \
+		 offsetofend(struct ib_uverbs_ex_query_device_resp,            \
+			     _uverbs_attr))
+
+	if (CAN_COPY(odp_caps, odp_caps)) {
+		attr->odp_caps.general_caps = resp->odp_caps.general_caps;
+		attr->odp_caps.per_transport_caps.rc_odp_caps =
+			resp->odp_caps.per_transport_caps.rc_odp_caps;
+		attr->odp_caps.per_transport_caps.uc_odp_caps =
+			resp->odp_caps.per_transport_caps.uc_odp_caps;
+		attr->odp_caps.per_transport_caps.ud_odp_caps =
+			resp->odp_caps.per_transport_caps.ud_odp_caps;
 	}
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, hca_core_clock) +
-			 sizeof(attr->hca_core_clock)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, hca_core_clock) +
-		    sizeof(resp->hca_core_clock))
-			attr->hca_core_clock = resp->hca_core_clock;
-	}
+	if (CAN_COPY(completion_timestamp_mask, timestamp_mask))
+		attr->completion_timestamp_mask = resp->timestamp_mask;
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, device_cap_flags_ex) +
-			 sizeof(attr->device_cap_flags_ex)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, device_cap_flags_ex) +
-		    sizeof(resp->device_cap_flags_ex))
-			attr->device_cap_flags_ex = resp->device_cap_flags_ex;
-	}
+	if (CAN_COPY(hca_core_clock, hca_core_clock))
+		attr->hca_core_clock = resp->hca_core_clock;
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, rss_caps) +
-			 sizeof(attr->rss_caps)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, rss_caps) +
-		    sizeof(resp->rss_caps)) {
-			attr->rss_caps.supported_qpts = resp->rss_caps.supported_qpts;
-			attr->rss_caps.max_rwq_indirection_tables = resp->rss_caps.max_rwq_indirection_tables;
-			attr->rss_caps.max_rwq_indirection_table_size = resp->rss_caps.max_rwq_indirection_table_size;
-		}
-	}
+	if (CAN_COPY(device_cap_flags_ex, device_cap_flags_ex))
+		attr->device_cap_flags_ex = resp->device_cap_flags_ex;
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, max_wq_type_rq) +
-			 sizeof(attr->max_wq_type_rq)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, max_wq_type_rq) +
-		    sizeof(resp->max_wq_type_rq))
-			attr->max_wq_type_rq = resp->max_wq_type_rq;
+	if (CAN_COPY(rss_caps, rss_caps)) {
+		attr->rss_caps.supported_qpts = resp->rss_caps.supported_qpts;
+		attr->rss_caps.max_rwq_indirection_tables =
+			resp->rss_caps.max_rwq_indirection_tables;
+		attr->rss_caps.max_rwq_indirection_table_size =
+			resp->rss_caps.max_rwq_indirection_table_size;
 	}
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, raw_packet_caps) +
-			 sizeof(attr->raw_packet_caps)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, raw_packet_caps) +
-		    sizeof(resp->raw_packet_caps))
-			attr->raw_packet_caps = resp->raw_packet_caps;
-	}
+	if (CAN_COPY(max_wq_type_rq, max_wq_type_rq))
+		attr->max_wq_type_rq = resp->max_wq_type_rq;
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, tm_caps) +
-			 sizeof(attr->tm_caps)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, tm_caps) +
-		    sizeof(resp->tm_caps)) {
-			attr->tm_caps.max_rndv_hdr_size =
-				resp->tm_caps.max_rndv_hdr_size;
-			attr->tm_caps.max_num_tags =
-				resp->tm_caps.max_num_tags;
-			attr->tm_caps.flags = resp->tm_caps.flags;
-			attr->tm_caps.max_ops =
-				resp->tm_caps.max_ops;
-			attr->tm_caps.max_sge =
-				resp->tm_caps.max_sge;
-		}
-	}
+	if (CAN_COPY(raw_packet_caps, raw_packet_caps))
+		attr->raw_packet_caps = resp->raw_packet_caps;
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, cq_mod_caps) +
-			 sizeof(attr->cq_mod_caps)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, cq_moderation_caps) +
-		    sizeof(resp->cq_moderation_caps)) {
-			attr->cq_mod_caps.max_cq_count = resp->cq_moderation_caps.max_cq_moderation_count;
-			attr->cq_mod_caps.max_cq_period = resp->cq_moderation_caps.max_cq_moderation_period;
-		}
+	if (CAN_COPY(tm_caps, tm_caps)) {
+		attr->tm_caps.max_rndv_hdr_size =
+			resp->tm_caps.max_rndv_hdr_size;
+		attr->tm_caps.max_num_tags = resp->tm_caps.max_num_tags;
+		attr->tm_caps.flags = resp->tm_caps.flags;
+		attr->tm_caps.max_ops = resp->tm_caps.max_ops;
+		attr->tm_caps.max_sge = resp->tm_caps.max_sge;
 	}
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, max_dm_size) +
-			sizeof(attr->max_dm_size)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, max_dm_size) +
-		    sizeof(resp->max_dm_size)) {
-			attr->max_dm_size = resp->max_dm_size;
-		}
+	if (CAN_COPY(cq_mod_caps, cq_moderation_caps)) {
+		attr->cq_mod_caps.max_cq_count =
+			resp->cq_moderation_caps.max_cq_moderation_count;
+		attr->cq_mod_caps.max_cq_period =
+			resp->cq_moderation_caps.max_cq_moderation_period;
 	}
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, xrc_odp_caps) +
-			sizeof(attr->xrc_odp_caps)) {
-		if (resp->response_length >=
-		    offsetof(struct ib_uverbs_ex_query_device_resp, xrc_odp_caps) +
-		    sizeof(resp->xrc_odp_caps)) {
-			attr->xrc_odp_caps = resp->xrc_odp_caps;
-		}
-	}
+	if (CAN_COPY(max_dm_size, max_dm_size))
+		attr->max_dm_size = resp->max_dm_size;
+
+	if (CAN_COPY(xrc_odp_caps, xrc_odp_caps))
+		attr->xrc_odp_caps = resp->xrc_odp_caps;
+#undef CAN_COPY
 
 	return 0;
 }
diff --git a/util/util.h b/util/util.h
index 0f2c35cd0647ce..47346ca1bf5841 100644
--- a/util/util.h
+++ b/util/util.h
@@ -23,6 +23,9 @@ static inline bool __good_snprintf(size_t len, int rc)
 	 ((a)->tv_nsec CMP (b)->tv_nsec) :	\
 	 ((a)->tv_sec CMP (b)->tv_sec))
 
+#define offsetofend(_type, _member)                                            \
+	(offsetof(_type, _member) + sizeof(((_type *)0)->_member))
+
 static inline unsigned long align(unsigned long val, unsigned long align)
 {
 	return (val + align - 1) & ~(align - 1);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/9] verbs: Add ibv_cmd_query_device_any()
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 1/9] verbs: Simplify query_device_ex Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-18 12:43   ` Gal Pressman
  2020-11-16 20:23 ` [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex() Jason Gunthorpe
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

This implements all the query_device command flows under a single call.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 libibverbs/cmd_device.c      | 155 +++++++++++++++++++++++++++++++++++
 libibverbs/driver.h          |   5 ++
 libibverbs/libibverbs.map.in |   1 +
 3 files changed, 161 insertions(+)

diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
index 6c8e01ec9866a9..0019784ee779c1 100644
--- a/libibverbs/cmd_device.c
+++ b/libibverbs/cmd_device.c
@@ -35,6 +35,7 @@
 #include <stdlib.h>
 #include <dirent.h>
 #include <infiniband/cmd_write.h>
+#include <util/util.h>
 
 #include <net/if.h>
 
@@ -516,3 +517,157 @@ ssize_t _ibv_query_gid_table(struct ibv_context *context,
 
 	return num_entries;
 }
+
+int ibv_cmd_query_device_any(struct ibv_context *context,
+			     const struct ibv_query_device_ex_input *input,
+			     struct ibv_device_attr_ex *attr, size_t attr_size,
+			     struct ib_uverbs_ex_query_device_resp *resp,
+			     size_t *resp_size)
+{
+	struct ib_uverbs_ex_query_device_resp internal_resp;
+	size_t internal_resp_size;
+	int err;
+
+	if (input && input->comp_mask)
+		return EINVAL;
+	if (attr_size < sizeof(attr->orig_attr))
+		return EINVAL;
+
+	if (!resp) {
+		resp = &internal_resp;
+		internal_resp_size = sizeof(internal_resp);
+		resp_size = &internal_resp_size;
+	}
+	memset(attr, 0, attr_size);
+	memset(resp, 0, *resp_size);
+
+	if (attr_size > sizeof(attr->orig_attr)) {
+		struct ibv_query_device_ex cmd = {};
+
+		err = execute_cmd_write_ex(context,
+					   IB_USER_VERBS_EX_CMD_QUERY_DEVICE,
+					   &cmd, sizeof(cmd), resp, *resp_size);
+		if (err) {
+			if (err != EOPNOTSUPP)
+				return err;
+			attr_size = sizeof(attr->orig_attr);
+		}
+	}
+
+	if (attr_size == sizeof(attr->orig_attr)) {
+		struct ibv_query_device cmd = {};
+
+		err = execute_cmd_write(context, IB_USER_VERBS_CMD_QUERY_DEVICE,
+					&cmd, sizeof(cmd), &resp->base,
+					sizeof(resp->base));
+		if (err)
+			return err;
+		resp->response_length = sizeof(resp->base);
+	}
+
+	*resp_size = resp->response_length;
+	attr->orig_attr.node_guid = resp->base.node_guid;
+	attr->orig_attr.sys_image_guid = resp->base.sys_image_guid;
+	attr->orig_attr.max_mr_size = resp->base.max_mr_size;
+	attr->orig_attr.page_size_cap = resp->base.page_size_cap;
+	attr->orig_attr.vendor_id = resp->base.vendor_id;
+	attr->orig_attr.vendor_part_id = resp->base.vendor_part_id;
+	attr->orig_attr.hw_ver = resp->base.hw_ver;
+	attr->orig_attr.max_qp = resp->base.max_qp;
+	attr->orig_attr.max_qp_wr = resp->base.max_qp_wr;
+	attr->orig_attr.device_cap_flags = resp->base.device_cap_flags;
+	attr->orig_attr.max_sge = resp->base.max_sge;
+	attr->orig_attr.max_sge_rd = resp->base.max_sge_rd;
+	attr->orig_attr.max_cq = resp->base.max_cq;
+	attr->orig_attr.max_cqe = resp->base.max_cqe;
+	attr->orig_attr.max_mr = resp->base.max_mr;
+	attr->orig_attr.max_pd = resp->base.max_pd;
+	attr->orig_attr.max_qp_rd_atom = resp->base.max_qp_rd_atom;
+	attr->orig_attr.max_ee_rd_atom = resp->base.max_ee_rd_atom;
+	attr->orig_attr.max_res_rd_atom = resp->base.max_res_rd_atom;
+	attr->orig_attr.max_qp_init_rd_atom = resp->base.max_qp_init_rd_atom;
+	attr->orig_attr.max_ee_init_rd_atom = resp->base.max_ee_init_rd_atom;
+	attr->orig_attr.atomic_cap = resp->base.atomic_cap;
+	attr->orig_attr.max_ee = resp->base.max_ee;
+	attr->orig_attr.max_rdd = resp->base.max_rdd;
+	attr->orig_attr.max_mw = resp->base.max_mw;
+	attr->orig_attr.max_raw_ipv6_qp = resp->base.max_raw_ipv6_qp;
+	attr->orig_attr.max_raw_ethy_qp = resp->base.max_raw_ethy_qp;
+	attr->orig_attr.max_mcast_grp = resp->base.max_mcast_grp;
+	attr->orig_attr.max_mcast_qp_attach = resp->base.max_mcast_qp_attach;
+	attr->orig_attr.max_total_mcast_qp_attach =
+		resp->base.max_total_mcast_qp_attach;
+	attr->orig_attr.max_ah = resp->base.max_ah;
+	attr->orig_attr.max_fmr = resp->base.max_fmr;
+	attr->orig_attr.max_map_per_fmr = resp->base.max_map_per_fmr;
+	attr->orig_attr.max_srq = resp->base.max_srq;
+	attr->orig_attr.max_srq_wr = resp->base.max_srq_wr;
+	attr->orig_attr.max_srq_sge = resp->base.max_srq_sge;
+	attr->orig_attr.max_pkeys = resp->base.max_pkeys;
+	attr->orig_attr.local_ca_ack_delay = resp->base.local_ca_ack_delay;
+	attr->orig_attr.phys_port_cnt = resp->base.phys_port_cnt;
+
+#define CAN_COPY(_ibv_attr, _uverbs_attr)                                      \
+	(attr_size >= offsetofend(struct ibv_device_attr_ex, _ibv_attr) &&     \
+	 resp->response_length >=                                              \
+		 offsetofend(struct ib_uverbs_ex_query_device_resp,            \
+			     _uverbs_attr))
+
+	if (CAN_COPY(odp_caps, odp_caps)) {
+		attr->odp_caps.general_caps = resp->odp_caps.general_caps;
+		attr->odp_caps.per_transport_caps.rc_odp_caps =
+			resp->odp_caps.per_transport_caps.rc_odp_caps;
+		attr->odp_caps.per_transport_caps.uc_odp_caps =
+			resp->odp_caps.per_transport_caps.uc_odp_caps;
+		attr->odp_caps.per_transport_caps.ud_odp_caps =
+			resp->odp_caps.per_transport_caps.ud_odp_caps;
+	}
+
+	if (CAN_COPY(completion_timestamp_mask, timestamp_mask))
+		attr->completion_timestamp_mask = resp->timestamp_mask;
+
+	if (CAN_COPY(hca_core_clock, hca_core_clock))
+		attr->hca_core_clock = resp->hca_core_clock;
+
+	if (CAN_COPY(device_cap_flags_ex, device_cap_flags_ex))
+		attr->device_cap_flags_ex = resp->device_cap_flags_ex;
+
+	if (CAN_COPY(rss_caps, rss_caps)) {
+		attr->rss_caps.supported_qpts = resp->rss_caps.supported_qpts;
+		attr->rss_caps.max_rwq_indirection_tables =
+			resp->rss_caps.max_rwq_indirection_tables;
+		attr->rss_caps.max_rwq_indirection_table_size =
+			resp->rss_caps.max_rwq_indirection_table_size;
+	}
+
+	if (CAN_COPY(max_wq_type_rq, max_wq_type_rq))
+		attr->max_wq_type_rq = resp->max_wq_type_rq;
+
+	if (CAN_COPY(raw_packet_caps, raw_packet_caps))
+		attr->raw_packet_caps = resp->raw_packet_caps;
+
+	if (CAN_COPY(tm_caps, tm_caps)) {
+		attr->tm_caps.max_rndv_hdr_size =
+			resp->tm_caps.max_rndv_hdr_size;
+		attr->tm_caps.max_num_tags = resp->tm_caps.max_num_tags;
+		attr->tm_caps.flags = resp->tm_caps.flags;
+		attr->tm_caps.max_ops = resp->tm_caps.max_ops;
+		attr->tm_caps.max_sge = resp->tm_caps.max_sge;
+	}
+
+	if (CAN_COPY(cq_mod_caps, cq_moderation_caps)) {
+		attr->cq_mod_caps.max_cq_count =
+			resp->cq_moderation_caps.max_cq_moderation_count;
+		attr->cq_mod_caps.max_cq_period =
+			resp->cq_moderation_caps.max_cq_moderation_period;
+	}
+
+	if (CAN_COPY(max_dm_size, max_dm_size))
+		attr->max_dm_size = resp->max_dm_size;
+
+	if (CAN_COPY(xrc_odp_caps, xrc_odp_caps))
+		attr->xrc_odp_caps = resp->xrc_odp_caps;
+#undef CAN_COPY
+
+	return 0;
+}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 87d1a030a39c2d..e54db0ea6413e8 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -460,6 +460,11 @@ int ibv_cmd_create_flow_action_esp(struct ibv_context *ctx,
 int ibv_cmd_modify_flow_action_esp(struct verbs_flow_action *flow_action,
 				   struct ibv_flow_action_esp_attr *attr,
 				   struct ibv_command_buffer *driver);
+int ibv_cmd_query_device_any(struct ibv_context *context,
+			     const struct ibv_query_device_ex_input *input,
+			     struct ibv_device_attr_ex *attr, size_t attr_size,
+			     struct ib_uverbs_ex_query_device_resp *resp,
+			     size_t *resp_size);
 int ibv_cmd_query_device_ex(struct ibv_context *context,
 			    const struct ibv_query_device_ex_input *input,
 			    struct ibv_device_attr_ex *attr, size_t attr_size,
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index 7429016aae0f02..c1f7e09b240ab0 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -204,6 +204,7 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
 		ibv_cmd_post_srq_recv;
 		ibv_cmd_query_context;
 		ibv_cmd_query_device;
+		ibv_cmd_query_device_any;
 		ibv_cmd_query_device_ex;
 		ibv_cmd_query_mr;
 		ibv_cmd_query_port;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex()
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 1/9] verbs: Simplify query_device_ex Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 2/9] verbs: Add ibv_cmd_query_device_any() Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-17 17:24   ` Yishai Hadas
  2020-11-16 20:23 ` [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex() Jason Gunthorpe
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

When the user calls mlx5_query_device_ex() it should not cause the context
values to be mutated, only the attribute should be returned.

Move this code to a dedicated function that is only called during context
setup.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 providers/mlx5/mlx5.c  | 10 +------
 providers/mlx5/mlx5.h  |  1 +
 providers/mlx5/verbs.c | 62 ++++++++++++++++++++++++++++--------------
 3 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 1378acf2e2f3af..06b9a52ebb3019 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -1373,7 +1373,6 @@ static int mlx5_set_context(struct mlx5_context *context,
 {
 	struct verbs_context *v_ctx = &context->ibv_ctx;
 	struct ibv_port_attr port_attr = {};
-	struct ibv_device_attr_ex device_attr = {};
 	int cmd_fd = v_ctx->context.cmd_fd;
 	struct mlx5_device *mdev = to_mdev(v_ctx->context.device);
 	struct ibv_device *ibdev = v_ctx->context.device;
@@ -1518,14 +1517,7 @@ bf_done:
 			goto err_free;
 	}
 
-	if (!mlx5_query_device_ex(&v_ctx->context, NULL, &device_attr,
-				  sizeof(struct ibv_device_attr_ex))) {
-		context->cached_device_cap_flags =
-			device_attr.orig_attr.device_cap_flags;
-		context->atomic_cap = device_attr.orig_attr.atomic_cap;
-		context->cached_tso_caps = device_attr.tso_caps;
-		context->max_dm_size = device_attr.max_dm_size;
-	}
+	mlx5_query_device_ctx(context);
 
 	for (j = 0; j < min(MLX5_MAX_PORTS_NUM, context->num_ports); ++j) {
 		memset(&port_attr, 0, sizeof(port_attr));
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index 782d29bf757e0b..72e710b7b5e4aa 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -878,6 +878,7 @@ __be32 *mlx5_alloc_dbrec(struct mlx5_context *context, struct ibv_pd *pd,
 void mlx5_free_db(struct mlx5_context *context, __be32 *db, struct ibv_pd *pd,
 		  bool custom_alloc);
 
+void mlx5_query_device_ctx(struct mlx5_context *mctx);
 int mlx5_query_device(struct ibv_context *context,
 		       struct ibv_device_attr *attr);
 int mlx5_query_device_ex(struct ibv_context *context,
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index 3622cae1df5017..42c984033d8eaa 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -3450,19 +3450,19 @@ static void get_pci_atomic_caps(struct ibv_context *context,
 	}
 }
 
-static void get_lag_caps(struct ibv_context *ctx)
+static void get_lag_caps(struct mlx5_context *mctx)
 {
 	uint16_t opmod = MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE |
 		HCA_CAP_OPMOD_GET_CUR;
 	uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {};
 	uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {};
-	struct mlx5_context *mctx = to_mctx(ctx);
 	int ret;
 
 	DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
 	DEVX_SET(query_hca_cap_in, in, op_mod, opmod);
 
-	ret = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out));
+	ret = mlx5dv_devx_general_cmd(&mctx->ibv_ctx.context, in, sizeof(in),
+				      out, sizeof(out));
 	if (ret)
 		return;
 
@@ -3512,6 +3512,41 @@ int mlx5_query_device_ex(struct ibv_context *context,
 	attr->packet_pacing_caps.supported_qpts =
 		resp.packet_pacing_caps.supported_qpts;
 
+	major     = (raw_fw_ver >> 32) & 0xffff;
+	minor     = (raw_fw_ver >> 16) & 0xffff;
+	sub_minor = raw_fw_ver & 0xffff;
+	a = &attr->orig_attr;
+	snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d",
+		 major, minor, sub_minor);
+
+	if (attr_size >= offsetof(struct ibv_device_attr_ex, pci_atomic_caps) +
+			sizeof(attr->pci_atomic_caps))
+		get_pci_atomic_caps(context, attr);
+
+	return 0;
+}
+
+void mlx5_query_device_ctx(struct mlx5_context *mctx)
+{
+	struct ibv_device_attr_ex device_attr;
+	struct mlx5_query_device_ex_resp resp;
+	size_t resp_size = sizeof(resp);
+
+	get_lag_caps(mctx);
+
+	if (!(mctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE))
+		return;
+
+	if (ibv_cmd_query_device_any(&mctx->ibv_ctx.context, NULL, &device_attr,
+				     sizeof(device_attr), &resp.ibv_resp,
+				     &resp_size))
+		return;
+
+	mctx->cached_device_cap_flags = device_attr.orig_attr.device_cap_flags;
+	mctx->atomic_cap = device_attr.orig_attr.atomic_cap;
+	mctx->cached_tso_caps = device_attr.tso_caps;
+	mctx->max_dm_size = device_attr.max_dm_size;
+
 	if (resp.mlx5_ib_support_multi_pkt_send_wqes & MLX5_IB_ALLOW_MPW)
 		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_MPW_ALLOWED;
 
@@ -3519,7 +3554,8 @@ int mlx5_query_device_ex(struct ibv_context *context,
 		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_ENHANCED_MPW;
 
 	mctx->cqe_comp_caps.max_num = resp.cqe_comp_caps.max_num;
-	mctx->cqe_comp_caps.supported_format = resp.cqe_comp_caps.supported_format;
+	mctx->cqe_comp_caps.supported_format =
+		resp.cqe_comp_caps.supported_format;
 	mctx->sw_parsing_caps.sw_parsing_offloads =
 		resp.sw_parsing_caps.sw_parsing_offloads;
 	mctx->sw_parsing_caps.supported_qpts =
@@ -3544,25 +3580,11 @@ int mlx5_query_device_ex(struct ibv_context *context,
 		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_CQE_128B_PAD;
 
 	if (resp.flags & MLX5_IB_QUERY_DEV_RESP_PACKET_BASED_CREDIT_MODE)
-		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE;
+		mctx->vendor_cap_flags |=
+			MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE;
 
 	if (resp.flags & MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT)
 		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_SCAT2CQE_DCT;
-
-	major     = (raw_fw_ver >> 32) & 0xffff;
-	minor     = (raw_fw_ver >> 16) & 0xffff;
-	sub_minor = raw_fw_ver & 0xffff;
-	a = &attr->orig_attr;
-	snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d",
-		 major, minor, sub_minor);
-
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, pci_atomic_caps) +
-			sizeof(attr->pci_atomic_caps))
-		get_pci_atomic_caps(context, attr);
-
-	get_lag_caps(context);
-
-	return 0;
 }
 
 static int rwq_sig_enabled(struct ibv_context *context)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex()
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2020-11-16 20:23 ` [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex() Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-18  7:39   ` Gal Pressman
                     ` (2 more replies)
  2020-11-16 20:23 ` [PATCH 5/9] mlx4: Move the context intialization out of mlx4_query_device_ex() Jason Gunthorpe
                   ` (4 subsequent siblings)
  8 siblings, 3 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson; +Cc: Gal Pressman

When the user calls efa_query_device_ex() it should not cause the context
values to be mutated, only the attribute shuld be returned.

Move this code to a dedicated function that is only called during context
setup.

Cc: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 providers/efa/efa.c   | 14 +------------
 providers/efa/verbs.c | 46 +++++++++++++++++++++++++++++++++++--------
 providers/efa/verbs.h |  1 +
 3 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/providers/efa/efa.c b/providers/efa/efa.c
index 35f9b246a711ec..b24c14f7fa1fe1 100644
--- a/providers/efa/efa.c
+++ b/providers/efa/efa.c
@@ -54,10 +54,7 @@ static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
 {
 	struct efa_alloc_ucontext_resp resp = {};
 	struct efa_alloc_ucontext cmd = {};
-	struct ibv_device_attr_ex attr;
-	unsigned int qp_table_sz;
 	struct efa_context *ctx;
-	int err;
 
 	cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_TX_BATCH;
 	cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_MIN_SQ_WR;
@@ -86,17 +83,8 @@ static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
 
 	verbs_set_ops(&ctx->ibvctx, &efa_ctx_ops);
 
-	err = efa_query_device_ex(&ctx->ibvctx.context, NULL, &attr,
-				  sizeof(attr));
-	if (err)
+	if (!efa_query_device_ctx(ctx))
 		goto err_free_spinlock;
-
-	qp_table_sz = roundup_pow_of_two(attr.orig_attr.max_qp);
-	ctx->qp_table_sz_m1 = qp_table_sz - 1;
-	ctx->qp_table = calloc(qp_table_sz, sizeof(*ctx->qp_table));
-	if (!ctx->qp_table)
-		goto err_free_spinlock;
-
 	return &ctx->ibvctx;
 
 err_free_spinlock:
diff --git a/providers/efa/verbs.c b/providers/efa/verbs.c
index 1a9633155c62f8..52d6285f1f409c 100644
--- a/providers/efa/verbs.c
+++ b/providers/efa/verbs.c
@@ -106,14 +106,6 @@ int efa_query_device_ex(struct ibv_context *context,
 	if (err)
 		return err;
 
-	ctx->device_caps = resp.device_caps;
-	ctx->max_sq_wr = resp.max_sq_wr;
-	ctx->max_rq_wr = resp.max_rq_wr;
-	ctx->max_sq_sge = resp.max_sq_sge;
-	ctx->max_rq_sge = resp.max_rq_sge;
-	ctx->max_rdma_size = resp.max_rdma_size;
-	ctx->max_wr_rdma_sge = a->max_sge_rd;
-
 	a->max_qp_wr = min_t(int, a->max_qp_wr,
 			     ctx->max_llq_size / sizeof(struct efa_io_tx_wqe));
 	snprintf(a->fw_ver, sizeof(a->fw_ver), "%u.%u.%u.%u",
@@ -122,6 +114,44 @@ int efa_query_device_ex(struct ibv_context *context,
 	return 0;
 }
 
+int efa_query_device_ctx(struct efa_context *ctx)
+{
+	struct ibv_device_attr_ex attr;
+	struct efa_query_device_ex_resp resp;
+	size_t resp_size = sizeof(resp);
+	unsigned int qp_table_sz;
+	int err;
+
+	if (ctx->cmds_supp_udata_mask & EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE) {
+		err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL,
+					       &attr, sizeof(attr),
+					       &resp.ibv_resp, &resp_size);
+		if (err)
+			return err;
+
+		ctx->device_caps = resp.device_caps;
+		ctx->max_sq_wr = resp.max_sq_wr;
+		ctx->max_rq_wr = resp.max_rq_wr;
+		ctx->max_sq_sge = resp.max_sq_sge;
+		ctx->max_rq_sge = resp.max_rq_sge;
+		ctx->max_rdma_size = resp.max_rdma_size;
+		ctx->max_wr_rdma_sge = attr.orig_attr.max_sge_rd;
+	} else {
+		err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL,
+					       &attr, sizeof(attr.orig_attr),
+					       NULL, NULL);
+		if (err)
+			return err;
+	}
+
+	qp_table_sz = roundup_pow_of_two(attr.orig_attr.max_qp);
+	ctx->qp_table_sz_m1 = qp_table_sz - 1;
+	ctx->qp_table = calloc(qp_table_sz, sizeof(*ctx->qp_table));
+	if (!ctx->qp_table)
+		return ENOMEM;
+	return 0;
+}
+
 int efadv_query_device(struct ibv_context *ibvctx,
 		       struct efadv_device_attr *attr,
 		       uint32_t inlen)
diff --git a/providers/efa/verbs.h b/providers/efa/verbs.h
index da022e615af064..3b0e4e0d498761 100644
--- a/providers/efa/verbs.h
+++ b/providers/efa/verbs.h
@@ -9,6 +9,7 @@
 #include <infiniband/driver.h>
 #include <infiniband/verbs.h>
 
+int efa_query_device_ctx(struct efa_context *ctx);
 int efa_query_device(struct ibv_context *uctx, struct ibv_device_attr *attr);
 int efa_query_port(struct ibv_context *uctx, uint8_t port,
 		   struct ibv_port_attr *attr);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/9] mlx4: Move the context intialization out of mlx4_query_device_ex()
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2020-11-16 20:23 ` [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex() Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 6/9] providers: Remove normal query_device() from providers that have _ex Jason Gunthorpe
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

When the user calls mlx4_query_device_ex() it should not cause the context
values to be mutated, only the attribute shuld be returned.

Move this code to a dedicated function that is only called during context
setup.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 providers/mlx4/mlx4.c  | 33 +-------------------------------
 providers/mlx4/mlx4.h  |  1 +
 providers/mlx4/verbs.c | 43 +++++++++++++++++++++++++++++++++++-------
 3 files changed, 38 insertions(+), 39 deletions(-)

diff --git a/providers/mlx4/mlx4.c b/providers/mlx4/mlx4.c
index c4a3c557fea426..619b841d788cb2 100644
--- a/providers/mlx4/mlx4.c
+++ b/providers/mlx4/mlx4.c
@@ -136,28 +136,6 @@ static const struct verbs_context_ops mlx4_ctx_ops = {
 	.free_context = mlx4_free_context,
 };
 
-static int mlx4_map_internal_clock(struct mlx4_device *mdev,
-				   struct ibv_context *ibv_ctx)
-{
-	struct mlx4_context *context = to_mctx(ibv_ctx);
-	void *hca_clock_page;
-
-	hca_clock_page = mmap(NULL, mdev->page_size,
-			      PROT_READ, MAP_SHARED, ibv_ctx->cmd_fd,
-			      mdev->page_size * 3);
-
-	if (hca_clock_page == MAP_FAILED) {
-		fprintf(stderr, PFX
-			"Warning: Timestamp available,\n"
-			"but failed to mmap() hca core clock page.\n");
-		return -1;
-	}
-
-	context->hca_core_clock = hca_clock_page +
-		(context->core_clock.offset & (mdev->page_size - 1));
-	return 0;
-}
-
 static struct verbs_context *mlx4_alloc_context(struct ibv_device *ibdev,
 						  int cmd_fd,
 						  void *private_data)
@@ -170,7 +148,6 @@ static struct verbs_context *mlx4_alloc_context(struct ibv_device *ibdev,
 	__u16				bf_reg_size;
 	struct mlx4_device              *dev = to_mdev(ibdev);
 	struct verbs_context		*verbs_ctx;
-	struct ibv_device_attr_ex	dev_attrs;
 
 	context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx,
 					       RDMA_DRIVER_MLX4);
@@ -242,15 +219,7 @@ static struct verbs_context *mlx4_alloc_context(struct ibv_device *ibdev,
 
 	verbs_set_ops(verbs_ctx, &mlx4_ctx_ops);
 
-	context->hca_core_clock = NULL;
-	memset(&dev_attrs, 0, sizeof(dev_attrs));
-	if (!mlx4_query_device_ex(&verbs_ctx->context, NULL, &dev_attrs,
-				  sizeof(struct ibv_device_attr_ex))) {
-		context->max_qp_wr = dev_attrs.orig_attr.max_qp_wr;
-		context->max_sge = dev_attrs.orig_attr.max_sge;
-		if (context->core_clock.offset_valid)
-			mlx4_map_internal_clock(dev, &verbs_ctx->context);
-	}
+	mlx4_query_device_ctx(dev, context);
 
 	return verbs_ctx;
 
diff --git a/providers/mlx4/mlx4.h b/providers/mlx4/mlx4.h
index 479c39d0a69fc4..3c0787144e7e51 100644
--- a/providers/mlx4/mlx4.h
+++ b/providers/mlx4/mlx4.h
@@ -304,6 +304,7 @@ __be32 *mlx4_alloc_db(struct mlx4_context *context, enum mlx4_db_type type);
 void mlx4_free_db(struct mlx4_context *context, enum mlx4_db_type type,
 		  __be32 *db);
 
+void mlx4_query_device_ctx(struct mlx4_device *mdev, struct mlx4_context *mctx);
 int mlx4_query_device(struct ibv_context *context,
 		       struct ibv_device_attr *attr);
 int mlx4_query_device_ex(struct ibv_context *context,
diff --git a/providers/mlx4/verbs.c b/providers/mlx4/verbs.c
index 512297f2eebac0..4fe5c1d87d6d91 100644
--- a/providers/mlx4/verbs.c
+++ b/providers/mlx4/verbs.c
@@ -38,6 +38,7 @@
 #include <string.h>
 #include <pthread.h>
 #include <errno.h>
+#include <sys/mman.h>
 
 #include <util/mmio.h>
 
@@ -70,7 +71,6 @@ int mlx4_query_device_ex(struct ibv_context *context,
 			 struct ibv_device_attr_ex *attr,
 			 size_t attr_size)
 {
-	struct mlx4_context *mctx = to_mctx(context);
 	struct mlx4_query_device_ex_resp resp = {};
 	struct mlx4_query_device_ex cmd = {};
 	uint64_t raw_fw_ver;
@@ -90,12 +90,6 @@ int mlx4_query_device_ex(struct ibv_context *context,
 	attr->tso_caps.max_tso = resp.tso_caps.max_tso;
 	attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts;
 
-	if (resp.comp_mask & MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET) {
-		mctx->core_clock.offset = resp.hca_core_clock_offset;
-		mctx->core_clock.offset_valid = 1;
-	}
-	mctx->max_inl_recv_sz = resp.max_inl_recv_sz;
-
 	major     = (raw_fw_ver >> 32) & 0xffff;
 	minor     = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
@@ -106,6 +100,41 @@ int mlx4_query_device_ex(struct ibv_context *context,
 	return 0;
 }
 
+void mlx4_query_device_ctx(struct mlx4_device *mdev, struct mlx4_context *mctx)
+{
+	struct ibv_device_attr_ex device_attr;
+	struct mlx4_query_device_ex_resp resp;
+	size_t resp_size = sizeof(resp);
+
+	if (ibv_cmd_query_device_any(&mctx->ibv_ctx.context, NULL,
+				       &device_attr, sizeof(device_attr),
+				       &resp.ibv_resp, &resp_size))
+		return;
+
+	mctx->max_qp_wr = device_attr.orig_attr.max_qp_wr;
+	mctx->max_sge = device_attr.orig_attr.max_sge;
+	mctx->max_inl_recv_sz = resp.max_inl_recv_sz;
+
+	if (resp.comp_mask & MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET) {
+		void *hca_clock_page;
+
+		mctx->core_clock.offset = resp.hca_core_clock_offset;
+		mctx->core_clock.offset_valid = 1;
+
+		hca_clock_page =
+			mmap(NULL, mdev->page_size, PROT_READ, MAP_SHARED,
+			     mctx->ibv_ctx.context.cmd_fd, mdev->page_size * 3);
+		if (hca_clock_page != MAP_FAILED)
+			mctx->hca_core_clock =
+				hca_clock_page + (mctx->core_clock.offset &
+						  (mdev->page_size - 1));
+		else
+			fprintf(stderr, PFX
+				"Warning: Timestamp available,\n"
+				"but failed to mmap() hca core clock page.\n");
+	}
+}
+
 static int mlx4_read_clock(struct ibv_context *context, uint64_t *cycles)
 {
 	uint32_t clockhi, clocklo, clockhi1;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/9] providers: Remove normal query_device() from providers that have _ex
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2020-11-16 20:23 ` [PATCH 5/9] mlx4: Move the context intialization out of mlx4_query_device_ex() Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-18 12:47   ` Gal Pressman
  2020-11-16 20:23 ` [PATCH 7/9] providers: Convert all providers to implement query_device_ex Jason Gunthorpe
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

The ex callback can implement both versions, no reason for duplicate
code in two paths. Have the core code call the _ex version instead.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 libibverbs/dummy_ops.c |  9 ++++-
 providers/efa/efa.c    |  1 -
 providers/efa/verbs.c  | 36 +++++---------------
 providers/efa/verbs.h  |  1 -
 providers/mlx4/mlx4.c  |  1 -
 providers/mlx4/mlx4.h  |  2 --
 providers/mlx4/verbs.c | 45 +++++++++----------------
 providers/mlx5/mlx5.c  |  1 -
 providers/mlx5/mlx5.h  |  2 --
 providers/mlx5/verbs.c | 75 +++++++++++++++++-------------------------
 10 files changed, 61 insertions(+), 112 deletions(-)

diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index e5af9e4eac8e34..711dfafb5caed5 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -380,7 +380,14 @@ static int post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr,
 static int query_device(struct ibv_context *context,
 			struct ibv_device_attr *device_attr)
 {
-	return EOPNOTSUPP;
+	const struct verbs_context_ops *ops = get_ops(context);
+
+	if (!ops->query_device_ex)
+		return EOPNOTSUPP;
+	return ops->query_device_ex(
+		context, NULL,
+		container_of(device_attr, struct ibv_device_attr_ex, orig_attr),
+		sizeof(*device_attr));
 }
 
 /* Provide a generic implementation for all providers that don't implement
diff --git a/providers/efa/efa.c b/providers/efa/efa.c
index b24c14f7fa1fe1..f6d314dad51e58 100644
--- a/providers/efa/efa.c
+++ b/providers/efa/efa.c
@@ -40,7 +40,6 @@ static const struct verbs_context_ops efa_ctx_ops = {
 	.poll_cq = efa_poll_cq,
 	.post_recv = efa_post_recv,
 	.post_send = efa_post_send,
-	.query_device = efa_query_device,
 	.query_device_ex = efa_query_device_ex,
 	.query_port = efa_query_port,
 	.query_qp = efa_query_qp,
diff --git a/providers/efa/verbs.c b/providers/efa/verbs.c
index 52d6285f1f409c..d50206c13d4295 100644
--- a/providers/efa/verbs.c
+++ b/providers/efa/verbs.c
@@ -56,27 +56,6 @@ struct efa_wq_init_attr {
 	uint16_t sub_cq_idx;
 };
 
-int efa_query_device(struct ibv_context *ibvctx,
-		     struct ibv_device_attr *dev_attr)
-{
-	struct efa_context *ctx = to_efa_context(ibvctx);
-	struct ibv_query_device cmd;
-	uint8_t fw_ver[8];
-	int err;
-
-	err = ibv_cmd_query_device(ibvctx, dev_attr, (uint64_t *)&fw_ver,
-				   &cmd, sizeof(cmd));
-	if (err)
-		return err;
-
-	dev_attr->max_qp_wr = min_t(int, dev_attr->max_qp_wr,
-				    ctx->max_llq_size / sizeof(struct efa_io_tx_wqe));
-	snprintf(dev_attr->fw_ver, sizeof(dev_attr->fw_ver), "%u.%u.%u.%u",
-		 fw_ver[0], fw_ver[1], fw_ver[2], fw_ver[3]);
-
-	return 0;
-}
-
 int efa_query_port(struct ibv_context *ibvctx, uint8_t port,
 		   struct ibv_port_attr *port_attr)
 {
@@ -91,23 +70,24 @@ int efa_query_device_ex(struct ibv_context *context,
 			size_t attr_size)
 {
 	struct efa_context *ctx = to_efa_context(context);
-	int cmd_supp_uhw = ctx->cmds_supp_udata_mask &
-			   EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE;
 	struct ibv_device_attr *a = &attr->orig_attr;
 	struct efa_query_device_ex_resp resp = {};
-	struct ibv_query_device_ex cmd = {};
+	size_t resp_size = (ctx->cmds_supp_udata_mask &
+			    EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE) ?
+				   sizeof(resp) :
+				   sizeof(resp.ibv_resp);
 	uint8_t fw_ver[8];
 	int err;
 
-	err = ibv_cmd_query_device_ex(
-		context, input, attr, attr_size, (uint64_t *)&fw_ver, &cmd,
-		sizeof(cmd), &resp.ibv_resp,
-		cmd_supp_uhw ? sizeof(resp) : sizeof(resp.ibv_resp));
+	err = ibv_cmd_query_device_any(context, input, attr, attr_size,
+				       &resp.ibv_resp, &resp_size);
 	if (err)
 		return err;
 
 	a->max_qp_wr = min_t(int, a->max_qp_wr,
 			     ctx->max_llq_size / sizeof(struct efa_io_tx_wqe));
+	memcpy(fw_ver, &resp.ibv_resp.base.fw_ver,
+	       sizeof(resp.ibv_resp.base.fw_ver));
 	snprintf(a->fw_ver, sizeof(a->fw_ver), "%u.%u.%u.%u",
 		 fw_ver[0], fw_ver[1], fw_ver[2], fw_ver[3]);
 
diff --git a/providers/efa/verbs.h b/providers/efa/verbs.h
index 3b0e4e0d498761..b7ae3f0a15c00c 100644
--- a/providers/efa/verbs.h
+++ b/providers/efa/verbs.h
@@ -10,7 +10,6 @@
 #include <infiniband/verbs.h>
 
 int efa_query_device_ctx(struct efa_context *ctx);
-int efa_query_device(struct ibv_context *uctx, struct ibv_device_attr *attr);
 int efa_query_port(struct ibv_context *uctx, uint8_t port,
 		   struct ibv_port_attr *attr);
 int efa_query_device_ex(struct ibv_context *context,
diff --git a/providers/mlx4/mlx4.c b/providers/mlx4/mlx4.c
index 619b841d788cb2..1e71cde4a1f9dc 100644
--- a/providers/mlx4/mlx4.c
+++ b/providers/mlx4/mlx4.c
@@ -84,7 +84,6 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static const struct verbs_context_ops mlx4_ctx_ops = {
-	.query_device  = mlx4_query_device,
 	.query_port    = mlx4_query_port,
 	.alloc_pd      = mlx4_alloc_pd,
 	.dealloc_pd    = mlx4_free_pd,
diff --git a/providers/mlx4/mlx4.h b/providers/mlx4/mlx4.h
index 3c0787144e7e51..6c6ffc77657463 100644
--- a/providers/mlx4/mlx4.h
+++ b/providers/mlx4/mlx4.h
@@ -305,8 +305,6 @@ void mlx4_free_db(struct mlx4_context *context, enum mlx4_db_type type,
 		  __be32 *db);
 
 void mlx4_query_device_ctx(struct mlx4_device *mdev, struct mlx4_context *mctx);
-int mlx4_query_device(struct ibv_context *context,
-		       struct ibv_device_attr *attr);
 int mlx4_query_device_ex(struct ibv_context *context,
 			 const struct ibv_query_device_ex_input *input,
 			 struct ibv_device_attr_ex *attr,
diff --git a/providers/mlx4/verbs.c b/providers/mlx4/verbs.c
index 4fe5c1d87d6d91..ea8e882bb363ba 100644
--- a/providers/mlx4/verbs.c
+++ b/providers/mlx4/verbs.c
@@ -45,51 +45,36 @@
 #include "mlx4.h"
 #include "mlx4-abi.h"
 
-int mlx4_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
-{
-	struct ibv_query_device cmd;
-	uint64_t raw_fw_ver;
-	unsigned major, minor, sub_minor;
-	int ret;
-
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd, sizeof cmd);
-	if (ret)
-		return ret;
-
-	major     = (raw_fw_ver >> 32) & 0xffff;
-	minor     = (raw_fw_ver >> 16) & 0xffff;
-	sub_minor = raw_fw_ver & 0xffff;
-
-	snprintf(attr->fw_ver, sizeof attr->fw_ver,
-		 "%d.%d.%03d", major, minor, sub_minor);
-
-	return 0;
-}
-
 int mlx4_query_device_ex(struct ibv_context *context,
 			 const struct ibv_query_device_ex_input *input,
 			 struct ibv_device_attr_ex *attr,
 			 size_t attr_size)
 {
-	struct mlx4_query_device_ex_resp resp = {};
-	struct mlx4_query_device_ex cmd = {};
+	struct mlx4_query_device_ex_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned sub_minor;
 	unsigned major;
 	unsigned minor;
 	int err;
 
-	err = ibv_cmd_query_device_ex(context, input, attr, attr_size,
-				      &raw_fw_ver, &cmd.ibv_cmd, sizeof(cmd),
-				      &resp.ibv_resp, sizeof(resp));
+	err = ibv_cmd_query_device_any(context, input, attr, attr_size,
+				       &resp.ibv_resp, &resp_size);
 	if (err)
 		return err;
 
-	attr->rss_caps.rx_hash_fields_mask = resp.rss_caps.rx_hash_fields_mask;
-	attr->rss_caps.rx_hash_function = resp.rss_caps.rx_hash_function;
-	attr->tso_caps.max_tso = resp.tso_caps.max_tso;
-	attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts;
+	if (attr_size >= offsetofend(struct ibv_device_attr_ex, rss_caps)) {
+		attr->rss_caps.rx_hash_fields_mask =
+			resp.rss_caps.rx_hash_fields_mask;
+		attr->rss_caps.rx_hash_function =
+			resp.rss_caps.rx_hash_function;
+	}
+	if (attr_size >= offsetofend(struct ibv_device_attr_ex, tso_caps)) {
+		attr->tso_caps.max_tso = resp.tso_caps.max_tso;
+		attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts;
+	}
 
+	raw_fw_ver = resp.ibv_resp.base.fw_ver;
 	major     = (raw_fw_ver >> 32) & 0xffff;
 	minor     = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 06b9a52ebb3019..cf0a62928705bc 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -90,7 +90,6 @@ uint32_t mlx5_debug_mask = 0;
 int mlx5_freeze_on_error_cqe;
 
 static const struct verbs_context_ops mlx5_ctx_common_ops = {
-	.query_device  = mlx5_query_device,
 	.query_port    = mlx5_query_port,
 	.alloc_pd      = mlx5_alloc_pd,
 	.async_event   = mlx5_async_event,
diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
index 72e710b7b5e4aa..8821015c6d503e 100644
--- a/providers/mlx5/mlx5.h
+++ b/providers/mlx5/mlx5.h
@@ -879,8 +879,6 @@ void mlx5_free_db(struct mlx5_context *context, __be32 *db, struct ibv_pd *pd,
 		  bool custom_alloc);
 
 void mlx5_query_device_ctx(struct mlx5_context *mctx);
-int mlx5_query_device(struct ibv_context *context,
-		       struct ibv_device_attr *attr);
 int mlx5_query_device_ex(struct ibv_context *context,
 			 const struct ibv_query_device_ex_input *input,
 			 struct ibv_device_attr_ex *attr,
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index 42c984033d8eaa..5882e209b06b54 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -65,27 +65,6 @@ static inline int is_xrc_tgt(int type)
 	return type == IBV_QPT_XRC_RECV;
 }
 
-int mlx5_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
-{
-	struct ibv_query_device cmd;
-	uint64_t raw_fw_ver;
-	unsigned major, minor, sub_minor;
-	int ret;
-
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd, sizeof cmd);
-	if (ret)
-		return ret;
-
-	major     = (raw_fw_ver >> 32) & 0xffff;
-	minor     = (raw_fw_ver >> 16) & 0xffff;
-	sub_minor = raw_fw_ver & 0xffff;
-
-	snprintf(attr->fw_ver, sizeof attr->fw_ver,
-		 "%d.%d.%04d", major, minor, sub_minor);
-
-	return 0;
-}
-
 static int mlx5_read_clock(struct ibv_context *context, uint64_t *cycles)
 {
 	unsigned int clockhi, clocklo, clockhi1;
@@ -3481,37 +3460,47 @@ int mlx5_query_device_ex(struct ibv_context *context,
 			 size_t attr_size)
 {
 	struct mlx5_context *mctx = to_mctx(context);
-	struct mlx5_query_device_ex_resp resp;
-	struct mlx5_query_device_ex cmd;
+	struct mlx5_query_device_ex_resp resp = {};
+	size_t resp_size =
+		(mctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE) ?
+			sizeof(resp) :
+			sizeof(resp.ibv_resp);
 	struct ibv_device_attr *a;
 	uint64_t raw_fw_ver;
 	unsigned sub_minor;
 	unsigned major;
 	unsigned minor;
 	int err;
-	int cmd_supp_uhw = mctx->cmds_supp_uhw &
-		MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE;
 
-	memset(&cmd, 0, sizeof(cmd));
-	memset(&resp, 0, sizeof(resp));
-	err = ibv_cmd_query_device_ex(
-		context, input, attr, attr_size, &raw_fw_ver, &cmd.ibv_cmd,
-		sizeof(cmd), &resp.ibv_resp,
-		cmd_supp_uhw ? sizeof(resp) : sizeof(resp.ibv_resp));
+	err = ibv_cmd_query_device_any(context, input, attr, attr_size,
+				       &resp.ibv_resp, &resp_size);
 	if (err)
 		return err;
 
-	attr->tso_caps.max_tso = resp.tso_caps.max_tso;
-	attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts;
-	attr->rss_caps.rx_hash_fields_mask = resp.rss_caps.rx_hash_fields_mask;
-	attr->rss_caps.rx_hash_function = resp.rss_caps.rx_hash_function;
-	attr->packet_pacing_caps.qp_rate_limit_min =
-		resp.packet_pacing_caps.qp_rate_limit_min;
-	attr->packet_pacing_caps.qp_rate_limit_max =
-		resp.packet_pacing_caps.qp_rate_limit_max;
-	attr->packet_pacing_caps.supported_qpts =
-		resp.packet_pacing_caps.supported_qpts;
+	if (attr_size >= offsetofend(struct ibv_device_attr_ex, tso_caps)) {
+		attr->tso_caps.max_tso = resp.tso_caps.max_tso;
+		attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts;
+	}
+	if (attr_size >= offsetofend(struct ibv_device_attr_ex, rss_caps)) {
+		attr->rss_caps.rx_hash_fields_mask =
+			resp.rss_caps.rx_hash_fields_mask;
+		attr->rss_caps.rx_hash_function =
+			resp.rss_caps.rx_hash_function;
+	}
+	if (attr_size >=
+	    offsetofend(struct ibv_device_attr_ex, packet_pacing_caps)) {
+		attr->packet_pacing_caps.qp_rate_limit_min =
+			resp.packet_pacing_caps.qp_rate_limit_min;
+		attr->packet_pacing_caps.qp_rate_limit_max =
+			resp.packet_pacing_caps.qp_rate_limit_max;
+		attr->packet_pacing_caps.supported_qpts =
+			resp.packet_pacing_caps.supported_qpts;
+	}
+
+	if (attr_size >= offsetofend(struct ibv_device_attr_ex, pci_atomic_caps))
+		get_pci_atomic_caps(context, attr);
 
+	raw_fw_ver = resp.ibv_resp.base.fw_ver;
 	major     = (raw_fw_ver >> 32) & 0xffff;
 	minor     = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
@@ -3519,10 +3508,6 @@ int mlx5_query_device_ex(struct ibv_context *context,
 	snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d",
 		 major, minor, sub_minor);
 
-	if (attr_size >= offsetof(struct ibv_device_attr_ex, pci_atomic_caps) +
-			sizeof(attr->pci_atomic_caps))
-		get_pci_atomic_caps(context, attr);
-
 	return 0;
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/9] providers: Convert all providers to implement query_device_ex
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2020-11-16 20:23 ` [PATCH 6/9] providers: Remove normal query_device() from providers that have _ex Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 8/9] verbs: Remove dead code Jason Gunthorpe
  2020-11-16 20:23 ` [PATCH 9/9] verbs: Delete query_device() internal support Jason Gunthorpe
  8 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

The kernel now supports query_device_ex for all drivers, there is no
reason to have a weird split where providers do different things.

All providers implement only query_device_ex and call
ibv_cmd_query_device_any() to get as much of the device_attr's as the
kernel can return.

The user facing ibv_query_device is emulated by requesting only the first
portion of the ibv_device_attr_ex structure using a shorter size.

Nearly all providers have a fairly simple pattern where they just call
ibv_cmd_query_device_any() and the manipulate the fw_version into a
string. A few return a device udata and process that as well.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 providers/bnxt_re/main.c           |  2 +-
 providers/bnxt_re/verbs.c          | 30 +++++++++++++++++++-----------
 providers/bnxt_re/verbs.h          |  5 +++--
 providers/cxgb4/dev.c              |  2 +-
 providers/cxgb4/libcxgb4.h         |  3 ++-
 providers/cxgb4/verbs.c            | 14 +++++++++-----
 providers/hfi1verbs/hfiverbs.c     |  2 +-
 providers/hfi1verbs/hfiverbs.h     |  5 +++--
 providers/hfi1verbs/verbs.c        | 13 ++++++++-----
 providers/hns/hns_roce_u.c         |  8 ++++++--
 providers/hns/hns_roce_u.h         |  3 ++-
 providers/hns/hns_roce_u_verbs.c   | 15 +++++++++------
 providers/i40iw/i40iw_umain.c      |  2 +-
 providers/i40iw/i40iw_umain.h      |  4 +++-
 providers/i40iw/i40iw_uverbs.c     | 18 +++++++++++-------
 providers/ipathverbs/ipathverbs.c  |  2 +-
 providers/ipathverbs/ipathverbs.h  |  5 +++--
 providers/ipathverbs/verbs.c       | 13 ++++++++-----
 providers/mthca/mthca.c            |  2 +-
 providers/mthca/mthca.h            |  3 ++-
 providers/mthca/verbs.c            | 13 +++++++++----
 providers/ocrdma/ocrdma_main.c     |  2 +-
 providers/ocrdma/ocrdma_main.h     |  4 +++-
 providers/ocrdma/ocrdma_verbs.c    | 20 ++++++++++++--------
 providers/qedr/qelr_main.c         |  2 +-
 providers/qedr/qelr_verbs.c        | 21 ++++++++++++---------
 providers/qedr/qelr_verbs.h        |  3 ++-
 providers/rxe/rxe.c                | 15 +++++++++------
 providers/siw/siw.c                | 20 +++++++++++---------
 providers/vmw_pvrdma/pvrdma.h      |  3 ++-
 providers/vmw_pvrdma/pvrdma_main.c |  2 +-
 providers/vmw_pvrdma/verbs.c       | 13 ++++++++-----
 32 files changed, 165 insertions(+), 104 deletions(-)

diff --git a/providers/bnxt_re/main.c b/providers/bnxt_re/main.c
index baeee733fdfbab..a78e6b98815dee 100644
--- a/providers/bnxt_re/main.c
+++ b/providers/bnxt_re/main.c
@@ -91,7 +91,7 @@ static const struct verbs_match_ent cna_table[] = {
 };
 
 static const struct verbs_context_ops bnxt_re_cntx_ops = {
-	.query_device  = bnxt_re_query_device,
+	.query_device_ex = bnxt_re_query_device,
 	.query_port    = bnxt_re_query_port,
 	.alloc_pd      = bnxt_re_alloc_pd,
 	.dealloc_pd    = bnxt_re_free_pd,
diff --git a/providers/bnxt_re/verbs.c b/providers/bnxt_re/verbs.c
index 03237e7f810321..20f4e7dd08b37c 100644
--- a/providers/bnxt_re/verbs.c
+++ b/providers/bnxt_re/verbs.c
@@ -53,19 +53,24 @@
 #include "main.h"
 #include "verbs.h"
 
-int bnxt_re_query_device(struct ibv_context *ibvctx,
-			 struct ibv_device_attr *dev_attr)
+int bnxt_re_query_device(struct ibv_context *context,
+			 const struct ibv_query_device_ex_input *input,
+			 struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint8_t fw_ver[8];
-	int status;
+	int err;
 
-	memset(dev_attr, 0, sizeof(struct ibv_device_attr));
-	status = ibv_cmd_query_device(ibvctx, dev_attr, (uint64_t *)&fw_ver,
-				      &cmd, sizeof(cmd));
-	snprintf(dev_attr->fw_ver, 64, "%d.%d.%d.%d",
-		 fw_ver[0], fw_ver[1], fw_ver[2], fw_ver[3]);
-	return status;
+	err = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
+	if (err)
+		return err;
+
+	memcpy(fw_ver, &resp.base.fw_ver, sizeof(resp.base.fw_ver));
+	snprintf(attr->orig_attr.fw_ver, 64, "%d.%d.%d.%d", fw_ver[0],
+		 fw_ver[1], fw_ver[2], fw_ver[3]);
+	return 0;
 }
 
 int bnxt_re_query_port(struct ibv_context *ibvctx, uint8_t port,
@@ -773,7 +778,10 @@ static int bnxt_re_check_qp_limits(struct bnxt_re_context *cntx,
 	struct ibv_device_attr devattr;
 	int ret;
 
-	ret = bnxt_re_query_device(&cntx->ibvctx.context, &devattr);
+	ret = bnxt_re_query_device(
+		&cntx->ibvctx.context, NULL,
+		container_of(&devattr, struct ibv_device_attr_ex, orig_attr),
+		sizeof(devattr));
 	if (ret)
 		return ret;
 	if (attr->cap.max_send_sge > devattr.max_sge)
diff --git a/providers/bnxt_re/verbs.h b/providers/bnxt_re/verbs.h
index b9fd84bdbac9a8..1566709f096093 100644
--- a/providers/bnxt_re/verbs.h
+++ b/providers/bnxt_re/verbs.h
@@ -54,8 +54,9 @@
 #include <infiniband/driver.h>
 #include <infiniband/verbs.h>
 
-int bnxt_re_query_device(struct ibv_context *uctx,
-			 struct ibv_device_attr *attr);
+int bnxt_re_query_device(struct ibv_context *context,
+			 const struct ibv_query_device_ex_input *input,
+			 struct ibv_device_attr_ex *attr, size_t attr_size);
 int bnxt_re_query_port(struct ibv_context *uctx, uint8_t port,
 		       struct ibv_port_attr *attr);
 struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *uctx);
diff --git a/providers/cxgb4/dev.c b/providers/cxgb4/dev.c
index 06948efb3f0105..76b78d9b29a71c 100644
--- a/providers/cxgb4/dev.c
+++ b/providers/cxgb4/dev.c
@@ -75,7 +75,7 @@ int t5_en_wc = 1;
 static LIST_HEAD(devices);
 
 static const struct verbs_context_ops  c4iw_ctx_common_ops = {
-	.query_device = c4iw_query_device,
+	.query_device_ex = c4iw_query_device,
 	.query_port = c4iw_query_port,
 	.alloc_pd = c4iw_alloc_pd,
 	.dealloc_pd = c4iw_free_pd,
diff --git a/providers/cxgb4/libcxgb4.h b/providers/cxgb4/libcxgb4.h
index c5036d0b83c21e..f0658ab89f4aa5 100644
--- a/providers/cxgb4/libcxgb4.h
+++ b/providers/cxgb4/libcxgb4.h
@@ -191,7 +191,8 @@ static inline unsigned long_log2(unsigned long x)
 }
 
 int c4iw_query_device(struct ibv_context *context,
-			     struct ibv_device_attr *attr);
+		      const struct ibv_query_device_ex_input *input,
+		      struct ibv_device_attr_ex *attr, size_t attr_size);
 int c4iw_query_port(struct ibv_context *context, uint8_t port,
 			   struct ibv_port_attr *attr);
 
diff --git a/providers/cxgb4/verbs.c b/providers/cxgb4/verbs.c
index 32bae6906a1595..a28152fa32761c 100644
--- a/providers/cxgb4/verbs.c
+++ b/providers/cxgb4/verbs.c
@@ -47,24 +47,28 @@ bool is_64b_cqe;
 
 #define MASKED(x) (void *)((unsigned long)(x) & c4iw_page_mask)
 
-int c4iw_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
+int c4iw_query_device(struct ibv_context *context,
+		      const struct ibv_query_device_ex_input *input,
+		      struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	u8 major, minor, sub_minor, build;
 	int ret;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd,
-				   sizeof cmd);
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major = (raw_fw_ver >> 24) & 0xff;
 	minor = (raw_fw_ver >> 16) & 0xff;
 	sub_minor = (raw_fw_ver >> 8) & 0xff;
 	build = raw_fw_ver & 0xff;
 
-	snprintf(attr->fw_ver, sizeof attr->fw_ver,
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%d.%d", major, minor, sub_minor, build);
 
 	return 0;
diff --git a/providers/hfi1verbs/hfiverbs.c b/providers/hfi1verbs/hfiverbs.c
index 9bfb967c09791e..514a7e6b602cb7 100644
--- a/providers/hfi1verbs/hfiverbs.c
+++ b/providers/hfi1verbs/hfiverbs.c
@@ -90,7 +90,7 @@ static const struct verbs_match_ent hca_table[] = {
 
 static const struct verbs_context_ops hfi1_ctx_common_ops = {
 	.free_context	= hfi1_free_context,
-	.query_device	= hfi1_query_device,
+	.query_device_ex = hfi1_query_device,
 	.query_port	= hfi1_query_port,
 
 	.alloc_pd	= hfi1_alloc_pd,
diff --git a/providers/hfi1verbs/hfiverbs.h b/providers/hfi1verbs/hfiverbs.h
index b9e91d8072acf3..34977fc0bdfca2 100644
--- a/providers/hfi1verbs/hfiverbs.h
+++ b/providers/hfi1verbs/hfiverbs.h
@@ -194,8 +194,9 @@ static inline struct hfi1_rwqe *get_rwqe_ptr(struct hfi1_rq *rq,
 		  rq->max_sge * sizeof(struct ibv_sge)) * n);
 }
 
-extern int hfi1_query_device(struct ibv_context *context,
-			      struct ibv_device_attr *attr);
+int hfi1_query_device(struct ibv_context *context,
+		      const struct ibv_query_device_ex_input *input,
+		      struct ibv_device_attr_ex *attr, size_t attr_size);
 
 extern int hfi1_query_port(struct ibv_context *context, uint8_t port,
 			    struct ibv_port_attr *attr);
diff --git a/providers/hfi1verbs/verbs.c b/providers/hfi1verbs/verbs.c
index 275f8d511392a7..028552a23718cd 100644
--- a/providers/hfi1verbs/verbs.c
+++ b/providers/hfi1verbs/verbs.c
@@ -68,23 +68,26 @@
 #include "hfi-abi.h"
 
 int hfi1_query_device(struct ibv_context *context,
-		       struct ibv_device_attr *attr)
+		      const struct ibv_query_device_ex_input *input,
+		      struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned major, minor, sub_minor;
 	int ret;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver,
-				   &cmd, sizeof cmd);
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major     = (raw_fw_ver >> 32) & 0xffff;
 	minor     = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof attr->fw_ver,
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%d", major, minor, sub_minor);
 
 	return 0;
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index c4370411d8aa33..23b317cc9b5fb8 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -74,7 +74,7 @@ static const struct verbs_context_ops hns_common_ops = {
 	.dereg_mr = hns_roce_u_dereg_mr,
 	.destroy_cq = hns_roce_u_destroy_cq,
 	.modify_cq = hns_roce_u_modify_cq,
-	.query_device = hns_roce_u_query_device,
+	.query_device_ex = hns_roce_u_query_device,
 	.query_port = hns_roce_u_query_port,
 	.query_qp = hns_roce_u_query_qp,
 	.reg_mr = hns_roce_u_reg_mr,
@@ -147,7 +147,11 @@ static struct verbs_context *hns_roce_alloc_context(struct ibv_device *ibdev,
 	verbs_set_ops(&context->ibv_ctx, &hns_common_ops);
 	verbs_set_ops(&context->ibv_ctx, &hr_dev->u_hw->hw_ops);
 
-	if (hns_roce_u_query_device(&context->ibv_ctx.context, &dev_attrs))
+	if (hns_roce_u_query_device(&context->ibv_ctx.context, NULL,
+				    container_of(&dev_attrs,
+						 struct ibv_device_attr_ex,
+						 orig_attr),
+				    sizeof(dev_attrs)))
 		goto tptr_free;
 
 	context->max_qp_wr = dev_attrs.max_qp_wr;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index b0308d14820b06..0c2dc1e3441550 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -316,7 +316,8 @@ static inline struct  hns_roce_qp *to_hr_qp(struct ibv_qp *ibv_qp)
 }
 
 int hns_roce_u_query_device(struct ibv_context *context,
-			    struct ibv_device_attr *attr);
+			    const struct ibv_query_device_ex_input *input,
+			    struct ibv_device_attr_ex *attr, size_t attr_size);
 int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
 			  struct ibv_port_attr *attr);
 
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index 06fdbfac5e898d..bffd8df4f3f7a4 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -54,24 +54,27 @@ void hns_roce_init_qp_indices(struct hns_roce_qp *qp)
 }
 
 int hns_roce_u_query_device(struct ibv_context *context,
-			    struct ibv_device_attr *attr)
+			    const struct ibv_query_device_ex_input *input,
+			    struct ibv_device_attr_ex *attr, size_t attr_size)
 {
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	int ret;
-	struct ibv_query_device cmd;
 	uint64_t raw_fw_ver;
 	unsigned int major, minor, sub_minor;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd,
-				   sizeof(cmd));
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major = (raw_fw_ver >> 32) & 0xffff;
 	minor = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof(attr->fw_ver), "%d.%d.%03d", major, minor,
-		 sub_minor);
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
+		 "%d.%d.%03d", major, minor, sub_minor);
 
 	return 0;
 }
diff --git a/providers/i40iw/i40iw_umain.c b/providers/i40iw/i40iw_umain.c
index eef8cd50cabf43..c5ac0792ed6f83 100644
--- a/providers/i40iw/i40iw_umain.c
+++ b/providers/i40iw/i40iw_umain.c
@@ -95,7 +95,7 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static const struct verbs_context_ops i40iw_uctx_ops = {
-	.query_device	= i40iw_uquery_device,
+	.query_device_ex = i40iw_uquery_device,
 	.query_port	= i40iw_uquery_port,
 	.alloc_pd	= i40iw_ualloc_pd,
 	.dealloc_pd	= i40iw_ufree_pd,
diff --git a/providers/i40iw/i40iw_umain.h b/providers/i40iw/i40iw_umain.h
index 10385dfc8304e9..fe643dd1a04e06 100644
--- a/providers/i40iw/i40iw_umain.h
+++ b/providers/i40iw/i40iw_umain.h
@@ -151,7 +151,9 @@ static inline struct i40iw_uqp *to_i40iw_uqp(struct ibv_qp *ibqp)
 }
 
 /* i40iw_uverbs.c */
-int i40iw_uquery_device(struct ibv_context *, struct ibv_device_attr *);
+int i40iw_uquery_device(struct ibv_context *context,
+			const struct ibv_query_device_ex_input *input,
+			struct ibv_device_attr_ex *attr, size_t attr_size);
 int i40iw_uquery_port(struct ibv_context *, uint8_t, struct ibv_port_attr *);
 struct ibv_pd *i40iw_ualloc_pd(struct ibv_context *);
 int i40iw_ufree_pd(struct ibv_pd *);
diff --git a/providers/i40iw/i40iw_uverbs.c b/providers/i40iw/i40iw_uverbs.c
index 71b59a7a812c17..c170bb33ef8b6e 100644
--- a/providers/i40iw/i40iw_uverbs.c
+++ b/providers/i40iw/i40iw_uverbs.c
@@ -55,23 +55,27 @@
  * @context: user context for the device
  * @attr: where to save all the mx resources from the driver
  **/
-int i40iw_uquery_device(struct ibv_context *context, struct ibv_device_attr *attr)
+int i40iw_uquery_device(struct ibv_context *context,
+			const struct ibv_query_device_ex_input *input,
+			struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t i40iw_fw_ver;
 	int ret;
 	unsigned int minor, major;
 
-	ret = ibv_cmd_query_device(context, attr, &i40iw_fw_ver, &cmd, sizeof(cmd));
-	if (ret) {
-		fprintf(stderr, PFX "%s: query device failed and returned status code: %d\n", __func__, ret);
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
+	if (ret)
 		return ret;
-	}
 
+	i40iw_fw_ver = resp.base.fw_ver;
 	major = (i40iw_fw_ver >> 16) & 0xffff;
 	minor = i40iw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof(attr->fw_ver), "%d.%d", major, minor);
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
+		 "%d.%d", major, minor);
 
 	return 0;
 }
diff --git a/providers/ipathverbs/ipathverbs.c b/providers/ipathverbs/ipathverbs.c
index 0e1a58433d34b2..975f52d011c4c0 100644
--- a/providers/ipathverbs/ipathverbs.c
+++ b/providers/ipathverbs/ipathverbs.c
@@ -89,7 +89,7 @@ static const struct verbs_match_ent hca_table[] = {
 
 static const struct verbs_context_ops ipath_ctx_common_ops = {
 	.free_context	= ipath_free_context,
-	.query_device	= ipath_query_device,
+	.query_device_ex = ipath_query_device,
 	.query_port	= ipath_query_port,
 
 	.alloc_pd	= ipath_alloc_pd,
diff --git a/providers/ipathverbs/ipathverbs.h b/providers/ipathverbs/ipathverbs.h
index 694f1f44a48315..c5fa761f794567 100644
--- a/providers/ipathverbs/ipathverbs.h
+++ b/providers/ipathverbs/ipathverbs.h
@@ -173,8 +173,9 @@ static inline struct ipath_rwqe *get_rwqe_ptr(struct ipath_rq *rq,
 		  rq->max_sge * sizeof(struct ibv_sge)) * n);
 }
 
-extern int ipath_query_device(struct ibv_context *context,
-			      struct ibv_device_attr *attr);
+int ipath_query_device(struct ibv_context *context,
+		       const struct ibv_query_device_ex_input *input,
+		       struct ibv_device_attr_ex *attr, size_t attr_size);
 
 extern int ipath_query_port(struct ibv_context *context, uint8_t port,
 			    struct ibv_port_attr *attr);
diff --git a/providers/ipathverbs/verbs.c b/providers/ipathverbs/verbs.c
index 505ea584e878de..e1b098a078584a 100644
--- a/providers/ipathverbs/verbs.c
+++ b/providers/ipathverbs/verbs.c
@@ -48,23 +48,26 @@
 #include "ipath-abi.h"
 
 int ipath_query_device(struct ibv_context *context,
-		       struct ibv_device_attr *attr)
+		       const struct ibv_query_device_ex_input *input,
+		       struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned major, minor, sub_minor;
 	int ret;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver,
-				   &cmd, sizeof cmd);
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major     = (raw_fw_ver >> 32) & 0xffff;
 	minor     = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof attr->fw_ver,
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%d", major, minor, sub_minor);
 
 	return 0;
diff --git a/providers/mthca/mthca.c b/providers/mthca/mthca.c
index abce4866883d8b..809aae00ef26f4 100644
--- a/providers/mthca/mthca.c
+++ b/providers/mthca/mthca.c
@@ -92,7 +92,7 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static const struct verbs_context_ops mthca_ctx_common_ops = {
-	.query_device  = mthca_query_device,
+	.query_device_ex = mthca_query_device,
 	.query_port    = mthca_query_port,
 	.alloc_pd      = mthca_alloc_pd,
 	.dealloc_pd    = mthca_free_pd,
diff --git a/providers/mthca/mthca.h b/providers/mthca/mthca.h
index b7df2f734686c8..43c58fa44d1a07 100644
--- a/providers/mthca/mthca.h
+++ b/providers/mthca/mthca.h
@@ -273,7 +273,8 @@ struct mthca_db_table *mthca_alloc_db_tab(int uarc_size);
 void mthca_free_db_tab(struct mthca_db_table *db_tab);
 
 int mthca_query_device(struct ibv_context *context,
-		       struct ibv_device_attr *attr);
+		       const struct ibv_query_device_ex_input *input,
+		       struct ibv_device_attr_ex *attr, size_t attr_size);
 int mthca_query_port(struct ibv_context *context, uint8_t port,
 		     struct ibv_port_attr *attr);
 
diff --git a/providers/mthca/verbs.c b/providers/mthca/verbs.c
index 99e5ec661265a6..7ba5a4177b485c 100644
--- a/providers/mthca/verbs.c
+++ b/providers/mthca/verbs.c
@@ -42,22 +42,27 @@
 #include "mthca.h"
 #include "mthca-abi.h"
 
-int mthca_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
+int mthca_query_device(struct ibv_context *context,
+		       const struct ibv_query_device_ex_input *input,
+		       struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned major, minor, sub_minor;
 	int ret;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd, sizeof cmd);
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major     = (raw_fw_ver >> 32) & 0xffff;
 	minor     = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof attr->fw_ver,
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%d", major, minor, sub_minor);
 
 	return 0;
diff --git a/providers/ocrdma/ocrdma_main.c b/providers/ocrdma/ocrdma_main.c
index f7ed629de8b4cf..c955dd1ba6642f 100644
--- a/providers/ocrdma/ocrdma_main.c
+++ b/providers/ocrdma/ocrdma_main.c
@@ -68,7 +68,7 @@ static const struct verbs_match_ent ucna_table[] = {
 };
 
 static const struct verbs_context_ops ocrdma_ctx_ops = {
-	.query_device = ocrdma_query_device,
+	.query_device_ex = ocrdma_query_device,
 	.query_port = ocrdma_query_port,
 	.alloc_pd = ocrdma_alloc_pd,
 	.dealloc_pd = ocrdma_free_pd,
diff --git a/providers/ocrdma/ocrdma_main.h b/providers/ocrdma/ocrdma_main.h
index aadefd9649ac90..33ea20e0c6066b 100644
--- a/providers/ocrdma/ocrdma_main.h
+++ b/providers/ocrdma/ocrdma_main.h
@@ -265,7 +265,9 @@ static inline struct ocrdma_ah *get_ocrdma_ah(struct ibv_ah *ibah)
 }
 
 void ocrdma_init_ahid_tbl(struct ocrdma_devctx *ctx);
-int ocrdma_query_device(struct ibv_context *, struct ibv_device_attr *);
+int ocrdma_query_device(struct ibv_context *context,
+			const struct ibv_query_device_ex_input *input,
+			struct ibv_device_attr_ex *attr, size_t attr_size);
 int ocrdma_query_port(struct ibv_context *, uint8_t, struct ibv_port_attr *);
 struct ibv_pd *ocrdma_alloc_pd(struct ibv_context *);
 int ocrdma_free_pd(struct ibv_pd *);
diff --git a/providers/ocrdma/ocrdma_verbs.c b/providers/ocrdma/ocrdma_verbs.c
index 4ae35be9d2d9ee..688ff7d4f24043 100644
--- a/providers/ocrdma/ocrdma_verbs.c
+++ b/providers/ocrdma/ocrdma_verbs.c
@@ -68,17 +68,21 @@ static inline void ocrdma_swap_cpu_to_le(void *dst, uint32_t len)
  * ocrdma_query_device
  */
 int ocrdma_query_device(struct ibv_context *context,
-			struct ibv_device_attr *attr)
+			const struct ibv_query_device_ex_input *input,
+			struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
-	uint64_t fw_ver;
 	struct ocrdma_device *dev = get_ocrdma_dev(context->device);
-	int status;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
+	int ret;
 
-	bzero(attr, sizeof *attr);
-	status = ibv_cmd_query_device(context, attr, &fw_ver, &cmd, sizeof cmd);
-	memcpy(attr->fw_ver, dev->fw_ver, sizeof(dev->fw_ver));
-	return status;
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
+	if (ret)
+		return ret;
+
+	memcpy(attr->orig_attr.fw_ver, dev->fw_ver, sizeof(dev->fw_ver));
+	return 0;
 }
 
 /*
diff --git a/providers/qedr/qelr_main.c b/providers/qedr/qelr_main.c
index bdfaa930f0c601..334972ae043cc4 100644
--- a/providers/qedr/qelr_main.c
+++ b/providers/qedr/qelr_main.c
@@ -87,7 +87,7 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static const struct verbs_context_ops qelr_ctx_ops = {
-	.query_device = qelr_query_device,
+	.query_device_ex = qelr_query_device,
 	.query_port = qelr_query_port,
 	.alloc_pd = qelr_alloc_pd,
 	.dealloc_pd = qelr_dealloc_pd,
diff --git a/providers/qedr/qelr_verbs.c b/providers/qedr/qelr_verbs.c
index 4e77a1976a9154..dab9cf67539704 100644
--- a/providers/qedr/qelr_verbs.c
+++ b/providers/qedr/qelr_verbs.c
@@ -75,26 +75,29 @@ static inline int qelr_wq_is_full(struct qelr_qp_hwq_info *info)
 }
 
 int qelr_query_device(struct ibv_context *context,
-		      struct ibv_device_attr *attr)
+		      const struct ibv_query_device_ex_input *input,
+		      struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t fw_ver;
 	unsigned int major, minor, revision, eng;
-	int status;
+	int ret;
 
-	bzero(attr, sizeof(*attr));
-	status = ibv_cmd_query_device(context, attr, &fw_ver, &cmd,
-				      sizeof(cmd));
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
+	if (ret)
+		return ret;
 
+	fw_ver = resp.base.fw_ver;
 	major = (fw_ver >> 24) & 0xff;
 	minor = (fw_ver >> 16) & 0xff;
 	revision = (fw_ver >> 8) & 0xff;
 	eng = fw_ver & 0xff;
 
-	snprintf(attr->fw_ver, sizeof(attr->fw_ver),
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%d.%d", major, minor, revision, eng);
-
-	return status;
+	return 0;
 }
 
 int qelr_query_port(struct ibv_context *context, uint8_t port,
diff --git a/providers/qedr/qelr_verbs.h b/providers/qedr/qelr_verbs.h
index bbfd4906b082e6..b5b43b19241904 100644
--- a/providers/qedr/qelr_verbs.h
+++ b/providers/qedr/qelr_verbs.h
@@ -41,7 +41,8 @@
 #include <util/udma_barrier.h>
 
 int qelr_query_device(struct ibv_context *context,
-		      struct ibv_device_attr *attr);
+		      const struct ibv_query_device_ex_input *input,
+		      struct ibv_device_attr_ex *attr, size_t attr_size);
 int qelr_query_port(struct ibv_context *context, uint8_t port,
 		    struct ibv_port_attr *attr);
 
diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c
index 18e8c53dcd253a..d4357bac9d4d40 100644
--- a/providers/rxe/rxe.c
+++ b/providers/rxe/rxe.c
@@ -65,23 +65,26 @@ static const struct verbs_match_ent hca_table[] = {
 };
 
 static int rxe_query_device(struct ibv_context *context,
-			    struct ibv_device_attr *attr)
+			    const struct ibv_query_device_ex_input *input,
+			    struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned int major, minor, sub_minor;
 	int ret;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver,
-				   &cmd, sizeof(cmd));
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major = (raw_fw_ver >> 32) & 0xffff;
 	minor = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof(attr->fw_ver),
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%d", major, minor, sub_minor);
 
 	return 0;
@@ -831,7 +834,7 @@ static int rxe_destroy_ah(struct ibv_ah *ibah)
 }
 
 static const struct verbs_context_ops rxe_ctx_ops = {
-	.query_device = rxe_query_device,
+	.query_device_ex = rxe_query_device,
 	.query_port = rxe_query_port,
 	.alloc_pd = rxe_alloc_pd,
 	.dealloc_pd = rxe_dealloc_pd,
diff --git a/providers/siw/siw.c b/providers/siw/siw.c
index 0f94e614d16876..8f6dee4e58af56 100644
--- a/providers/siw/siw.c
+++ b/providers/siw/siw.c
@@ -20,26 +20,28 @@
 static const int siw_debug;
 static void siw_free_context(struct ibv_context *ibv_ctx);
 
-static int siw_query_device(struct ibv_context *ctx,
-			    struct ibv_device_attr *attr)
+static int siw_query_device(struct ibv_context *context,
+			 const struct ibv_query_device_ex_input *input,
+			 struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned int major, minor, sub_minor;
 	int rv;
 
-	memset(&cmd, 0, sizeof(cmd));
-
-	rv = ibv_cmd_query_device(ctx, attr, &raw_fw_ver, &cmd, sizeof(cmd));
+	rv = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				      &resp_size);
 	if (rv)
 		return rv;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major = (raw_fw_ver >> 32) & 0xffff;
 	minor = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof(attr->fw_ver), "%d.%d.%d", major, minor,
-		 sub_minor);
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
+		 "%d.%d.%d", major, minor, sub_minor);
 
 	return 0;
 }
@@ -832,7 +834,7 @@ static const struct verbs_context_ops siw_context_ops = {
 	.post_recv = siw_post_recv,
 	.post_send = siw_post_send,
 	.post_srq_recv = siw_post_srq_recv,
-	.query_device = siw_query_device,
+	.query_device_ex = siw_query_device,
 	.query_port = siw_query_port,
 	.query_qp = siw_query_qp,
 	.reg_mr = siw_reg_mr,
diff --git a/providers/vmw_pvrdma/pvrdma.h b/providers/vmw_pvrdma/pvrdma.h
index 0db65773f5d003..bb6ba729d0e4bc 100644
--- a/providers/vmw_pvrdma/pvrdma.h
+++ b/providers/vmw_pvrdma/pvrdma.h
@@ -275,7 +275,8 @@ int pvrdma_alloc_buf(struct pvrdma_buf *buf, size_t size, int page_size);
 void pvrdma_free_buf(struct pvrdma_buf *buf);
 
 int pvrdma_query_device(struct ibv_context *context,
-			struct ibv_device_attr *attr);
+			const struct ibv_query_device_ex_input *input,
+			struct ibv_device_attr_ex *attr, size_t attr_size);
 int pvrdma_query_port(struct ibv_context *context, uint8_t port,
 		      struct ibv_port_attr *attr);
 
diff --git a/providers/vmw_pvrdma/pvrdma_main.c b/providers/vmw_pvrdma/pvrdma_main.c
index 14a67c1cee3150..4f93b519dfa6b3 100644
--- a/providers/vmw_pvrdma/pvrdma_main.c
+++ b/providers/vmw_pvrdma/pvrdma_main.c
@@ -55,7 +55,7 @@ static void pvrdma_free_context(struct ibv_context *ibctx);
 
 static const struct verbs_context_ops pvrdma_ctx_ops = {
 	.free_context = pvrdma_free_context,
-	.query_device = pvrdma_query_device,
+	.query_device_ex = pvrdma_query_device,
 	.query_port = pvrdma_query_port,
 	.alloc_pd = pvrdma_alloc_pd,
 	.dealloc_pd = pvrdma_free_pd,
diff --git a/providers/vmw_pvrdma/verbs.c b/providers/vmw_pvrdma/verbs.c
index e8423c01365e7b..815333691c3831 100644
--- a/providers/vmw_pvrdma/verbs.c
+++ b/providers/vmw_pvrdma/verbs.c
@@ -47,23 +47,26 @@
 #include "pvrdma.h"
 
 int pvrdma_query_device(struct ibv_context *context,
-			struct ibv_device_attr *attr)
+			const struct ibv_query_device_ex_input *input,
+			struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	struct ibv_query_device cmd;
+	struct ib_uverbs_ex_query_device_resp resp;
+	size_t resp_size = sizeof(resp);
 	uint64_t raw_fw_ver;
 	unsigned major, minor, sub_minor;
 	int ret;
 
-	ret = ibv_cmd_query_device(context, attr, &raw_fw_ver,
-				   &cmd, sizeof(cmd));
+	ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp,
+				       &resp_size);
 	if (ret)
 		return ret;
 
+	raw_fw_ver = resp.base.fw_ver;
 	major = (raw_fw_ver >> 32) & 0xffff;
 	minor = (raw_fw_ver >> 16) & 0xffff;
 	sub_minor = raw_fw_ver & 0xffff;
 
-	snprintf(attr->fw_ver, sizeof(attr->fw_ver),
+	snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver),
 		 "%d.%d.%03d", major, minor, sub_minor);
 
 	return 0;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/9] verbs: Remove dead code
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2020-11-16 20:23 ` [PATCH 7/9] providers: Convert all providers to implement query_device_ex Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  2020-11-18 12:46   ` Gal Pressman
  2020-11-16 20:23 ` [PATCH 9/9] verbs: Delete query_device() internal support Jason Gunthorpe
  8 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

Remove the old query_device support code, it is now replaced by
ibv_cmd_query_device_any()

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 libibverbs/cmd.c             | 99 ------------------------------------
 libibverbs/driver.h          |  8 ---
 libibverbs/libibverbs.map.in |  1 -
 3 files changed, 108 deletions(-)

diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index a439f8c06481dd..ec9750e7c04eb4 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -44,7 +44,6 @@
 #include <infiniband/cmd_write.h>
 #include "ibverbs.h"
 #include <ccan/minmax.h>
-#include <util/util.h>
 
 bool verbs_allow_disassociate_destroy;
 
@@ -113,104 +112,6 @@ int ibv_cmd_query_device(struct ibv_context *context,
 	return 0;
 }
 
-int ibv_cmd_query_device_ex(struct ibv_context *context,
-			    const struct ibv_query_device_ex_input *input,
-			    struct ibv_device_attr_ex *attr, size_t attr_size,
-			    uint64_t *raw_fw_ver,
-			    struct ibv_query_device_ex *cmd,
-			    size_t cmd_size,
-			    struct ib_uverbs_ex_query_device_resp *resp,
-			    size_t resp_size)
-{
-	int err;
-
-	if (input && input->comp_mask)
-		return EINVAL;
-
-	if (attr_size < offsetof(struct ibv_device_attr_ex, comp_mask) +
-			sizeof(attr->comp_mask))
-		return EINVAL;
-
-	cmd->comp_mask = 0;
-	cmd->reserved = 0;
-	memset(attr->orig_attr.fw_ver, 0, sizeof(attr->orig_attr.fw_ver));
-	memset(&attr->comp_mask, 0, attr_size - sizeof(attr->orig_attr));
-
-	err = execute_cmd_write_ex(context, IB_USER_VERBS_EX_CMD_QUERY_DEVICE,
-				   cmd, cmd_size, resp, resp_size);
-	if (err)
-		return err;
-
-	copy_query_dev_fields(&attr->orig_attr, &resp->base, raw_fw_ver);
-	/* Report back supported comp_mask bits. For now no comp_mask bit is
-	 * defined */
-	attr->comp_mask = resp->comp_mask & 0;
-
-#define CAN_COPY(_ibv_attr, _uverbs_attr)                                      \
-	(attr_size >= offsetofend(struct ibv_device_attr_ex, _ibv_attr) &&     \
-	 resp->response_length >=                                              \
-		 offsetofend(struct ib_uverbs_ex_query_device_resp,            \
-			     _uverbs_attr))
-
-	if (CAN_COPY(odp_caps, odp_caps)) {
-		attr->odp_caps.general_caps = resp->odp_caps.general_caps;
-		attr->odp_caps.per_transport_caps.rc_odp_caps =
-			resp->odp_caps.per_transport_caps.rc_odp_caps;
-		attr->odp_caps.per_transport_caps.uc_odp_caps =
-			resp->odp_caps.per_transport_caps.uc_odp_caps;
-		attr->odp_caps.per_transport_caps.ud_odp_caps =
-			resp->odp_caps.per_transport_caps.ud_odp_caps;
-	}
-
-	if (CAN_COPY(completion_timestamp_mask, timestamp_mask))
-		attr->completion_timestamp_mask = resp->timestamp_mask;
-
-	if (CAN_COPY(hca_core_clock, hca_core_clock))
-		attr->hca_core_clock = resp->hca_core_clock;
-
-	if (CAN_COPY(device_cap_flags_ex, device_cap_flags_ex))
-		attr->device_cap_flags_ex = resp->device_cap_flags_ex;
-
-	if (CAN_COPY(rss_caps, rss_caps)) {
-		attr->rss_caps.supported_qpts = resp->rss_caps.supported_qpts;
-		attr->rss_caps.max_rwq_indirection_tables =
-			resp->rss_caps.max_rwq_indirection_tables;
-		attr->rss_caps.max_rwq_indirection_table_size =
-			resp->rss_caps.max_rwq_indirection_table_size;
-	}
-
-	if (CAN_COPY(max_wq_type_rq, max_wq_type_rq))
-		attr->max_wq_type_rq = resp->max_wq_type_rq;
-
-	if (CAN_COPY(raw_packet_caps, raw_packet_caps))
-		attr->raw_packet_caps = resp->raw_packet_caps;
-
-	if (CAN_COPY(tm_caps, tm_caps)) {
-		attr->tm_caps.max_rndv_hdr_size =
-			resp->tm_caps.max_rndv_hdr_size;
-		attr->tm_caps.max_num_tags = resp->tm_caps.max_num_tags;
-		attr->tm_caps.flags = resp->tm_caps.flags;
-		attr->tm_caps.max_ops = resp->tm_caps.max_ops;
-		attr->tm_caps.max_sge = resp->tm_caps.max_sge;
-	}
-
-	if (CAN_COPY(cq_mod_caps, cq_moderation_caps)) {
-		attr->cq_mod_caps.max_cq_count =
-			resp->cq_moderation_caps.max_cq_moderation_count;
-		attr->cq_mod_caps.max_cq_period =
-			resp->cq_moderation_caps.max_cq_moderation_period;
-	}
-
-	if (CAN_COPY(max_dm_size, max_dm_size))
-		attr->max_dm_size = resp->max_dm_size;
-
-	if (CAN_COPY(xrc_odp_caps, xrc_odp_caps))
-		attr->xrc_odp_caps = resp->xrc_odp_caps;
-#undef CAN_COPY
-
-	return 0;
-}
-
 int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd,
 		     struct ibv_alloc_pd *cmd, size_t cmd_size,
 		     struct ib_uverbs_alloc_pd_resp *resp, size_t resp_size)
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index e54db0ea6413e8..33998e227c98ec 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -465,14 +465,6 @@ int ibv_cmd_query_device_any(struct ibv_context *context,
 			     struct ibv_device_attr_ex *attr, size_t attr_size,
 			     struct ib_uverbs_ex_query_device_resp *resp,
 			     size_t *resp_size);
-int ibv_cmd_query_device_ex(struct ibv_context *context,
-			    const struct ibv_query_device_ex_input *input,
-			    struct ibv_device_attr_ex *attr, size_t attr_size,
-			    uint64_t *raw_fw_ver,
-			    struct ibv_query_device_ex *cmd,
-			    size_t cmd_size,
-			    struct ib_uverbs_ex_query_device_resp *resp,
-			    size_t resp_size);
 int ibv_cmd_query_port(struct ibv_context *context, uint8_t port_num,
 		       struct ibv_port_attr *port_attr,
 		       struct ibv_query_port *cmd, size_t cmd_size);
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index c1f7e09b240ab0..672717a6fa551e 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -205,7 +205,6 @@ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ {
 		ibv_cmd_query_context;
 		ibv_cmd_query_device;
 		ibv_cmd_query_device_any;
-		ibv_cmd_query_device_ex;
 		ibv_cmd_query_mr;
 		ibv_cmd_query_port;
 		ibv_cmd_query_qp;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 9/9] verbs: Delete query_device() internal support
  2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2020-11-16 20:23 ` [PATCH 8/9] verbs: Remove dead code Jason Gunthorpe
@ 2020-11-16 20:23 ` Jason Gunthorpe
  8 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-16 20:23 UTC (permalink / raw)
  To: linux-rdma, Bob Pearson

Now that all providers only implement the _ex API have the external API
call query_device_ex() directly and remove everything to do with
the internal query_device op.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 libibverbs/cmd.c       | 65 ------------------------------------------
 libibverbs/device.c    |  1 +
 libibverbs/driver.h    |  6 ----
 libibverbs/dummy_ops.c | 28 +-----------------
 libibverbs/verbs.c     |  5 +++-
 libibverbs/verbs.h     |  3 +-
 providers/cxgb4/dev.c  |  8 +++---
 7 files changed, 12 insertions(+), 104 deletions(-)

diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index ec9750e7c04eb4..d7078823989bd2 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -47,71 +47,6 @@
 
 bool verbs_allow_disassociate_destroy;
 
-static void copy_query_dev_fields(struct ibv_device_attr *device_attr,
-				  struct ib_uverbs_query_device_resp *resp,
-				  uint64_t *raw_fw_ver)
-{
-	*raw_fw_ver				= resp->fw_ver;
-	device_attr->node_guid			= resp->node_guid;
-	device_attr->sys_image_guid		= resp->sys_image_guid;
-	device_attr->max_mr_size		= resp->max_mr_size;
-	device_attr->page_size_cap		= resp->page_size_cap;
-	device_attr->vendor_id			= resp->vendor_id;
-	device_attr->vendor_part_id		= resp->vendor_part_id;
-	device_attr->hw_ver			= resp->hw_ver;
-	device_attr->max_qp			= resp->max_qp;
-	device_attr->max_qp_wr			= resp->max_qp_wr;
-	device_attr->device_cap_flags		= resp->device_cap_flags;
-	device_attr->max_sge			= resp->max_sge;
-	device_attr->max_sge_rd			= resp->max_sge_rd;
-	device_attr->max_cq			= resp->max_cq;
-	device_attr->max_cqe			= resp->max_cqe;
-	device_attr->max_mr			= resp->max_mr;
-	device_attr->max_pd			= resp->max_pd;
-	device_attr->max_qp_rd_atom		= resp->max_qp_rd_atom;
-	device_attr->max_ee_rd_atom		= resp->max_ee_rd_atom;
-	device_attr->max_res_rd_atom		= resp->max_res_rd_atom;
-	device_attr->max_qp_init_rd_atom	= resp->max_qp_init_rd_atom;
-	device_attr->max_ee_init_rd_atom	= resp->max_ee_init_rd_atom;
-	device_attr->atomic_cap			= resp->atomic_cap;
-	device_attr->max_ee			= resp->max_ee;
-	device_attr->max_rdd			= resp->max_rdd;
-	device_attr->max_mw			= resp->max_mw;
-	device_attr->max_raw_ipv6_qp		= resp->max_raw_ipv6_qp;
-	device_attr->max_raw_ethy_qp		= resp->max_raw_ethy_qp;
-	device_attr->max_mcast_grp		= resp->max_mcast_grp;
-	device_attr->max_mcast_qp_attach	= resp->max_mcast_qp_attach;
-	device_attr->max_total_mcast_qp_attach	= resp->max_total_mcast_qp_attach;
-	device_attr->max_ah			= resp->max_ah;
-	device_attr->max_fmr			= resp->max_fmr;
-	device_attr->max_map_per_fmr		= resp->max_map_per_fmr;
-	device_attr->max_srq			= resp->max_srq;
-	device_attr->max_srq_wr			= resp->max_srq_wr;
-	device_attr->max_srq_sge		= resp->max_srq_sge;
-	device_attr->max_pkeys			= resp->max_pkeys;
-	device_attr->local_ca_ack_delay		= resp->local_ca_ack_delay;
-	device_attr->phys_port_cnt		= resp->phys_port_cnt;
-}
-
-int ibv_cmd_query_device(struct ibv_context *context,
-			 struct ibv_device_attr *device_attr,
-			 uint64_t *raw_fw_ver,
-			 struct ibv_query_device *cmd, size_t cmd_size)
-{
-	struct ib_uverbs_query_device_resp resp;
-	int ret;
-
-	ret = execute_cmd_write(context, IB_USER_VERBS_CMD_QUERY_DEVICE, cmd,
-				cmd_size, &resp, sizeof(resp));
-	if (ret)
-		return ret;
-
-	memset(device_attr->fw_ver, 0, sizeof device_attr->fw_ver);
-	copy_query_dev_fields(device_attr, &resp, raw_fw_ver);
-
-	return 0;
-}
-
 int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd,
 		     struct ibv_alloc_pd *cmd, size_t cmd_size,
 		     struct ib_uverbs_alloc_pd_resp *resp, size_t resp_size)
diff --git a/libibverbs/device.c b/libibverbs/device.c
index e9c429a51c7dd1..331a0871b4a77b 100644
--- a/libibverbs/device.c
+++ b/libibverbs/device.c
@@ -320,6 +320,7 @@ static void set_lib_ops(struct verbs_context *vctx)
 #undef ibv_query_port
 	vctx->context.ops._compat_query_port = ibv_query_port;
 	vctx->query_port = __lib_query_port;
+	vctx->context.ops._compat_query_device = ibv_query_device;
 
 	/*
 	 * In order to maintain backward/forward binary compatibility
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 33998e227c98ec..359b17302fab5d 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -354,8 +354,6 @@ struct verbs_context_ops {
 			    struct ibv_ops_wr **bad_op);
 	int (*post_srq_recv)(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr,
 			     struct ibv_recv_wr **bad_recv_wr);
-	int (*query_device)(struct ibv_context *context,
-			    struct ibv_device_attr *device_attr);
 	int (*query_device_ex)(struct ibv_context *context,
 			       const struct ibv_query_device_ex_input *input,
 			       struct ibv_device_attr_ex *attr,
@@ -449,10 +447,6 @@ int ibv_cmd_get_context(struct verbs_context *context,
 			struct ib_uverbs_get_context_resp *resp, size_t resp_size);
 int ibv_cmd_query_context(struct ibv_context *ctx,
 			  struct ibv_command_buffer *driver);
-int ibv_cmd_query_device(struct ibv_context *context,
-			 struct ibv_device_attr *device_attr,
-			 uint64_t *raw_fw_ver,
-			 struct ibv_query_device *cmd, size_t cmd_size);
 int ibv_cmd_create_flow_action_esp(struct ibv_context *ctx,
 				   struct ibv_flow_action_esp_attr *attr,
 				   struct verbs_flow_action *flow_action,
diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
index 711dfafb5caed5..b6f272dbd8f6de 100644
--- a/libibverbs/dummy_ops.c
+++ b/libibverbs/dummy_ops.c
@@ -377,35 +377,11 @@ static int post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr,
 	return EOPNOTSUPP;
 }
 
-static int query_device(struct ibv_context *context,
-			struct ibv_device_attr *device_attr)
-{
-	const struct verbs_context_ops *ops = get_ops(context);
-
-	if (!ops->query_device_ex)
-		return EOPNOTSUPP;
-	return ops->query_device_ex(
-		context, NULL,
-		container_of(device_attr, struct ibv_device_attr_ex, orig_attr),
-		sizeof(*device_attr));
-}
-
-/* Provide a generic implementation for all providers that don't implement
- * query_device_ex.
- */
 static int query_device_ex(struct ibv_context *context,
 			   const struct ibv_query_device_ex_input *input,
 			   struct ibv_device_attr_ex *attr, size_t attr_size)
 {
-	if (input && input->comp_mask)
-		return EINVAL;
-
-	if (attr_size < sizeof(attr->orig_attr))
-		return EOPNOTSUPP;
-
-	memset(attr, 0, attr_size);
-
-	return ibv_query_device(context, &attr->orig_attr);
+	return EOPNOTSUPP;
 }
 
 static int query_ece(struct ibv_qp *qp, struct ibv_ece *ece)
@@ -558,7 +534,6 @@ const struct verbs_context_ops verbs_dummy_ops = {
 	post_send,
 	post_srq_ops,
 	post_srq_recv,
-	query_device,
 	query_device_ex,
 	query_ece,
 	query_port,
@@ -680,7 +655,6 @@ void verbs_set_ops(struct verbs_context *vctx,
 	SET_OP(ctx, post_send);
 	SET_OP(vctx, post_srq_ops);
 	SET_OP(ctx, post_srq_recv);
-	SET_PRIV_OP(ctx, query_device);
 	SET_OP(vctx, query_device_ex);
 	SET_PRIV_OP_IC(vctx, query_ece);
 	SET_PRIV_OP_IC(ctx, query_port);
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 7fc10240cf9def..18f5cba8c49525 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -156,7 +156,10 @@ LATEST_SYMVER_FUNC(ibv_query_device, 1_1, "IBVERBS_1.1",
 		   struct ibv_context *context,
 		   struct ibv_device_attr *device_attr)
 {
-	return get_ops(context)->query_device(context, device_attr);
+	return get_ops(context)->query_device_ex(
+		context, NULL,
+		container_of(device_attr, struct ibv_device_attr_ex, orig_attr),
+		sizeof(*device_attr));
 }
 
 int __lib_query_port(struct ibv_context *context, uint8_t port_num,
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index ee57e0526d65b4..aafab2ab5547bd 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -1922,7 +1922,8 @@ struct ibv_device {
 
 struct _compat_ibv_port_attr;
 struct ibv_context_ops {
-	void *(*_compat_query_device)(void);
+	int (*_compat_query_device)(struct ibv_context *context,
+				    struct ibv_device_attr *device_attr);
 	int (*_compat_query_port)(struct ibv_context *context,
 				  uint8_t port_num,
 				  struct _compat_ibv_port_attr *port_attr);
diff --git a/providers/cxgb4/dev.c b/providers/cxgb4/dev.c
index 76b78d9b29a71c..c42c2300f1751f 100644
--- a/providers/cxgb4/dev.c
+++ b/providers/cxgb4/dev.c
@@ -114,8 +114,6 @@ static struct verbs_context *c4iw_alloc_context(struct ibv_device *ibdev,
 	struct ibv_get_context cmd;
 	struct uc4iw_alloc_ucontext_resp resp;
 	struct c4iw_dev *rhp = to_c4iw_dev(ibdev);
-	struct ibv_query_device qcmd;
-	uint64_t raw_fw_ver;
 	struct ibv_device_attr attr;
 
 	context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx,
@@ -143,8 +141,10 @@ static struct verbs_context *c4iw_alloc_context(struct ibv_device *ibdev,
 	} 
 
 	verbs_set_ops(&context->ibv_ctx, &c4iw_ctx_common_ops);
-	if (ibv_cmd_query_device(&context->ibv_ctx.context, &attr,
-				 &raw_fw_ver, &qcmd, sizeof(qcmd)))
+	if (c4iw_query_device(&context->ibv_ctx.context, NULL,
+			      container_of(&attr, struct ibv_device_attr_ex,
+					   orig_attr),
+			      sizeof(attr)))
 		goto err_unmap;
 
 	if (!rhp->mmid2ptr) {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex()
  2020-11-16 20:23 ` [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex() Jason Gunthorpe
@ 2020-11-17 17:24   ` Yishai Hadas
  2020-11-17 19:08     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Yishai Hadas @ 2020-11-17 17:24 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma, Bob Pearson, Yishai Hadas, maorg

On 11/16/2020 10:23 PM, Jason Gunthorpe wrote:
> When the user calls mlx5_query_device_ex() it should not cause the context
> values to be mutated, only the attribute should be returned.
>
> Move this code to a dedicated function that is only called during context
> setup.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   providers/mlx5/mlx5.c  | 10 +------
>   providers/mlx5/mlx5.h  |  1 +
>   providers/mlx5/verbs.c | 62 ++++++++++++++++++++++++++++--------------
>   3 files changed, 44 insertions(+), 29 deletions(-)
>
> diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
> index 1378acf2e2f3af..06b9a52ebb3019 100644
> --- a/providers/mlx5/mlx5.c
> +++ b/providers/mlx5/mlx5.c
> @@ -1373,7 +1373,6 @@ static int mlx5_set_context(struct mlx5_context *context,
>   {
>   	struct verbs_context *v_ctx = &context->ibv_ctx;
>   	struct ibv_port_attr port_attr = {};
> -	struct ibv_device_attr_ex device_attr = {};
>   	int cmd_fd = v_ctx->context.cmd_fd;
>   	struct mlx5_device *mdev = to_mdev(v_ctx->context.device);
>   	struct ibv_device *ibdev = v_ctx->context.device;
> @@ -1518,14 +1517,7 @@ bf_done:
>   			goto err_free;
>   	}
>   
> -	if (!mlx5_query_device_ex(&v_ctx->context, NULL, &device_attr,
> -				  sizeof(struct ibv_device_attr_ex))) {
> -		context->cached_device_cap_flags =
> -			device_attr.orig_attr.device_cap_flags;
> -		context->atomic_cap = device_attr.orig_attr.atomic_cap;
> -		context->cached_tso_caps = device_attr.tso_caps;
> -		context->max_dm_size = device_attr.max_dm_size;
> -	}
> +	mlx5_query_device_ctx(context);
>   
>   	for (j = 0; j < min(MLX5_MAX_PORTS_NUM, context->num_ports); ++j) {
>   		memset(&port_attr, 0, sizeof(port_attr));
> diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h
> index 782d29bf757e0b..72e710b7b5e4aa 100644
> --- a/providers/mlx5/mlx5.h
> +++ b/providers/mlx5/mlx5.h
> @@ -878,6 +878,7 @@ __be32 *mlx5_alloc_dbrec(struct mlx5_context *context, struct ibv_pd *pd,
>   void mlx5_free_db(struct mlx5_context *context, __be32 *db, struct ibv_pd *pd,
>   		  bool custom_alloc);
>   
> +void mlx5_query_device_ctx(struct mlx5_context *mctx);
>   int mlx5_query_device(struct ibv_context *context,
>   		       struct ibv_device_attr *attr);
>   int mlx5_query_device_ex(struct ibv_context *context,
> diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
> index 3622cae1df5017..42c984033d8eaa 100644
> --- a/providers/mlx5/verbs.c
> +++ b/providers/mlx5/verbs.c
> @@ -3450,19 +3450,19 @@ static void get_pci_atomic_caps(struct ibv_context *context,
>   	}
>   }
>   
> -static void get_lag_caps(struct ibv_context *ctx)
> +static void get_lag_caps(struct mlx5_context *mctx)
>   {
>   	uint16_t opmod = MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE |
>   		HCA_CAP_OPMOD_GET_CUR;
>   	uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {};
>   	uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {};
> -	struct mlx5_context *mctx = to_mctx(ctx);
>   	int ret;
>   
>   	DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
>   	DEVX_SET(query_hca_cap_in, in, op_mod, opmod);
>   
> -	ret = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out));
> +	ret = mlx5dv_devx_general_cmd(&mctx->ibv_ctx.context, in, sizeof(in),
> +				      out, sizeof(out));
>   	if (ret)
>   		return;
>   
> @@ -3512,6 +3512,41 @@ int mlx5_query_device_ex(struct ibv_context *context,
>   	attr->packet_pacing_caps.supported_qpts =
>   		resp.packet_pacing_caps.supported_qpts;
>   
> +	major     = (raw_fw_ver >> 32) & 0xffff;
> +	minor     = (raw_fw_ver >> 16) & 0xffff;
> +	sub_minor = raw_fw_ver & 0xffff;
> +	a = &attr->orig_attr;
> +	snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d",
> +		 major, minor, sub_minor);
> +
> +	if (attr_size >= offsetof(struct ibv_device_attr_ex, pci_atomic_caps) +
> +			sizeof(attr->pci_atomic_caps))
> +		get_pci_atomic_caps(context, attr);
> +
> +	return 0;
> +}
> +
> +void mlx5_query_device_ctx(struct mlx5_context *mctx)
> +{
> +	struct ibv_device_attr_ex device_attr;
> +	struct mlx5_query_device_ex_resp resp;
> +	size_t resp_size = sizeof(resp);
> +
> +	get_lag_caps(mctx);
> +
> +	if (!(mctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE))
> +		return;
> +
Any reason to not read some applicable context fields (e.g. max_dm_size) 
over uverbs by reducing the resp_size to the ib part ?
> +	if (ibv_cmd_query_device_any(&mctx->ibv_ctx.context, NULL, &device_attr,
> +				     sizeof(device_attr), &resp.ibv_resp,
> +				     &resp_size))
> +		return;
> +
> +	mctx->cached_device_cap_flags = device_attr.orig_attr.device_cap_flags;
> +	mctx->atomic_cap = device_attr.orig_attr.atomic_cap;
> +	mctx->cached_tso_caps = device_attr.tso_caps;

The device_attr.tso_caps is not set over uverbs / cmd_device.c, it comes 
as UHW.
The below should be done instead.

mctx->cached_tso_caps.max_tso = resp.tso_caps.max_tso;
mctx->cached_tso_caps.supported_qpts = resp.tso_caps.supported_qpts;

> +	mctx->max_dm_size = device_attr.max_dm_size;
> +
>   	if (resp.mlx5_ib_support_multi_pkt_send_wqes & MLX5_IB_ALLOW_MPW)
>   		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_MPW_ALLOWED;
>   
> @@ -3519,7 +3554,8 @@ int mlx5_query_device_ex(struct ibv_context *context,
>   		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_ENHANCED_MPW;
>   
>   	mctx->cqe_comp_caps.max_num = resp.cqe_comp_caps.max_num;
> -	mctx->cqe_comp_caps.supported_format = resp.cqe_comp_caps.supported_format;
> +	mctx->cqe_comp_caps.supported_format =
> +		resp.cqe_comp_caps.supported_format;
>   	mctx->sw_parsing_caps.sw_parsing_offloads =
>   		resp.sw_parsing_caps.sw_parsing_offloads;
>   	mctx->sw_parsing_caps.supported_qpts =
> @@ -3544,25 +3580,11 @@ int mlx5_query_device_ex(struct ibv_context *context,
>   		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_CQE_128B_PAD;
>   
>   	if (resp.flags & MLX5_IB_QUERY_DEV_RESP_PACKET_BASED_CREDIT_MODE)
> -		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE;
> +		mctx->vendor_cap_flags |=
> +			MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE;
>   
>   	if (resp.flags & MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT)
>   		mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_SCAT2CQE_DCT;
> -
> -	major     = (raw_fw_ver >> 32) & 0xffff;
> -	minor     = (raw_fw_ver >> 16) & 0xffff;
> -	sub_minor = raw_fw_ver & 0xffff;
> -	a = &attr->orig_attr;
> -	snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d",
> -		 major, minor, sub_minor);
> -
> -	if (attr_size >= offsetof(struct ibv_device_attr_ex, pci_atomic_caps) +
> -			sizeof(attr->pci_atomic_caps))
> -		get_pci_atomic_caps(context, attr);
> -
> -	get_lag_caps(context);
> -
> -	return 0;
>   }
>   
>   static int rwq_sig_enabled(struct ibv_context *context)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex()
  2020-11-17 17:24   ` Yishai Hadas
@ 2020-11-17 19:08     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-17 19:08 UTC (permalink / raw)
  To: Yishai Hadas; +Cc: linux-rdma, Bob Pearson, maorg

On Tue, Nov 17, 2020 at 07:24:41PM +0200, Yishai Hadas wrote:

> > +void mlx5_query_device_ctx(struct mlx5_context *mctx)
> > +{
> > +	struct ibv_device_attr_ex device_attr;
> > +	struct mlx5_query_device_ex_resp resp;
> > +	size_t resp_size = sizeof(resp);
> > +
> > +	get_lag_caps(mctx);
> > +
> > +	if (!(mctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE))
> > +		return;
> > +

> Any reason to not read some applicable context fields (e.g. max_dm_size)
> over uverbs by reducing the resp_size to the ib part ?

No, I missed the device_attr detail

> > +	if (ibv_cmd_query_device_any(&mctx->ibv_ctx.context, NULL, &device_attr,
> > +				     sizeof(device_attr), &resp.ibv_resp,
> > +				     &resp_size))
> > +		return;
> > +
> > +	mctx->cached_device_cap_flags = device_attr.orig_attr.device_cap_flags;
> > +	mctx->atomic_cap = device_attr.orig_attr.atomic_cap;
> > +	mctx->cached_tso_caps = device_attr.tso_caps;
> 
> The device_attr.tso_caps is not set over uverbs / cmd_device.c, it
> comes as UHW.  The below should be done instead.

Got it

I updated the github

Thanks,
Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex()
  2020-11-16 20:23 ` [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex() Jason Gunthorpe
@ 2020-11-18  7:39   ` Gal Pressman
  2020-11-18  8:07   ` Gal Pressman
  2020-11-18 12:45   ` Gal Pressman
  2 siblings, 0 replies; 21+ messages in thread
From: Gal Pressman @ 2020-11-18  7:39 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-rdma, Bob Pearson

On 16/11/2020 22:23, Jason Gunthorpe wrote:
> When the user calls efa_query_device_ex() it should not cause the context
> values to be mutated, only the attribute shuld be returned.
> 
> Move this code to a dedicated function that is only called during context
> setup.
> 
> Cc: Gal Pressman <galpress@amazon.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Didn't get a chance to review yet, but this one breaks EFA.
I'll try to provide more info today/tomorrow.

(BTW: typo in the subject line "intialization", and "shuld" in the commit message)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex()
  2020-11-16 20:23 ` [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex() Jason Gunthorpe
  2020-11-18  7:39   ` Gal Pressman
@ 2020-11-18  8:07   ` Gal Pressman
  2020-11-18 12:45   ` Gal Pressman
  2 siblings, 0 replies; 21+ messages in thread
From: Gal Pressman @ 2020-11-18  8:07 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-rdma, Bob Pearson

On 16/11/2020 22:23, Jason Gunthorpe wrote:
> When the user calls efa_query_device_ex() it should not cause the context
> values to be mutated, only the attribute shuld be returned.
> 
> Move this code to a dedicated function that is only called during context
> setup.
> 
> Cc: Gal Pressman <galpress@amazon.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  providers/efa/efa.c   | 14 +------------
>  providers/efa/verbs.c | 46 +++++++++++++++++++++++++++++++++++--------
>  providers/efa/verbs.h |  1 +
>  3 files changed, 40 insertions(+), 21 deletions(-)
> 
> diff --git a/providers/efa/efa.c b/providers/efa/efa.c
> index 35f9b246a711ec..b24c14f7fa1fe1 100644
> --- a/providers/efa/efa.c
> +++ b/providers/efa/efa.c
> @@ -54,10 +54,7 @@ static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
>  {
>  	struct efa_alloc_ucontext_resp resp = {};
>  	struct efa_alloc_ucontext cmd = {};
> -	struct ibv_device_attr_ex attr;
> -	unsigned int qp_table_sz;
>  	struct efa_context *ctx;
> -	int err;
>  
>  	cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_TX_BATCH;
>  	cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_MIN_SQ_WR;
> @@ -86,17 +83,8 @@ static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
>  
>  	verbs_set_ops(&ctx->ibvctx, &efa_ctx_ops);
>  
> -	err = efa_query_device_ex(&ctx->ibvctx.context, NULL, &attr,
> -				  sizeof(attr));
> -	if (err)
> +	if (!efa_query_device_ctx(ctx))
>  		goto err_free_spinlock;

This return error on success seems to be the issue, will verify.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/9] verbs: Add ibv_cmd_query_device_any()
  2020-11-16 20:23 ` [PATCH 2/9] verbs: Add ibv_cmd_query_device_any() Jason Gunthorpe
@ 2020-11-18 12:43   ` Gal Pressman
  2020-11-18 12:45     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Gal Pressman @ 2020-11-18 12:43 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-rdma, Bob Pearson

On 16/11/2020 22:23, Jason Gunthorpe wrote:
> This implements all the query_device command flows under a single call.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  libibverbs/cmd_device.c      | 155 +++++++++++++++++++++++++++++++++++
>  libibverbs/driver.h          |   5 ++
>  libibverbs/libibverbs.map.in |   1 +
>  3 files changed, 161 insertions(+)
> 
> diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
> index 6c8e01ec9866a9..0019784ee779c1 100644
> --- a/libibverbs/cmd_device.c
> +++ b/libibverbs/cmd_device.c
> @@ -35,6 +35,7 @@
>  #include <stdlib.h>
>  #include <dirent.h>
>  #include <infiniband/cmd_write.h>
> +#include <util/util.h>
>  
>  #include <net/if.h>
>  
> @@ -516,3 +517,157 @@ ssize_t _ibv_query_gid_table(struct ibv_context *context,
>  
>  	return num_entries;
>  }
> +
> +int ibv_cmd_query_device_any(struct ibv_context *context,
> +			     const struct ibv_query_device_ex_input *input,
> +			     struct ibv_device_attr_ex *attr, size_t attr_size,
> +			     struct ib_uverbs_ex_query_device_resp *resp,
> +			     size_t *resp_size)
> +{
> +	struct ib_uverbs_ex_query_device_resp internal_resp;
> +	size_t internal_resp_size;
> +	int err;
> +
> +	if (input && input->comp_mask)
> +		return EINVAL;
> +	if (attr_size < sizeof(attr->orig_attr))
> +		return EINVAL;
> +
> +	if (!resp) {
> +		resp = &internal_resp;
> +		internal_resp_size = sizeof(internal_resp);
> +		resp_size = &internal_resp_size;
> +	}
> +	memset(attr, 0, attr_size);
> +	memset(resp, 0, *resp_size);
> +
> +	if (attr_size > sizeof(attr->orig_attr)) {
> +		struct ibv_query_device_ex cmd = {};
> +
> +		err = execute_cmd_write_ex(context,
> +					   IB_USER_VERBS_EX_CMD_QUERY_DEVICE,
> +					   &cmd, sizeof(cmd), resp, *resp_size);
> +		if (err) {
> +			if (err != EOPNOTSUPP)

Are you sure about that?
I think older kernels return ENOSYS.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex()
  2020-11-16 20:23 ` [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex() Jason Gunthorpe
  2020-11-18  7:39   ` Gal Pressman
  2020-11-18  8:07   ` Gal Pressman
@ 2020-11-18 12:45   ` Gal Pressman
  2020-11-18 20:59     ` Jason Gunthorpe
  2 siblings, 1 reply; 21+ messages in thread
From: Gal Pressman @ 2020-11-18 12:45 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-rdma, Bob Pearson

On 16/11/2020 22:23, Jason Gunthorpe wrote:
> When the user calls efa_query_device_ex() it should not cause the context
> values to be mutated, only the attribute shuld be returned.
> 
> Move this code to a dedicated function that is only called during context
> setup.
> 
> Cc: Gal Pressman <galpress@amazon.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  providers/efa/efa.c   | 14 +------------
>  providers/efa/verbs.c | 46 +++++++++++++++++++++++++++++++++++--------
>  providers/efa/verbs.h |  1 +
>  3 files changed, 40 insertions(+), 21 deletions(-)
> 
> diff --git a/providers/efa/efa.c b/providers/efa/efa.c
> index 35f9b246a711ec..b24c14f7fa1fe1 100644
> --- a/providers/efa/efa.c
> +++ b/providers/efa/efa.c
> @@ -54,10 +54,7 @@ static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
>  {
>  	struct efa_alloc_ucontext_resp resp = {};
>  	struct efa_alloc_ucontext cmd = {};
> -	struct ibv_device_attr_ex attr;
> -	unsigned int qp_table_sz;
>  	struct efa_context *ctx;
> -	int err;
>  
>  	cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_TX_BATCH;
>  	cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_MIN_SQ_WR;
> @@ -86,17 +83,8 @@ static struct verbs_context *efa_alloc_context(struct ibv_device *vdev,
>  
>  	verbs_set_ops(&ctx->ibvctx, &efa_ctx_ops);
>  
> -	err = efa_query_device_ex(&ctx->ibvctx.context, NULL, &attr,
> -				  sizeof(attr));
> -	if (err)
> +	if (!efa_query_device_ctx(ctx))

Remove the not.

>  		goto err_free_spinlock;
> -
> -	qp_table_sz = roundup_pow_of_two(attr.orig_attr.max_qp);
> -	ctx->qp_table_sz_m1 = qp_table_sz - 1;
> -	ctx->qp_table = calloc(qp_table_sz, sizeof(*ctx->qp_table));
> -	if (!ctx->qp_table)
> -		goto err_free_spinlock;
> -
>  	return &ctx->ibvctx;
>  
>  err_free_spinlock:
> diff --git a/providers/efa/verbs.c b/providers/efa/verbs.c
> index 1a9633155c62f8..52d6285f1f409c 100644
> --- a/providers/efa/verbs.c
> +++ b/providers/efa/verbs.c
> @@ -106,14 +106,6 @@ int efa_query_device_ex(struct ibv_context *context,
>  	if (err)
>  		return err;
>  
> -	ctx->device_caps = resp.device_caps;
> -	ctx->max_sq_wr = resp.max_sq_wr;
> -	ctx->max_rq_wr = resp.max_rq_wr;
> -	ctx->max_sq_sge = resp.max_sq_sge;
> -	ctx->max_rq_sge = resp.max_rq_sge;
> -	ctx->max_rdma_size = resp.max_rdma_size;
> -	ctx->max_wr_rdma_sge = a->max_sge_rd;
> -
>  	a->max_qp_wr = min_t(int, a->max_qp_wr,
>  			     ctx->max_llq_size / sizeof(struct efa_io_tx_wqe));
>  	snprintf(a->fw_ver, sizeof(a->fw_ver), "%u.%u.%u.%u",
> @@ -122,6 +114,44 @@ int efa_query_device_ex(struct ibv_context *context,
>  	return 0;
>  }
>  
> +int efa_query_device_ctx(struct efa_context *ctx)
> +{
> +	struct ibv_device_attr_ex attr;
> +	struct efa_query_device_ex_resp resp;

Preferably I would put this first.

> +	size_t resp_size = sizeof(resp);
> +	unsigned int qp_table_sz;
> +	int err;
> +
> +	if (ctx->cmds_supp_udata_mask & EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE) {
> +		err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL,
> +					       &attr, sizeof(attr),
> +					       &resp.ibv_resp, &resp_size);
> +		if (err)
> +			return err;
> +
> +		ctx->device_caps = resp.device_caps;
> +		ctx->max_sq_wr = resp.max_sq_wr;
> +		ctx->max_rq_wr = resp.max_rq_wr;
> +		ctx->max_sq_sge = resp.max_sq_sge;
> +		ctx->max_rq_sge = resp.max_rq_sge;
> +		ctx->max_rdma_size = resp.max_rdma_size;
> +		ctx->max_wr_rdma_sge = attr.orig_attr.max_sge_rd;

max_wr_rdma_sge assignment can be done in the else clause as well.

> +	} else {
> +		err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL,
> +					       &attr, sizeof(attr.orig_attr),
> +					       NULL, NULL);
> +		if (err)
> +			return err;
> +	}
> +
> +	qp_table_sz = roundup_pow_of_two(attr.orig_attr.max_qp);
> +	ctx->qp_table_sz_m1 = qp_table_sz - 1;
> +	ctx->qp_table = calloc(qp_table_sz, sizeof(*ctx->qp_table));
> +	if (!ctx->qp_table)
> +		return ENOMEM;
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/9] verbs: Add ibv_cmd_query_device_any()
  2020-11-18 12:43   ` Gal Pressman
@ 2020-11-18 12:45     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-18 12:45 UTC (permalink / raw)
  To: Gal Pressman; +Cc: linux-rdma, Bob Pearson

On Wed, Nov 18, 2020 at 02:43:57PM +0200, Gal Pressman wrote:
> On 16/11/2020 22:23, Jason Gunthorpe wrote:

> > +int ibv_cmd_query_device_any(struct ibv_context *context,
> > +			     const struct ibv_query_device_ex_input *input,
> > +			     struct ibv_device_attr_ex *attr, size_t attr_size,
> > +			     struct ib_uverbs_ex_query_device_resp *resp,
> > +			     size_t *resp_size)
> > +{
> > +	struct ib_uverbs_ex_query_device_resp internal_resp;
> > +	size_t internal_resp_size;
> > +	int err;
> > +
> > +	if (input && input->comp_mask)
> > +		return EINVAL;
> > +	if (attr_size < sizeof(attr->orig_attr))
> > +		return EINVAL;
> > +
> > +	if (!resp) {
> > +		resp = &internal_resp;
> > +		internal_resp_size = sizeof(internal_resp);
> > +		resp_size = &internal_resp_size;
> > +	}
> > +	memset(attr, 0, attr_size);
> > +	memset(resp, 0, *resp_size);
> > +
> > +	if (attr_size > sizeof(attr->orig_attr)) {
> > +		struct ibv_query_device_ex cmd = {};
> > +
> > +		err = execute_cmd_write_ex(context,
> > +					   IB_USER_VERBS_EX_CMD_QUERY_DEVICE,
> > +					   &cmd, sizeof(cmd), resp, *resp_size);
> > +		if (err) {
> > +			if (err != EOPNOTSUPP)
> 
> Are you sure about that?
> I think older kernels return ENOSYS.

Oh, that is possibly true, we changed those at one point, I didn't
test that far back.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/9] verbs: Remove dead code
  2020-11-16 20:23 ` [PATCH 8/9] verbs: Remove dead code Jason Gunthorpe
@ 2020-11-18 12:46   ` Gal Pressman
  2020-11-18 12:53     ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Gal Pressman @ 2020-11-18 12:46 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-rdma, Bob Pearson

On 16/11/2020 22:23, Jason Gunthorpe wrote:
> Remove the old query_device support code, it is now replaced by
> ibv_cmd_query_device_any()
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Shouldn't the legacy fallback in ibv_query_device_ex() be removed as well?

/**
 * ibv_query_device_ex - Get extended device properties
 */
static inline int
ibv_query_device_ex(struct ibv_context *context,
		    const struct ibv_query_device_ex_input *input,
		    struct ibv_device_attr_ex *attr)
{
	struct verbs_context *vctx;
	int ret;

	if (input && input->comp_mask)
		return EINVAL;

	vctx = verbs_get_ctx_op(context, query_device_ex);
	if (!vctx)
		goto legacy;

	ret = vctx->query_device_ex(context, input, attr, sizeof(*attr));
	if (ret == EOPNOTSUPP || ret == ENOSYS)
		goto legacy;

	return ret;

legacy:
	memset(attr, 0, sizeof(*attr));
	ret = ibv_query_device(context, &attr->orig_attr);

	return ret;
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/9] providers: Remove normal query_device() from providers that have _ex
  2020-11-16 20:23 ` [PATCH 6/9] providers: Remove normal query_device() from providers that have _ex Jason Gunthorpe
@ 2020-11-18 12:47   ` Gal Pressman
  0 siblings, 0 replies; 21+ messages in thread
From: Gal Pressman @ 2020-11-18 12:47 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-rdma, Bob Pearson

On 16/11/2020 22:23, Jason Gunthorpe wrote:
> The ex callback can implement both versions, no reason for duplicate
> code in two paths. Have the core code call the _ex version instead.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Gal Pressman <galpress@amazon.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/9] verbs: Remove dead code
  2020-11-18 12:46   ` Gal Pressman
@ 2020-11-18 12:53     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-18 12:53 UTC (permalink / raw)
  To: Gal Pressman; +Cc: linux-rdma, Bob Pearson

On Wed, Nov 18, 2020 at 02:46:05PM +0200, Gal Pressman wrote:
> On 16/11/2020 22:23, Jason Gunthorpe wrote:
> > Remove the old query_device support code, it is now replaced by
> > ibv_cmd_query_device_any()
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> Shouldn't the legacy fallback in ibv_query_device_ex() be removed as well?

I thought about it, and yes we could convert that into a real function
call and remove all the junk

But if we do that it causes a symbol ver dependencies and if we are
going that far then we should really just do all of the
verbs_get_ctx_op() inlines and be done with that mess entirely.

This would mean anything linked with a new rdma-core would not be able
to load on any of the older ones..

I've been saving that gem for some future adventure since it seems to
cause some general pain. It would be good to couple it with some other
verbs.h clean up as well

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex()
  2020-11-18 12:45   ` Gal Pressman
@ 2020-11-18 20:59     ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2020-11-18 20:59 UTC (permalink / raw)
  To: Gal Pressman; +Cc: linux-rdma, Bob Pearson

On Wed, Nov 18, 2020 at 02:45:42PM +0200, Gal Pressman wrote:

> > +	size_t resp_size = sizeof(resp);
> > +	unsigned int qp_table_sz;
> > +	int err;
> > +
> > +	if (ctx->cmds_supp_udata_mask & EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE) {
> > +		err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL,
> > +					       &attr, sizeof(attr),
> > +					       &resp.ibv_resp, &resp_size);
> > +		if (err)
> > +			return err;
> > +
> > +		ctx->device_caps = resp.device_caps;
> > +		ctx->max_sq_wr = resp.max_sq_wr;
> > +		ctx->max_rq_wr = resp.max_rq_wr;
> > +		ctx->max_sq_sge = resp.max_sq_sge;
> > +		ctx->max_rq_sge = resp.max_rq_sge;
> > +		ctx->max_rdma_size = resp.max_rdma_size;
> > +		ctx->max_wr_rdma_sge = attr.orig_attr.max_sge_rd;
> 
> max_wr_rdma_sge assignment can be done in the else clause as well.

Yes, it is the same mistake as I did in mlx5

I updated everything, thanks

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-11-18 21:00 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16 20:23 [PATCH 0/9] Simplify query_device() in libibverbs Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 1/9] verbs: Simplify query_device_ex Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 2/9] verbs: Add ibv_cmd_query_device_any() Jason Gunthorpe
2020-11-18 12:43   ` Gal Pressman
2020-11-18 12:45     ` Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 3/9] mlx5: Move context initialization out of mlx5_query_device_ex() Jason Gunthorpe
2020-11-17 17:24   ` Yishai Hadas
2020-11-17 19:08     ` Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 4/9] efa: Move the context intialization out of efa_query_device_ex() Jason Gunthorpe
2020-11-18  7:39   ` Gal Pressman
2020-11-18  8:07   ` Gal Pressman
2020-11-18 12:45   ` Gal Pressman
2020-11-18 20:59     ` Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 5/9] mlx4: Move the context intialization out of mlx4_query_device_ex() Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 6/9] providers: Remove normal query_device() from providers that have _ex Jason Gunthorpe
2020-11-18 12:47   ` Gal Pressman
2020-11-16 20:23 ` [PATCH 7/9] providers: Convert all providers to implement query_device_ex Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 8/9] verbs: Remove dead code Jason Gunthorpe
2020-11-18 12:46   ` Gal Pressman
2020-11-18 12:53     ` Jason Gunthorpe
2020-11-16 20:23 ` [PATCH 9/9] verbs: Delete query_device() internal support Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).