linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-core 0/8] verbs: Query GID table API
@ 2020-09-14  6:34 Yishai Hadas
  2020-09-14  6:34 ` [PATCH rdma-core 1/8] Update kernel headers Yishai Hadas
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:34 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

When an application is not using RDMA CM and if it is using multiple RDMA
devices with one or more RoCE ports, finding the right GID table entry is a
long process.

For example, with two RoCE dual-port devices in a system, when IP failover is
used between two RoCE ports, searching a suitable GID entry for a given source
IP, matching netdevice of given RoCEv1/v2 type requires iterating over all 4
ports * 256 entry GID table.

Even though the best first match GID table for given criteria is used, when the
matching entry is on the 4th port, it requires reading 3 ports * 256 entries *
3 files (GID, netdev, type) = 2304 files.  The GID table needs to be referred
on every QP creation during IP failover on other netdevice of an RDMA device.

We introduce this series of patches, which introduces an API to query the
complete GID tables of an RDMA device, that returns all valid GID table
entries.

This is done through single ioctl, eliminating 2304 read, 2304 open and 2304
close system calls to just a total of 2 calls (one for each device).

While at it, we also introduce an API to query an individual GID entry over
ioctl interface, which provides all GID attributes information.

The APIs are based on the below RFC [1], the matching kernel part was sent to
rdma-next.

PR was sent as well [2].

[1] https://www.spinics.net/lists/linux-rdma/msg91825.html
[2] https://github.com/linux-rdma/rdma-core/pull/828

Avihai Horon (7):
  verbs: Change the name of enum ibv_gid_type
  verbs: Introduce a new query GID entry API
  verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl
  verbs: Optimize ibv_query_gid and ibv_query_gid_type
  verbs: Introduce a new query GID table API
  pyverbs: Add query_gid_table and query_gid_ex methods
  tests: Add tests for ibv_query_gid_table and ibv_query_gid_ex

Yishai Hadas (1):
  Update kernel headers

 debian/libibverbs1.symbols                |   3 +
 kernel-headers/rdma/ib_user_ioctl_cmds.h  |  16 +++
 kernel-headers/rdma/ib_user_ioctl_verbs.h |  14 ++
 kernel-headers/rdma/ib_user_verbs.h       |  11 ++
 kernel-headers/rdma/rdma_user_rxe.h       |   6 +-
 libibverbs/CMakeLists.txt                 |   2 +-
 libibverbs/cmd_device.c                   | 215 ++++++++++++++++++++++++++++++
 libibverbs/driver.h                       |  29 +++-
 libibverbs/examples/devinfo.c             |  14 +-
 libibverbs/libibverbs.map.in              |   6 +
 libibverbs/man/CMakeLists.txt             |   2 +
 libibverbs/man/ibv_query_gid_ex.3.md      |  93 +++++++++++++
 libibverbs/man/ibv_query_gid_table.3.md   |  73 ++++++++++
 libibverbs/verbs.c                        |  95 ++++++++++---
 libibverbs/verbs.h                        |  45 +++++++
 providers/mlx5/verbs.c                    |   2 +-
 pyverbs/device.pxd                        |   3 +
 pyverbs/device.pyx                        | 108 ++++++++++++++-
 pyverbs/libibverbs.pxd                    |  15 ++-
 pyverbs/libibverbs_enums.pxd              |  11 +-
 tests/base.py                             |   3 +-
 tests/test_device.py                      |  32 +++++
 22 files changed, 761 insertions(+), 37 deletions(-)
 create mode 100644 libibverbs/man/ibv_query_gid_ex.3.md
 create mode 100644 libibverbs/man/ibv_query_gid_table.3.md

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 1/8] Update kernel headers
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
@ 2020-09-14  6:34 ` Yishai Hadas
  2020-09-14  6:34 ` [PATCH rdma-core 2/8] verbs: Change the name of enum ibv_gid_type Yishai Hadas
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:34 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

To commit 320a6a2fef0b ("RDMA/uverbs: Expose the new GID query API to
user space")

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 kernel-headers/rdma/ib_user_ioctl_cmds.h  | 16 ++++++++++++++++
 kernel-headers/rdma/ib_user_ioctl_verbs.h | 14 ++++++++++++++
 kernel-headers/rdma/ib_user_verbs.h       | 11 +++++++++++
 kernel-headers/rdma/rdma_user_rxe.h       |  6 +++---
 4 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/kernel-headers/rdma/ib_user_ioctl_cmds.h b/kernel-headers/rdma/ib_user_ioctl_cmds.h
index 99dcabf..7968a18 100644
--- a/kernel-headers/rdma/ib_user_ioctl_cmds.h
+++ b/kernel-headers/rdma/ib_user_ioctl_cmds.h
@@ -70,6 +70,8 @@ enum uverbs_methods_device {
 	UVERBS_METHOD_QUERY_PORT,
 	UVERBS_METHOD_GET_CONTEXT,
 	UVERBS_METHOD_QUERY_CONTEXT,
+	UVERBS_METHOD_QUERY_GID_TABLE,
+	UVERBS_METHOD_QUERY_GID_ENTRY,
 };
 
 enum uverbs_attrs_invoke_write_cmd_attr_ids {
@@ -352,4 +354,18 @@ enum uverbs_attrs_async_event_create {
 	UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE,
 };
 
+enum uverbs_attrs_query_gid_table_cmd_attr_ids {
+	UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE,
+	UVERBS_ATTR_QUERY_GID_TABLE_FLAGS,
+	UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES,
+	UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES,
+};
+
+enum uverbs_attrs_query_gid_entry_cmd_attr_ids {
+	UVERBS_ATTR_QUERY_GID_ENTRY_PORT,
+	UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX,
+	UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS,
+	UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY,
+};
+
 #endif
diff --git a/kernel-headers/rdma/ib_user_ioctl_verbs.h b/kernel-headers/rdma/ib_user_ioctl_verbs.h
index 5debab4..cfea82a 100644
--- a/kernel-headers/rdma/ib_user_ioctl_verbs.h
+++ b/kernel-headers/rdma/ib_user_ioctl_verbs.h
@@ -250,4 +250,18 @@ enum rdma_driver_id {
 	RDMA_DRIVER_SIW,
 };
 
+enum ib_uverbs_gid_type {
+	IB_UVERBS_GID_TYPE_IB,
+	IB_UVERBS_GID_TYPE_ROCE_V1,
+	IB_UVERBS_GID_TYPE_ROCE_V2,
+};
+
+struct ib_uverbs_gid_entry {
+	__aligned_u64 gid[2];
+	__u32 gid_index;
+	__u32 port_num;
+	__u32 gid_type;
+	__u32 netdev_ifindex; /* It is 0 if there is no netdev associated with it */
+};
+
 #endif
diff --git a/kernel-headers/rdma/ib_user_verbs.h b/kernel-headers/rdma/ib_user_verbs.h
index 0474c74..456438c 100644
--- a/kernel-headers/rdma/ib_user_verbs.h
+++ b/kernel-headers/rdma/ib_user_verbs.h
@@ -457,6 +457,17 @@ struct ib_uverbs_poll_cq {
 	__u32 ne;
 };
 
+enum ib_uverbs_wc_opcode {
+	IB_UVERBS_WC_SEND = 0,
+	IB_UVERBS_WC_RDMA_WRITE = 1,
+	IB_UVERBS_WC_RDMA_READ = 2,
+	IB_UVERBS_WC_COMP_SWAP = 3,
+	IB_UVERBS_WC_FETCH_ADD = 4,
+	IB_UVERBS_WC_BIND_MW = 5,
+	IB_UVERBS_WC_LOCAL_INV = 6,
+	IB_UVERBS_WC_TSO = 7,
+};
+
 struct ib_uverbs_wc {
 	__aligned_u64 wr_id;
 	__u32 status;
diff --git a/kernel-headers/rdma/rdma_user_rxe.h b/kernel-headers/rdma/rdma_user_rxe.h
index aae2e69..d8f2e0e 100644
--- a/kernel-headers/rdma/rdma_user_rxe.h
+++ b/kernel-headers/rdma/rdma_user_rxe.h
@@ -99,8 +99,8 @@ struct rxe_send_wr {
 				struct ib_mr *mr;
 				__aligned_u64 reserved;
 			};
-			__u32        key;
-			__u32        access;
+			__u32	     key;
+			__u32	     access;
 		} reg;
 	} wr;
 };
@@ -112,7 +112,7 @@ struct rxe_sge {
 };
 
 struct mminfo {
-	__aligned_u64  		offset;
+	__aligned_u64		offset;
 	__u32			size;
 	__u32			pad;
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 2/8] verbs: Change the name of enum ibv_gid_type
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
  2020-09-14  6:34 ` [PATCH rdma-core 1/8] Update kernel headers Yishai Hadas
@ 2020-09-14  6:34 ` Yishai Hadas
  2020-09-14  6:34 ` [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API Yishai Hadas
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:34 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

From: Avihai Horon <avihaih@nvidia.com>

Change the name of enum ibv_gid_type in order to introduce a new enum
ibv_gid_type in verbs.h in the next commits, which will provide a more
accurate gid type info by separating IB and RoCEv1 types.

This is a preliminary step before introducing a new query GID API.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 libibverbs/driver.h           |  8 ++++----
 libibverbs/examples/devinfo.c | 14 +++++++-------
 libibverbs/verbs.c            | 21 +++++++++++----------
 providers/mlx5/verbs.c        |  2 +-
 pyverbs/device.pyx            |  2 +-
 pyverbs/libibverbs.pxd        |  2 +-
 pyverbs/libibverbs_enums.pxd  |  6 +++---
 tests/base.py                 |  3 ++-
 8 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 4436363..046c07d 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -72,9 +72,9 @@ enum verbs_qp_mask {
 	VERBS_QP_EX		= 1 << 1,
 };
 
-enum ibv_gid_type {
-	IBV_GID_TYPE_IB_ROCE_V1,
-	IBV_GID_TYPE_ROCE_V2,
+enum ibv_gid_type_sysfs {
+	IBV_GID_TYPE_SYSFS_IB_ROCE_V1,
+	IBV_GID_TYPE_SYSFS_ROCE_V2,
 };
 
 enum ibv_mr_type {
@@ -653,7 +653,7 @@ static inline bool check_comp_mask(uint64_t input, uint64_t supported)
 }
 
 int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
-		       unsigned int index, enum ibv_gid_type *type);
+		       unsigned int index, enum ibv_gid_type_sysfs *type);
 
 static inline int
 ibv_check_alloc_parent_domain(struct ibv_parent_domain_init_attr *attr)
diff --git a/libibverbs/examples/devinfo.c b/libibverbs/examples/devinfo.c
index 00ed3f9..c245217 100644
--- a/libibverbs/examples/devinfo.c
+++ b/libibverbs/examples/devinfo.c
@@ -164,17 +164,17 @@ static const char *vl_str(uint8_t vl_num)
 }
 
 #define DEVINFO_INVALID_GID_TYPE	2
-static const char *gid_type_str(enum ibv_gid_type type)
+static const char *gid_type_str(enum ibv_gid_type_sysfs type)
 {
 	switch (type) {
-	case IBV_GID_TYPE_IB_ROCE_V1: return "RoCE v1";
-	case IBV_GID_TYPE_ROCE_V2: return "RoCE v2";
+	case IBV_GID_TYPE_SYSFS_IB_ROCE_V1: return "RoCE v1";
+	case IBV_GID_TYPE_SYSFS_ROCE_V2: return "RoCE v2";
 	default: return "Invalid gid type";
 	}
 }
 
 static void print_formated_gid(union ibv_gid *gid, int i,
-			       enum ibv_gid_type type, int ll)
+			       enum ibv_gid_type_sysfs type, int ll)
 {
 	char gid_str[INET6_ADDRSTRLEN] = {};
 	char str[20] = {};
@@ -182,7 +182,7 @@ static void print_formated_gid(union ibv_gid *gid, int i,
 	if (ll == IBV_LINK_LAYER_ETHERNET)
 		sprintf(str, ", %s", gid_type_str(type));
 
-	if (type == IBV_GID_TYPE_IB_ROCE_V1)
+	if (type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1)
 		printf("\t\t\tGID[%3d]:\t\t%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x%s\n",
 		       i, gid->raw[0], gid->raw[1], gid->raw[2],
 		       gid->raw[3], gid->raw[4], gid->raw[5], gid->raw[6],
@@ -190,7 +190,7 @@ static void print_formated_gid(union ibv_gid *gid, int i,
 		       gid->raw[11], gid->raw[12], gid->raw[13], gid->raw[14],
 		       gid->raw[15], str);
 
-	if (type == IBV_GID_TYPE_ROCE_V2) {
+	if (type == IBV_GID_TYPE_SYSFS_ROCE_V2) {
 		inet_ntop(AF_INET6, gid->raw, gid_str, sizeof(gid_str));
 		printf("\t\t\tGID[%3d]:\t\t%s%s\n", i, gid_str, str);
 	}
@@ -200,7 +200,7 @@ static int print_all_port_gids(struct ibv_context *ctx,
 			       struct ibv_port_attr *port_attr,
 			       uint8_t port_num)
 {
-	enum ibv_gid_type type;
+	enum ibv_gid_type_sysfs type;
 	union ibv_gid gid;
 	int tbl_len;
 	int rc = 0;
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index b54e2b8..9507ffd 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -704,7 +704,7 @@ LATEST_SYMVER_FUNC(ibv_create_ah, 1_1, "IBVERBS_1.1",
 #define V1_TYPE "IB/RoCE v1"
 #define V2_TYPE "RoCE v2"
 int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
-		       unsigned int index, enum ibv_gid_type *type)
+		       unsigned int index, enum ibv_gid_type_sysfs *type)
 {
 	struct verbs_device *verbs_device = verbs_get_device(context->device);
 	char buff[11];
@@ -723,7 +723,7 @@ int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 			/* In IB, this file doesn't exist and the kernel sets
 			 * errno to -EINVAL.
 			 */
-			*type = IBV_GID_TYPE_IB_ROCE_V1;
+			*type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
 			return 0;
 		}
 		if (asprintf(&dir_path, "%s/%s/%d/%s/",
@@ -738,7 +738,7 @@ int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 				 * we have an old kernel and all GIDs are
 				 * IB/RoCE v1
 				 */
-				*type = IBV_GID_TYPE_IB_ROCE_V1;
+				*type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
 			else
 				return -1;
 		} else {
@@ -748,9 +748,9 @@ int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 		}
 	} else {
 		if (!strcmp(buff, V1_TYPE)) {
-			*type = IBV_GID_TYPE_IB_ROCE_V1;
+			*type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
 		} else if (!strcmp(buff, V2_TYPE)) {
-			*type = IBV_GID_TYPE_ROCE_V2;
+			*type = IBV_GID_TYPE_SYSFS_ROCE_V2;
 		} else {
 			errno = ENOTSUP;
 			return -1;
@@ -761,9 +761,10 @@ int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 }
 
 static int ibv_find_gid_index(struct ibv_context *context, uint8_t port_num,
-			      union ibv_gid *gid, enum ibv_gid_type gid_type)
+			      union ibv_gid *gid,
+			      enum ibv_gid_type_sysfs gid_type)
 {
-	enum ibv_gid_type sgid_type = 0;
+	enum ibv_gid_type_sysfs sgid_type = 0;
 	union ibv_gid sgid;
 	int i = 0, ret;
 
@@ -863,7 +864,7 @@ static inline int set_ah_attr_by_ipv4(struct ibv_context *context,
 
 	map_ipv4_addr_to_ipv6(ip4h->daddr, (struct in6_addr *)&sgid);
 	ret = ibv_find_gid_index(context, port_num, &sgid,
-				 IBV_GID_TYPE_ROCE_V2);
+				 IBV_GID_TYPE_SYSFS_ROCE_V2);
 	if (ret < 0)
 		return ret;
 
@@ -893,9 +894,9 @@ static inline int set_ah_attr_by_ipv6(struct ibv_context *context,
 
 	ah_attr->grh.dgid = grh->sgid;
 	if (grh->next_hdr == IPPROTO_UDP) {
-		sgid_type = IBV_GID_TYPE_ROCE_V2;
+		sgid_type = IBV_GID_TYPE_SYSFS_ROCE_V2;
 	} else if (grh->next_hdr == IB_NEXT_HDR) {
-		sgid_type = IBV_GID_TYPE_IB_ROCE_V1;
+		sgid_type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
 	} else {
 		errno = EPROTONOSUPPORT;
 		return -1;
diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c
index 4650250..917d057 100644
--- a/providers/mlx5/verbs.c
+++ b/providers/mlx5/verbs.c
@@ -2955,7 +2955,7 @@ struct ibv_ah *mlx5_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr)
 				       attr->grh.sgid_index, &gid_type))
 			goto err;
 
-		if (gid_type == IBV_GID_TYPE_ROCE_V2)
+		if (gid_type == IBV_GID_TYPE_SYSFS_ROCE_V2)
 			mlx5_ah_set_udp_sport(ah, attr);
 
 		/* Since RoCE packets must contain GRH, this bit is reserved
diff --git a/pyverbs/device.pyx b/pyverbs/device.pyx
index b75fcd0..c1323cd 100755
--- a/pyverbs/device.pyx
+++ b/pyverbs/device.pyx
@@ -220,7 +220,7 @@ cdef class Context(PyverbsCM):
         return gid
 
     def query_gid_type(self, unsigned int port_num, unsigned int index):
-        cdef v.ibv_gid_type gid_type
+        cdef v.ibv_gid_type_sysfs gid_type
         rc = v.ibv_query_gid_type(self.context, port_num, index, &gid_type)
         if rc != 0:
             raise PyverbsRDMAErrno('Failed to query gid type of port {p} and gid index {g}'
diff --git a/pyverbs/libibverbs.pxd b/pyverbs/libibverbs.pxd
index dda33e7..c84b9fc 100755
--- a/pyverbs/libibverbs.pxd
+++ b/pyverbs/libibverbs.pxd
@@ -617,6 +617,6 @@ cdef extern from 'infiniband/verbs.h':
 
 cdef extern from 'infiniband/driver.h':
     int ibv_query_gid_type(ibv_context *context, uint8_t port_num,
-                           unsigned int index, ibv_gid_type *type)
+                           unsigned int index, ibv_gid_type_sysfs *type)
     int ibv_set_ece(ibv_qp *qp, ibv_ece *ece)
     int ibv_query_ece(ibv_qp *qp, ibv_ece *ece)
diff --git a/pyverbs/libibverbs_enums.pxd b/pyverbs/libibverbs_enums.pxd
index 7c1a120..83ca516 100755
--- a/pyverbs/libibverbs_enums.pxd
+++ b/pyverbs/libibverbs_enums.pxd
@@ -443,6 +443,6 @@ _IBV_ADVISE_MR_FLAG_FLUSH = IBV_ADVISE_MR_FLAG_FLUSH
 
 
 cdef extern from '<infiniband/driver.h>':
-    cpdef enum ibv_gid_type:
-        IBV_GID_TYPE_IB_ROCE_V1
-        IBV_GID_TYPE_ROCE_V2
+    cpdef enum ibv_gid_type_sysfs:
+        IBV_GID_TYPE_SYSFS_IB_ROCE_V1
+        IBV_GID_TYPE_SYSFS_ROCE_V2
diff --git a/tests/base.py b/tests/base.py
index b6c389d..0ebd728 100755
--- a/tests/base.py
+++ b/tests/base.py
@@ -176,7 +176,8 @@ class RDMATestCase(unittest.TestCase):
                 continue
             # Avoid RoCEv2 GIDs on unsupported devices
             if port_attrs.link_layer == e.IBV_LINK_LAYER_ETHERNET and \
-                    ctx.query_gid_type(port, idx) == e.IBV_GID_TYPE_ROCE_V2 and \
+                    ctx.query_gid_type(port, idx) == \
+                    e.IBV_GID_TYPE_SYSFS_ROCE_V2 and \
                     has_roce_hw_bug(vendor_id, vendor_pid):
                 continue
             if not os.path.exists('/sys/class/infiniband/{}/device/net/'.format(dev)):
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
  2020-09-14  6:34 ` [PATCH rdma-core 1/8] Update kernel headers Yishai Hadas
  2020-09-14  6:34 ` [PATCH rdma-core 2/8] verbs: Change the name of enum ibv_gid_type Yishai Hadas
@ 2020-09-14  6:34 ` Yishai Hadas
  2020-09-21 16:49   ` Jason Gunthorpe
  2020-09-14  6:34 ` [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl Yishai Hadas
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:34 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

From: Avihai Horon <avihaih@nvidia.com>

Introduce the ibv_query_gid_ex verb which queries a specific index in
the GID table of the given port of the given device.
The queried data is stored in a buffer provided by the user.

If the kernel doesn't support ioctl or the needed uverbs method, the API
will try to query the GID entry via sysfs.

This API provides a faster way to query a GID table entry using a single
call over ioctl, instead of multiple calls to open, close and read
multiple sysfs files for a single GID table entry.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 debian/libibverbs1.symbols           |   2 +
 libibverbs/CMakeLists.txt            |   2 +-
 libibverbs/cmd_device.c              | 105 +++++++++++++++++++++++++++++++++++
 libibverbs/driver.h                  |   4 ++
 libibverbs/libibverbs.map.in         |   5 ++
 libibverbs/man/CMakeLists.txt        |   1 +
 libibverbs/man/ibv_query_gid_ex.3.md |  93 +++++++++++++++++++++++++++++++
 libibverbs/verbs.c                   |   8 +++
 libibverbs/verbs.h                   |  29 ++++++++++
 9 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100644 libibverbs/man/ibv_query_gid_ex.3.md

diff --git a/debian/libibverbs1.symbols b/debian/libibverbs1.symbols
index 2efbe89..536d543 100644
--- a/debian/libibverbs1.symbols
+++ b/debian/libibverbs1.symbols
@@ -8,7 +8,9 @@ libibverbs.so.1 libibverbs1 #MINVER#
  IBVERBS_1.8@IBVERBS_1.8 28
  IBVERBS_1.9@IBVERBS_1.9 30
  IBVERBS_1.10@IBVERBS_1.10 31
+ IBVERBS_1.11@IBVERBS_1.11 32
  (symver)IBVERBS_PRIVATE_25 25
+ _ibv_query_gid_ex@IBVERBS_1.11 32
  ibv_ack_async_event@IBVERBS_1.0 1.1.6
  ibv_ack_async_event@IBVERBS_1.1 1.1.6
  ibv_ack_cq_events@IBVERBS_1.0 1.1.6
diff --git a/libibverbs/CMakeLists.txt b/libibverbs/CMakeLists.txt
index 06a590f..0fe4256 100644
--- a/libibverbs/CMakeLists.txt
+++ b/libibverbs/CMakeLists.txt
@@ -21,7 +21,7 @@ configure_file("libibverbs.map.in"
 
 rdma_library(ibverbs "${CMAKE_CURRENT_BINARY_DIR}/libibverbs.map"
   # See Documentation/versioning.md
-  1 1.10.${PACKAGE_VERSION}
+  1 1.11.${PACKAGE_VERSION}
   all_providers.c
   cmd.c
   cmd_ah.c
diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
index a55fb10..06e6c5a 100644
--- a/libibverbs/cmd_device.c
+++ b/libibverbs/cmd_device.c
@@ -32,6 +32,8 @@
 
 #include <infiniband/cmd_write.h>
 
+#include <net/if.h>
+
 static void copy_query_port_resp_to_port_attr(struct ibv_port_attr *port_attr,
 				       struct ib_uverbs_query_port_resp *resp)
 {
@@ -202,3 +204,106 @@ int ibv_cmd_query_context(struct ibv_context *context,
 
 	return 0;
 }
+
+static int is_zero_gid(union ibv_gid *gid)
+{
+	const union ibv_gid zgid = {};
+
+	return !memcmp(gid, &zgid, sizeof(*gid));
+}
+
+static int query_sysfs_gid_ndev_ifindex(struct ibv_context *context,
+					uint8_t port_num, uint32_t gid_index,
+					uint32_t *ndev_ifindex)
+{
+	struct verbs_device *verbs_device = verbs_get_device(context->device);
+	char buff[IF_NAMESIZE] = {};
+
+	if (ibv_read_ibdev_sysfs_file(buff, sizeof(buff), verbs_device->sysfs,
+				      "ports/%d/gid_attrs/ndevs/%d", port_num,
+				      gid_index) <= 0) {
+		*ndev_ifindex = 0;
+		return 0;
+	}
+
+	*ndev_ifindex = if_nametoindex(buff);
+	return *ndev_ifindex ? 0 : errno;
+}
+
+static int query_sysfs_gid_entry(struct ibv_context *context, uint32_t port_num,
+				 uint32_t gid_index,
+				 struct ibv_gid_entry *entry)
+{
+	enum ibv_gid_type_sysfs gid_type;
+	struct ibv_port_attr port_attr = {};
+	int ret;
+
+	entry->gid_index = gid_index;
+	entry->port_num = port_num;
+	ret = ibv_query_gid(context, port_num, gid_index, &entry->gid);
+	if (ret)
+		return EINVAL;
+
+	ret = ibv_query_gid_type(context, port_num, gid_index, &gid_type);
+	if (ret)
+		return EINVAL;
+
+	if (gid_type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1) {
+		ret = ibv_query_port(context, port_num, &port_attr);
+		if (ret)
+			goto out;
+
+		if (port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND) {
+			entry->gid_type = IBV_GID_TYPE_IB;
+		} else if (port_attr.link_layer == IBV_LINK_LAYER_ETHERNET) {
+			entry->gid_type = IBV_GID_TYPE_ROCE_V1;
+		} else {
+			ret = EINVAL;
+			goto out;
+		}
+	} else {
+		entry->gid_type = IBV_GID_TYPE_ROCE_V2;
+	}
+
+	ret = query_sysfs_gid_ndev_ifindex(context, port_num, gid_index,
+					   &entry->ndev_ifindex);
+
+out:
+	return ret;
+}
+
+/* Using async_event cmd_name because query_gid_ex is not in
+ * verbs_context_ops while async_event is and doesn't use ioctl.
+ */
+#define query_gid_kernel_cap async_event
+int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
+			    uint32_t gid_index, struct ibv_gid_entry *entry,
+			    uint32_t flags, size_t entry_size)
+{
+	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE,
+			       UVERBS_METHOD_QUERY_GID_ENTRY, 4);
+	int ret;
+
+	fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_PORT, port_num);
+	fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX,
+			   gid_index);
+	fill_attr_in_uint32(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS, flags);
+	fill_attr_out(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY, entry,
+		      entry_size);
+
+	switch (execute_ioctl_fallback(context, query_gid_kernel_cap, cmdb,
+				       &ret)) {
+	case TRY_WRITE:
+		if (flags)
+			return EOPNOTSUPP;
+
+		ret = query_sysfs_gid_entry(context, port_num, gid_index,
+					    entry);
+		if (ret)
+			return ret;
+
+		return is_zero_gid(&entry->gid) ? ENODATA : 0;
+	default:
+		return ret;
+	}
+}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 046c07d..13b5219 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -633,6 +633,10 @@ int ibv_cmd_reg_dm_mr(struct ibv_pd *pd, struct verbs_dm *dm,
 		      unsigned int access, struct verbs_mr *vmr,
 		      struct ibv_command_buffer *link);
 
+int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
+			    uint32_t gid_index, struct ibv_gid_entry *entry,
+			    uint32_t flags, size_t entry_size);
+
 /*
  * sysfs helper functions
  */
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index 3240e00..dae4963 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -142,6 +142,11 @@ IBVERBS_1.10 {
 		ibv_unimport_pd;
 } IBVERBS_1.9;
 
+IBVERBS_1.11 {
+	global:
+		_ibv_query_gid_ex;
+} IBVERBS_1.10;
+
 /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
    version. See the top level CMakeLists.txt for this setting. */
 
diff --git a/libibverbs/man/CMakeLists.txt b/libibverbs/man/CMakeLists.txt
index b925004..2dea4ff 100644
--- a/libibverbs/man/CMakeLists.txt
+++ b/libibverbs/man/CMakeLists.txt
@@ -57,6 +57,7 @@ rdma_man_pages(
   ibv_query_device_ex.3
   ibv_query_ece.3.md
   ibv_query_gid.3.md
+  ibv_query_gid_ex.3.md
   ibv_query_pkey.3.md
   ibv_query_port.3
   ibv_query_qp.3
diff --git a/libibverbs/man/ibv_query_gid_ex.3.md b/libibverbs/man/ibv_query_gid_ex.3.md
new file mode 100644
index 0000000..9e14f01
--- /dev/null
+++ b/libibverbs/man/ibv_query_gid_ex.3.md
@@ -0,0 +1,93 @@
+---
+date: 2020-04-24
+footer: libibverbs
+header: "Libibverbs Programmer's Manual"
+layout: page
+license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md'
+section: 3
+title: IBV_QUERY_GID_EX
+---
+
+# NAME
+
+ibv_query_gid_ex - Query an InfiniBand port's GID table entry
+
+# SYNOPSIS
+
+```c
+#include <infiniband/verbs.h>
+
+int ibv_query_gid_ex(struct ibv_context *context,
+                     uint32_t port_num,
+                     uint32_t gid_index,
+                     struct ibv_gid_entry *entry,
+                     uint32_t flags);
+```
+
+# DESCRIPTION
+
+**ibv_query_gid_ex()** returns the GID entry at *entry* for
+*gid_index* of port *port_num* for device context *context*.
+
+# ARGUMENTS
+
+*context*
+:	The context of the device to query.
+
+*port_num*
+:	The number of port to query its GID table.
+
+*gid_index*
+:	The index of the GID table entry to query.
+
+## *entry* Argument
+:	An ibv_gid_entry struct, as defined in <infiniband/verbs.h>.
+```c
+struct ibv_gid_entry {
+		union ibv_gid gid;
+		uint32_t gid_index;
+		uint32_t port_num;
+		uint32_t gid_type;
+		uint32_t ndev_ifindex;
+};
+```
+
+	*gid*
+:			The GID entry.
+
+	*gid_index*
+:			The GID table index of this entry.
+
+	*port_num*
+:			The port number that this GID belongs to.
+
+	*gid_type*
+:			enum ibv_gid_type, can be one of IBV_GID_TYPE_IB, IBV_GID_TYPE_ROCE_V1 or IBV_GID_TYPE_ROCE_V2.
+
+	*ndev_ifindex*
+:			The interface index of the net device associated with this GID.
+			It is 0 if there is no net device associated with it.
+
+*flags*
+:	Extra fields to query post *ndev_ifindex*, for now must be 0.
+
+# RETURN VALUE
+
+**ibv_query_gid_ex()** returns 0 on success or errno value on error.
+
+# ERRORS
+
+ENODATA
+:	*gid_index* is within the GID table size of port *port_num* but there is no data in this index.
+
+# SEE ALSO
+
+**ibv_open_device**(3),
+**ibv_query_device**(3),
+**ibv_query_pkey**(3),
+**ibv_query_port**(3),
+**ibv_query_gid_table**(3)
+
+# AUTHOR
+
+Parav Pandit <parav@nvidia.com>
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 9507ffd..9427aba 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -240,6 +240,14 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
 	return 0;
 }
 
+int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
+		      uint32_t gid_index, struct ibv_gid_entry *entry,
+		      uint32_t flags, size_t entry_size)
+{
+	return ibv_cmd_query_gid_entry(context, port_num, gid_index, entry,
+				       flags, entry_size);
+}
+
 LATEST_SYMVER_FUNC(ibv_query_pkey, 1_1, "IBVERBS_1.1",
 		   int,
 		   struct ibv_context *context, uint8_t port_num,
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index 2e785aa..e5bf900 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -68,6 +68,20 @@ union ibv_gid {
 	} global;
 };
 
+enum ibv_gid_type {
+	IBV_GID_TYPE_IB,
+	IBV_GID_TYPE_ROCE_V1,
+	IBV_GID_TYPE_ROCE_V2,
+};
+
+struct ibv_gid_entry {
+	union ibv_gid gid;
+	uint32_t gid_index;
+	uint32_t port_num;
+	uint32_t gid_type; /* enum ibv_gid_type */
+	uint32_t ndev_ifindex;
+};
+
 #define vext_field_avail(type, fld, sz) (offsetof(type, fld) < (sz))
 
 #ifdef __cplusplus
@@ -2330,6 +2344,21 @@ static inline int ___ibv_query_port(struct ibv_context *context,
 int ibv_query_gid(struct ibv_context *context, uint8_t port_num,
 		  int index, union ibv_gid *gid);
 
+int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
+		     uint32_t gid_index, struct ibv_gid_entry *entry,
+		     uint32_t flags, size_t entry_size);
+
+/**
+ * ibv_query_gid_ex - Read a GID table entry
+ */
+static inline int ibv_query_gid_ex(struct ibv_context *context,
+				   uint32_t port_num, uint32_t gid_index,
+				   struct ibv_gid_entry *entry, uint32_t flags)
+{
+	return _ibv_query_gid_ex(context, port_num, gid_index, entry, flags,
+				 sizeof(*entry));
+}
+
 /**
  * ibv_query_pkey - Get a P_Key table entry
  */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
                   ` (2 preceding siblings ...)
  2020-09-14  6:34 ` [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API Yishai Hadas
@ 2020-09-14  6:34 ` Yishai Hadas
  2020-09-21 16:53   ` Jason Gunthorpe
  2020-09-14  6:35 ` [PATCH rdma-core 5/8] verbs: Optimize ibv_query_gid and ibv_query_gid_type Yishai Hadas
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:34 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

From: Avihai Horon <avihaih@nvidia.com>

Currently ibv_query_gid and ibv_query_gid_type are implemented over
sysfs. In order to improve their performance we implement them using the
new query GID entry API, so now they will use ioctl and fallback to
sysfs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 libibverbs/cmd_device.c |  4 ++--
 libibverbs/driver.h     |  6 +++++
 libibverbs/verbs.c      | 58 ++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
index 06e6c5a..fb166bb 100644
--- a/libibverbs/cmd_device.c
+++ b/libibverbs/cmd_device.c
@@ -240,11 +240,11 @@ static int query_sysfs_gid_entry(struct ibv_context *context, uint32_t port_num,
 
 	entry->gid_index = gid_index;
 	entry->port_num = port_num;
-	ret = ibv_query_gid(context, port_num, gid_index, &entry->gid);
+	ret = _ibv_query_gid(context, port_num, gid_index, &entry->gid);
 	if (ret)
 		return EINVAL;
 
-	ret = ibv_query_gid_type(context, port_num, gid_index, &gid_type);
+	ret = _ibv_query_gid_type(context, port_num, gid_index, &gid_type);
 	if (ret)
 		return EINVAL;
 
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 13b5219..2ab0a89 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -659,6 +659,12 @@ static inline bool check_comp_mask(uint64_t input, uint64_t supported)
 int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 		       unsigned int index, enum ibv_gid_type_sysfs *type);
 
+int _ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
+			unsigned int index, enum ibv_gid_type_sysfs *type);
+
+int _ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index,
+		   union ibv_gid *gid);
+
 static inline int
 ibv_check_alloc_parent_domain(struct ibv_parent_domain_init_attr *attr)
 {
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 9427aba..9dec4e6 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -216,10 +216,8 @@ LATEST_SYMVER_FUNC(ibv_query_port, 1_1, "IBVERBS_1.1",
 				sizeof(*port_attr));
 }
 
-LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
-		   int,
-		   struct ibv_context *context, uint8_t port_num,
-		   int index, union ibv_gid *gid)
+int _ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index,
+		   union ibv_gid *gid)
 {
 	struct verbs_device *verbs_device = verbs_get_device(context->device);
 	char attr[41];
@@ -240,6 +238,29 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
 	return 0;
 }
 
+LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
+		   int,
+		   struct ibv_context *context, uint8_t port_num,
+		   int index, union ibv_gid *gid)
+{
+	struct ibv_gid_entry entry = {};
+	int ret;
+
+	ret = ibv_cmd_query_gid_entry(context, port_num, index, &entry, 0,
+				      sizeof(entry));
+	/* Preserve API behavior for empty GID */
+	if (ret == ENODATA) {
+		memset(gid, 0, sizeof(*gid));
+		return 0;
+	}
+	if (ret)
+		return -1;
+
+	memcpy(gid, &entry.gid, sizeof(entry.gid));
+
+	return 0;
+}
+
 int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
 		      uint32_t gid_index, struct ibv_gid_entry *entry,
 		      uint32_t flags, size_t entry_size)
@@ -711,8 +732,8 @@ LATEST_SYMVER_FUNC(ibv_create_ah, 1_1, "IBVERBS_1.1",
  */
 #define V1_TYPE "IB/RoCE v1"
 #define V2_TYPE "RoCE v2"
-int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
-		       unsigned int index, enum ibv_gid_type_sysfs *type)
+int _ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
+			unsigned int index, enum ibv_gid_type_sysfs *type)
 {
 	struct verbs_device *verbs_device = verbs_get_device(context->device);
 	char buff[11];
@@ -768,6 +789,31 @@ int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 	return 0;
 }
 
+int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
+		       unsigned int index, enum ibv_gid_type_sysfs *type)
+{
+	struct ibv_gid_entry entry = {};
+	int ret;
+
+	ret = ibv_cmd_query_gid_entry(context, port_num, index, &entry, 0,
+				      sizeof(entry));
+	/* Preserve API behavior for empty GID */
+	if (ret == ENODATA) {
+		*type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
+		return 0;
+	}
+	if (ret)
+		return -1;
+
+	if (entry.gid_type == IBV_GID_TYPE_IB ||
+	    entry.gid_type == IBV_GID_TYPE_ROCE_V1)
+		*type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
+	else
+		*type = IBV_GID_TYPE_SYSFS_ROCE_V2;
+
+	return 0;
+}
+
 static int ibv_find_gid_index(struct ibv_context *context, uint8_t port_num,
 			      union ibv_gid *gid,
 			      enum ibv_gid_type_sysfs gid_type)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 5/8] verbs: Optimize ibv_query_gid and ibv_query_gid_type
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
                   ` (3 preceding siblings ...)
  2020-09-14  6:34 ` [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl Yishai Hadas
@ 2020-09-14  6:35 ` Yishai Hadas
  2020-09-14  6:35 ` [PATCH rdma-core 6/8] verbs: Introduce a new query GID table API Yishai Hadas
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:35 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

From: Avihai Horon <avihaih@nvidia.com>

ibv_query_gid and ibv_query_gid_type are implemented as ioctl first and
fallback to sysfs. Currently, if the fallback path is taken, all of the
gid entry attributes are retrieved over sysfs.

For example, if ibv_query_gid is called and the fallback path is taken,
the gid type and the gid ndev ifindex will also be read over sysfs, even
though we only need the gid.

In order to eliminate these unnecessary sysfs reads, we add an attribute
mask to ibv_cmd_query_gid_entry that will allow us to mark the specific
gid entry attributes that we would like to query in fallback.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 libibverbs/cmd_device.c | 65 +++++++++++++++++++++++++++++--------------------
 libibverbs/driver.h     |  9 ++++++-
 libibverbs/verbs.c      | 10 +++++---
 3 files changed, 53 insertions(+), 31 deletions(-)

diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
index fb166bb..9aa9dff 100644
--- a/libibverbs/cmd_device.c
+++ b/libibverbs/cmd_device.c
@@ -232,41 +232,49 @@ static int query_sysfs_gid_ndev_ifindex(struct ibv_context *context,
 
 static int query_sysfs_gid_entry(struct ibv_context *context, uint32_t port_num,
 				 uint32_t gid_index,
-				 struct ibv_gid_entry *entry)
+				 struct ibv_gid_entry *entry,
+				 uint32_t attr_mask)
 {
 	enum ibv_gid_type_sysfs gid_type;
 	struct ibv_port_attr port_attr = {};
-	int ret;
+	int ret = 0;
 
 	entry->gid_index = gid_index;
 	entry->port_num = port_num;
-	ret = _ibv_query_gid(context, port_num, gid_index, &entry->gid);
-	if (ret)
-		return EINVAL;
-
-	ret = _ibv_query_gid_type(context, port_num, gid_index, &gid_type);
-	if (ret)
-		return EINVAL;
-
-	if (gid_type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1) {
-		ret = ibv_query_port(context, port_num, &port_attr);
+	if (attr_mask & VERBS_QUERY_GID_ATTR_GID) {
+		ret = _ibv_query_gid(context, port_num, gid_index, &entry->gid);
 		if (ret)
-			goto out;
+			return EINVAL;
+	}
 
-		if (port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND) {
-			entry->gid_type = IBV_GID_TYPE_IB;
-		} else if (port_attr.link_layer == IBV_LINK_LAYER_ETHERNET) {
-			entry->gid_type = IBV_GID_TYPE_ROCE_V1;
+	if (attr_mask & VERBS_QUERY_GID_ATTR_TYPE) {
+		ret = _ibv_query_gid_type(context, port_num, gid_index,
+					  &gid_type);
+		if (ret)
+			return EINVAL;
+
+		if (gid_type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1) {
+			ret = ibv_query_port(context, port_num, &port_attr);
+			if (ret)
+				goto out;
+
+			if (port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND) {
+				entry->gid_type = IBV_GID_TYPE_IB;
+			} else if (port_attr.link_layer ==
+				   IBV_LINK_LAYER_ETHERNET) {
+				entry->gid_type = IBV_GID_TYPE_ROCE_V1;
+			} else {
+				ret = EINVAL;
+				goto out;
+			}
 		} else {
-			ret = EINVAL;
-			goto out;
+			entry->gid_type = IBV_GID_TYPE_ROCE_V2;
 		}
-	} else {
-		entry->gid_type = IBV_GID_TYPE_ROCE_V2;
 	}
 
-	ret = query_sysfs_gid_ndev_ifindex(context, port_num, gid_index,
-					   &entry->ndev_ifindex);
+	if (attr_mask & VERBS_QUERY_GID_ATTR_NDEV_IFINDEX)
+		ret = query_sysfs_gid_ndev_ifindex(context, port_num, gid_index,
+						   &entry->ndev_ifindex);
 
 out:
 	return ret;
@@ -278,7 +286,8 @@ out:
 #define query_gid_kernel_cap async_event
 int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
 			    uint32_t gid_index, struct ibv_gid_entry *entry,
-			    uint32_t flags, size_t entry_size)
+			    uint32_t flags, size_t entry_size,
+			    uint32_t fallback_attr_mask)
 {
 	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE,
 			       UVERBS_METHOD_QUERY_GID_ENTRY, 4);
@@ -298,11 +307,15 @@ int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
 			return EOPNOTSUPP;
 
 		ret = query_sysfs_gid_entry(context, port_num, gid_index,
-					    entry);
+					    entry, fallback_attr_mask);
 		if (ret)
 			return ret;
 
-		return is_zero_gid(&entry->gid) ? ENODATA : 0;
+		if (fallback_attr_mask & VERBS_QUERY_GID_ATTR_GID &&
+		    is_zero_gid(&entry->gid))
+			return ENODATA;
+
+		return 0;
 	default:
 		return ret;
 	}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index 2ab0a89..c998b5b 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -77,6 +77,12 @@ enum ibv_gid_type_sysfs {
 	IBV_GID_TYPE_SYSFS_ROCE_V2,
 };
 
+enum verbs_query_gid_attr_mask {
+	VERBS_QUERY_GID_ATTR_GID		= 1 << 0,
+	VERBS_QUERY_GID_ATTR_TYPE		= 1 << 1,
+	VERBS_QUERY_GID_ATTR_NDEV_IFINDEX	= 1 << 2,
+};
+
 enum ibv_mr_type {
 	IBV_MR_TYPE_MR,
 	IBV_MR_TYPE_NULL_MR,
@@ -635,7 +641,8 @@ int ibv_cmd_reg_dm_mr(struct ibv_pd *pd, struct verbs_dm *dm,
 
 int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
 			    uint32_t gid_index, struct ibv_gid_entry *entry,
-			    uint32_t flags, size_t entry_size);
+			    uint32_t flags, size_t entry_size,
+			    uint32_t fallback_attr_mask);
 
 /*
  * sysfs helper functions
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 9dec4e6..237c56b 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -247,7 +247,7 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
 	int ret;
 
 	ret = ibv_cmd_query_gid_entry(context, port_num, index, &entry, 0,
-				      sizeof(entry));
+				      sizeof(entry), VERBS_QUERY_GID_ATTR_GID);
 	/* Preserve API behavior for empty GID */
 	if (ret == ENODATA) {
 		memset(gid, 0, sizeof(*gid));
@@ -265,8 +265,10 @@ int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
 		      uint32_t gid_index, struct ibv_gid_entry *entry,
 		      uint32_t flags, size_t entry_size)
 {
-	return ibv_cmd_query_gid_entry(context, port_num, gid_index, entry,
-				       flags, entry_size);
+	return ibv_cmd_query_gid_entry(
+		context, port_num, gid_index, entry, flags, entry_size,
+		VERBS_QUERY_GID_ATTR_GID | VERBS_QUERY_GID_ATTR_TYPE |
+			VERBS_QUERY_GID_ATTR_NDEV_IFINDEX);
 }
 
 LATEST_SYMVER_FUNC(ibv_query_pkey, 1_1, "IBVERBS_1.1",
@@ -796,7 +798,7 @@ int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num,
 	int ret;
 
 	ret = ibv_cmd_query_gid_entry(context, port_num, index, &entry, 0,
-				      sizeof(entry));
+				      sizeof(entry), VERBS_QUERY_GID_ATTR_TYPE);
 	/* Preserve API behavior for empty GID */
 	if (ret == ENODATA) {
 		*type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 6/8] verbs: Introduce a new query GID table API
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
                   ` (4 preceding siblings ...)
  2020-09-14  6:35 ` [PATCH rdma-core 5/8] verbs: Optimize ibv_query_gid and ibv_query_gid_type Yishai Hadas
@ 2020-09-14  6:35 ` Yishai Hadas
  2020-09-14  6:35 ` [PATCH rdma-core 7/8] pyverbs: Add query_gid_table and query_gid_ex methods Yishai Hadas
  2020-09-14  6:35 ` [PATCH rdma-core 8/8] tests: Add tests for ibv_query_gid_table and ibv_query_gid_ex Yishai Hadas
  7 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:35 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih

From: Avihai Horon <avihaih@nvidia.com>

Introduce the ibv_query_gid_table verb which queries the GID tables of
the given device and stores the queried data in a buffer provided by the
user.

If the kernel doesn't support ioctl or the needed uverbs method, the API
will try to query the GID tables via sysfs.

This API provides a faster way to query the GID tables of a device using
a single call over ioctl, instead of multiple calls to open, close and
read multiple sysfs files for a single GID table entry.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 debian/libibverbs1.symbols              |   1 +
 libibverbs/cmd_device.c                 | 117 +++++++++++++++++++++++++++++---
 libibverbs/driver.h                     |   4 ++
 libibverbs/libibverbs.map.in            |   1 +
 libibverbs/man/CMakeLists.txt           |   1 +
 libibverbs/man/ibv_query_gid_table.3.md |  73 ++++++++++++++++++++
 libibverbs/verbs.c                      |   8 +++
 libibverbs/verbs.h                      |  16 +++++
 8 files changed, 211 insertions(+), 10 deletions(-)
 create mode 100644 libibverbs/man/ibv_query_gid_table.3.md

diff --git a/debian/libibverbs1.symbols b/debian/libibverbs1.symbols
index 536d543..99257de 100644
--- a/debian/libibverbs1.symbols
+++ b/debian/libibverbs1.symbols
@@ -11,6 +11,7 @@ libibverbs.so.1 libibverbs1 #MINVER#
  IBVERBS_1.11@IBVERBS_1.11 32
  (symver)IBVERBS_PRIVATE_25 25
  _ibv_query_gid_ex@IBVERBS_1.11 32
+ _ibv_query_gid_table@IBVERBS_1.11 32
  ibv_ack_async_event@IBVERBS_1.0 1.1.6
  ibv_ack_async_event@IBVERBS_1.1 1.1.6
  ibv_ack_cq_events@IBVERBS_1.0 1.1.6
diff --git a/libibverbs/cmd_device.c b/libibverbs/cmd_device.c
index 9aa9dff..cbdb01d 100644
--- a/libibverbs/cmd_device.c
+++ b/libibverbs/cmd_device.c
@@ -233,7 +233,7 @@ static int query_sysfs_gid_ndev_ifindex(struct ibv_context *context,
 static int query_sysfs_gid_entry(struct ibv_context *context, uint32_t port_num,
 				 uint32_t gid_index,
 				 struct ibv_gid_entry *entry,
-				 uint32_t attr_mask)
+				 uint32_t attr_mask, int link_layer)
 {
 	enum ibv_gid_type_sysfs gid_type;
 	struct ibv_port_attr port_attr = {};
@@ -254,14 +254,18 @@ static int query_sysfs_gid_entry(struct ibv_context *context, uint32_t port_num,
 			return EINVAL;
 
 		if (gid_type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1) {
-			ret = ibv_query_port(context, port_num, &port_attr);
-			if (ret)
-				goto out;
+			if (link_layer < 0) {
+				ret = ibv_query_port(context, port_num,
+						     &port_attr);
+				if (ret)
+					goto out;
+
+				link_layer = port_attr.link_layer;
+			}
 
-			if (port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND) {
+			if (link_layer == IBV_LINK_LAYER_INFINIBAND) {
 				entry->gid_type = IBV_GID_TYPE_IB;
-			} else if (port_attr.link_layer ==
-				   IBV_LINK_LAYER_ETHERNET) {
+			} else if (link_layer == IBV_LINK_LAYER_ETHERNET) {
 				entry->gid_type = IBV_GID_TYPE_ROCE_V1;
 			} else {
 				ret = EINVAL;
@@ -280,8 +284,64 @@ out:
 	return ret;
 }
 
-/* Using async_event cmd_name because query_gid_ex is not in
- * verbs_context_ops while async_event is and doesn't use ioctl.
+static int query_gid_table_fb(struct ibv_context *context,
+			      struct ibv_gid_entry *entries, size_t max_entries,
+			      uint64_t *num_entries, size_t entry_size)
+{
+	struct ibv_device_attr dev_attr = {};
+	struct ibv_port_attr port_attr = {};
+	struct ibv_gid_entry entry = {};
+	int attr_mask;
+	void *tmp;
+	int i, j;
+	int ret;
+
+	ret = ibv_query_device(context, &dev_attr);
+	if (ret)
+		goto out;
+
+	tmp = entries;
+	*num_entries = 0;
+	attr_mask = VERBS_QUERY_GID_ATTR_GID | VERBS_QUERY_GID_ATTR_TYPE |
+		    VERBS_QUERY_GID_ATTR_NDEV_IFINDEX;
+	for (i = 0; i < dev_attr.phys_port_cnt; i++) {
+		ret = ibv_query_port(context, i + 1, &port_attr);
+		if (ret)
+			goto out;
+
+		for (j = 0; j < port_attr.gid_tbl_len; j++) {
+			/* In case we already reached max_entries, query to some
+			 * temp entry, in case all other entries are zeros the
+			 * API should succceed.
+			 */
+			if (*num_entries == max_entries)
+				tmp = &entry;
+			ret = query_sysfs_gid_entry(context, i + 1, j,
+						    tmp,
+						    attr_mask,
+						    port_attr.link_layer);
+			if (ret)
+				goto out;
+			if (is_zero_gid(&((struct ibv_gid_entry *)tmp)->gid))
+				continue;
+			if (*num_entries == max_entries) {
+				ret = EINVAL;
+				goto out;
+			}
+
+			(*num_entries)++;
+			tmp += entry_size;
+		}
+	}
+
+out:
+	return ret;
+}
+
+/* Using async_event cmd_name because query_gid_ex and query_gid_table are not
+ * in verbs_context_ops while async_event is and doesn't use ioctl.
+ * If one of them is not supported, so is the other. Hence, we can use a single
+ * cmd_name for both of them.
  */
 #define query_gid_kernel_cap async_event
 int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
@@ -307,7 +367,7 @@ int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
 			return EOPNOTSUPP;
 
 		ret = query_sysfs_gid_entry(context, port_num, gid_index,
-					    entry, fallback_attr_mask);
+					    entry, fallback_attr_mask, -1);
 		if (ret)
 			return ret;
 
@@ -320,3 +380,40 @@ int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
 		return ret;
 	}
 }
+
+ssize_t ibv_cmd_query_gid_table(struct ibv_context *context,
+				struct ibv_gid_entry *entries,
+				size_t max_entries, uint32_t flags,
+				size_t entry_size)
+{
+	DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE,
+			       UVERBS_METHOD_QUERY_GID_TABLE, 4);
+	uint64_t num_entries;
+	int ret;
+
+	fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE,
+			   entry_size);
+	fill_attr_in_uint32(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_FLAGS, flags);
+	fill_attr_out(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, entries,
+		      _array_len(entry_size, max_entries));
+	fill_attr_out_ptr(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES,
+			  &num_entries);
+
+	switch (execute_ioctl_fallback(context, query_gid_kernel_cap, cmdb,
+				       &ret)) {
+	case TRY_WRITE:
+		if (flags)
+			return -EOPNOTSUPP;
+
+		ret = query_gid_table_fb(context, entries, max_entries,
+					 &num_entries, entry_size);
+		break;
+	default:
+		break;
+	}
+
+	if (ret)
+		return -ret;
+
+	return num_entries;
+}
diff --git a/libibverbs/driver.h b/libibverbs/driver.h
index c998b5b..b7b74df 100644
--- a/libibverbs/driver.h
+++ b/libibverbs/driver.h
@@ -643,6 +643,10 @@ int ibv_cmd_query_gid_entry(struct ibv_context *context, uint32_t port_num,
 			    uint32_t gid_index, struct ibv_gid_entry *entry,
 			    uint32_t flags, size_t entry_size,
 			    uint32_t fallback_attr_mask);
+ssize_t ibv_cmd_query_gid_table(struct ibv_context *context,
+				struct ibv_gid_entry *entries,
+				size_t max_entries, uint32_t flags,
+				size_t entry_size);
 
 /*
  * sysfs helper functions
diff --git a/libibverbs/libibverbs.map.in b/libibverbs/libibverbs.map.in
index dae4963..7429016 100644
--- a/libibverbs/libibverbs.map.in
+++ b/libibverbs/libibverbs.map.in
@@ -145,6 +145,7 @@ IBVERBS_1.10 {
 IBVERBS_1.11 {
 	global:
 		_ibv_query_gid_ex;
+		_ibv_query_gid_table;
 } IBVERBS_1.10;
 
 /* If any symbols in this stanza change ABI then the entire staza gets a new symbol
diff --git a/libibverbs/man/CMakeLists.txt b/libibverbs/man/CMakeLists.txt
index 2dea4ff..1fb5ac1 100644
--- a/libibverbs/man/CMakeLists.txt
+++ b/libibverbs/man/CMakeLists.txt
@@ -58,6 +58,7 @@ rdma_man_pages(
   ibv_query_ece.3.md
   ibv_query_gid.3.md
   ibv_query_gid_ex.3.md
+  ibv_query_gid_table.3.md
   ibv_query_pkey.3.md
   ibv_query_port.3
   ibv_query_qp.3
diff --git a/libibverbs/man/ibv_query_gid_table.3.md b/libibverbs/man/ibv_query_gid_table.3.md
new file mode 100644
index 0000000..e10f51c
--- /dev/null
+++ b/libibverbs/man/ibv_query_gid_table.3.md
@@ -0,0 +1,73 @@
+---
+date: 2020-04-24
+footer: libibverbs
+header: "Libibverbs Programmer's Manual"
+layout: page
+license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md'
+section: 3
+title: IBV_QUERY_GID_TABLE
+---
+
+# NAME
+
+ibv_query_gid_table - query an InfiniBand device's GID table
+
+# SYNOPSIS
+
+```c
+#include <infiniband/verbs.h>
+
+ssize_t ibv_query_gid_table(struct ibv_context *context,
+                            struct ibv_gid_entry *entries,
+                            size_t max_entries,
+                            uint32_t flags);
+```
+
+# DESCRIPTION
+
+**ibv_query_gid_table()** returns the valid GID table entries of the RDMA
+device context *context* at the pointer *entries*.
+
+A caller must allocate *entries* array for the GID table entries it
+desires to query. This API returns only valid GID table entries.
+
+A caller must pass non zero number of entries at *max_entries* that corresponds
+to the size of *entries* array.
+
+*entries* array must be allocated such that it can contain all the valid
+GID table entries of the device. If there are more valid GID entries than
+the provided value of *max_entries* and *entries* array, the call will fail.
+For example, if a RDMA device *context* has a total of 10 valid
+GID entries, *entries* should be allocated for at least 10 entries, and
+*max_entries* should be set appropriately.
+
+# ARGUMENTS
+
+*context*
+:	The context of the device to query.
+
+*entries*
+:	Array of ibv_gid_entry structs where the GID entries are returned.
+	Please see **ibv_query_gid_ex**(3) man page for *ibv_gid_entry*.
+
+*max_entries*
+:	Maximum number of entries that can be returned.
+
+*flags*
+:	Extra fields to query post *entries->ndev_ifindex*, for now must be 0.
+
+# RETURN VALUE
+
+**ibv_query_gid_table()** returns the number of entries that were read on success or negative errno value on error.
+Number of entries returned is <= max_entries.
+
+# SEE ALSO
+
+**ibv_open_device**(3),
+**ibv_query_device**(3),
+**ibv_query_port**(3),
+**ibv_query_gid_ex**(3)
+
+# AUTHOR
+
+Parav Pandit <parav@nvidia.com>
diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
index 237c56b..e16c91a 100644
--- a/libibverbs/verbs.c
+++ b/libibverbs/verbs.c
@@ -271,6 +271,14 @@ int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
 			VERBS_QUERY_GID_ATTR_NDEV_IFINDEX);
 }
 
+ssize_t _ibv_query_gid_table(struct ibv_context *context,
+			     struct ibv_gid_entry *entries, size_t max_entries,
+			     uint32_t flags, size_t entry_size)
+{
+	return ibv_cmd_query_gid_table(context, entries, max_entries, flags,
+				       entry_size);
+}
+
 LATEST_SYMVER_FUNC(ibv_query_pkey, 1_1, "IBVERBS_1.1",
 		   int,
 		   struct ibv_context *context, uint8_t port_num,
diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
index e5bf900..caf626c 100644
--- a/libibverbs/verbs.h
+++ b/libibverbs/verbs.h
@@ -43,6 +43,7 @@
 #include <string.h>
 #include <linux/types.h>
 #include <stdint.h>
+#include <sys/types.h>
 #include <infiniband/verbs_api.h>
 
 #ifdef __cplusplus
@@ -2359,6 +2360,21 @@ static inline int ibv_query_gid_ex(struct ibv_context *context,
 				 sizeof(*entry));
 }
 
+ssize_t _ibv_query_gid_table(struct ibv_context *context,
+			     struct ibv_gid_entry *entries, size_t max_entries,
+			     uint32_t flags, size_t entry_size);
+
+/*
+ * ibv_query_gid_table - Get all valid GID table entries
+ */
+static inline ssize_t ibv_query_gid_table(struct ibv_context *context,
+					  struct ibv_gid_entry *entries,
+					  size_t max_entries, uint32_t flags)
+{
+	return _ibv_query_gid_table(context, entries, max_entries, flags,
+				    sizeof(*entries));
+}
+
 /**
  * ibv_query_pkey - Get a P_Key table entry
  */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 7/8] pyverbs: Add query_gid_table and query_gid_ex methods
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
                   ` (5 preceding siblings ...)
  2020-09-14  6:35 ` [PATCH rdma-core 6/8] verbs: Introduce a new query GID table API Yishai Hadas
@ 2020-09-14  6:35 ` Yishai Hadas
  2020-09-14  6:35 ` [PATCH rdma-core 8/8] tests: Add tests for ibv_query_gid_table and ibv_query_gid_ex Yishai Hadas
  7 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:35 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih, Edward Srouji

From: Avihai Horon <avihaih@nvidia.com>

Add two new methods to Context class: query_gid_table and query_gid_ex.

query_gid_table queries all GID tables of the device and returns
a list of GIDEntry objects containing all valid GID entries.

query_gid_ex queries the GID table of the given port in the given index
and returns a GIDEntry object.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
---
 pyverbs/device.pxd           |   3 ++
 pyverbs/device.pyx           | 106 +++++++++++++++++++++++++++++++++++++++++++
 pyverbs/libibverbs.pxd       |  13 ++++++
 pyverbs/libibverbs_enums.pxd |   5 ++
 4 files changed, 127 insertions(+)

diff --git a/pyverbs/device.pxd b/pyverbs/device.pxd
index 73328d3..0519c4b 100755
--- a/pyverbs/device.pxd
+++ b/pyverbs/device.pxd
@@ -64,3 +64,6 @@ cdef class DM(PyverbsCM):
 
 cdef class PortAttr(PyverbsObject):
     cdef v.ibv_port_attr attr
+
+cdef class GIDEntry(PyverbsObject):
+    cdef v.ibv_gid_entry entry
diff --git a/pyverbs/device.pyx b/pyverbs/device.pyx
index c1323cd..b16d6d0 100755
--- a/pyverbs/device.pyx
+++ b/pyverbs/device.pyx
@@ -26,6 +26,8 @@ from libc.stdlib cimport free, malloc
 from libc.string cimport memset
 from libc.stdint cimport uint64_t
 from libc.stdint cimport uint16_t
+from libc.stdint cimport uint32_t
+from pyverbs.utils import gid_str
 
 cdef extern from 'endian.h':
     unsigned long be64toh(unsigned long host_64bits);
@@ -240,6 +242,53 @@ cdef class Context(PyverbsCM):
                                    format(p=port_num), rc)
         return port_attrs
 
+    def query_gid_table(self, size_t max_entries, uint32_t flags=0):
+        """
+        Queries the GID tables of the device for at most <max_entries> entries
+        and returns them.
+        :param max_entries: Maximum number of GID entries to retrieve
+        :param flags: Specifies new extra members of struct ibv_gid_entry to
+                      query
+        :return: List of GIDEntry objects on success
+        """
+        cdef v.ibv_gid_entry *entries
+        cdef v.ibv_gid_entry entry
+
+        entries = <v.ibv_gid_entry *>malloc(max_entries *
+                                            sizeof(v.ibv_gid_entry))
+        rc = v.ibv_query_gid_table(self.context, entries, max_entries, flags)
+        if rc < 0:
+            raise PyverbsRDMAError('Failed to query gid tables of the device',
+                                   rc)
+        gid_entries = []
+        for i in range(rc):
+            entry = entries[i]
+            gid_entries.append(GIDEntry(entry.gid._global.subnet_prefix,
+                               entry.gid._global.interface_id, entry.gid_index,
+                               entry.port_num, entry.gid_type,
+                               entry.ndev_ifindex))
+        free(entries)
+        return gid_entries
+
+    def query_gid_ex(self, uint32_t port_num, uint32_t gid_index,
+                     uint32_t flags=0):
+        """
+        Queries the GID table of port <port_num> in index <gid_index>, and
+        returns the GID entry.
+        :param port_num: The port number to query
+        :param gid_index: The index in the GID table to query
+        :param flags: Specifies new extra members of struct ibv_gid_entry to
+                      query
+        :return: GIDEntry object on success
+        """
+        entry = GIDEntry()
+        rc = v.ibv_query_gid_ex(self.context, port_num, gid_index,
+                                &entry.entry, flags)
+        if rc != 0:
+            raise PyverbsRDMAError(f'Failed to query gid table of port '\
+                                   f'{port_num} in index {gid_index}', rc)
+        return entry
+
     cdef add_ref(self, obj):
         if isinstance(obj, PD):
             self.pds.add(obj)
@@ -816,6 +865,63 @@ cdef class PortAttr(PyverbsObject):
             print_format.format('Flags', self.attr.flags)
 
 
+cdef class GIDEntry(PyverbsObject):
+    def __init__(self, subnet_prefix=0, interface_id=0, gid_index=0,
+                 port_num=0, gid_type=0, ndev_ifindex=0):
+        super().__init__()
+        self.entry.gid._global.subnet_prefix = subnet_prefix
+        self.entry.gid._global.interface_id = interface_id
+        self.entry.gid_index = gid_index
+        self.entry.port_num = port_num
+        self.entry.gid_type = gid_type
+        self.entry.ndev_ifindex = ndev_ifindex
+
+    @property
+    def gid_subnet_prefix(self):
+        return self.entry.gid._global.subnet_prefix
+
+    @property
+    def gid_interface_id(self):
+        return self.entry.gid._global.interface_id
+
+    @property
+    def gid_index(self):
+        return self.entry.gid_index
+
+    @property
+    def port_num(self):
+        return self.entry.port_num
+
+    @property
+    def gid_type(self):
+        return self.entry.gid_type
+
+    @property
+    def ndev_ifindex(self):
+        return self.entry.ndev_ifindex
+
+    def gid_str(self):
+        return gid_str(self.gid_subnet_prefix, self.gid_interface_id)
+
+    def __str__(self):
+        print_format = '{:<24}: {:<20}\n'
+        return print_format.format('GID', self.gid_str()) +\
+            print_format.format('GID Index', self.gid_index) +\
+            print_format.format('Port number', self.port_num) +\
+            print_format.format('GID type', translate_gid_type(
+                                self.gid_type)) +\
+            print_format.format('Ndev ifindex', self.ndev_ifindex)
+
+
+def translate_gid_type(gid_type):
+    types = {e.IBV_GID_TYPE_IB: 'IB', e.IBV_GID_TYPE_ROCE_V1: 'RoCEv1',
+             e.IBV_GID_TYPE_ROCE_V2: 'RoCEv2'}
+    try:
+        return types[gid_type]
+    except KeyError:
+        return f'Unknown gid_type ({gid_type})'
+
+
 def guid_format(num):
     """
     Get GUID representation of the given number, including change of endianness.
diff --git a/pyverbs/libibverbs.pxd b/pyverbs/libibverbs.pxd
index c84b9fc..6fbba54 100755
--- a/pyverbs/libibverbs.pxd
+++ b/pyverbs/libibverbs.pxd
@@ -483,6 +483,13 @@ cdef extern from 'infiniband/verbs.h':
         uint32_t options
         uint32_t comp_mask
 
+    cdef struct ibv_gid_entry:
+        ibv_gid gid
+        uint32_t gid_index
+        uint32_t port_num
+        uint32_t gid_type
+        uint32_t ndev_ifindex
+
     ibv_device **ibv_get_device_list(int *n)
     int ibv_get_device_index(ibv_device *device);
     void ibv_free_device_list(ibv_device **list)
@@ -613,6 +620,12 @@ cdef extern from 'infiniband/verbs.h':
     void ibv_unimport_mr(ibv_mr *mr)
     ibv_pd *ibv_import_pd(ibv_context *context, uint32_t handle)
     void ibv_unimport_pd(ibv_pd *pd)
+    int ibv_query_gid_ex(ibv_context *context, uint32_t port_num,
+                         uint32_t gid_index, ibv_gid_entry *entry,
+                         uint32_t flags)
+    ssize_t ibv_query_gid_table(ibv_context *context,
+                                ibv_gid_entry *entries, size_t max_entries,
+                                uint32_t flags)
 
 
 cdef extern from 'infiniband/driver.h':
diff --git a/pyverbs/libibverbs_enums.pxd b/pyverbs/libibverbs_enums.pxd
index 83ca516..a5c07b3 100755
--- a/pyverbs/libibverbs_enums.pxd
+++ b/pyverbs/libibverbs_enums.pxd
@@ -427,6 +427,11 @@ cdef extern from '<infiniband/verbs.h>':
 
     cdef void *IBV_ALLOCATOR_USE_DEFAULT
 
+    cpdef enum ibv_gid_type:
+        IBV_GID_TYPE_IB
+        IBV_GID_TYPE_ROCE_V1
+        IBV_GID_TYPE_ROCE_V2
+
 
 cdef extern from "<infiniband/verbs_api.h>":
     cdef unsigned long long IBV_ADVISE_MR_ADVICE_PREFETCH
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rdma-core 8/8] tests: Add tests for ibv_query_gid_table and ibv_query_gid_ex
  2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
                   ` (6 preceding siblings ...)
  2020-09-14  6:35 ` [PATCH rdma-core 7/8] pyverbs: Add query_gid_table and query_gid_ex methods Yishai Hadas
@ 2020-09-14  6:35 ` Yishai Hadas
  7 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-14  6:35 UTC (permalink / raw)
  To: linux-rdma; +Cc: jgg, yishaih, maorg, avihaih, Edward Srouji

From: Avihai Horon <avihaih@nvidia.com>

Add a test for ibv_query_gid_table and another one for
ibv_query_gid_ex.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
---
 tests/test_device.py | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/tests/test_device.py b/tests/test_device.py
index 1eb1a81..94c0e40 100644
--- a/tests/test_device.py
+++ b/tests/test_device.py
@@ -6,6 +6,7 @@ Test module for pyverbs' device module.
 import unittest
 import resource
 import random
+import errno
 
 from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError
 from tests.base import PyverbsAPITestCase
@@ -80,6 +81,37 @@ class DeviceTest(PyverbsAPITestCase):
             with d.Context(name=dev.name.decode()) as ctx:
                 ctx.query_gid(port_num=1, index=0)
 
+    def test_query_gid_table(self):
+        """
+        Test ibv_query_gid_table()
+        """
+        devs = self.get_device_list()
+        with d.Context(name=devs[0].name.decode()) as ctx:
+            device_attr = ctx.query_device()
+            port_attr = ctx.query_port(1)
+            max_entries = device_attr.phys_port_cnt * port_attr.gid_tbl_len
+            try:
+                ctx.query_gid_table(max_entries)
+            except PyverbsRDMAError as ex:
+                if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]:
+                    raise unittest.SkipTest('ibv_query_gid_table is not'\
+                                            ' supported on this device')
+                raise ex
+
+    def test_query_gid_ex(self):
+        """
+        Test ibv_query_gid_ex()
+        """
+        devs = self.get_device_list()
+        with d.Context(name=devs[0].name.decode()) as ctx:
+            try:
+                ctx.query_gid_ex(port_num=1, gid_index=0)
+            except PyverbsRDMAError as ex:
+                if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]:
+                    raise unittest.SkipTest('ibv_query_gid_ex is not'\
+                                            ' supported on this device')
+                raise ex
+
     @staticmethod
     def verify_device_attr(attr, device):
         """
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API
  2020-09-14  6:34 ` [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API Yishai Hadas
@ 2020-09-21 16:49   ` Jason Gunthorpe
  2020-09-22 12:47     ` Yishai Hadas
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2020-09-21 16:49 UTC (permalink / raw)
  To: Yishai Hadas; +Cc: linux-rdma, maorg, avihaih

On Mon, Sep 14, 2020 at 09:34:58AM +0300, Yishai Hadas wrote:
> +static int query_sysfs_gid_ndev_ifindex(struct ibv_context *context,
> +					uint8_t port_num, uint32_t gid_index,
> +					uint32_t *ndev_ifindex)
> +{
> +	struct verbs_device *verbs_device = verbs_get_device(context->device);
> +	char buff[IF_NAMESIZE] = {};

This init is not necessary

> diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
> index 9507ffd..9427aba 100644
> +++ b/libibverbs/verbs.c
> @@ -240,6 +240,14 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
>  	return 0;
>  }
>  
> +int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
> +		      uint32_t gid_index, struct ibv_gid_entry *entry,
> +		      uint32_t flags, size_t entry_size)
> +{
> +	return ibv_cmd_query_gid_entry(context, port_num, gid_index, entry,
> +				       flags, entry_size);
> +}

This extra function seems unncessary.

We've been creating C files for each object type, the gid stuff could
go in device.c

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl
  2020-09-14  6:34 ` [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl Yishai Hadas
@ 2020-09-21 16:53   ` Jason Gunthorpe
  2020-09-22 12:59     ` Yishai Hadas
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2020-09-21 16:53 UTC (permalink / raw)
  To: Yishai Hadas; +Cc: linux-rdma, maorg, avihaih

On Mon, Sep 14, 2020 at 09:34:59AM +0300, Yishai Hadas wrote:

> diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
> index 9427aba..9dec4e6 100644
> +++ b/libibverbs/verbs.c
> @@ -216,10 +216,8 @@ LATEST_SYMVER_FUNC(ibv_query_port, 1_1, "IBVERBS_1.1",
>  				sizeof(*port_attr));
>  }
>  
> -LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
> -		   int,
> -		   struct ibv_context *context, uint8_t port_num,
> -		   int index, union ibv_gid *gid)
> +int _ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index,
> +		   union ibv_gid *gid)
>  {
>  	struct verbs_device *verbs_device = verbs_get_device(context->device);
>  	char attr[41];
> @@ -240,6 +238,29 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
>  	return 0;
>  }

This should be moved to be near query_sysfs_gid_entry() and given a
better name

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API
  2020-09-21 16:49   ` Jason Gunthorpe
@ 2020-09-22 12:47     ` Yishai Hadas
  0 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-22 12:47 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma, maorg, avihaih

On 9/21/2020 7:49 PM, Jason Gunthorpe wrote:
> On Mon, Sep 14, 2020 at 09:34:58AM +0300, Yishai Hadas wrote:
>> +static int query_sysfs_gid_ndev_ifindex(struct ibv_context *context,
>> +					uint8_t port_num, uint32_t gid_index,
>> +					uint32_t *ndev_ifindex)
>> +{
>> +	struct verbs_device *verbs_device = verbs_get_device(context->device);
>> +	char buff[IF_NAMESIZE] = {};
> This init is not necessary

OK

>
>> diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
>> index 9507ffd..9427aba 100644
>> +++ b/libibverbs/verbs.c
>> @@ -240,6 +240,14 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
>>   	return 0;
>>   }
>>   
>> +int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num,
>> +		      uint32_t gid_index, struct ibv_gid_entry *entry,
>> +		      uint32_t flags, size_t entry_size)
>> +{
>> +	return ibv_cmd_query_gid_entry(context, port_num, gid_index, entry,
>> +				       flags, entry_size);
>> +}
> This extra function seems unncessary.
>
> We've been creating C files for each object type, the gid stuff could
> go in device.c
>
> Jason

In two patches ahead we introduce some mask to optimize ibv_query_gid() 
and ibv_query_gid_type() once they fallback over sysfs, so we may need 
to differentiate
between this external verb API to the internal command, but YES, all of 
this can be done under cmd_device.c, will be part of V1.

Thanks,
Yishai


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl
  2020-09-21 16:53   ` Jason Gunthorpe
@ 2020-09-22 12:59     ` Yishai Hadas
  0 siblings, 0 replies; 13+ messages in thread
From: Yishai Hadas @ 2020-09-22 12:59 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma, maorg, avihaih

On 9/21/2020 7:53 PM, Jason Gunthorpe wrote:
> On Mon, Sep 14, 2020 at 09:34:59AM +0300, Yishai Hadas wrote:
>
>> diff --git a/libibverbs/verbs.c b/libibverbs/verbs.c
>> index 9427aba..9dec4e6 100644
>> +++ b/libibverbs/verbs.c
>> @@ -216,10 +216,8 @@ LATEST_SYMVER_FUNC(ibv_query_port, 1_1, "IBVERBS_1.1",
>>   				sizeof(*port_attr));
>>   }
>>   
>> -LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
>> -		   int,
>> -		   struct ibv_context *context, uint8_t port_num,
>> -		   int index, union ibv_gid *gid)
>> +int _ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index,
>> +		   union ibv_gid *gid)
>>   {
>>   	struct verbs_device *verbs_device = verbs_get_device(context->device);
>>   	char attr[41];
>> @@ -240,6 +238,29 @@ LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1",
>>   	return 0;
>>   }
> This should be moved to be near query_sysfs_gid_entry() and given a
> better name
>
> Jason

OK, will rename it to be query_sysfs_gid() and will move it to be near 
the above which reads the full gid entry information.
In addition, will do the same for _ibv_query_gid_type(), will move to 
cmd_device.c and rename it to be query_sysfs_gid_type().

Yishai


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-22 12:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14  6:34 [PATCH rdma-core 0/8] verbs: Query GID table API Yishai Hadas
2020-09-14  6:34 ` [PATCH rdma-core 1/8] Update kernel headers Yishai Hadas
2020-09-14  6:34 ` [PATCH rdma-core 2/8] verbs: Change the name of enum ibv_gid_type Yishai Hadas
2020-09-14  6:34 ` [PATCH rdma-core 3/8] verbs: Introduce a new query GID entry API Yishai Hadas
2020-09-21 16:49   ` Jason Gunthorpe
2020-09-22 12:47     ` Yishai Hadas
2020-09-14  6:34 ` [PATCH rdma-core 4/8] verbs: Implement ibv_query_gid and ibv_query_gid_type over ioctl Yishai Hadas
2020-09-21 16:53   ` Jason Gunthorpe
2020-09-22 12:59     ` Yishai Hadas
2020-09-14  6:35 ` [PATCH rdma-core 5/8] verbs: Optimize ibv_query_gid and ibv_query_gid_type Yishai Hadas
2020-09-14  6:35 ` [PATCH rdma-core 6/8] verbs: Introduce a new query GID table API Yishai Hadas
2020-09-14  6:35 ` [PATCH rdma-core 7/8] pyverbs: Add query_gid_table and query_gid_ex methods Yishai Hadas
2020-09-14  6:35 ` [PATCH rdma-core 8/8] tests: Add tests for ibv_query_gid_table and ibv_query_gid_ex Yishai Hadas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).