All of lore.kernel.org
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding
@ 2021-09-28  8:50 Rongwei Liu
  2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu
  2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
  0 siblings, 2 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-09-28  8:50 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

In socket direct mode, it's possible to bind any two (maybe four
in the future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore.

Doesn't need to backport to DPDK 20.11

v2: fix ci warnings.

Rongwei Liu (2):
  common/mlx5: support pcie device guid query
  net/mlx5: support socket direct mode bonding

 drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++
 drivers/net/mlx5/linux/mlx5_os.c           | 43 +++++++++++++++++-----
 3 files changed, 94 insertions(+), 9 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query
  2021-09-28  8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu
@ 2021-09-28  8:50 ` Rongwei Liu
  2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-09-28  8:50 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

sysfs entry "phys_switch_id" holds each PCIe device'
guid.

The devices which reside in the same physical NIC should
have the same guid.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++
 2 files changed, 60 insertions(+)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
index 9e0c823c97..8b3ee2baea 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -2,6 +2,7 @@
  * Copyright 2020 Mellanox Technologies, Ltd
  */
 
+#include <sys/types.h>
 #include <unistd.h>
 #include <string.h>
 #include <stdio.h>
@@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr)
 	mlx5_glue->free_device_list(ibv_list);
 	return ibv_match;
 }
+
+int
+mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len)
+{
+	char tmp[512];
+	char cur_ifname[IF_NAMESIZE + 1];
+	FILE *id_file;
+	DIR *dir;
+	struct dirent *ptr;
+	int ret;
+
+	if (guid == NULL || len < sizeof(u_int64_t) + 1)
+		return -1;
+	memset(guid, 0, len);
+	snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net",
+			dev->domain, dev->bus, dev->devid, dev->function);
+	dir = opendir(tmp);
+	if (dir == NULL)
+		return -1;
+	/* Traverse to identify PF interface */
+	do {
+		ptr = readdir(dir);
+		if (ptr == NULL || ptr->d_type != DT_DIR) {
+			closedir(dir);
+			return -1;
+		}
+	} while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') ||
+		 strchr(ptr->d_name, 'v'));
+	snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name);
+	closedir(dir);
+	snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp),
+			"/%s/phys_switch_id", cur_ifname);
+	/* Older OFED like 5.3 doesn't support read */
+	id_file = fopen(tmp, "r");
+	if (!id_file)
+		return 0;
+	ret = fscanf(id_file, "%16s", guid);
+	fclose(id_file);
+	return ret;
+}
diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h
index c3202b6786..3cdea75373 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.h
+++ b/drivers/common/mlx5/linux/mlx5_common_os.h
@@ -296,4 +296,23 @@ __rte_internal
 struct ibv_device *
 mlx5_os_get_ibv_dev(const struct rte_device *dev);
 
+/**
+ * This is used to query system_image_guid as describing in PRM.
+ *
+ * @param dev[in]
+ *  Pointer to a device instance as PCIe id.
+ * @param guid[out]
+ *  Pointer to the buffer to hold device guid.
+ *  Guid is uint64_t and corresponding to 17 bytes string.
+ * @param len[in]
+ *  Guid buffer length, 17 bytes at least.
+ *
+ * @return
+ *  -1 if internal failure.
+ *  0 if OFED doesn't support.
+ *  >0 if success.
+ */
+int
+mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len);
+
 #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding
  2021-09-28  8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu
  2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu
@ 2021-09-28  8:50 ` Rongwei Liu
  2021-09-29 21:58   ` Thomas Monjalon
  1 sibling, 1 reply; 11+ messages in thread
From: Rongwei Liu @ 2021-09-28  8:50 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

In socket direct mode, it's possible to bind any two (maybe four
in future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore,

Kernel driver uses "system_image_guid" to identify if devices can
be bound together or not. Sysfs "phys_switch_id" is used to get
"system_image_guid" of each network interface.

OFED 5.4+ is required to support "phys_switch_id".
Centos 8.1 needs to enable switch_dev mode first.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++-------
 1 file changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3746057673..1d57b934fc 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 	FILE *bond_file = NULL, *file;
 	int pf = -1;
 	int ret;
+	uint8_t cur_guid[32] = {0};
+	uint8_t guid[32] = {0};
 
 	/*
 	 * Try to get master device name. If something goes
@@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 	np = mlx5_nl_portnum(nl_rdma, ibv_dev->name);
 	if (!np)
 		return -1;
+	if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0)
+		return -1;
 	/*
 	 * The Master device might not be on the predefined
 	 * port (not on port index 1, it is not garanted),
@@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		char tmp_str[IF_NAMESIZE + 32];
 		struct rte_pci_addr pci_addr;
 		struct mlx5_switch_info	info;
+		int ret;
 
 		/* Process slave interface names in the loop. */
 		snprintf(tmp_str, sizeof(tmp_str),
@@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 				tmp_str);
 			break;
 		}
-		/* Match PCI address, allows BDF0+pfx or BDFx+pfx. */
-		if (pci_dev->domain == pci_addr.domain &&
-		    pci_dev->bus == pci_addr.bus &&
-		    pci_dev->devid == pci_addr.devid &&
-		    ((pci_dev->function == 0 &&
-		      pci_dev->function + owner == pci_addr.function) ||
-		     (pci_dev->function == owner &&
-		      pci_addr.function == owner)))
-			pf = info.port_name;
 		/* Get ifindex. */
 		snprintf(tmp_str, sizeof(tmp_str),
 			 "/sys/class/net/%s/ifindex", ifname);
@@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		bond_info->ports[info.port_name].pci_addr = pci_addr;
 		bond_info->ports[info.port_name].ifindex = ifindex;
 		bond_info->n_port++;
+		/*
+		 * Under socket direct mode, bonding will use
+		 * system_image_guid as identification.
+		 * After OFED 5.4, guid is readable (ret >= 0) under sysfs.
+		 * All bonding members should have the same guid even if driver
+		 * is using PCIe BDF.
+		 */
+		ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid));
+		if (ret < 0)
+			break;
+		else if (ret > 0) {
+			if (!memcmp(guid, cur_guid, sizeof(guid)) &&
+			    owner == info.port_name &&
+			    (owner != 0 || (owner == 0 &&
+			    !rte_pci_addr_cmp(pci_dev, &pci_addr))))
+				pf = info.port_name;
+		} else if (pci_dev->domain == pci_addr.domain &&
+		    pci_dev->bus == pci_addr.bus &&
+		    pci_dev->devid == pci_addr.devid &&
+		    ((pci_dev->function == 0 &&
+		      pci_dev->function + owner == pci_addr.function) ||
+		     (pci_dev->function == owner &&
+		      pci_addr.function == owner)))
+			pf = info.port_name;
 	}
 	if (pf >= 0) {
 		/* Get bond interface info */
@@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
 				ifindex, bond_info->ifindex, bond_info->ifname);
 	}
+	if (owner == 0 && pf != 0) {
+		DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner",
+				pci_dev->domain, pci_dev->bus, pci_dev->devid,
+				pci_dev->function);
+	}
 	return pf;
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding
  2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
@ 2021-09-29 21:58   ` Thomas Monjalon
  2021-10-04  6:45     ` Slava Ovsiienko
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Thomas Monjalon @ 2021-09-29 21:58 UTC (permalink / raw)
  To: matan, viacheslavo, Rongwei Liu; +Cc: orika, dev, rasland

28/09/2021 10:50, Rongwei Liu:
> In socket direct mode, it's possible to bind any two (maybe four
> in future) PCIe devices with IDs like xxxx:xx:xx.x and
> yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
> the same PCIe domain/bus/device ID anymore,
> 
> Kernel driver uses "system_image_guid" to identify if devices can
> be bound together or not. Sysfs "phys_switch_id" is used to get
> "system_image_guid" of each network interface.
> 
> OFED 5.4+ is required to support "phys_switch_id".
> Centos 8.1 needs to enable switch_dev mode first.
> 
> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++-------
>  1 file changed, 34 insertions(+), 9 deletions(-)

Does it deserve a line in the release notes?




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding
  2021-09-29 21:58   ` Thomas Monjalon
@ 2021-10-04  6:45     ` Slava Ovsiienko
  2021-10-08 10:05     ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu
  2021-10-14  2:57     ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu
  2 siblings, 0 replies; 11+ messages in thread
From: Slava Ovsiienko @ 2021-10-04  6:45 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon, Matan Azrad, Rongwei Liu
  Cc: Ori Kam, dev, Raslan Darawsheh

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, September 30, 2021 0:58
> To: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Rongwei Liu <rongweil@nvidia.com>
> Cc: Ori Kam <orika@nvidia.com>; dev@dpdk.org; Raslan Darawsheh
> <rasland@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct
> mode bonding
> 
> 28/09/2021 10:50, Rongwei Liu:
> > In socket direct mode, it's possible to bind any two (maybe four in
> > future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y.
> > Bonding member interfaces are unnecessary to have the same PCIe
> > domain/bus/device ID anymore,
> >
> > Kernel driver uses "system_image_guid" to identify if devices can be
> > bound together or not. Sysfs "phys_switch_id" is used to get
> > "system_image_guid" of each network interface.
> >
> > OFED 5.4+ is required to support "phys_switch_id".
> > Centos 8.1 needs to enable switch_dev mode first.
> >
> > Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> > ---
> >  drivers/net/mlx5/linux/mlx5_os.c | 43
> > +++++++++++++++++++++++++-------
> >  1 file changed, 34 insertions(+), 9 deletions(-)
> 
> Does it deserve a line in the release notes?
Not sure, it is minor update.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v3 0/2] support socket direct mode bonding
  2021-09-29 21:58   ` Thomas Monjalon
  2021-10-04  6:45     ` Slava Ovsiienko
@ 2021-10-08 10:05     ` Rongwei Liu
  2021-10-08 10:05       ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu
  2021-10-08 10:05       ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
  2021-10-14  2:57     ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu
  2 siblings, 2 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

In socket direct mode, it's possible to bind any two (maybe four
in the future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore.

Doesn't need to backport to DPDK 20.11

v2: fix ci warnings.
v3: add description in release_21_11.

Rongwei Liu (2):
  common/mlx5: support pcie device guid query
  net/mlx5: support socket direct mode bonding

 doc/guides/rel_notes/release_21_11.rst     |  4 ++
 drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++
 drivers/net/mlx5/linux/mlx5_os.c           | 43 +++++++++++++++++-----
 4 files changed, 98 insertions(+), 9 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query
  2021-10-08 10:05     ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu
@ 2021-10-08 10:05       ` Rongwei Liu
  2021-10-08 10:05       ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

sysfs entry "phys_switch_id" holds each PCIe device'
guid.

The devices which reside in the same physical NIC should
have the same guid.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++
 2 files changed, 60 insertions(+)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
index 9e0c823c97..8b3ee2baea 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -2,6 +2,7 @@
  * Copyright 2020 Mellanox Technologies, Ltd
  */
 
+#include <sys/types.h>
 #include <unistd.h>
 #include <string.h>
 #include <stdio.h>
@@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr)
 	mlx5_glue->free_device_list(ibv_list);
 	return ibv_match;
 }
+
+int
+mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len)
+{
+	char tmp[512];
+	char cur_ifname[IF_NAMESIZE + 1];
+	FILE *id_file;
+	DIR *dir;
+	struct dirent *ptr;
+	int ret;
+
+	if (guid == NULL || len < sizeof(u_int64_t) + 1)
+		return -1;
+	memset(guid, 0, len);
+	snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net",
+			dev->domain, dev->bus, dev->devid, dev->function);
+	dir = opendir(tmp);
+	if (dir == NULL)
+		return -1;
+	/* Traverse to identify PF interface */
+	do {
+		ptr = readdir(dir);
+		if (ptr == NULL || ptr->d_type != DT_DIR) {
+			closedir(dir);
+			return -1;
+		}
+	} while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') ||
+		 strchr(ptr->d_name, 'v'));
+	snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name);
+	closedir(dir);
+	snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp),
+			"/%s/phys_switch_id", cur_ifname);
+	/* Older OFED like 5.3 doesn't support read */
+	id_file = fopen(tmp, "r");
+	if (!id_file)
+		return 0;
+	ret = fscanf(id_file, "%16s", guid);
+	fclose(id_file);
+	return ret;
+}
diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h
index c3202b6786..3cdea75373 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.h
+++ b/drivers/common/mlx5/linux/mlx5_common_os.h
@@ -296,4 +296,23 @@ __rte_internal
 struct ibv_device *
 mlx5_os_get_ibv_dev(const struct rte_device *dev);
 
+/**
+ * This is used to query system_image_guid as describing in PRM.
+ *
+ * @param dev[in]
+ *  Pointer to a device instance as PCIe id.
+ * @param guid[out]
+ *  Pointer to the buffer to hold device guid.
+ *  Guid is uint64_t and corresponding to 17 bytes string.
+ * @param len[in]
+ *  Guid buffer length, 17 bytes at least.
+ *
+ * @return
+ *  -1 if internal failure.
+ *  0 if OFED doesn't support.
+ *  >0 if success.
+ */
+int
+mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len);
+
 #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding
  2021-10-08 10:05     ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu
  2021-10-08 10:05       ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu
@ 2021-10-08 10:05       ` Rongwei Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

In socket direct mode, it's possible to bind any two (maybe four
in future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore,

Kernel driver uses "system_image_guid" to identify if devices can
be bound together or not. Sysfs "phys_switch_id" is used to get
"system_image_guid" of each network interface.

OFED 5.4+ is required to support "phys_switch_id".
Centos 8.1 needs to enable switch_dev mode first

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/rel_notes/release_21_11.rst |  4 +++
 drivers/net/mlx5/linux/mlx5_os.c       | 43 ++++++++++++++++++++------
 2 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index dfc2cbdeed..54a7bd230f 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -106,6 +106,10 @@ New Features
   * Added DES-CBC, AES-XCBC-MAC, AES-CMAC and non-HMAC algo support.
   * Added PDCP short MAC-I support.
 
+* **Updated Mellanox mlx5 driver.**
+
+  * Added socket direct mode bonding support which needs OFED 5.4+.
+
 * **Updated NXP dpaa2_sec crypto PMD.**
 
   * Added PDCP short MAC-I support.
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3746057673..1d57b934fc 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 	FILE *bond_file = NULL, *file;
 	int pf = -1;
 	int ret;
+	uint8_t cur_guid[32] = {0};
+	uint8_t guid[32] = {0};
 
 	/*
 	 * Try to get master device name. If something goes
@@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 	np = mlx5_nl_portnum(nl_rdma, ibv_dev->name);
 	if (!np)
 		return -1;
+	if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0)
+		return -1;
 	/*
 	 * The Master device might not be on the predefined
 	 * port (not on port index 1, it is not garanted),
@@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		char tmp_str[IF_NAMESIZE + 32];
 		struct rte_pci_addr pci_addr;
 		struct mlx5_switch_info	info;
+		int ret;
 
 		/* Process slave interface names in the loop. */
 		snprintf(tmp_str, sizeof(tmp_str),
@@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 				tmp_str);
 			break;
 		}
-		/* Match PCI address, allows BDF0+pfx or BDFx+pfx. */
-		if (pci_dev->domain == pci_addr.domain &&
-		    pci_dev->bus == pci_addr.bus &&
-		    pci_dev->devid == pci_addr.devid &&
-		    ((pci_dev->function == 0 &&
-		      pci_dev->function + owner == pci_addr.function) ||
-		     (pci_dev->function == owner &&
-		      pci_addr.function == owner)))
-			pf = info.port_name;
 		/* Get ifindex. */
 		snprintf(tmp_str, sizeof(tmp_str),
 			 "/sys/class/net/%s/ifindex", ifname);
@@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		bond_info->ports[info.port_name].pci_addr = pci_addr;
 		bond_info->ports[info.port_name].ifindex = ifindex;
 		bond_info->n_port++;
+		/*
+		 * Under socket direct mode, bonding will use
+		 * system_image_guid as identification.
+		 * After OFED 5.4, guid is readable (ret >= 0) under sysfs.
+		 * All bonding members should have the same guid even if driver
+		 * is using PCIe BDF.
+		 */
+		ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid));
+		if (ret < 0)
+			break;
+		else if (ret > 0) {
+			if (!memcmp(guid, cur_guid, sizeof(guid)) &&
+			    owner == info.port_name &&
+			    (owner != 0 || (owner == 0 &&
+			    !rte_pci_addr_cmp(pci_dev, &pci_addr))))
+				pf = info.port_name;
+		} else if (pci_dev->domain == pci_addr.domain &&
+		    pci_dev->bus == pci_addr.bus &&
+		    pci_dev->devid == pci_addr.devid &&
+		    ((pci_dev->function == 0 &&
+		      pci_dev->function + owner == pci_addr.function) ||
+		     (pci_dev->function == owner &&
+		      pci_addr.function == owner)))
+			pf = info.port_name;
 	}
 	if (pf >= 0) {
 		/* Get bond interface info */
@@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
 				ifindex, bond_info->ifindex, bond_info->ifname);
 	}
+	if (owner == 0 && pf != 0) {
+		DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner",
+				pci_dev->domain, pci_dev->bus, pci_dev->devid,
+				pci_dev->function);
+	}
 	return pf;
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v4 0/2] support socket direct mode bonding
  2021-09-29 21:58   ` Thomas Monjalon
  2021-10-04  6:45     ` Slava Ovsiienko
  2021-10-08 10:05     ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu
@ 2021-10-14  2:57     ` Rongwei Liu
  2021-10-14  2:58       ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu
  2021-10-14  2:58       ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
  2 siblings, 2 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-10-14  2:57 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

In socket direct mode, it's possible to bind any two (maybe four
in the future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore.

Doesn't need to backport to DPDK 20.11

v2: fix ci warnings.
v3: add description in release_21_11.rst.
v4: add description in mlx5.rst.

Rongwei Liu (2):
  common/mlx5: support pcie device guid query
  net/mlx5: support socket direct mode bonding

 doc/guides/nics/mlx5.rst                   |  4 ++
 doc/guides/rel_notes/release_21_11.rst     |  4 ++
 drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++
 drivers/net/mlx5/linux/mlx5_os.c           | 43 +++++++++++++++++-----
 5 files changed, 102 insertions(+), 9 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query
  2021-10-14  2:57     ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu
@ 2021-10-14  2:58       ` Rongwei Liu
  2021-10-14  2:58       ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-10-14  2:58 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

sysfs entry "phys_switch_id" holds each PCIe device'
guid.

The devices which reside in the same physical NIC should
have the same guid.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++
 drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++
 2 files changed, 60 insertions(+)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
index 9e0c823c97..8b3ee2baea 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -2,6 +2,7 @@
  * Copyright 2020 Mellanox Technologies, Ltd
  */
 
+#include <sys/types.h>
 #include <unistd.h>
 #include <string.h>
 #include <stdio.h>
@@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr)
 	mlx5_glue->free_device_list(ibv_list);
 	return ibv_match;
 }
+
+int
+mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len)
+{
+	char tmp[512];
+	char cur_ifname[IF_NAMESIZE + 1];
+	FILE *id_file;
+	DIR *dir;
+	struct dirent *ptr;
+	int ret;
+
+	if (guid == NULL || len < sizeof(u_int64_t) + 1)
+		return -1;
+	memset(guid, 0, len);
+	snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net",
+			dev->domain, dev->bus, dev->devid, dev->function);
+	dir = opendir(tmp);
+	if (dir == NULL)
+		return -1;
+	/* Traverse to identify PF interface */
+	do {
+		ptr = readdir(dir);
+		if (ptr == NULL || ptr->d_type != DT_DIR) {
+			closedir(dir);
+			return -1;
+		}
+	} while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') ||
+		 strchr(ptr->d_name, 'v'));
+	snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name);
+	closedir(dir);
+	snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp),
+			"/%s/phys_switch_id", cur_ifname);
+	/* Older OFED like 5.3 doesn't support read */
+	id_file = fopen(tmp, "r");
+	if (!id_file)
+		return 0;
+	ret = fscanf(id_file, "%16s", guid);
+	fclose(id_file);
+	return ret;
+}
diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h
index c3202b6786..3cdea75373 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.h
+++ b/drivers/common/mlx5/linux/mlx5_common_os.h
@@ -296,4 +296,23 @@ __rte_internal
 struct ibv_device *
 mlx5_os_get_ibv_dev(const struct rte_device *dev);
 
+/**
+ * This is used to query system_image_guid as describing in PRM.
+ *
+ * @param dev[in]
+ *  Pointer to a device instance as PCIe id.
+ * @param guid[out]
+ *  Pointer to the buffer to hold device guid.
+ *  Guid is uint64_t and corresponding to 17 bytes string.
+ * @param len[in]
+ *  Guid buffer length, 17 bytes at least.
+ *
+ * @return
+ *  -1 if internal failure.
+ *  0 if OFED doesn't support.
+ *  >0 if success.
+ */
+int
+mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len);
+
 #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding
  2021-10-14  2:57     ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu
  2021-10-14  2:58       ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu
@ 2021-10-14  2:58       ` Rongwei Liu
  1 sibling, 0 replies; 11+ messages in thread
From: Rongwei Liu @ 2021-10-14  2:58 UTC (permalink / raw)
  To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland

In socket direct mode, it's possible to bind any two (maybe four
in future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore,

Kernel driver uses "system_image_guid" to identify if devices can
be bound together or not. Sysfs "phys_switch_id" is used to get
"system_image_guid" of each network interface.

OFED 5.4+ is required to support "phys_switch_id".

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst               |  4 +++
 doc/guides/rel_notes/release_21_11.rst |  4 +++
 drivers/net/mlx5/linux/mlx5_os.c       | 43 ++++++++++++++++++++------
 3 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bae73f42d8..b58236e00a 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -464,6 +464,10 @@ Limitations
   - In order to achieve best insertion rate, application should manage the flows per lcore.
   - Better to disable memory reclaim by setting ``reclaim_mem_mode`` to 0 to accelerate the flow object allocation and release with cache.
 
+- Bonding under socket direct mode
+
+  - Needs OFED 5.4+.
+
 Statistics
 ----------
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index dfc2cbdeed..2a6cc765c2 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -106,6 +106,10 @@ New Features
   * Added DES-CBC, AES-XCBC-MAC, AES-CMAC and non-HMAC algo support.
   * Added PDCP short MAC-I support.
 
+* **Updated Mellanox mlx5 driver.**
+
+  * Added socket direct mode bonding support.
+
 * **Updated NXP dpaa2_sec crypto PMD.**
 
   * Added PDCP short MAC-I support.
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3746057673..1d57b934fc 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 	FILE *bond_file = NULL, *file;
 	int pf = -1;
 	int ret;
+	uint8_t cur_guid[32] = {0};
+	uint8_t guid[32] = {0};
 
 	/*
 	 * Try to get master device name. If something goes
@@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 	np = mlx5_nl_portnum(nl_rdma, ibv_dev->name);
 	if (!np)
 		return -1;
+	if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0)
+		return -1;
 	/*
 	 * The Master device might not be on the predefined
 	 * port (not on port index 1, it is not garanted),
@@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		char tmp_str[IF_NAMESIZE + 32];
 		struct rte_pci_addr pci_addr;
 		struct mlx5_switch_info	info;
+		int ret;
 
 		/* Process slave interface names in the loop. */
 		snprintf(tmp_str, sizeof(tmp_str),
@@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 				tmp_str);
 			break;
 		}
-		/* Match PCI address, allows BDF0+pfx or BDFx+pfx. */
-		if (pci_dev->domain == pci_addr.domain &&
-		    pci_dev->bus == pci_addr.bus &&
-		    pci_dev->devid == pci_addr.devid &&
-		    ((pci_dev->function == 0 &&
-		      pci_dev->function + owner == pci_addr.function) ||
-		     (pci_dev->function == owner &&
-		      pci_addr.function == owner)))
-			pf = info.port_name;
 		/* Get ifindex. */
 		snprintf(tmp_str, sizeof(tmp_str),
 			 "/sys/class/net/%s/ifindex", ifname);
@@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		bond_info->ports[info.port_name].pci_addr = pci_addr;
 		bond_info->ports[info.port_name].ifindex = ifindex;
 		bond_info->n_port++;
+		/*
+		 * Under socket direct mode, bonding will use
+		 * system_image_guid as identification.
+		 * After OFED 5.4, guid is readable (ret >= 0) under sysfs.
+		 * All bonding members should have the same guid even if driver
+		 * is using PCIe BDF.
+		 */
+		ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid));
+		if (ret < 0)
+			break;
+		else if (ret > 0) {
+			if (!memcmp(guid, cur_guid, sizeof(guid)) &&
+			    owner == info.port_name &&
+			    (owner != 0 || (owner == 0 &&
+			    !rte_pci_addr_cmp(pci_dev, &pci_addr))))
+				pf = info.port_name;
+		} else if (pci_dev->domain == pci_addr.domain &&
+		    pci_dev->bus == pci_addr.bus &&
+		    pci_dev->devid == pci_addr.devid &&
+		    ((pci_dev->function == 0 &&
+		      pci_dev->function + owner == pci_addr.function) ||
+		     (pci_dev->function == owner &&
+		      pci_addr.function == owner)))
+			pf = info.port_name;
 	}
 	if (pf >= 0) {
 		/* Get bond interface info */
@@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
 				ifindex, bond_info->ifindex, bond_info->ifname);
 	}
+	if (owner == 0 && pf != 0) {
+		DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner",
+				pci_dev->domain, pci_dev->bus, pci_dev->devid,
+				pci_dev->function);
+	}
 	return pf;
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-14  2:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28  8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu
2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu
2021-09-28  8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
2021-09-29 21:58   ` Thomas Monjalon
2021-10-04  6:45     ` Slava Ovsiienko
2021-10-08 10:05     ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu
2021-10-08 10:05       ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu
2021-10-08 10:05       ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
2021-10-14  2:57     ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu
2021-10-14  2:58       ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu
2021-10-14  2:58       ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.