* [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding @ 2021-09-28 8:50 Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 0 siblings, 2 replies; 11+ messages in thread From: Rongwei Liu @ 2021-09-28 8:50 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in the future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore. Doesn't need to backport to DPDK 20.11 v2: fix ci warnings. Rongwei Liu (2): common/mlx5: support pcie device guid query net/mlx5: support socket direct mode bonding drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++----- 3 files changed, 94 insertions(+), 9 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query 2021-09-28 8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu @ 2021-09-28 8:50 ` Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-09-28 8:50 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland sysfs entry "phys_switch_id" holds each PCIe device' guid. The devices which reside in the same physical NIC should have the same guid. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 9e0c823c97..8b3ee2baea 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -2,6 +2,7 @@ * Copyright 2020 Mellanox Technologies, Ltd */ +#include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdio.h> @@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) mlx5_glue->free_device_list(ibv_list); return ibv_match; } + +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len) +{ + char tmp[512]; + char cur_ifname[IF_NAMESIZE + 1]; + FILE *id_file; + DIR *dir; + struct dirent *ptr; + int ret; + + if (guid == NULL || len < sizeof(u_int64_t) + 1) + return -1; + memset(guid, 0, len); + snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net", + dev->domain, dev->bus, dev->devid, dev->function); + dir = opendir(tmp); + if (dir == NULL) + return -1; + /* Traverse to identify PF interface */ + do { + ptr = readdir(dir); + if (ptr == NULL || ptr->d_type != DT_DIR) { + closedir(dir); + return -1; + } + } while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') || + strchr(ptr->d_name, 'v')); + snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name); + closedir(dir); + snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp), + "/%s/phys_switch_id", cur_ifname); + /* Older OFED like 5.3 doesn't support read */ + id_file = fopen(tmp, "r"); + if (!id_file) + return 0; + ret = fscanf(id_file, "%16s", guid); + fclose(id_file); + return ret; +} diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index c3202b6786..3cdea75373 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -296,4 +296,23 @@ __rte_internal struct ibv_device * mlx5_os_get_ibv_dev(const struct rte_device *dev); +/** + * This is used to query system_image_guid as describing in PRM. + * + * @param dev[in] + * Pointer to a device instance as PCIe id. + * @param guid[out] + * Pointer to the buffer to hold device guid. + * Guid is uint64_t and corresponding to 17 bytes string. + * @param len[in] + * Guid buffer length, 17 bytes at least. + * + * @return + * -1 if internal failure. + * 0 if OFED doesn't support. + * >0 if success. + */ +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len); + #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */ -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding 2021-09-28 8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu @ 2021-09-28 8:50 ` Rongwei Liu 2021-09-29 21:58 ` Thomas Monjalon 1 sibling, 1 reply; 11+ messages in thread From: Rongwei Liu @ 2021-09-28 8:50 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Centos 8.1 needs to enable switch_dev mode first. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 9 deletions(-) diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 3746057673..1d57b934fc 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, FILE *bond_file = NULL, *file; int pf = -1; int ret; + uint8_t cur_guid[32] = {0}; + uint8_t guid[32] = {0}; /* * Try to get master device name. If something goes @@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, np = mlx5_nl_portnum(nl_rdma, ibv_dev->name); if (!np) return -1; + if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0) + return -1; /* * The Master device might not be on the predefined * port (not on port index 1, it is not garanted), @@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, char tmp_str[IF_NAMESIZE + 32]; struct rte_pci_addr pci_addr; struct mlx5_switch_info info; + int ret; /* Process slave interface names in the loop. */ snprintf(tmp_str, sizeof(tmp_str), @@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, tmp_str); break; } - /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */ - if (pci_dev->domain == pci_addr.domain && - pci_dev->bus == pci_addr.bus && - pci_dev->devid == pci_addr.devid && - ((pci_dev->function == 0 && - pci_dev->function + owner == pci_addr.function) || - (pci_dev->function == owner && - pci_addr.function == owner))) - pf = info.port_name; /* Get ifindex. */ snprintf(tmp_str, sizeof(tmp_str), "/sys/class/net/%s/ifindex", ifname); @@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, bond_info->ports[info.port_name].pci_addr = pci_addr; bond_info->ports[info.port_name].ifindex = ifindex; bond_info->n_port++; + /* + * Under socket direct mode, bonding will use + * system_image_guid as identification. + * After OFED 5.4, guid is readable (ret >= 0) under sysfs. + * All bonding members should have the same guid even if driver + * is using PCIe BDF. + */ + ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid)); + if (ret < 0) + break; + else if (ret > 0) { + if (!memcmp(guid, cur_guid, sizeof(guid)) && + owner == info.port_name && + (owner != 0 || (owner == 0 && + !rte_pci_addr_cmp(pci_dev, &pci_addr)))) + pf = info.port_name; + } else if (pci_dev->domain == pci_addr.domain && + pci_dev->bus == pci_addr.bus && + pci_dev->devid == pci_addr.devid && + ((pci_dev->function == 0 && + pci_dev->function + owner == pci_addr.function) || + (pci_dev->function == owner && + pci_addr.function == owner))) + pf = info.port_name; } if (pf >= 0) { /* Get bond interface info */ @@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, DRV_LOG(INFO, "PF device %u, bond device %u(%s)", ifindex, bond_info->ifindex, bond_info->ifname); } + if (owner == 0 && pf != 0) { + DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner", + pci_dev->domain, pci_dev->bus, pci_dev->devid, + pci_dev->function); + } return pf; } -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu @ 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Thomas Monjalon @ 2021-09-29 21:58 UTC (permalink / raw) To: matan, viacheslavo, Rongwei Liu; +Cc: orika, dev, rasland 28/09/2021 10:50, Rongwei Liu: > In socket direct mode, it's possible to bind any two (maybe four > in future) PCIe devices with IDs like xxxx:xx:xx.x and > yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have > the same PCIe domain/bus/device ID anymore, > > Kernel driver uses "system_image_guid" to identify if devices can > be bound together or not. Sysfs "phys_switch_id" is used to get > "system_image_guid" of each network interface. > > OFED 5.4+ is required to support "phys_switch_id". > Centos 8.1 needs to enable switch_dev mode first. > > Signed-off-by: Rongwei Liu <rongweil@nvidia.com> > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > --- > drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++------- > 1 file changed, 34 insertions(+), 9 deletions(-) Does it deserve a line in the release notes? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding 2021-09-29 21:58 ` Thomas Monjalon @ 2021-10-04 6:45 ` Slava Ovsiienko 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2 siblings, 0 replies; 11+ messages in thread From: Slava Ovsiienko @ 2021-10-04 6:45 UTC (permalink / raw) To: NBU-Contact-Thomas Monjalon, Matan Azrad, Rongwei Liu Cc: Ori Kam, dev, Raslan Darawsheh > -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Thursday, September 30, 2021 0:58 > To: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko > <viacheslavo@nvidia.com>; Rongwei Liu <rongweil@nvidia.com> > Cc: Ori Kam <orika@nvidia.com>; dev@dpdk.org; Raslan Darawsheh > <rasland@nvidia.com> > Subject: Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct > mode bonding > > 28/09/2021 10:50, Rongwei Liu: > > In socket direct mode, it's possible to bind any two (maybe four in > > future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. > > Bonding member interfaces are unnecessary to have the same PCIe > > domain/bus/device ID anymore, > > > > Kernel driver uses "system_image_guid" to identify if devices can be > > bound together or not. Sysfs "phys_switch_id" is used to get > > "system_image_guid" of each network interface. > > > > OFED 5.4+ is required to support "phys_switch_id". > > Centos 8.1 needs to enable switch_dev mode first. > > > > Signed-off-by: Rongwei Liu <rongweil@nvidia.com> > > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > > --- > > drivers/net/mlx5/linux/mlx5_os.c | 43 > > +++++++++++++++++++++++++------- > > 1 file changed, 34 insertions(+), 9 deletions(-) > > Does it deserve a line in the release notes? Not sure, it is minor update. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v3 0/2] support socket direct mode bonding 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko @ 2021-10-08 10:05 ` Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2 siblings, 2 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in the future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore. Doesn't need to backport to DPDK 20.11 v2: fix ci warnings. v3: add description in release_21_11. Rongwei Liu (2): common/mlx5: support pcie device guid query net/mlx5: support socket direct mode bonding doc/guides/rel_notes/release_21_11.rst | 4 ++ drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++----- 4 files changed, 98 insertions(+), 9 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu @ 2021-10-08 10:05 ` Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland sysfs entry "phys_switch_id" holds each PCIe device' guid. The devices which reside in the same physical NIC should have the same guid. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 9e0c823c97..8b3ee2baea 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -2,6 +2,7 @@ * Copyright 2020 Mellanox Technologies, Ltd */ +#include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdio.h> @@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) mlx5_glue->free_device_list(ibv_list); return ibv_match; } + +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len) +{ + char tmp[512]; + char cur_ifname[IF_NAMESIZE + 1]; + FILE *id_file; + DIR *dir; + struct dirent *ptr; + int ret; + + if (guid == NULL || len < sizeof(u_int64_t) + 1) + return -1; + memset(guid, 0, len); + snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net", + dev->domain, dev->bus, dev->devid, dev->function); + dir = opendir(tmp); + if (dir == NULL) + return -1; + /* Traverse to identify PF interface */ + do { + ptr = readdir(dir); + if (ptr == NULL || ptr->d_type != DT_DIR) { + closedir(dir); + return -1; + } + } while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') || + strchr(ptr->d_name, 'v')); + snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name); + closedir(dir); + snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp), + "/%s/phys_switch_id", cur_ifname); + /* Older OFED like 5.3 doesn't support read */ + id_file = fopen(tmp, "r"); + if (!id_file) + return 0; + ret = fscanf(id_file, "%16s", guid); + fclose(id_file); + return ret; +} diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index c3202b6786..3cdea75373 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -296,4 +296,23 @@ __rte_internal struct ibv_device * mlx5_os_get_ibv_dev(const struct rte_device *dev); +/** + * This is used to query system_image_guid as describing in PRM. + * + * @param dev[in] + * Pointer to a device instance as PCIe id. + * @param guid[out] + * Pointer to the buffer to hold device guid. + * Guid is uint64_t and corresponding to 17 bytes string. + * @param len[in] + * Guid buffer length, 17 bytes at least. + * + * @return + * -1 if internal failure. + * 0 if OFED doesn't support. + * >0 if success. + */ +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len); + #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */ -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu @ 2021-10-08 10:05 ` Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Centos 8.1 needs to enable switch_dev mode first Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- doc/guides/rel_notes/release_21_11.rst | 4 +++ drivers/net/mlx5/linux/mlx5_os.c | 43 ++++++++++++++++++++------ 2 files changed, 38 insertions(+), 9 deletions(-) diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index dfc2cbdeed..54a7bd230f 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -106,6 +106,10 @@ New Features * Added DES-CBC, AES-XCBC-MAC, AES-CMAC and non-HMAC algo support. * Added PDCP short MAC-I support. +* **Updated Mellanox mlx5 driver.** + + * Added socket direct mode bonding support which needs OFED 5.4+. + * **Updated NXP dpaa2_sec crypto PMD.** * Added PDCP short MAC-I support. diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 3746057673..1d57b934fc 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, FILE *bond_file = NULL, *file; int pf = -1; int ret; + uint8_t cur_guid[32] = {0}; + uint8_t guid[32] = {0}; /* * Try to get master device name. If something goes @@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, np = mlx5_nl_portnum(nl_rdma, ibv_dev->name); if (!np) return -1; + if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0) + return -1; /* * The Master device might not be on the predefined * port (not on port index 1, it is not garanted), @@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, char tmp_str[IF_NAMESIZE + 32]; struct rte_pci_addr pci_addr; struct mlx5_switch_info info; + int ret; /* Process slave interface names in the loop. */ snprintf(tmp_str, sizeof(tmp_str), @@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, tmp_str); break; } - /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */ - if (pci_dev->domain == pci_addr.domain && - pci_dev->bus == pci_addr.bus && - pci_dev->devid == pci_addr.devid && - ((pci_dev->function == 0 && - pci_dev->function + owner == pci_addr.function) || - (pci_dev->function == owner && - pci_addr.function == owner))) - pf = info.port_name; /* Get ifindex. */ snprintf(tmp_str, sizeof(tmp_str), "/sys/class/net/%s/ifindex", ifname); @@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, bond_info->ports[info.port_name].pci_addr = pci_addr; bond_info->ports[info.port_name].ifindex = ifindex; bond_info->n_port++; + /* + * Under socket direct mode, bonding will use + * system_image_guid as identification. + * After OFED 5.4, guid is readable (ret >= 0) under sysfs. + * All bonding members should have the same guid even if driver + * is using PCIe BDF. + */ + ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid)); + if (ret < 0) + break; + else if (ret > 0) { + if (!memcmp(guid, cur_guid, sizeof(guid)) && + owner == info.port_name && + (owner != 0 || (owner == 0 && + !rte_pci_addr_cmp(pci_dev, &pci_addr)))) + pf = info.port_name; + } else if (pci_dev->domain == pci_addr.domain && + pci_dev->bus == pci_addr.bus && + pci_dev->devid == pci_addr.devid && + ((pci_dev->function == 0 && + pci_dev->function + owner == pci_addr.function) || + (pci_dev->function == owner && + pci_addr.function == owner))) + pf = info.port_name; } if (pf >= 0) { /* Get bond interface info */ @@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, DRV_LOG(INFO, "PF device %u, bond device %u(%s)", ifindex, bond_info->ifindex, bond_info->ifname); } + if (owner == 0 && pf != 0) { + DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner", + pci_dev->domain, pci_dev->bus, pci_dev->devid, + pci_dev->function); + } return pf; } -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v4 0/2] support socket direct mode bonding 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu @ 2021-10-14 2:57 ` Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2 siblings, 2 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-14 2:57 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in the future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore. Doesn't need to backport to DPDK 20.11 v2: fix ci warnings. v3: add description in release_21_11.rst. v4: add description in mlx5.rst. Rongwei Liu (2): common/mlx5: support pcie device guid query net/mlx5: support socket direct mode bonding doc/guides/nics/mlx5.rst | 4 ++ doc/guides/rel_notes/release_21_11.rst | 4 ++ drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++----- 5 files changed, 102 insertions(+), 9 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu @ 2021-10-14 2:58 ` Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-14 2:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland sysfs entry "phys_switch_id" holds each PCIe device' guid. The devices which reside in the same physical NIC should have the same guid. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 9e0c823c97..8b3ee2baea 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -2,6 +2,7 @@ * Copyright 2020 Mellanox Technologies, Ltd */ +#include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdio.h> @@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) mlx5_glue->free_device_list(ibv_list); return ibv_match; } + +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len) +{ + char tmp[512]; + char cur_ifname[IF_NAMESIZE + 1]; + FILE *id_file; + DIR *dir; + struct dirent *ptr; + int ret; + + if (guid == NULL || len < sizeof(u_int64_t) + 1) + return -1; + memset(guid, 0, len); + snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net", + dev->domain, dev->bus, dev->devid, dev->function); + dir = opendir(tmp); + if (dir == NULL) + return -1; + /* Traverse to identify PF interface */ + do { + ptr = readdir(dir); + if (ptr == NULL || ptr->d_type != DT_DIR) { + closedir(dir); + return -1; + } + } while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') || + strchr(ptr->d_name, 'v')); + snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name); + closedir(dir); + snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp), + "/%s/phys_switch_id", cur_ifname); + /* Older OFED like 5.3 doesn't support read */ + id_file = fopen(tmp, "r"); + if (!id_file) + return 0; + ret = fscanf(id_file, "%16s", guid); + fclose(id_file); + return ret; +} diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index c3202b6786..3cdea75373 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -296,4 +296,23 @@ __rte_internal struct ibv_device * mlx5_os_get_ibv_dev(const struct rte_device *dev); +/** + * This is used to query system_image_guid as describing in PRM. + * + * @param dev[in] + * Pointer to a device instance as PCIe id. + * @param guid[out] + * Pointer to the buffer to hold device guid. + * Guid is uint64_t and corresponding to 17 bytes string. + * @param len[in] + * Guid buffer length, 17 bytes at least. + * + * @return + * -1 if internal failure. + * 0 if OFED doesn't support. + * >0 if success. + */ +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len); + #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */ -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu @ 2021-10-14 2:58 ` Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-14 2:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- doc/guides/nics/mlx5.rst | 4 +++ doc/guides/rel_notes/release_21_11.rst | 4 +++ drivers/net/mlx5/linux/mlx5_os.c | 43 ++++++++++++++++++++------ 3 files changed, 42 insertions(+), 9 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index bae73f42d8..b58236e00a 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -464,6 +464,10 @@ Limitations - In order to achieve best insertion rate, application should manage the flows per lcore. - Better to disable memory reclaim by setting ``reclaim_mem_mode`` to 0 to accelerate the flow object allocation and release with cache. +- Bonding under socket direct mode + + - Needs OFED 5.4+. + Statistics ---------- diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index dfc2cbdeed..2a6cc765c2 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -106,6 +106,10 @@ New Features * Added DES-CBC, AES-XCBC-MAC, AES-CMAC and non-HMAC algo support. * Added PDCP short MAC-I support. +* **Updated Mellanox mlx5 driver.** + + * Added socket direct mode bonding support. + * **Updated NXP dpaa2_sec crypto PMD.** * Added PDCP short MAC-I support. diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 3746057673..1d57b934fc 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, FILE *bond_file = NULL, *file; int pf = -1; int ret; + uint8_t cur_guid[32] = {0}; + uint8_t guid[32] = {0}; /* * Try to get master device name. If something goes @@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, np = mlx5_nl_portnum(nl_rdma, ibv_dev->name); if (!np) return -1; + if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0) + return -1; /* * The Master device might not be on the predefined * port (not on port index 1, it is not garanted), @@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, char tmp_str[IF_NAMESIZE + 32]; struct rte_pci_addr pci_addr; struct mlx5_switch_info info; + int ret; /* Process slave interface names in the loop. */ snprintf(tmp_str, sizeof(tmp_str), @@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, tmp_str); break; } - /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */ - if (pci_dev->domain == pci_addr.domain && - pci_dev->bus == pci_addr.bus && - pci_dev->devid == pci_addr.devid && - ((pci_dev->function == 0 && - pci_dev->function + owner == pci_addr.function) || - (pci_dev->function == owner && - pci_addr.function == owner))) - pf = info.port_name; /* Get ifindex. */ snprintf(tmp_str, sizeof(tmp_str), "/sys/class/net/%s/ifindex", ifname); @@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, bond_info->ports[info.port_name].pci_addr = pci_addr; bond_info->ports[info.port_name].ifindex = ifindex; bond_info->n_port++; + /* + * Under socket direct mode, bonding will use + * system_image_guid as identification. + * After OFED 5.4, guid is readable (ret >= 0) under sysfs. + * All bonding members should have the same guid even if driver + * is using PCIe BDF. + */ + ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid)); + if (ret < 0) + break; + else if (ret > 0) { + if (!memcmp(guid, cur_guid, sizeof(guid)) && + owner == info.port_name && + (owner != 0 || (owner == 0 && + !rte_pci_addr_cmp(pci_dev, &pci_addr)))) + pf = info.port_name; + } else if (pci_dev->domain == pci_addr.domain && + pci_dev->bus == pci_addr.bus && + pci_dev->devid == pci_addr.devid && + ((pci_dev->function == 0 && + pci_dev->function + owner == pci_addr.function) || + (pci_dev->function == owner && + pci_addr.function == owner))) + pf = info.port_name; } if (pf >= 0) { /* Get bond interface info */ @@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, DRV_LOG(INFO, "PF device %u, bond device %u(%s)", ifindex, bond_info->ifindex, bond_info->ifname); } + if (owner == 0 && pf != 0) { + DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner", + pci_dev->domain, pci_dev->bus, pci_dev->devid, + pci_dev->function); + } return pf; } -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-10-14 2:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-28 8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.