All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 00/10] Support RAW Ethernet when RoCE is disabled
@ 2016-11-27 14:51 Leon Romanovsky
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua

Hi Doug,

Please find below the patch set from Or. I didn't add version notation to this
patch set, because it was enriched extensively over initial version [1], from
one mlx5-only patch to native IB core support.

It is tested against v4.9-rc6.

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

On some environments, such as certain SRIOV VF configurations, RoCE is
not supported for mlx5 Ethernet ports. Currently, the driver will not
open IB device on that port.

This is problematic, since we do want user-space RAW Ethernet (RAW_PACKET QPs)
functionality to remain in place. For that end, we change the relevant driver
flows such that an IB device instance is created in that case as well.

Following the previous post [1], Doug wanted us to enable a way for
applications to query what QP types are actually supported on a device.
This series adds that functionality on device/port granularity, since
some drivers (mlx4) could support different protocols per port and
hence different QP types per port.

QP types are basically derived from the protocol/s supported on the port. We
added two protocols (raw packet and usnic) to have a protocol set which is
consistent with what is currently upstream.

Patches 1-7 deal with the new query and patches 8-10 with the mlx5 specific
changes.

To make it clear, the upstream kernel IB core is fully functional without the
new port attribute, in the sense that no errors are seen from components that
e.g use SMI and GSI services. The query is mostly targeted to user-space.

Thanks, Or.

[1] https://patchwork.kernel.org/patch/9096351/

---------------------------------------------------------------------------------------
Available in the "topic/raw-ibdev" topic branch of this git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git
Or for browsing:
https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=topic/raw-ibdev
---------------------------------------------------------------------------------------

CC: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
CC: Steve Wise <swise-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
CC: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
CC: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
CC: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
CC: "Wei Hu(Xavier)" <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
CC: Faisal Latif <faisal.latif-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
CC: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
CC: Selvin Xavier <selvin.xavier-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org>
CC: Devesh Sharma <devesh.sharma-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org>
CC: Mitesh Ahuja <mitesh.ahuja-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org>
CC: Christian Benvenuti <benve-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
CC: Dave Goodell <dgoodell-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
CC: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Or Gerlitz (10):
  IB/core: Add raw packet protocol
  IB/mlx5: Support raw packet protocol
  IB/mlx4: Support raw packet protocol
  IB: Add protocol for USNIC
  IB: Query port through the core instead of directly calling the driver
    handler
  IB/core: Enable to query QP types supported by IB device on a port
  IB/uverbs: Propagate supported QP types to user-space
  IB/mlx5: Refactor registration to netdev notifier
  IB/mlx5: Rename RoCE related helpers to reflect being Eth ones
  IB/mlx5: Support RAW Ethernet when RoCE is disabled

 drivers/infiniband/core/device.c             | 28 +++++++++
 drivers/infiniband/core/uverbs_cmd.c         |  2 +
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |  7 ++-
 drivers/infiniband/hw/cxgb4/provider.c       |  8 +--
 drivers/infiniband/hw/hfi1/verbs.c           |  1 +
 drivers/infiniband/hw/hns/hns_roce_main.c    | 10 ++-
 drivers/infiniband/hw/i40iw/i40iw_verbs.c    |  8 +--
 drivers/infiniband/hw/mlx4/alias_GUID.c      |  1 +
 drivers/infiniband/hw/mlx4/main.c            | 23 ++++---
 drivers/infiniband/hw/mlx4/sysfs.c           |  1 +
 drivers/infiniband/hw/mlx5/mad.c             |  2 +-
 drivers/infiniband/hw/mlx5/main.c            | 91 +++++++++++++++++-----------
 drivers/infiniband/hw/mthca/mthca_provider.c |  9 +--
 drivers/infiniband/hw/nes/nes_verbs.c        |  5 +-
 drivers/infiniband/hw/ocrdma/ocrdma_main.c   |  9 +--
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c  |  1 +
 drivers/infiniband/hw/qedr/verbs.c           | 11 ++--
 drivers/infiniband/hw/qib/qib_verbs.c        |  1 +
 drivers/infiniband/hw/usnic/usnic_ib_main.c  |  4 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c |  2 +-
 drivers/infiniband/sw/rdmavt/vt.c            |  7 ++-
 drivers/infiniband/sw/rxe/rxe_verbs.c        |  6 +-
 include/rdma/ib_verbs.h                      | 17 ++++++
 include/uapi/rdma/ib_user_verbs.h            |  2 +-
 24 files changed, 173 insertions(+), 83 deletions(-)

--
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-11-27 14:51   ` Leon Romanovsky
       [not found]     ` <1480258296-27032-2-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-11-27 14:51   ` [PATCH rdma-next 02/10] IB/mlx5: Support " Leon Romanovsky
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Define raw packet protocol which comes to denote this port supports
the RAW_PACKET QP type. To be used in downstream patches where the
IB core serves a query on the supported QP types for device/port.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 include/rdma/ib_verbs.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..0cb3194 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -485,6 +485,7 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
 #define RDMA_CORE_CAP_PROT_ROCE         0x00200000
 #define RDMA_CORE_CAP_PROT_IWARP        0x00400000
 #define RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP 0x00800000
+#define RDMA_CORE_CAP_PROT_RAW_PACKET   0x01000000
 
 #define RDMA_CORE_PORT_IBA_IB          (RDMA_CORE_CAP_PROT_IB  \
 					| RDMA_CORE_CAP_IB_MAD \
@@ -508,6 +509,8 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
 #define RDMA_CORE_PORT_INTEL_OPA       (RDMA_CORE_PORT_IBA_IB  \
 					| RDMA_CORE_CAP_OPA_MAD)
 
+#define RDMA_CORE_PORT_RAW_PACKET	(RDMA_CORE_CAP_PROT_RAW_PACKET)
+
 struct ib_port_attr {
 	u64			subnet_prefix;
 	enum ib_port_state	state;
@@ -2286,6 +2289,11 @@ static inline bool rdma_ib_or_roce(const struct ib_device *device, u8 port_num)
 		rdma_protocol_roce(device, port_num);
 }
 
+static inline bool rdma_protocol_raw_packet(const struct ib_device *device, u8 port_num)
+{
+	return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_RAW_PACKET;
+}
+
 /**
  * rdma_cap_ib_mad - Check if the port of a device supports Infiniband
  * Management Datagrams.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 02/10] IB/mlx5: Support raw packet protocol
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-11-27 14:51   ` [PATCH rdma-next 01/10] IB/core: Add raw packet protocol Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 03/10] IB/mlx4: " Leon Romanovsky
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Mark support for the new raw packet protocol on Eth ports.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 1e47999..019a7a4 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2702,11 +2702,13 @@ static u32 get_core_cap_flags(struct ib_device *ibdev)
 	if (ll == IB_LINK_LAYER_INFINIBAND)
 		return RDMA_CORE_PORT_IBA_IB;
 
+	ret = RDMA_CORE_PORT_RAW_PACKET;
+
 	if (!(l3_type_cap & MLX5_ROCE_L3_TYPE_IPV4_CAP))
-		return 0;
+		return ret;
 
 	if (!(l3_type_cap & MLX5_ROCE_L3_TYPE_IPV6_CAP))
-		return 0;
+		return ret;
 
 	if (roce_version_cap & MLX5_ROCE_VERSION_1_CAP)
 		ret |= RDMA_CORE_PORT_IBA_ROCE;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 03/10] IB/mlx4: Support raw packet protocol
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-11-27 14:51   ` [PATCH rdma-next 01/10] IB/core: Add raw packet protocol Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 02/10] IB/mlx5: Support " Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 04/10] IB: Add protocol for USNIC Leon Romanovsky
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Mark support for the new raw packet protocol on Eth ports.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index b597e82..3902832 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2532,16 +2532,19 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
 
 	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND) {
 		immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
+		immutable->max_mad_size = IB_MGMT_MAD_SIZE;
 	} else {
 		if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE)
 			immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
 		if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
 			immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
 				RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+		immutable->core_cap_flags |= RDMA_CORE_PORT_RAW_PACKET;
+		if (immutable->core_cap_flags & (RDMA_CORE_PORT_IBA_ROCE |
+		    RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP))
+			immutable->max_mad_size = IB_MGMT_MAD_SIZE;
 	}
 
-	immutable->max_mad_size = IB_MGMT_MAD_SIZE;
-
 	return 0;
 }
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 04/10] IB: Add protocol for USNIC
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 03/10] IB/mlx4: " Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 05/10] IB: Query port through the core instead of directly calling the driver handler Leon Romanovsky
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add protocol definition for the proprietary the USNIC driver, to be
used in downstream patches that query supported qp types.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
 include/rdma/ib_verbs.h                     | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
index 0a89a95..dde0b23 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
@@ -325,6 +325,7 @@ static int usnic_port_immutable(struct ib_device *ibdev, u8 port_num,
 	if (err)
 		return err;
 
+	immutable->core_cap_flags = RDMA_CORE_PORT_USNIC;
 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0cb3194..485b725 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -486,6 +486,7 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
 #define RDMA_CORE_CAP_PROT_IWARP        0x00400000
 #define RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP 0x00800000
 #define RDMA_CORE_CAP_PROT_RAW_PACKET   0x01000000
+#define RDMA_CORE_CAP_PROT_USNIC        0x02000000
 
 #define RDMA_CORE_PORT_IBA_IB          (RDMA_CORE_CAP_PROT_IB  \
 					| RDMA_CORE_CAP_IB_MAD \
@@ -511,6 +512,8 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
 
 #define RDMA_CORE_PORT_RAW_PACKET	(RDMA_CORE_CAP_PROT_RAW_PACKET)
 
+#define RDMA_CORE_PORT_USNIC		(RDMA_CORE_CAP_PROT_USNIC)
+
 struct ib_port_attr {
 	u64			subnet_prefix;
 	enum ib_port_state	state;
@@ -2294,6 +2297,11 @@ static inline bool rdma_protocol_raw_packet(const struct ib_device *device, u8 p
 	return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_RAW_PACKET;
 }
 
+static inline bool rdma_protocol_usnic(const struct ib_device *device, u8 port_num)
+{
+	return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_USNIC;
+}
+
 /**
  * rdma_cap_ib_mad - Check if the port of a device supports Infiniband
  * Management Datagrams.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 05/10] IB: Query port through the core instead of directly calling the driver handler
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 04/10] IB: Add protocol for USNIC Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port Leon Romanovsky
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Change the drivers to call ib_query_port in their get port
immutable handler instead of their own query port handler.

Doing this required to set the core cap flags of this device
before the ib_query_port call is made, since the IB core might
need these caps to serve the port query.

Drivers are ensured by the IB core that the port attributes passed
to the port query verb implementation are zero, and hence we
removed the zeroing from the drivers.

A pre-step for allowing drivers to modify some port attributes (in
their query port handler) which are set by the IB core before
calling them.

This patch doesn't add any new functionality.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |  7 ++++---
 drivers/infiniband/hw/cxgb4/provider.c       |  8 ++++----
 drivers/infiniband/hw/hfi1/verbs.c           |  1 +
 drivers/infiniband/hw/hns/hns_roce_main.c    |  7 ++++---
 drivers/infiniband/hw/i40iw/i40iw_verbs.c    |  8 ++++----
 drivers/infiniband/hw/mlx4/alias_GUID.c      |  1 +
 drivers/infiniband/hw/mlx4/main.c            | 18 +++++++++---------
 drivers/infiniband/hw/mlx4/sysfs.c           |  1 +
 drivers/infiniband/hw/mlx5/mad.c             |  2 +-
 drivers/infiniband/hw/mlx5/main.c            | 12 +++++++-----
 drivers/infiniband/hw/mthca/mthca_provider.c |  9 +++++----
 drivers/infiniband/hw/nes/nes_verbs.c        |  5 +++--
 drivers/infiniband/hw/ocrdma/ocrdma_main.c   |  9 +++++----
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c  |  1 +
 drivers/infiniband/hw/qedr/verbs.c           |  9 +++++----
 drivers/infiniband/hw/qib/qib_verbs.c        |  1 +
 drivers/infiniband/hw/usnic/usnic_ib_main.c  |  5 +++--
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c |  2 +-
 drivers/infiniband/sw/rdmavt/vt.c            |  7 ++++---
 drivers/infiniband/sw/rxe/rxe_verbs.c        |  6 ++++--
 20 files changed, 68 insertions(+), 51 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index cba57bb..fcaca76 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1132,7 +1132,7 @@ static int iwch_query_port(struct ib_device *ibdev,
 	dev = to_iwch_dev(ibdev);
 	netdev = dev->rdev.port_info.lldevs[port-1];

-	memset(props, 0, sizeof(struct ib_port_attr));
+	/* props being zeroed by the caller, avoid zeroing it here */
 	props->max_mtu = IB_MTU_4096;
 	if (netdev->mtu >= 4096)
 		props->active_mtu = IB_MTU_4096;
@@ -1337,13 +1337,14 @@ static int iwch_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = iwch_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;

 	return 0;
 }
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index 645e606..84f0885 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -356,8 +356,7 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,

 	dev = to_c4iw_dev(ibdev);
 	netdev = dev->rdev.lldi.ports[port-1];
-
-	memset(props, 0, sizeof(struct ib_port_attr));
+	/* props being zeroed by the caller, avoid zeroing it here */
 	props->max_mtu = IB_MTU_4096;
 	if (netdev->mtu >= 4096)
 		props->active_mtu = IB_MTU_4096;
@@ -503,13 +502,14 @@ static int c4iw_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = c4iw_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;

 	return 0;
 }
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 4b7a16c..699ee54 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1399,6 +1399,7 @@ static int query_port(struct rvt_dev_info *rdi, u8 port_num,
 	struct hfi1_pportdata *ppd = &dd->pport[port_num - 1];
 	u16 lid = ppd->lid;

+	/* props being zeroed by the caller, avoid zeroing it here */
 	props->lid = lid ? lid : 0;
 	props->lmc = ppd->lmc;
 	/* OPA logical states match IB logical states */
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index 764e35a..c6b5779 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -403,7 +403,7 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u8 port_num,
 	assert(port_num > 0);
 	port = port_num - 1;

-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */

 	props->max_mtu = hr_dev->caps.max_mtu;
 	props->gid_tbl_len = hr_dev->caps.gid_table_len[port];
@@ -572,14 +572,15 @@ static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
 	struct ib_port_attr attr;
 	int ret;

-	ret = hns_roce_query_port(ib_dev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+
+	ret = ib_query_port(ib_dev, port_num, &attr);
 	if (ret)
 		return ret;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;

-	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;

 	return 0;
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
index 6329c97..84ae990 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -96,8 +96,7 @@ static int i40iw_query_port(struct ib_device *ibdev,
 	struct i40iw_device *iwdev = to_iwdev(ibdev);
 	struct net_device *netdev = iwdev->netdev;

-	memset(props, 0, sizeof(*props));
-
+	/* props being zeroed by the caller, avoid zeroing it here */
 	props->max_mtu = IB_MTU_4096;
 	if (netdev->mtu >= 4096)
 		props->active_mtu = IB_MTU_4096;
@@ -2359,14 +2358,15 @@ static int i40iw_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = i40iw_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;
+
+	err = ib_query_port(ibdev, port_num, &attr);

 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;

 	return 0;
 }
diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c b/drivers/infiniband/hw/mlx4/alias_GUID.c
index 5e99390..0678da5 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -499,6 +499,7 @@ static int set_guid_rec(struct ib_device *ibdev,
 	struct list_head *head =
 		&dev->sriov.alias_guid.ports_guid[port - 1].cb_list;

+	memset(&attr, 0, sizeof(struct ib_port_attr));
 	err = __mlx4_ib_query_port(ibdev, port, &attr, 1);
 	if (err) {
 		pr_debug("mlx4_ib_query_port failed (err: %d), port: %d\n",
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 3902832..c0b8789f 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -737,7 +737,7 @@ int __mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
 {
 	int err;

-	memset(props, 0, sizeof *props);
+	/* props being zeroed by the caller, avoid zeroing it here */

 	err = mlx4_ib_port_link_layer(ibdev, port) == IB_LINK_LAYER_INFINIBAND ?
 		ib_link_query_port(ibdev, port, props, netw_view) :
@@ -1010,7 +1010,7 @@ static int mlx4_ib_modify_port(struct ib_device *ibdev, u8 port, int mask,

 	mutex_lock(&mdev->cap_mask_mutex);

-	err = mlx4_ib_query_port(ibdev, port, &attr);
+	err = ib_query_port(ibdev, port, &attr);
 	if (err)
 		goto out;

@@ -2523,13 +2523,6 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct mlx4_ib_dev *mdev = to_mdev(ibdev);
 	int err;

-	err = mlx4_ib_query_port(ibdev, port_num, &attr);
-	if (err)
-		return err;
-
-	immutable->pkey_tbl_len = attr.pkey_tbl_len;
-	immutable->gid_tbl_len = attr.gid_tbl_len;
-
 	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND) {
 		immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
 		immutable->max_mad_size = IB_MGMT_MAD_SIZE;
@@ -2545,6 +2538,13 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
 			immutable->max_mad_size = IB_MGMT_MAD_SIZE;
 	}

+	err = ib_query_port(ibdev, port_num, &attr);
+	if (err)
+		return err;
+
+	immutable->pkey_tbl_len = attr.pkey_tbl_len;
+	immutable->gid_tbl_len = attr.gid_tbl_len;
+
 	return 0;
 }

diff --git a/drivers/infiniband/hw/mlx4/sysfs.c b/drivers/infiniband/hw/mlx4/sysfs.c
index 69fb5ba..5835165 100644
--- a/drivers/infiniband/hw/mlx4/sysfs.c
+++ b/drivers/infiniband/hw/mlx4/sysfs.c
@@ -226,6 +226,7 @@ static int add_port_entries(struct mlx4_ib_dev *device, int port_num)
 	int ret = 0 ;
 	struct ib_port_attr attr;

+	memset(&attr, 0, sizeof(struct ib_port_attr));
 	/* get the physical gid and pkey table sizes.*/
 	ret = __mlx4_ib_query_port(&device->ib_dev, port_num, &attr, 1);
 	if (ret)
diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index 39e5848..4e58b8f 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -515,7 +515,7 @@ int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u8 port,
 	if (!in_mad || !out_mad)
 		goto out;

-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */

 	init_query_mad(in_mad);
 	in_mad->attr_id  = IB_SMP_ATTR_PORT_INFO;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 019a7a4..a137254 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -174,7 +174,7 @@ static int mlx5_query_port_roce(struct ib_device *device, u8 port_num,
 	enum ib_mtu ndev_ib_mtu;
 	u16 qkey_viol_cntr;

-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */

 	props->port_cap_flags  |= IB_PORT_CM_SUP;
 	props->port_cap_flags  |= IB_PORT_IP_BASED_GIDS;
@@ -793,7 +793,7 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u8 port,
 		goto out;
 	}

-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */

 	err = mlx5_query_hca_vport_context(mdev, 0, port, 0, rep);
 	if (err)
@@ -941,7 +941,7 @@ static int mlx5_ib_modify_port(struct ib_device *ibdev, u8 port, int mask,

 	mutex_lock(&dev->cap_mask_mutex);

-	err = mlx5_ib_query_port(ibdev, port, &attr);
+	err = ib_query_port(ibdev, port, &attr);
 	if (err)
 		goto out;

@@ -2408,6 +2408,7 @@ static int get_port_caps(struct mlx5_ib_dev *dev)
 	}

 	for (port = 1; port <= MLX5_CAP_GEN(dev->mdev, num_ports); port++) {
+		memset(pprops, 0, sizeof(struct ib_port_attr));
 		err = mlx5_ib_query_port(&dev->ib_dev, port, pprops);
 		if (err) {
 			mlx5_ib_warn(dev, "query_port %d failed %d\n",
@@ -2725,13 +2726,14 @@ static int mlx5_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = mlx5_ib_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = get_core_cap_flags(ibdev);
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = get_core_cap_flags(ibdev);
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;

 	return 0;
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 358930a4..2b464fb 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -146,7 +146,7 @@ static int mthca_query_port(struct ib_device *ibdev,
 	if (!in_mad || !out_mad)
 		goto out;

-	memset(props, 0, sizeof *props);
+	/* props being zeroed by the caller, avoid zeroing it here */

 	init_query_mad(in_mad);
 	in_mad->attr_id  = IB_SMP_ATTR_PORT_INFO;
@@ -212,7 +212,7 @@ static int mthca_modify_port(struct ib_device *ibdev,
 	if (mutex_lock_interruptible(&to_mdev(ibdev)->cap_mask_mutex))
 		return -ERESTARTSYS;

-	err = mthca_query_port(ibdev, port, &attr);
+	err = ib_query_port(ibdev, port, &attr);
 	if (err)
 		goto out;

@@ -1164,13 +1164,14 @@ static int mthca_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = mthca_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;

 	return 0;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index bd69125..5256b07 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -475,7 +475,7 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
 	struct nes_vnic *nesvnic = to_nesvnic(ibdev);
 	struct net_device *netdev = nesvnic->netdev;

-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */

 	props->max_mtu = IB_MTU_4096;

@@ -3673,13 +3673,14 @@ static int nes_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

+	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;
+
 	err = nes_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IWARP;

 	return 0;
 }
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 8960715..3e43bdc 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -93,15 +93,16 @@ static int ocrdma_port_immutable(struct ib_device *ibdev, u8 port_num,
 	int err;

 	dev = get_ocrdma_dev(ibdev);
-	err = ocrdma_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+	if (ocrdma_is_udp_encap_supported(dev))
+		immutable->core_cap_flags |= RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
-	if (ocrdma_is_udp_encap_supported(dev))
-		immutable->core_cap_flags |= RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP;
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;

 	return 0;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 6af44f8..013d15c 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -210,6 +210,7 @@ int ocrdma_query_port(struct ib_device *ibdev,
 	struct ocrdma_dev *dev;
 	struct net_device *netdev;

+	/* props being zeroed by the caller, avoid zeroing it here */
 	dev = get_ocrdma_dev(ibdev);
 	if (port > 1) {
 		pr_err("%s(%d) invalid_port=0x%x\n", __func__,
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index a615142..53207ff 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -238,8 +238,8 @@ int qedr_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *attr)
 	}

 	rdma_port = dev->ops->rdma_query_port(dev->rdma_ctx);
-	memset(attr, 0, sizeof(*attr));

+	/* *attr being zeroed by the caller, avoid zeroing it here */
 	if (rdma_port->port_state == QED_RDMA_PORT_UP) {
 		attr->state = IB_PORT_ACTIVE;
 		attr->phys_state = 5;
@@ -3533,14 +3533,15 @@ int qedr_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = qedr_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
+				    RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
-				    RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;

 	return 0;
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 954f150..2204578 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1320,6 +1320,7 @@ static int qib_query_port(struct rvt_dev_info *rdi, u8 port_num,
 	enum ib_mtu mtu;
 	u16 lid = ppd->lid;

+	/* props being zeroed by the caller, avoid zeroing it here */
 	props->lid = lid ? lid : be16_to_cpu(IB_LID_PERMISSIVE);
 	props->lmc = ppd->lmc;
 	props->state = dd->f_iblink_state(ppd->lastibcstat);
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
index dde0b23..4f5a45d 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
@@ -321,11 +321,12 @@ static int usnic_port_immutable(struct ib_device *ibdev, u8 port_num,
 	struct ib_port_attr attr;
 	int err;

-	err = usnic_ib_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_USNIC;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

-	immutable->core_cap_flags = RDMA_CORE_PORT_USNIC;
 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;

diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index a5bfbba..797bfe4 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -330,7 +330,7 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,

 	mutex_lock(&us_ibdev->usdev_lock);
 	__ethtool_get_link_ksettings(us_ibdev->netdev, &cmd);
-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */

 	props->lid = 0;
 	props->lmc = 1;
diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c
index d430c2f..1165639 100644
--- a/drivers/infiniband/sw/rdmavt/vt.c
+++ b/drivers/infiniband/sw/rdmavt/vt.c
@@ -165,7 +165,7 @@ static int rvt_query_port(struct ib_device *ibdev, u8 port_num,
 		return -EINVAL;

 	rvp = rdi->ports[port_index];
-	memset(props, 0, sizeof(*props));
+	/* props being zeroed by the caller, avoid zeroing it here */
 	props->sm_lid = rvp->sm_lid;
 	props->sm_sl = rvp->sm_sl;
 	props->port_cap_flags = rvp->port_cap_flags;
@@ -326,13 +326,14 @@ static int rvt_get_port_immutable(struct ib_device *ibdev, u8 port_num,
 	if (port_index < 0)
 		return -EINVAL;

-	err = rvt_query_port(ibdev, port_num, &attr);
+	immutable->core_cap_flags = rdi->dparms.core_cap_flags;
+
+	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = rdi->dparms.core_cap_flags;
 	immutable->max_mad_size = rdi->dparms.max_mad_size;

 	return 0;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 19841c8..e0f4a53 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -86,6 +86,7 @@ static int rxe_query_port(struct ib_device *dev,

 	port = &rxe->port;

+	/* *attr being zeroed by the caller, avoid zeroing it here */
 	*attr = port->attr;

 	mutex_lock(&rxe->usdev_lock);
@@ -261,13 +262,14 @@ static int rxe_port_immutable(struct ib_device *dev, u8 port_num,
 	int err;
 	struct ib_port_attr attr;

-	err = rxe_query_port(dev, port_num, &attr);
+	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+
+	err = ib_query_port(dev, port_num, &attr);
 	if (err)
 		return err;

 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;

 	return 0;
--
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 05/10] IB: Query port through the core instead of directly calling the driver handler Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
       [not found]     ` <1480258296-27032-7-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2016-11-27 14:51   ` [PATCH rdma-next 07/10] IB/uverbs: Propagate supported QP types to user-space Leon Romanovsky
                     ` (4 subsequent siblings)
  10 siblings, 1 reply; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add qp_type_cap port attribute which is a bit field representation
of the ib_qp_type enum. This will allow applications to query what QP
types are supported by an IB device instance on a specific port.

The qp_type_cap port attribute is set by the core according to the
protocol supported for the device port. This holds for all the
providers with the exception of two RoCE drivers that don't implement
UD and UC. To handle that, they (hns and qedr) are patched to remove
these QPs from what's the core has set for them as supported.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/device.c          | 28 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/hns/hns_roce_main.c |  3 +++
 drivers/infiniband/hw/qedr/verbs.c        |  2 ++
 include/rdma/ib_verbs.h                   |  1 +
 4 files changed, 34 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 760ef60..f7abde2 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -646,6 +646,31 @@ void ib_dispatch_event(struct ib_event *event)
 }
 EXPORT_SYMBOL(ib_dispatch_event);
 
+static void get_port_qp_types(const struct ib_device *device, u8 port_num,
+			      struct ib_port_attr *port_attr)
+{
+	if (rdma_cap_ib_smi(device, port_num))
+		port_attr->qp_type_cap |= BIT(IB_QPT_SMI);
+
+	if (rdma_cap_ib_cm(device, port_num))
+		port_attr->qp_type_cap |= BIT(IB_QPT_GSI);
+
+	if (rdma_ib_or_roce(device, port_num)) {
+		port_attr->qp_type_cap |= (BIT(IB_QPT_RC) | BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
+		if (device->attrs.device_cap_flags & IB_DEVICE_XRC)
+			port_attr->qp_type_cap |= (BIT(IB_QPT_XRC_INI) | BIT(IB_QPT_XRC_TGT));
+	}
+
+	if (rdma_protocol_iwarp(device, port_num))
+		port_attr->qp_type_cap |= BIT(IB_QPT_RC);
+
+	if (rdma_protocol_raw_packet(device, port_num))
+		port_attr->qp_type_cap |= BIT(IB_QPT_RAW_PACKET);
+
+	if (rdma_protocol_usnic(device, port_num))
+		port_attr->qp_type_cap |= BIT(IB_QPT_UD);
+}
+
 /**
  * ib_query_port - Query IB port attributes
  * @device:Device to query
@@ -666,6 +691,9 @@ int ib_query_port(struct ib_device *device,
 		return -EINVAL;
 
 	memset(port_attr, 0, sizeof(*port_attr));
+
+	get_port_qp_types(device, port_num, port_attr);
+
 	err = device->query_port(device, port_num, port_attr);
 	if (err || port_attr->subnet_prefix)
 		return err;
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index c6b5779..22de534 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -430,6 +430,9 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u8 port_num,
 			IB_PORT_ACTIVE : IB_PORT_DOWN;
 	props->phys_state = (props->state == IB_PORT_ACTIVE) ? 5 : 3;
 
+	/* mark that UD and UC aren't supported */
+	props->qp_type_cap &= ~(BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
+
 	spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
 
 	return 0;
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 53207ff..33d0219 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -263,6 +263,8 @@ int qedr_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *attr)
 	attr->max_msg_sz = rdma_port->max_msg_size;
 	attr->max_vl_num = 4;
 
+	/* mark that UD and UC aren't supported */
+	attr->qp_type_cap &= ~(BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
 	return 0;
 }
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 485b725..0b839e4 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -536,6 +536,7 @@ struct ib_port_attr {
 	u8			active_speed;
 	u8                      phys_state;
 	bool			grh_required;
+	u16			qp_type_cap;
 };
 
 enum ib_device_modify_flags {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 07/10] IB/uverbs: Propagate supported QP types to user-space
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 08/10] IB/mlx5: Refactor registration to netdev notifier Leon Romanovsky
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Propagate supported qp types to user-space when they issue a query
port call. From user-space point of view, zero means that the query
is not supported, under the reasoning that it doesn't make sense
to have user-space devices that don't support any qp type. Note that
this will only happen when they run over older kernels.

Make sure to filter out qp types which are not supported for
user-space (SMI, GSI, etc).

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 2 ++
 include/uapi/rdma/ib_user_verbs.h    | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index cb3f515a..1c386bc 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -526,6 +526,8 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
 	resp.phys_state      = attr.phys_state;
 	resp.link_layer      = rdma_port_get_link_layer(ib_dev,
 							cmd.port_num);
+	/* don't expose to user-space QPTs they don't know */
+	resp.qp_type_cap = attr.qp_type_cap & ~(BIT(IB_QPT_SMI) | BIT(IB_QPT_GSI));
 
 	if (copy_to_user((void __user *) (unsigned long) cmd.response,
 			 &resp, sizeof resp))
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 25225eb..a50604f 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -276,7 +276,7 @@ struct ib_uverbs_query_port_resp {
 	__u8  active_speed;
 	__u8  phys_state;
 	__u8  link_layer;
-	__u8  reserved[2];
+	__u16 qp_type_cap;
 };
 
 struct ib_uverbs_alloc_pd {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 08/10] IB/mlx5: Refactor registration to netdev notifier
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 07/10] IB/uverbs: Propagate supported QP types to user-space Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 09/10] IB/mlx5: Rename RoCE related helpers to reflect being Eth ones Leon Romanovsky
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Refactor the netdev notifier registration into a small helper function.

This is a pre-step towards having mlx5 IB device over an Ethernet port
which doesn't support RoCE. Also, renamed the de-registration helper
and the new helper as netdev notifier and not roce, to make it clear
this is not only used with roce.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index a137254..91921f1 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2789,7 +2789,21 @@ static void mlx5_roce_lag_cleanup(struct mlx5_ib_dev *dev)
 	}
 }
 
-static void mlx5_remove_roce_notifier(struct mlx5_ib_dev *dev)
+static int mlx5_add_netdev_notifier(struct mlx5_ib_dev *dev)
+{
+	int err;
+
+	dev->roce.nb.notifier_call = mlx5_netdev_event;
+	err = register_netdevice_notifier(&dev->roce.nb);
+	if (err) {
+		dev->roce.nb.notifier_call = NULL;
+		return err;
+	}
+
+	return 0;
+}
+
+static void mlx5_remove_netdev_notifier(struct mlx5_ib_dev *dev)
 {
 	if (dev->roce.nb.notifier_call) {
 		unregister_netdevice_notifier(&dev->roce.nb);
@@ -2801,12 +2815,9 @@ static int mlx5_enable_roce(struct mlx5_ib_dev *dev)
 {
 	int err;
 
-	dev->roce.nb.notifier_call = mlx5_netdev_event;
-	err = register_netdevice_notifier(&dev->roce.nb);
-	if (err) {
-		dev->roce.nb.notifier_call = NULL;
+	err = mlx5_add_netdev_notifier(dev);
+	if (err)
 		return err;
-	}
 
 	err = mlx5_nic_vport_enable_roce(dev->mdev);
 	if (err)
@@ -2822,7 +2833,7 @@ err_disable_roce:
 	mlx5_nic_vport_disable_roce(dev->mdev);
 
 err_unregister_netdevice_notifier:
-	mlx5_remove_roce_notifier(dev);
+	mlx5_remove_netdev_notifier(dev);
 	return err;
 }
 
@@ -3186,7 +3197,7 @@ err_rsrc:
 err_disable_roce:
 	if (ll == IB_LINK_LAYER_ETHERNET) {
 		mlx5_disable_roce(dev);
-		mlx5_remove_roce_notifier(dev);
+		mlx5_remove_netdev_notifier(dev);
 	}
 
 err_free_port:
@@ -3203,7 +3214,7 @@ static void mlx5_ib_remove(struct mlx5_core_dev *mdev, void *context)
 	struct mlx5_ib_dev *dev = context;
 	enum rdma_link_layer ll = mlx5_ib_port_link_layer(&dev->ib_dev, 1);
 
-	mlx5_remove_roce_notifier(dev);
+	mlx5_remove_netdev_notifier(dev);
 	ib_unregister_device(&dev->ib_dev);
 	mlx5_ib_dealloc_q_counters(dev);
 	destroy_umrc_res(dev);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 09/10] IB/mlx5: Rename RoCE related helpers to reflect being Eth ones
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 08/10] IB/mlx5: Refactor registration to netdev notifier Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-11-27 14:51   ` [PATCH rdma-next 10/10] IB/mlx5: Support RAW Ethernet when RoCE is disabled Leon Romanovsky
  2016-12-14 19:06   ` [PATCH rdma-next 00/10] " Doug Ledford
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

This is a pre-step towards having mlx5 IB device also over Eth ports where
RoCE is not supported. We change the roce enable/disable and roce_lag
init/fini function names to have _eth instead of _roce.

This patch doesn't change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 91921f1..b811c70 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2748,7 +2748,7 @@ static void get_dev_fw_str(struct ib_device *ibdev, char *str,
 		       fw_rev_min(dev->mdev), fw_rev_sub(dev->mdev));
 }
 
-static int mlx5_roce_lag_init(struct mlx5_ib_dev *dev)
+static int mlx5_eth_lag_init(struct mlx5_ib_dev *dev)
 {
 	struct mlx5_core_dev *mdev = dev->mdev;
 	struct mlx5_flow_namespace *ns = mlx5_get_flow_namespace(mdev,
@@ -2777,7 +2777,7 @@ err_destroy_vport_lag:
 	return err;
 }
 
-static void mlx5_roce_lag_cleanup(struct mlx5_ib_dev *dev)
+static void mlx5_eth_lag_cleanup(struct mlx5_ib_dev *dev)
 {
 	struct mlx5_core_dev *mdev = dev->mdev;
 
@@ -2811,7 +2811,7 @@ static void mlx5_remove_netdev_notifier(struct mlx5_ib_dev *dev)
 	}
 }
 
-static int mlx5_enable_roce(struct mlx5_ib_dev *dev)
+static int mlx5_enable_eth(struct mlx5_ib_dev *dev)
 {
 	int err;
 
@@ -2823,7 +2823,7 @@ static int mlx5_enable_roce(struct mlx5_ib_dev *dev)
 	if (err)
 		goto err_unregister_netdevice_notifier;
 
-	err = mlx5_roce_lag_init(dev);
+	err = mlx5_eth_lag_init(dev);
 	if (err)
 		goto err_disable_roce;
 
@@ -2837,9 +2837,9 @@ err_unregister_netdevice_notifier:
 	return err;
 }
 
-static void mlx5_disable_roce(struct mlx5_ib_dev *dev)
+static void mlx5_disable_eth(struct mlx5_ib_dev *dev)
 {
-	mlx5_roce_lag_cleanup(dev);
+	mlx5_eth_lag_cleanup(dev);
 	mlx5_nic_vport_disable_roce(dev->mdev);
 }
 
@@ -3143,14 +3143,14 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	spin_lock_init(&dev->reset_flow_resource_lock);
 
 	if (ll == IB_LINK_LAYER_ETHERNET) {
-		err = mlx5_enable_roce(dev);
+		err = mlx5_enable_eth(dev);
 		if (err)
 			goto err_free_port;
 	}
 
 	err = create_dev_resources(&dev->devr);
 	if (err)
-		goto err_disable_roce;
+		goto err_disable_eth;
 
 	err = mlx5_ib_odp_init_one(dev);
 	if (err)
@@ -3194,9 +3194,9 @@ err_odp:
 err_rsrc:
 	destroy_dev_resources(&dev->devr);
 
-err_disable_roce:
+err_disable_eth:
 	if (ll == IB_LINK_LAYER_ETHERNET) {
-		mlx5_disable_roce(dev);
+		mlx5_disable_eth(dev);
 		mlx5_remove_netdev_notifier(dev);
 	}
 
@@ -3221,7 +3221,7 @@ static void mlx5_ib_remove(struct mlx5_core_dev *mdev, void *context)
 	mlx5_ib_odp_remove_one(dev);
 	destroy_dev_resources(&dev->devr);
 	if (ll == IB_LINK_LAYER_ETHERNET)
-		mlx5_disable_roce(dev);
+		mlx5_disable_eth(dev);
 	kfree(dev->port);
 	ib_dealloc_device(&dev->ib_dev);
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH rdma-next 10/10] IB/mlx5: Support RAW Ethernet when RoCE is disabled
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 09/10] IB/mlx5: Rename RoCE related helpers to reflect being Eth ones Leon Romanovsky
@ 2016-11-27 14:51   ` Leon Romanovsky
  2016-12-14 19:06   ` [PATCH rdma-next 00/10] " Doug Ledford
  10 siblings, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-11-27 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

On some environments, such as certain SRIOV VF configurations, RoCE is
not supported for mlx5 Ethernet ports. Currently, the driver will not
open IB device on that port.

This is problematic, since we do want user-space RAW Ethernet (RAW_PACKET
QPs) functionality to remain in place. For that end, enhance the relevant
driver flows such that we do create a device instance in that case.

Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index b811c70..c0462d1 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2724,6 +2724,8 @@ static int mlx5_port_immutable(struct ib_device *ibdev, u8 port_num,
 			       struct ib_port_immutable *immutable)
 {
 	struct ib_port_attr attr;
+	struct mlx5_ib_dev *dev = to_mdev(ibdev);
+	enum rdma_link_layer ll = mlx5_ib_port_link_layer(ibdev, port_num);
 	int err;
 
 	immutable->core_cap_flags = get_core_cap_flags(ibdev);
@@ -2734,7 +2736,8 @@ static int mlx5_port_immutable(struct ib_device *ibdev, u8 port_num,
 
 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
-	immutable->max_mad_size = IB_MGMT_MAD_SIZE;
+	if ((ll == IB_LINK_LAYER_INFINIBAND) || MLX5_CAP_GEN(dev->mdev, roce))
+		immutable->max_mad_size = IB_MGMT_MAD_SIZE;
 
 	return 0;
 }
@@ -2819,9 +2822,11 @@ static int mlx5_enable_eth(struct mlx5_ib_dev *dev)
 	if (err)
 		return err;
 
-	err = mlx5_nic_vport_enable_roce(dev->mdev);
-	if (err)
-		goto err_unregister_netdevice_notifier;
+	if (MLX5_CAP_GEN(dev->mdev, roce)) {
+		err = mlx5_nic_vport_enable_roce(dev->mdev);
+		if (err)
+			goto err_unregister_netdevice_notifier;
+	}
 
 	err = mlx5_eth_lag_init(dev);
 	if (err)
@@ -2830,7 +2835,8 @@ static int mlx5_enable_eth(struct mlx5_ib_dev *dev)
 	return 0;
 
 err_disable_roce:
-	mlx5_nic_vport_disable_roce(dev->mdev);
+	if (MLX5_CAP_GEN(dev->mdev, roce))
+		mlx5_nic_vport_disable_roce(dev->mdev);
 
 err_unregister_netdevice_notifier:
 	mlx5_remove_netdev_notifier(dev);
@@ -2840,7 +2846,8 @@ err_unregister_netdevice_notifier:
 static void mlx5_disable_eth(struct mlx5_ib_dev *dev)
 {
 	mlx5_eth_lag_cleanup(dev);
-	mlx5_nic_vport_disable_roce(dev->mdev);
+	if (MLX5_CAP_GEN(dev->mdev, roce))
+		mlx5_nic_vport_disable_roce(dev->mdev);
 }
 
 static void mlx5_ib_dealloc_q_counters(struct mlx5_ib_dev *dev)
@@ -2962,9 +2969,6 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	port_type_cap = MLX5_CAP_GEN(mdev, port_type);
 	ll = mlx5_port_type_cap_to_rdma_ll(port_type_cap);
 
-	if ((ll == IB_LINK_LAYER_ETHERNET) && !MLX5_CAP_GEN(mdev, roce))
-		return NULL;
-
 	printk_once(KERN_INFO "%s", mlx5_version);
 
 	dev = (struct mlx5_ib_dev *)ib_alloc_device(sizeof(*dev));
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]     ` <1480258296-27032-2-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-11-28 17:00       ` Jason Gunthorpe
       [not found]         ` <20161128170056.GC28381-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-28 17:00 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:

> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device, u8 port_num)
> +{
> +	return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_RAW_PACKET;
> +}

Does the mlx drivers really register ports with different capabilities
as the same ib_device? I'm not sure that should be allowed.

I keep talking about how we need to get rid of the port_num in these
sorts of places because it makes no sense...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]         ` <20161128170056.GC28381-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-28 17:08           ` Steve Wise
  2016-11-30  2:07             ` Doug Ledford
  2016-11-28 20:57           ` Or Gerlitz
  1 sibling, 1 reply; 53+ messages in thread
From: Steve Wise @ 2016-11-28 17:08 UTC (permalink / raw)
  To: 'Jason Gunthorpe', 'Leon Romanovsky'
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

> 
> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
> 
> > +static inline bool rdma_protocol_raw_packet(const struct ib_device *device,
u8
> port_num)
> > +{
> > +	return device->port_immutable[port_num].core_cap_flags &
> RDMA_CORE_CAP_PROT_RAW_PACKET;
> > +}
> 
> Does the mlx drivers really register ports with different capabilities
> as the same ib_device? I'm not sure that should be allowed.
> 
> I keep talking about how we need to get rid of the port_num in these
> sorts of places because it makes no sense...
> 

I agree.   Requiring the port number has implications that ripple up into the
rdma-rw api as well...
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]         ` <20161128170056.GC28381-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-11-28 17:08           ` Steve Wise
@ 2016-11-28 20:57           ` Or Gerlitz
       [not found]             ` <CAJ3xEMiv6HCu-9fi12XtafxYWu-+gNPMbnfb-A4-+FrgR6KZNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 53+ messages in thread
From: Or Gerlitz @ 2016-11-28 20:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Steve Wise, Mike Marciniszyn, Dennis Dalessandro, Lijun Ou,
	Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

On Mon, Nov 28, 2016 at 7:00 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
>
>> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device, u8 port_num)
>> +{
>> +     return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_RAW_PACKET;
>> +}
>
> Does the mlx drivers really register ports with different capabilities
> as the same ib_device? I'm not sure that should be allowed.

mlx4 yeah (practically for the last ~10 years) for instance Eth ports
don't support SMI -- this goes back to the fact that mlx4 devices are
single PCI function with potentially two ports and each port can be
set to different link layer. But this no more holds for mlx5, these
devices are function-per-port and hence IB device per port. I guess we
have to swallow that pill and move on as newer devices don't have this
behavior, okay?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]             ` <CAJ3xEMiv6HCu-9fi12XtafxYWu-+gNPMbnfb-A4-+FrgR6KZNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-11-28 22:25               ` Jason Gunthorpe
       [not found]                 ` <20161128222559.GB744-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-28 22:25 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Leon Romanovsky, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Steve Wise, Mike Marciniszyn, Dennis Dalessandro, Lijun Ou,
	Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

On Mon, Nov 28, 2016 at 10:57:00PM +0200, Or Gerlitz wrote:
> On Mon, Nov 28, 2016 at 7:00 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
> >
> >> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device, u8 port_num)
> >> +{
> >> +     return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_RAW_PACKET;
> >> +}
> >
> > Does the mlx drivers really register ports with different capabilities
> > as the same ib_device? I'm not sure that should be allowed.
> 
> mlx4 yeah (practically for the last ~10 years) for instance Eth ports
> don't support SMI -- this goes back to the fact that mlx4 devices are
> single PCI function with potentially two ports and each port can be

struct ib_device is not linked to a PCI device. AFAIK mlx4 created one
ib_device for rocee ports and one for IB, or at least it should or
things are already broken.

> set to different link layer. But this no more holds for mlx5, these
> devices are function-per-port and hence IB device per port.

Since it has nothing to do with pci devices, please do this properly.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                 ` <20161128222559.GB744-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-29  6:35                   ` Or Gerlitz
       [not found]                     ` <CAJ3xEMiC3UDujgSL5fwP7ee1=OjhorZ2aeB1k+ptGb9GWaUVkg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Or Gerlitz @ 2016-11-29  6:35 UTC (permalink / raw)
  To: Jason Gunthorpe, Yishai Hadas, Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Moni Shoua

On Tue, Nov 29, 2016 at 12:25 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Mon, Nov 28, 2016 at 10:57:00PM +0200, Or Gerlitz wrote:
>> On Mon, Nov 28, 2016 at 7:00 PM, Jason Gunthorpe
>> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:

>> > Does the mlx drivers really register ports with different capabilities
>> > as the same ib_device? I'm not sure that should be allowed.

>> mlx4 yeah (practically for the last ~10 years) for instance Eth ports
>> don't support SMI -- this goes back to the fact that mlx4 devices are
>> single PCI function with potentially two ports and each port can be

> struct ib_device is not linked to a PCI device. AFAIK mlx4 created one
> ib_device for rocee ports and one for IB,

no, it doesn't

> or at least it should or things are already broken.

Can you elaborate what it broken with mlx4? I suspect we
even have down there some functionality which depends on that,
but again, I 1st would like to hear if/what is broken - I copied the maintainer.

>> set to different link layer. But this no more holds for mlx5, these
>> devices are function-per-port and hence IB device per port.

> Since it has nothing to do with pci devices, please do this properly.

again, mlx5 does this already.

Jason, patches 8-10 which carry the functional change I want to introduce
(allow mlx5 IB devices to be created when RoCE is not supported) stand
for themselves.

As I wrote, the stack is fully functional  (i.e no error in the IB
core etc) when
only these patches are put. E.g things behave in a similar manner to all
the upstream iWARP drivers that refuse to create any QP which is not RC.

I am okay to put aside patches 1-7 which I added per Doug request for
user-space
applications to be able and query what QPs are supports on a device,
or to get in
patches 1-7, whatever works better for Doug and ppl. I don't think
it's fair to ask
a re-write of the 10y old IB device-ing of things done by mlx4 just to be able
and introduce this reduced functionally (raw packet qp only) of mlx5 devices.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                     ` <CAJ3xEMiC3UDujgSL5fwP7ee1=OjhorZ2aeB1k+ptGb9GWaUVkg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-11-29 16:19                       ` Jason Gunthorpe
  0 siblings, 0 replies; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-29 16:19 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Yishai Hadas, Matan Barak, Doug Ledford,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Moni Shoua

On Tue, Nov 29, 2016 at 08:35:52AM +0200, Or Gerlitz wrote:
> > or at least it should or things are already broken.
> 
> Can you elaborate what it broken with mlx4? I suspect we
> even have down there some functionality which depends on that,
> but again, I 1st would like to hear if/what is broken - I copied the maintainer.

The semantic we require is that everything under a struct ib_device is
compatible, you can use an AH on any PD with any QP on any port. As an
example wee absolutely do not allow iwarp and ib on the same
ib_device, (rdma_cm will break horribly, at least).

The fact rocee and ib sort of work together as-is represents a fluke
not a design goal :(

> Jason, patches 8-10 which carry the functional change I want to introduce
> (allow mlx5 IB devices to be created when RoCE is not supported) stand
> for themselves.

I asked for the port_num to be removed, and the uapi for it to be
device not port specific. Don't see why that is such a big deal..

> a re-write of the 10y old IB device-ing of things done by mlx4 just
> to be able and introduce this reduced functionally (raw packet qp
> only) of mlx5 devices.

Seems totally fair to me for patches adding a new uapi. Why do you
guys keep thinking uapis should be easy?

Who else is going to fix this?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
  2016-11-28 17:08           ` Steve Wise
@ 2016-11-30  2:07             ` Doug Ledford
       [not found]               ` <cfdf28c6-4715-28d4-7da6-453fb6794c29-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Doug Ledford @ 2016-11-30  2:07 UTC (permalink / raw)
  To: Steve Wise, 'Jason Gunthorpe', 'Leon Romanovsky'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'


[-- Attachment #1.1: Type: text/plain, Size: 1320 bytes --]

On 11/28/2016 12:08 PM, Steve Wise wrote:
>>
>> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
>>
>>> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device,
> u8
>> port_num)
>>> +{
>>> +	return device->port_immutable[port_num].core_cap_flags &
>> RDMA_CORE_CAP_PROT_RAW_PACKET;
>>> +}
>>
>> Does the mlx drivers really register ports with different capabilities
>> as the same ib_device? I'm not sure that should be allowed.
>>
>> I keep talking about how we need to get rid of the port_num in these
>> sorts of places because it makes no sense...
>>
> 
> I agree.   Requiring the port number has implications that ripple up into the
> rdma-rw api as well...
>  
> 

In all fairness, there is no requirement that any two ports on the same
device be the same link layer, or if the link layer is Ethernet, there
is no requirement that they can't support both iWARP and RoCE.  The idea
that the parent device defined the supported protocols for all ports of
a device became wrong with the first mlx4 device that could do both IB
and Ethernet.  And I think I've heard rumblings of a combined RoCE/iWARP
device possibly in the future from someone else.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]               ` <cfdf28c6-4715-28d4-7da6-453fb6794c29-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-11-30  2:33                 ` Tom Talpey
       [not found]                   ` <5927e04b-42ec-52c1-88a3-456cc4409334-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
  2016-11-30 16:29                 ` Hefty, Sean
  2016-11-30 16:36                 ` Jason Gunthorpe
  2 siblings, 1 reply; 53+ messages in thread
From: Tom Talpey @ 2016-11-30  2:33 UTC (permalink / raw)
  To: Doug Ledford, Steve Wise, 'Jason Gunthorpe',
	'Leon Romanovsky'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

On 11/29/2016 9:07 PM, Doug Ledford wrote:
> On 11/28/2016 12:08 PM, Steve Wise wrote:
>>>
>>> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
>>>
>>>> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device,
>> u8
>>> port_num)
>>>> +{
>>>> +	return device->port_immutable[port_num].core_cap_flags &
>>> RDMA_CORE_CAP_PROT_RAW_PACKET;
>>>> +}
>>>
>>> Does the mlx drivers really register ports with different capabilities
>>> as the same ib_device? I'm not sure that should be allowed.
>>>
>>> I keep talking about how we need to get rid of the port_num in these
>>> sorts of places because it makes no sense...
>>>
>>
>> I agree.   Requiring the port number has implications that ripple up into the
>> rdma-rw api as well...
>>
>>
>
> In all fairness, there is no requirement that any two ports on the same
> device be the same link layer, or if the link layer is Ethernet, there
> is no requirement that they can't support both iWARP and RoCE.  The idea
> that the parent device defined the supported protocols for all ports of
> a device became wrong with the first mlx4 device that could do both IB
> and Ethernet.  And I think I've heard rumblings of a combined RoCE/iWARP
> device possibly in the future from someone else.

This one for instance?

http://www.qlogic.com/Resources/Documents/DataSheets/Adapters/DataSheet_QLE45211HL_QLE45212HL.pdf

I'd love to see any such device support protocol choice per
connection, not just per port. That of course would have
implications on the rdma commection manager api.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                   ` <5927e04b-42ec-52c1-88a3-456cc4409334-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
@ 2016-11-30 16:18                     ` Doug Ledford
  2016-11-30 16:30                     ` Liran Liss
  1 sibling, 0 replies; 53+ messages in thread
From: Doug Ledford @ 2016-11-30 16:18 UTC (permalink / raw)
  To: Tom Talpey, Steve Wise, 'Jason Gunthorpe',
	'Leon Romanovsky'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'


[-- Attachment #1.1: Type: text/plain, Size: 1863 bytes --]

On 11/29/2016 9:33 PM, Tom Talpey wrote:
> On 11/29/2016 9:07 PM, Doug Ledford wrote:
>> On 11/28/2016 12:08 PM, Steve Wise wrote:
>>>>
>>>> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
>>>>
>>>>> +static inline bool rdma_protocol_raw_packet(const struct ib_device
>>>>> *device,
>>> u8
>>>> port_num)
>>>>> +{
>>>>> +    return device->port_immutable[port_num].core_cap_flags &
>>>> RDMA_CORE_CAP_PROT_RAW_PACKET;
>>>>> +}
>>>>
>>>> Does the mlx drivers really register ports with different capabilities
>>>> as the same ib_device? I'm not sure that should be allowed.
>>>>
>>>> I keep talking about how we need to get rid of the port_num in these
>>>> sorts of places because it makes no sense...
>>>>
>>>
>>> I agree.   Requiring the port number has implications that ripple up
>>> into the
>>> rdma-rw api as well...
>>>
>>>
>>
>> In all fairness, there is no requirement that any two ports on the same
>> device be the same link layer, or if the link layer is Ethernet, there
>> is no requirement that they can't support both iWARP and RoCE.  The idea
>> that the parent device defined the supported protocols for all ports of
>> a device became wrong with the first mlx4 device that could do both IB
>> and Ethernet.  And I think I've heard rumblings of a combined RoCE/iWARP
>> device possibly in the future from someone else.
> 
> This one for instance?
> 
> http://www.qlogic.com/Resources/Documents/DataSheets/Adapters/DataSheet_QLE45211HL_QLE45212HL.pdf
> 
> 
> I'd love to see any such device support protocol choice per
> connection, not just per port. That of course would have
> implications on the rdma commection manager api.
> 

That's certainly a prime example, thanks Tom ;-)

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]               ` <cfdf28c6-4715-28d4-7da6-453fb6794c29-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-11-30  2:33                 ` Tom Talpey
@ 2016-11-30 16:29                 ` Hefty, Sean
  2016-11-30 16:36                 ` Jason Gunthorpe
  2 siblings, 0 replies; 53+ messages in thread
From: Hefty, Sean @ 2016-11-30 16:29 UTC (permalink / raw)
  To: Doug Ledford, Steve Wise, 'Jason Gunthorpe',
	'Leon Romanovsky'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)', Latif, Faisal, 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

> In all fairness, there is no requirement that any two ports on the same
> device be the same link layer, or if the link layer is Ethernet, there
> is no requirement that they can't support both iWARP and RoCE.  The
> idea
> that the parent device defined the supported protocols for all ports of
> a device became wrong with the first mlx4 device that could do both IB
> and Ethernet.  And I think I've heard rumblings of a combined
> RoCE/iWARP
> device possibly in the future from someone else.

It would help if the community didn't continually redefine terms based on the latest set of patches or whims or random hardware feature.  At one time an ib_device meant an actual IB device - go figure.  Now it's not even a device, but some abstract weirdness collection ports that all support the same transport, or was it link layer, or ... I really have no idea now.  The RDMA subsystem really needs to figure out what it wants to be, because even the term RDMA doesn't even apply to all of the devices that it supports.  And now we're at the point of arguing over where drivers should go because no one even knows that anymore.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                   ` <5927e04b-42ec-52c1-88a3-456cc4409334-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
  2016-11-30 16:18                     ` Doug Ledford
@ 2016-11-30 16:30                     ` Liran Liss
       [not found]                       ` <HE1PR0501MB28124286F8D902C49596EF85B18C0-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  1 sibling, 1 reply; 53+ messages in thread
From: Liran Liss @ 2016-11-30 16:30 UTC (permalink / raw)
  To: Tom Talpey, Doug Ledford, Steve Wise, 'Jason Gunthorpe',
	'Leon Romanovsky'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua, Or Gerlitz

> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Tom Talpey

> >
> > In all fairness, there is no requirement that any two ports on the
> > same device be the same link layer, or if the link layer is Ethernet,
> > there is no requirement that they can't support both iWARP and RoCE.
> > The idea that the parent device defined the supported protocols for
> > all ports of a device became wrong with the first mlx4 device that
> > could do both IB and Ethernet.  And I think I've heard rumblings of a
> > combined RoCE/iWARP device possibly in the future from someone else.
> 
> This one for instance?
> 
> http://www.qlogic.com/Resources/Documents/DataSheets/Adapters/DataShee
> t_QLE45211HL_QLE45212HL.pdf
> 
> I'd love to see any such device support protocol choice per connection, not just
> per port. That of course would have implications on the rdma commection
> manager api.
> 

Exactly. If/when such devices appear, we would need to extend connection management to specify the protocol, rather than infer it from the port space.
It would be perfectly sensible to use both RoCE and iWARP over the same physical Ethernet port and the same source IP address.

Rethinking about the uAPI, maybe we should report a protocol bit-mask similar to the kernel's, instead of QP types?
This would provide all the required information (e.g., any combination of RoCEv1/v2, iWARP, and Raw Ethernet for Ethernet links) for today's use-cases as well as tomorrow's combined RoCE/iWARP devices.
--Liran

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]               ` <cfdf28c6-4715-28d4-7da6-453fb6794c29-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-11-30  2:33                 ` Tom Talpey
  2016-11-30 16:29                 ` Hefty, Sean
@ 2016-11-30 16:36                 ` Jason Gunthorpe
       [not found]                   ` <20161130163621.GB24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-30 16:36 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

On Tue, Nov 29, 2016 at 09:07:52PM -0500, Doug Ledford wrote:
> On 11/28/2016 12:08 PM, Steve Wise wrote:
> >>
> >> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
> >>
> >>> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device,
> > u8
> >> port_num)
> >>> +{
> >>> +	return device->port_immutable[port_num].core_cap_flags &
> >> RDMA_CORE_CAP_PROT_RAW_PACKET;
> >>> +}
> >>
> >> Does the mlx drivers really register ports with different capabilities
> >> as the same ib_device? I'm not sure that should be allowed.
> >>
> >> I keep talking about how we need to get rid of the port_num in these
> >> sorts of places because it makes no sense...
> >>
> > 
> > I agree.   Requiring the port number has implications that ripple up into the
> > rdma-rw api as well...
> >  
> > 
> 
> In all fairness, there is no requirement that any two ports on the same
> device be the same link layer, or if the link layer is Ethernet, there
> is no requirement that they can't support both iWARP and RoCE.

There actually is a requirement. The RDMA CM hard requires all ports
be iWARP or !iWARP at least. I'm sure there are other subtle things
floating around.

There are also things that become very confusing for user space, and
we don't have the infrastructure to support, if ports can switch
configurations on the fly.

The simplest, approach, most in line with how verbs was designed, is
to require each ib_device to have a single kind of AH.

> The idea that the parent device defined the supported protocols for
> all ports of a device became wrong with the first mlx4 device that

Arguably it was sort of OK for roceev1, is less OK for v2, but
shouldn't have been done anyhow.

The uapi question here is do we want to double down and try and make
this work (and what does that even *mean*) or admit mlx4 was an error
and stop doing that going forward..

Or do something else? eg Specifying the AH type when creating the PD
could potentially solve some of the problems...

> could do both IB and Ethernet.  And I think I've heard rumblings of
> a combined RoCE/iWARP device possibly in the future from someone
> else.

Two struct ib_devices for the same port then... Certainly ugly.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                       ` <HE1PR0501MB28124286F8D902C49596EF85B18C0-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-11-30 16:39                         ` Jason Gunthorpe
       [not found]                           ` <20161130163949.GC24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-30 16:39 UTC (permalink / raw)
  To: Liran Liss
  Cc: Tom Talpey, Doug Ledford, Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua, Or Gerlitz

On Wed, Nov 30, 2016 at 04:30:09PM +0000, Liran Liss wrote:
> > I'd love to see any such device support protocol choice per connection, not just
> > per port. That of course would have implications on the rdma commection
> > manager api.
 
> Exactly. If/when such devices appear, we would need to extend
> connection management to specify the protocol, rather than infer it
> from the port space.

We support that perfectly today as long as the port creates two 'struct
ib_devices'. Anything else will require some kind of changes to
libibverb's API to specify the AH style.

> Rethinking about the uAPI, maybe we should report a protocol
> bit-mask similar to the kernel's, instead of QP types?  This would
> provide all the required information (e.g., any combination of
> RoCEv1/v2, iWARP, and Raw Ethernet for Ethernet links) for today's
> use-cases as well as tomorrow's combined RoCE/iWARP devices.

Maybe we should dump this uapi stuff until Matan's patches are
done. The introspection possible with Matan's work is flexible enough
to cope with more cases..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                           ` <20161130163949.GC24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-30 16:59                             ` Or Gerlitz
  2016-11-30 17:01                             ` Liran Liss
  1 sibling, 0 replies; 53+ messages in thread
From: Or Gerlitz @ 2016-11-30 16:59 UTC (permalink / raw)
  To: Liran Liss, Doug Ledford
  Cc: Jason Gunthorpe, Tom Talpey, Steve Wise,
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua

On 11/30/2016 6:39 PM, Jason Gunthorpe wrote:
> Maybe we should dump this uapi stuff until Matan's patches are
> done. The introspection possible with Matan's work is flexible enough
> to cope with more cases..

Basically I am OKay with that approach too.

Doug, if you are willing to take the mlx5 patches that enable the 
feature of mlx5 device over Eth port that doesn't support RoCE (8,9,10 - 
I will have to do some rebasing) and discuss the query when the new ABI 
code is getting closer to be upstream that's fine too.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                           ` <20161130163949.GC24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-11-30 16:59                             ` Or Gerlitz
@ 2016-11-30 17:01                             ` Liran Liss
       [not found]                               ` <HE1PR0501MB28128CDB112558C5980CB638B18C0-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  1 sibling, 1 reply; 53+ messages in thread
From: Liran Liss @ 2016-11-30 17:01 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Tom Talpey, Doug Ledford, Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua, Or

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]

> 
> > Exactly. If/when such devices appear, we would need to extend
> > connection management to specify the protocol, rather than infer it
> > from the port space.
> 
> We support that perfectly today as long as the port creates two 'struct
> ib_devices'. Anything else will require some kind of changes to libibverb's API to
> specify the AH style.
> 

rdmacm would still have to choose between these ib_devices somehow - this doesn't change anything.
Also, AHs are already port-specific. So, I don't see any issue in this regard.

In any case, we have millions of multi-port devices that can use different link types deployed.
This is the specification, and more such devices could appear in the future.
We cannot change the device model.

> > Rethinking about the uAPI, maybe we should report a protocol bit-mask
> > similar to the kernel's, instead of QP types?  This would provide all
> > the required information (e.g., any combination of RoCEv1/v2, iWARP,
> > and Raw Ethernet for Ethernet links) for today's use-cases as well as
> > tomorrow's combined RoCE/iWARP devices.
> 
> Maybe we should dump this uapi stuff until Matan's patches are done. The
> introspection possible with Matan's work is flexible enough to cope with more
> cases..

No doubt that the new ABI would be a lot more flexible and self-describing.
But it would take a while until we port everything to use it.
So, generally, I don't see any problem using the current extensibility capabilities to support useful semantics.

> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                               ` <HE1PR0501MB28128CDB112558C5980CB638B18C0-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-11-30 17:08                                 ` Jason Gunthorpe
       [not found]                                   ` <20161130170830.GA17512-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-30 17:08 UTC (permalink / raw)
  To: Liran Liss
  Cc: Tom Talpey, Doug Ledford, Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua, Or Gerlitz

On Wed, Nov 30, 2016 at 05:01:32PM +0000, Liran Liss wrote:
> > From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> 
> > 
> > > Exactly. If/when such devices appear, we would need to extend
> > > connection management to specify the protocol, rather than infer it
> > > from the port space.
> > 
> > We support that perfectly today as long as the port creates two 'struct
> > ib_devices'. Anything else will require some kind of changes to libibverb's API to
> > specify the AH style.
> > 
> 
> rdmacm would still have to choose between these ib_devices somehow

Each ib_device is either iwarp or rocee, the rdma cm would route iwarp
stuff to the iwarp one and rocee stuff to the rocee one. Not really a
problem with today's architecture.

> - this doesn't change anything.  Also, AHs are already
> port-specific. So, I don't see any issue in this regard.

The current scheme infers the protocol of the AH from the current
configuration of the port, which is a crazy API when the port's
protocol can change on the fly.

> In any case, we have millions of multi-port devices that can use
> different link types deployed.  This is the specification, and more
> such devices could appear in the future.  We cannot change the
> device model.

Of course we can change how they are modeled in Linux, it is just
software.

> No doubt that the new ABI would be a lot more flexible and
> self-describing.  But it would take a while until we port everything
> to use it.  So, generally, I don't see any problem using the current
> extensibility capabilities to support useful semantics.

Perhaps a moritorium on some changes to the current uAPI will
encourage the new one to get finished :P

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                   ` <20161130170830.GA17512-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-30 17:25                                     ` Hefty, Sean
       [not found]                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA190-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Hefty, Sean @ 2016-11-30 17:25 UTC (permalink / raw)
  To: Jason Gunthorpe, Liran Liss
  Cc: Tom Talpey, Doug Ledford, Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua, Or

> > - this doesn't change anything.  Also, AHs are already
> > port-specific. So, I don't see any issue in this regard.
> 
> The current scheme infers the protocol of the AH from the current
> configuration of the port, which is a crazy API when the port's
> protocol can change on the fly.

Maybe the solution is to make the protocol selection explicit throughout the APIs and associate it with a QP, rather than attempting to list all transport protocols that a port can support.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA190-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-11-30 17:27                                         ` Steve Wise
  2016-11-30 17:30                                           ` Hefty, Sean
  2016-11-30 17:32                                         ` Jason Gunthorpe
  1 sibling, 1 reply; 53+ messages in thread
From: Steve Wise @ 2016-11-30 17:27 UTC (permalink / raw)
  To: 'Hefty, Sean', 'Jason Gunthorpe', 'Liran Liss'
  Cc: 'Tom Talpey', 'Doug Ledford',
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Marciniszyn, Mike', 'Dalessandro, Dennis',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Latif, Faisal', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

> 
> > > - this doesn't change anything.  Also, AHs are already
> > > port-specific. So, I don't see any issue in this regard.
> >
> > The current scheme infers the protocol of the AH from the current
> > configuration of the port, which is a crazy API when the port's
> > protocol can change on the fly.
> 
> Maybe the solution is to make the protocol selection explicit throughout the
APIs
> and associate it with a QP, rather than attempting to list all transport
protocols that
> a port can support.

Do you mean requiring the application to pick the protocol?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
  2016-11-30 17:27                                         ` Steve Wise
@ 2016-11-30 17:30                                           ` Hefty, Sean
       [not found]                                             ` <1828884A29C6694DAF28B7E6B8A82373AB0BA1B7-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Hefty, Sean @ 2016-11-30 17:30 UTC (permalink / raw)
  To: Steve Wise, 'Jason Gunthorpe', 'Liran Liss'
  Cc: 'Tom Talpey', 'Doug Ledford',
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)', Latif, Faisal, 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

> > > > - this doesn't change anything.  Also, AHs are already
> > > > port-specific. So, I don't see any issue in this regard.
> > >
> > > The current scheme infers the protocol of the AH from the current
> > > configuration of the port, which is a crazy API when the port's
> > > protocol can change on the fly.
> >
> > Maybe the solution is to make the protocol selection explicit
> throughout the
> APIs
> > and associate it with a QP, rather than attempting to list all
> transport
> protocols that
> > a port can support.
> 
> Do you mean requiring the application to pick the protocol?

Yes - it seems necessary to support devices with RoCE and iWarp running on the same port.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA190-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-11-30 17:27                                         ` Steve Wise
@ 2016-11-30 17:32                                         ` Jason Gunthorpe
       [not found]                                           ` <20161130173211.GA9067-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  1 sibling, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-30 17:32 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Liran Liss, Tom Talpey, Doug Ledford, Steve Wise,
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni

On Wed, Nov 30, 2016 at 05:25:18PM +0000, Hefty, Sean wrote:
> > > - this doesn't change anything.  Also, AHs are already
> > > port-specific. So, I don't see any issue in this regard.
> > 
> > The current scheme infers the protocol of the AH from the current
> > configuration of the port, which is a crazy API when the port's
> > protocol can change on the fly.
> 
> Maybe the solution is to make the protocol selection explicit
> throughout the APIs and associate it with a QP, rather than
> attempting to list all transport protocols that a port can support.

AH's are linked to a PD, not a QP..

If we had to do it again, a PD centric approach would be more
sensible:

  // Create a PD on 'port' using ah format 'protocol'
  pd = ibv_pd_create(port, enum ah_type protocol);

  // Enable APM or resource sharing on the PD across two ports
  ibv_pd_add_port(pd, alt_port);

And get rid of the multi-port ib_device concept entirely.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                           ` <20161130173211.GA9067-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-30 18:17                                             ` Hefty, Sean
       [not found]                                               ` <1828884A29C6694DAF28B7E6B8A82373AB0BA20F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Hefty, Sean @ 2016-11-30 18:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Liran Liss, Tom Talpey, Doug Ledford, Steve Wise,
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni

> > Maybe the solution is to make the protocol selection explicit
> > throughout the APIs and associate it with a QP, rather than
> > attempting to list all transport protocols that a port can support.
> 
> AH's are linked to a PD, not a QP..

But protocols are associated with QPs.  I'm not sure *why* an AH is linked to a PD.  Conceptually the association seems unnecessary.

> If we had to do it again, a PD centric approach would be more
> sensible:
> 
>   // Create a PD on 'port' using ah format 'protocol'
>   pd = ibv_pd_create(port, enum ah_type protocol);
> 
>   // Enable APM or resource sharing on the PD across two ports
>   ibv_pd_add_port(pd, alt_port);
> 
> And get rid of the multi-port ib_device concept entirely.

I think no matter what option is chosen, some limitation may end up being placed on how a device is used that may not map to hardware limits.  I'm personally fine with that, but we need to define the right level of abstraction (and pick the right terms to describe them).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                               ` <1828884A29C6694DAF28B7E6B8A82373AB0BA20F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-11-30 18:34                                                 ` Jason Gunthorpe
       [not found]                                                   ` <20161130183436.GA10057-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-30 18:34 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Liran Liss, Tom Talpey, Doug Ledford, Steve Wise,
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni

On Wed, Nov 30, 2016 at 06:17:34PM +0000, Hefty, Sean wrote:
> > > Maybe the solution is to make the protocol selection explicit
> > > throughout the APIs and associate it with a QP, rather than
> > > attempting to list all transport protocols that a port can support.
> > 
> > AH's are linked to a PD, not a QP..
> 
> But protocols are associated with QPs.  I'm not sure *why* an AH is
> linked to a PD.  Conceptually the association seems unnecessary.

The AH has to be linked to the PD because the PD specifies the
hardware target and the AH is a hardware object. All objects must be
traced back to a PD. It is linked to a PD not a QP because the AH can
be shared across all QPs, which is useful for UD applications..

The AH is really similar to the ethernet layer in an IP stack, and the QP
is akin the TCP layer.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                   ` <20161130183436.GA10057-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-30 18:46                                                     ` Hefty, Sean
       [not found]                                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA24D-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Hefty, Sean @ 2016-11-30 18:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Liran Liss, Tom Talpey, Doug Ledford, Steve Wise,
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni

> > But protocols are associated with QPs.  I'm not sure *why* an AH is
> > linked to a PD.  Conceptually the association seems unnecessary.
> 
> The AH has to be linked to the PD because the PD specifies the
> hardware target and the AH is a hardware object. All objects must be
> traced back to a PD. It is linked to a PD not a QP because the AH can
> be shared across all QPs, which is useful for UD applications..
> 
> The AH is really similar to the ethernet layer in an IP stack, and the
> QP
> is akin the TCP layer.

I always viewed the AH as specifying the route to the destination port.  Locally, I still don't see why it couldn't work across the device, like the CQ does.  Associating it with a PD just seems to limit which QPs, and by association the memory buffers, it can be used with.  So, I'm still not seeing the reason for the linkage...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA24D-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-11-30 18:58                                                         ` Jason Gunthorpe
  0 siblings, 0 replies; 53+ messages in thread
From: Jason Gunthorpe @ 2016-11-30 18:58 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Liran Liss, Tom Talpey, Doug Ledford, Steve Wise,
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni

On Wed, Nov 30, 2016 at 06:46:13PM +0000, Hefty, Sean wrote:
> > > But protocols are associated with QPs.  I'm not sure *why* an AH is
> > > linked to a PD.  Conceptually the association seems unnecessary.
> > 
> > The AH has to be linked to the PD because the PD specifies the
> > hardware target and the AH is a hardware object. All objects must be
> > traced back to a PD. It is linked to a PD not a QP because the AH can
> > be shared across all QPs, which is useful for UD applications..
> > 
> > The AH is really similar to the ethernet layer in an IP stack, and the
> > QP
> > is akin the TCP layer.
> 
> I always viewed the AH as specifying the route to the destination
> port.  Locally, I still don't see why it couldn't work across the
> device, like the CQ does.  Associating it with a PD just seems to
> limit which QPs, and by association the memory buffers, it can be
> used with.  So, I'm still not seeing the reason for the linkage...

Oh I see what you mean, I forgot about that - yes, I don't see any
reason why the AH is linked to the PD not the ibv_context..

Perhaps it is an artifact of the thinking when applying this to the
kernel where each ULP would allocate resources to the PD and then
destroy the PD when the connection is done? So per ulp resources were
charged to the PD..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                   ` <20161130163621.GB24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-11-30 19:35                     ` Doug Ledford
       [not found]                       ` <0f905c52-b167-4b7d-4fd0-091056997e47-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Doug Ledford @ 2016-11-30 19:35 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	'Mike Marciniszyn', 'Dennis Dalessandro',
	'Lijun Ou', 'Wei Hu(Xavier)',
	'Faisal Latif', 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'


[-- Attachment #1.1: Type: text/plain, Size: 3248 bytes --]

On 11/30/2016 11:36 AM, Jason Gunthorpe wrote:
> On Tue, Nov 29, 2016 at 09:07:52PM -0500, Doug Ledford wrote:
>> On 11/28/2016 12:08 PM, Steve Wise wrote:
>>>>
>>>> On Sun, Nov 27, 2016 at 04:51:27PM +0200, Leon Romanovsky wrote:
>>>>
>>>>> +static inline bool rdma_protocol_raw_packet(const struct ib_device *device,
>>> u8
>>>> port_num)
>>>>> +{
>>>>> +	return device->port_immutable[port_num].core_cap_flags &
>>>> RDMA_CORE_CAP_PROT_RAW_PACKET;
>>>>> +}
>>>>
>>>> Does the mlx drivers really register ports with different capabilities
>>>> as the same ib_device? I'm not sure that should be allowed.
>>>>
>>>> I keep talking about how we need to get rid of the port_num in these
>>>> sorts of places because it makes no sense...
>>>>
>>>
>>> I agree.   Requiring the port number has implications that ripple up into the
>>> rdma-rw api as well...
>>>  
>>>
>>
>> In all fairness, there is no requirement that any two ports on the same
>> device be the same link layer, or if the link layer is Ethernet, there
>> is no requirement that they can't support both iWARP and RoCE.
> 
> There actually is a requirement. The RDMA CM hard requires all ports
> be iWARP or !iWARP at least. I'm sure there are other subtle things
> floating around.
> 
> There are also things that become very confusing for user space, and
> we don't have the infrastructure to support, if ports can switch
> configurations on the fly.
> 
> The simplest, approach, most in line with how verbs was designed, is
> to require each ib_device to have a single kind of AH.

Sorry, we're conflating two separate things here.  I was merely
referring to the hardware.  Not our implementation of a stack.  I was
attempting to point out that our stack implementation is placing
artificial restrictions on the hardware that the hardware does not share.

>> The idea that the parent device defined the supported protocols for
>> all ports of a device became wrong with the first mlx4 device that
> 
> Arguably it was sort of OK for roceev1, is less OK for v2, but
> shouldn't have been done anyhow.
> 
> The uapi question here is do we want to double down and try and make
> this work (and what does that even *mean*) or admit mlx4 was an error
> and stop doing that going forward..
> 
> Or do something else? eg Specifying the AH type when creating the PD
> could potentially solve some of the problems...
> 
>> could do both IB and Ethernet.  And I think I've heard rumblings of
>> a combined RoCE/iWARP device possibly in the future from someone
>> else.
> 
> Two struct ib_devices for the same port then... Certainly ugly.

I'm not entirely sure I agree....I would have to think about it more,
but the underlying problem I'm concerned with is exactly what I point
out above: what restrictions would we be placing on the hardware that
are artificial and a result of our stack that the hardware itself does
not share?  Could the hardware support automatic migration from iWARP to
RoCE or vice versa if both endpoints supported it?  Would that work if
we required two separate ib_devices?  Things like that.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                       ` <0f905c52-b167-4b7d-4fd0-091056997e47-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-11-30 21:12                         ` Hefty, Sean
  0 siblings, 0 replies; 53+ messages in thread
From: Hefty, Sean @ 2016-11-30 21:12 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Steve Wise, 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)', Latif, Faisal, 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

> I'm not entirely sure I agree....I would have to think about it more,
> but the underlying problem I'm concerned with is exactly what I point
> out above: what restrictions would we be placing on the hardware that
> are artificial and a result of our stack that the hardware itself does
> not share?  Could the hardware support automatic migration from iWARP
> to
> RoCE or vice versa if both endpoints supported it?  Would that work if
> we required two separate ib_devices?  Things like that.

IMO, in order to create a usable software interface, we will end up defining limitations.  The alternative seems to be an endless series of capability bits that expands to millions of combinations, and in the end still won't capture everything.  Consider that the existing interfaces already limit what upstream drivers can expose.

The primary reason for exposing devices is for apps to associate hardware resources (e.g. CQ and QP) with each other.  With software based implementations of roce/iwarp, a device is basically the system.  If you connect NICs directly to the CPU, then even separate hardware devices could share state, which could complicate things even more.

I'm not sure about Jason's idea of re-defining the PD as a 'protocol domain' (my term), but I think we should seriously consider something similar as an addition to the <device, port> model.  I'm still of the opinion that protocol is a property of a QP, not a port.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                             ` <1828884A29C6694DAF28B7E6B8A82373AB0BA1B7-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-12-01 13:06                                               ` Tom Talpey
       [not found]                                                 ` <d4bab4a9-aa32-282f-b501-ecfc00b0be0f-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Tom Talpey @ 2016-12-01 13:06 UTC (permalink / raw)
  To: Hefty, Sean, Steve Wise, 'Jason Gunthorpe', 'Liran Liss'
  Cc: 'Doug Ledford', 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)', Latif, Faisal, 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

On 11/30/2016 12:30 PM, Hefty, Sean wrote:
>>>>> - this doesn't change anything.  Also, AHs are already
>>>>> port-specific. So, I don't see any issue in this regard.
>>>>
>>>> The current scheme infers the protocol of the AH from the current
>>>> configuration of the port, which is a crazy API when the port's
>>>> protocol can change on the fly.
>>>
>>> Maybe the solution is to make the protocol selection explicit
>> throughout the
>> APIs
>>> and associate it with a QP, rather than attempting to list all
>> transport
>> protocols that
>>> a port can support.
>>
>> Do you mean requiring the application to pick the protocol?
>
> Yes - it seems necessary to support devices with RoCE and iWarp running on the same port.

Sockets require this, the API requires a protocol family (PF_xxx)
and socket type (SOCK_xxx) that direct the endpoint to become TCP,
UDP, etc.

On the other hand, RDMA APIs to date have considered RDMA to be the
transport, and therefore they have hidden the underlying protocol.
You just request an "RDMA" endpoint from a specific adapter, and get
what it returns.

I'd agree that the application picks the protocol, but I'd also
argue that the API should have a "wildcard" mode which allows the
current behavior. It could be a simple extension to just add a
selector parameter.

The tricky part is what to do with passive endpoints. If a wildcard
is specified, should a multiprotocol adapter create listens on all
protocols? Also, what of protocols which auto-negotiate? It's my
understanding that some RoCE adapters will attempt to autodetect
whether their peer is RoCEv1 or RoCEv2 capable, and adjust their
protocol to suit.

Tom.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                 ` <d4bab4a9-aa32-282f-b501-ecfc00b0be0f-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
@ 2016-12-01 19:07                                                   ` Hefty, Sean
       [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373AB0BA68E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Hefty, Sean @ 2016-12-01 19:07 UTC (permalink / raw)
  To: Tom Talpey, Steve Wise, 'Jason Gunthorpe', 'Liran Liss'
  Cc: 'Doug Ledford', 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)', Latif, Faisal, 'Yishai Hadas',
	'Selvin Xavier', 'Devesh Sharma',
	'Mitesh Ahuja', 'Christian Benvenuti',
	'Dave Goodell', 'Moni Shoua',
	'Or Gerlitz'

> I'd agree that the application picks the protocol, but I'd also
> argue that the API should have a "wildcard" mode which allows the
> current behavior. It could be a simple extension to just add a
> selector parameter.

I agree.  The QP type can be viewed as the socket type, we only need to add a protocol field to go along with it.  I.e. the protocol should be a qp attribute provided on input to qp create.

> The tricky part is what to do with passive endpoints. If a wildcard
> is specified, should a multiprotocol adapter create listens on all
> protocols? Also, what of protocols which auto-negotiate? It's my
> understanding that some RoCE adapters will attempt to autodetect
> whether their peer is RoCEv1 or RoCEv2 capable, and adjust their
> protocol to suit.

We could include the protocol selection, also with wildcard support, here as well.  The rdma cm may be able to handle this, so that drivers won't need to deal with a wildcard endpoint.  Assuming that we don't end up with some devices that require they implement wildcard support, while others can't...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373AB0BA68E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-12-04 20:38                                                       ` Liran Liss
       [not found]                                                         ` <HE1PR0501MB2812393A0A690DEC1E03DE3FB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Liran Liss @ 2016-12-04 20:38 UTC (permalink / raw)
  To: Hefty, Sean, Tom Talpey, Steve Wise, 'Jason Gunthorpe'
  Cc: 'Doug Ledford', 'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni Shoua, Or Gerlitz

> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Hefty, Sean

> 
> > I'd agree that the application picks the protocol, but I'd also argue
> > that the API should have a "wildcard" mode which allows the current
> > behavior. It could be a simple extension to just add a selector
> > parameter.
> 
> I agree.  The QP type can be viewed as the socket type, we only need to add a
> protocol field to go along with it.  I.e. the protocol should be a qp attribute
> provided on input to qp create.
> 

Right.

> > The tricky part is what to do with passive endpoints. If a wildcard is
> > specified, should a multiprotocol adapter create listens on all
> > protocols? Also, what of protocols which auto-negotiate? It's my
> > understanding that some RoCE adapters will attempt to autodetect
> > whether their peer is RoCEv1 or RoCEv2 capable, and adjust their
> > protocol to suit.
> 
> We could include the protocol selection, also with wildcard support, here as
> well.  The rdma cm may be able to handle this, so that drivers won't need to deal
> with a wildcard endpoint.  Assuming that we don't end up with some devices
> that require they implement wildcard support, while others can't...

Typically, the application will request the protocol that it wants or leave it unspecified.
In that case, I think that the rdmacm would select the device default.

Anyway, returning to the initial matter at hand: I would like to start with each port reporting a bit mask of the supported protocols on that link (RoCE v1/v2, Raw Ethernet, iWARP, etc.)
It will be used for reporting device capabilities in general for tools, as well as by applications that don't use rdmacm.
--Liran

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                         ` <HE1PR0501MB2812393A0A690DEC1E03DE3FB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-12-05 17:10                                                           ` Jason Gunthorpe
       [not found]                                                             ` <20161205171013.GA27784-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-12-05 17:10 UTC (permalink / raw)
  To: Liran Liss
  Cc: Hefty, Sean, Tom Talpey, Steve Wise, 'Doug Ledford',
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell',
	Moni

On Sun, Dec 04, 2016 at 08:38:14PM +0000, Liran Liss wrote:

> Anyway, returning to the initial matter at hand: I would like to
> start with each port reporting a bit mask of the supported protocols
> on that link (RoCE v1/v2, Raw Ethernet, iWARP, etc.)  It will be
> used for reporting device capabilities in general for tools, as well
> as by applications that don't use rdmacm.

Why don't we start by defining how it is supposed to even work and how
to fix the RDMA CM before adding even more random capability bits?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                             ` <20161205171013.GA27784-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-12-05 20:14                                                               ` Liran Liss
       [not found]                                                                 ` <HE1PR0501MB2812A66403A8C19AF83742EDB1830-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Liran Liss @ 2016-12-05 20:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Hefty, Sean, Tom Talpey, Steve Wise, 'Doug Ledford',
	'Leon Romanovsky',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Steve Wise',
	Marciniszyn, Mike, Dalessandro, Dennis, 'Lijun Ou',
	'Wei Hu(Xavier)',
	Latif, Faisal, Yishai Hadas, 'Selvin Xavier',
	'Devesh Sharma', 'Mitesh Ahuja',
	'Christian Benvenuti', 'Dave Goodell'

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]

> 
> > Anyway, returning to the initial matter at hand: I would like to start
> > with each port reporting a bit mask of the supported protocols on that
> > link (RoCE v1/v2, Raw Ethernet, iWARP, etc.)  It will be used for
> > reporting device capabilities in general for tools, as well as by
> > applications that don't use rdmacm.
> 
> Why don't we start by defining how it is supposed to even work and how to fix
> the RDMA CM before adding even more random capability bits?
> 
> Jason

Extensions to RDMA CM is an important topic, which we can continue to discuss as devices that require it are introduced.

Protocol capabilities are needed for information purposes as well as for applications that do not use RDMA CM.
They are not "random", but depict what a device can do.
--Liran
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                 ` <HE1PR0501MB2812A66403A8C19AF83742EDB1830-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-12-06 21:26                                                                   ` Or Gerlitz
       [not found]                                                                     ` <CAJ3xEMhYs0jtXYDqAXEw17Fk4padG903J3enL+uNPg_fNk-9Uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Or Gerlitz @ 2016-12-06 21:26 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Jason Gunthorpe, Hefty, Sean, Steve Wise,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Matan Barak,
	Leon Romanovsky

On Mon, Dec 5, 2016 at 10:14 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]

>>> Anyway, returning to the initial matter at hand: I would like to start
>>> with each port reporting a bit mask of the supported protocols on that
>>> link (RoCE v1/v2, Raw Ethernet, iWARP, etc.)  It will be used for
>>> reporting device capabilities in general for tools, as well as by
>>> applications that don't use rdmacm.

>> Why don't we start by defining how it is supposed to even work and how to fix
>> the RDMA CM before adding even more random capability bits?

> Extensions to RDMA CM is an important topic, which we can continue to discuss as devices that require it are introduced.

> Protocol capabilities are needed for information purposes as well as for applications that do not use RDMA CM.
> They are not "random", but depict what a device can do.

Doug,

Can we get a maintainer say here? I didn't see complaints on the mlx5
IB driver patch that opens IB device also in the lack of RoCE support.
We have a valid real life use-case which needs this driver patch.

Per your request, I made also some core patch to expose QPTs to
user-space and we went into this long discussion.

Do you want the change along what was proposed by Liran (allow
user-space to query the supported protocols per port) or you are okay
with  taking only the mlx5 patches for now and discuss the rest later
or something else?

Please let us know

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                     ` <CAJ3xEMhYs0jtXYDqAXEw17Fk4padG903J3enL+uNPg_fNk-9Uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-06 21:39                                                                       ` Jason Gunthorpe
       [not found]                                                                         ` <20161206213938.GC647-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-12-06 21:39 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Doug Ledford, Hefty, Sean, Steve Wise,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Matan Barak,
	Leon Romanovsky

On Tue, Dec 06, 2016 at 11:26:51PM +0200, Or Gerlitz wrote:

> Do you want the change along what was proposed by Liran (allow
> user-space to query the supported protocols per port) or you are
> okay

Why doesn't mellanox come up with a plan to actually make all of this
work in some sensible way? So far only Mellanox needs this.

I've already proposed disallowing multiprotocol struct ib_devices.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                         ` <20161206213938.GC647-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-12-06 22:13                                                                           ` Hefty, Sean
       [not found]                                                                             ` <1828884A29C6694DAF28B7E6B8A82373AB0BBEC7-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Hefty, Sean @ 2016-12-06 22:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Or Gerlitz
  Cc: Doug Ledford, Steve Wise, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Matan Barak, Leon Romanovsky

> I've already proposed disallowing multiprotocol struct ib_devices.

My preference is to discontinue attempts at associating a protocol with the device.  A device could implement a dozen protocols in software.  Transports belong to QPs or cm ids, not devices.  Each rdma_cm_id should be associated with a specific cm/transport directly, rather than indirectly selecting one based on the bound <device, port>.

If an app wants a specific transport type for a QP, why doesn't it just try to open one and see if the call fails?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                             ` <1828884A29C6694DAF28B7E6B8A82373AB0BBEC7-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-12-07 22:06                                                                               ` Doug Ledford
       [not found]                                                                                 ` <d7bca935-32cb-51ee-beea-724e1a5b748a-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Doug Ledford @ 2016-12-07 22:06 UTC (permalink / raw)
  To: Hefty, Sean, Jason Gunthorpe, Or Gerlitz
  Cc: Steve Wise, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss,
	Matan Barak, Leon Romanovsky


[-- Attachment #1.1: Type: text/plain, Size: 936 bytes --]

On 12/6/2016 5:13 PM, Hefty, Sean wrote:
>> I've already proposed disallowing multiprotocol struct ib_devices.
> 
> My preference is to discontinue attempts at associating a protocol with the device.  A device could implement a dozen protocols in software.  Transports belong to QPs or cm ids, not devices.  Each rdma_cm_id should be associated with a specific cm/transport directly, rather than indirectly selecting one based on the bound <device, port>.
> 
> If an app wants a specific transport type for a QP, why doesn't it just try to open one and see if the call fails?

I tend to agree with Sean on this.  And to answer Or's question from a
couple emails back, I'm inclined to just take the last three patches for
now while we work out a better idea of how everything here should be
done on the first seven patches.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                                 ` <d7bca935-32cb-51ee-beea-724e1a5b748a-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-12-07 22:40                                                                                   ` Jason Gunthorpe
       [not found]                                                                                     ` <20161207224044.GA23093-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Jason Gunthorpe @ 2016-12-07 22:40 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Hefty, Sean, Or Gerlitz, Steve Wise,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss, Matan Barak,
	Leon Romanovsky

On Wed, Dec 07, 2016 at 05:06:24PM -0500, Doug Ledford wrote:
> On 12/6/2016 5:13 PM, Hefty, Sean wrote:
> >> I've already proposed disallowing multiprotocol struct ib_devices.
> > 
> > My preference is to discontinue attempts at associating a protocol with the device.  A device could implement a dozen protocols in software.  Transports belong to QPs or cm ids, not devices.  Each rdma_cm_id should be associated with a specific cm/transport directly, rather than indirectly selecting one based on the bound <device, port>.
> > 
> > If an app wants a specific transport type for a QP, why doesn't it just try to open one and see if the call fails?
> 
> I tend to agree with Sean on this.  And to answer Or's question from a
> couple emails back, I'm inclined to just take the last three patches for
> now while we work out a better idea of how everything here should be
> done on the first seven patches.

Well in that case, we should work toward getting rid of 'struct
ib_device' - it is the cause of so much of this trouble. A
port-focused model like the netstack uses is overall saner..

We can figure out some other way to model what ports can be shared
within a PD, ideally something that works better for rxe..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                                     ` <20161207224044.GA23093-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-12-08  9:29                                                                                       ` Liran Liss
       [not found]                                                                                         ` <HE1PR0501MB28127FEFC67D9F71BF14BB6CB1840-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: Liran Liss @ 2016-12-08  9:29 UTC (permalink / raw)
  To: Jason Gunthorpe, Doug Ledford
  Cc: Hefty, Sean, Or Gerlitz, Steve Wise,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Matan Barak, Leon Romanovsky

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> Sent: Thursday, December 08, 2016 12:41 AM

> > >
> > > My preference is to discontinue attempts at associating a protocol with the
> device.  A device could implement a dozen protocols in software.  Transports
> belong to QPs or cm ids, not devices.  Each rdma_cm_id should be associated
> with a specific cm/transport directly, rather than indirectly selecting one based
> on the bound <device, port>.
> > >
> > > If an app wants a specific transport type for a QP, why doesn't it just try to
> open one and see if the call fails?
> >
> > I tend to agree with Sean on this.  And to answer Or's question from a
> > couple emails back, I'm inclined to just take the last three patches
> > for now while we work out a better idea of how everything here should
> > be done on the first seven patches.
> 

Yep, it's a good start.

> Well in that case, we should work toward getting rid of 'struct ib_device' - it is
> the cause of so much of this trouble. A port-focused model like the netstack uses
> is overall saner..
> 

We are discussing 2 separate issues:
(1) Whether a device can support multiple protocols
(2) Multi-port devices, optionally with different link types.

I think that we all agree that (1) will be needed for future devices, and that different QPs might be running different protocols.
This means that supported protocols need to be device capabilities, as we have today in the kernel.
There are several devices that implement (2), which follows directly from the specification. The model for these devices is not going to change.

There is a question on where to report protocol capabilities - at the device or port level. This is regardless of how the kernel CMA will resolve which device and port will service a connection - it's a matter of transparency and management.
Doing so at the device level will require apps to know which protocols run on which link type. Doing so at the port level will be more obvious to apps, but won't be a common use case (assuming most apps will use rdmacm, which will do the protocol resolution job for them in the kernel). Doing so at the device level seems to be more aligned with how most future devices would be implemented.
However, I won't rule out multi-port devices so soon. When the transport is offloaded to HW, it makes sense to conduct HA between ports within the domain of a single device.

In any case, this has nothing to do with 'struct ib_device', which represents a device that handles packet processing internally and exposes transport endpoints to applications.
--Liran
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
       [not found]                                                                                         ` <HE1PR0501MB28127FEFC67D9F71BF14BB6CB1840-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2016-12-08 17:41                                                                                           ` Jason Gunthorpe
  0 siblings, 0 replies; 53+ messages in thread
From: Jason Gunthorpe @ 2016-12-08 17:41 UTC (permalink / raw)
  To: Liran Liss
  Cc: Doug Ledford, Hefty, Sean, Or Gerlitz, Steve Wise,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Matan Barak, Leon Romanovsky

On Thu, Dec 08, 2016 at 09:29:32AM +0000, Liran Liss wrote:

> > Well in that case, we should work toward getting rid of 'struct ib_device' - it is
> > the cause of so much of this trouble. A port-focused model like the netstack uses
> > is overall saner..
> > 
> 
> We are discussing 2 separate issues:
> (1) Whether a device can support multiple protocols
> (2) Multi-port devices, optionally with different link types.
> 
> I think that we all agree that (1) will be needed for future
> devices,

We all agree future hardware devices will support multiple protocols,
but I don't think there is any agreement so far on how to model that,
or what 'struct ib_device' means in such a world. As I said, I'd like
to get rid of it if we use a QP centric model to solve this problem.

> and that different QPs might be running different protocols.  This
> means that supported protocols need to be device capabilities,

No, they are QP capabilities, add new APIs to work with and query QP
related things and stop using the device for anything.

> have today in the kernel.  There are several devices that implement
> (2), which follows directly from the specification. The model for
> these devices is not going to change.

Again, no, the specification for multiport devices does not
contemplate different addressing formats, so the one driver that did
this went off spec in a badly thought out manner and left us this
mess.

> kernel). Doing so at the device level seems to be more aligned with
> how most future devices would be implemented.  However, I won't rule
> out multi-port devices so soon. When the transport is offloaded to
> HW, it makes sense to conduct HA between ports within the domain of
> a single device.

It is a QP thing if we go down Sean's suggestion, so add QP related
queries to get the information.

I have no idea how to model HA in this world - presumably when you add
an AH to a QP it will refuse to do it if the implied HA model isn't
supported.

But, I'm not sure how well the QP model works when talking about
HA/APM..

> In any case, this has nothing to do with 'struct ib_device', which
> represents a device that handles packet processing internally and
> exposes transport endpoints to applications.

We need to make sense of what 'struct ib_device' is to figure out how
to solve these problems because we have multiple different definitions
running around concurrnatly in the stack - that is why it doesn't work
today to have more than one protocol on a struct ib_device.

So, yes, it actually has everything to do with 'struct ib_device'..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 00/10] Support RAW Ethernet when RoCE is disabled
       [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (9 preceding siblings ...)
  2016-11-27 14:51   ` [PATCH rdma-next 10/10] IB/mlx5: Support RAW Ethernet when RoCE is disabled Leon Romanovsky
@ 2016-12-14 19:06   ` Doug Ledford
  10 siblings, 0 replies; 53+ messages in thread
From: Doug Ledford @ 2016-12-14 19:06 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua


[-- Attachment #1.1: Type: text/plain, Size: 1630 bytes --]

On 11/27/2016 9:51 AM, Leon Romanovsky wrote:
> Hi Doug,
> 
> Please find below the patch set from Or. I didn't add version notation to this
> patch set, because it was enriched extensively over initial version [1], from
> one mlx5-only patch to native IB core support.
> 
> It is tested against v4.9-rc6.
> 
> From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> On some environments, such as certain SRIOV VF configurations, RoCE is
> not supported for mlx5 Ethernet ports. Currently, the driver will not
> open IB device on that port.
> 
> This is problematic, since we do want user-space RAW Ethernet (RAW_PACKET QPs)
> functionality to remain in place. For that end, we change the relevant driver
> flows such that an IB device instance is created in that case as well.
> 
> Following the previous post [1], Doug wanted us to enable a way for
> applications to query what QP types are actually supported on a device.
> This series adds that functionality on device/port granularity, since
> some drivers (mlx4) could support different protocols per port and
> hence different QP types per port.
> 
> QP types are basically derived from the protocol/s supported on the port. We
> added two protocols (raw packet and usnic) to have a protocol set which is
> consistent with what is currently upstream.
> 
> Patches 1-7 deal with the new query and patches 8-10 with the mlx5 specific
> changes.

I took only the last three patches until we settle on the issues brought up.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port
       [not found]     ` <1480258296-27032-7-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2016-12-16 20:11       ` ira.weiny
       [not found]         ` <20161216201158.GE12582-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  0 siblings, 1 reply; 53+ messages in thread
From: ira.weiny @ 2016-12-16 20:11 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

On Sun, Nov 27, 2016 at 04:51:32PM +0200, Leon Romanovsky wrote:
> From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> Add qp_type_cap port attribute which is a bit field representation
> of the ib_qp_type enum. This will allow applications to query what QP
> types are supported by an IB device instance on a specific port.
> 
> The qp_type_cap port attribute is set by the core according to the
> protocol supported for the device port. This holds for all the
> providers with the exception of two RoCE drivers that don't implement
> UD and UC. To handle that, they (hns and qedr) are patched to remove
> these QPs from what's the core has set for them as supported.
> 
> Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>  drivers/infiniband/core/device.c          | 28 ++++++++++++++++++++++++++++
>  drivers/infiniband/hw/hns/hns_roce_main.c |  3 +++
>  drivers/infiniband/hw/qedr/verbs.c        |  2 ++
>  include/rdma/ib_verbs.h                   |  1 +
>  4 files changed, 34 insertions(+)
> 
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 760ef60..f7abde2 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -646,6 +646,31 @@ void ib_dispatch_event(struct ib_event *event)
>  }
>  EXPORT_SYMBOL(ib_dispatch_event);
>  
> +static void get_port_qp_types(const struct ib_device *device, u8 port_num,
> +			      struct ib_port_attr *port_attr)
> +{
> +	if (rdma_cap_ib_smi(device, port_num))
> +		port_attr->qp_type_cap |= BIT(IB_QPT_SMI);
> +
> +	if (rdma_cap_ib_cm(device, port_num))
> +		port_attr->qp_type_cap |= BIT(IB_QPT_GSI);

This is not accurate.  The IB CM is not the same as having QP1 supported.

Ira

> +
> +	if (rdma_ib_or_roce(device, port_num)) {
> +		port_attr->qp_type_cap |= (BIT(IB_QPT_RC) | BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
> +		if (device->attrs.device_cap_flags & IB_DEVICE_XRC)
> +			port_attr->qp_type_cap |= (BIT(IB_QPT_XRC_INI) | BIT(IB_QPT_XRC_TGT));
> +	}
> +
> +	if (rdma_protocol_iwarp(device, port_num))
> +		port_attr->qp_type_cap |= BIT(IB_QPT_RC);
> +
> +	if (rdma_protocol_raw_packet(device, port_num))
> +		port_attr->qp_type_cap |= BIT(IB_QPT_RAW_PACKET);
> +
> +	if (rdma_protocol_usnic(device, port_num))
> +		port_attr->qp_type_cap |= BIT(IB_QPT_UD);
> +}
> +
>  /**
>   * ib_query_port - Query IB port attributes
>   * @device:Device to query
> @@ -666,6 +691,9 @@ int ib_query_port(struct ib_device *device,
>  		return -EINVAL;
>  
>  	memset(port_attr, 0, sizeof(*port_attr));
> +
> +	get_port_qp_types(device, port_num, port_attr);
> +
>  	err = device->query_port(device, port_num, port_attr);
>  	if (err || port_attr->subnet_prefix)
>  		return err;
> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> index c6b5779..22de534 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> @@ -430,6 +430,9 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u8 port_num,
>  			IB_PORT_ACTIVE : IB_PORT_DOWN;
>  	props->phys_state = (props->state == IB_PORT_ACTIVE) ? 5 : 3;
>  
> +	/* mark that UD and UC aren't supported */
> +	props->qp_type_cap &= ~(BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
> +
>  	spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
>  
>  	return 0;
> diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
> index 53207ff..33d0219 100644
> --- a/drivers/infiniband/hw/qedr/verbs.c
> +++ b/drivers/infiniband/hw/qedr/verbs.c
> @@ -263,6 +263,8 @@ int qedr_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *attr)
>  	attr->max_msg_sz = rdma_port->max_msg_size;
>  	attr->max_vl_num = 4;
>  
> +	/* mark that UD and UC aren't supported */
> +	attr->qp_type_cap &= ~(BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
>  	return 0;
>  }
>  
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 485b725..0b839e4 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -536,6 +536,7 @@ struct ib_port_attr {
>  	u8			active_speed;
>  	u8                      phys_state;
>  	bool			grh_required;
> +	u16			qp_type_cap;
>  };
>  
>  enum ib_device_modify_flags {
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port
       [not found]         ` <20161216201158.GE12582-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-12-18  7:37           ` Leon Romanovsky
  2016-12-18 21:07           ` Or Gerlitz
  1 sibling, 0 replies; 53+ messages in thread
From: Leon Romanovsky @ 2016-12-18  7:37 UTC (permalink / raw)
  To: ira.weiny
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Steve Wise, Mike Marciniszyn,
	Dennis Dalessandro, Lijun Ou, Wei Hu(Xavier),
	Faisal Latif, Yishai Hadas, Selvin Xavier, Devesh Sharma,
	Mitesh Ahuja, Christian Benvenuti, Dave Goodell, Moni Shoua,
	Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 5253 bytes --]

On Fri, Dec 16, 2016 at 03:11:59PM -0500, ira.weiny wrote:
> On Sun, Nov 27, 2016 at 04:51:32PM +0200, Leon Romanovsky wrote:
> > From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >
> > Add qp_type_cap port attribute which is a bit field representation
> > of the ib_qp_type enum. This will allow applications to query what QP
> > types are supported by an IB device instance on a specific port.
> >
> > The qp_type_cap port attribute is set by the core according to the
> > protocol supported for the device port. This holds for all the
> > providers with the exception of two RoCE drivers that don't implement
> > UD and UC. To handle that, they (hns and qedr) are patched to remove
> > these QPs from what's the core has set for them as supported.
> >
> > Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > ---
> >  drivers/infiniband/core/device.c          | 28 ++++++++++++++++++++++++++++
> >  drivers/infiniband/hw/hns/hns_roce_main.c |  3 +++
> >  drivers/infiniband/hw/qedr/verbs.c        |  2 ++
> >  include/rdma/ib_verbs.h                   |  1 +
> >  4 files changed, 34 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> > index 760ef60..f7abde2 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -646,6 +646,31 @@ void ib_dispatch_event(struct ib_event *event)
> >  }
> >  EXPORT_SYMBOL(ib_dispatch_event);
> >
> > +static void get_port_qp_types(const struct ib_device *device, u8 port_num,
> > +			      struct ib_port_attr *port_attr)
> > +{
> > +	if (rdma_cap_ib_smi(device, port_num))
> > +		port_attr->qp_type_cap |= BIT(IB_QPT_SMI);
> > +
> > +	if (rdma_cap_ib_cm(device, port_num))
> > +		port_attr->qp_type_cap |= BIT(IB_QPT_GSI);
>
> This is not accurate.  The IB CM is not the same as having QP1 supported.

Thanks Ira,
Luckily enough, we dropped first 7 patches from this series and this
patch is not relevant now.

>
> Ira
>
> > +
> > +	if (rdma_ib_or_roce(device, port_num)) {
> > +		port_attr->qp_type_cap |= (BIT(IB_QPT_RC) | BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
> > +		if (device->attrs.device_cap_flags & IB_DEVICE_XRC)
> > +			port_attr->qp_type_cap |= (BIT(IB_QPT_XRC_INI) | BIT(IB_QPT_XRC_TGT));
> > +	}
> > +
> > +	if (rdma_protocol_iwarp(device, port_num))
> > +		port_attr->qp_type_cap |= BIT(IB_QPT_RC);
> > +
> > +	if (rdma_protocol_raw_packet(device, port_num))
> > +		port_attr->qp_type_cap |= BIT(IB_QPT_RAW_PACKET);
> > +
> > +	if (rdma_protocol_usnic(device, port_num))
> > +		port_attr->qp_type_cap |= BIT(IB_QPT_UD);
> > +}
> > +
> >  /**
> >   * ib_query_port - Query IB port attributes
> >   * @device:Device to query
> > @@ -666,6 +691,9 @@ int ib_query_port(struct ib_device *device,
> >  		return -EINVAL;
> >
> >  	memset(port_attr, 0, sizeof(*port_attr));
> > +
> > +	get_port_qp_types(device, port_num, port_attr);
> > +
> >  	err = device->query_port(device, port_num, port_attr);
> >  	if (err || port_attr->subnet_prefix)
> >  		return err;
> > diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> > index c6b5779..22de534 100644
> > --- a/drivers/infiniband/hw/hns/hns_roce_main.c
> > +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> > @@ -430,6 +430,9 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u8 port_num,
> >  			IB_PORT_ACTIVE : IB_PORT_DOWN;
> >  	props->phys_state = (props->state == IB_PORT_ACTIVE) ? 5 : 3;
> >
> > +	/* mark that UD and UC aren't supported */
> > +	props->qp_type_cap &= ~(BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
> > +
> >  	spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
> >
> >  	return 0;
> > diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
> > index 53207ff..33d0219 100644
> > --- a/drivers/infiniband/hw/qedr/verbs.c
> > +++ b/drivers/infiniband/hw/qedr/verbs.c
> > @@ -263,6 +263,8 @@ int qedr_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *attr)
> >  	attr->max_msg_sz = rdma_port->max_msg_size;
> >  	attr->max_vl_num = 4;
> >
> > +	/* mark that UD and UC aren't supported */
> > +	attr->qp_type_cap &= ~(BIT(IB_QPT_UD) | BIT(IB_QPT_UC));
> >  	return 0;
> >  }
> >
> > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> > index 485b725..0b839e4 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -536,6 +536,7 @@ struct ib_port_attr {
> >  	u8			active_speed;
> >  	u8                      phys_state;
> >  	bool			grh_required;
> > +	u16			qp_type_cap;
> >  };
> >
> >  enum ib_device_modify_flags {
> > --
> > 2.7.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port
       [not found]         ` <20161216201158.GE12582-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  2016-12-18  7:37           ` Leon Romanovsky
@ 2016-12-18 21:07           ` Or Gerlitz
  1 sibling, 0 replies; 53+ messages in thread
From: Or Gerlitz @ 2016-12-18 21:07 UTC (permalink / raw)
  To: ira.weiny; +Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Matan Barak

On Fri, Dec 16, 2016 at 10:11 PM, ira.weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> On Sun, Nov 27, 2016 at 04:51:32PM +0200, Leon Romanovsky wrote:
> > From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >
> > Add qp_type_cap port attribute which is a bit field representation
> > of the ib_qp_type enum. This will allow applications to query what QP
> > types are supported by an IB device instance on a specific port.
> >
> > The qp_type_cap port attribute is set by the core according to the
> > protocol supported for the device port. This holds for all the
> > providers with the exception of two RoCE drivers that don't implement
> > UD and UC. To handle that, they (hns and qedr) are patched to remove
> > these QPs from what's the core has set for them as supported.
> >
> > Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Reviewed-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > ---
> >  drivers/infiniband/core/device.c          | 28 ++++++++++++++++++++++++++++
> >  drivers/infiniband/hw/hns/hns_roce_main.c |  3 +++
> >  drivers/infiniband/hw/qedr/verbs.c        |  2 ++
> >  include/rdma/ib_verbs.h                   |  1 +
> >  4 files changed, 34 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> > index 760ef60..f7abde2 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -646,6 +646,31 @@ void ib_dispatch_event(struct ib_event *event)
> >  }
> >  EXPORT_SYMBOL(ib_dispatch_event);
> >
> > +static void get_port_qp_types(const struct ib_device *device, u8 port_num,
> > +                           struct ib_port_attr *port_attr)
> > +{
> > +     if (rdma_cap_ib_smi(device, port_num))
> > +             port_attr->qp_type_cap |= BIT(IB_QPT_SMI);
> > +
> > +     if (rdma_cap_ib_cm(device, port_num))
> > +             port_attr->qp_type_cap |= BIT(IB_QPT_GSI);
>
> This is not accurate.  The IB CM is not the same as having QP1 supported.
>

Ira,

Looking on the kernel mad module code, I see they (your code..) 1st
check that rdma_cap_ib_mad(device, port) is true, and next just
blindly attempt to open GSI QP (call create_mad_qp([...], IB_QPT_GSI)
on the device/port.

I guess this is correct by elimination only if the smi cap is false...
b/c if the mad cap is set and the smi cap is true, it's possible that
GSI isn't supported for the port under some proprietary protocol. So
what is per your design a clear robust way to determine if GSI is
supported on the device/port?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2016-12-18 21:07 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-27 14:51 [PATCH rdma-next 00/10] Support RAW Ethernet when RoCE is disabled Leon Romanovsky
     [not found] ` <1480258296-27032-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-11-27 14:51   ` [PATCH rdma-next 01/10] IB/core: Add raw packet protocol Leon Romanovsky
     [not found]     ` <1480258296-27032-2-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-11-28 17:00       ` Jason Gunthorpe
     [not found]         ` <20161128170056.GC28381-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-28 17:08           ` Steve Wise
2016-11-30  2:07             ` Doug Ledford
     [not found]               ` <cfdf28c6-4715-28d4-7da6-453fb6794c29-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-30  2:33                 ` Tom Talpey
     [not found]                   ` <5927e04b-42ec-52c1-88a3-456cc4409334-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2016-11-30 16:18                     ` Doug Ledford
2016-11-30 16:30                     ` Liran Liss
     [not found]                       ` <HE1PR0501MB28124286F8D902C49596EF85B18C0-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-11-30 16:39                         ` Jason Gunthorpe
     [not found]                           ` <20161130163949.GC24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-30 16:59                             ` Or Gerlitz
2016-11-30 17:01                             ` Liran Liss
     [not found]                               ` <HE1PR0501MB28128CDB112558C5980CB638B18C0-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-11-30 17:08                                 ` Jason Gunthorpe
     [not found]                                   ` <20161130170830.GA17512-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-30 17:25                                     ` Hefty, Sean
     [not found]                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA190-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-11-30 17:27                                         ` Steve Wise
2016-11-30 17:30                                           ` Hefty, Sean
     [not found]                                             ` <1828884A29C6694DAF28B7E6B8A82373AB0BA1B7-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-12-01 13:06                                               ` Tom Talpey
     [not found]                                                 ` <d4bab4a9-aa32-282f-b501-ecfc00b0be0f-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2016-12-01 19:07                                                   ` Hefty, Sean
     [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373AB0BA68E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-12-04 20:38                                                       ` Liran Liss
     [not found]                                                         ` <HE1PR0501MB2812393A0A690DEC1E03DE3FB1800-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-12-05 17:10                                                           ` Jason Gunthorpe
     [not found]                                                             ` <20161205171013.GA27784-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-12-05 20:14                                                               ` Liran Liss
     [not found]                                                                 ` <HE1PR0501MB2812A66403A8C19AF83742EDB1830-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-12-06 21:26                                                                   ` Or Gerlitz
     [not found]                                                                     ` <CAJ3xEMhYs0jtXYDqAXEw17Fk4padG903J3enL+uNPg_fNk-9Uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-06 21:39                                                                       ` Jason Gunthorpe
     [not found]                                                                         ` <20161206213938.GC647-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-12-06 22:13                                                                           ` Hefty, Sean
     [not found]                                                                             ` <1828884A29C6694DAF28B7E6B8A82373AB0BBEC7-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-12-07 22:06                                                                               ` Doug Ledford
     [not found]                                                                                 ` <d7bca935-32cb-51ee-beea-724e1a5b748a-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-12-07 22:40                                                                                   ` Jason Gunthorpe
     [not found]                                                                                     ` <20161207224044.GA23093-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-12-08  9:29                                                                                       ` Liran Liss
     [not found]                                                                                         ` <HE1PR0501MB28127FEFC67D9F71BF14BB6CB1840-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2016-12-08 17:41                                                                                           ` Jason Gunthorpe
2016-11-30 17:32                                         ` Jason Gunthorpe
     [not found]                                           ` <20161130173211.GA9067-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-30 18:17                                             ` Hefty, Sean
     [not found]                                               ` <1828884A29C6694DAF28B7E6B8A82373AB0BA20F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-11-30 18:34                                                 ` Jason Gunthorpe
     [not found]                                                   ` <20161130183436.GA10057-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-30 18:46                                                     ` Hefty, Sean
     [not found]                                                       ` <1828884A29C6694DAF28B7E6B8A82373AB0BA24D-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-11-30 18:58                                                         ` Jason Gunthorpe
2016-11-30 16:29                 ` Hefty, Sean
2016-11-30 16:36                 ` Jason Gunthorpe
     [not found]                   ` <20161130163621.GB24639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-30 19:35                     ` Doug Ledford
     [not found]                       ` <0f905c52-b167-4b7d-4fd0-091056997e47-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-30 21:12                         ` Hefty, Sean
2016-11-28 20:57           ` Or Gerlitz
     [not found]             ` <CAJ3xEMiv6HCu-9fi12XtafxYWu-+gNPMbnfb-A4-+FrgR6KZNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-11-28 22:25               ` Jason Gunthorpe
     [not found]                 ` <20161128222559.GB744-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-11-29  6:35                   ` Or Gerlitz
     [not found]                     ` <CAJ3xEMiC3UDujgSL5fwP7ee1=OjhorZ2aeB1k+ptGb9GWaUVkg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-11-29 16:19                       ` Jason Gunthorpe
2016-11-27 14:51   ` [PATCH rdma-next 02/10] IB/mlx5: Support " Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 03/10] IB/mlx4: " Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 04/10] IB: Add protocol for USNIC Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 05/10] IB: Query port through the core instead of directly calling the driver handler Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 06/10] IB/core: Enable to query QP types supported by IB device on a port Leon Romanovsky
     [not found]     ` <1480258296-27032-7-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2016-12-16 20:11       ` ira.weiny
     [not found]         ` <20161216201158.GE12582-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-12-18  7:37           ` Leon Romanovsky
2016-12-18 21:07           ` Or Gerlitz
2016-11-27 14:51   ` [PATCH rdma-next 07/10] IB/uverbs: Propagate supported QP types to user-space Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 08/10] IB/mlx5: Refactor registration to netdev notifier Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 09/10] IB/mlx5: Rename RoCE related helpers to reflect being Eth ones Leon Romanovsky
2016-11-27 14:51   ` [PATCH rdma-next 10/10] IB/mlx5: Support RAW Ethernet when RoCE is disabled Leon Romanovsky
2016-12-14 19:06   ` [PATCH rdma-next 00/10] " Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.