All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 0/7] Add RoCE v2 support for mlx4 driver
@ 2015-12-29 13:24 Matan Barak
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz, Matan Barak

Hi Doug,

This series adds RoCE v2 support for mlx4 driver.
It implements the required bits in the new RoCE v2 API while adding
the necessary firmware commands and handling.

Patch 0001 queries the firmware if RoCE is supported.
Patch 0002 introduces a new firmware command that sets the GID table,
such that we store the GID type along the GID itself in the table.
Patch 0003 configures the device to work in RoCE v1 and RoCE v2 mixed
mode.
Patch 0004 adds the support to create steering rules for IPv4 based
packets. This is necessary in order to support RoCE multicast.
Patch 0005 introduces the support for sending RoCE v2 packets from
QP1.
Patch 0006 creates another QP in order to receive QP1 RoCE v2 traffic.
Patch 0007 advertises RoCE v2 support for upper layer. From this point
and on, the GID table will be populated with RoCE v2 based GIDs (if
the hardware supports so).

Regards,
Moni and Matan

Maor Gottlieb (1):
  net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow

Matan Barak (2):
  IB/mlx4: Add RoCE per GID support for add_gid and del_gid
  IB/mlx4: Advertise RoCE support

Moni Shoua (4):
  IB/mlx4: Query RoCE support
  IB/mlx4: Configure device to work in RoCEv2
  IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
  IB/mlx4: Create and use another QP1 for RoCEv2

 drivers/infiniband/hw/mlx4/main.c         | 100 +++++++++--
 drivers/infiniband/hw/mlx4/mlx4_ib.h      |   8 +
 drivers/infiniband/hw/mlx4/qp.c           | 283 ++++++++++++++++++++++++------
 drivers/net/ethernet/mellanox/mlx4/fw.c   |  19 +-
 drivers/net/ethernet/mellanox/mlx4/main.c |   6 +-
 drivers/net/ethernet/mellanox/mlx4/mcg.c  |  14 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |   7 +-
 drivers/net/ethernet/mellanox/mlx4/port.c |   8 +
 drivers/net/ethernet/mellanox/mlx4/qp.c   |  28 +++
 include/linux/mlx4/cmd.h                  |   3 +-
 include/linux/mlx4/device.h               |  18 +-
 include/linux/mlx4/qp.h                   |  15 +-
 include/rdma/ib_verbs.h                   |   2 +
 13 files changed, 434 insertions(+), 77 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH for-next 1/7] IB/mlx4: Query RoCE support
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid Matan Barak
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz

From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Query the RoCE support from firmware using the appropriate firmware
commands. Downstream patches will read these capabilities and act
accordingly.

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx4/fw.c   |  3 +++
 drivers/net/ethernet/mellanox/mlx4/main.c |  6 +++++-
 include/linux/mlx4/device.h               | 11 +++++++++--
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 90db94e..bdd6822 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -157,6 +157,7 @@ static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags)
 		[29] = "802.1ad offload support",
 		[31] = "Modifying loopback source checks using UPDATE_QP support",
 		[32] = "Loopback source checks support",
+		[33] = "RoCEv2 support"
 	};
 	int i;
 
@@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
 	MLX4_GET(dev_cap->bmme_flags, outbox,
 		 QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+	if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
+		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
 	if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
 		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 31c491e..fb4968f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -424,8 +424,12 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 	if (mlx4_priv(dev)->pci_dev_data & MLX4_PCI_DEV_FORCE_SENSE_PORT)
 		dev->caps.flags |= MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
 	/* Don't do sense port on multifunction devices (for now at least) */
-	if (mlx4_is_mfunc(dev))
+	/* Don't do enable RoCE V2 on multifunction devices */
+	if (mlx4_is_mfunc(dev)) {
 		dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
+		dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
+		mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
+	}
 
 	if (mlx4_low_memory_profile()) {
 		dev->caps.log_num_macs  = MLX4_MIN_LOG_NUM_MAC;
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index d3133be..dbf39ab 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -216,6 +216,7 @@ enum {
 	MLX4_DEV_CAP_FLAG2_SKIP_OUTER_VLAN	= 1LL <<  30,
 	MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB = 1ULL << 31,
 	MLX4_DEV_CAP_FLAG2_LB_SRC_CHK           = 1ULL << 32,
+	MLX4_DEV_CAP_FLAG2_ROCE_V1_V2		= 1LL <<  33,
 };
 
 enum {
@@ -267,6 +268,7 @@ enum {
 	MLX4_BMME_FLAG_TYPE_2_WIN	= 1 <<  9,
 	MLX4_BMME_FLAG_RESERVED_LKEY	= 1 << 10,
 	MLX4_BMME_FLAG_FAST_REG_WR	= 1 << 11,
+	MLX4_BMME_FLAG_ROCE_V1_V2	= 1 << 19,
 	MLX4_BMME_FLAG_PORT_REMAP	= 1 << 24,
 	MLX4_BMME_FLAG_VSD_INIT2RTR	= 1 << 28,
 };
@@ -275,6 +277,10 @@ enum {
 	MLX4_FLAG_PORT_REMAP		= MLX4_BMME_FLAG_PORT_REMAP
 };
 
+enum {
+	MLX4_FLAG_ROCE_V1_V2		= MLX4_BMME_FLAG_ROCE_V1_V2
+};
+
 enum mlx4_event {
 	MLX4_EVENT_TYPE_COMP		   = 0x00,
 	MLX4_EVENT_TYPE_PATH_MIG	   = 0x01,
@@ -984,9 +990,10 @@ struct mlx4_mad_ifc {
 		if (((dev)->caps.port_mask[port] != MLX4_PORT_TYPE_IB))
 
 #define mlx4_foreach_ib_transport_port(port, dev)                         \
-	for ((port) = 1; (port) <= (dev)->caps.num_ports; (port)++)	  \
+	for ((port) = 1; (port) <= (dev)->caps.num_ports; (port)++)       \
 		if (((dev)->caps.port_mask[port] == MLX4_PORT_TYPE_IB) || \
-			((dev)->caps.flags & MLX4_DEV_CAP_FLAG_IBOE))
+			((dev)->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) || \
+			((dev)->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2))
 
 #define MLX4_INVALID_SLAVE_ID	0xFF
 #define MLX4_SINK_COUNTER_INDEX(dev)	(dev->caps.max_counters - 1)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 1/7] IB/mlx4: Query RoCE support Matan Barak
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2 Matan Barak
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz, Matan Barak

In RoCE, GID table is managed in the IB core driver. The role of the
mlx4 driver is to synchronize the HW with the entries in the GID table.
Since it is possible that the same GID value will appear more than once
in the GID table (though with different attributes) it is required from
the mlx4 driver to maintain a reference counting mechanism and populate
the HW with a single value. We use a new firmware command in order to
populate the GID table and store the type along with the GID value.

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c    | 69 +++++++++++++++++++++++++++++++++---
 drivers/infiniband/hw/mlx4/mlx4_ib.h |  1 +
 include/linux/mlx4/cmd.h             |  3 +-
 3 files changed, 67 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 627267f..988fa33 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -154,9 +154,9 @@ static struct net_device *mlx4_ib_get_netdev(struct ib_device *device, u8 port_n
 	return dev;
 }
 
-static int mlx4_ib_update_gids(struct gid_entry *gids,
-			       struct mlx4_ib_dev *ibdev,
-			       u8 port_num)
+static int mlx4_ib_update_gids_v1(struct gid_entry *gids,
+				  struct mlx4_ib_dev *ibdev,
+				  u8 port_num)
 {
 	struct mlx4_cmd_mailbox *mailbox;
 	int err;
@@ -187,6 +187,61 @@ static int mlx4_ib_update_gids(struct gid_entry *gids,
 	return err;
 }
 
+static int mlx4_ib_update_gids_v1_v2(struct gid_entry *gids,
+				     struct mlx4_ib_dev *ibdev,
+				     u8 port_num)
+{
+	struct mlx4_cmd_mailbox *mailbox;
+	int err;
+	struct mlx4_dev *dev = ibdev->dev;
+	int i;
+	struct {
+		union ib_gid	gid;
+		__be32		rsrvd1[2];
+		__be16		rsrvd2;
+		u8		type;
+		u8		version;
+		__be32		rsrvd3;
+	} *gid_tbl;
+
+	mailbox = mlx4_alloc_cmd_mailbox(dev);
+	if (IS_ERR(mailbox))
+		return -ENOMEM;
+
+	gid_tbl = mailbox->buf;
+	for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i) {
+		memcpy(&gid_tbl[i].gid, &gids[i].gid, sizeof(union ib_gid));
+		if (gids[i].gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) {
+			gid_tbl[i].version = 2;
+			if (!ipv6_addr_v4mapped((struct in6_addr *)&gids[i].gid))
+				gid_tbl[i].type = 1;
+		}
+	}
+
+	err = mlx4_cmd(dev, mailbox->dma,
+		       MLX4_SET_PORT_ROCE_ADDR << 8 | port_num,
+		       1, MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+		       MLX4_CMD_WRAPPED);
+	if (mlx4_is_bonded(dev))
+		err += mlx4_cmd(dev, mailbox->dma,
+				MLX4_SET_PORT_ROCE_ADDR << 8 | 2,
+				1, MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+				MLX4_CMD_WRAPPED);
+
+	mlx4_free_cmd_mailbox(dev, mailbox);
+	return err;
+}
+
+static int mlx4_ib_update_gids(struct gid_entry *gids,
+			       struct mlx4_ib_dev *ibdev,
+			       u8 port_num)
+{
+	if (ibdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
+		return mlx4_ib_update_gids_v1_v2(gids, ibdev, port_num);
+
+	return mlx4_ib_update_gids_v1(gids, ibdev, port_num);
+}
+
 static int mlx4_ib_add_gid(struct ib_device *device,
 			   u8 port_num,
 			   unsigned int index,
@@ -215,7 +270,8 @@ static int mlx4_ib_add_gid(struct ib_device *device,
 	port_gid_table = &iboe->gids[port_num - 1];
 	spin_lock_bh(&iboe->lock);
 	for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i) {
-		if (!memcmp(&port_gid_table->gids[i].gid, gid, sizeof(*gid))) {
+		if (!memcmp(&port_gid_table->gids[i].gid, gid, sizeof(*gid)) &&
+		    (port_gid_table->gids[i].gid_type == attr->gid_type))  {
 			found = i;
 			break;
 		}
@@ -233,6 +289,7 @@ static int mlx4_ib_add_gid(struct ib_device *device,
 			} else {
 				*context = port_gid_table->gids[free].ctx;
 				memcpy(&port_gid_table->gids[free].gid, gid, sizeof(*gid));
+				port_gid_table->gids[free].gid_type = attr->gid_type;
 				port_gid_table->gids[free].ctx->real_index = free;
 				port_gid_table->gids[free].ctx->refcount = 1;
 				hw_update = 1;
@@ -248,8 +305,10 @@ static int mlx4_ib_add_gid(struct ib_device *device,
 		if (!gids) {
 			ret = -ENOMEM;
 		} else {
-			for (i = 0; i < MLX4_MAX_PORT_GIDS; i++)
+			for (i = 0; i < MLX4_MAX_PORT_GIDS; i++) {
 				memcpy(&gids[i].gid, &port_gid_table->gids[i].gid, sizeof(union ib_gid));
+				gids[i].gid_type = port_gid_table->gids[i].gid_type;
+			}
 		}
 	}
 	spin_unlock_bh(&iboe->lock);
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 8916e9b..7179fb1 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -478,6 +478,7 @@ struct gid_cache_context {
 
 struct gid_entry {
 	union ib_gid	gid;
+	enum ib_gid_type gid_type;
 	struct gid_cache_context *ctx;
 };
 
diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h
index 58391f2..116b284 100644
--- a/include/linux/mlx4/cmd.h
+++ b/include/linux/mlx4/cmd.h
@@ -206,7 +206,8 @@ enum {
 	MLX4_SET_PORT_GID_TABLE = 0x5,
 	MLX4_SET_PORT_PRIO2TC	= 0x8,
 	MLX4_SET_PORT_SCHEDULER = 0x9,
-	MLX4_SET_PORT_VXLAN	= 0xB
+	MLX4_SET_PORT_VXLAN	= 0xB,
+	MLX4_SET_PORT_ROCE_ADDR	= 0xD
 };
 
 enum {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 1/7] IB/mlx4: Query RoCE support Matan Barak
  2015-12-29 13:24   ` [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid Matan Barak
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow Matan Barak
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz

From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Some mlx4 adapters are RoCEv2 capable. To enable this feature some
hardware configuration is required. This is

1. Set port general parameters
2. Configure the outgoing UDP destination port
3. Configure the QP that work with RoCEv2

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c         | 19 ++++++++++++++---
 drivers/infiniband/hw/mlx4/qp.c           | 35 ++++++++++++++++++++++++++++---
 drivers/net/ethernet/mellanox/mlx4/fw.c   | 16 +++++++++++++-
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |  7 +++++--
 drivers/net/ethernet/mellanox/mlx4/port.c |  8 +++++++
 drivers/net/ethernet/mellanox/mlx4/qp.c   | 28 +++++++++++++++++++++++++
 include/linux/mlx4/device.h               |  1 +
 include/linux/mlx4/qp.h                   | 15 +++++++++++--
 include/rdma/ib_verbs.h                   |  2 ++
 9 files changed, 120 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 988fa33..44e5699 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -384,6 +384,7 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
 	int i;
 	int ret;
 	unsigned long flags;
+	struct ib_gid_attr attr;
 
 	if (port_num > MLX4_MAX_PORTS)
 		return -EINVAL;
@@ -394,10 +395,13 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
 	if (!rdma_cap_roce_gid_table(&ibdev->ib_dev, port_num))
 		return index;
 
-	ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, NULL);
+	ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, &attr);
 	if (ret)
 		return ret;
 
+	if (attr.ndev)
+		dev_put(attr.ndev);
+
 	if (!memcmp(&gid, &zgid, sizeof(gid)))
 		return -EINVAL;
 
@@ -405,7 +409,8 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
 	port_gid_table = &iboe->gids[port_num - 1];
 
 	for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i)
-		if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid))) {
+		if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid)) &&
+		    attr.gid_type == port_gid_table->gids[i].gid_type) {
 			ctx = port_gid_table->gids[i].ctx;
 			break;
 		}
@@ -2481,7 +2486,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 	if (mlx4_ib_init_sriov(ibdev))
 		goto err_mad;
 
-	if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) {
+	if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE ||
+	    dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
 		if (!iboe->nb.notifier_call) {
 			iboe->nb.notifier_call = mlx4_ib_netdev_event;
 			err = register_netdevice_notifier(&iboe->nb);
@@ -2490,6 +2496,13 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 				goto err_notif;
 			}
 		}
+		if (!mlx4_is_slave(dev) &&
+		    dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
+			err = mlx4_config_roce_v2_port(dev, ROCE_V2_UDP_DPORT);
+			if (err) {
+				goto err_notif;
+			}
+		}
 	}
 
 	for (j = 0; j < ARRAY_SIZE(mlx4_class_attributes); ++j) {
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 8d28059..c0dee79 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1508,6 +1508,24 @@ static int create_qp_lb_counter(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
 	return 0;
 }
 
+enum {
+	MLX4_QPC_ROCE_MODE_1 = 0,
+	MLX4_QPC_ROCE_MODE_2 = 2,
+	MLX4_QPC_ROCE_MODE_MAX = 0xff
+};
+
+static u8 gid_type_to_qpc(enum ib_gid_type gid_type)
+{
+	switch (gid_type) {
+	case IB_GID_TYPE_ROCE:
+		return MLX4_QPC_ROCE_MODE_1;
+	case IB_GID_TYPE_ROCE_UDP_ENCAP:
+		return MLX4_QPC_ROCE_MODE_2;
+	default:
+		return MLX4_QPC_ROCE_MODE_MAX;
+	}
+}
+
 static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 			       const struct ib_qp_attr *attr, int attr_mask,
 			       enum ib_qp_state cur_state, enum ib_qp_state new_state)
@@ -1651,9 +1669,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		u16 vlan = 0xffff;
 		u8 smac[ETH_ALEN];
 		int status = 0;
+		int is_eth = rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
+			attr->ah_attr.ah_flags & IB_AH_GRH;
 
-		if (rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
-		    attr->ah_attr.ah_flags & IB_AH_GRH) {
+		if (is_eth && attr->ah_attr.ah_flags & IB_AH_GRH) {
 			int index = attr->ah_attr.grh.sgid_index;
 
 			status = ib_get_cached_gid(ibqp->device, port_num,
@@ -1675,6 +1694,16 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 
 		optpar |= (MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH |
 			   MLX4_QP_OPTPAR_SCHED_QUEUE);
+
+		if (is_eth &&
+		    (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR)) {
+			u8 qpc_roce_mode = gid_type_to_qpc(gid_attr.gid_type);
+
+			if (qpc_roce_mode == MLX4_QPC_ROCE_MODE_MAX)
+				goto out;
+			context->rlkey_roce_mode |= (qpc_roce_mode << 6);
+		}
+
 	}
 
 	if (attr_mask & IB_QP_TIMEOUT) {
@@ -1846,7 +1875,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		sqd_event = 0;
 
 	if (!ibqp->uobject && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT)
-		context->rlkey |= (1 << 4);
+		context->rlkey_roce_mode |= (1 << 4);
 
 	/*
 	 * Before passing a kernel QP to the HW, make sure that the
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index bdd6822..c8a0c3f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -2232,7 +2232,8 @@ struct mlx4_config_dev {
 	__be32	rsvd1[3];
 	__be16	vxlan_udp_dport;
 	__be16	rsvd2;
-	__be32	rsvd3;
+	__be16  roce_v2_entropy;
+	__be16  roce_v2_udp_dport;
 	__be32	roce_flags;
 	__be32	rsvd4[25];
 	__be16	rsvd5;
@@ -2241,6 +2242,7 @@ struct mlx4_config_dev {
 };
 
 #define MLX4_VXLAN_UDP_DPORT (1 << 0)
+#define MLX4_ROCE_V2_UDP_DPORT BIT(3)
 #define MLX4_DISABLE_RX_PORT BIT(18)
 
 static int mlx4_CONFIG_DEV_set(struct mlx4_dev *dev, struct mlx4_config_dev *config_dev)
@@ -2358,6 +2360,18 @@ int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis)
 	return mlx4_CONFIG_DEV_set(dev, &config_dev);
 }
 
+int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port)
+{
+	struct mlx4_config_dev config_dev;
+
+	memset(&config_dev, 0, sizeof(config_dev));
+	config_dev.update_flags    = cpu_to_be32(MLX4_ROCE_V2_UDP_DPORT);
+	config_dev.roce_v2_udp_dport = cpu_to_be16(udp_port);
+
+	return mlx4_CONFIG_DEV_set(dev, &config_dev);
+}
+EXPORT_SYMBOL_GPL(mlx4_config_roce_v2_port);
+
 int mlx4_virt2phy_port_map(struct mlx4_dev *dev, u32 port1, u32 port2)
 {
 	struct mlx4_cmd_mailbox *mailbox;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index e1cf903..6a54502 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -778,8 +778,11 @@ struct mlx4_set_port_general_context {
 	u16 reserved1;
 	u8 v_ignore_fcs;
 	u8 flags;
-	u8 ignore_fcs;
-	u8 reserved2;
+	union {
+		u8 ignore_fcs;
+		u8 roce_mode;
+	};
+	u8 rr_proto;
 	__be16 mtu;
 	u8 pptx;
 	u8 pfctx;
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index c2b2131..31db708 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -968,6 +968,8 @@ int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port, int pkey_tbl_sz)
 	return err;
 }
 
+#define SET_PORT_ROCE_2_FLAGS          0x10
+#define MLX4_SET_PORT_ROCE_V1_V2       0x2
 int mlx4_SET_PORT_general(struct mlx4_dev *dev, u8 port, int mtu,
 			  u8 pptx, u8 pfctx, u8 pprx, u8 pfcrx)
 {
@@ -987,6 +989,12 @@ int mlx4_SET_PORT_general(struct mlx4_dev *dev, u8 port, int mtu,
 	context->pprx = (pprx * (!pfcrx)) << 7;
 	context->pfcrx = pfcrx;
 
+	if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
+		context->flags |= SET_PORT_ROCE_2_FLAGS;
+		context->roce_mode |=
+			(MLX4_SET_PORT_ROCE_V1_V2 & 7)
+			<< 4;
+	}
 	in_mod = MLX4_SET_PORT_GENERAL << 8 | port;
 	err = mlx4_cmd(dev, mailbox->dma, in_mod, MLX4_SET_PORT_ETH_OPCODE,
 		       MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
diff --git a/drivers/net/ethernet/mellanox/mlx4/qp.c b/drivers/net/ethernet/mellanox/mlx4/qp.c
index 168823d..d818186 100644
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -167,6 +167,13 @@ static int __mlx4_qp_modify(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
 		context->log_page_size   = mtt->page_shift - MLX4_ICM_PAGE_SHIFT;
 	}
 
+	if ((cur_state == MLX4_QP_STATE_RTR) &&
+	    (new_state == MLX4_QP_STATE_RTS) &&
+	    dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2 &&
+	    !mlx4_is_mfunc(dev))
+		context->roce_entropy =
+			cpu_to_be16(mlx4_qp_roce_entropy(dev, qp->qpn));
+
 	*(__be32 *) mailbox->buf = cpu_to_be32(optpar);
 	memcpy(mailbox->buf + 8, context, sizeof *context);
 
@@ -921,3 +928,24 @@ int mlx4_qp_to_ready(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
 	return 0;
 }
 EXPORT_SYMBOL_GPL(mlx4_qp_to_ready);
+
+u32 mlx4_qp_roce_entropy(struct mlx4_dev *dev, u32 qpn)
+{
+	struct mlx4_qp_context context;
+	struct mlx4_qp qp;
+	int err;
+
+	qp.qpn = qpn;
+	err = mlx4_qp_query(dev, &qp, &context);
+	if (!err) {
+		u32 dest_qpn = be32_to_cpu(context.remote_qpn) & 0xffffff;
+		u16 folded_dst = folded_qp(dest_qpn);
+		u16 folded_src = folded_qp(qpn);
+
+		return (dest_qpn != qpn) ?
+			((folded_dst ^ folded_src) | 0xC000) :
+			folded_src | 0xC000;
+	}
+	return 0xdead;
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_roce_entropy);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index dbf39ab..0d873f1ae 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1464,6 +1464,7 @@ int mlx4_get_base_gid_ix(struct mlx4_dev *dev, int slave, int port);
 
 int mlx4_config_vxlan_port(struct mlx4_dev *dev, __be16 udp_port);
 int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis);
+int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port);
 int mlx4_virt2phy_port_map(struct mlx4_dev *dev, u32 port1, u32 port2);
 int mlx4_vf_smi_enabled(struct mlx4_dev *dev, int slave, int port);
 int mlx4_vf_get_enable_smi_admin(struct mlx4_dev *dev, int slave, int port);
diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h
index fe052e2..631c9b8 100644
--- a/include/linux/mlx4/qp.h
+++ b/include/linux/mlx4/qp.h
@@ -194,7 +194,7 @@ struct mlx4_qp_context {
 	u8			mtu_msgmax;
 	u8			rq_size_stride;
 	u8			sq_size_stride;
-	u8			rlkey;
+	u8			rlkey_roce_mode;
 	__be32			usr_page;
 	__be32			local_qpn;
 	__be32			remote_qpn;
@@ -204,7 +204,8 @@ struct mlx4_qp_context {
 	u32			reserved1;
 	__be32			next_send_psn;
 	__be32			cqn_send;
-	u32			reserved2[2];
+	__be16                  roce_entropy;
+	__be16                  reserved2[3];
 	__be32			last_acked_psn;
 	__be32			ssn;
 	__be32			params2;
@@ -487,4 +488,14 @@ static inline struct mlx4_qp *__mlx4_qp_lookup(struct mlx4_dev *dev, u32 qpn)
 
 void mlx4_qp_remove(struct mlx4_dev *dev, struct mlx4_qp *qp);
 
+static inline u16 folded_qp(u32 q)
+{
+	u16 res;
+
+	res = ((q & 0xff) ^ ((q & 0xff0000) >> 16)) | (q & 0xff00);
+	return res;
+}
+
+u32 mlx4_qp_roce_entropy(struct mlx4_dev *dev, u32 qpn);
+
 #endif /* MLX4_QP_H */
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 75fcc97..9efaa9b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -81,6 +81,8 @@ enum ib_gid_type {
 	IB_GID_TYPE_SIZE
 };
 
+#define ROCE_V2_UDP_DPORT      4791
+
 struct ib_gid_attr {
 	enum ib_gid_type	gid_type;
 	struct net_device	*ndev;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-12-29 13:24   ` [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2 Matan Barak
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-5-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers Matan Barak
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz, Maor Gottlieb

From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

When attaching multicast for RoCE v2, we need to be able to steer
packets to the QPs. Hence, we add support for IPV4 over IB steering.

Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx4/mcg.c | 14 ++++++++++++--
 include/linux/mlx4/device.h              |  6 ++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mcg.c b/drivers/net/ethernet/mellanox/mlx4/mcg.c
index 1d4e2e0..834e60e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mcg.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c
@@ -858,7 +858,9 @@ static int parse_trans_rule(struct mlx4_dev *dev, struct mlx4_spec_list *spec,
 		break;
 
 	case MLX4_NET_TRANS_RULE_ID_IB:
-		rule_hw->ib.l3_qpn = spec->ib.l3_qpn;
+		rule_hw->ib.l3_qpn = spec->ib.l3_qpn |
+			(spec->ib.roce_type == MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4 ?
+			 (__force __be32)0x80 : (__force __be32)0);
 		rule_hw->ib.qpn_mask = spec->ib.qpn_msk;
 		memcpy(&rule_hw->ib.dst_gid, &spec->ib.dst_gid, 16);
 		memcpy(&rule_hw->ib.dst_gid_msk, &spec->ib.dst_gid_msk, 16);
@@ -1384,10 +1386,18 @@ int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, struct mlx4_qp *qp,
 			memcpy(spec.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
 			break;
 
+		case MLX4_PROT_IB_IPV4:
+			spec.id = MLX4_NET_TRANS_RULE_ID_IB;
+			memcpy(spec.ib.dst_gid + 12, gid + 12, 4);
+			memset(spec.ib.dst_gid_msk + 12, 0xff, 4);
+			spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4;
+			break;
+
 		case MLX4_PROT_IB_IPV6:
 			spec.id = MLX4_NET_TRANS_RULE_ID_IB;
 			memcpy(spec.ib.dst_gid, gid, 16);
-			memset(&spec.ib.dst_gid_msk, 0xff, 16);
+			memset(spec.ib.dst_gid_msk, 0xff, 16);
+			spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6;
 			break;
 		default:
 			return -EINVAL;
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 0d873f1ae..cdc75b2 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -391,6 +391,11 @@ enum mlx4_protocol {
 	MLX4_PROT_FCOE
 };
 
+enum mlx4_flow_roce_type {
+	MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6 = 0,
+	MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4
+};
+
 enum {
 	MLX4_MTT_FLAG_PRESENT		= 1
 };
@@ -1197,6 +1202,7 @@ struct mlx4_spec_ipv4 {
 struct mlx4_spec_ib {
 	__be32  l3_qpn;
 	__be32	qpn_msk;
+	enum    mlx4_flow_roce_type roce_type;
 	u8	dst_gid[16];
 	u8	dst_gid_msk[16];
 };
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-12-29 13:24   ` [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow Matan Barak
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2 Matan Barak
  2015-12-29 13:24   ` [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support Matan Barak
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz

From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

RoCEv2 packets are sent over IP/UDP protocols.
The mlx4 driver uses a type of RAW QP to send packets for QP1 and
therefore needs to build the network headers below BTH in software.

This patche adds option to build QP1 packets with IP and UDP headers if
RoCEv2 is requested.

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/qp.c | 86 ++++++++++++++++++++++++++---------------
 1 file changed, 54 insertions(+), 32 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index c0dee79..8485602 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -32,6 +32,8 @@
  */
 
 #include <linux/log2.h>
+#include <linux/if_ether.h>
+#include <net/ip.h>
 #include <linux/slab.h>
 #include <linux/netdevice.h>
 #include <linux/vmalloc.h>
@@ -2282,16 +2284,7 @@ static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp,
 	return 0;
 }
 
-static void mlx4_u64_to_smac(u8 *dst_mac, u64 src_mac)
-{
-	int i;
-
-	for (i = ETH_ALEN; i; i--) {
-		dst_mac[i - 1] = src_mac & 0xff;
-		src_mac >>= 8;
-	}
-}
-
+#define MLX4_ROCEV2_QP1_SPORT 0xC000
 static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 			    void *wqe, unsigned *mlx_seg_len)
 {
@@ -2311,6 +2304,8 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 	bool is_eth;
 	bool is_vlan = false;
 	bool is_grh;
+	bool is_udp = false;
+	int ip_version = 0;
 
 	send_size = 0;
 	for (i = 0; i < wr->wr.num_sge; ++i)
@@ -2319,6 +2314,8 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 	is_eth = rdma_port_get_link_layer(sqp->qp.ibqp.device, sqp->qp.port) == IB_LINK_LAYER_ETHERNET;
 	is_grh = mlx4_ib_ah_grh_present(ah);
 	if (is_eth) {
+		struct ib_gid_attr gid_attr;
+
 		if (mlx4_is_mfunc(to_mdev(ib_dev)->dev)) {
 			/* When multi-function is enabled, the ib_core gid
 			 * indexes don't necessarily match the hw ones, so
@@ -2329,23 +2326,36 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 			if (err)
 				return err;
 		} else  {
-			err = ib_get_cached_gid(ib_dev,
+			err = ib_get_cached_gid(sqp->qp.ibqp.device,
 						be32_to_cpu(ah->av.ib.port_pd) >> 24,
 						ah->av.ib.gid_index, &sgid,
-						NULL);
-			if (!err && !memcmp(&sgid, &zgid, sizeof(sgid)))
-				err = -ENOENT;
-			if (err)
+						&gid_attr);
+			if (!err) {
+				if (gid_attr.ndev)
+					dev_put(gid_attr.ndev);
+				if (!memcmp(&sgid, &zgid, sizeof(sgid)))
+					err = -ENOENT;
+			}
+			if (!err) {
+				is_udp = gid_attr.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP;
+				if (is_udp) {
+					if (ipv6_addr_v4mapped((struct in6_addr *)&sgid))
+						ip_version = 4;
+					else
+						ip_version = 6;
+					is_grh = false;
+				}
+			} else {
 				return err;
+			}
 		}
-
 		if (ah->av.eth.vlan != cpu_to_be16(0xffff)) {
 			vlan = be16_to_cpu(ah->av.eth.vlan) & 0x0fff;
 			is_vlan = 1;
 		}
 	}
 	err = ib_ud_header_init(send_size, !is_eth, is_eth, is_vlan, is_grh,
-				0, 0, 0, &sqp->ud_header);
+			  ip_version, is_udp, 0, &sqp->ud_header);
 	if (err)
 		return err;
 
@@ -2356,7 +2366,7 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 		sqp->ud_header.lrh.source_lid = cpu_to_be16(ah->av.ib.g_slid & 0x7f);
 	}
 
-	if (is_grh) {
+	if (is_grh || (ip_version == 6)) {
 		sqp->ud_header.grh.traffic_class =
 			(be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 20) & 0xff;
 		sqp->ud_header.grh.flow_label    =
@@ -2385,6 +2395,25 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 		       ah->av.ib.dgid, 16);
 	}
 
+	if (ip_version == 4) {
+		sqp->ud_header.ip4.tos =
+			(be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 20) & 0xff;
+		sqp->ud_header.ip4.id = 0;
+		sqp->ud_header.ip4.frag_off = htons(IP_DF);
+		sqp->ud_header.ip4.ttl = ah->av.eth.hop_limit;
+
+		memcpy(&sqp->ud_header.ip4.saddr,
+		       sgid.raw + 12, 4);
+		memcpy(&sqp->ud_header.ip4.daddr, ah->av.ib.dgid + 12, 4);
+		sqp->ud_header.ip4.check = ib_ud_ip4_csum(&sqp->ud_header);
+	}
+
+	if (is_udp) {
+		sqp->ud_header.udp.dport = htons(ROCE_V2_UDP_DPORT);
+		sqp->ud_header.udp.sport = htons(MLX4_ROCEV2_QP1_SPORT);
+		sqp->ud_header.udp.csum = 0;
+	}
+
 	mlx->flags &= cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE);
 
 	if (!is_eth) {
@@ -2413,34 +2442,27 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 
 	if (is_eth) {
 		struct in6_addr in6;
-
+		u16 ether_type;
 		u16 pcp = (be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 29) << 13;
 
+		ether_type = (!is_udp) ? MLX4_IB_IBOE_ETHERTYPE :
+			(ip_version == 4 ? ETH_P_IP : ETH_P_IPV6);
+
 		mlx->sched_prio = cpu_to_be16(pcp);
 
+		ether_addr_copy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac);
 		memcpy(sqp->ud_header.eth.dmac_h, ah->av.eth.mac, 6);
-		/* FIXME: cache smac value? */
 		memcpy(&ctrl->srcrb_flags16[0], ah->av.eth.mac, 2);
 		memcpy(&ctrl->imm, ah->av.eth.mac + 2, 4);
 		memcpy(&in6, sgid.raw, sizeof(in6));
 
-		if (!mlx4_is_mfunc(to_mdev(ib_dev)->dev)) {
-			u64 mac = atomic64_read(&to_mdev(ib_dev)->iboe.mac[sqp->qp.port - 1]);
-			u8 smac[ETH_ALEN];
-
-			mlx4_u64_to_smac(smac, mac);
-			memcpy(sqp->ud_header.eth.smac_h, smac, ETH_ALEN);
-		} else {
-			/* use the src mac of the tunnel */
-			memcpy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac, ETH_ALEN);
-		}
 
 		if (!memcmp(sqp->ud_header.eth.smac_h, sqp->ud_header.eth.dmac_h, 6))
 			mlx->flags |= cpu_to_be32(MLX4_WQE_CTRL_FORCE_LOOPBACK);
 		if (!is_vlan) {
-			sqp->ud_header.eth.type = cpu_to_be16(MLX4_IB_IBOE_ETHERTYPE);
+			sqp->ud_header.eth.type = cpu_to_be16(ether_type);
 		} else {
-			sqp->ud_header.vlan.type = cpu_to_be16(MLX4_IB_IBOE_ETHERTYPE);
+			sqp->ud_header.vlan.type = cpu_to_be16(ether_type);
 			sqp->ud_header.vlan.tag = cpu_to_be16(vlan | pcp);
 		}
 	} else {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-12-29 13:24   ` [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers Matan Barak
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-7-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-12-29 13:24   ` [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support Matan Barak
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz

From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The mlx4 driver uses a special QP to implement the GSI QP. This kind
of QP allows to build the InfiniBand headers in SW to be put before
the payload that comes in with the WR. The mlx4 HW builds the packet,
calculates the ICRC and puts it at the end of the payload. This ICRC
calculation however depends on the QP configuration which is
determined when QP is modified (roce_mode during INIT->RTR).
On the other hand, ICRC verification when packet is received does to
depend on this configuration.
Therefore, using 2 GSI QPs for send (one for each RoCE version) and 1
GSI QP for receive are required.

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/mlx4_ib.h |   7 ++
 drivers/infiniband/hw/mlx4/qp.c      | 162 ++++++++++++++++++++++++++++++-----
 2 files changed, 149 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 7179fb1..52ce7b0 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -177,11 +177,18 @@ struct mlx4_ib_wq {
 	unsigned		tail;
 };
 
+enum {
+	MLX4_IB_QP_CREATE_ROCE_V2_GSI = IB_QP_CREATE_RESERVED_START
+};
+
 enum mlx4_ib_qp_flags {
 	MLX4_IB_QP_LSO = IB_QP_CREATE_IPOIB_UD_LSO,
 	MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK = IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK,
 	MLX4_IB_QP_NETIF = IB_QP_CREATE_NETIF_QP,
 	MLX4_IB_QP_CREATE_USE_GFP_NOIO = IB_QP_CREATE_USE_GFP_NOIO,
+
+	/* Mellanox specific flags start from IB_QP_CREATE_RESERVED_START */
+	MLX4_IB_ROCE_V2_GSI_QP = MLX4_IB_QP_CREATE_ROCE_V2_GSI,
 	MLX4_IB_SRIOV_TUNNEL_QP = 1 << 30,
 	MLX4_IB_SRIOV_SQP = 1 << 31,
 };
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 8485602..a154d51 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -87,6 +87,7 @@ struct mlx4_ib_sqp {
 	u32			send_psn;
 	struct ib_ud_header	ud_header;
 	u8			header_buf[MLX4_IB_UD_HEADER_SIZE];
+	struct ib_qp		*roce_v2_gsi;
 };
 
 enum {
@@ -155,7 +156,10 @@ static int is_sqp(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
 			}
 		}
 	}
-	return proxy_sqp;
+	if (proxy_sqp)
+		return 1;
+
+	return !!(qp->flags & MLX4_IB_ROCE_V2_GSI_QP);
 }
 
 /* used for INIT/CLOSE port logic */
@@ -695,6 +699,7 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 			qp = &sqp->qp;
 			qp->pri.vid = 0xFFFF;
 			qp->alt.vid = 0xFFFF;
+			sqp->roce_v2_gsi = NULL;
 		} else {
 			qp = kzalloc(sizeof (struct mlx4_ib_qp), gfp);
 			if (!qp)
@@ -1085,9 +1090,17 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
 	del_gid_entries(qp);
 }
 
-static u32 get_sqp_num(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *attr)
+static int get_sqp_num(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *attr)
 {
 	/* Native or PPF */
+	if ((!mlx4_is_mfunc(dev->dev) || mlx4_is_master(dev->dev)) &&
+	    attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI) {
+		int sqpn;
+		int res = mlx4_qp_reserve_range(dev->dev, 1, 1, &sqpn, 0);
+
+		return res ? -abs(res) : sqpn;
+	}
+
 	if (!mlx4_is_mfunc(dev->dev) ||
 	    (mlx4_is_master(dev->dev) &&
 	     attr->create_flags & MLX4_IB_SRIOV_SQP)) {
@@ -1102,9 +1115,9 @@ static u32 get_sqp_num(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *attr)
 		return dev->dev->caps.qp1_proxy[attr->port_num - 1];
 }
 
-struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
-				struct ib_qp_init_attr *init_attr,
-				struct ib_udata *udata)
+static struct ib_qp *_mlx4_ib_create_qp(struct ib_pd *pd,
+					struct ib_qp_init_attr *init_attr,
+					struct ib_udata *udata)
 {
 	struct mlx4_ib_qp *qp = NULL;
 	int err;
@@ -1123,6 +1136,7 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
 					MLX4_IB_SRIOV_TUNNEL_QP |
 					MLX4_IB_SRIOV_SQP |
 					MLX4_IB_QP_NETIF |
+					MLX4_IB_QP_CREATE_ROCE_V2_GSI |
 					MLX4_IB_QP_CREATE_USE_GFP_NOIO))
 		return ERR_PTR(-EINVAL);
 
@@ -1131,15 +1145,21 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
 			return ERR_PTR(-EINVAL);
 	}
 
-	if (init_attr->create_flags &&
-	    ((udata && init_attr->create_flags & ~(sup_u_create_flags)) ||
-	     ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP |
-					   MLX4_IB_QP_CREATE_USE_GFP_NOIO |
-					   MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK)) &&
-	      init_attr->qp_type != IB_QPT_UD) ||
-	     ((init_attr->create_flags & MLX4_IB_SRIOV_SQP) &&
-	      init_attr->qp_type > IB_QPT_GSI)))
-		return ERR_PTR(-EINVAL);
+	if (init_attr->create_flags) {
+		/* userspace is not allowed to set create flags */
+		if (udata && init_attr->create_flags & ~(sup_u_create_flags))
+			return ERR_PTR(-EINVAL);
+
+		if ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP |
+						 MLX4_IB_QP_CREATE_USE_GFP_NOIO |
+						 MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK) &&
+		     init_attr->qp_type != IB_QPT_UD) &&
+		    (init_attr->create_flags & MLX4_IB_SRIOV_SQP &&
+		     init_attr->qp_type > IB_QPT_GSI) &&
+		    (init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI &&
+		     init_attr->qp_type != IB_QPT_GSI))
+			return ERR_PTR(-EINVAL);
+	}
 
 	switch (init_attr->qp_type) {
 	case IB_QPT_XRC_TGT:
@@ -1176,19 +1196,25 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
 	case IB_QPT_SMI:
 	case IB_QPT_GSI:
 	{
+		int sqpn;
+
 		/* Userspace is not allowed to create special QPs: */
 		if (udata)
 			return ERR_PTR(-EINVAL);
+		sqpn = get_sqp_num(to_mdev(pd->device), init_attr);
+
+		if (sqpn < 0)
+			return ERR_PTR(sqpn);
 
 		err = create_qp_common(to_mdev(pd->device), pd, init_attr, udata,
-				       get_sqp_num(to_mdev(pd->device), init_attr),
+				       sqpn,
 				       &qp, gfp);
 		if (err)
 			return ERR_PTR(err);
 
 		qp->port	= init_attr->port_num;
-		qp->ibqp.qp_num = init_attr->qp_type == IB_QPT_SMI ? 0 : 1;
-
+		qp->ibqp.qp_num = init_attr->qp_type == IB_QPT_SMI ? 0 :
+			init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI ? sqpn : 1;
 		break;
 	}
 	default:
@@ -1199,7 +1225,41 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
 	return &qp->ibqp;
 }
 
-int mlx4_ib_destroy_qp(struct ib_qp *qp)
+struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd,
+				struct ib_qp_init_attr *init_attr,
+				struct ib_udata *udata) {
+	struct ib_device *device = pd ? pd->device : init_attr->xrcd->device;
+	struct ib_qp *ibqp;
+	struct mlx4_ib_dev *dev = to_mdev(device);
+
+	ibqp = _mlx4_ib_create_qp(pd, init_attr, udata);
+
+	if (!IS_ERR_OR_NULL(ibqp) &&
+	    (init_attr->qp_type == IB_QPT_GSI) &&
+	    !(init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI)) {
+		struct mlx4_ib_sqp *sqp = to_msqp((to_mqp(ibqp)));
+		int is_eth = rdma_cap_eth_ah(&dev->ib_dev, init_attr->port_num);
+
+		if (is_eth &&
+		    dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
+			init_attr->create_flags |= MLX4_IB_QP_CREATE_ROCE_V2_GSI;
+			sqp->roce_v2_gsi = ib_create_qp(pd, init_attr);
+
+			if (IS_ERR_OR_NULL(sqp->roce_v2_gsi)) {
+				pr_err("Failed to create GSI QP for RoCEv2 (%ld)\n", PTR_ERR(sqp->roce_v2_gsi));
+				sqp->roce_v2_gsi = NULL;
+			} else {
+				sqp = to_msqp(to_mqp(sqp->roce_v2_gsi));
+				sqp->qp.flags |= MLX4_IB_ROCE_V2_GSI_QP;
+			}
+
+			init_attr->create_flags &= ~MLX4_IB_QP_CREATE_ROCE_V2_GSI;
+		}
+	}
+	return ibqp;
+}
+
+static int _mlx4_ib_destroy_qp(struct ib_qp *qp)
 {
 	struct mlx4_ib_dev *dev = to_mdev(qp->device);
 	struct mlx4_ib_qp *mqp = to_mqp(qp);
@@ -1228,6 +1288,20 @@ int mlx4_ib_destroy_qp(struct ib_qp *qp)
 	return 0;
 }
 
+int mlx4_ib_destroy_qp(struct ib_qp *qp)
+{
+	struct mlx4_ib_qp *mqp = to_mqp(qp);
+
+	if (mqp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI) {
+		struct mlx4_ib_sqp *sqp = to_msqp(mqp);
+
+		if (sqp->roce_v2_gsi)
+			ib_destroy_qp(sqp->roce_v2_gsi);
+	}
+
+	return _mlx4_ib_destroy_qp(qp);
+}
+
 static int to_mlx4_st(struct mlx4_ib_dev *dev, enum mlx4_ib_qp_type type)
 {
 	switch (type) {
@@ -1654,6 +1728,14 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 			mlx4_ib_steer_qp_reg(dev, qp, 1);
 			steer_qp = 1;
 		}
+
+		if (ibqp->qp_type == IB_QPT_GSI) {
+			enum ib_gid_type gid_type = qp->flags & MLX4_IB_ROCE_V2_GSI_QP ?
+				IB_GID_TYPE_ROCE_UDP_ENCAP : IB_GID_TYPE_ROCE;
+			u8 qpc_roce_mode = gid_type_to_qpc(gid_type);
+
+			context->rlkey_roce_mode |= (qpc_roce_mode << 6);
+		}
 	}
 
 	if (attr_mask & IB_QP_PKEY_INDEX) {
@@ -2054,8 +2136,8 @@ out:
 	return err;
 }
 
-int mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
-		      int attr_mask, struct ib_udata *udata)
+static int _mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+			      int attr_mask, struct ib_udata *udata)
 {
 	struct mlx4_ib_dev *dev = to_mdev(ibqp->device);
 	struct mlx4_ib_qp *qp = to_mqp(ibqp);
@@ -2158,6 +2240,26 @@ out:
 	return err;
 }
 
+int mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+		      int attr_mask, struct ib_udata *udata)
+{
+	struct mlx4_ib_qp *mqp = to_mqp(ibqp);
+	int ret;
+
+	ret = _mlx4_ib_modify_qp(ibqp, attr, attr_mask, udata);
+
+	if (mqp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI) {
+		struct mlx4_ib_sqp *sqp = to_msqp(mqp);
+
+		if (sqp->roce_v2_gsi)
+			ret = ib_modify_qp(sqp->roce_v2_gsi, attr, attr_mask);
+		if (ret)
+			pr_err("Failed to modify GSI QP for RoCEv2 (%d)\n",
+			       ret);
+	}
+	return ret;
+}
+
 static int vf_get_qp0_qkey(struct mlx4_dev *dev, int qpn, u32 *qkey)
 {
 	int i;
@@ -2802,6 +2904,26 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
 	int i;
 	struct mlx4_ib_dev *mdev = to_mdev(ibqp->device);
 
+	if (qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI) {
+		struct mlx4_ib_sqp *sqp = to_msqp(qp);
+
+		if (sqp->roce_v2_gsi) {
+			struct mlx4_ib_ah *ah = to_mah(ud_wr(wr)->ah);
+			struct ib_gid_attr gid_attr;
+			union ib_gid gid;
+
+			if (!ib_get_cached_gid(ibqp->device,
+					       be32_to_cpu(ah->av.ib.port_pd) >> 24,
+					       ah->av.ib.gid_index, &gid,
+					       &gid_attr)) {
+				if (gid_attr.ndev)
+					dev_put(gid_attr.ndev);
+				qp = (gid_attr.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) ?
+					to_mqp(sqp->roce_v2_gsi) : qp;
+			}
+		}
+	}
+
 	spin_lock_irqsave(&qp->sq.lock, flags);
 	if (mdev->dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
 		err = -EIO;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support
       [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-12-29 13:24   ` [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2 Matan Barak
@ 2015-12-29 13:24   ` Matan Barak
       [not found]     ` <1451395447-5198-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  6 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-29 13:24 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny, Or Gerlitz, Matan Barak

Advertise RoCE support in port_immutable according to the hardware
capabilities. This enables the verbs stack to use RoCE v2 mode.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 44e5699..8cf2575 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2183,6 +2183,7 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
 			       struct ib_port_immutable *immutable)
 {
 	struct ib_port_attr attr;
+	struct mlx4_ib_dev *mdev = to_mdev(ibdev);
 	int err;
 
 	err = mlx4_ib_query_port(ibdev, port_num, &attr);
@@ -2192,10 +2193,15 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
 
-	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND)
+	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND) {
 		immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
-	else
-		immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+	} else {
+		if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE)
+			immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+		if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
+			immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
+				RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+	}
 
 	immutable->max_mad_size = IB_MGMT_MAD_SIZE;
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow
       [not found]     ` <1451395447-5198-5-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 14:28       ` Or Gerlitz
  0 siblings, 0 replies; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 14:28 UTC (permalink / raw)
  To: Matan Barak, Maor Gottlieb
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Moni Shoua, Majd Dibbiny

On 12/29/2015 3:24 PM, Matan Barak wrote:
> From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
s/handlning/handling/

>
> When attaching multicast for RoCE v2, we need to be able to steer
> packets to the QPs. Hence, we add support for IPV4 over IB steering.

not sure to follow on the change-log, can you clarify it little further...

>
> Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>   drivers/net/ethernet/mellanox/mlx4/mcg.c | 14 ++++++++++++--
>   include/linux/mlx4/device.h              |  6 ++++++
>   2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mcg.c b/drivers/net/ethernet/mellanox/mlx4/mcg.c
> index 1d4e2e0..834e60e 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mcg.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c
> @@ -858,7 +858,9 @@ static int parse_trans_rule(struct mlx4_dev *dev, struct mlx4_spec_list *spec,
>   		break;
>   
>   	case MLX4_NET_TRANS_RULE_ID_IB:
> -		rule_hw->ib.l3_qpn = spec->ib.l3_qpn;
> +		rule_hw->ib.l3_qpn = spec->ib.l3_qpn |
> +			(spec->ib.roce_type == MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4 ?
> +			 (__force __be32)0x80 : (__force __be32)0);

maybe avoid using hard coded constants and get meaningful name for them?

>   		rule_hw->ib.qpn_mask = spec->ib.qpn_msk;
>   		memcpy(&rule_hw->ib.dst_gid, &spec->ib.dst_gid, 16);
>   		memcpy(&rule_hw->ib.dst_gid_msk, &spec->ib.dst_gid_msk, 16);
> @@ -1384,10 +1386,18 @@ int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, struct mlx4_qp *qp,
>   			memcpy(spec.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
>   			break;
>   
> +		case MLX4_PROT_IB_IPV4:
> +			spec.id = MLX4_NET_TRANS_RULE_ID_IB;
> +			memcpy(spec.ib.dst_gid + 12, gid + 12, 4);
> +			memset(spec.ib.dst_gid_msk + 12, 0xff, 4);
> +			spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4;
> +			break;
> +
>   		case MLX4_PROT_IB_IPV6:
>   			spec.id = MLX4_NET_TRANS_RULE_ID_IB;
>   			memcpy(spec.ib.dst_gid, gid, 16);
> -			memset(&spec.ib.dst_gid_msk, 0xff, 16);
> +			memset(spec.ib.dst_gid_msk, 0xff, 16);
> +			spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6;
>   			break;
>   		default:
>   			return -EINVAL;
> diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
> index 0d873f1ae..cdc75b2 100644
> --- a/include/linux/mlx4/device.h
> +++ b/include/linux/mlx4/device.h
> @@ -391,6 +391,11 @@ enum mlx4_protocol {
>   	MLX4_PROT_FCOE
>   };
>   
> +enum mlx4_flow_roce_type {
> +	MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6 = 0,
> +	MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4
> +};
> +
>   enum {
>   	MLX4_MTT_FLAG_PRESENT		= 1
>   };
> @@ -1197,6 +1202,7 @@ struct mlx4_spec_ipv4 {
>   struct mlx4_spec_ib {
>   	__be32  l3_qpn;
>   	__be32	qpn_msk;
> +	enum    mlx4_flow_roce_type roce_type;
>   	u8	dst_gid[16];
>   	u8	dst_gid_msk[16];
>   };

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2
       [not found]     ` <1451395447-5198-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 14:37       ` Or Gerlitz
       [not found]         ` <56829AB0.3080805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 14:37 UTC (permalink / raw)
  To: Matan Barak, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 12/29/2015 3:24 PM, Matan Barak wrote:
> From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> Some mlx4 adapters are RoCEv2 capable. To enable this feature some
> hardware configuration is required. This is
>
> 1. Set port general parameters
> 2. Configure the outgoing UDP destination port
> 3. Configure the QP that work with RoCEv2
>
> Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>   drivers/infiniband/hw/mlx4/main.c         | 19 ++++++++++++++---
>   drivers/infiniband/hw/mlx4/qp.c           | 35 ++++++++++++++++++++++++++++---
>   drivers/net/ethernet/mellanox/mlx4/fw.c   | 16 +++++++++++++-
>   drivers/net/ethernet/mellanox/mlx4/mlx4.h |  7 +++++--
>   drivers/net/ethernet/mellanox/mlx4/port.c |  8 +++++++
>   drivers/net/ethernet/mellanox/mlx4/qp.c   | 28 +++++++++++++++++++++++++
>   include/linux/mlx4/device.h               |  1 +
>   include/linux/mlx4/qp.h                   | 15 +++++++++++--
>   include/rdma/ib_verbs.h                   |  2 ++
>   9 files changed, 120 insertions(+), 11 deletions(-)

Better put (please do...) functionality which is plain mlx4 corish (such 
as new/modified FW commands, new SW/FW fields of structs and such) into 
mlx4_core patch.

>
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index 988fa33..44e5699 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -384,6 +384,7 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
>   	int i;
>   	int ret;
>   	unsigned long flags;
> +	struct ib_gid_attr attr;
>   
>   	if (port_num > MLX4_MAX_PORTS)
>   		return -EINVAL;
> @@ -394,10 +395,13 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
>   	if (!rdma_cap_roce_gid_table(&ibdev->ib_dev, port_num))
>   		return index;
>   
> -	ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, NULL);
> +	ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, &attr);
>   	if (ret)
>   		return ret;
>   
> +	if (attr.ndev)
> +		dev_put(attr.ndev);
> +
>   	if (!memcmp(&gid, &zgid, sizeof(gid)))
>   		return -EINVAL;
>   
> @@ -405,7 +409,8 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
>   	port_gid_table = &iboe->gids[port_num - 1];
>   
>   	for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i)
> -		if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid))) {
> +		if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid)) &&
> +		    attr.gid_type == port_gid_table->gids[i].gid_type) {
>   			ctx = port_gid_table->gids[i].ctx;
>   			break;
>   		}
> @@ -2481,7 +2486,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>   	if (mlx4_ib_init_sriov(ibdev))
>   		goto err_mad;
>   
> -	if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) {
> +	if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE ||
> +	    dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
>   		if (!iboe->nb.notifier_call) {
>   			iboe->nb.notifier_call = mlx4_ib_netdev_event;
>   			err = register_netdevice_notifier(&iboe->nb);
> @@ -2490,6 +2496,13 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>   				goto err_notif;
>   			}
>   		}
> +		if (!mlx4_is_slave(dev) &&
> +		    dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
> +			err = mlx4_config_roce_v2_port(dev, ROCE_V2_UDP_DPORT);
> +			if (err) {
> +				goto err_notif;
> +			}
> +		}
>   	}
>   
>   	for (j = 0; j < ARRAY_SIZE(mlx4_class_attributes); ++j) {
> diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
> index 8d28059..c0dee79 100644
> --- a/drivers/infiniband/hw/mlx4/qp.c
> +++ b/drivers/infiniband/hw/mlx4/qp.c
> @@ -1508,6 +1508,24 @@ static int create_qp_lb_counter(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
>   	return 0;
>   }
>   
> +enum {
> +	MLX4_QPC_ROCE_MODE_1 = 0,
> +	MLX4_QPC_ROCE_MODE_2 = 2,
> +	MLX4_QPC_ROCE_MODE_MAX = 0xff
> +};
> +
> +static u8 gid_type_to_qpc(enum ib_gid_type gid_type)
> +{
> +	switch (gid_type) {
> +	case IB_GID_TYPE_ROCE:
> +		return MLX4_QPC_ROCE_MODE_1;
> +	case IB_GID_TYPE_ROCE_UDP_ENCAP:
> +		return MLX4_QPC_ROCE_MODE_2;
> +	default:
> +		return MLX4_QPC_ROCE_MODE_MAX;
> +	}
> +}
> +
>   static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>   			       const struct ib_qp_attr *attr, int attr_mask,
>   			       enum ib_qp_state cur_state, enum ib_qp_state new_state)
> @@ -1651,9 +1669,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>   		u16 vlan = 0xffff;
>   		u8 smac[ETH_ALEN];
>   		int status = 0;
> +		int is_eth = rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
> +			attr->ah_attr.ah_flags & IB_AH_GRH;
>   
> -		if (rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
> -		    attr->ah_attr.ah_flags & IB_AH_GRH) {
> +		if (is_eth && attr->ah_attr.ah_flags & IB_AH_GRH) {
>   			int index = attr->ah_attr.grh.sgid_index;
>   
>   			status = ib_get_cached_gid(ibqp->device, port_num,
> @@ -1675,6 +1694,16 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>   
>   		optpar |= (MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH |
>   			   MLX4_QP_OPTPAR_SCHED_QUEUE);
> +
> +		if (is_eth &&
> +		    (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR)) {
> +			u8 qpc_roce_mode = gid_type_to_qpc(gid_attr.gid_type);
> +
> +			if (qpc_roce_mode == MLX4_QPC_ROCE_MODE_MAX)
> +				goto out;
> +			context->rlkey_roce_mode |= (qpc_roce_mode << 6);
> +		}
> +
>   	}
>   
>   	if (attr_mask & IB_QP_TIMEOUT) {
> @@ -1846,7 +1875,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>   		sqd_event = 0;
>   
>   	if (!ibqp->uobject && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT)
> -		context->rlkey |= (1 << 4);
> +		context->rlkey_roce_mode |= (1 << 4);
>   
>   	/*
>   	 * Before passing a kernel QP to the HW, make sure that the
> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
> index bdd6822..c8a0c3f 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
> @@ -2232,7 +2232,8 @@ struct mlx4_config_dev {
>   	__be32	rsvd1[3];
>   	__be16	vxlan_udp_dport;
>   	__be16	rsvd2;
> -	__be32	rsvd3;
> +	__be16  roce_v2_entropy;
> +	__be16  roce_v2_udp_dport;
>   	__be32	roce_flags;
>   	__be32	rsvd4[25];
>   	__be16	rsvd5;
> @@ -2241,6 +2242,7 @@ struct mlx4_config_dev {
>   };
>   
>   #define MLX4_VXLAN_UDP_DPORT (1 << 0)
> +#define MLX4_ROCE_V2_UDP_DPORT BIT(3)
>   #define MLX4_DISABLE_RX_PORT BIT(18)
>   
>   static int mlx4_CONFIG_DEV_set(struct mlx4_dev *dev, struct mlx4_config_dev *config_dev)
> @@ -2358,6 +2360,18 @@ int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis)
>   	return mlx4_CONFIG_DEV_set(dev, &config_dev);
>   }
>   
> +int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port)
> +{
> +	struct mlx4_config_dev config_dev;
> +
> +	memset(&config_dev, 0, sizeof(config_dev));
> +	config_dev.update_flags    = cpu_to_be32(MLX4_ROCE_V2_UDP_DPORT);
> +	config_dev.roce_v2_udp_dport = cpu_to_be16(udp_port);
> +
> +	return mlx4_CONFIG_DEV_set(dev, &config_dev);
> +}
> +EXPORT_SYMBOL_GPL(mlx4_config_roce_v2_port);

I didn't see a patch to the resource tracker, did you make sure that VFs 
can't attempt to configure the UDP port?

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2
       [not found]     ` <1451395447-5198-7-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 14:42       ` Or Gerlitz
       [not found]         ` <56829BC4.2070709-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 14:42 UTC (permalink / raw)
  To: Matan Barak, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 12/29/2015 3:24 PM, Matan Barak wrote:
> The mlx4 driver uses a special QP to implement the GSI QP. This kind
> of QP allows to build the InfiniBand headers in SW to be put before
> the payload that comes in with the WR. The mlx4 HW builds the packet,
> calculates the ICRC and puts it at the end of the payload. This ICRC
> calculation however depends on the QP configuration which is
> determined when QP is modified (roce_mode during INIT->RTR).
> On the other hand, ICRC verification when packet is received does to
> depend on this configuration.

I don't understand the part of the sentence saying "when packet is 
received does to depend on this configuration"
maybe some typo/s there?

> Therefore, using 2 GSI QPs for send (one for each RoCE version) and 1
> GSI QP for receive are required.

s/2/two/ and s/1/one/ please

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support
       [not found]     ` <1451395447-5198-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 14:44       ` Or Gerlitz
       [not found]         ` <56829C3D.5050009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 14:44 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny

On 12/29/2015 3:24 PM, Matan Barak wrote:
> Advertise RoCE support in port_immutable according to the hardware
> capabilities. This enables the verbs stack to use RoCE v2 mode.

Advertise RoCE V2 support

>
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

I guess you wanted  "IB/mlx4: Advertise RoCE V2 support" for the patch 
title? since we did
advertise RDMA_CORE_PORT_IBA_ROCE prior to this patch.

Or.
> ---
>   drivers/infiniband/hw/mlx4/main.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index 44e5699..8cf2575 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -2183,6 +2183,7 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
>   			       struct ib_port_immutable *immutable)
>   {
>   	struct ib_port_attr attr;
> +	struct mlx4_ib_dev *mdev = to_mdev(ibdev);
>   	int err;
>   
>   	err = mlx4_ib_query_port(ibdev, port_num, &attr);
> @@ -2192,10 +2193,15 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
>   	immutable->pkey_tbl_len = attr.pkey_tbl_len;
>   	immutable->gid_tbl_len = attr.gid_tbl_len;
>   
> -	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND)
> +	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND) {
>   		immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
> -	else
> -		immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
> +	} else {
> +		if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE)
> +			immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
> +		if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
> +			immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
> +				RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
> +	}
>   
>   	immutable->max_mad_size = IB_MGMT_MAD_SIZE;
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support
       [not found]     ` <1451395447-5198-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 15:19       ` Or Gerlitz
       [not found]         ` <5682A499.9040701-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 15:19 UTC (permalink / raw)
  To: Matan Barak, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 12/29/2015 3:24 PM, Matan Barak wrote:
> @@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
>   		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
>   	MLX4_GET(dev_cap->bmme_flags, outbox,
>   		 QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
> +	if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
> +		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;

Did you make sure that the query dev cap wrapper unsets this bit when 
proxing VF queries?

>   	if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
>   		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
>   	MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid
       [not found]     ` <1451395447-5198-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 15:24       ` Or Gerlitz
       [not found]         ` <5682A5AA.9040709-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 15:24 UTC (permalink / raw)
  To: Matan Barak, Doug Ledford, Moni Shoua
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Majd Dibbiny

On 12/29/2015 3:24 PM, Matan Barak wrote:
> [...] We use a new firmware command in order to populate the GID table and store the type along with the GID value.

Its a new value to existing command.. so better say we use a new value 
to the SET_PORT firmware command to do X

Also here, break out mlx4_core new functionality e.g the changes to 
include/linux/mlx4/cmd.h into mlx4_core only patch. You don't need any 
change to mlx4_core to have it's own patch, I guess one up to three mlx4 
core patches would be OK.

Did you make sure (at the resource tracker) that VFs can't do this new 
set port command flavor?

Also find some spot to put blank line in the change-log, it's hard to 
read this way.

Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
       [not found]     ` <1451395447-5198-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-29 19:01       ` Or Gerlitz
       [not found]         ` <CAJ3xEMgVvpj5S9gc_3onzCU5zjkXayOEZCCk_DofAwz194s8KQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-29 19:01 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Moni Shoua, Majd Dibbiny, Or Gerlitz

On Tue, Dec 29, 2015 at 3:24 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> @@ -2413,34 +2442,27 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
>
>         if (is_eth) {
>                 struct in6_addr in6;
> -
> +               u16 ether_type;
>                 u16 pcp = (be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 29) << 13;
>
> +               ether_type = (!is_udp) ? MLX4_IB_IBOE_ETHERTYPE :
> +                       (ip_version == 4 ? ETH_P_IP : ETH_P_IPV6);
> +
>                 mlx->sched_prio = cpu_to_be16(pcp);
>
> +               ether_addr_copy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac);
>                 memcpy(sqp->ud_header.eth.dmac_h, ah->av.eth.mac, 6);
> -               /* FIXME: cache smac value? */
>                 memcpy(&ctrl->srcrb_flags16[0], ah->av.eth.mac, 2);
>                 memcpy(&ctrl->imm, ah->av.eth.mac + 2, 4);
>                 memcpy(&in6, sgid.raw, sizeof(in6));
>
> -               if (!mlx4_is_mfunc(to_mdev(ib_dev)->dev)) {
> -                       u64 mac = atomic64_read(&to_mdev(ib_dev)->iboe.mac[sqp->qp.port - 1]);
> -                       u8 smac[ETH_ALEN];
> -
> -                       mlx4_u64_to_smac(smac, mac);
> -                       memcpy(sqp->ud_header.eth.smac_h, smac, ETH_ALEN);
> -               } else {
> -                       /* use the src mac of the tunnel */
> -                       memcpy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac, ETH_ALEN);
> -               }
>

The last hunk that you removed had a role and was by no means
dead-code, right? so... (1) why it's correct to remove it? (2) if you
want to introduce different way to implement what was done here, why
in this patch? maybe add pre-patch for that
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2
       [not found]         ` <56829AB0.3080805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30  8:23           ` Matan Barak
       [not found]             ` <56839469.3070508-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-30  8:23 UTC (permalink / raw)
  To: Or Gerlitz, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny



On 12/29/2015 4:37 PM, Or Gerlitz wrote:
> On 12/29/2015 3:24 PM, Matan Barak wrote:
>> From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> Some mlx4 adapters are RoCEv2 capable. To enable this feature some
>> hardware configuration is required. This is
>>
>> 1. Set port general parameters
>> 2. Configure the outgoing UDP destination port
>> 3. Configure the QP that work with RoCEv2
>>
>> Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>   drivers/infiniband/hw/mlx4/main.c         | 19 ++++++++++++++---
>>   drivers/infiniband/hw/mlx4/qp.c           | 35
>> ++++++++++++++++++++++++++++---
>>   drivers/net/ethernet/mellanox/mlx4/fw.c   | 16 +++++++++++++-
>>   drivers/net/ethernet/mellanox/mlx4/mlx4.h |  7 +++++--
>>   drivers/net/ethernet/mellanox/mlx4/port.c |  8 +++++++
>>   drivers/net/ethernet/mellanox/mlx4/qp.c   | 28
>> +++++++++++++++++++++++++
>>   include/linux/mlx4/device.h               |  1 +
>>   include/linux/mlx4/qp.h                   | 15 +++++++++++--
>>   include/rdma/ib_verbs.h                   |  2 ++
>>   9 files changed, 120 insertions(+), 11 deletions(-)
>
> Better put (please do...) functionality which is plain mlx4 corish (such
> as new/modified FW commands, new SW/FW fields of structs and such) into
> mlx4_core patch.
>
>>
>> diff --git a/drivers/infiniband/hw/mlx4/main.c
>> b/drivers/infiniband/hw/mlx4/main.c
>> index 988fa33..44e5699 100644
>> --- a/drivers/infiniband/hw/mlx4/main.c
>> +++ b/drivers/infiniband/hw/mlx4/main.c
>> @@ -384,6 +384,7 @@ int mlx4_ib_gid_index_to_real_index(struct
>> mlx4_ib_dev *ibdev,
>>       int i;
>>       int ret;
>>       unsigned long flags;
>> +    struct ib_gid_attr attr;
>>       if (port_num > MLX4_MAX_PORTS)
>>           return -EINVAL;
>> @@ -394,10 +395,13 @@ int mlx4_ib_gid_index_to_real_index(struct
>> mlx4_ib_dev *ibdev,
>>       if (!rdma_cap_roce_gid_table(&ibdev->ib_dev, port_num))
>>           return index;
>> -    ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid,
>> NULL);
>> +    ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid,
>> &attr);
>>       if (ret)
>>           return ret;
>> +    if (attr.ndev)
>> +        dev_put(attr.ndev);
>> +
>>       if (!memcmp(&gid, &zgid, sizeof(gid)))
>>           return -EINVAL;
>> @@ -405,7 +409,8 @@ int mlx4_ib_gid_index_to_real_index(struct
>> mlx4_ib_dev *ibdev,
>>       port_gid_table = &iboe->gids[port_num - 1];
>>       for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i)
>> -        if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid))) {
>> +        if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid)) &&
>> +            attr.gid_type == port_gid_table->gids[i].gid_type) {
>>               ctx = port_gid_table->gids[i].ctx;
>>               break;
>>           }
>> @@ -2481,7 +2486,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>>       if (mlx4_ib_init_sriov(ibdev))
>>           goto err_mad;
>> -    if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) {
>> +    if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE ||
>> +        dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
>>           if (!iboe->nb.notifier_call) {
>>               iboe->nb.notifier_call = mlx4_ib_netdev_event;
>>               err = register_netdevice_notifier(&iboe->nb);
>> @@ -2490,6 +2496,13 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>>                   goto err_notif;
>>               }
>>           }
>> +        if (!mlx4_is_slave(dev) &&
>> +            dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
>> +            err = mlx4_config_roce_v2_port(dev, ROCE_V2_UDP_DPORT);
>> +            if (err) {
>> +                goto err_notif;
>> +            }
>> +        }
>>       }
>>       for (j = 0; j < ARRAY_SIZE(mlx4_class_attributes); ++j) {
>> diff --git a/drivers/infiniband/hw/mlx4/qp.c
>> b/drivers/infiniband/hw/mlx4/qp.c
>> index 8d28059..c0dee79 100644
>> --- a/drivers/infiniband/hw/mlx4/qp.c
>> +++ b/drivers/infiniband/hw/mlx4/qp.c
>> @@ -1508,6 +1508,24 @@ static int create_qp_lb_counter(struct
>> mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
>>       return 0;
>>   }
>> +enum {
>> +    MLX4_QPC_ROCE_MODE_1 = 0,
>> +    MLX4_QPC_ROCE_MODE_2 = 2,
>> +    MLX4_QPC_ROCE_MODE_MAX = 0xff
>> +};
>> +
>> +static u8 gid_type_to_qpc(enum ib_gid_type gid_type)
>> +{
>> +    switch (gid_type) {
>> +    case IB_GID_TYPE_ROCE:
>> +        return MLX4_QPC_ROCE_MODE_1;
>> +    case IB_GID_TYPE_ROCE_UDP_ENCAP:
>> +        return MLX4_QPC_ROCE_MODE_2;
>> +    default:
>> +        return MLX4_QPC_ROCE_MODE_MAX;
>> +    }
>> +}
>> +
>>   static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>>                      const struct ib_qp_attr *attr, int attr_mask,
>>                      enum ib_qp_state cur_state, enum ib_qp_state
>> new_state)
>> @@ -1651,9 +1669,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>>           u16 vlan = 0xffff;
>>           u8 smac[ETH_ALEN];
>>           int status = 0;
>> +        int is_eth = rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
>> +            attr->ah_attr.ah_flags & IB_AH_GRH;
>> -        if (rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
>> -            attr->ah_attr.ah_flags & IB_AH_GRH) {
>> +        if (is_eth && attr->ah_attr.ah_flags & IB_AH_GRH) {
>>               int index = attr->ah_attr.grh.sgid_index;
>>               status = ib_get_cached_gid(ibqp->device, port_num,
>> @@ -1675,6 +1694,16 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>>           optpar |= (MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH |
>>                  MLX4_QP_OPTPAR_SCHED_QUEUE);
>> +
>> +        if (is_eth &&
>> +            (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR)) {
>> +            u8 qpc_roce_mode = gid_type_to_qpc(gid_attr.gid_type);
>> +
>> +            if (qpc_roce_mode == MLX4_QPC_ROCE_MODE_MAX)
>> +                goto out;
>> +            context->rlkey_roce_mode |= (qpc_roce_mode << 6);
>> +        }
>> +
>>       }
>>       if (attr_mask & IB_QP_TIMEOUT) {
>> @@ -1846,7 +1875,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
>>           sqd_event = 0;
>>       if (!ibqp->uobject && cur_state == IB_QPS_RESET && new_state ==
>> IB_QPS_INIT)
>> -        context->rlkey |= (1 << 4);
>> +        context->rlkey_roce_mode |= (1 << 4);
>>       /*
>>        * Before passing a kernel QP to the HW, make sure that the
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c
>> b/drivers/net/ethernet/mellanox/mlx4/fw.c
>> index bdd6822..c8a0c3f 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
>> @@ -2232,7 +2232,8 @@ struct mlx4_config_dev {
>>       __be32    rsvd1[3];
>>       __be16    vxlan_udp_dport;
>>       __be16    rsvd2;
>> -    __be32    rsvd3;
>> +    __be16  roce_v2_entropy;
>> +    __be16  roce_v2_udp_dport;
>>       __be32    roce_flags;
>>       __be32    rsvd4[25];
>>       __be16    rsvd5;
>> @@ -2241,6 +2242,7 @@ struct mlx4_config_dev {
>>   };
>>   #define MLX4_VXLAN_UDP_DPORT (1 << 0)
>> +#define MLX4_ROCE_V2_UDP_DPORT BIT(3)
>>   #define MLX4_DISABLE_RX_PORT BIT(18)
>>   static int mlx4_CONFIG_DEV_set(struct mlx4_dev *dev, struct
>> mlx4_config_dev *config_dev)
>> @@ -2358,6 +2360,18 @@ int mlx4_disable_rx_port_check(struct mlx4_dev
>> *dev, bool dis)
>>       return mlx4_CONFIG_DEV_set(dev, &config_dev);
>>   }
>> +int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port)
>> +{
>> +    struct mlx4_config_dev config_dev;
>> +
>> +    memset(&config_dev, 0, sizeof(config_dev));
>> +    config_dev.update_flags    = cpu_to_be32(MLX4_ROCE_V2_UDP_DPORT);
>> +    config_dev.roce_v2_udp_dport = cpu_to_be16(udp_port);
>> +
>> +    return mlx4_CONFIG_DEV_set(dev, &config_dev);
>> +}
>> +EXPORT_SYMBOL_GPL(mlx4_config_roce_v2_port);
>
> I didn't see a patch to the resource tracker, did you make sure that VFs
> can't attempt to configure the UDP port?
>

int mlx4_CONFIG_DEV_wrapper(struct mlx4_dev *dev, int slave,
                             struct mlx4_vhcr *vhcr,
                             struct mlx4_cmd_mailbox *inbox,
                             struct mlx4_cmd_mailbox *outbox,
                             struct mlx4_cmd_info *cmd)
{
         int err;
         u8 get = vhcr->op_modifier;

         if (get != 1)
                 return -EPERM;

         err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);

         return err;
}

Only "get" is permitted in multi-function setups.

Anyway, mlx4_config_roce_v2_port is not called for these setups because 
of this condition:
if (mlx4_is_mfunc(dev)) {
	dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
	dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
	mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
}


> Or.
>

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2
       [not found]         ` <56829BC4.2070709-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30  8:24           ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2015-12-30  8:24 UTC (permalink / raw)
  To: Or Gerlitz, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny



On 12/29/2015 4:42 PM, Or Gerlitz wrote:
> On 12/29/2015 3:24 PM, Matan Barak wrote:
>> The mlx4 driver uses a special QP to implement the GSI QP. This kind
>> of QP allows to build the InfiniBand headers in SW to be put before
>> the payload that comes in with the WR. The mlx4 HW builds the packet,
>> calculates the ICRC and puts it at the end of the payload. This ICRC
>> calculation however depends on the QP configuration which is
>> determined when QP is modified (roce_mode during INIT->RTR).
>> On the other hand, ICRC verification when packet is received does to
>> depend on this configuration.
>
> I don't understand the part of the sentence saying "when packet is
> received does to depend on this configuration"
> maybe some typo/s there?
>

I'll rephrase Moni's commit message for V2:

The mlx4 driver uses a special QP to implement the GSI QP. This kind of 
QP allows to build the InfiniBand headers in software.
When mlx4 hadware builds the packet, it calculates the ICRC and puts it 
at the end of the payload. However, this ICRC calculation depends
on the QP configuration, which is determined when the QP is modified 
(roce_mode during INIT->RTR).
When receiving a packet, the ICRC verification doesn't depend on this 
configuration.
Therefore, using two GSI QPs for send (one for each RoCE version) and 
one GSI QP for receive are required.

>> Therefore, using 2 GSI QPs for send (one for each RoCE version) and 1
>> GSI QP for receive are required.
>
> s/2/two/ and s/1/one/ please
>

No problem

> Or.
>

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support
       [not found]         ` <56829C3D.5050009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30  8:25           ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2015-12-30  8:25 UTC (permalink / raw)
  To: Or Gerlitz, Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Moni Shoua,
	Majd Dibbiny



On 12/29/2015 4:44 PM, Or Gerlitz wrote:
> On 12/29/2015 3:24 PM, Matan Barak wrote:
>> Advertise RoCE support in port_immutable according to the hardware
>> capabilities. This enables the verbs stack to use RoCE v2 mode.
>
> Advertise RoCE V2 support
>
>>
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> I guess you wanted  "IB/mlx4: Advertise RoCE V2 support" for the patch
> title? since we did
> advertise RDMA_CORE_PORT_IBA_ROCE prior to this patch.
>

Correct, thanks!

> Or.
>> ---
>>   drivers/infiniband/hw/mlx4/main.c | 12 +++++++++---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/mlx4/main.c
>> b/drivers/infiniband/hw/mlx4/main.c
>> index 44e5699..8cf2575 100644
>> --- a/drivers/infiniband/hw/mlx4/main.c
>> +++ b/drivers/infiniband/hw/mlx4/main.c
>> @@ -2183,6 +2183,7 @@ static int mlx4_port_immutable(struct ib_device
>> *ibdev, u8 port_num,
>>                      struct ib_port_immutable *immutable)
>>   {
>>       struct ib_port_attr attr;
>> +    struct mlx4_ib_dev *mdev = to_mdev(ibdev);
>>       int err;
>>       err = mlx4_ib_query_port(ibdev, port_num, &attr);
>> @@ -2192,10 +2193,15 @@ static int mlx4_port_immutable(struct
>> ib_device *ibdev, u8 port_num,
>>       immutable->pkey_tbl_len = attr.pkey_tbl_len;
>>       immutable->gid_tbl_len = attr.gid_tbl_len;
>> -    if (mlx4_ib_port_link_layer(ibdev, port_num) ==
>> IB_LINK_LAYER_INFINIBAND)
>> +    if (mlx4_ib_port_link_layer(ibdev, port_num) ==
>> IB_LINK_LAYER_INFINIBAND) {
>>           immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
>> -    else
>> -        immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
>> +    } else {
>> +        if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE)
>> +            immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
>> +        if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
>> +            immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
>> +                RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
>> +    }
>>       immutable->max_mad_size = IB_MGMT_MAD_SIZE;
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support
       [not found]         ` <5682A499.9040701-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30  8:27           ` Matan Barak
       [not found]             ` <5683957B.1070401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Matan Barak @ 2015-12-30  8:27 UTC (permalink / raw)
  To: Or Gerlitz, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny



On 12/29/2015 5:19 PM, Or Gerlitz wrote:
> On 12/29/2015 3:24 PM, Matan Barak wrote:
>> @@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev,
>> struct mlx4_dev_cap *dev_cap)
>>           dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
>>       MLX4_GET(dev_cap->bmme_flags, outbox,
>>            QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
>> +    if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
>> +        dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
>
> Did you make sure that the query dev cap wrapper unsets this bit when
> proxing VF queries?

In mlx4_dev_cap:
if (mlx4_is_mfunc(dev)) {
	dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
	dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
	mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
}

mlx4_slave_cap calls mlx4_dev_cap and uses the dev_caps it queried, so 
we should be safe here.

>
>>       if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
>>           dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
>>       MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support
       [not found]             ` <5683957B.1070401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30  8:44               ` Or Gerlitz
       [not found]                 ` <5683997E.9090307-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Or Gerlitz @ 2015-12-30  8:44 UTC (permalink / raw)
  To: Matan Barak, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 12/30/2015 10:27 AM, Matan Barak wrote:
>
>
> On 12/29/2015 5:19 PM, Or Gerlitz wrote:
>> On 12/29/2015 3:24 PM, Matan Barak wrote:
>>> @@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev,
>>> struct mlx4_dev_cap *dev_cap)
>>>           dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
>>>       MLX4_GET(dev_cap->bmme_flags, outbox,
>>>            QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
>>> +    if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
>>> +        dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
>>
>> Did you make sure that the query dev cap wrapper unsets this bit when
>> proxing VF queries?
>
> In mlx4_dev_cap:
> if (mlx4_is_mfunc(dev)) {
>     dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
>     dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
>     mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
> }
>
> mlx4_slave_cap calls mlx4_dev_cap and uses the dev_caps it queried, so 
> we should be safe here.

mlx4_slave_cap is part of the Linux VF driver flow, right?

So...  NO, this is the Linux implementation.

You should make things robust against any guest driver.

The only way to do that is patch the command wrapper used by the PF
to filter out unwanted cap bits, see other filtering we do in 
mlx4_QUERY_DEV_CAP_wrapper

Or.


>
>>
>>>       if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
>>>           dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
>>>       MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2
       [not found]             ` <56839469.3070508-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30  8:46               ` Or Gerlitz
  0 siblings, 0 replies; 25+ messages in thread
From: Or Gerlitz @ 2015-12-30  8:46 UTC (permalink / raw)
  To: Matan Barak, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny

On 12/30/2015 10:23 AM, Matan Barak wrote:
>
> int mlx4_CONFIG_DEV_wrapper(struct mlx4_dev *dev, int slave,
>                             struct mlx4_vhcr *vhcr,
>                             struct mlx4_cmd_mailbox *inbox,
>                             struct mlx4_cmd_mailbox *outbox,
>                             struct mlx4_cmd_info *cmd)
> {
>         int err;
>         u8 get = vhcr->op_modifier;
>
>         if (get != 1)
>                 return -EPERM;
>
>         err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
>
>         return err;
> }
>
> Only "get" is permitted in multi-function setups.

good, thanks for clarifying this out.

>
> Anyway, mlx4_config_roce_v2_port is not called for these setups 
> because of this condition:
> if (mlx4_is_mfunc(dev)) {
>     dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
>     dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
>     mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
> } 

wrong again, you assume your Linux VF driver, but the VM can run other 
driver.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid
       [not found]         ` <5682A5AA.9040709-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30 10:21           ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2015-12-30 10:21 UTC (permalink / raw)
  To: Or Gerlitz, Doug Ledford, Moni Shoua
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas, Majd Dibbiny



On 12/29/2015 5:24 PM, Or Gerlitz wrote:
> On 12/29/2015 3:24 PM, Matan Barak wrote:
>> [...] We use a new firmware command in order to populate the GID table
>> and store the type along with the GID value.
>
> Its a new value to existing command.. so better say we use a new value
> to the SET_PORT firmware command to do X
>

Ok

> Also here, break out mlx4_core new functionality e.g the changes to
> include/linux/mlx4/cmd.h into mlx4_core only patch. You don't need any
> change to mlx4_core to have it's own patch, I guess one up to three mlx4
> core patches would be OK.
>

I'll split mlx4_core logically.

> Did you make sure (at the resource tracker) that VFs can't do this new
> set port command flavor?
>

In mlx4_common_set_port:
if (slave != dev->caps.function &&
     in_modifier != MLX4_SET_PORT_GENERAL &&
      in_modifier != MLX4_SET_PORT_GID_TABLE) {
	mlx4_warn(dev, "denying SET_PORT for slave:%d\n", slave);
	return -EINVAL;
}



> Also find some spot to put blank line in the change-log, it's hard to
> read this way.
>

No problem

> Or.

Matan

>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
       [not found]         ` <CAJ3xEMgVvpj5S9gc_3onzCU5zjkXayOEZCCk_DofAwz194s8KQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-12-30 11:04           ` Moni Shoua
       [not found]             ` <CAG9sBKMOE1ULAnL=OWiSYc76vgi8jiuhnQxY3Li9NwGy3+r7oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Moni Shoua @ 2015-12-30 11:04 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny, Or Gerlitz

>
> The last hunk that you removed had a role and was by no means
> dead-code, right? so... (1) why it's correct to remove it? (2) if you
> want to introduce different way to implement what was done here, why
> in this patch? maybe add pre-patch for that

In a way you are right. This hunk does not insert a bug and even
improves correctness but it acutally belongs to an earlier patch
(dbf727de7440f73c4b92be4b958cbc24977e8ca2 IB/core: Use GID table in AH
creation and dmac resolution)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support
       [not found]                 ` <5683997E.9090307-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-30 11:11                   ` Matan Barak
  0 siblings, 0 replies; 25+ messages in thread
From: Matan Barak @ 2015-12-30 11:11 UTC (permalink / raw)
  To: Or Gerlitz, Moni Shoua
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Yishai Hadas,
	Majd Dibbiny



On 12/30/2015 10:44 AM, Or Gerlitz wrote:
> On 12/30/2015 10:27 AM, Matan Barak wrote:
>>
>>
>> On 12/29/2015 5:19 PM, Or Gerlitz wrote:
>>> On 12/29/2015 3:24 PM, Matan Barak wrote:
>>>> @@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev,
>>>> struct mlx4_dev_cap *dev_cap)
>>>>           dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
>>>>       MLX4_GET(dev_cap->bmme_flags, outbox,
>>>>            QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
>>>> +    if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
>>>> +        dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
>>>
>>> Did you make sure that the query dev cap wrapper unsets this bit when
>>> proxing VF queries?
>>
>> In mlx4_dev_cap:
>> if (mlx4_is_mfunc(dev)) {
>>     dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
>>     dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
>>     mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
>> }
>>
>> mlx4_slave_cap calls mlx4_dev_cap and uses the dev_caps it queried, so
>> we should be safe here.
>
> mlx4_slave_cap is part of the Linux VF driver flow, right?
>
> So...  NO, this is the Linux implementation.
>
> You should make things robust against any guest driver.
>
> The only way to do that is patch the command wrapper used by the PF
> to filter out unwanted cap bits, see other filtering we do in
> mlx4_QUERY_DEV_CAP_wrapper
>

I agree, thanks

> Or.

Matan

>
>
>>
>>>
>>>>       if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
>>>>           dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
>>>>       MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
>>>
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
       [not found]             ` <CAG9sBKMOE1ULAnL=OWiSYc76vgi8jiuhnQxY3Li9NwGy3+r7oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-12-31  7:04               ` Or Gerlitz
  0 siblings, 0 replies; 25+ messages in thread
From: Or Gerlitz @ 2015-12-31  7:04 UTC (permalink / raw)
  To: Moni Shoua, Or Gerlitz
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Yishai Hadas, Majd Dibbiny

On 12/30/2015 1:04 PM, Moni Shoua wrote:
>> The last hunk that you removed had a role and was by no means
>> dead-code, right? so... (1) why it's correct to remove it? (2) if you
>> want to introduce different way to implement what was done here, why
>> in this patch? maybe add pre-patch for that
> In a way you are right. This hunk does not insert a bug and even
> improves correctness but it acutally belongs to an earlier patch
> (dbf727de7440f73c4b92be4b958cbc24977e8ca2 IB/core: Use GID table in AH
> creation and dmac resolution)

so what's the plan here? avoid deleting it?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-12-31  7:04 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-29 13:24 [PATCH for-next 0/7] Add RoCE v2 support for mlx4 driver Matan Barak
     [not found] ` <1451395447-5198-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 13:24   ` [PATCH for-next 1/7] IB/mlx4: Query RoCE support Matan Barak
     [not found]     ` <1451395447-5198-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 15:19       ` Or Gerlitz
     [not found]         ` <5682A499.9040701-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30  8:27           ` Matan Barak
     [not found]             ` <5683957B.1070401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30  8:44               ` Or Gerlitz
     [not found]                 ` <5683997E.9090307-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30 11:11                   ` Matan Barak
2015-12-29 13:24   ` [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid Matan Barak
     [not found]     ` <1451395447-5198-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 15:24       ` Or Gerlitz
     [not found]         ` <5682A5AA.9040709-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30 10:21           ` Matan Barak
2015-12-29 13:24   ` [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2 Matan Barak
     [not found]     ` <1451395447-5198-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 14:37       ` Or Gerlitz
     [not found]         ` <56829AB0.3080805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30  8:23           ` Matan Barak
     [not found]             ` <56839469.3070508-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30  8:46               ` Or Gerlitz
2015-12-29 13:24   ` [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow Matan Barak
     [not found]     ` <1451395447-5198-5-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 14:28       ` Or Gerlitz
2015-12-29 13:24   ` [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers Matan Barak
     [not found]     ` <1451395447-5198-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 19:01       ` Or Gerlitz
     [not found]         ` <CAJ3xEMgVvpj5S9gc_3onzCU5zjkXayOEZCCk_DofAwz194s8KQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-12-30 11:04           ` Moni Shoua
     [not found]             ` <CAG9sBKMOE1ULAnL=OWiSYc76vgi8jiuhnQxY3Li9NwGy3+r7oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-12-31  7:04               ` Or Gerlitz
2015-12-29 13:24   ` [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2 Matan Barak
     [not found]     ` <1451395447-5198-7-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 14:42       ` Or Gerlitz
     [not found]         ` <56829BC4.2070709-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30  8:24           ` Matan Barak
2015-12-29 13:24   ` [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support Matan Barak
     [not found]     ` <1451395447-5198-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-29 14:44       ` Or Gerlitz
     [not found]         ` <56829C3D.5050009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-30  8:25           ` Matan Barak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.