All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next V1 0/9] Add RoCE v2 support
@ 2015-10-15 16:07 Matan Barak
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

Hi Doug,

This series adds the support for RoCE v2. In order to support RoCE v2,
we add gid_type attribute to every GID. When the RoCE GID management
populates the GID table, it duplicates each GID with all supported types.
This gives the user the ability to communicate over each supported
type.

Patch 0001, 0002 and 0003 add support for multiple GID types to the
cache and related APIs. The third patch exposes the GID attributes
information is sysfs.

Patch 0004 adds the RoCE v2 GID type and the capabilities required
from the vendor in order to implement RoCE v2. These capabilities
are grouped together as RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP.

RoCE v2 could work at IPv4 and IPv6 networks. When receiving ib_wc, this
information should come from the vendor's driver. In case the vendor
doesn't supply this information, we parse the packet headers and resolve
its network type. Patch 0005 adds this information and required utilities.

Patches 0006 and 0007 add configfs support (and the required
infrastructure) for CMA. The administrator should be able to set the
default RoCE type. This is done through a new per-port
default_roce_mode configfs file.

Patch 0008 formats a QP1 packet in order to support RoCE v2 CM
packets. This is required for vendors which implement their
QP1 as a Raw QP.

Patch 0009 adds support for IPv4 multicast as an IPv4 network
requires IGMP to be sent in order to join multicast groups.

Vendors code aren't part of this patch-set. Soft-Roce will be
sent soon and depends on these patches. Other vendors, like
mlx4, ocrdma and mlx5 will follow.

This patch is applied on "Add RoCE GID cache usage in verbs/cma"
which was sent to the mailing list.

Thanks,
Matan

Changes from V0:
 - Rebased patches against Doug's latest k.o/for-4.4 tree.
 - Fixed a bug in configfs (rmdir caused an incorrect free).

Matan Barak (6):
  IB/core: Add gid_type to gid attribute
  IB/cm: Use the source GID index type
  IB/core: Add gid attributes to sysfs
  IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type
  IB/rdma_cm: Add wrapper for cma reference count
  IB/cma: Add configfs for rdma_cm

Moni Shoua (2):
  IB/core: Initialize UD header structure with IP and UDP headers
  IB/cma: Join and leave multicast groups with IGMP

Somnath Kotur (1):
  IB/core: Add rdma_network_type to wc

 drivers/infiniband/Kconfig                |   9 +
 drivers/infiniband/core/Makefile          |   2 +
 drivers/infiniband/core/addr.c            |  14 ++
 drivers/infiniband/core/cache.c           | 156 +++++++++----
 drivers/infiniband/core/cm.c              |  25 ++-
 drivers/infiniband/core/cma.c             | 216 ++++++++++++++++--
 drivers/infiniband/core/cma_configfs.c    | 353 ++++++++++++++++++++++++++++++
 drivers/infiniband/core/core_priv.h       |  32 +++
 drivers/infiniband/core/device.c          |  10 +-
 drivers/infiniband/core/multicast.c       |  20 +-
 drivers/infiniband/core/roce_gid_mgmt.c   |  61 +++++-
 drivers/infiniband/core/sa_query.c        |   5 +-
 drivers/infiniband/core/sysfs.c           | 184 +++++++++++++++-
 drivers/infiniband/core/ud_header.c       | 155 ++++++++++++-
 drivers/infiniband/core/uverbs_marshall.c |   1 +
 drivers/infiniband/core/verbs.c           | 124 ++++++++++-
 drivers/infiniband/hw/mlx4/qp.c           |   7 +-
 drivers/infiniband/hw/mthca/mthca_qp.c    |   2 +-
 include/rdma/ib_addr.h                    |   1 +
 include/rdma/ib_cache.h                   |   4 +
 include/rdma/ib_pack.h                    |  45 +++-
 include/rdma/ib_sa.h                      |   4 +
 include/rdma/ib_verbs.h                   |  78 ++++++-
 23 files changed, 1402 insertions(+), 106 deletions(-)
 create mode 100644 drivers/infiniband/core/cma_configfs.c

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 1/9] IB/core: Add gid_type to gid attribute
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-10-15 16:07   ` Matan Barak
  2015-10-15 16:07   ` [PATCH for-next V1 2/9] IB/cm: Use the source GID index type Matan Barak
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

In order to support multiple GID types, we need to store the gid_type
with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT
GID table entries shall have a "GID type" attribute that denotes the L3
Address type". The currently supported GID is IB_GID_TYPE_IB which is
also RoCE v1 GID type.

This implies that gid_type should be added to roce_gid_table meta-data.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cache.c           | 131 +++++++++++++++++++++---------
 drivers/infiniband/core/cm.c              |   2 +-
 drivers/infiniband/core/cma.c             |   3 +-
 drivers/infiniband/core/core_priv.h       |   4 +
 drivers/infiniband/core/device.c          |  10 ++-
 drivers/infiniband/core/multicast.c       |   2 +-
 drivers/infiniband/core/roce_gid_mgmt.c   |  60 ++++++++++++--
 drivers/infiniband/core/sa_query.c        |   5 +-
 drivers/infiniband/core/uverbs_marshall.c |   1 +
 drivers/infiniband/core/verbs.c           |   1 +
 include/rdma/ib_cache.h                   |   4 +
 include/rdma/ib_sa.h                      |   1 +
 include/rdma/ib_verbs.h                   |  11 ++-
 13 files changed, 180 insertions(+), 55 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 89bebea..d9ca6c3 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -64,6 +64,7 @@ enum gid_attr_find_mask {
 	GID_ATTR_FIND_MASK_GID          = 1UL << 0,
 	GID_ATTR_FIND_MASK_NETDEV	= 1UL << 1,
 	GID_ATTR_FIND_MASK_DEFAULT	= 1UL << 2,
+	GID_ATTR_FIND_MASK_GID_TYPE	= 1UL << 3,
 };
 
 enum gid_table_entry_props {
@@ -112,6 +113,19 @@ struct ib_gid_table {
 	struct ib_gid_table_entry *data_vec;
 };
 
+static const char * const gid_type_str[] = {
+	[IB_GID_TYPE_IB]	= "IB/RoCE v1",
+};
+
+const char *ib_cache_gid_type_str(enum ib_gid_type gid_type)
+{
+	if (gid_type < ARRAY_SIZE(gid_type_str) && gid_type_str[gid_type])
+		return gid_type_str[gid_type];
+
+	return "Invalid GID type";
+}
+EXPORT_SYMBOL(ib_cache_gid_type_str);
+
 static int write_gid(struct ib_device *ib_dev, u8 port,
 		     struct ib_gid_table *table, int ix,
 		     const union ib_gid *gid,
@@ -216,6 +230,10 @@ static int find_gid(struct ib_gid_table *table, const union ib_gid *gid,
 		if (table->data_vec[i].props & GID_TABLE_ENTRY_INVALID)
 			goto next;
 
+		if (mask & GID_ATTR_FIND_MASK_GID_TYPE &&
+		    attr->gid_type != val->gid_type)
+			goto next;
+
 		if (mask & GID_ATTR_FIND_MASK_GID &&
 		    memcmp(gid, &table->data_vec[i].gid, sizeof(*gid)))
 			goto next;
@@ -277,6 +295,7 @@ int ib_cache_gid_add(struct ib_device *ib_dev, u8 port,
 	mutex_lock(&table->lock);
 
 	ix = find_gid(table, gid, attr, false, GID_ATTR_FIND_MASK_GID |
+		      GID_ATTR_FIND_MASK_GID_TYPE |
 		      GID_ATTR_FIND_MASK_NETDEV);
 	if (ix >= 0)
 		goto out_unlock;
@@ -308,6 +327,7 @@ int ib_cache_gid_del(struct ib_device *ib_dev, u8 port,
 
 	ix = find_gid(table, gid, attr, false,
 		      GID_ATTR_FIND_MASK_GID	  |
+		      GID_ATTR_FIND_MASK_GID_TYPE |
 		      GID_ATTR_FIND_MASK_NETDEV	  |
 		      GID_ATTR_FIND_MASK_DEFAULT);
 	if (ix < 0)
@@ -396,11 +416,13 @@ static int _ib_cache_gid_table_find(struct ib_device *ib_dev,
 
 static int ib_cache_gid_find(struct ib_device *ib_dev,
 			     const union ib_gid *gid,
+			     enum ib_gid_type gid_type,
 			     struct net_device *ndev, u8 *port,
 			     u16 *index)
 {
-	unsigned long mask = GID_ATTR_FIND_MASK_GID;
-	struct ib_gid_attr gid_attr_val = {.ndev = ndev};
+	unsigned long mask = GID_ATTR_FIND_MASK_GID |
+			     GID_ATTR_FIND_MASK_GID_TYPE;
+	struct ib_gid_attr gid_attr_val = {.ndev = ndev, .gid_type = gid_type};
 
 	if (ndev)
 		mask |= GID_ATTR_FIND_MASK_NETDEV;
@@ -411,14 +433,16 @@ static int ib_cache_gid_find(struct ib_device *ib_dev,
 
 int ib_find_cached_gid_by_port(struct ib_device *ib_dev,
 			       const union ib_gid *gid,
+			       enum ib_gid_type gid_type,
 			       u8 port, struct net_device *ndev,
 			       u16 *index)
 {
 	int local_index;
 	struct ib_gid_table **ports_table = ib_dev->cache.gid_cache;
 	struct ib_gid_table *table;
-	unsigned long mask = GID_ATTR_FIND_MASK_GID;
-	struct ib_gid_attr val = {.ndev = ndev};
+	unsigned long mask = GID_ATTR_FIND_MASK_GID |
+			     GID_ATTR_FIND_MASK_GID_TYPE;
+	struct ib_gid_attr val = {.ndev = ndev, .gid_type = gid_type};
 
 	if (port < rdma_start_port(ib_dev) ||
 	    port > rdma_end_port(ib_dev))
@@ -568,15 +592,15 @@ static void cleanup_gid_table_port(struct ib_device *ib_dev, u8 port,
 
 void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
 				  struct net_device *ndev,
+				  unsigned long gid_type_mask,
 				  enum ib_cache_gid_default_mode mode)
 {
 	struct ib_gid_table **ports_table = ib_dev->cache.gid_cache;
 	union ib_gid gid;
 	struct ib_gid_attr gid_attr;
+	struct ib_gid_attr zattr_type = zattr;
 	struct ib_gid_table *table;
-	int ix;
-	union ib_gid current_gid;
-	struct ib_gid_attr current_gid_attr = {};
+	unsigned int gid_type;
 
 	table  = ports_table[port - rdma_start_port(ib_dev)];
 
@@ -584,46 +608,74 @@ void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
 	memset(&gid_attr, 0, sizeof(gid_attr));
 	gid_attr.ndev = ndev;
 
-	mutex_lock(&table->lock);
-	ix = find_gid(table, NULL, NULL, true, GID_ATTR_FIND_MASK_DEFAULT);
-
-	/* Coudn't find default GID location */
-	WARN_ON(ix < 0);
-
-	if (!__ib_cache_gid_get(ib_dev, port, ix,
-				&current_gid, &current_gid_attr) &&
-	    mode == IB_CACHE_GID_DEFAULT_MODE_SET &&
-	    !memcmp(&gid, &current_gid, sizeof(gid)) &&
-	    !memcmp(&gid_attr, &current_gid_attr, sizeof(gid_attr)))
-		goto unlock;
-
-	if ((memcmp(&current_gid, &zgid, sizeof(current_gid)) ||
-	     memcmp(&current_gid_attr, &zattr,
-		    sizeof(current_gid_attr))) &&
-	    del_gid(ib_dev, port, table, ix, true)) {
-		pr_warn("ib_cache_gid: can't delete index %d for default gid %pI6\n",
-			ix, gid.raw);
-		goto unlock;
-	}
+	for (gid_type = 0; gid_type < IB_GID_TYPE_SIZE; ++gid_type) {
+		int ix;
+		union ib_gid current_gid;
+		struct ib_gid_attr current_gid_attr = {};
 
-	if (mode == IB_CACHE_GID_DEFAULT_MODE_SET)
-		if (add_gid(ib_dev, port, table, ix, &gid, &gid_attr, true))
-			pr_warn("ib_cache_gid: unable to add default gid %pI6\n",
-				gid.raw);
+		if (1UL << gid_type & ~gid_type_mask)
+			continue;
 
-unlock:
-	if (current_gid_attr.ndev)
-		dev_put(current_gid_attr.ndev);
-	mutex_unlock(&table->lock);
+		gid_attr.gid_type = gid_type;
+
+		mutex_lock(&table->lock);
+		ix = find_gid(table, &gid, &gid_attr, true,
+			      GID_ATTR_FIND_MASK_GID_TYPE |
+			      GID_ATTR_FIND_MASK_DEFAULT);
+
+		/* Coudn't find default GID location */
+		WARN_ON(ix < 0);
+
+		zattr_type.gid_type = gid_type;
+
+		if (!__ib_cache_gid_get(ib_dev, port, ix,
+					&current_gid, &current_gid_attr) &&
+		    mode == IB_CACHE_GID_DEFAULT_MODE_SET &&
+		    !memcmp(&gid, &current_gid, sizeof(gid)) &&
+		    !memcmp(&gid_attr, &current_gid_attr, sizeof(gid_attr)))
+			goto release;
+
+		if ((memcmp(&current_gid, &zgid, sizeof(current_gid)) ||
+		     memcmp(&current_gid_attr, &zattr_type,
+			    sizeof(current_gid_attr))) &&
+		    del_gid(ib_dev, port, table, ix, true)) {
+			pr_warn("roce_gid_table: can't delete index %d for default gid %pI6\n",
+				ix, gid.raw);
+			goto release;
+		}
+
+		if (mode == IB_CACHE_GID_DEFAULT_MODE_SET)
+			if (add_gid(ib_dev, port, table, ix, &gid, &gid_attr,
+				    true))
+				pr_warn("roce_gid_table: unable to add default gid %pI6\n",
+					gid.raw);
+
+release:
+		if (current_gid_attr.ndev)
+			dev_put(current_gid_attr.ndev);
+		mutex_unlock(&table->lock);
+	}
 }
 
 static int gid_table_reserve_default(struct ib_device *ib_dev, u8 port,
 				     struct ib_gid_table *table)
 {
-	if (rdma_protocol_roce(ib_dev, port)) {
-		struct ib_gid_table_entry *entry = &table->data_vec[0];
+	unsigned int i;
+	unsigned long roce_gid_type_mask;
+	unsigned int num_default_gids;
+	unsigned int current_gid = 0;
+
+	roce_gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
+	num_default_gids = hweight_long(roce_gid_type_mask);
+	for (i = 0; i < num_default_gids && i < table->sz; i++) {
+		struct ib_gid_table_entry *entry =
+			&table->data_vec[i];
 
 		entry->props |= GID_TABLE_ENTRY_DEFAULT;
+		current_gid = find_next_bit(&roce_gid_type_mask,
+					    BITS_PER_LONG,
+					    current_gid);
+		entry->attr.gid_type = current_gid++;
 	}
 
 	return 0;
@@ -737,11 +789,12 @@ EXPORT_SYMBOL(ib_get_cached_gid);
 
 int ib_find_cached_gid(struct ib_device *device,
 		       const union ib_gid *gid,
+		       enum ib_gid_type gid_type,
 		       struct net_device *ndev,
 		       u8               *port_num,
 		       u16              *index)
 {
-	return ib_cache_gid_find(device, gid, ndev, port_num, index);
+	return ib_cache_gid_find(device, gid, gid_type, ndev, port_num, index);
 }
 EXPORT_SYMBOL(ib_find_cached_gid);
 
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 6dd24a5..a8e14be 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -364,7 +364,7 @@ static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av)
 	read_lock_irqsave(&cm.device_lock, flags);
 	list_for_each_entry(cm_dev, &cm.device_list, list) {
 		if (!ib_find_cached_gid(cm_dev->ib_device, &path->sgid,
-					ndev, &p, NULL)) {
+					IB_GID_TYPE_IB, ndev, &p, NULL)) {
 			port = cm_dev->port[p-1];
 			break;
 		}
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 2914460..a56e957 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -442,7 +442,8 @@ static inline int cma_validate_port(struct ib_device *device, u8 port,
 	if (dev_type == ARPHRD_ETHER)
 		ndev = dev_get_by_index(&init_net, bound_if_index);
 
-	ret = ib_find_cached_gid_by_port(device, gid, port, ndev, NULL);
+	ret = ib_find_cached_gid_by_port(device, gid, IB_GID_TYPE_IB, port,
+					 ndev, NULL);
 
 	if (ndev)
 		dev_put(ndev);
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 5cf6eb7..d531f91 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -70,8 +70,11 @@ enum ib_cache_gid_default_mode {
 	IB_CACHE_GID_DEFAULT_MODE_DELETE
 };
 
+const char *ib_cache_gid_type_str(enum ib_gid_type gid_type);
+
 void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
 				  struct net_device *ndev,
+				  unsigned long gid_type_mask,
 				  enum ib_cache_gid_default_mode mode);
 
 int ib_cache_gid_add(struct ib_device *ib_dev, u8 port,
@@ -87,6 +90,7 @@ int roce_gid_mgmt_init(void);
 void roce_gid_mgmt_cleanup(void);
 
 int roce_rescan_device(struct ib_device *ib_dev);
+unsigned long roce_gid_type_mask_support(struct ib_device *ib_dev, u8 port);
 
 int ib_cache_setup_one(struct ib_device *device);
 void ib_cache_cleanup_one(struct ib_device *device);
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 179e813..bb6c47d 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -825,26 +825,32 @@ EXPORT_SYMBOL(ib_modify_port);
  *   a specified GID value occurs.
  * @device: The device to query.
  * @gid: The GID value to search for.
+ * @gid_type: Type of GID.
  * @ndev: The ndev related to the GID to search for.
  * @port_num: The port number of the device where the GID value was found.
  * @index: The index into the GID table where the GID was found.  This
  *   parameter may be NULL.
  */
 int ib_find_gid(struct ib_device *device, union ib_gid *gid,
-		struct net_device *ndev, u8 *port_num, u16 *index)
+		enum ib_gid_type gid_type, struct net_device *ndev,
+		u8 *port_num, u16 *index)
 {
 	union ib_gid tmp_gid;
 	int ret, port, i;
 
 	for (port = rdma_start_port(device); port <= rdma_end_port(device); ++port) {
 		if (rdma_cap_roce_gid_table(device, port)) {
-			if (!ib_find_cached_gid_by_port(device, gid, port,
+			if (!ib_find_cached_gid_by_port(device, gid, gid_type, port,
 							ndev, index)) {
+
 				*port_num = port;
 				return 0;
 			}
 		}
 
+		if (gid_type != IB_GID_TYPE_IB)
+			continue;
+
 		for (i = 0; i < device->port_immutable[port].gid_tbl_len; ++i) {
 			ret = ib_query_gid(device, port, i, &tmp_gid, NULL);
 			if (ret)
diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index bb6685f..6911ae6 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -729,7 +729,7 @@ int ib_init_ah_from_mcmember(struct ib_device *device, u8 port_num,
 	u16 gid_index;
 	u8 p;
 
-	ret = ib_find_cached_gid(device, &rec->port_gid,
+	ret = ib_find_cached_gid(device, &rec->port_gid, IB_GID_TYPE_IB,
 				 NULL, &p, &gid_index);
 	if (ret)
 		return ret;
diff --git a/drivers/infiniband/core/roce_gid_mgmt.c b/drivers/infiniband/core/roce_gid_mgmt.c
index 178f984..61c27a7 100644
--- a/drivers/infiniband/core/roce_gid_mgmt.c
+++ b/drivers/infiniband/core/roce_gid_mgmt.c
@@ -67,17 +67,52 @@ struct netdev_event_work {
 	struct netdev_event_work_cmd	cmds[ROCE_NETDEV_CALLBACK_SZ];
 };
 
+static const struct {
+	bool (*is_supported)(const struct ib_device *device, u8 port_num);
+	enum ib_gid_type gid_type;
+} PORT_CAP_TO_GID_TYPE[] = {
+	{rdma_protocol_roce,   IB_GID_TYPE_ROCE},
+};
+
+#define CAP_TO_GID_TABLE_SIZE	ARRAY_SIZE(PORT_CAP_TO_GID_TYPE)
+
+unsigned long roce_gid_type_mask_support(struct ib_device *ib_dev, u8 port)
+{
+	int i;
+	unsigned int ret_flags = 0;
+
+	if (!rdma_protocol_roce(ib_dev, port))
+		return 1UL << IB_GID_TYPE_IB;
+
+	for (i = 0; i < CAP_TO_GID_TABLE_SIZE; i++)
+		if (PORT_CAP_TO_GID_TYPE[i].is_supported(ib_dev, port))
+			ret_flags |= 1UL << PORT_CAP_TO_GID_TYPE[i].gid_type;
+
+	return ret_flags;
+}
+EXPORT_SYMBOL(roce_gid_type_mask_support);
+
 static void update_gid(enum gid_op_type gid_op, struct ib_device *ib_dev,
 		       u8 port, union ib_gid *gid,
 		       struct ib_gid_attr *gid_attr)
 {
-	switch (gid_op) {
-	case GID_ADD:
-		ib_cache_gid_add(ib_dev, port, gid, gid_attr);
-		break;
-	case GID_DEL:
-		ib_cache_gid_del(ib_dev, port, gid, gid_attr);
-		break;
+	int i;
+	unsigned long gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
+
+	for (i = 0; i < IB_GID_TYPE_SIZE; i++) {
+		if ((1UL << i) & gid_type_mask) {
+			gid_attr->gid_type = i;
+			switch (gid_op) {
+			case GID_ADD:
+				ib_cache_gid_add(ib_dev, port,
+						 gid, gid_attr);
+				break;
+			case GID_DEL:
+				ib_cache_gid_del(ib_dev, port,
+						 gid, gid_attr);
+				break;
+			}
+		}
 	}
 }
 
@@ -203,6 +238,8 @@ static void enum_netdev_default_gids(struct ib_device *ib_dev,
 				     u8 port, struct net_device *event_ndev,
 				     struct net_device *rdma_ndev)
 {
+	unsigned long gid_type_mask;
+
 	rcu_read_lock();
 	if (!rdma_ndev ||
 	    ((rdma_ndev != event_ndev &&
@@ -215,7 +252,9 @@ static void enum_netdev_default_gids(struct ib_device *ib_dev,
 	}
 	rcu_read_unlock();
 
-	ib_cache_gid_set_default_gid(ib_dev, port, rdma_ndev,
+	gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
+
+	ib_cache_gid_set_default_gid(ib_dev, port, rdma_ndev, gid_type_mask,
 				     IB_CACHE_GID_DEFAULT_MODE_SET);
 }
 
@@ -237,9 +276,14 @@ static void bond_delete_netdev_default_gids(struct ib_device *ib_dev,
 	if (is_upper_dev_rcu(rdma_ndev, event_ndev) &&
 	    is_eth_active_slave_of_bonding_rcu(rdma_ndev, real_dev) ==
 	    BONDING_SLAVE_STATE_INACTIVE) {
+		unsigned long gid_type_mask;
+
 		rcu_read_unlock();
 
+		gid_type_mask = roce_gid_type_mask_support(ib_dev, port);
+
 		ib_cache_gid_set_default_gid(ib_dev, port, rdma_ndev,
+					     gid_type_mask,
 					     IB_CACHE_GID_DEFAULT_MODE_DELETE);
 	} else {
 		rcu_read_unlock();
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index dcdaa79..11eef25 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -1012,8 +1012,8 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
 		ah_attr->ah_flags = IB_AH_GRH;
 		ah_attr->grh.dgid = rec->dgid;
 
-		ret = ib_find_cached_gid(device, &rec->sgid, ndev, &port_num,
-					 &gid_index);
+		ret = ib_find_cached_gid(device, &rec->sgid, rec->gid_type, ndev,
+					 &port_num, &gid_index);
 		if (ret) {
 			if (ndev)
 				dev_put(ndev);
@@ -1155,6 +1155,7 @@ static void ib_sa_path_rec_callback(struct ib_sa_query *sa_query,
 			  mad->data, &rec);
 		rec.net = NULL;
 		rec.ifindex = 0;
+		rec.gid_type = IB_GID_TYPE_IB;
 		memset(rec.dmac, 0, ETH_ALEN);
 		query->callback(status, &rec, query->context);
 	} else
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index 7d2f14c..af020f8 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -144,5 +144,6 @@ void ib_copy_path_rec_from_user(struct ib_sa_path_rec *dst,
 	memset(dst->dmac, 0, sizeof(dst->dmac));
 	dst->net = NULL;
 	dst->ifindex = 0;
+	dst->gid_type = IB_GID_TYPE_IB;
 }
 EXPORT_SYMBOL(ib_copy_path_rec_from_user);
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 46d97f0..8b4ade6 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -387,6 +387,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 
 		if (!rdma_cap_eth_ah(device, port_num)) {
 			ret = ib_find_cached_gid_by_port(device, &grh->dgid,
+							 IB_GID_TYPE_IB,
 							 port_num, NULL,
 							 &gid_index);
 			if (ret)
diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h
index 269a27cf..e30f19b 100644
--- a/include/rdma/ib_cache.h
+++ b/include/rdma/ib_cache.h
@@ -60,6 +60,7 @@ int ib_get_cached_gid(struct ib_device    *device,
  *   a specified GID value occurs.
  * @device: The device to query.
  * @gid: The GID value to search for.
+ * @gid_type: The GID type to search for.
  * @ndev: In RoCE, the net device of the device. NULL means ignore.
  * @port_num: The port number of the device where the GID value was found.
  * @index: The index into the cached GID table where the GID was found.  This
@@ -70,6 +71,7 @@ int ib_get_cached_gid(struct ib_device    *device,
  */
 int ib_find_cached_gid(struct ib_device *device,
 		       const union ib_gid *gid,
+		       enum ib_gid_type gid_type,
 		       struct net_device *ndev,
 		       u8               *port_num,
 		       u16              *index);
@@ -79,6 +81,7 @@ int ib_find_cached_gid(struct ib_device *device,
  * GID value occurs
  * @device: The device to query.
  * @gid: The GID value to search for.
+ * @gid_type: The GID type to search for.
  * @port_num: The port number of the device where the GID value sould be
  *   searched.
  * @ndev: In RoCE, the net device of the device. Null means ignore.
@@ -90,6 +93,7 @@ int ib_find_cached_gid(struct ib_device *device,
  */
 int ib_find_cached_gid_by_port(struct ib_device *device,
 			       const union ib_gid *gid,
+			       enum ib_gid_type gid_type,
 			       u8               port_num,
 			       struct net_device *ndev,
 			       u16              *index);
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index 3019695..0a40ed2 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -160,6 +160,7 @@ struct ib_sa_path_rec {
 	int	     ifindex;
 	/* ignored in IB */
 	struct net  *net;
+	enum ib_gid_type gid_type;
 };
 
 static inline struct net_device *ib_get_ndev_from_path(struct ib_sa_path_rec *rec)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0b20658..7619e22 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -67,7 +67,15 @@ union ib_gid {
 
 extern union ib_gid zgid;
 
+enum ib_gid_type {
+	/* If link layer is Ethernet, this is RoCE V1 */
+	IB_GID_TYPE_IB        = 0,
+	IB_GID_TYPE_ROCE      = 0,
+	IB_GID_TYPE_SIZE
+};
+
 struct ib_gid_attr {
+	enum ib_gid_type	gid_type;
 	struct net_device	*ndev;
 };
 
@@ -2186,7 +2194,8 @@ int ib_modify_port(struct ib_device *device,
 		   struct ib_port_modify *port_modify);
 
 int ib_find_gid(struct ib_device *device, union ib_gid *gid,
-		struct net_device *ndev, u8 *port_num, u16 *index);
+		enum ib_gid_type gid_type, struct net_device *ndev,
+		u8 *port_num, u16 *index);
 
 int ib_find_pkey(struct ib_device *device,
 		 u8 port_num, u16 pkey, u16 *index);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 2/9] IB/cm: Use the source GID index type
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15 16:07   ` [PATCH for-next V1 1/9] IB/core: Add gid_type to gid attribute Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
  2015-10-15 16:07   ` [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs Matan Barak
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

Previosuly, cm and cma modules supported only IB and RoCE v1 GID type.
In order to support multiple GID types, the gid_type is passed to
cm_init_av_by_path and stored in the path record.

The rdma cm client would use a default GID type that will be saved in
rdma_id_private.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cm.c  | 25 ++++++++++++++++++++-----
 drivers/infiniband/core/cma.c |  2 ++
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index a8e14be..985e4cd 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -364,7 +364,7 @@ static int cm_init_av_by_path(struct ib_sa_path_rec *path, struct cm_av *av)
 	read_lock_irqsave(&cm.device_lock, flags);
 	list_for_each_entry(cm_dev, &cm.device_list, list) {
 		if (!ib_find_cached_gid(cm_dev->ib_device, &path->sgid,
-					IB_GID_TYPE_IB, ndev, &p, NULL)) {
+					path->gid_type, ndev, &p, NULL)) {
 			port = cm_dev->port[p-1];
 			break;
 		}
@@ -1595,6 +1595,8 @@ static int cm_req_handler(struct cm_work *work)
 	struct ib_cm_id *cm_id;
 	struct cm_id_private *cm_id_priv, *listen_cm_id_priv;
 	struct cm_req_msg *req_msg;
+	union ib_gid gid;
+	struct ib_gid_attr gid_attr;
 	int ret;
 
 	req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
@@ -1634,11 +1636,24 @@ static int cm_req_handler(struct cm_work *work)
 	cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
 
 	memcpy(work->path[0].dmac, cm_id_priv->av.ah_attr.dmac, ETH_ALEN);
-	ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
+	ret = ib_get_cached_gid(work->port->cm_dev->ib_device,
+				work->port->port_num,
+				cm_id_priv->av.ah_attr.grh.sgid_index,
+				&gid, &gid_attr);
+	if (!ret) {
+		if (gid_attr.ndev)
+			dev_put(gid_attr.ndev);
+		work->path[0].gid_type = gid_attr.gid_type;
+		ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
+	}
 	if (ret) {
-		ib_get_cached_gid(work->port->cm_dev->ib_device,
-				  work->port->port_num, 0, &work->path[0].sgid,
-				  NULL);
+		int err = ib_get_cached_gid(work->port->cm_dev->ib_device,
+					    work->port->port_num, 0,
+					    &work->path[0].sgid,
+					    &gid_attr);
+		if (!err && gid_attr.ndev)
+			dev_put(gid_attr.ndev);
+		work->path[0].gid_type = gid_attr.gid_type;
 		ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
 			       &work->path[0].sgid, sizeof work->path[0].sgid,
 			       NULL, 0);
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index a56e957..2e592e6 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -214,6 +214,7 @@ struct rdma_id_private {
 	u8			tos;
 	u8			reuseaddr;
 	u8			afonly;
+	enum ib_gid_type	gid_type;
 };
 
 struct cma_multicast {
@@ -2276,6 +2277,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
 		ndev = dev_get_by_index(&init_net, addr->dev_addr.bound_dev_if);
 		route->path_rec->net = &init_net;
 		route->path_rec->ifindex = addr->dev_addr.bound_dev_if;
+		route->path_rec->gid_type = id_priv->gid_type;
 	}
 	if (!ndev) {
 		ret = -ENODEV;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15 16:07   ` [PATCH for-next V1 1/9] IB/core: Add gid_type to gid attribute Matan Barak
  2015-10-15 16:07   ` [PATCH for-next V1 2/9] IB/cm: Use the source GID index type Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
       [not found]     ` <1444925232-13598-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15 16:07   ` [PATCH for-next V1 4/9] IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type Matan Barak
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

This patch set adds attributes of net device and gid type to each GID
in the GID table. Users that use verbs directly need to specify
the GID index. Since the same GID could have different types or
associated net devices, users should have the ability to query the
associated GID attributes. Adding these attributes to sysfs.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/sysfs.c | 184 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 182 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index b1f37d4..4d5d87a 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -37,12 +37,22 @@
 #include <linux/slab.h>
 #include <linux/stat.h>
 #include <linux/string.h>
+#include <linux/netdevice.h>
 
 #include <rdma/ib_mad.h>
 
+struct ib_port;
+
+struct gid_attr_group {
+	struct ib_port		*port;
+	struct kobject		kobj;
+	struct attribute_group	ndev;
+	struct attribute_group	type;
+};
 struct ib_port {
 	struct kobject         kobj;
 	struct ib_device      *ibdev;
+	struct gid_attr_group *gid_attr_group;
 	struct attribute_group gid_group;
 	struct attribute_group pkey_group;
 	u8                     port_num;
@@ -84,6 +94,24 @@ static const struct sysfs_ops port_sysfs_ops = {
 	.show = port_attr_show
 };
 
+static ssize_t gid_attr_show(struct kobject *kobj,
+			     struct attribute *attr, char *buf)
+{
+	struct port_attribute *port_attr =
+		container_of(attr, struct port_attribute, attr);
+	struct ib_port *p = container_of(kobj, struct gid_attr_group,
+					 kobj)->port;
+
+	if (!port_attr->show)
+		return -EIO;
+
+	return port_attr->show(p, port_attr, buf);
+}
+
+static const struct sysfs_ops gid_attr_sysfs_ops = {
+	.show = gid_attr_show
+};
+
 static ssize_t state_show(struct ib_port *p, struct port_attribute *unused,
 			  char *buf)
 {
@@ -281,6 +309,46 @@ static struct attribute *port_default_attrs[] = {
 	NULL
 };
 
+static size_t print_ndev(struct ib_gid_attr *gid_attr, char *buf)
+{
+	if (!gid_attr->ndev)
+		return -EINVAL;
+
+	return sprintf(buf, "%s\n", gid_attr->ndev->name);
+}
+
+static size_t print_gid_type(struct ib_gid_attr *gid_attr, char *buf)
+{
+	return sprintf(buf, "%s\n", ib_cache_gid_type_str(gid_attr->gid_type));
+}
+
+static ssize_t _show_port_gid_attr(struct ib_port *p,
+				   struct port_attribute *attr,
+				   char *buf,
+				   size_t (*print)(struct ib_gid_attr *gid_attr,
+						   char *buf))
+{
+	struct port_table_attribute *tab_attr =
+		container_of(attr, struct port_table_attribute, attr);
+	union ib_gid gid;
+	struct ib_gid_attr gid_attr = {};
+	ssize_t ret;
+	va_list args;
+
+	ret = ib_query_gid(p->ibdev, p->port_num, tab_attr->index, &gid,
+			   &gid_attr);
+	if (ret)
+		goto err;
+
+	ret = print(&gid_attr, buf);
+
+err:
+	if (gid_attr.ndev)
+		dev_put(gid_attr.ndev);
+	va_end(args);
+	return ret;
+}
+
 static ssize_t show_port_gid(struct ib_port *p, struct port_attribute *attr,
 			     char *buf)
 {
@@ -296,6 +364,19 @@ static ssize_t show_port_gid(struct ib_port *p, struct port_attribute *attr,
 	return sprintf(buf, "%pI6\n", gid.raw);
 }
 
+static ssize_t show_port_gid_attr_ndev(struct ib_port *p,
+				       struct port_attribute *attr, char *buf)
+{
+	return _show_port_gid_attr(p, attr, buf, print_ndev);
+}
+
+static ssize_t show_port_gid_attr_gid_type(struct ib_port *p,
+					   struct port_attribute *attr,
+					   char *buf)
+{
+	return _show_port_gid_attr(p, attr, buf, print_gid_type);
+}
+
 static ssize_t show_port_pkey(struct ib_port *p, struct port_attribute *attr,
 			      char *buf)
 {
@@ -451,12 +532,41 @@ static void ib_port_release(struct kobject *kobj)
 	kfree(p);
 }
 
+static void ib_port_gid_attr_release(struct kobject *kobj)
+{
+	struct gid_attr_group *g = container_of(kobj, struct gid_attr_group,
+						kobj);
+	struct attribute *a;
+	int i;
+
+	if (g->ndev.attrs) {
+		for (i = 0; (a = g->ndev.attrs[i]); ++i)
+			kfree(a);
+
+		kfree(g->ndev.attrs);
+	}
+
+	if (g->type.attrs) {
+		for (i = 0; (a = g->type.attrs[i]); ++i)
+			kfree(a);
+
+		kfree(g->type.attrs);
+	}
+
+	kfree(g);
+}
+
 static struct kobj_type port_type = {
 	.release       = ib_port_release,
 	.sysfs_ops     = &port_sysfs_ops,
 	.default_attrs = port_default_attrs
 };
 
+static struct kobj_type gid_attr_type = {
+	.sysfs_ops      = &gid_attr_sysfs_ops,
+	.release        = ib_port_gid_attr_release
+};
+
 static struct attribute **
 alloc_group_attrs(ssize_t (*show)(struct ib_port *,
 				  struct port_attribute *, char *buf),
@@ -528,9 +638,23 @@ static int add_port(struct ib_device *device, int port_num,
 		return ret;
 	}
 
+	p->gid_attr_group = kzalloc(sizeof(*p->gid_attr_group), GFP_KERNEL);
+	if (!p->gid_attr_group) {
+		ret = -ENOMEM;
+		goto err_put;
+	}
+
+	p->gid_attr_group->port = p;
+	ret = kobject_init_and_add(&p->gid_attr_group->kobj, &gid_attr_type,
+				   &p->kobj, "gid_attrs");
+	if (ret) {
+		kfree(p->gid_attr_group);
+		goto err_put;
+	}
+
 	ret = sysfs_create_group(&p->kobj, &pma_group);
 	if (ret)
-		goto err_put;
+		goto err_put_gid_attrs;
 
 	p->gid_group.name  = "gids";
 	p->gid_group.attrs = alloc_group_attrs(show_port_gid, attr.gid_tbl_len);
@@ -543,12 +667,38 @@ static int add_port(struct ib_device *device, int port_num,
 	if (ret)
 		goto err_free_gid;
 
+	p->gid_attr_group->ndev.name = "ndevs";
+	p->gid_attr_group->ndev.attrs = alloc_group_attrs(show_port_gid_attr_ndev,
+							  attr.gid_tbl_len);
+	if (!p->gid_attr_group->ndev.attrs) {
+		ret = -ENOMEM;
+		goto err_remove_gid;
+	}
+
+	ret = sysfs_create_group(&p->gid_attr_group->kobj,
+				 &p->gid_attr_group->ndev);
+	if (ret)
+		goto err_free_gid_ndev;
+
+	p->gid_attr_group->type.name = "types";
+	p->gid_attr_group->type.attrs = alloc_group_attrs(show_port_gid_attr_gid_type,
+							  attr.gid_tbl_len);
+	if (!p->gid_attr_group->type.attrs) {
+		ret = -ENOMEM;
+		goto err_remove_gid_ndev;
+	}
+
+	ret = sysfs_create_group(&p->gid_attr_group->kobj,
+				 &p->gid_attr_group->type);
+	if (ret)
+		goto err_free_gid_type;
+
 	p->pkey_group.name  = "pkeys";
 	p->pkey_group.attrs = alloc_group_attrs(show_port_pkey,
 						attr.pkey_tbl_len);
 	if (!p->pkey_group.attrs) {
 		ret = -ENOMEM;
-		goto err_remove_gid;
+		goto err_remove_gid_type;
 	}
 
 	ret = sysfs_create_group(&p->kobj, &p->pkey_group);
@@ -576,6 +726,28 @@ err_free_pkey:
 	kfree(p->pkey_group.attrs);
 	p->pkey_group.attrs = NULL;
 
+err_remove_gid_type:
+	sysfs_remove_group(&p->gid_attr_group->kobj,
+			   &p->gid_attr_group->type);
+
+err_free_gid_type:
+	for (i = 0; i < attr.gid_tbl_len; ++i)
+		kfree(p->gid_attr_group->type.attrs[i]);
+
+	kfree(p->gid_attr_group->type.attrs);
+	p->gid_attr_group->type.attrs = NULL;
+
+err_remove_gid_ndev:
+	sysfs_remove_group(&p->gid_attr_group->kobj,
+			   &p->gid_attr_group->ndev);
+
+err_free_gid_ndev:
+	for (i = 0; i < attr.gid_tbl_len; ++i)
+		kfree(p->gid_attr_group->ndev.attrs[i]);
+
+	kfree(p->gid_attr_group->ndev.attrs);
+	p->gid_attr_group->ndev.attrs = NULL;
+
 err_remove_gid:
 	sysfs_remove_group(&p->kobj, &p->gid_group);
 
@@ -589,6 +761,9 @@ err_free_gid:
 err_remove_pma:
 	sysfs_remove_group(&p->kobj, &pma_group);
 
+err_put_gid_attrs:
+	kobject_put(&p->gid_attr_group->kobj);
+
 err_put:
 	kobject_put(&p->kobj);
 	return ret;
@@ -803,6 +978,11 @@ static void free_port_list_attributes(struct ib_device *device)
 		sysfs_remove_group(p, &pma_group);
 		sysfs_remove_group(p, &port->pkey_group);
 		sysfs_remove_group(p, &port->gid_group);
+		sysfs_remove_group(&port->gid_attr_group->kobj,
+				   &port->gid_attr_group->ndev);
+		sysfs_remove_group(&port->gid_attr_group->kobj,
+				   &port->gid_attr_group->type);
+		kobject_put(&port->gid_attr_group->kobj);
 		kobject_put(p);
 	}
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 4/9] IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
  2015-10-15 16:07   ` [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc Matan Barak
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

Adding RoCE v2 GID type and port type. Vendors
which support this type will get their GID table
populated with RoCE v2 GIDs automatically.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cache.c         |  1 +
 drivers/infiniband/core/roce_gid_mgmt.c |  3 ++-
 include/rdma/ib_verbs.h                 | 23 +++++++++++++++++++++--
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index d9ca6c3..cb34b06 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -115,6 +115,7 @@ struct ib_gid_table {
 
 static const char * const gid_type_str[] = {
 	[IB_GID_TYPE_IB]	= "IB/RoCE v1",
+	[IB_GID_TYPE_ROCE_UDP_ENCAP]	= "RoCE v2",
 };
 
 const char *ib_cache_gid_type_str(enum ib_gid_type gid_type)
diff --git a/drivers/infiniband/core/roce_gid_mgmt.c b/drivers/infiniband/core/roce_gid_mgmt.c
index 61c27a7..1e3673f 100644
--- a/drivers/infiniband/core/roce_gid_mgmt.c
+++ b/drivers/infiniband/core/roce_gid_mgmt.c
@@ -71,7 +71,8 @@ static const struct {
 	bool (*is_supported)(const struct ib_device *device, u8 port_num);
 	enum ib_gid_type gid_type;
 } PORT_CAP_TO_GID_TYPE[] = {
-	{rdma_protocol_roce,   IB_GID_TYPE_ROCE},
+	{rdma_protocol_roce_eth_encap, IB_GID_TYPE_ROCE},
+	{rdma_protocol_roce_udp_encap, IB_GID_TYPE_ROCE_UDP_ENCAP},
 };
 
 #define CAP_TO_GID_TABLE_SIZE	ARRAY_SIZE(PORT_CAP_TO_GID_TYPE)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 7619e22..77906fe 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -71,6 +71,7 @@ enum ib_gid_type {
 	/* If link layer is Ethernet, this is RoCE V1 */
 	IB_GID_TYPE_IB        = 0,
 	IB_GID_TYPE_ROCE      = 0,
+	IB_GID_TYPE_ROCE_UDP_ENCAP = 1,
 	IB_GID_TYPE_SIZE
 };
 
@@ -399,6 +400,7 @@ union rdma_protocol_stats {
 #define RDMA_CORE_CAP_PROT_IB           0x00100000
 #define RDMA_CORE_CAP_PROT_ROCE         0x00200000
 #define RDMA_CORE_CAP_PROT_IWARP        0x00400000
+#define RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP 0x00800000
 
 #define RDMA_CORE_PORT_IBA_IB          (RDMA_CORE_CAP_PROT_IB  \
 					| RDMA_CORE_CAP_IB_MAD \
@@ -411,6 +413,12 @@ union rdma_protocol_stats {
 					| RDMA_CORE_CAP_IB_CM   \
 					| RDMA_CORE_CAP_AF_IB   \
 					| RDMA_CORE_CAP_ETH_AH)
+#define RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP			\
+					(RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP \
+					| RDMA_CORE_CAP_IB_MAD  \
+					| RDMA_CORE_CAP_IB_CM   \
+					| RDMA_CORE_CAP_AF_IB   \
+					| RDMA_CORE_CAP_ETH_AH)
 #define RDMA_CORE_PORT_IWARP           (RDMA_CORE_CAP_PROT_IWARP \
 					| RDMA_CORE_CAP_IW_CM)
 #define RDMA_CORE_PORT_INTEL_OPA       (RDMA_CORE_PORT_IBA_IB  \
@@ -1942,6 +1950,17 @@ static inline bool rdma_protocol_ib(const struct ib_device *device, u8 port_num)
 
 static inline bool rdma_protocol_roce(const struct ib_device *device, u8 port_num)
 {
+	return device->port_immutable[port_num].core_cap_flags &
+		(RDMA_CORE_CAP_PROT_ROCE | RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP);
+}
+
+static inline bool rdma_protocol_roce_udp_encap(const struct ib_device *device, u8 port_num)
+{
+	return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP;
+}
+
+static inline bool rdma_protocol_roce_eth_encap(const struct ib_device *device, u8 port_num)
+{
 	return device->port_immutable[port_num].core_cap_flags & RDMA_CORE_CAP_PROT_ROCE;
 }
 
@@ -1952,8 +1971,8 @@ static inline bool rdma_protocol_iwarp(const struct ib_device *device, u8 port_n
 
 static inline bool rdma_ib_or_roce(const struct ib_device *device, u8 port_num)
 {
-	return device->port_immutable[port_num].core_cap_flags &
-		(RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_PROT_ROCE);
+	return rdma_protocol_ib(device, port_num) ||
+		rdma_protocol_roce(device, port_num);
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 4/9] IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
       [not found]     ` <1444925232-13598-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15 16:07   ` [PATCH for-next V1 6/9] IB/rdma_cm: Add wrapper for cma reference count Matan Barak
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

From: Somnath Kotur <Somnath.Kotur-idTK6quXuVSLFuii7jzJGg@public.gmane.org>

Providers should tell IB core the wc's network type.
This is used in order to search for the proper GID in the
GID table. When using HCAs that can't provide this info,
IB core tries to deep examine the packet and extract
the GID type by itself.

We choose sgid_index and type from all the matching entries in
RDMA-CM based on hint from the IP stack and we set hop_limit for
the IP packet based on above hint from IP stack.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Somnath Kotur <Somnath.Kotur-idTK6quXuVSLFuii7jzJGg@public.gmane.org>
---
 drivers/infiniband/core/addr.c  |  14 +++++
 drivers/infiniband/core/cma.c   |  11 +++-
 drivers/infiniband/core/verbs.c | 123 ++++++++++++++++++++++++++++++++++++++--
 include/rdma/ib_addr.h          |   1 +
 include/rdma/ib_verbs.h         |  44 ++++++++++++++
 5 files changed, 187 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index d3c42b3..3e1f93c 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -257,6 +257,12 @@ static int addr4_resolve(struct sockaddr_in *src_in,
 		goto put;
 	}
 
+	/* If there's a gateway, we're definitely in RoCE v2 (as RoCE v1 isn't
+	 * routable) and we could set the network type accordingly.
+	 */
+	if (rt->rt_uses_gateway)
+		addr->network = RDMA_NETWORK_IPV4;
+
 	ret = dst_fetch_ha(&rt->dst, addr, &fl4.daddr);
 put:
 	ip_rt_put(rt);
@@ -271,6 +277,7 @@ static int addr6_resolve(struct sockaddr_in6 *src_in,
 {
 	struct flowi6 fl6;
 	struct dst_entry *dst;
+	struct rt6_info *rt;
 	int ret;
 
 	memset(&fl6, 0, sizeof fl6);
@@ -282,6 +289,7 @@ static int addr6_resolve(struct sockaddr_in6 *src_in,
 	if ((ret = dst->error))
 		goto put;
 
+	rt = (struct rt6_info *)dst;
 	if (ipv6_addr_any(&fl6.saddr)) {
 		ret = ipv6_dev_get_saddr(&init_net, ip6_dst_idev(dst)->dev,
 					 &fl6.daddr, 0, &fl6.saddr);
@@ -305,6 +313,12 @@ static int addr6_resolve(struct sockaddr_in6 *src_in,
 		goto put;
 	}
 
+	/* If there's a gateway, we're definitely in RoCE v2 (as RoCE v1 isn't
+	 * routable) and we could set the network type accordingly.
+	 */
+	if (rt->rt6i_flags & RTF_GATEWAY)
+		addr->network = RDMA_NETWORK_IPV6;
+
 	ret = dst_fetch_ha(dst, addr, &fl6.daddr);
 put:
 	dst_release(dst);
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 2e592e6..c5d1685 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2253,6 +2253,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
 {
 	struct rdma_route *route = &id_priv->id.route;
 	struct rdma_addr *addr = &route->addr;
+	enum ib_gid_type network_gid_type;
 	struct cma_work *work;
 	int ret;
 	struct net_device *ndev = NULL;
@@ -2291,7 +2292,15 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
 	rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.dst_addr,
 		    &route->path_rec->dgid);
 
-	route->path_rec->hop_limit = 1;
+	/* Use the hint from IP Stack to select GID Type */
+	network_gid_type = ib_network_to_gid_type(addr->dev_addr.network);
+	if (addr->dev_addr.network != RDMA_NETWORK_IB) {
+		route->path_rec->gid_type = network_gid_type;
+		/* TODO: get the hoplimit from the inet/inet6 device */
+		route->path_rec->hop_limit = IPV6_DEFAULT_HOPLIMIT;
+	} else {
+		route->path_rec->hop_limit = 1;
+	}
 	route->path_rec->reversible = 1;
 	route->path_rec->pkey = cpu_to_be16(0xffff);
 	route->path_rec->mtu_selector = IB_SA_EQ;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 8b4ade6..2f568ad 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -311,8 +311,61 @@ struct ib_ah *ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr)
 }
 EXPORT_SYMBOL(ib_create_ah);
 
+static int ib_get_header_version(const union rdma_network_hdr *hdr)
+{
+	const struct iphdr *ip4h = (struct iphdr *)&hdr->roce4grh;
+	struct iphdr ip4h_checked;
+	const struct ipv6hdr *ip6h = (struct ipv6hdr *)&hdr->ibgrh;
+
+	/* If it's IPv6, the version must be 6, otherwise, the first
+	 * 20 bytes (before the IPv4 header) are garbled.
+	 */
+	if (ip6h->version != 6)
+		return (ip4h->version == 4) ? 4 : 0;
+	/* version may be 6 or 4 because the first 20 bytes could be garbled */
+
+	/* RoCE v2 requires no options, thus header length
+	   must be 5 words
+	*/
+	if (ip4h->ihl != 5)
+		return 6;
+
+	/* Verify checksum.
+	   We can't write on scattered buffers so we need to copy to
+	   temp buffer.
+	 */
+	memcpy(&ip4h_checked, ip4h, sizeof(ip4h_checked));
+	ip4h_checked.check = 0;
+	ip4h_checked.check = ip_fast_csum((u8 *)&ip4h_checked, 5);
+	/* if IPv4 header checksum is OK, believe it */
+	if (ip4h->check == ip4h_checked.check)
+		return 4;
+	return 6;
+}
+
+static enum rdma_network_type ib_get_net_type_by_grh(struct ib_device *device,
+						     u8 port_num,
+						     const struct ib_grh *grh)
+{
+	int grh_version;
+
+	if (rdma_protocol_ib(device, port_num))
+		return RDMA_NETWORK_IB;
+
+	grh_version = ib_get_header_version((union rdma_network_hdr *)grh);
+
+	if (grh_version == 4)
+		return RDMA_NETWORK_IPV4;
+
+	if (grh->next_hdr == IPPROTO_UDP)
+		return RDMA_NETWORK_IPV6;
+
+	return RDMA_NETWORK_ROCE_V1;
+}
+
 struct find_gid_index_context {
 	u16 vlan_id;
+	enum ib_gid_type gid_type;
 };
 
 static bool find_gid_index(const union ib_gid *gid,
@@ -322,6 +375,9 @@ static bool find_gid_index(const union ib_gid *gid,
 	struct find_gid_index_context *ctx =
 		(struct find_gid_index_context *)context;
 
+	if (ctx->gid_type != gid_attr->gid_type)
+		return false;
+
 	if ((!!(ctx->vlan_id != 0xffff) == !is_vlan_dev(gid_attr->ndev)) ||
 	    (is_vlan_dev(gid_attr->ndev) &&
 	     vlan_dev_vlan_id(gid_attr->ndev) != ctx->vlan_id))
@@ -332,14 +388,49 @@ static bool find_gid_index(const union ib_gid *gid,
 
 static int get_sgid_index_from_eth(struct ib_device *device, u8 port_num,
 				   u16 vlan_id, const union ib_gid *sgid,
+				   enum ib_gid_type gid_type,
 				   u16 *gid_index)
 {
-	struct find_gid_index_context context = {.vlan_id = vlan_id};
+	struct find_gid_index_context context = {.vlan_id = vlan_id,
+						 .gid_type = gid_type};
 
 	return ib_find_gid_by_filter(device, sgid, port_num, find_gid_index,
 				     &context, gid_index);
 }
 
+static int get_gids_from_rdma_hdr(union rdma_network_hdr *hdr,
+				  enum rdma_network_type net_type,
+				  union ib_gid *sgid, union ib_gid *dgid)
+{
+	struct sockaddr_in  src_in;
+	struct sockaddr_in  dst_in;
+	__be32 src_saddr, dst_saddr;
+
+	if (!sgid || !dgid)
+		return -EINVAL;
+
+	if (net_type == RDMA_NETWORK_IPV4) {
+		memcpy(&src_in.sin_addr.s_addr,
+		       &hdr->roce4grh.saddr, 4);
+		memcpy(&dst_in.sin_addr.s_addr,
+		       &hdr->roce4grh.daddr, 4);
+		src_saddr = src_in.sin_addr.s_addr;
+		dst_saddr = dst_in.sin_addr.s_addr;
+		ipv6_addr_set_v4mapped(src_saddr,
+				       (struct in6_addr *)sgid);
+		ipv6_addr_set_v4mapped(dst_saddr,
+				       (struct in6_addr *)dgid);
+		return 0;
+	} else if (net_type == RDMA_NETWORK_IPV6 ||
+		   net_type == RDMA_NETWORK_IB) {
+		*dgid = hdr->ibgrh.dgid;
+		*sgid = hdr->ibgrh.sgid;
+		return 0;
+	} else {
+		return -EINVAL;
+	}
+}
+
 int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 		       const struct ib_wc *wc, const struct ib_grh *grh,
 		       struct ib_ah_attr *ah_attr)
@@ -347,9 +438,25 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 	u32 flow_class;
 	u16 gid_index;
 	int ret;
+	enum rdma_network_type net_type = RDMA_NETWORK_IB;
+	enum ib_gid_type gid_type = IB_GID_TYPE_IB;
+	union ib_gid dgid;
+	union ib_gid sgid;
 
 	memset(ah_attr, 0, sizeof *ah_attr);
 	if (rdma_cap_eth_ah(device, port_num)) {
+		if (wc->wc_flags & IB_WC_WITH_NETWORK_HDR_TYPE)
+			net_type = wc->network_hdr_type;
+		else
+			net_type = ib_get_net_type_by_grh(device, port_num, grh);
+		gid_type = ib_network_to_gid_type(net_type);
+	}
+	ret = get_gids_from_rdma_hdr((union rdma_network_hdr *)grh, net_type,
+				     &sgid, &dgid);
+	if (ret)
+		return ret;
+
+	if (rdma_protocol_roce(device, port_num)) {
 		u16 vlan_id = wc->wc_flags & IB_WC_WITH_VLAN ?
 				wc->vlan_id : 0xffff;
 
@@ -358,7 +465,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 
 		if (!(wc->wc_flags & IB_WC_WITH_SMAC) ||
 		    !(wc->wc_flags & IB_WC_WITH_VLAN)) {
-			ret = rdma_addr_find_dmac_by_grh(&grh->dgid, &grh->sgid,
+			ret = rdma_addr_find_dmac_by_grh(&dgid, &sgid,
 							 ah_attr->dmac,
 							 wc->wc_flags & IB_WC_WITH_VLAN ?
 							 NULL : &vlan_id,
@@ -368,7 +475,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 		}
 
 		ret = get_sgid_index_from_eth(device, port_num, vlan_id,
-					      &grh->dgid, &gid_index);
+					      &dgid, gid_type, &gid_index);
 		if (ret)
 			return ret;
 
@@ -383,10 +490,10 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 
 	if (wc->wc_flags & IB_WC_GRH) {
 		ah_attr->ah_flags = IB_AH_GRH;
-		ah_attr->grh.dgid = grh->sgid;
+		ah_attr->grh.dgid = sgid;
 
 		if (!rdma_cap_eth_ah(device, port_num)) {
-			ret = ib_find_cached_gid_by_port(device, &grh->dgid,
+			ret = ib_find_cached_gid_by_port(device, &dgid,
 							 IB_GID_TYPE_IB,
 							 port_num, NULL,
 							 &gid_index);
@@ -1026,6 +1133,12 @@ int ib_resolve_eth_dmac(struct ib_qp *qp,
 					ret = -ENXIO;
 				goto out;
 			}
+			if (sgid_attr.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP)
+				/* TODO: get the hoplimit from the inet/inet6
+				 * device
+				 */
+				qp_attr->ah_attr.grh.hop_limit =
+							IPV6_DEFAULT_HOPLIMIT;
 
 			ifindex = sgid_attr.ndev->ifindex;
 
diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index 17e4a8b..81e19d9 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -71,6 +71,7 @@ struct rdma_dev_addr {
 	unsigned short dev_type;
 	int bound_dev_if;
 	enum rdma_transport_type transport;
+	enum rdma_network_type network;
 };
 
 /**
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 77906fe..dd1d901 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -50,6 +50,8 @@
 #include <linux/workqueue.h>
 #include <linux/socket.h>
 #include <uapi/linux/if_ether.h>
+#include <net/ipv6.h>
+#include <net/ip.h>
 
 #include <linux/atomic.h>
 #include <linux/mmu_notifier.h>
@@ -107,6 +109,35 @@ enum rdma_protocol_type {
 __attribute_const__ enum rdma_transport_type
 rdma_node_get_transport(enum rdma_node_type node_type);
 
+enum rdma_network_type {
+	RDMA_NETWORK_IB,
+	RDMA_NETWORK_ROCE_V1 = RDMA_NETWORK_IB,
+	RDMA_NETWORK_IPV4,
+	RDMA_NETWORK_IPV6
+};
+
+static inline enum ib_gid_type ib_network_to_gid_type(enum rdma_network_type network_type)
+{
+	if (network_type == RDMA_NETWORK_IPV4 ||
+	    network_type == RDMA_NETWORK_IPV6)
+		return IB_GID_TYPE_ROCE_UDP_ENCAP;
+
+	/* IB_GID_TYPE_IB same as RDMA_NETWORK_ROCE_V1 */
+	return IB_GID_TYPE_IB;
+}
+
+static inline enum rdma_network_type ib_gid_to_network_type(enum ib_gid_type gid_type,
+							    union ib_gid *gid)
+{
+	if (gid_type == IB_GID_TYPE_IB)
+		return RDMA_NETWORK_IB;
+
+	if (ipv6_addr_v4mapped((struct in6_addr *)gid))
+		return RDMA_NETWORK_IPV4;
+	else
+		return RDMA_NETWORK_IPV6;
+}
+
 enum rdma_link_layer {
 	IB_LINK_LAYER_UNSPECIFIED,
 	IB_LINK_LAYER_INFINIBAND,
@@ -533,6 +564,17 @@ struct ib_grh {
 	union ib_gid	dgid;
 };
 
+union rdma_network_hdr {
+	struct ib_grh ibgrh;
+	struct {
+		/* The IB spec states that if it's IPv4, the header
+		 * is located in the last 20 bytes of the header.
+		 */
+		u8		reserved[20];
+		struct iphdr	roce4grh;
+	};
+};
+
 enum {
 	IB_MULTICAST_QPN = 0xffffff
 };
@@ -769,6 +811,7 @@ enum ib_wc_flags {
 	IB_WC_IP_CSUM_OK	= (1<<3),
 	IB_WC_WITH_SMAC		= (1<<4),
 	IB_WC_WITH_VLAN		= (1<<5),
+	IB_WC_WITH_NETWORK_HDR_TYPE	= (1<<6),
 };
 
 struct ib_wc {
@@ -791,6 +834,7 @@ struct ib_wc {
 	u8			port_num;	/* valid only for DR SMPs on switches */
 	u8			smac[ETH_ALEN];
 	u16			vlan_id;
+	u8			network_hdr_type;
 };
 
 enum ib_cq_notify_flags {
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 6/9] IB/rdma_cm: Add wrapper for cma reference count
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
  2015-10-15 16:07   ` [PATCH for-next V1 7/9] IB/cma: Add configfs for rdma_cm Matan Barak
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

Currently, cma users can't increase or decrease the cma reference
count. This is necassary when setting cma attributes (like the
default GID type) in order to avoid use-after-free errors.
Adding cma_ref_dev and cma_deref_dev APIs.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cma.c       | 11 +++++++++--
 drivers/infiniband/core/core_priv.h |  4 ++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c5d1685..0955690 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -58,6 +58,8 @@
 #include <rdma/ib_sa.h>
 #include <rdma/iw_cm.h>
 
+#include "core_priv.h"
+
 MODULE_AUTHOR("Sean Hefty");
 MODULE_DESCRIPTION("Generic RDMA CM Agent");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -171,6 +173,11 @@ enum {
 	CMA_OPTION_AFONLY,
 };
 
+void cma_ref_dev(struct cma_device *cma_dev)
+{
+	atomic_inc(&cma_dev->refcount);
+}
+
 /*
  * Device removal can occur at anytime, so we need extra handling to
  * serialize notifying the user of device removal with other callbacks.
@@ -325,7 +332,7 @@ static inline void cma_set_ip_ver(struct cma_hdr *hdr, u8 ip_ver)
 static void cma_attach_to_dev(struct rdma_id_private *id_priv,
 			      struct cma_device *cma_dev)
 {
-	atomic_inc(&cma_dev->refcount);
+	cma_ref_dev(cma_dev);
 	id_priv->cma_dev = cma_dev;
 	id_priv->id.device = cma_dev->device;
 	id_priv->id.route.addr.dev_addr.transport =
@@ -333,7 +340,7 @@ static void cma_attach_to_dev(struct rdma_id_private *id_priv,
 	list_add_tail(&id_priv->list, &cma_dev->id_list);
 }
 
-static inline void cma_deref_dev(struct cma_device *cma_dev)
+void cma_deref_dev(struct cma_device *cma_dev)
 {
 	if (atomic_dec_and_test(&cma_dev->refcount))
 		complete(&cma_dev->comp);
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index d531f91..aeb107c 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -38,6 +38,10 @@
 
 #include <rdma/ib_verbs.h>
 
+struct cma_device;
+void cma_ref_dev(struct cma_device *cma_dev);
+void cma_deref_dev(struct cma_device *cma_dev);
+
 int  ib_device_register_sysfs(struct ib_device *device,
 			      int (*port_callback)(struct ib_device *,
 						   u8, struct kobject *));
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 7/9] IB/cma: Add configfs for rdma_cm
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 6/9] IB/rdma_cm: Add wrapper for cma reference count Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
       [not found]     ` <1444925232-13598-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-10-15 16:07   ` [PATCH for-next V1 8/9] IB/core: Initialize UD header structure with IP and UDP headers Matan Barak
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur

Users would like to control the behaviour of rdma_cm.
For example, old applications which don't set the
required RoCE gid type could be executed on RoCE V2
network types. In order to support this configuration,
we implement a configfs for rdma_cm.

In order to use the configfs, one needs to mount it and
mkdir <IB device name> inside rdma_cm directory.

The patch adds support for a single configuration file,
default_roce_mode. The mode can either be "IB/RoCE v1" or
"RoCE v2".

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/Kconfig             |   9 +
 drivers/infiniband/core/Makefile       |   2 +
 drivers/infiniband/core/cache.c        |  24 +++
 drivers/infiniband/core/cma.c          |  95 ++++++++-
 drivers/infiniband/core/cma_configfs.c | 353 +++++++++++++++++++++++++++++++++
 drivers/infiniband/core/core_priv.h    |  24 +++
 6 files changed, 503 insertions(+), 4 deletions(-)
 create mode 100644 drivers/infiniband/core/cma_configfs.c

diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index aa26f3c..2568af6 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -54,6 +54,15 @@ config INFINIBAND_ADDR_TRANS
 	depends on INFINIBAND
 	default y
 
+config INFINIBAND_ADDR_TRANS_CONFIGFS
+	bool
+	depends on INFINIBAND_ADDR_TRANS && CONFIGFS_FS
+	default y
+	---help---
+	  ConfigFS support for RDMA communication manager (CM).
+	  This allows the user to config the default GID type that the CM
+	  uses for each device, when initiaing new connections.
+
 source "drivers/infiniband/hw/mthca/Kconfig"
 source "drivers/infiniband/hw/qib/Kconfig"
 source "drivers/infiniband/hw/cxgb3/Kconfig"
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index d43a899..7922fa7 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -24,6 +24,8 @@ iw_cm-y :=			iwcm.o iwpm_util.o iwpm_msg.o
 
 rdma_cm-y :=			cma.o
 
+rdma_cm-$(CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS) += cma_configfs.o
+
 rdma_ucm-y :=			ucma.o
 
 ib_addr-y :=			addr.o
diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index cb34b06..9e02f3c 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -127,6 +127,30 @@ const char *ib_cache_gid_type_str(enum ib_gid_type gid_type)
 }
 EXPORT_SYMBOL(ib_cache_gid_type_str);
 
+int ib_cache_gid_parse_type_str(const char *buf)
+{
+	unsigned int i;
+	size_t len;
+	int err = -EINVAL;
+
+	len = strlen(buf);
+	if (len == 0)
+		return -EINVAL;
+
+	if (buf[len - 1] == '\n')
+		len--;
+
+	for (i = 0; i < ARRAY_SIZE(gid_type_str); ++i)
+		if (gid_type_str[i] && !strncmp(buf, gid_type_str[i], len) &&
+		    len == strlen(gid_type_str[i])) {
+			err = i;
+			break;
+		}
+
+	return err;
+}
+EXPORT_SYMBOL(ib_cache_gid_parse_type_str);
+
 static int write_gid(struct ib_device *ib_dev, u8 port,
 		     struct ib_gid_table *table, int ix,
 		     const union ib_gid *gid,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0955690..b03099e 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -139,6 +139,7 @@ struct cma_device {
 	struct completion	comp;
 	atomic_t		refcount;
 	struct list_head	id_list;
+	enum ib_gid_type	*default_gid_type;
 };
 
 struct rdma_bind_list {
@@ -178,6 +179,62 @@ void cma_ref_dev(struct cma_device *cma_dev)
 	atomic_inc(&cma_dev->refcount);
 }
 
+struct cma_device *cma_enum_devices_by_ibdev(cma_device_filter	filter,
+					     void		*cookie)
+{
+	struct cma_device *cma_dev;
+	struct cma_device *found_cma_dev = NULL;
+
+	mutex_lock(&lock);
+
+	list_for_each_entry(cma_dev, &dev_list, list)
+		if (filter(cma_dev->device, cookie)) {
+			found_cma_dev = cma_dev;
+			break;
+		}
+
+	if (found_cma_dev)
+		cma_ref_dev(found_cma_dev);
+	mutex_unlock(&lock);
+	return found_cma_dev;
+}
+
+int cma_get_default_gid_type(struct cma_device *cma_dev,
+			     unsigned int port)
+{
+	if (port < rdma_start_port(cma_dev->device) ||
+	    port > rdma_end_port(cma_dev->device))
+		return -EINVAL;
+
+	return cma_dev->default_gid_type[port - rdma_start_port(cma_dev->device)];
+}
+
+int cma_set_default_gid_type(struct cma_device *cma_dev,
+			     unsigned int port,
+			     enum ib_gid_type default_gid_type)
+{
+	unsigned long supported_gids;
+
+	if (port < rdma_start_port(cma_dev->device) ||
+	    port > rdma_end_port(cma_dev->device))
+		return -EINVAL;
+
+	supported_gids = roce_gid_type_mask_support(cma_dev->device, port);
+
+	if (!(supported_gids & 1 << default_gid_type))
+		return -EINVAL;
+
+	cma_dev->default_gid_type[port - rdma_start_port(cma_dev->device)] =
+		default_gid_type;
+
+	return 0;
+}
+
+struct ib_device *cma_get_ib_dev(struct cma_device *cma_dev)
+{
+	return cma_dev->device;
+}
+
 /*
  * Device removal can occur at anytime, so we need extra handling to
  * serialize notifying the user of device removal with other callbacks.
@@ -334,6 +391,9 @@ static void cma_attach_to_dev(struct rdma_id_private *id_priv,
 {
 	cma_ref_dev(cma_dev);
 	id_priv->cma_dev = cma_dev;
+	id_priv->gid_type =
+		cma_dev->default_gid_type[id_priv->id.port_num -
+					  rdma_start_port(cma_dev->device)];
 	id_priv->id.device = cma_dev->device;
 	id_priv->id.route.addr.dev_addr.transport =
 		rdma_node_get_transport(cma_dev->device->node_type);
@@ -435,6 +495,7 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
 }
 
 static inline int cma_validate_port(struct ib_device *device, u8 port,
+				    enum ib_gid_type gid_type,
 				      union ib_gid *gid, int dev_type,
 				      int bound_if_index)
 {
@@ -449,8 +510,10 @@ static inline int cma_validate_port(struct ib_device *device, u8 port,
 
 	if (dev_type == ARPHRD_ETHER)
 		ndev = dev_get_by_index(&init_net, bound_if_index);
+	else
+		gid_type = IB_GID_TYPE_IB;
 
-	ret = ib_find_cached_gid_by_port(device, gid, IB_GID_TYPE_IB, port,
+	ret = ib_find_cached_gid_by_port(device, gid, gid_type, port,
 					 ndev, NULL);
 
 	if (ndev)
@@ -485,7 +548,10 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
 		gidp = rdma_protocol_roce(cma_dev->device, port) ?
 		       &iboe_gid : &gid;
 
-		ret = cma_validate_port(cma_dev->device, port, gidp,
+		ret = cma_validate_port(cma_dev->device, port,
+					rdma_protocol_ib(cma_dev->device, port) ?
+					IB_GID_TYPE_IB :
+					listen_id_priv->gid_type, gidp,
 					dev_addr->dev_type,
 					dev_addr->bound_dev_if);
 		if (!ret) {
@@ -504,8 +570,11 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
 			gidp = rdma_protocol_roce(cma_dev->device, port) ?
 			       &iboe_gid : &gid;
 
-			ret = cma_validate_port(cma_dev->device, port, gidp,
-						dev_addr->dev_type,
+			ret = cma_validate_port(cma_dev->device, port,
+						rdma_protocol_ib(cma_dev->device, port) ?
+						IB_GID_TYPE_IB :
+						cma_dev->default_gid_type[port - 1],
+						gidp, dev_addr->dev_type,
 						dev_addr->bound_dev_if);
 			if (!ret) {
 				id_priv->id.port_num = port;
@@ -3829,12 +3898,27 @@ static void cma_add_one(struct ib_device *device)
 {
 	struct cma_device *cma_dev;
 	struct rdma_id_private *id_priv;
+	unsigned int i;
+	unsigned long supported_gids = 0;
 
 	cma_dev = kmalloc(sizeof *cma_dev, GFP_KERNEL);
 	if (!cma_dev)
 		return;
 
 	cma_dev->device = device;
+	cma_dev->default_gid_type = kcalloc(device->phys_port_cnt,
+					    sizeof(*cma_dev->default_gid_type),
+					    GFP_KERNEL);
+	if (!cma_dev->default_gid_type) {
+		kfree(cma_dev);
+		return;
+	}
+	for (i = rdma_start_port(device); i <= rdma_end_port(device); i++) {
+		supported_gids = roce_gid_type_mask_support(device, i);
+		WARN_ON(!supported_gids);
+		cma_dev->default_gid_type[i - rdma_start_port(device)] =
+			find_first_bit(&supported_gids, BITS_PER_LONG);
+	}
 
 	init_completion(&cma_dev->comp);
 	atomic_set(&cma_dev->refcount, 1);
@@ -3914,6 +3998,7 @@ static void cma_remove_one(struct ib_device *device, void *client_data)
 	mutex_unlock(&lock);
 
 	cma_process_remove(cma_dev);
+	kfree(cma_dev->default_gid_type);
 	kfree(cma_dev);
 }
 
@@ -4014,6 +4099,7 @@ static int __init cma_init(void)
 
 	if (ibnl_add_client(RDMA_NL_RDMA_CM, RDMA_NL_RDMA_CM_NUM_OPS, cma_cb_table))
 		printk(KERN_WARNING "RDMA CMA: failed to add netlink callback\n");
+	cma_configfs_init();
 
 	return 0;
 
@@ -4027,6 +4113,7 @@ err:
 
 static void __exit cma_cleanup(void)
 {
+	cma_configfs_exit();
 	ibnl_remove_client(RDMA_NL_RDMA_CM);
 	ib_unregister_client(&cma_client);
 	unregister_netdevice_notifier(&cma_nb);
diff --git a/drivers/infiniband/core/cma_configfs.c b/drivers/infiniband/core/cma_configfs.c
new file mode 100644
index 0000000..9911a08
--- /dev/null
+++ b/drivers/infiniband/core/cma_configfs.c
@@ -0,0 +1,353 @@
+/*
+ * Copyright (c) 2015, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/configfs.h>
+#include <rdma/ib_verbs.h>
+#include "core_priv.h"
+
+struct cma_device;
+
+struct cma_dev_group;
+
+struct cma_dev_port_group {
+	unsigned int		port_num;
+	struct cma_dev_group	*cma_dev_group;
+	struct config_group	group;
+};
+
+struct cma_dev_group {
+	char				name[IB_DEVICE_NAME_MAX];
+	struct config_group		device_group;
+	struct config_group		ports_group;
+	struct config_group		*default_dev_group[2];
+	struct config_group		**default_ports_group;
+	struct cma_dev_port_group	*ports;
+};
+
+struct cma_configfs_attr {
+	struct configfs_attribute	attr;
+	ssize_t				(*show)(struct cma_device *cma_dev,
+						struct cma_dev_port_group *group,
+						char *buf);
+	ssize_t				(*store)(struct cma_device *cma_dev,
+						 struct cma_dev_port_group *group,
+						 const char *buf, size_t count);
+};
+
+static struct cma_dev_port_group *to_dev_port_group(struct config_item *item)
+{
+	struct config_group *group;
+
+	if (!item)
+		return NULL;
+
+	group = container_of(item, struct config_group, cg_item);
+	return container_of(group, struct cma_dev_port_group, group);
+}
+
+static ssize_t show_default_roce_mode(struct cma_device *cma_dev,
+				      struct cma_dev_port_group *group,
+				      char *buf)
+{
+	unsigned int port = group->port_num;
+	int gid_type = cma_get_default_gid_type(cma_dev, port);
+
+	if (gid_type < 0)
+		return gid_type;
+
+	return sprintf(buf, "%s\n", ib_cache_gid_type_str(gid_type));
+}
+
+static ssize_t store_default_roce_mode(struct cma_device *cma_dev,
+				       struct cma_dev_port_group *group,
+				       const char *buf, size_t count)
+{
+	int gid_type = ib_cache_gid_parse_type_str(buf);
+	unsigned int port = group->port_num;
+	int ret;
+
+	if (gid_type < 0)
+		return -EINVAL;
+
+	ret = cma_set_default_gid_type(cma_dev, port, gid_type);
+	if (ret)
+		return ret;
+
+	return strnlen(buf, count);
+}
+
+#define CMA_PARAM_PORT_ATTR_RW(_name)				\
+static struct cma_configfs_attr cma_configfs_attr_##_name =	\
+	__CONFIGFS_ATTR(_name, S_IRUGO | S_IWUSR, show_##_name, store_##_name)
+
+CMA_PARAM_PORT_ATTR_RW(default_roce_mode);
+
+static bool filter_by_name(struct ib_device *ib_dev, void *cookie)
+{
+	return !strcmp(ib_dev->name, cookie);
+}
+
+static ssize_t cma_configfs_attr_show(struct config_item *item,
+				      struct configfs_attribute *attr,
+				      char *buf)
+{
+	ssize_t ret = -EINVAL;
+	struct cma_dev_port_group *group = to_dev_port_group(item);
+	struct cma_device *cma_dev;
+	struct cma_configfs_attr *ca =
+		container_of(attr, struct cma_configfs_attr, attr);
+
+	if (!group)
+		return -ENODEV;
+
+	cma_dev = cma_enum_devices_by_ibdev(filter_by_name,
+					    group->cma_dev_group->name);
+	if (!cma_dev)
+		return -ENODEV;
+
+	if (ca->show)
+		ret = ca->show(cma_dev, group, buf);
+
+	cma_deref_dev(cma_dev);
+	return ret;
+}
+
+static ssize_t cma_configfs_attr_store(struct config_item *item,
+				       struct configfs_attribute *attr,
+				       const char *buf, size_t count)
+{
+	ssize_t ret = -EINVAL;
+	struct cma_dev_port_group *group = to_dev_port_group(item);
+	struct cma_device *cma_dev;
+	struct cma_configfs_attr *ca =
+		container_of(attr, struct cma_configfs_attr, attr);
+
+	if (!group)
+		return -ENODEV;
+
+	cma_dev = cma_enum_devices_by_ibdev(filter_by_name,
+					    group->cma_dev_group->name);
+	if (!cma_dev)
+		return -ENODEV;
+
+	if (ca->store)
+		ret = ca->store(cma_dev, group, buf, count);
+
+	cma_deref_dev(cma_dev);
+	return ret;
+}
+
+static struct configfs_attribute *cma_configfs_attributes[] = {
+	&cma_configfs_attr_default_roce_mode.attr,
+	NULL,
+};
+
+static struct configfs_item_operations cma_item_ops = {
+	.show_attribute		= cma_configfs_attr_show,
+	.store_attribute	= cma_configfs_attr_store,
+};
+
+static struct config_item_type cma_port_group_type = {
+	.ct_attrs	= cma_configfs_attributes,
+	.ct_item_ops	= &cma_item_ops,
+	.ct_owner	= THIS_MODULE
+};
+
+static int make_cma_ports(struct cma_dev_group *cma_dev_group,
+			  struct cma_device *cma_dev)
+{
+	struct ib_device *ibdev;
+	unsigned int i;
+	unsigned int ports_num;
+	struct cma_dev_port_group *ports;
+	int err;
+
+	ibdev = cma_get_ib_dev(cma_dev);
+
+	if (!ibdev)
+		return -ENODEV;
+
+	ports_num = ibdev->phys_port_cnt;
+	ports = kcalloc(ports_num, sizeof(*cma_dev_group->ports),
+			GFP_KERNEL);
+
+	cma_dev_group->default_ports_group = kcalloc(ports_num + 1,
+						     sizeof(*cma_dev_group->ports),
+						     GFP_KERNEL);
+
+	if (!ports || !cma_dev_group->default_ports_group) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	for (i = 0; i < ports_num; i++) {
+		char port_str[10];
+
+		ports[i].port_num = i + 1;
+		snprintf(port_str, sizeof(port_str), "%u", i + 1);
+		ports[i].cma_dev_group = cma_dev_group;
+		config_group_init_type_name(&ports[i].group,
+					    port_str,
+					    &cma_port_group_type);
+		cma_dev_group->default_ports_group[i] = &ports[i].group;
+	}
+	cma_dev_group->default_ports_group[i] = NULL;
+	cma_dev_group->ports = ports;
+
+	return 0;
+free:
+	kfree(ports);
+	kfree(cma_dev_group->default_ports_group);
+	cma_dev_group->ports = NULL;
+	cma_dev_group->default_ports_group = NULL;
+	return err;
+}
+
+static void release_cma_dev(struct config_item  *item)
+{
+	struct config_group *group = container_of(item, struct config_group,
+						  cg_item);
+	struct cma_dev_group *cma_dev_group = container_of(group,
+							   struct cma_dev_group,
+							   device_group);
+
+	kfree(cma_dev_group);
+};
+
+static void release_cma_ports_group(struct config_item  *item)
+{
+	struct config_group *group = container_of(item, struct config_group,
+						  cg_item);
+	struct cma_dev_group *cma_dev_group = container_of(group,
+							   struct cma_dev_group,
+							   ports_group);
+
+	kfree(cma_dev_group->ports);
+	kfree(cma_dev_group->default_ports_group);
+	cma_dev_group->ports = NULL;
+	cma_dev_group->default_ports_group = NULL;
+};
+
+static struct configfs_item_operations cma_ports_item_ops = {
+	.release = release_cma_ports_group
+};
+
+static struct config_item_type cma_ports_group_type = {
+	.ct_item_ops	= &cma_ports_item_ops,
+	.ct_owner	= THIS_MODULE
+};
+
+static struct configfs_item_operations cma_device_item_ops = {
+	.release = release_cma_dev
+};
+
+static struct config_item_type cma_device_group_type = {
+	.ct_item_ops	= &cma_device_item_ops,
+	.ct_owner	= THIS_MODULE
+};
+
+static struct config_group *make_cma_dev(struct config_group *group,
+					 const char *name)
+{
+	int err = -ENODEV;
+	struct cma_device *cma_dev = cma_enum_devices_by_ibdev(filter_by_name,
+							       (void *)name);
+	struct cma_dev_group *cma_dev_group = NULL;
+
+	if (!cma_dev)
+		goto fail;
+
+	cma_dev_group = kzalloc(sizeof(*cma_dev_group), GFP_KERNEL);
+
+	if (!cma_dev_group) {
+		err = -ENOMEM;
+		goto fail;
+	}
+
+	strncpy(cma_dev_group->name, name, sizeof(cma_dev_group->name));
+
+	err = make_cma_ports(cma_dev_group, cma_dev);
+	if (err)
+		goto fail;
+
+	cma_dev_group->ports_group.default_groups =
+		cma_dev_group->default_ports_group;
+	config_group_init_type_name(&cma_dev_group->ports_group, "ports",
+				    &cma_ports_group_type);
+
+	cma_dev_group->device_group.default_groups
+		= cma_dev_group->default_dev_group;
+	cma_dev_group->default_dev_group[0] = &cma_dev_group->ports_group;
+	cma_dev_group->default_dev_group[1] = NULL;
+
+	config_group_init_type_name(&cma_dev_group->device_group, name,
+				    &cma_device_group_type);
+
+	cma_deref_dev(cma_dev);
+	return &cma_dev_group->device_group;
+
+fail:
+	if (cma_dev)
+		cma_deref_dev(cma_dev);
+	kfree(cma_dev_group);
+	return ERR_PTR(err);
+}
+
+static struct configfs_group_operations cma_subsys_group_ops = {
+	.make_group	= make_cma_dev,
+};
+
+static struct config_item_type cma_subsys_type = {
+	.ct_group_ops	= &cma_subsys_group_ops,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct configfs_subsystem cma_subsys = {
+	.su_group	= {
+		.cg_item	= {
+			.ci_namebuf	= "rdma_cm",
+			.ci_type	= &cma_subsys_type,
+		},
+	},
+};
+
+int __init cma_configfs_init(void)
+{
+	config_group_init(&cma_subsys.su_group);
+	mutex_init(&cma_subsys.su_mutex);
+	return configfs_register_subsystem(&cma_subsys);
+}
+
+void __exit cma_configfs_exit(void)
+{
+	configfs_unregister_subsystem(&cma_subsys);
+}
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index aeb107c..09d2615 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -38,9 +38,31 @@
 
 #include <rdma/ib_verbs.h>
 
+#if IS_ENABLED(CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS)
+int cma_configfs_init(void);
+void cma_configfs_exit(void);
+#else
+static inline int cma_configfs_init(void)
+{
+	return 0;
+}
+
+static inline void cma_configfs_exit(void)
+{
+}
+#endif
 struct cma_device;
 void cma_ref_dev(struct cma_device *cma_dev);
 void cma_deref_dev(struct cma_device *cma_dev);
+typedef bool (*cma_device_filter)(struct ib_device *, void *);
+struct cma_device *cma_enum_devices_by_ibdev(cma_device_filter	filter,
+					     void		*cookie);
+int cma_get_default_gid_type(struct cma_device *cma_dev,
+			     unsigned int port);
+int cma_set_default_gid_type(struct cma_device *cma_dev,
+			     unsigned int port,
+			     enum ib_gid_type default_gid_type);
+struct ib_device *cma_get_ib_dev(struct cma_device *cma_dev);
 
 int  ib_device_register_sysfs(struct ib_device *device,
 			      int (*port_callback)(struct ib_device *,
@@ -74,6 +96,8 @@ enum ib_cache_gid_default_mode {
 	IB_CACHE_GID_DEFAULT_MODE_DELETE
 };
 
+int ib_cache_gid_parse_type_str(const char *buf);
+
 const char *ib_cache_gid_type_str(enum ib_gid_type gid_type);
 
 void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 8/9] IB/core: Initialize UD header structure with IP and UDP headers
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 7/9] IB/cma: Add configfs for rdma_cm Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
  2015-10-15 16:07   ` [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP Matan Barak
  2015-11-16 13:23   ` [PATCH for-next V1 0/9] Add RoCE v2 support Matan Barak
  9 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur, Moni Shoua

From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

ib_ud_header_init() is used to format InfiniBand headers
in a buffer up to (but not with) BTH. For RoCE UDP ENCAP it is
required that this function would be able to build also IP and UDP
headers.

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/ud_header.c    | 155 ++++++++++++++++++++++++++++++---
 drivers/infiniband/hw/mlx4/qp.c        |   7 +-
 drivers/infiniband/hw/mthca/mthca_qp.c |   2 +-
 include/rdma/ib_pack.h                 |  45 ++++++++--
 4 files changed, 188 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/core/ud_header.c b/drivers/infiniband/core/ud_header.c
index 72feee6..96697e7 100644
--- a/drivers/infiniband/core/ud_header.c
+++ b/drivers/infiniband/core/ud_header.c
@@ -35,6 +35,7 @@
 #include <linux/string.h>
 #include <linux/export.h>
 #include <linux/if_ether.h>
+#include <linux/ip.h>
 
 #include <rdma/ib_pack.h>
 
@@ -116,6 +117,72 @@ static const struct ib_field vlan_table[]  = {
 	  .size_bits    = 16 }
 };
 
+static const struct ib_field ip4_table[]  = {
+	{ STRUCT_FIELD(ip4, ver),
+	  .offset_words = 0,
+	  .offset_bits  = 0,
+	  .size_bits    = 4 },
+	{ STRUCT_FIELD(ip4, hdr_len),
+	  .offset_words = 0,
+	  .offset_bits  = 4,
+	  .size_bits    = 4 },
+	{ STRUCT_FIELD(ip4, tos),
+	  .offset_words = 0,
+	  .offset_bits  = 8,
+	  .size_bits    = 8 },
+	{ STRUCT_FIELD(ip4, tot_len),
+	  .offset_words = 0,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(ip4, id),
+	  .offset_words = 1,
+	  .offset_bits  = 0,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(ip4, frag_off),
+	  .offset_words = 1,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(ip4, ttl),
+	  .offset_words = 2,
+	  .offset_bits  = 0,
+	  .size_bits    = 8 },
+	{ STRUCT_FIELD(ip4, protocol),
+	  .offset_words = 2,
+	  .offset_bits  = 8,
+	  .size_bits    = 8 },
+	{ STRUCT_FIELD(ip4, check),
+	  .offset_words = 2,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(ip4, saddr),
+	  .offset_words = 3,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+	{ STRUCT_FIELD(ip4, daddr),
+	  .offset_words = 4,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 }
+};
+
+static const struct ib_field udp_table[]  = {
+	{ STRUCT_FIELD(udp, sport),
+	  .offset_words = 0,
+	  .offset_bits  = 0,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(udp, dport),
+	  .offset_words = 0,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(udp, length),
+	  .offset_words = 1,
+	  .offset_bits  = 0,
+	  .size_bits    = 16 },
+	{ STRUCT_FIELD(udp, csum),
+	  .offset_words = 1,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 }
+};
+
 static const struct ib_field grh_table[]  = {
 	{ STRUCT_FIELD(grh, ip_version),
 	  .offset_words = 0,
@@ -213,26 +280,57 @@ static const struct ib_field deth_table[] = {
 	  .size_bits    = 24 }
 };
 
+__be16 ib_ud_ip4_csum(struct ib_ud_header *header)
+{
+	struct iphdr iph;
+
+	iph.ihl		= 5;
+	iph.version	= 4;
+	iph.tos		= header->ip4.tos;
+	iph.tot_len	= header->ip4.tot_len;
+	iph.id		= header->ip4.id;
+	iph.frag_off	= header->ip4.frag_off;
+	iph.ttl		= header->ip4.ttl;
+	iph.protocol	= header->ip4.protocol;
+	iph.check	= 0;
+	iph.saddr	= header->ip4.saddr;
+	iph.daddr	= header->ip4.daddr;
+
+	return ip_fast_csum((u8 *)&iph, iph.ihl);
+}
+EXPORT_SYMBOL(ib_ud_ip4_csum);
+
 /**
  * ib_ud_header_init - Initialize UD header structure
  * @payload_bytes:Length of packet payload
  * @lrh_present: specify if LRH is present
  * @eth_present: specify if Eth header is present
  * @vlan_present: packet is tagged vlan
- * @grh_present:GRH flag (if non-zero, GRH will be included)
+ * @grh_present: GRH flag (if non-zero, GRH will be included)
+ * @ip_version: if non-zero, IP header, V4 or V6, will be included
+ * @udp_present :if non-zero, UDP header will be included
  * @immediate_present: specify if immediate data is present
  * @header:Structure to initialize
  */
-void ib_ud_header_init(int     		    payload_bytes,
-		       int		    lrh_present,
-		       int		    eth_present,
-		       int		    vlan_present,
-		       int    		    grh_present,
-		       int		    immediate_present,
-		       struct ib_ud_header *header)
+int ib_ud_header_init(int     payload_bytes,
+		      int    lrh_present,
+		      int    eth_present,
+		      int    vlan_present,
+		      int    grh_present,
+		      int    ip_version,
+		      int    udp_present,
+		      int    immediate_present,
+		      struct ib_ud_header *header)
 {
+	grh_present = grh_present && !ip_version;
 	memset(header, 0, sizeof *header);
 
+	/*
+	 * UDP header without IP header doesn't make sense
+	 */
+	if (udp_present && ip_version != 4 && ip_version != 6)
+		return -EINVAL;
+
 	if (lrh_present) {
 		u16 packet_length;
 
@@ -252,7 +350,7 @@ void ib_ud_header_init(int     		    payload_bytes,
 	if (vlan_present)
 		header->eth.type = cpu_to_be16(ETH_P_8021Q);
 
-	if (grh_present) {
+	if (ip_version == 6 || grh_present) {
 		header->grh.ip_version      = 6;
 		header->grh.payload_length  =
 			cpu_to_be16((IB_BTH_BYTES     +
@@ -260,8 +358,30 @@ void ib_ud_header_init(int     		    payload_bytes,
 				     payload_bytes    +
 				     4                + /* ICRC     */
 				     3) & ~3);          /* round up */
-		header->grh.next_header     = 0x1b;
+		header->grh.next_header     = udp_present ? IPPROTO_UDP : 0x1b;
+	}
+
+	if (ip_version == 4) {
+		int udp_bytes = udp_present ? IB_UDP_BYTES : 0;
+
+		header->ip4.ver = 4; /* version 4 */
+		header->ip4.hdr_len = 5; /* 5 words */
+		header->ip4.tot_len =
+			cpu_to_be16(IB_IP4_BYTES   +
+				     udp_bytes     +
+				     IB_BTH_BYTES  +
+				     IB_DETH_BYTES +
+				     payload_bytes +
+				     4);     /* ICRC     */
+		header->ip4.protocol = IPPROTO_UDP;
 	}
+	if (udp_present && ip_version)
+		header->udp.length =
+			cpu_to_be16(IB_UDP_BYTES   +
+				     IB_BTH_BYTES  +
+				     IB_DETH_BYTES +
+				     payload_bytes +
+				     4);     /* ICRC     */
 
 	if (immediate_present)
 		header->bth.opcode           = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE;
@@ -273,8 +393,11 @@ void ib_ud_header_init(int     		    payload_bytes,
 	header->lrh_present = lrh_present;
 	header->eth_present = eth_present;
 	header->vlan_present = vlan_present;
-	header->grh_present = grh_present;
+	header->grh_present = grh_present || (ip_version == 6);
+	header->ipv4_present = ip_version == 4;
+	header->udp_present = udp_present;
 	header->immediate_present = immediate_present;
+	return 0;
 }
 EXPORT_SYMBOL(ib_ud_header_init);
 
@@ -311,6 +434,16 @@ int ib_ud_header_pack(struct ib_ud_header *header,
 			&header->grh, buf + len);
 		len += IB_GRH_BYTES;
 	}
+	if (header->ipv4_present) {
+		ib_pack(ip4_table, ARRAY_SIZE(ip4_table),
+			&header->ip4, buf + len);
+		len += IB_IP4_BYTES;
+	}
+	if (header->udp_present) {
+		ib_pack(udp_table, ARRAY_SIZE(udp_table),
+			&header->udp, buf + len);
+		len += IB_UDP_BYTES;
+	}
 
 	ib_pack(bth_table, ARRAY_SIZE(bth_table),
 		&header->bth, buf + len);
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 71a4176..e8737ff 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -2089,7 +2089,7 @@ static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp,
 	if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_PROXY_SMI_OWNER)
 		send_size += sizeof (struct mlx4_ib_tunnel_header);
 
-	ib_ud_header_init(send_size, 1, 0, 0, 0, 0, &sqp->ud_header);
+	ib_ud_header_init(send_size, 1, 0, 0, 0, 0, 0, 0, &sqp->ud_header);
 
 	if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_PROXY_SMI_OWNER) {
 		sqp->ud_header.lrh.service_level =
@@ -2235,7 +2235,10 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_send_wr *wr,
 			is_vlan = 1;
 		}
 	}
-	ib_ud_header_init(send_size, !is_eth, is_eth, is_vlan, is_grh, 0, &sqp->ud_header);
+	err = ib_ud_header_init(send_size, !is_eth, is_eth, is_vlan, is_grh,
+				0, 0, 0, &sqp->ud_header);
+	if (err)
+		return err;
 
 	if (!is_eth) {
 		sqp->ud_header.lrh.service_level =
diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c
index e354b2f..22a04fd 100644
--- a/drivers/infiniband/hw/mthca/mthca_qp.c
+++ b/drivers/infiniband/hw/mthca/mthca_qp.c
@@ -1485,7 +1485,7 @@ static int build_mlx_header(struct mthca_dev *dev, struct mthca_sqp *sqp,
 	u16 pkey;
 
 	ib_ud_header_init(256, /* assume a MAD */ 1, 0, 0,
-			  mthca_ah_grh_present(to_mah(wr->wr.ud.ah)), 0,
+			  mthca_ah_grh_present(to_mah(wr->wr.ud.ah)), 0, 0, 0,
 			  &sqp->ud_header);
 
 	err = mthca_read_ah(dev, to_mah(wr->wr.ud.ah), &sqp->ud_header);
diff --git a/include/rdma/ib_pack.h b/include/rdma/ib_pack.h
index 709a533..0e494d5 100644
--- a/include/rdma/ib_pack.h
+++ b/include/rdma/ib_pack.h
@@ -41,6 +41,8 @@ enum {
 	IB_ETH_BYTES  = 14,
 	IB_VLAN_BYTES = 4,
 	IB_GRH_BYTES  = 40,
+	IB_IP4_BYTES  = 20,
+	IB_UDP_BYTES  = 8,
 	IB_BTH_BYTES  = 12,
 	IB_DETH_BYTES = 8
 };
@@ -223,6 +225,27 @@ struct ib_unpacked_eth {
 	__be16	type;
 };
 
+struct ib_unpacked_ip4 {
+	u8	ver;
+	u8	hdr_len;
+	u8	tos;
+	__be16	tot_len;
+	__be16	id;
+	__be16	frag_off;
+	u8	ttl;
+	u8	protocol;
+	__be16	check;
+	__be32	saddr;
+	__be32	daddr;
+};
+
+struct ib_unpacked_udp {
+	__be16	sport;
+	__be16	dport;
+	__be16	length;
+	__be16	csum;
+};
+
 struct ib_unpacked_vlan {
 	__be16  tag;
 	__be16  type;
@@ -237,6 +260,10 @@ struct ib_ud_header {
 	struct ib_unpacked_vlan vlan;
 	int			grh_present;
 	struct ib_unpacked_grh	grh;
+	int			ipv4_present;
+	struct ib_unpacked_ip4	ip4;
+	int			udp_present;
+	struct ib_unpacked_udp	udp;
 	struct ib_unpacked_bth	bth;
 	struct ib_unpacked_deth deth;
 	int			immediate_present;
@@ -253,13 +280,17 @@ void ib_unpack(const struct ib_field        *desc,
 	       void                         *buf,
 	       void                         *structure);
 
-void ib_ud_header_init(int		    payload_bytes,
-		       int		    lrh_present,
-		       int		    eth_present,
-		       int		    vlan_present,
-		       int		    grh_present,
-		       int		    immediate_present,
-		       struct ib_ud_header *header);
+__be16 ib_ud_ip4_csum(struct ib_ud_header *header);
+
+int ib_ud_header_init(int		    payload_bytes,
+		      int		    lrh_present,
+		      int		    eth_present,
+		      int		    vlan_present,
+		      int		    grh_present,
+		      int		    ip_version,
+		      int		    udp_present,
+		      int		    immediate_present,
+		      struct ib_ud_header *header);
 
 int ib_ud_header_pack(struct ib_ud_header *header,
 		      void                *buf);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 8/9] IB/core: Initialize UD header structure with IP and UDP headers Matan Barak
@ 2015-10-15 16:07   ` Matan Barak
       [not found]     ` <1444925232-13598-10-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-11-16 13:23   ` [PATCH for-next V1 0/9] Add RoCE v2 support Matan Barak
  9 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-10-15 16:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Jason Gunthorpe,
	Matan Barak, Eran Ben Elisha, Somnath Kotur, Moni Shoua

From: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Since RoCEv2 is a protocol over IP header it is required to send IGMP
join and leave requests to the network when joining and leaving
multicast groups.

Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cma.c       | 96 +++++++++++++++++++++++++++++++++----
 drivers/infiniband/core/multicast.c | 20 +++++++-
 include/rdma/ib_sa.h                |  3 ++
 3 files changed, 107 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index b03099e..423d6ba 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -38,6 +38,7 @@
 #include <linux/in6.h>
 #include <linux/mutex.h>
 #include <linux/random.h>
+#include <linux/igmp.h>
 #include <linux/idr.h>
 #include <linux/inetdevice.h>
 #include <linux/slab.h>
@@ -290,6 +291,7 @@ struct cma_multicast {
 	void			*context;
 	struct sockaddr_storage	addr;
 	struct kref		mcref;
+	bool			igmp_joined;
 };
 
 struct cma_work {
@@ -386,6 +388,26 @@ static inline void cma_set_ip_ver(struct cma_hdr *hdr, u8 ip_ver)
 	hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF);
 }
 
+static int cma_igmp_send(struct net_device *ndev, union ib_gid *mgid, bool join)
+{
+	struct in_device *in_dev = NULL;
+
+	if (ndev) {
+		rtnl_lock();
+		in_dev = __in_dev_get_rtnl(ndev);
+		if (in_dev) {
+			if (join)
+				ip_mc_inc_group(in_dev,
+						*(__be32 *)(mgid->raw + 12));
+			else
+				ip_mc_dec_group(in_dev,
+						*(__be32 *)(mgid->raw + 12));
+		}
+		rtnl_unlock();
+	}
+	return (in_dev) ? 0 : -ENODEV;
+}
+
 static void cma_attach_to_dev(struct rdma_id_private *id_priv,
 			      struct cma_device *cma_dev)
 {
@@ -1476,8 +1498,24 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
 				      id_priv->id.port_num)) {
 			ib_sa_free_multicast(mc->multicast.ib);
 			kfree(mc);
-		} else
+		} else {
+			if (mc->igmp_joined) {
+				struct rdma_dev_addr *dev_addr =
+					&id_priv->id.route.addr.dev_addr;
+				struct net_device *ndev = NULL;
+
+				if (dev_addr->bound_dev_if)
+					ndev = dev_get_by_index(&init_net,
+								dev_addr->bound_dev_if);
+				if (ndev) {
+					cma_igmp_send(ndev,
+						      &mc->multicast.ib->rec.mgid,
+						      false);
+					dev_put(ndev);
+				}
+			}
 			kref_put(&mc->mcref, release_mc);
+		}
 	}
 }
 
@@ -3707,7 +3745,7 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
 {
 	struct iboe_mcast_work *work;
 	struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
-	int err;
+	int err = 0;
 	struct sockaddr *addr = (struct sockaddr *)&mc->addr;
 	struct net_device *ndev = NULL;
 
@@ -3739,13 +3777,35 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
 	mc->multicast.ib->rec.rate = iboe_get_rate(ndev);
 	mc->multicast.ib->rec.hop_limit = 1;
 	mc->multicast.ib->rec.mtu = iboe_get_mtu(ndev->mtu);
+	mc->multicast.ib->rec.ifindex = dev_addr->bound_dev_if;
+	mc->multicast.ib->rec.net = &init_net;
+	rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr,
+		    &mc->multicast.ib->rec.port_gid);
+
+	mc->multicast.ib->rec.gid_type =
+		id_priv->cma_dev->default_gid_type[id_priv->id.port_num -
+		rdma_start_port(id_priv->cma_dev->device)];
+	if (addr->sa_family == AF_INET) {
+		if (mc->multicast.ib->rec.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP)
+			err = cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid,
+					    true);
+		if (!err) {
+			mc->igmp_joined = true;
+			mc->multicast.ib->rec.hop_limit = IPV6_DEFAULT_HOPLIMIT;
+		}
+	} else {
+		if (mc->multicast.ib->rec.gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP)
+			err = -ENOTSUPP;
+		else
+			mc->multicast.ib->rec.gid_type = IB_GID_TYPE_IB;
+	}
 	dev_put(ndev);
-	if (!mc->multicast.ib->rec.mtu) {
-		err = -EINVAL;
+	if (err || !mc->multicast.ib->rec.mtu) {
+		if (!err)
+			err = -EINVAL;
 		goto out2;
 	}
-	rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr,
-		    &mc->multicast.ib->rec.port_gid);
+
 	work->id = id_priv;
 	work->mc = mc;
 	INIT_WORK(&work->work, iboe_mcast_work_handler);
@@ -3780,7 +3840,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
 	memcpy(&mc->addr, addr, rdma_addr_size(addr));
 	mc->context = context;
 	mc->id_priv = id_priv;
-
+	mc->igmp_joined = false;
 	spin_lock(&id_priv->lock);
 	list_add(&mc->list, &id_priv->mc_list);
 	spin_unlock(&id_priv->lock);
@@ -3825,9 +3885,25 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
 			if (rdma_cap_ib_mcast(id->device, id->port_num)) {
 				ib_sa_free_multicast(mc->multicast.ib);
 				kfree(mc);
-			} else if (rdma_protocol_roce(id->device, id->port_num))
-				kref_put(&mc->mcref, release_mc);
-
+			} else if (rdma_protocol_roce(id->device, id->port_num)) {
+					if (mc->igmp_joined) {
+						struct rdma_dev_addr *dev_addr =
+							&id->route.addr.dev_addr;
+						struct net_device *ndev = NULL;
+
+						if (dev_addr->bound_dev_if)
+							ndev = dev_get_by_index(&init_net,
+										dev_addr->bound_dev_if);
+						if (ndev) {
+							cma_igmp_send(ndev,
+								      &mc->multicast.ib->rec.mgid,
+								      false);
+							dev_put(ndev);
+						}
+						mc->igmp_joined = false;
+					}
+					kref_put(&mc->mcref, release_mc);
+				}
 			return;
 		}
 	}
diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index 6911ae6..f71904e 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -729,8 +729,24 @@ int ib_init_ah_from_mcmember(struct ib_device *device, u8 port_num,
 	u16 gid_index;
 	u8 p;
 
-	ret = ib_find_cached_gid(device, &rec->port_gid, IB_GID_TYPE_IB,
-				 NULL, &p, &gid_index);
+	if (rdma_protocol_roce(device, port_num)) {
+		struct net_device *ndev = rec->net ?
+			dev_get_by_index(rec->net, rec->ifindex) : NULL;
+
+		ret = ib_find_cached_gid_by_port(device, &rec->port_gid,
+						 rec->gid_type, port_num,
+						 ndev,
+						 &gid_index);
+		if (ndev)
+			dev_put(ndev);
+	} else if (rdma_protocol_ib(device, port_num)) {
+		ret = ib_find_cached_gid(device, &rec->port_gid,
+					 IB_GID_TYPE_IB, NULL, &p,
+					 &gid_index);
+	} else {
+		ret = -EINVAL;
+	}
+
 	if (ret)
 		return ret;
 
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index 0a40ed2..5bea0e8 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -206,6 +206,9 @@ struct ib_sa_mcmember_rec {
 	u8           scope;
 	u8           join_state;
 	int          proxy_join;
+	int	     ifindex;
+	struct net  *net;
+	enum ib_gid_type gid_type;
 };
 
 /* Service Record Component Mask Sec 15.2.5.14 Ver 1.1	*/
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 0/9] Add RoCE v2 support
       [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2015-10-15 16:07   ` [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP Matan Barak
@ 2015-11-16 13:23   ` Matan Barak
       [not found]     ` <CAAKD3BBm2WZ8TqSFi7gC82BwBTCc+D-SJSpSSqhEqMjL8-Fq_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  9 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-11-16 13:23 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma, Or Gerlitz, Jason Gunthorpe,
	Eran Ben Elisha, Somnath Kotur

On Thu, Oct 15, 2015 at 6:07 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> Hi Doug,
>
> This series adds the support for RoCE v2. In order to support RoCE v2,
> we add gid_type attribute to every GID. When the RoCE GID management
> populates the GID table, it duplicates each GID with all supported types.
> This gives the user the ability to communicate over each supported
> type.
>
> Patch 0001, 0002 and 0003 add support for multiple GID types to the
> cache and related APIs. The third patch exposes the GID attributes
> information is sysfs.
>
> Patch 0004 adds the RoCE v2 GID type and the capabilities required
> from the vendor in order to implement RoCE v2. These capabilities
> are grouped together as RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP.
>
> RoCE v2 could work at IPv4 and IPv6 networks. When receiving ib_wc, this
> information should come from the vendor's driver. In case the vendor
> doesn't supply this information, we parse the packet headers and resolve
> its network type. Patch 0005 adds this information and required utilities.
>
> Patches 0006 and 0007 add configfs support (and the required
> infrastructure) for CMA. The administrator should be able to set the
> default RoCE type. This is done through a new per-port
> default_roce_mode configfs file.
>
> Patch 0008 formats a QP1 packet in order to support RoCE v2 CM
> packets. This is required for vendors which implement their
> QP1 as a Raw QP.
>
> Patch 0009 adds support for IPv4 multicast as an IPv4 network
> requires IGMP to be sent in order to join multicast groups.
>
> Vendors code aren't part of this patch-set. Soft-Roce will be
> sent soon and depends on these patches. Other vendors, like
> mlx4, ocrdma and mlx5 will follow.
>
> This patch is applied on "Add RoCE GID cache usage in verbs/cma"
> which was sent to the mailing list.
>
> Thanks,
> Matan
>
> Changes from V0:
>  - Rebased patches against Doug's latest k.o/for-4.4 tree.
>  - Fixed a bug in configfs (rmdir caused an incorrect free).
>
> Matan Barak (6):
>   IB/core: Add gid_type to gid attribute
>   IB/cm: Use the source GID index type
>   IB/core: Add gid attributes to sysfs
>   IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type
>   IB/rdma_cm: Add wrapper for cma reference count
>   IB/cma: Add configfs for rdma_cm
>
> Moni Shoua (2):
>   IB/core: Initialize UD header structure with IP and UDP headers
>   IB/cma: Join and leave multicast groups with IGMP
>
> Somnath Kotur (1):
>   IB/core: Add rdma_network_type to wc
>
>  drivers/infiniband/Kconfig                |   9 +
>  drivers/infiniband/core/Makefile          |   2 +
>  drivers/infiniband/core/addr.c            |  14 ++
>  drivers/infiniband/core/cache.c           | 156 +++++++++----
>  drivers/infiniband/core/cm.c              |  25 ++-
>  drivers/infiniband/core/cma.c             | 216 ++++++++++++++++--
>  drivers/infiniband/core/cma_configfs.c    | 353 ++++++++++++++++++++++++++++++
>  drivers/infiniband/core/core_priv.h       |  32 +++
>  drivers/infiniband/core/device.c          |  10 +-
>  drivers/infiniband/core/multicast.c       |  20 +-
>  drivers/infiniband/core/roce_gid_mgmt.c   |  61 +++++-
>  drivers/infiniband/core/sa_query.c        |   5 +-
>  drivers/infiniband/core/sysfs.c           | 184 +++++++++++++++-
>  drivers/infiniband/core/ud_header.c       | 155 ++++++++++++-
>  drivers/infiniband/core/uverbs_marshall.c |   1 +
>  drivers/infiniband/core/verbs.c           | 124 ++++++++++-
>  drivers/infiniband/hw/mlx4/qp.c           |   7 +-
>  drivers/infiniband/hw/mthca/mthca_qp.c    |   2 +-
>  include/rdma/ib_addr.h                    |   1 +
>  include/rdma/ib_cache.h                   |   4 +
>  include/rdma/ib_pack.h                    |  45 +++-
>  include/rdma/ib_sa.h                      |   4 +
>  include/rdma/ib_verbs.h                   |  78 ++++++-
>  23 files changed, 1402 insertions(+), 106 deletions(-)
>  create mode 100644 drivers/infiniband/core/cma_configfs.c
>
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hi Doug,

This series was posted a month ago.
If you have any comments, I would appreciate if you could please post
them. Otherwise, could we please push this series to v4.5?
If you need, I can rebase it over your for-4.5 branch once you'll open
such a branch.

Thanks,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 0/9] Add RoCE v2 support
       [not found]     ` <CAAKD3BBm2WZ8TqSFi7gC82BwBTCc+D-SJSpSSqhEqMjL8-Fq_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-22 21:28       ` Or Gerlitz
       [not found]         ` <CAJ3xEMiAkz0aouPgHWD31CwrX4SmOQfysJBX2kQOZ91gVP+94g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Or Gerlitz @ 2015-11-22 21:28 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Matan Barak, linux-rdma, Or Gerlitz, Jason Gunthorpe,
	Eran Ben Elisha, Somnath Kotur, Matan Barak, Christoph Lameter,
	talal-VPRAkNaXOzVWk0Htik3J/w

On Mon, Nov 16, 2015, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On Thu, Oct 15, 2015 , Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:

>> Hi Doug,
>> This series adds the support for RoCE v2. In order to support RoCE v2,
>> we add gid_type attribute to every GID. When the RoCE GID management
>> populates the GID table, it duplicates each GID with all supported types.
>> This gives the user the ability to communicate over each supported type.

> Hi Doug,
> This series was posted a month ago.
> If you have any comments, I would appreciate if you could please post them.

Doug,

Correction. The series was posted on Aug 13 -- three months ago, see [1].

I think it's terribly fair enough to know where it stands w.r.t
deployment into 4.5

Lack of RoCE v2 support in the IB core is slowing down the RocE
support for mlx5 and adoption of Soft-RoCE, it's fair enough to ask
for you as maintainer to react/respond after three months the patches
are waiting.

Or.

[1] http://marc.info/?l=linux-rdma&m=143948192926723&w=2

> Otherwise, could we please push this series to v4.5?
> If you need, I can rebase it over your for-4.5 branch once you'll open
> such a branch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 0/9] Add RoCE v2 support
       [not found]         ` <CAJ3xEMiAkz0aouPgHWD31CwrX4SmOQfysJBX2kQOZ91gVP+94g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-23 19:53           ` Doug Ledford
  0 siblings, 0 replies; 33+ messages in thread
From: Doug Ledford @ 2015-11-23 19:53 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Matan Barak, linux-rdma, Or Gerlitz, Jason Gunthorpe,
	Eran Ben Elisha, Somnath Kotur, Matan Barak, Christoph Lameter,
	talal-VPRAkNaXOzVWk0Htik3J/w

[-- Attachment #1: Type: text/plain, Size: 1842 bytes --]

On 11/22/2015 04:28 PM, Or Gerlitz wrote:
> On Mon, Nov 16, 2015, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On Thu, Oct 15, 2015 , Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> 
>>> Hi Doug,
>>> This series adds the support for RoCE v2. In order to support RoCE v2,
>>> we add gid_type attribute to every GID. When the RoCE GID management
>>> populates the GID table, it duplicates each GID with all supported types.
>>> This gives the user the ability to communicate over each supported type.
> 
>> Hi Doug,
>> This series was posted a month ago.
>> If you have any comments, I would appreciate if you could please post them.
> 
> Doug,
> 
> Correction. The series was posted on Aug 13 -- three months ago, see [1].
> 
> I think it's terribly fair enough to know where it stands w.r.t
> deployment into 4.5

It's normally fair as you say.  However, as you know, I've been out for
a while.  This set only missed 4.4 because I didn't have time to review
it prior to the merge window.  Not the fault of anyone who submitted it,
but a reality of my life circumstances none the less.  Reviewing this
and getting the blocked code moving forward is at the top of my 4.5
candidate list.

> Lack of RoCE v2 support in the IB core is slowing down the RocE
> support for mlx5 and adoption of Soft-RoCE, it's fair enough to ask
> for you as maintainer to react/respond after three months the patches
> are waiting.
> 
> Or.
> 
> [1] http://marc.info/?l=linux-rdma&m=143948192926723&w=2
> 
>> Otherwise, could we please push this series to v4.5?
>> If you need, I can rebase it over your for-4.5 branch once you'll open
>> such a branch.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]     ` <1444925232-13598-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-11-23 21:19       ` Jason Gunthorpe
       [not found]         ` <20151123211916.GA6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-23 21:19 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

> +	/* Use the hint from IP Stack to select GID Type */
> +	network_gid_type = ib_network_to_gid_type(addr->dev_addr.network);
> +	if (addr->dev_addr.network != RDMA_NETWORK_IB) {
> +		route->path_rec->gid_type = network_gid_type;
> +		/* TODO: get the hoplimit from the inet/inet6 device */
> +		route->path_rec->hop_limit = IPV6_DEFAULT_HOPLIMIT;

Uh, that is more than a TODO, that is showing this is all messed up.

It isn't just the hop limit that has to come from the route entry, all
the source information of the path comes from there. Ie the gid table
should accept the route entry directly and spit out the sgid_index.

The responder side is the same, it also needs to do a route lookup to
figure out what it is doing, and that may not match what the rx says
from the headers. This is important stuff.

I really don't like the API changes that went in with the last series
that added net_dev and gid_attr everywhere, that just seems to be
enabling mistakes like the above. You can't use rocev2 without doing
route lookups, providing APIs that don't force this to happen just
encourages broken flows like this.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs
       [not found]     ` <1444925232-13598-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-11-23 21:20       ` Jason Gunthorpe
       [not found]         ` <20151123212029.GB6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-23 21:20 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Thu, Oct 15, 2015 at 07:07:06PM +0300, Matan Barak wrote:
> This patch set adds attributes of net device and gid type to each GID
> in the GID table. Users that use verbs directly need to specify
> the GID index. Since the same GID could have different types or
> associated net devices, users should have the ability to query the
> associated GID attributes. Adding these attributes to sysfs.

There is no addition to Documentation/ABI/ for this change

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 7/9] IB/cma: Add configfs for rdma_cm
       [not found]     ` <1444925232-13598-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-11-23 21:23       ` Jason Gunthorpe
       [not found]         ` <20151123212359.GC6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-23 21:23 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Thu, Oct 15, 2015 at 07:07:10PM +0300, Matan Barak wrote:
> Users would like to control the behaviour of rdma_cm.
> For example, old applications which don't set the
> required RoCE gid type could be executed on RoCE V2
> network types. In order to support this configuration,
> we implement a configfs for rdma_cm.
> 
> In order to use the configfs, one needs to mount it and
> mkdir <IB device name> inside rdma_cm directory.
> 
> The patch adds support for a single configuration file,
> default_roce_mode. The mode can either be "IB/RoCE v1" or
> "RoCE v2".

Also no Documentation/ABI stuff for this

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP
       [not found]     ` <1444925232-13598-10-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-11-23 21:25       ` Jason Gunthorpe
       [not found]         ` <20151123212526.GD6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-23 21:25 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur, Moni Shoua

On Thu, Oct 15, 2015 at 07:07:12PM +0300, Matan Barak wrote:
> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
> index 0a40ed2..5bea0e8 100644
> +++ b/include/rdma/ib_sa.h
> @@ -206,6 +206,9 @@ struct ib_sa_mcmember_rec {
>  	u8           scope;
>  	u8           join_state;
>  	int          proxy_join;
> +	int	     ifindex;
> +	struct net  *net;
> +	enum ib_gid_type gid_type;
>  };

This is really gross.

Make ib_init_ah_from_mcmember accept a QP and get the above stuff from
the QP.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 7/9] IB/cma: Add configfs for rdma_cm
       [not found]         ` <20151123212359.GC6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-24  8:28           ` Matan Barak
  0 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-11-24  8:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Mon, Nov 23, 2015 at 11:23 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Oct 15, 2015 at 07:07:10PM +0300, Matan Barak wrote:
>> Users would like to control the behaviour of rdma_cm.
>> For example, old applications which don't set the
>> required RoCE gid type could be executed on RoCE V2
>> network types. In order to support this configuration,
>> we implement a configfs for rdma_cm.
>>
>> In order to use the configfs, one needs to mount it and
>> mkdir <IB device name> inside rdma_cm directory.
>>
>> The patch adds support for a single configuration file,
>> default_roce_mode. The mode can either be "IB/RoCE v1" or
>> "RoCE v2".
>
> Also no Documentation/ABI stuff for this
>

I'll add documentation for this. Thanks.

> Jason

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs
       [not found]         ` <20151123212029.GB6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-24  8:31           ` Matan Barak
       [not found]             ` <CAAKD3BCQzBax6N3+-RhdEvByQu3mz1KKsjQ7yjs-fn2_nSPfOA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-11-24  8:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Mon, Nov 23, 2015 at 11:20 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Oct 15, 2015 at 07:07:06PM +0300, Matan Barak wrote:
>> This patch set adds attributes of net device and gid type to each GID
>> in the GID table. Users that use verbs directly need to specify
>> the GID index. Since the same GID could have different types or
>> associated net devices, users should have the ability to query the
>> associated GID attributes. Adding these attributes to sysfs.
>
> There is no addition to Documentation/ABI/ for this change
>

Do we have documentation for our sysfs ABI in the Documentation directory?
Grepping for infiniband there only found sysfs-driver-ib_srp, which
isn't the right place to add these things.

> Jason

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP
       [not found]         ` <20151123212526.GD6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-24  9:41           ` Moni Shoua
       [not found]             ` <CAG9sBKMUPJ74RLKT54yO-==0gP9nzfrbfWz1Mb_J5VstRQr2OA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Moni Shoua @ 2015-11-24  9:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Mon, Nov 23, 2015 at 11:25 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Oct 15, 2015 at 07:07:12PM +0300, Matan Barak wrote:
>> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
>> index 0a40ed2..5bea0e8 100644
>> +++ b/include/rdma/ib_sa.h
>> @@ -206,6 +206,9 @@ struct ib_sa_mcmember_rec {
>>       u8           scope;
>>       u8           join_state;
>>       int          proxy_join;
>> +     int          ifindex;
>> +     struct net  *net;
>> +     enum ib_gid_type gid_type;
>>  };
>
> This is really gross.
>
> Make ib_init_ah_from_mcmember accept a QP and get the above stuff from
> the QP.
>
> Jason

Which QP is that. You might not have any existing QP when you want to
create the AH or you might have 10.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]         ` <20151123211916.GA6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-24 13:47           ` Matan Barak
       [not found]             ` <CAAKD3BCWMrd8A+UgjQg+jtfLmyOCaOB4iGCr2ZAbaazRBZeGxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-11-24 13:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Mon, Nov 23, 2015 at 11:19 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> +     /* Use the hint from IP Stack to select GID Type */
>> +     network_gid_type = ib_network_to_gid_type(addr->dev_addr.network);
>> +     if (addr->dev_addr.network != RDMA_NETWORK_IB) {
>> +             route->path_rec->gid_type = network_gid_type;
>> +             /* TODO: get the hoplimit from the inet/inet6 device */
>> +             route->path_rec->hop_limit = IPV6_DEFAULT_HOPLIMIT;
>
> Uh, that is more than a TODO, that is showing this is all messed up.
>
> It isn't just the hop limit that has to come from the route entry, all
> the source information of the path comes from there. Ie the gid table
> should accept the route entry directly and spit out the sgid_index.
>
> The responder side is the same, it also needs to do a route lookup to
> figure out what it is doing, and that may not match what the rx says
> from the headers. This is important stuff.
>

The only entity that translates between IPs and GIDs is the RDMACM.
The GID cache is like a database. It allows one to store, retrieve and
query the GIDs and GID attrs it stores.
roce_gid_mgmt, is the part that populates this "dumb" database.
IMHO, adding such a "smart" layer to the GID cache is wrong, as this
should be part of RDMACM which does the translation. No need to get
the gid cache involved.


> I really don't like the API changes that went in with the last series
> that added net_dev and gid_attr everywhere, that just seems to be
> enabling mistakes like the above. You can't use rocev2 without doing
> route lookups, providing APIs that don't force this to happen just
> encourages broken flows like this.
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs
       [not found]             ` <CAAKD3BCQzBax6N3+-RhdEvByQu3mz1KKsjQ7yjs-fn2_nSPfOA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-24 13:49               ` Matan Barak
       [not found]                 ` <CAAKD3BA=h+Mpq9VBnCNpv0UCAkmwCBtOahpOhhWdCvUM=C7JPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-11-24 13:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 10:31 AM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On Mon, Nov 23, 2015 at 11:20 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> On Thu, Oct 15, 2015 at 07:07:06PM +0300, Matan Barak wrote:
>>> This patch set adds attributes of net device and gid type to each GID
>>> in the GID table. Users that use verbs directly need to specify
>>> the GID index. Since the same GID could have different types or
>>> associated net devices, users should have the ability to query the
>>> associated GID attributes. Adding these attributes to sysfs.
>>
>> There is no addition to Documentation/ABI/ for this change
>>
>
> Do we have documentation for our sysfs ABI in the Documentation directory?
> Grepping for infiniband there only found sysfs-driver-ib_srp, which
> isn't the right place to add these things.
>

Of course that if there is no such documentation, I can add a new file
for the sysfs ABI defined in this patch.

>> Jason
>
> Matan
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs
       [not found]                 ` <CAAKD3BA=h+Mpq9VBnCNpv0UCAkmwCBtOahpOhhWdCvUM=C7JPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-24 17:59                   ` Jason Gunthorpe
  0 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-24 17:59 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 03:49:28PM +0200, Matan Barak wrote:
> Of course that if there is no such documentation, I can add a new file
> for the sysfs ABI defined in this patch.

That is probably needed, our old stuff predates this new process.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]             ` <CAAKD3BCWMrd8A+UgjQg+jtfLmyOCaOB4iGCr2ZAbaazRBZeGxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-24 18:14               ` Jason Gunthorpe
       [not found]                 ` <20151124181415.GC10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-24 18:14 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 03:47:51PM +0200, Matan Barak wrote:
> > It isn't just the hop limit that has to come from the route entry, all
> > the source information of the path comes from there. Ie the gid table
> > should accept the route entry directly and spit out the sgid_index.
> >
> > The responder side is the same, it also needs to do a route lookup to
> > figure out what it is doing, and that may not match what the rx says
> > from the headers. This is important stuff.
> >
> 
> The only entity that translates between IPs and GIDs is the RDMACM.

The rocev2 stuff is using IP, and the gid entry is now overloaded to
specify IP header fields.

Absolutely every determination of IP header fields needs to go through
the route table, so every single lookup that can return a rocev2 SGID
*MUST* use route data.

The places in this series where that isn't done are plain and simply
wrong.

The abstraction at the gid cache is making it too easy to make this
mistake. It is enabling callers to do direct gid lookups without a
route lookup, which is unconditionally wrong. Every call site into the
gid cache I looked at appears to have this problem.

The simplest fix is to have a new gid cache api for rocve2 that
somehow forces/includes the necessary route lookup. The existing API
cannot simply be extended for rocev2.

> roce_gid_mgmt, is the part that populates this "dumb" database.
> IMHO, adding such a "smart" layer to the GID cache is wrong, as this
> should be part of RDMACM which does the translation. No need to get
> the gid cache involved.

OK. Change the gid cache so only a RDMA CM private API can return
rocev2 gids.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP
       [not found]             ` <CAG9sBKMUPJ74RLKT54yO-==0gP9nzfrbfWz1Mb_J5VstRQr2OA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-24 18:15               ` Jason Gunthorpe
       [not found]                 ` <20151124181500.GD10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-24 18:15 UTC (permalink / raw)
  To: Moni Shoua
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 11:41:10AM +0200, Moni Shoua wrote:
> On Mon, Nov 23, 2015 at 11:25 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > On Thu, Oct 15, 2015 at 07:07:12PM +0300, Matan Barak wrote:
> >> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
> >> index 0a40ed2..5bea0e8 100644
> >> +++ b/include/rdma/ib_sa.h
> >> @@ -206,6 +206,9 @@ struct ib_sa_mcmember_rec {
> >>       u8           scope;
> >>       u8           join_state;
> >>       int          proxy_join;
> >> +     int          ifindex;
> >> +     struct net  *net;
> >> +     enum ib_gid_type gid_type;
> >>  };
> >
> > This is really gross.
> >
> > Make ib_init_ah_from_mcmember accept a QP and get the above stuff from
> > the QP.
> >
> > Jason
> 
> Which QP is that. You might not have any existing QP when you want to
> create the AH or you might have 10.

roce multicast is only done with the CM and the CM always has a QP.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]                 ` <20151124181415.GC10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-24 19:07                   ` Matan Barak
       [not found]                     ` <CAAKD3BAO6rNn-Br=MZxvkd+rYSsE9G7wK+9YR9uJ3xdP1U+u0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-11-30 20:56                   ` Liran Liss
  1 sibling, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-11-24 19:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 8:14 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Nov 24, 2015 at 03:47:51PM +0200, Matan Barak wrote:
>> > It isn't just the hop limit that has to come from the route entry, all
>> > the source information of the path comes from there. Ie the gid table
>> > should accept the route entry directly and spit out the sgid_index.
>> >
>> > The responder side is the same, it also needs to do a route lookup to
>> > figure out what it is doing, and that may not match what the rx says
>> > from the headers. This is important stuff.
>> >
>>
>> The only entity that translates between IPs and GIDs is the RDMACM.
>
> The rocev2 stuff is using IP, and the gid entry is now overloaded to
> specify IP header fields.
>

The GID entry is now overloaded to expose GID metadata. For example,
ndev (for L2 Ethernet attributes) and GID type.

> Absolutely every determination of IP header fields needs to go through
> the route table, so every single lookup that can return a rocev2 SGID
> *MUST* use route data.
>
> The places in this series where that isn't done are plain and simply
> wrong.
>

IMHO, the user is entitles to choose any valid sgid_index for the
interface. Anything he chooses guaranteed to be valid (from security
perspective), but doesn't guarantee to work if both sides don't use
IPs that can be routed successfully to the destination.
Why do we need to block users who use ibv_rc_pingpong and chose the
GID index correctly by hand?

> The abstraction at the gid cache is making it too easy to make this
> mistake. It is enabling callers to do direct gid lookups without a
> route lookup, which is unconditionally wrong. Every call site into the
> gid cache I looked at appears to have this problem.
>

We can and should guarantee rdma-cm users get the right GID every time.
I don't think we should block users of choosing either a correct GID
or an incorrect GID, that's up to them.
We're only providing a correct database that these users can query and
a right rdma-cm model.

> The simplest fix is to have a new gid cache api for rocve2 that
> somehow forces/includes the necessary route lookup. The existing API
> cannot simply be extended for rocev2.
>
>> roce_gid_mgmt, is the part that populates this "dumb" database.
>> IMHO, adding such a "smart" layer to the GID cache is wrong, as this
>> should be part of RDMACM which does the translation. No need to get
>> the gid cache involved.
>
> OK. Change the gid cache so only a RDMA CM private API can return
> rocev2 gids.
>

So you propose to block verbs applications from using the RoCE v2 GIDs? Why?

> Jason

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]                     ` <CAAKD3BAO6rNn-Br=MZxvkd+rYSsE9G7wK+9YR9uJ3xdP1U+u0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-25  6:55                       ` Jason Gunthorpe
       [not found]                         ` <20151125065542.GC4326-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-25  6:55 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 09:07:41PM +0200, Matan Barak wrote:

> IMHO, the user is entitles to choose any valid sgid_index for the
> interface. Anything he chooses guaranteed to be valid (from security
> perspective)

No, the namespace patches will have to limit the sgid_indexes that can
be used with a QP to those that fall within the namespace. This is
another reason I don't like this approach for the kapi.

> Why do we need to block users who use ibv_rc_pingpong and chose the
> GID index correctly by hand?

I'm not really concerned with user space, we are stuck with exporting
the gid index there.

> > OK. Change the gid cache so only a RDMA CM private API can return
> > rocev2 gids.
> 
> So you propose to block verbs applications from using the RoCE v2 GIDs? Why?

Just the kernel consumers, so the in-kernel users are correct.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP
       [not found]                 ` <20151124181500.GD10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-25  8:31                   ` Moni Shoua
       [not found]                     ` <CAG9sBKORfbJQWxg7nn6OuZydNZQj4f1ZhDTPKoc-YUbzQNybrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Moni Shoua @ 2015-11-25  8:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Tue, Nov 24, 2015 at 8:15 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Nov 24, 2015 at 11:41:10AM +0200, Moni Shoua wrote:
>> On Mon, Nov 23, 2015 at 11:25 PM, Jason Gunthorpe
>> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> > On Thu, Oct 15, 2015 at 07:07:12PM +0300, Matan Barak wrote:
>> >> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
>> >> index 0a40ed2..5bea0e8 100644
>> >> +++ b/include/rdma/ib_sa.h
>> >> @@ -206,6 +206,9 @@ struct ib_sa_mcmember_rec {
>> >>       u8           scope;
>> >>       u8           join_state;
>> >>       int          proxy_join;
>> >> +     int          ifindex;
>> >> +     struct net  *net;
>> >> +     enum ib_gid_type gid_type;
>> >>  };
>> >
>> > This is really gross.
>> >
>> > Make ib_init_ah_from_mcmember accept a QP and get the above stuff from
>> > the QP.
>> >
>> > Jason
>>
>> Which QP is that. You might not have any existing QP when you want to
>> create the AH or you might have 10.
>
> roce multicast is only done with the CM and the CM always has a QP.
>
> Jason
I don't see why you can't join before having a QP and anyway,
rdma_create_qp() is optional
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]                         ` <20151125065542.GC4326-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-11-25 14:18                           ` Matan Barak
       [not found]                             ` <CAAKD3BAEMD47cScunGNnx2iitL6uFWicDHALJt5w-szoSZwOwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 33+ messages in thread
From: Matan Barak @ 2015-11-25 14:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Wed, Nov 25, 2015 at 8:55 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Nov 24, 2015 at 09:07:41PM +0200, Matan Barak wrote:
>
>> IMHO, the user is entitles to choose any valid sgid_index for the
>> interface. Anything he chooses guaranteed to be valid (from security
>> perspective)
>
> No, the namespace patches will have to limit the sgid_indexes that can
> be used with a QP to those that fall within the namespace. This is
> another reason I don't like this approach for the kapi.
>

By saying namespace, do you mean net namespaces?
If so, the gid cache allows to search by net device (and there's a
"custom" search that the user can define a filter function which can
filter by net).
Anyway, I don't think this cache should be used other than a simple database.

>> Why do we need to block users who use ibv_rc_pingpong and chose the
>> GID index correctly by hand?
>
> I'm not really concerned with user space, we are stuck with exporting
> the gid index there.
>

So why do we need to block kernel applications from doing the same
things user-space application can do?

>> > OK. Change the gid cache so only a RDMA CM private API can return
>> > rocev2 gids.
>>
>> So you propose to block verbs applications from using the RoCE v2 GIDs? Why?
>
> Just the kernel consumers, so the in-kernel users are correct.
>

If there are kernel consumers that want to work with verbs directly,
they should use ib_init_ah_from_wc and ib_resolve_eth_dmac (or we can
rename that for other L2 attributes).
The shared code shouldn't be in the cache.

> Jason

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]                             ` <CAAKD3BAEMD47cScunGNnx2iitL6uFWicDHALJt5w-szoSZwOwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-25 17:29                               ` Jason Gunthorpe
  0 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-25 17:29 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Wed, Nov 25, 2015 at 04:18:25PM +0200, Matan Barak wrote:
> On Wed, Nov 25, 2015 at 8:55 AM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > On Tue, Nov 24, 2015 at 09:07:41PM +0200, Matan Barak wrote:
> >
> >> IMHO, the user is entitles to choose any valid sgid_index for the
> >> interface. Anything he chooses guaranteed to be valid (from security
> >> perspective)
> >
> > No, the namespace patches will have to limit the sgid_indexes that can
> > be used with a QP to those that fall within the namespace. This is
> > another reason I don't like this approach for the kapi.
> 
> By saying namespace, do you mean net namespaces?

Whatever it turns out to be, Haggie was talking about rdma namespaces
for some for this stuff too, but IMHO, rocev2 is pretty clearly
covered under net namespaces.

> If so, the gid cache allows to search by net device (and there's a
> "custom" search that the user can define a filter function which can
> filter by net).
> Anyway, I don't think this cache should be used other than a simple database.

It has nothing to do with the cache, it is everywhere else, you can't
create a qp with a sgid index that is not part of your namespace, for
instance, or recieve a packet on a QP outside your namespace,
etc. Lots of details.



> >> Why do we need to block users who use ibv_rc_pingpong and chose the
> >> GID index correctly by hand?
> >
> > I'm not really concerned with user space, we are stuck with exporting
> > the gid index there.
> 
> So why do we need to block kernel applications from doing the same
> things user-space application can do?

As I explained, it is never correct to use a naked sgid_index and
roceve2, uverbs can't be fixed without a uapi change, but the kernel
can be.

> If there are kernel consumers that want to work with verbs directly,
> they should use ib_init_ah_from_wc and ib_resolve_eth_dmac (or we
> can

As I already said these functions are wrong, they don't have the
routing lookup needed for rocev2. That is my whole point, the
functions that are using the gid cache for rocev2 are *not correct*

I don't really care how you fix it, but every rocev2 sgid-index lookup
in the kernel must be accompanied by a route lookup.

I think the gid cache API design is wrong here because it doesn't
force the above, but whatever, if you choose a different API it
becomes your job to review every patch from now own to make sure other
people use your dangerous API properly.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP
       [not found]                     ` <CAG9sBKORfbJQWxg7nn6OuZydNZQj4f1ZhDTPKoc-YUbzQNybrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-11-25 17:39                       ` Jason Gunthorpe
  0 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2015-11-25 17:39 UTC (permalink / raw)
  To: Moni Shoua
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

On Wed, Nov 25, 2015 at 10:31:15AM +0200, Moni Shoua wrote:
> On Tue, Nov 24, 2015 at 8:15 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > On Tue, Nov 24, 2015 at 11:41:10AM +0200, Moni Shoua wrote:
> >> On Mon, Nov 23, 2015 at 11:25 PM, Jason Gunthorpe
> >> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> >> > On Thu, Oct 15, 2015 at 07:07:12PM +0300, Matan Barak wrote:
> >> >> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
> >> >> index 0a40ed2..5bea0e8 100644
> >> >> +++ b/include/rdma/ib_sa.h
> >> >> @@ -206,6 +206,9 @@ struct ib_sa_mcmember_rec {
> >> >>       u8           scope;
> >> >>       u8           join_state;
> >> >>       int          proxy_join;
> >> >> +     int          ifindex;
> >> >> +     struct net  *net;
> >> >> +     enum ib_gid_type gid_type;
> >> >>  };
> >> >
> >> > This is really gross.
> >> >
> >> > Make ib_init_ah_from_mcmember accept a QP and get the above stuff from
> >> > the QP.
> >> >
> >> > Jason
> >>
> >> Which QP is that. You might not have any existing QP when you want to
> >> create the AH or you might have 10.
> >
> > roce multicast is only done with the CM and the CM always has a QP.
> >
> I don't see why you can't join before having a QP and anyway,
> rdma_create_qp() is optional

Ugh, gross, why would anyone want to do that..

Doesn't change my point, the CM id is bound before multicast join can
run, don't pollute ib_sa_mcmember_rec with this.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]                 ` <20151124181415.GC10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-11-24 19:07                   ` Matan Barak
@ 2015-11-30 20:56                   ` Liran Liss
       [not found]                     ` <HE1PR05MB1418F62E731D463F2A63A899B1000-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  1 sibling, 1 reply; 33+ messages in thread
From: Liran Liss @ 2015-11-30 20:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma, Or Gerlitz,
	Eran Ben Elisha, Somnath Kotur

> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-

> 
> The abstraction at the gid cache is making it too easy to make this mistake. It
> is enabling callers to do direct gid lookups without a route lookup, which is
> unconditionally wrong. Every call site into the gid cache I looked at appears to
> have this problem.
> 
> The simplest fix is to have a new gid cache api for rocve2 that somehow
> forces/includes the necessary route lookup. The existing API cannot simply
> be extended for rocev2.
> 

I think that the GID cache should remain just that: a cache.
We shouldn't bloat it.
The CMA is the proper place to handle IP resolution.

> > roce_gid_mgmt, is the part that populates this "dumb" database.
> > IMHO, adding such a "smart" layer to the GID cache is wrong, as this
> > should be part of RDMACM which does the translation. No need to get
> > the gid cache involved.
> 
> OK. Change the gid cache so only a RDMA CM private API can return
> rocev2 gids.
> 

The same cache is also used in IB and thus by other components, so it cannot
be a private CM API.
RoCE ULPs use the CMA for establishing connections, so route lookups should
be done from there.
 --Liran

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc
       [not found]                     ` <HE1PR05MB1418F62E731D463F2A63A899B1000-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2015-12-01 14:35                       ` Matan Barak
  0 siblings, 0 replies; 33+ messages in thread
From: Matan Barak @ 2015-12-01 14:35 UTC (permalink / raw)
  To: Liran Liss
  Cc: Jason Gunthorpe, Matan Barak, Doug Ledford, linux-rdma,
	Or Gerlitz, Eran Ben Elisha, Somnath Kotur

On Mon, Nov 30, 2015 at 10:56 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
>
>>
>> The abstraction at the gid cache is making it too easy to make this mistake. It
>> is enabling callers to do direct gid lookups without a route lookup, which is
>> unconditionally wrong. Every call site into the gid cache I looked at appears to
>> have this problem.
>>
>> The simplest fix is to have a new gid cache api for rocve2 that somehow
>> forces/includes the necessary route lookup. The existing API cannot simply
>> be extended for rocev2.
>>
>
> I think that the GID cache should remain just that: a cache.
> We shouldn't bloat it.
> The CMA is the proper place to handle IP resolution.
>

In the sake of validating a proper route is chosen, we could add route
validation in
ib_init_ah_from_wc() and ib_init_ah_from_path().

>> > roce_gid_mgmt, is the part that populates this "dumb" database.
>> > IMHO, adding such a "smart" layer to the GID cache is wrong, as this
>> > should be part of RDMACM which does the translation. No need to get
>> > the gid cache involved.
>>
>> OK. Change the gid cache so only a RDMA CM private API can return
>> rocev2 gids.
>>
>
> The same cache is also used in IB and thus by other components, so it cannot
> be a private CM API.
> RoCE ULPs use the CMA for establishing connections, so route lookups should
> be done from there.
>  --Liran
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-12-01 14:35 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-15 16:07 [PATCH for-next V1 0/9] Add RoCE v2 support Matan Barak
     [not found] ` <1444925232-13598-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-15 16:07   ` [PATCH for-next V1 1/9] IB/core: Add gid_type to gid attribute Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 2/9] IB/cm: Use the source GID index type Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 3/9] IB/core: Add gid attributes to sysfs Matan Barak
     [not found]     ` <1444925232-13598-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-23 21:20       ` Jason Gunthorpe
     [not found]         ` <20151123212029.GB6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-24  8:31           ` Matan Barak
     [not found]             ` <CAAKD3BCQzBax6N3+-RhdEvByQu3mz1KKsjQ7yjs-fn2_nSPfOA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-24 13:49               ` Matan Barak
     [not found]                 ` <CAAKD3BA=h+Mpq9VBnCNpv0UCAkmwCBtOahpOhhWdCvUM=C7JPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-24 17:59                   ` Jason Gunthorpe
2015-10-15 16:07   ` [PATCH for-next V1 4/9] IB/core: Add ROCE_UDP_ENCAP (RoCE V2) type Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc Matan Barak
     [not found]     ` <1444925232-13598-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-23 21:19       ` Jason Gunthorpe
     [not found]         ` <20151123211916.GA6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-24 13:47           ` Matan Barak
     [not found]             ` <CAAKD3BCWMrd8A+UgjQg+jtfLmyOCaOB4iGCr2ZAbaazRBZeGxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-24 18:14               ` Jason Gunthorpe
     [not found]                 ` <20151124181415.GC10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-24 19:07                   ` Matan Barak
     [not found]                     ` <CAAKD3BAO6rNn-Br=MZxvkd+rYSsE9G7wK+9YR9uJ3xdP1U+u0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-25  6:55                       ` Jason Gunthorpe
     [not found]                         ` <20151125065542.GC4326-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-25 14:18                           ` Matan Barak
     [not found]                             ` <CAAKD3BAEMD47cScunGNnx2iitL6uFWicDHALJt5w-szoSZwOwg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-25 17:29                               ` Jason Gunthorpe
2015-11-30 20:56                   ` Liran Liss
     [not found]                     ` <HE1PR05MB1418F62E731D463F2A63A899B1000-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-12-01 14:35                       ` Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 6/9] IB/rdma_cm: Add wrapper for cma reference count Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 7/9] IB/cma: Add configfs for rdma_cm Matan Barak
     [not found]     ` <1444925232-13598-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-23 21:23       ` Jason Gunthorpe
     [not found]         ` <20151123212359.GC6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-24  8:28           ` Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 8/9] IB/core: Initialize UD header structure with IP and UDP headers Matan Barak
2015-10-15 16:07   ` [PATCH for-next V1 9/9] IB/cma: Join and leave multicast groups with IGMP Matan Barak
     [not found]     ` <1444925232-13598-10-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-23 21:25       ` Jason Gunthorpe
     [not found]         ` <20151123212526.GD6062-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-24  9:41           ` Moni Shoua
     [not found]             ` <CAG9sBKMUPJ74RLKT54yO-==0gP9nzfrbfWz1Mb_J5VstRQr2OA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-24 18:15               ` Jason Gunthorpe
     [not found]                 ` <20151124181500.GD10391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-11-25  8:31                   ` Moni Shoua
     [not found]                     ` <CAG9sBKORfbJQWxg7nn6OuZydNZQj4f1ZhDTPKoc-YUbzQNybrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-25 17:39                       ` Jason Gunthorpe
2015-11-16 13:23   ` [PATCH for-next V1 0/9] Add RoCE v2 support Matan Barak
     [not found]     ` <CAAKD3BBm2WZ8TqSFi7gC82BwBTCc+D-SJSpSSqhEqMjL8-Fq_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-22 21:28       ` Or Gerlitz
     [not found]         ` <CAJ3xEMiAkz0aouPgHWD31CwrX4SmOQfysJBX2kQOZ91gVP+94g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-11-23 19:53           ` Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.