All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 0/4] Support SendOnlyFullMember join
@ 2016-04-19 12:51 Erez Shitrit
       [not found] ` <1461070287-13469-1-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-19 12:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Erez Shitrit

There are 4 types of multicast join: FullMember, NonMember,
SendOnlyNonMember and SendOnlyFullMember.

This patch-set introduces the ability to send multicast join requests of
type SendOnlyFullMember (until now only the first three types were
supported).
Host that sends such join request, does not require a multicast
subscription and will still send properly, and in addition will not get
other multicast packets from that group.

It adds a new SA query (for classport_info) to check that the sm
supports sending such requests and it modifies multicast processing in
the IB/core module to handle SendOnlyFullMember multicast join requests.

In addition, the patch set modifies IPoIB so that if the sm does support
SendOnlyFullMember join requests, IPoIB will send such a request when
joining a sendonly multicast group.


Erez Shitrit (4):
  IB/core: Add support for get ClassPortInfo from the SA
  IB/sa: Add support for sa get ClassPortInfo
  IB/core: Support new type of join-state for multicast
  IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join

 drivers/infiniband/core/multicast.c            |  23 +++-
 drivers/infiniband/core/sa_query.c             | 174 +++++++++++++++++++++++++
 drivers/infiniband/hw/qib/qib_mad.c            |   6 +-
 drivers/infiniband/ulp/ipoib/ipoib.h           |   2 +
 drivers/infiniband/ulp/ipoib/ipoib_main.c      |  74 +++++++++++
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |  33 +++--
 drivers/infiniband/ulp/srpt/ib_srpt.c          |   5 +-
 include/rdma/ib_mad.h                          |   4 +-
 include/rdma/ib_sa.h                           |  12 ++
 9 files changed, 310 insertions(+), 23 deletions(-)

-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH for-next 1/4] IB/core: Add support for get ClassPortInfo from the SA
       [not found] ` <1461070287-13469-1-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-04-19 12:51   ` Erez Shitrit
       [not found]     ` <1461070287-13469-2-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-04-19 12:51   ` [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo Erez Shitrit
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-19 12:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Erez Shitrit


Change struct ib_class_port_info to conform to IB Spec 1.3
That in order to get specific capability mask from ClassPortInfo mad.

>From the IB Spec, ClassPortInfo section:
	"CapabilityMask2 Bits 0-26: Additional class-specific capabilities...
	 RespTimeValue the rest 5 bits"

The new struct now has one field that contains both of cap_mask2 (which was
reserved and resp_time which now is 27 bits for cap_mask2 and 5 bits for
resp_time in the new field)

More changes to adjust the new structure:
IB/qib: Change pma_get_classportinfo
IB/srpt: Adjust the use of ib_class_port_info

Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/qib/qib_mad.c   | 6 ++++--
 drivers/infiniband/ulp/srpt/ib_srpt.c | 5 ++++-
 include/rdma/ib_mad.h                 | 4 ++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
index 0bd1837..c5d029d 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -1158,6 +1158,7 @@ static int pma_get_classportinfo(struct ib_pma_mad *pmp,
 	struct ib_class_port_info *p =
 		(struct ib_class_port_info *)pmp->data;
 	struct qib_devdata *dd = dd_from_ibdev(ibdev);
+	char *p_cap_mask2;
 
 	memset(pmp->data, 0, sizeof(pmp->data));
 
@@ -1172,11 +1173,12 @@ static int pma_get_classportinfo(struct ib_pma_mad *pmp,
 	 * Set the most significant bit of CM2 to indicate support for
 	 * congestion statistics
 	 */
-	p->reserved[0] = dd->psxmitwait_supported << 7;
+	p_cap_mask2 = (char *)&p->cap_mask2_resp_time;
+	p_cap_mask2[0] = dd->psxmitwait_supported << 7;
 	/*
 	 * Expected response time is 4.096 usec. * 2^18 == 1.073741824 sec.
 	 */
-	p->resp_time_value = 18;
+	p_cap_mask2[3] = 18;
 
 	return reply((struct ib_smp *) pmp);
 }
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 0bd3cb2..d12b602 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -249,12 +249,15 @@ static void srpt_set_ioc(u8 *c_list, u32 slot, u8 value)
 static void srpt_get_class_port_info(struct ib_dm_mad *mad)
 {
 	struct ib_class_port_info *cif;
+	char *p_cap_mask2;
 
 	cif = (struct ib_class_port_info *)mad->data;
 	memset(cif, 0, sizeof(*cif));
 	cif->base_version = 1;
 	cif->class_version = 1;
-	cif->resp_time_value = 20;
+
+	p_cap_mask2 = (char *)&cif->cap_mask2_resp_time;
+	p_cap_mask2[3] = 20;
 
 	mad->mad_hdr.status = 0;
 }
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 37dd534c..2aaf1cb 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -243,8 +243,8 @@ struct ib_class_port_info {
 	u8			base_version;
 	u8			class_version;
 	__be16			capability_mask;
-	u8			reserved[3];
-	u8			resp_time_value;
+	  /* 27 bits for cap_mask2, 5 bits for resp_time */
+	__be32			cap_mask2_resp_time;
 	u8			redirect_gid[16];
 	__be32			redirect_tcslfl;
 	__be16			redirect_lid;
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo
       [not found] ` <1461070287-13469-1-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-04-19 12:51   ` [PATCH for-next 1/4] IB/core: Add support for get ClassPortInfo from the SA Erez Shitrit
@ 2016-04-19 12:51   ` Erez Shitrit
       [not found]     ` <1461070287-13469-3-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-04-19 12:51   ` [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast Erez Shitrit
  2016-04-19 12:51   ` [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join Erez Shitrit
  3 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-19 12:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Erez Shitrit


New sa query function to return the ClassPortInfo struct from the sa.
If the SM supports FullMemberSendOnly mode for MCG's, it sets a
capability bit in the capability_mask2 field of the response.

Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/sa_query.c | 174 +++++++++++++++++++++++++++++++++++++
 include/rdma/ib_sa.h               |  12 +++
 2 files changed, 186 insertions(+)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 8a09c0f..66535bf 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -119,6 +119,12 @@ struct ib_sa_guidinfo_query {
 	struct ib_sa_query sa_query;
 };
 
+struct ib_sa_classport_info_query {
+	void (*callback)(int, struct ib_class_port_info *, void *);
+	void *context;
+	struct ib_sa_query sa_query;
+};
+
 struct ib_sa_mcmember_query {
 	void (*callback)(int, struct ib_sa_mcmember_rec *, void *);
 	void *context;
@@ -392,6 +398,83 @@ static const struct ib_field service_rec_table[] = {
 	  .size_bits    = 2*64 },
 };
 
+#define PORTCLASSINFO_REC_FIELD(field) \
+	.struct_offset_bytes = offsetof(struct ib_class_port_info, field),	\
+	.struct_size_bytes   = sizeof((struct ib_class_port_info *)0)->field,	\
+	.field_name          = "ib_class_port_info:" #field
+
+static const struct ib_field classport_info_rec_table[] = {
+	{ PORTCLASSINFO_REC_FIELD(base_version),
+	  .offset_words = 0,
+	  .offset_bits  = 0,
+	  .size_bits    = 8 },
+	{ PORTCLASSINFO_REC_FIELD(class_version),
+	  .offset_words = 0,
+	  .offset_bits  = 8,
+	  .size_bits    = 8 },
+	{ PORTCLASSINFO_REC_FIELD(capability_mask),
+	  .offset_words = 0,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+	{ PORTCLASSINFO_REC_FIELD(cap_mask2_resp_time),
+	  .offset_words = 1,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+	{ PORTCLASSINFO_REC_FIELD(redirect_gid),
+	  .offset_words = 2,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ PORTCLASSINFO_REC_FIELD(redirect_tcslfl),
+	  .offset_words = 6,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+
+	{ PORTCLASSINFO_REC_FIELD(redirect_lid),
+	  .offset_words = 7,
+	  .offset_bits  = 0,
+	  .size_bits    = 16 },
+	{ PORTCLASSINFO_REC_FIELD(redirect_pkey),
+	  .offset_words = 7,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+
+	{ PORTCLASSINFO_REC_FIELD(redirect_qp),
+	  .offset_words = 8,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+	{ PORTCLASSINFO_REC_FIELD(redirect_qkey),
+	  .offset_words = 9,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+
+	{ PORTCLASSINFO_REC_FIELD(trap_gid),
+	  .offset_words = 10,
+	  .offset_bits  = 0,
+	  .size_bits    = 128 },
+	{ PORTCLASSINFO_REC_FIELD(trap_tcslfl),
+	  .offset_words = 14,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+
+	{ PORTCLASSINFO_REC_FIELD(trap_lid),
+	  .offset_words = 15,
+	  .offset_bits  = 0,
+	  .size_bits    = 16 },
+	{ PORTCLASSINFO_REC_FIELD(trap_pkey),
+	  .offset_words = 15,
+	  .offset_bits  = 16,
+	  .size_bits    = 16 },
+
+	{ PORTCLASSINFO_REC_FIELD(trap_hlqp),
+	  .offset_words = 16,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+	{ PORTCLASSINFO_REC_FIELD(trap_qkey),
+	  .offset_words = 17,
+	  .offset_bits  = 0,
+	  .size_bits    = 32 },
+};
+
 #define GUIDINFO_REC_FIELD(field) \
 	.struct_offset_bytes = offsetof(struct ib_sa_guidinfo_rec, field),	\
 	.struct_size_bytes   = sizeof((struct ib_sa_guidinfo_rec *) 0)->field,	\
@@ -1645,6 +1728,97 @@ err1:
 }
 EXPORT_SYMBOL(ib_sa_guid_info_rec_query);
 
+/* Support get SA ClassPortInfo */
+static void ib_sa_classport_info_rec_callback(struct ib_sa_query *sa_query,
+					      int status,
+					      struct ib_sa_mad *mad)
+{
+	struct ib_sa_classport_info_query *query =
+		container_of(sa_query, struct ib_sa_classport_info_query, sa_query);
+
+	if (mad) {
+		struct ib_class_port_info rec;
+
+		ib_unpack(classport_info_rec_table,
+			  ARRAY_SIZE(classport_info_rec_table),
+			  mad->data, &rec);
+		query->callback(status, &rec, query->context);
+	} else {
+		query->callback(status, NULL, query->context);
+	}
+}
+
+static void ib_sa_portclass_info_rec_release(struct ib_sa_query *sa_query)
+{
+	kfree(container_of(sa_query, struct ib_sa_classport_info_query,
+			   sa_query));
+}
+
+int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
+				   struct ib_device *device, u8 port_num,
+				   int timeout_ms, gfp_t gfp_mask,
+				   void (*callback)(int status,
+						    struct ib_class_port_info *resp,
+						    void *context),
+				   void *context,
+				   struct ib_sa_query **sa_query)
+{
+	struct ib_sa_classport_info_query *query;
+	struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
+	struct ib_sa_port *port;
+	struct ib_mad_agent *agent;
+	struct ib_sa_mad *mad;
+	int ret;
+
+	if (!sa_dev)
+		return -ENODEV;
+
+	port  = &sa_dev->port[port_num - sa_dev->start_port];
+	agent = port->agent;
+
+	query = kzalloc(sizeof(*query), gfp_mask);
+	if (!query)
+		return -ENOMEM;
+
+	query->sa_query.port = port;
+	ret = alloc_mad(&query->sa_query, gfp_mask);
+	if (ret)
+		goto err1;
+
+	ib_sa_client_get(client);
+	query->sa_query.client = client;
+	query->callback        = callback;
+	query->context         = context;
+
+	mad = query->sa_query.mad_buf->mad;
+	init_mad(mad, agent);
+
+	query->sa_query.callback = callback ? ib_sa_classport_info_rec_callback : NULL;
+
+	query->sa_query.release  = ib_sa_portclass_info_rec_release;
+	/* support GET only */
+	mad->mad_hdr.method	 = IB_MGMT_METHOD_GET;
+	mad->mad_hdr.attr_id	 = cpu_to_be16(IB_SA_ATTR_CLASS_PORTINFO);
+	mad->sa_hdr.comp_mask	 = 0;
+	*sa_query = &query->sa_query;
+
+	ret = send_mad(&query->sa_query, timeout_ms, gfp_mask);
+	if (ret < 0)
+		goto err2;
+
+	return ret;
+
+err2:
+	*sa_query = NULL;
+	ib_sa_client_put(query->sa_query.client);
+	free_mad(&query->sa_query);
+
+err1:
+	kfree(query);
+	return ret;
+}
+EXPORT_SYMBOL(ib_sa_classport_info_rec_query);
+
 static void send_handler(struct ib_mad_agent *agent,
 			 struct ib_mad_send_wc *mad_send_wc)
 {
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index cdc1c81..3840416 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -94,6 +94,8 @@ enum ib_sa_selector {
 	IB_SA_BEST = 3
 };
 
+#define IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT	BIT(12)
+
 /*
  * Structures for SA records are named "struct ib_sa_xxx_rec."  No
  * attempt is made to pack structures to match the physical layout of
@@ -439,4 +441,14 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client,
 			      void *context,
 			      struct ib_sa_query **sa_query);
 
+/* Support get SA ClassPortInfo */
+int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
+				   struct ib_device *device, u8 port_num,
+				   int timeout_ms, gfp_t gfp_mask,
+				   void (*callback)(int status,
+						    struct ib_class_port_info *resp,
+						    void *context),
+				   void *context,
+				   struct ib_sa_query **sa_query);
+
 #endif /* IB_SA_H */
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found] ` <1461070287-13469-1-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-04-19 12:51   ` [PATCH for-next 1/4] IB/core: Add support for get ClassPortInfo from the SA Erez Shitrit
  2016-04-19 12:51   ` [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo Erez Shitrit
@ 2016-04-19 12:51   ` Erez Shitrit
       [not found]     ` <1461070287-13469-4-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-04-19 12:51   ` [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join Erez Shitrit
  3 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-19 12:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Erez Shitrit


There are 4 types for MCG, FullMember, NonMember, SendOnlyNonMember,
and the new added type: SendOnlyFullMember.
Add support for the new SendOnlyFullMember join state.

The new type allows host to send join request as sendonly, it will cause the
group to be created but without getting packets from this multicast back to the
host.

Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/multicast.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index 250937c..31d9d41 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -93,6 +93,18 @@ enum {
 
 struct mcast_member;
 
+/*
+* There are 4 types of join states:
+* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
+*/
+enum {
+	FULLMEMBER_JOIN,
+	NONMEMBER_JOIN,
+	SENDONLY_NONMEBER_JOIN,
+	SENDONLY_FULLMEMBER_JOIN,
+	NUM_JOIN_MEMBERSHIP_TYPES = 4
+};
+
 struct mcast_group {
 	struct ib_sa_mcmember_rec rec;
 	struct rb_node		node;
@@ -102,7 +114,7 @@ struct mcast_group {
 	struct list_head	pending_list;
 	struct list_head	active_list;
 	struct mcast_member	*last_join;
-	int			members[3];
+	int			members[NUM_JOIN_MEMBERSHIP_TYPES];
 	atomic_t		refcount;
 	enum mcast_group_state	state;
 	struct ib_sa_query	*query;
@@ -220,8 +232,9 @@ static void queue_join(struct mcast_member *member)
 }
 
 /*
- * A multicast group has three types of members: full member, non member, and
- * send only member.  We need to keep track of the number of members of each
+ * A multicast group has four types of members: full member, non member,
+ * sendonly non member and sendonly full member.
+ * We need to keep track of the number of members of each
  * type based on their join state.  Adjust the number of members the belong to
  * the specified join states.
  */
@@ -229,7 +242,7 @@ static void adjust_membership(struct mcast_group *group, u8 join_state, int inc)
 {
 	int i;
 
-	for (i = 0; i < 3; i++, join_state >>= 1)
+	for (i = 0; i < NUM_JOIN_MEMBERSHIP_TYPES; i++, join_state >>= 1)
 		if (join_state & 0x1)
 			group->members[i] += inc;
 }
@@ -245,7 +258,7 @@ static u8 get_leave_state(struct mcast_group *group)
 	u8 leave_state = 0;
 	int i;
 
-	for (i = 0; i < 3; i++)
+	for (i = 0; i < NUM_JOIN_MEMBERSHIP_TYPES; i++)
 		if (!group->members[i])
 			leave_state |= (0x1 << i);
 
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found] ` <1461070287-13469-1-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-04-19 12:51   ` [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast Erez Shitrit
@ 2016-04-19 12:51   ` Erez Shitrit
       [not found]     ` <1461070287-13469-5-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  3 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-19 12:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Erez Shitrit


Check (via an sa query) if the SM supports the new option for SendOnly
multicast joins.
If the SM supports that option it will use the new join state to create
such multicast group.
If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
join for SendOnly MCG, use the correct state if supported.

This check is performed at every invocation of mcast_restart task, to be
sure that the driver stays in sync with the current state of the SM.

Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
 drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
 3 files changed, 96 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index caec8e9..c51f618 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -392,6 +392,7 @@ struct ipoib_dev_priv {
 	struct ipoib_ethtool_st ethtool;
 	struct timer_list poll_timer;
 	unsigned max_send_sge;
+	bool sm_fullmember_sendonly_support;
 };
 
 struct ipoib_ah {
@@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
 
 void ipoib_mark_paths_invalid(struct net_device *dev);
 void ipoib_flush_paths(struct net_device *dev);
+int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
 struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
 
 int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 3b630e5..eb5927b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
 
 	set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
 
+	priv->sm_fullmember_sendonly_support = false;
+
 	if (ipoib_ib_dev_open(dev)) {
 		if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
 			return 0;
@@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
 	spin_unlock_irq(&priv->lock);
 }
 
+struct classport_info_context {
+	struct ipoib_dev_priv	*priv;
+	struct completion	done;
+	struct ib_sa_query	*sa_query;
+};
+
+static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
+				    void *context)
+{
+	struct classport_info_context *cb_ctx = context;
+	struct ipoib_dev_priv *priv;
+
+	WARN_ON(!context);
+
+	priv = cb_ctx->priv;
+
+	if (status) {
+		pr_debug("device: %s failed query classport_info status: %d\n",
+			 priv->dev->name, status);
+		/* keeps the default, will try next mcast_restart */
+		priv->sm_fullmember_sendonly_support = false;
+		goto out;
+	}
+
+	/* take out resp_time that located in the last 5 bits */
+	if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
+	    IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
+		pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
+			 priv->dev->name);
+		priv->sm_fullmember_sendonly_support = true;
+	} else {
+		pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
+			 priv->dev->name);
+		priv->sm_fullmember_sendonly_support = false;
+	}
+
+out:
+	complete(&cb_ctx->done);
+}
+
+int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
+{
+	struct classport_info_context *callback_context;
+	int ret;
+
+	callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
+	if (!callback_context)
+		return -ENOMEM;
+
+	callback_context->priv = priv;
+	init_completion(&callback_context->done);
+
+	ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
+					     priv->ca, priv->port, 3000,
+					     GFP_KERNEL,
+					     classport_info_query_cb,
+					     callback_context,
+					     &callback_context->sa_query);
+	if (ret < 0) {
+		pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
+			priv->dev->name, ret);
+		kfree(callback_context);
+		return ret;
+	}
+
+	/* waiting for the callback to finish before returnning */
+	wait_for_completion(&callback_context->done);
+	kfree(callback_context);
+
+	return ret;
+}
+
 void ipoib_flush_paths(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 2588931..c1bb69e 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
 		rec.hop_limit	  = priv->broadcast->mcmember.hop_limit;
 
 		/*
-		 * Send-only IB Multicast joins do not work at the core
-		 * IB layer yet, so we can't use them here.  However,
-		 * we are emulating an Ethernet multicast send, which
-		 * does not require a multicast subscription and will
-		 * still send properly.  The most appropriate thing to
+		 * Send-only IB Multicast joins work at the core IB layer but
+		 * require specific SM support.
+		 * We can use such joins here only if the current SM supports that feature.
+		 * However, if not, we emulate an Ethernet multicast send,
+		 * which does not require a multicast subscription and will
+		 * still send properly. The most appropriate thing to
 		 * do is to create the group if it doesn't exist as that
 		 * most closely emulates the behavior, from a user space
-		 * application perspecitive, of Ethernet multicast
-		 * operation.  For now, we do a full join, maybe later
-		 * when the core IB layers support send only joins we
-		 * will use them.
+		 * application perspective, of Ethernet multicast operation.
 		 */
-#if 0
-		if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
-			rec.join_state = 4;
-#endif
+		if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
+		    priv->sm_fullmember_sendonly_support)
+			/* SM supports sendonly-fullmember, otherwise fallback to full-member */
+			rec.join_state = 8;
 	}
 	spin_unlock_irq(&priv->lock);
 
@@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
 	struct ib_port_attr port_attr;
 	unsigned long delay_until = 0;
 	struct ipoib_mcast *mcast = NULL;
+	int ret;
 
 	if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
 		return;
@@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
 	else
 		memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
 
+	/* Check if can send sendonly MCG's with sendonly-fullmember join state */
+	if (priv->broadcast) {
+		ret = ipoib_check_sm_sendonly_fullmember_support(priv);
+		if (ret < 0)
+			pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
+				 priv->dev->name, ret);
+	}
+
 	spin_lock_irq(&priv->lock);
 	if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
 		goto out;
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo
       [not found]     ` <1461070287-13469-3-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-04-20 12:09       ` Hal Rosenstock
       [not found]         ` <57177181.7090408-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2016-04-20 13:12       ` Or Gerlitz
  1 sibling, 1 reply; 31+ messages in thread
From: Hal Rosenstock @ 2016-04-20 12:09 UTC (permalink / raw)
  To: Erez Shitrit, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/19/2016 8:51 AM, Erez Shitrit wrote:
> New sa query function to return the ClassPortInfo struct from the sa.
> If the SM supports FullMemberSendOnly mode for MCG's, it sets a
> capability bit in the capability_mask2 field of the response.
> 
> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/core/sa_query.c | 174 +++++++++++++++++++++++++++++++++++++
>  include/rdma/ib_sa.h               |  12 +++
>  2 files changed, 186 insertions(+)
> 
> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
> index 8a09c0f..66535bf 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -119,6 +119,12 @@ struct ib_sa_guidinfo_query {
>  	struct ib_sa_query sa_query;
>  };
>  
> +struct ib_sa_classport_info_query {
> +	void (*callback)(int, struct ib_class_port_info *, void *);
> +	void *context;
> +	struct ib_sa_query sa_query;
> +};
> +
>  struct ib_sa_mcmember_query {
>  	void (*callback)(int, struct ib_sa_mcmember_rec *, void *);
>  	void *context;
> @@ -392,6 +398,83 @@ static const struct ib_field service_rec_table[] = {
>  	  .size_bits    = 2*64 },
>  };
>  
> +#define PORTCLASSINFO_REC_FIELD(field) \

Would CLASSPORTINFO_REC_FIELD be more consistent in terms of naming ?

> +	.struct_offset_bytes = offsetof(struct ib_class_port_info, field),	\
> +	.struct_size_bytes   = sizeof((struct ib_class_port_info *)0)->field,	\
> +	.field_name          = "ib_class_port_info:" #field
> +
> +static const struct ib_field classport_info_rec_table[] = {
> +	{ PORTCLASSINFO_REC_FIELD(base_version),
> +	  .offset_words = 0,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 8 },
> +	{ PORTCLASSINFO_REC_FIELD(class_version),
> +	  .offset_words = 0,
> +	  .offset_bits  = 8,
> +	  .size_bits    = 8 },
> +	{ PORTCLASSINFO_REC_FIELD(capability_mask),
> +	  .offset_words = 0,
> +	  .offset_bits  = 16,
> +	  .size_bits    = 16 },
> +	{ PORTCLASSINFO_REC_FIELD(cap_mask2_resp_time),
> +	  .offset_words = 1,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },

Why combine fields ? Syntax is rich enough to deal with bit fields.
This is what was done for other SA attributes in this file.

-- Hal

> +	{ PORTCLASSINFO_REC_FIELD(redirect_gid),
> +	  .offset_words = 2,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 128 },
> +	{ PORTCLASSINFO_REC_FIELD(redirect_tcslfl),
> +	  .offset_words = 6,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },
> +
> +	{ PORTCLASSINFO_REC_FIELD(redirect_lid),
> +	  .offset_words = 7,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 16 },
> +	{ PORTCLASSINFO_REC_FIELD(redirect_pkey),
> +	  .offset_words = 7,
> +	  .offset_bits  = 16,
> +	  .size_bits    = 16 },
> +
> +	{ PORTCLASSINFO_REC_FIELD(redirect_qp),
> +	  .offset_words = 8,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },
> +	{ PORTCLASSINFO_REC_FIELD(redirect_qkey),
> +	  .offset_words = 9,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },
> +
> +	{ PORTCLASSINFO_REC_FIELD(trap_gid),
> +	  .offset_words = 10,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 128 },
> +	{ PORTCLASSINFO_REC_FIELD(trap_tcslfl),
> +	  .offset_words = 14,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },
> +
> +	{ PORTCLASSINFO_REC_FIELD(trap_lid),
> +	  .offset_words = 15,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 16 },
> +	{ PORTCLASSINFO_REC_FIELD(trap_pkey),
> +	  .offset_words = 15,
> +	  .offset_bits  = 16,
> +	  .size_bits    = 16 },
> +
> +	{ PORTCLASSINFO_REC_FIELD(trap_hlqp),
> +	  .offset_words = 16,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },
> +	{ PORTCLASSINFO_REC_FIELD(trap_qkey),
> +	  .offset_words = 17,
> +	  .offset_bits  = 0,
> +	  .size_bits    = 32 },
> +};
> +
>  #define GUIDINFO_REC_FIELD(field) \
>  	.struct_offset_bytes = offsetof(struct ib_sa_guidinfo_rec, field),	\
>  	.struct_size_bytes   = sizeof((struct ib_sa_guidinfo_rec *) 0)->field,	\
> @@ -1645,6 +1728,97 @@ err1:
>  }
>  EXPORT_SYMBOL(ib_sa_guid_info_rec_query);
>  
> +/* Support get SA ClassPortInfo */
> +static void ib_sa_classport_info_rec_callback(struct ib_sa_query *sa_query,
> +					      int status,
> +					      struct ib_sa_mad *mad)
> +{
> +	struct ib_sa_classport_info_query *query =
> +		container_of(sa_query, struct ib_sa_classport_info_query, sa_query);
> +
> +	if (mad) {
> +		struct ib_class_port_info rec;
> +
> +		ib_unpack(classport_info_rec_table,
> +			  ARRAY_SIZE(classport_info_rec_table),
> +			  mad->data, &rec);
> +		query->callback(status, &rec, query->context);
> +	} else {
> +		query->callback(status, NULL, query->context);
> +	}
> +}
> +
> +static void ib_sa_portclass_info_rec_release(struct ib_sa_query *sa_query)
> +{
> +	kfree(container_of(sa_query, struct ib_sa_classport_info_query,
> +			   sa_query));
> +}
> +
> +int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
> +				   struct ib_device *device, u8 port_num,
> +				   int timeout_ms, gfp_t gfp_mask,
> +				   void (*callback)(int status,
> +						    struct ib_class_port_info *resp,
> +						    void *context),
> +				   void *context,
> +				   struct ib_sa_query **sa_query)
> +{
> +	struct ib_sa_classport_info_query *query;
> +	struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
> +	struct ib_sa_port *port;
> +	struct ib_mad_agent *agent;
> +	struct ib_sa_mad *mad;
> +	int ret;
> +
> +	if (!sa_dev)
> +		return -ENODEV;
> +
> +	port  = &sa_dev->port[port_num - sa_dev->start_port];
> +	agent = port->agent;
> +
> +	query = kzalloc(sizeof(*query), gfp_mask);
> +	if (!query)
> +		return -ENOMEM;
> +
> +	query->sa_query.port = port;
> +	ret = alloc_mad(&query->sa_query, gfp_mask);
> +	if (ret)
> +		goto err1;
> +
> +	ib_sa_client_get(client);
> +	query->sa_query.client = client;
> +	query->callback        = callback;
> +	query->context         = context;
> +
> +	mad = query->sa_query.mad_buf->mad;
> +	init_mad(mad, agent);
> +
> +	query->sa_query.callback = callback ? ib_sa_classport_info_rec_callback : NULL;
> +
> +	query->sa_query.release  = ib_sa_portclass_info_rec_release;
> +	/* support GET only */
> +	mad->mad_hdr.method	 = IB_MGMT_METHOD_GET;
> +	mad->mad_hdr.attr_id	 = cpu_to_be16(IB_SA_ATTR_CLASS_PORTINFO);
> +	mad->sa_hdr.comp_mask	 = 0;
> +	*sa_query = &query->sa_query;
> +
> +	ret = send_mad(&query->sa_query, timeout_ms, gfp_mask);
> +	if (ret < 0)
> +		goto err2;
> +
> +	return ret;
> +
> +err2:
> +	*sa_query = NULL;
> +	ib_sa_client_put(query->sa_query.client);
> +	free_mad(&query->sa_query);
> +
> +err1:
> +	kfree(query);
> +	return ret;
> +}
> +EXPORT_SYMBOL(ib_sa_classport_info_rec_query);
> +
>  static void send_handler(struct ib_mad_agent *agent,
>  			 struct ib_mad_send_wc *mad_send_wc)
>  {
> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
> index cdc1c81..3840416 100644
> --- a/include/rdma/ib_sa.h
> +++ b/include/rdma/ib_sa.h
> @@ -94,6 +94,8 @@ enum ib_sa_selector {
>  	IB_SA_BEST = 3
>  };
>  
> +#define IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT	BIT(12)
> +
>  /*
>   * Structures for SA records are named "struct ib_sa_xxx_rec."  No
>   * attempt is made to pack structures to match the physical layout of
> @@ -439,4 +441,14 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client,
>  			      void *context,
>  			      struct ib_sa_query **sa_query);
>  
> +/* Support get SA ClassPortInfo */
> +int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
> +				   struct ib_device *device, u8 port_num,
> +				   int timeout_ms, gfp_t gfp_mask,
> +				   void (*callback)(int status,
> +						    struct ib_class_port_info *resp,
> +						    void *context),
> +				   void *context,
> +				   struct ib_sa_query **sa_query);
> +
>  #endif /* IB_SA_H */
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]     ` <1461070287-13469-5-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-04-20 12:11       ` Hal Rosenstock
       [not found]         ` <571771FB.90908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Hal Rosenstock @ 2016-04-20 12:11 UTC (permalink / raw)
  To: Erez Shitrit, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: ogerlitz-VPRAkNaXOzVWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/19/2016 8:51 AM, Erez Shitrit wrote:
> Check (via an sa query) if the SM supports the new option for SendOnly
> multicast joins.
> If the SM supports that option it will use the new join state to create
> such multicast group.
> If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
> join for SendOnly MCG, use the correct state if supported.
> 
> This check is performed at every invocation of mcast_restart task, to be
> sure that the driver stays in sync with the current state of the SM.

Would a better strategy (to minimize gets of SA ClassPortInfo) be to do
this based on local events (like SM change, etc.) ?

-- Hal

> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
>  drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
>  3 files changed, 96 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
> index caec8e9..c51f618 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib.h
> +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
> @@ -392,6 +392,7 @@ struct ipoib_dev_priv {
>  	struct ipoib_ethtool_st ethtool;
>  	struct timer_list poll_timer;
>  	unsigned max_send_sge;
> +	bool sm_fullmember_sendonly_support;
>  };
>  
>  struct ipoib_ah {
> @@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
>  
>  void ipoib_mark_paths_invalid(struct net_device *dev);
>  void ipoib_flush_paths(struct net_device *dev);
> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
>  struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
>  
>  int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index 3b630e5..eb5927b 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
>  
>  	set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
>  
> +	priv->sm_fullmember_sendonly_support = false;
> +
>  	if (ipoib_ib_dev_open(dev)) {
>  		if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
>  			return 0;
> @@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
>  	spin_unlock_irq(&priv->lock);
>  }
>  
> +struct classport_info_context {
> +	struct ipoib_dev_priv	*priv;
> +	struct completion	done;
> +	struct ib_sa_query	*sa_query;
> +};
> +
> +static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
> +				    void *context)
> +{
> +	struct classport_info_context *cb_ctx = context;
> +	struct ipoib_dev_priv *priv;
> +
> +	WARN_ON(!context);
> +
> +	priv = cb_ctx->priv;
> +
> +	if (status) {
> +		pr_debug("device: %s failed query classport_info status: %d\n",
> +			 priv->dev->name, status);
> +		/* keeps the default, will try next mcast_restart */
> +		priv->sm_fullmember_sendonly_support = false;
> +		goto out;
> +	}
> +
> +	/* take out resp_time that located in the last 5 bits */
> +	if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
> +	    IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
> +		pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
> +			 priv->dev->name);
> +		priv->sm_fullmember_sendonly_support = true;
> +	} else {
> +		pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
> +			 priv->dev->name);
> +		priv->sm_fullmember_sendonly_support = false;
> +	}
> +
> +out:
> +	complete(&cb_ctx->done);
> +}
> +
> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
> +{
> +	struct classport_info_context *callback_context;
> +	int ret;
> +
> +	callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
> +	if (!callback_context)
> +		return -ENOMEM;
> +
> +	callback_context->priv = priv;
> +	init_completion(&callback_context->done);
> +
> +	ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
> +					     priv->ca, priv->port, 3000,
> +					     GFP_KERNEL,
> +					     classport_info_query_cb,
> +					     callback_context,
> +					     &callback_context->sa_query);
> +	if (ret < 0) {
> +		pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
> +			priv->dev->name, ret);
> +		kfree(callback_context);
> +		return ret;
> +	}
> +
> +	/* waiting for the callback to finish before returnning */
> +	wait_for_completion(&callback_context->done);
> +	kfree(callback_context);
> +
> +	return ret;
> +}
> +
>  void ipoib_flush_paths(struct net_device *dev)
>  {
>  	struct ipoib_dev_priv *priv = netdev_priv(dev);
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> index 2588931..c1bb69e 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> @@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
>  		rec.hop_limit	  = priv->broadcast->mcmember.hop_limit;
>  
>  		/*
> -		 * Send-only IB Multicast joins do not work at the core
> -		 * IB layer yet, so we can't use them here.  However,
> -		 * we are emulating an Ethernet multicast send, which
> -		 * does not require a multicast subscription and will
> -		 * still send properly.  The most appropriate thing to
> +		 * Send-only IB Multicast joins work at the core IB layer but
> +		 * require specific SM support.
> +		 * We can use such joins here only if the current SM supports that feature.
> +		 * However, if not, we emulate an Ethernet multicast send,
> +		 * which does not require a multicast subscription and will
> +		 * still send properly. The most appropriate thing to
>  		 * do is to create the group if it doesn't exist as that
>  		 * most closely emulates the behavior, from a user space
> -		 * application perspecitive, of Ethernet multicast
> -		 * operation.  For now, we do a full join, maybe later
> -		 * when the core IB layers support send only joins we
> -		 * will use them.
> +		 * application perspective, of Ethernet multicast operation.
>  		 */
> -#if 0
> -		if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
> -			rec.join_state = 4;
> -#endif
> +		if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
> +		    priv->sm_fullmember_sendonly_support)
> +			/* SM supports sendonly-fullmember, otherwise fallback to full-member */
> +			rec.join_state = 8;
>  	}
>  	spin_unlock_irq(&priv->lock);
>  
> @@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
>  	struct ib_port_attr port_attr;
>  	unsigned long delay_until = 0;
>  	struct ipoib_mcast *mcast = NULL;
> +	int ret;
>  
>  	if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>  		return;
> @@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
>  	else
>  		memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
>  
> +	/* Check if can send sendonly MCG's with sendonly-fullmember join state */
> +	if (priv->broadcast) {
> +		ret = ipoib_check_sm_sendonly_fullmember_support(priv);
> +		if (ret < 0)
> +			pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
> +				 priv->dev->name, ret);
> +	}
> +
>  	spin_lock_irq(&priv->lock);
>  	if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>  		goto out;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo
       [not found]     ` <1461070287-13469-3-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-04-20 12:09       ` Hal Rosenstock
@ 2016-04-20 13:12       ` Or Gerlitz
       [not found]         ` <CAJ3xEMihe-54UExbp3ptqH5yqg7dc6a6tKcJw0gNytTECce5TQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 31+ messages in thread
From: Or Gerlitz @ 2016-04-20 13:12 UTC (permalink / raw)
  To: Erez Shitrit; +Cc: Doug Ledford, Or Gerlitz, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 19, 2016 at 3:51 PM, Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>
> New sa query function to return the ClassPortInfo struct from the sa.

sa --> SA in all your change-logs, please

and in the subject it's the SA agent not the SA your are patching
here, reflect that
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]         ` <571771FB.90908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-04-20 13:12           ` Erez Shitrit
       [not found]             ` <CAAk-MO_24actZ4FJqOjk0ztR=Z6toSfB-xuyFKeUvtZ8h0Xv3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-20 13:12 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 20, 2016 at 3:11 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>> Check (via an sa query) if the SM supports the new option for SendOnly
>> multicast joins.
>> If the SM supports that option it will use the new join state to create
>> such multicast group.
>> If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
>> join for SendOnly MCG, use the correct state if supported.
>>
>> This check is performed at every invocation of mcast_restart task, to be
>> sure that the driver stays in sync with the current state of the SM.
>
> Would a better strategy (to minimize gets of SA ClassPortInfo) be to do
> this based on local events (like SM change, etc.) ?
>
> -- Hal
>

SA query for ClassPortInfo is done prior to the re-join of all
multicasts, because it takes no assumption on the current support of
the SM.
re-join to all multicast is done according to events from the
networking layer ("UP" from the kernel)  and from the IB part ("DOWN")
events (like SM change, RE-REG etc.)
We think that In both cases the driver should get the updated info of
the sm support.

>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>  drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
>>  drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
>>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
>>  3 files changed, 96 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
>> index caec8e9..c51f618 100644
>> --- a/drivers/infiniband/ulp/ipoib/ipoib.h
>> +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
>> @@ -392,6 +392,7 @@ struct ipoib_dev_priv {
>>       struct ipoib_ethtool_st ethtool;
>>       struct timer_list poll_timer;
>>       unsigned max_send_sge;
>> +     bool sm_fullmember_sendonly_support;
>>  };
>>
>>  struct ipoib_ah {
>> @@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
>>
>>  void ipoib_mark_paths_invalid(struct net_device *dev);
>>  void ipoib_flush_paths(struct net_device *dev);
>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
>>  struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
>>
>>  int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>> index 3b630e5..eb5927b 100644
>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>> @@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
>>
>>       set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
>>
>> +     priv->sm_fullmember_sendonly_support = false;
>> +
>>       if (ipoib_ib_dev_open(dev)) {
>>               if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
>>                       return 0;
>> @@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
>>       spin_unlock_irq(&priv->lock);
>>  }
>>
>> +struct classport_info_context {
>> +     struct ipoib_dev_priv   *priv;
>> +     struct completion       done;
>> +     struct ib_sa_query      *sa_query;
>> +};
>> +
>> +static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
>> +                                 void *context)
>> +{
>> +     struct classport_info_context *cb_ctx = context;
>> +     struct ipoib_dev_priv *priv;
>> +
>> +     WARN_ON(!context);
>> +
>> +     priv = cb_ctx->priv;
>> +
>> +     if (status) {
>> +             pr_debug("device: %s failed query classport_info status: %d\n",
>> +                      priv->dev->name, status);
>> +             /* keeps the default, will try next mcast_restart */
>> +             priv->sm_fullmember_sendonly_support = false;
>> +             goto out;
>> +     }
>> +
>> +     /* take out resp_time that located in the last 5 bits */
>> +     if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
>> +         IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
>> +             pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
>> +                      priv->dev->name);
>> +             priv->sm_fullmember_sendonly_support = true;
>> +     } else {
>> +             pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
>> +                      priv->dev->name);
>> +             priv->sm_fullmember_sendonly_support = false;
>> +     }
>> +
>> +out:
>> +     complete(&cb_ctx->done);
>> +}
>> +
>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
>> +{
>> +     struct classport_info_context *callback_context;
>> +     int ret;
>> +
>> +     callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
>> +     if (!callback_context)
>> +             return -ENOMEM;
>> +
>> +     callback_context->priv = priv;
>> +     init_completion(&callback_context->done);
>> +
>> +     ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
>> +                                          priv->ca, priv->port, 3000,
>> +                                          GFP_KERNEL,
>> +                                          classport_info_query_cb,
>> +                                          callback_context,
>> +                                          &callback_context->sa_query);
>> +     if (ret < 0) {
>> +             pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
>> +                     priv->dev->name, ret);
>> +             kfree(callback_context);
>> +             return ret;
>> +     }
>> +
>> +     /* waiting for the callback to finish before returnning */
>> +     wait_for_completion(&callback_context->done);
>> +     kfree(callback_context);
>> +
>> +     return ret;
>> +}
>> +
>>  void ipoib_flush_paths(struct net_device *dev)
>>  {
>>       struct ipoib_dev_priv *priv = netdev_priv(dev);
>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>> index 2588931..c1bb69e 100644
>> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>> @@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
>>               rec.hop_limit     = priv->broadcast->mcmember.hop_limit;
>>
>>               /*
>> -              * Send-only IB Multicast joins do not work at the core
>> -              * IB layer yet, so we can't use them here.  However,
>> -              * we are emulating an Ethernet multicast send, which
>> -              * does not require a multicast subscription and will
>> -              * still send properly.  The most appropriate thing to
>> +              * Send-only IB Multicast joins work at the core IB layer but
>> +              * require specific SM support.
>> +              * We can use such joins here only if the current SM supports that feature.
>> +              * However, if not, we emulate an Ethernet multicast send,
>> +              * which does not require a multicast subscription and will
>> +              * still send properly. The most appropriate thing to
>>                * do is to create the group if it doesn't exist as that
>>                * most closely emulates the behavior, from a user space
>> -              * application perspecitive, of Ethernet multicast
>> -              * operation.  For now, we do a full join, maybe later
>> -              * when the core IB layers support send only joins we
>> -              * will use them.
>> +              * application perspective, of Ethernet multicast operation.
>>                */
>> -#if 0
>> -             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
>> -                     rec.join_state = 4;
>> -#endif
>> +             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
>> +                 priv->sm_fullmember_sendonly_support)
>> +                     /* SM supports sendonly-fullmember, otherwise fallback to full-member */
>> +                     rec.join_state = 8;
>>       }
>>       spin_unlock_irq(&priv->lock);
>>
>> @@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>       struct ib_port_attr port_attr;
>>       unsigned long delay_until = 0;
>>       struct ipoib_mcast *mcast = NULL;
>> +     int ret;
>>
>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>               return;
>> @@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>       else
>>               memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
>>
>> +     /* Check if can send sendonly MCG's with sendonly-fullmember join state */
>> +     if (priv->broadcast) {
>> +             ret = ipoib_check_sm_sendonly_fullmember_support(priv);
>> +             if (ret < 0)
>> +                     pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
>> +                              priv->dev->name, ret);
>> +     }
>> +
>>       spin_lock_irq(&priv->lock);
>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>               goto out;
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo
       [not found]         ` <CAJ3xEMihe-54UExbp3ptqH5yqg7dc6a6tKcJw0gNytTECce5TQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-04-20 13:19           ` Erez Shitrit
  0 siblings, 0 replies; 31+ messages in thread
From: Erez Shitrit @ 2016-04-20 13:19 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 20, 2016 at 4:12 PM, Or Gerlitz <gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Apr 19, 2016 at 3:51 PM, Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>>
>> New sa query function to return the ClassPortInfo struct from the sa.
>
> sa --> SA in all your change-logs, please
>
> and in the subject it's the SA agent not the SA your are patching
> here, reflect that

Ack

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]             ` <CAAk-MO_24actZ4FJqOjk0ztR=Z6toSfB-xuyFKeUvtZ8h0Xv3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-04-20 13:29               ` Hal Rosenstock
       [not found]                 ` <5717843B.7020501-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Hal Rosenstock @ 2016-04-20 13:29 UTC (permalink / raw)
  To: Erez Shitrit
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/20/2016 9:12 AM, Erez Shitrit wrote:
> On Wed, Apr 20, 2016 at 3:11 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>>> Check (via an sa query) if the SM supports the new option for SendOnly
>>> multicast joins.
>>> If the SM supports that option it will use the new join state to create
>>> such multicast group.
>>> If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
>>> join for SendOnly MCG, use the correct state if supported.
>>>
>>> This check is performed at every invocation of mcast_restart task, to be
>>> sure that the driver stays in sync with the current state of the SM.
>>
>> Would a better strategy (to minimize gets of SA ClassPortInfo) be to do
>> this based on local events (like SM change, etc.) ?
>>
>> -- Hal
>>
> 
> SA query for ClassPortInfo is done prior to the re-join of all
> multicasts, because it takes no assumption on the current support of
> the SM.
> re-join to all multicast is done according to events from the
> networking layer ("UP" from the kernel)  and from the IB part ("DOWN")
> events (like SM change, RE-REG etc.)
> We think that In both cases the driver should get the updated info of
> the sm support.

Yes, those are same cases I'm referring to but won't this patch make it
occur once per rejoin rather than once when event occurs ? If so, this
is an unnecessary load on SA.

> 
>>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>> ---
>>>  drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
>>>  drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
>>>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
>>>  3 files changed, 96 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
>>> index caec8e9..c51f618 100644
>>> --- a/drivers/infiniband/ulp/ipoib/ipoib.h
>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
>>> @@ -392,6 +392,7 @@ struct ipoib_dev_priv {
>>>       struct ipoib_ethtool_st ethtool;
>>>       struct timer_list poll_timer;
>>>       unsigned max_send_sge;
>>> +     bool sm_fullmember_sendonly_support;
>>>  };
>>>
>>>  struct ipoib_ah {
>>> @@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
>>>
>>>  void ipoib_mark_paths_invalid(struct net_device *dev);
>>>  void ipoib_flush_paths(struct net_device *dev);
>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
>>>  struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
>>>
>>>  int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>> index 3b630e5..eb5927b 100644
>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>> @@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
>>>
>>>       set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
>>>
>>> +     priv->sm_fullmember_sendonly_support = false;
>>> +
>>>       if (ipoib_ib_dev_open(dev)) {
>>>               if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
>>>                       return 0;
>>> @@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
>>>       spin_unlock_irq(&priv->lock);
>>>  }
>>>
>>> +struct classport_info_context {
>>> +     struct ipoib_dev_priv   *priv;
>>> +     struct completion       done;
>>> +     struct ib_sa_query      *sa_query;
>>> +};
>>> +
>>> +static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
>>> +                                 void *context)
>>> +{
>>> +     struct classport_info_context *cb_ctx = context;
>>> +     struct ipoib_dev_priv *priv;
>>> +
>>> +     WARN_ON(!context);
>>> +
>>> +     priv = cb_ctx->priv;
>>> +
>>> +     if (status) {
>>> +             pr_debug("device: %s failed query classport_info status: %d\n",
>>> +                      priv->dev->name, status);
>>> +             /* keeps the default, will try next mcast_restart */
>>> +             priv->sm_fullmember_sendonly_support = false;
>>> +             goto out;
>>> +     }
>>> +
>>> +     /* take out resp_time that located in the last 5 bits */
>>> +     if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
>>> +         IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
>>> +             pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
>>> +                      priv->dev->name);
>>> +             priv->sm_fullmember_sendonly_support = true;
>>> +     } else {
>>> +             pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
>>> +                      priv->dev->name);
>>> +             priv->sm_fullmember_sendonly_support = false;
>>> +     }
>>> +
>>> +out:
>>> +     complete(&cb_ctx->done);
>>> +}
>>> +
>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
>>> +{
>>> +     struct classport_info_context *callback_context;
>>> +     int ret;
>>> +
>>> +     callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
>>> +     if (!callback_context)
>>> +             return -ENOMEM;
>>> +
>>> +     callback_context->priv = priv;
>>> +     init_completion(&callback_context->done);
>>> +
>>> +     ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
>>> +                                          priv->ca, priv->port, 3000,
>>> +                                          GFP_KERNEL,
>>> +                                          classport_info_query_cb,
>>> +                                          callback_context,
>>> +                                          &callback_context->sa_query);
>>> +     if (ret < 0) {
>>> +             pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
>>> +                     priv->dev->name, ret);
>>> +             kfree(callback_context);
>>> +             return ret;
>>> +     }
>>> +
>>> +     /* waiting for the callback to finish before returnning */
>>> +     wait_for_completion(&callback_context->done);
>>> +     kfree(callback_context);
>>> +
>>> +     return ret;
>>> +}
>>> +
>>>  void ipoib_flush_paths(struct net_device *dev)
>>>  {
>>>       struct ipoib_dev_priv *priv = netdev_priv(dev);
>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>> index 2588931..c1bb69e 100644
>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>> @@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
>>>               rec.hop_limit     = priv->broadcast->mcmember.hop_limit;
>>>
>>>               /*
>>> -              * Send-only IB Multicast joins do not work at the core
>>> -              * IB layer yet, so we can't use them here.  However,
>>> -              * we are emulating an Ethernet multicast send, which
>>> -              * does not require a multicast subscription and will
>>> -              * still send properly.  The most appropriate thing to
>>> +              * Send-only IB Multicast joins work at the core IB layer but
>>> +              * require specific SM support.
>>> +              * We can use such joins here only if the current SM supports that feature.
>>> +              * However, if not, we emulate an Ethernet multicast send,
>>> +              * which does not require a multicast subscription and will
>>> +              * still send properly. The most appropriate thing to
>>>                * do is to create the group if it doesn't exist as that
>>>                * most closely emulates the behavior, from a user space
>>> -              * application perspecitive, of Ethernet multicast
>>> -              * operation.  For now, we do a full join, maybe later
>>> -              * when the core IB layers support send only joins we
>>> -              * will use them.
>>> +              * application perspective, of Ethernet multicast operation.
>>>                */
>>> -#if 0
>>> -             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
>>> -                     rec.join_state = 4;
>>> -#endif
>>> +             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
>>> +                 priv->sm_fullmember_sendonly_support)
>>> +                     /* SM supports sendonly-fullmember, otherwise fallback to full-member */
>>> +                     rec.join_state = 8;
>>>       }
>>>       spin_unlock_irq(&priv->lock);
>>>
>>> @@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>       struct ib_port_attr port_attr;
>>>       unsigned long delay_until = 0;
>>>       struct ipoib_mcast *mcast = NULL;
>>> +     int ret;
>>>
>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>               return;
>>> @@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>       else
>>>               memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
>>>
>>> +     /* Check if can send sendonly MCG's with sendonly-fullmember join state */
>>> +     if (priv->broadcast) {
>>> +             ret = ipoib_check_sm_sendonly_fullmember_support(priv);
>>> +             if (ret < 0)
>>> +                     pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
>>> +                              priv->dev->name, ret);
>>> +     }
>>> +
>>>       spin_lock_irq(&priv->lock);
>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>               goto out;
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo
       [not found]         ` <57177181.7090408-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-04-20 14:24           ` Erez Shitrit
  2016-04-21 12:12           ` Erez Shitrit
  1 sibling, 0 replies; 31+ messages in thread
From: Erez Shitrit @ 2016-04-20 14:24 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 20, 2016 at 3:09 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>> New sa query function to return the ClassPortInfo struct from the sa.
>> If the SM supports FullMemberSendOnly mode for MCG's, it sets a
>> capability bit in the capability_mask2 field of the response.
>>
>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>  drivers/infiniband/core/sa_query.c | 174 +++++++++++++++++++++++++++++++++++++
>>  include/rdma/ib_sa.h               |  12 +++
>>  2 files changed, 186 insertions(+)
>>
>> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
>> index 8a09c0f..66535bf 100644
>> --- a/drivers/infiniband/core/sa_query.c
>> +++ b/drivers/infiniband/core/sa_query.c
>> @@ -119,6 +119,12 @@ struct ib_sa_guidinfo_query {
>>       struct ib_sa_query sa_query;
>>  };
>>
>> +struct ib_sa_classport_info_query {
>> +     void (*callback)(int, struct ib_class_port_info *, void *);
>> +     void *context;
>> +     struct ib_sa_query sa_query;
>> +};
>> +
>>  struct ib_sa_mcmember_query {
>>       void (*callback)(int, struct ib_sa_mcmember_rec *, void *);
>>       void *context;
>> @@ -392,6 +398,83 @@ static const struct ib_field service_rec_table[] = {
>>         .size_bits    = 2*64 },
>>  };
>>
>> +#define PORTCLASSINFO_REC_FIELD(field) \
>
> Would CLASSPORTINFO_REC_FIELD be more consistent in terms of naming ?

Agree. will fix that. thanks.

>
>> +     .struct_offset_bytes = offsetof(struct ib_class_port_info, field),      \
>> +     .struct_size_bytes   = sizeof((struct ib_class_port_info *)0)->field,   \
>> +     .field_name          = "ib_class_port_info:" #field
>> +
>> +static const struct ib_field classport_info_rec_table[] = {
>> +     { PORTCLASSINFO_REC_FIELD(base_version),
>> +       .offset_words = 0,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 8 },
>> +     { PORTCLASSINFO_REC_FIELD(class_version),
>> +       .offset_words = 0,
>> +       .offset_bits  = 8,
>> +       .size_bits    = 8 },
>> +     { PORTCLASSINFO_REC_FIELD(capability_mask),
>> +       .offset_words = 0,
>> +       .offset_bits  = 16,
>> +       .size_bits    = 16 },
>> +     { PORTCLASSINFO_REC_FIELD(cap_mask2_resp_time),
>> +       .offset_words = 1,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>
> Why combine fields ? Syntax is rich enough to deal with bit fields.
> This is what was done for other SA attributes in this file.
>
> -- Hal
>
>> +     { PORTCLASSINFO_REC_FIELD(redirect_gid),
>> +       .offset_words = 2,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 128 },
>> +     { PORTCLASSINFO_REC_FIELD(redirect_tcslfl),
>> +       .offset_words = 6,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(redirect_lid),
>> +       .offset_words = 7,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 16 },
>> +     { PORTCLASSINFO_REC_FIELD(redirect_pkey),
>> +       .offset_words = 7,
>> +       .offset_bits  = 16,
>> +       .size_bits    = 16 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(redirect_qp),
>> +       .offset_words = 8,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +     { PORTCLASSINFO_REC_FIELD(redirect_qkey),
>> +       .offset_words = 9,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(trap_gid),
>> +       .offset_words = 10,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 128 },
>> +     { PORTCLASSINFO_REC_FIELD(trap_tcslfl),
>> +       .offset_words = 14,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(trap_lid),
>> +       .offset_words = 15,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 16 },
>> +     { PORTCLASSINFO_REC_FIELD(trap_pkey),
>> +       .offset_words = 15,
>> +       .offset_bits  = 16,
>> +       .size_bits    = 16 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(trap_hlqp),
>> +       .offset_words = 16,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +     { PORTCLASSINFO_REC_FIELD(trap_qkey),
>> +       .offset_words = 17,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +};
>> +
>>  #define GUIDINFO_REC_FIELD(field) \
>>       .struct_offset_bytes = offsetof(struct ib_sa_guidinfo_rec, field),      \
>>       .struct_size_bytes   = sizeof((struct ib_sa_guidinfo_rec *) 0)->field,  \
>> @@ -1645,6 +1728,97 @@ err1:
>>  }
>>  EXPORT_SYMBOL(ib_sa_guid_info_rec_query);
>>
>> +/* Support get SA ClassPortInfo */
>> +static void ib_sa_classport_info_rec_callback(struct ib_sa_query *sa_query,
>> +                                           int status,
>> +                                           struct ib_sa_mad *mad)
>> +{
>> +     struct ib_sa_classport_info_query *query =
>> +             container_of(sa_query, struct ib_sa_classport_info_query, sa_query);
>> +
>> +     if (mad) {
>> +             struct ib_class_port_info rec;
>> +
>> +             ib_unpack(classport_info_rec_table,
>> +                       ARRAY_SIZE(classport_info_rec_table),
>> +                       mad->data, &rec);
>> +             query->callback(status, &rec, query->context);
>> +     } else {
>> +             query->callback(status, NULL, query->context);
>> +     }
>> +}
>> +
>> +static void ib_sa_portclass_info_rec_release(struct ib_sa_query *sa_query)
>> +{
>> +     kfree(container_of(sa_query, struct ib_sa_classport_info_query,
>> +                        sa_query));
>> +}
>> +
>> +int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
>> +                                struct ib_device *device, u8 port_num,
>> +                                int timeout_ms, gfp_t gfp_mask,
>> +                                void (*callback)(int status,
>> +                                                 struct ib_class_port_info *resp,
>> +                                                 void *context),
>> +                                void *context,
>> +                                struct ib_sa_query **sa_query)
>> +{
>> +     struct ib_sa_classport_info_query *query;
>> +     struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
>> +     struct ib_sa_port *port;
>> +     struct ib_mad_agent *agent;
>> +     struct ib_sa_mad *mad;
>> +     int ret;
>> +
>> +     if (!sa_dev)
>> +             return -ENODEV;
>> +
>> +     port  = &sa_dev->port[port_num - sa_dev->start_port];
>> +     agent = port->agent;
>> +
>> +     query = kzalloc(sizeof(*query), gfp_mask);
>> +     if (!query)
>> +             return -ENOMEM;
>> +
>> +     query->sa_query.port = port;
>> +     ret = alloc_mad(&query->sa_query, gfp_mask);
>> +     if (ret)
>> +             goto err1;
>> +
>> +     ib_sa_client_get(client);
>> +     query->sa_query.client = client;
>> +     query->callback        = callback;
>> +     query->context         = context;
>> +
>> +     mad = query->sa_query.mad_buf->mad;
>> +     init_mad(mad, agent);
>> +
>> +     query->sa_query.callback = callback ? ib_sa_classport_info_rec_callback : NULL;
>> +
>> +     query->sa_query.release  = ib_sa_portclass_info_rec_release;
>> +     /* support GET only */
>> +     mad->mad_hdr.method      = IB_MGMT_METHOD_GET;
>> +     mad->mad_hdr.attr_id     = cpu_to_be16(IB_SA_ATTR_CLASS_PORTINFO);
>> +     mad->sa_hdr.comp_mask    = 0;
>> +     *sa_query = &query->sa_query;
>> +
>> +     ret = send_mad(&query->sa_query, timeout_ms, gfp_mask);
>> +     if (ret < 0)
>> +             goto err2;
>> +
>> +     return ret;
>> +
>> +err2:
>> +     *sa_query = NULL;
>> +     ib_sa_client_put(query->sa_query.client);
>> +     free_mad(&query->sa_query);
>> +
>> +err1:
>> +     kfree(query);
>> +     return ret;
>> +}
>> +EXPORT_SYMBOL(ib_sa_classport_info_rec_query);
>> +
>>  static void send_handler(struct ib_mad_agent *agent,
>>                        struct ib_mad_send_wc *mad_send_wc)
>>  {
>> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
>> index cdc1c81..3840416 100644
>> --- a/include/rdma/ib_sa.h
>> +++ b/include/rdma/ib_sa.h
>> @@ -94,6 +94,8 @@ enum ib_sa_selector {
>>       IB_SA_BEST = 3
>>  };
>>
>> +#define IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT    BIT(12)
>> +
>>  /*
>>   * Structures for SA records are named "struct ib_sa_xxx_rec."  No
>>   * attempt is made to pack structures to match the physical layout of
>> @@ -439,4 +441,14 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client,
>>                             void *context,
>>                             struct ib_sa_query **sa_query);
>>
>> +/* Support get SA ClassPortInfo */
>> +int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
>> +                                struct ib_device *device, u8 port_num,
>> +                                int timeout_ms, gfp_t gfp_mask,
>> +                                void (*callback)(int status,
>> +                                                 struct ib_class_port_info *resp,
>> +                                                 void *context),
>> +                                void *context,
>> +                                struct ib_sa_query **sa_query);
>> +
>>  #endif /* IB_SA_H */
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo
       [not found]         ` <57177181.7090408-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2016-04-20 14:24           ` Erez Shitrit
@ 2016-04-21 12:12           ` Erez Shitrit
  1 sibling, 0 replies; 31+ messages in thread
From: Erez Shitrit @ 2016-04-21 12:12 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 20, 2016 at 3:09 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>> New sa query function to return the ClassPortInfo struct from the sa.
>> If the SM supports FullMemberSendOnly mode for MCG's, it sets a
>> capability bit in the capability_mask2 field of the response.
>>
>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>  drivers/infiniband/core/sa_query.c | 174 +++++++++++++++++++++++++++++++++++++
>>  include/rdma/ib_sa.h               |  12 +++
>>  2 files changed, 186 insertions(+)
>>
>> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
>> index 8a09c0f..66535bf 100644
>> --- a/drivers/infiniband/core/sa_query.c
>> +++ b/drivers/infiniband/core/sa_query.c
>> @@ -119,6 +119,12 @@ struct ib_sa_guidinfo_query {
>>       struct ib_sa_query sa_query;
>>  };
>>
>> +struct ib_sa_classport_info_query {
>> +     void (*callback)(int, struct ib_class_port_info *, void *);
>> +     void *context;
>> +     struct ib_sa_query sa_query;
>> +};
>> +
>>  struct ib_sa_mcmember_query {
>>       void (*callback)(int, struct ib_sa_mcmember_rec *, void *);
>>       void *context;
>> @@ -392,6 +398,83 @@ static const struct ib_field service_rec_table[] = {
>>         .size_bits    = 2*64 },
>>  };
>>
>> +#define PORTCLASSINFO_REC_FIELD(field) \
>
> Would CLASSPORTINFO_REC_FIELD be more consistent in terms of naming ?
>
>> +     .struct_offset_bytes = offsetof(struct ib_class_port_info, field),      \
>> +     .struct_size_bytes   = sizeof((struct ib_class_port_info *)0)->field,   \
>> +     .field_name          = "ib_class_port_info:" #field
>> +
>> +static const struct ib_field classport_info_rec_table[] = {
>> +     { PORTCLASSINFO_REC_FIELD(base_version),
>> +       .offset_words = 0,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 8 },
>> +     { PORTCLASSINFO_REC_FIELD(class_version),
>> +       .offset_words = 0,
>> +       .offset_bits  = 8,
>> +       .size_bits    = 8 },
>> +     { PORTCLASSINFO_REC_FIELD(capability_mask),
>> +       .offset_words = 0,
>> +       .offset_bits  = 16,
>> +       .size_bits    = 16 },
>> +     { PORTCLASSINFO_REC_FIELD(cap_mask2_resp_time),
>> +       .offset_words = 1,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>
> Why combine fields ? Syntax is rich enough to deal with bit fields.
> This is what was done for other SA attributes in this file.
>
> -- Hal

Tried to follow the sm implementation.
Anyway, will fix that.
Thanks.

>
>> +     { PORTCLASSINFO_REC_FIELD(redirect_gid),
>> +       .offset_words = 2,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 128 },
>> +     { PORTCLASSINFO_REC_FIELD(redirect_tcslfl),
>> +       .offset_words = 6,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(redirect_lid),
>> +       .offset_words = 7,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 16 },
>> +     { PORTCLASSINFO_REC_FIELD(redirect_pkey),
>> +       .offset_words = 7,
>> +       .offset_bits  = 16,
>> +       .size_bits    = 16 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(redirect_qp),
>> +       .offset_words = 8,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +     { PORTCLASSINFO_REC_FIELD(redirect_qkey),
>> +       .offset_words = 9,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(trap_gid),
>> +       .offset_words = 10,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 128 },
>> +     { PORTCLASSINFO_REC_FIELD(trap_tcslfl),
>> +       .offset_words = 14,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(trap_lid),
>> +       .offset_words = 15,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 16 },
>> +     { PORTCLASSINFO_REC_FIELD(trap_pkey),
>> +       .offset_words = 15,
>> +       .offset_bits  = 16,
>> +       .size_bits    = 16 },
>> +
>> +     { PORTCLASSINFO_REC_FIELD(trap_hlqp),
>> +       .offset_words = 16,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +     { PORTCLASSINFO_REC_FIELD(trap_qkey),
>> +       .offset_words = 17,
>> +       .offset_bits  = 0,
>> +       .size_bits    = 32 },
>> +};
>> +
>>  #define GUIDINFO_REC_FIELD(field) \
>>       .struct_offset_bytes = offsetof(struct ib_sa_guidinfo_rec, field),      \
>>       .struct_size_bytes   = sizeof((struct ib_sa_guidinfo_rec *) 0)->field,  \
>> @@ -1645,6 +1728,97 @@ err1:
>>  }
>>  EXPORT_SYMBOL(ib_sa_guid_info_rec_query);
>>
>> +/* Support get SA ClassPortInfo */
>> +static void ib_sa_classport_info_rec_callback(struct ib_sa_query *sa_query,
>> +                                           int status,
>> +                                           struct ib_sa_mad *mad)
>> +{
>> +     struct ib_sa_classport_info_query *query =
>> +             container_of(sa_query, struct ib_sa_classport_info_query, sa_query);
>> +
>> +     if (mad) {
>> +             struct ib_class_port_info rec;
>> +
>> +             ib_unpack(classport_info_rec_table,
>> +                       ARRAY_SIZE(classport_info_rec_table),
>> +                       mad->data, &rec);
>> +             query->callback(status, &rec, query->context);
>> +     } else {
>> +             query->callback(status, NULL, query->context);
>> +     }
>> +}
>> +
>> +static void ib_sa_portclass_info_rec_release(struct ib_sa_query *sa_query)
>> +{
>> +     kfree(container_of(sa_query, struct ib_sa_classport_info_query,
>> +                        sa_query));
>> +}
>> +
>> +int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
>> +                                struct ib_device *device, u8 port_num,
>> +                                int timeout_ms, gfp_t gfp_mask,
>> +                                void (*callback)(int status,
>> +                                                 struct ib_class_port_info *resp,
>> +                                                 void *context),
>> +                                void *context,
>> +                                struct ib_sa_query **sa_query)
>> +{
>> +     struct ib_sa_classport_info_query *query;
>> +     struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
>> +     struct ib_sa_port *port;
>> +     struct ib_mad_agent *agent;
>> +     struct ib_sa_mad *mad;
>> +     int ret;
>> +
>> +     if (!sa_dev)
>> +             return -ENODEV;
>> +
>> +     port  = &sa_dev->port[port_num - sa_dev->start_port];
>> +     agent = port->agent;
>> +
>> +     query = kzalloc(sizeof(*query), gfp_mask);
>> +     if (!query)
>> +             return -ENOMEM;
>> +
>> +     query->sa_query.port = port;
>> +     ret = alloc_mad(&query->sa_query, gfp_mask);
>> +     if (ret)
>> +             goto err1;
>> +
>> +     ib_sa_client_get(client);
>> +     query->sa_query.client = client;
>> +     query->callback        = callback;
>> +     query->context         = context;
>> +
>> +     mad = query->sa_query.mad_buf->mad;
>> +     init_mad(mad, agent);
>> +
>> +     query->sa_query.callback = callback ? ib_sa_classport_info_rec_callback : NULL;
>> +
>> +     query->sa_query.release  = ib_sa_portclass_info_rec_release;
>> +     /* support GET only */
>> +     mad->mad_hdr.method      = IB_MGMT_METHOD_GET;
>> +     mad->mad_hdr.attr_id     = cpu_to_be16(IB_SA_ATTR_CLASS_PORTINFO);
>> +     mad->sa_hdr.comp_mask    = 0;
>> +     *sa_query = &query->sa_query;
>> +
>> +     ret = send_mad(&query->sa_query, timeout_ms, gfp_mask);
>> +     if (ret < 0)
>> +             goto err2;
>> +
>> +     return ret;
>> +
>> +err2:
>> +     *sa_query = NULL;
>> +     ib_sa_client_put(query->sa_query.client);
>> +     free_mad(&query->sa_query);
>> +
>> +err1:
>> +     kfree(query);
>> +     return ret;
>> +}
>> +EXPORT_SYMBOL(ib_sa_classport_info_rec_query);
>> +
>>  static void send_handler(struct ib_mad_agent *agent,
>>                        struct ib_mad_send_wc *mad_send_wc)
>>  {
>> diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
>> index cdc1c81..3840416 100644
>> --- a/include/rdma/ib_sa.h
>> +++ b/include/rdma/ib_sa.h
>> @@ -94,6 +94,8 @@ enum ib_sa_selector {
>>       IB_SA_BEST = 3
>>  };
>>
>> +#define IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT    BIT(12)
>> +
>>  /*
>>   * Structures for SA records are named "struct ib_sa_xxx_rec."  No
>>   * attempt is made to pack structures to match the physical layout of
>> @@ -439,4 +441,14 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client,
>>                             void *context,
>>                             struct ib_sa_query **sa_query);
>>
>> +/* Support get SA ClassPortInfo */
>> +int ib_sa_classport_info_rec_query(struct ib_sa_client *client,
>> +                                struct ib_device *device, u8 port_num,
>> +                                int timeout_ms, gfp_t gfp_mask,
>> +                                void (*callback)(int status,
>> +                                                 struct ib_class_port_info *resp,
>> +                                                 void *context),
>> +                                void *context,
>> +                                struct ib_sa_query **sa_query);
>> +
>>  #endif /* IB_SA_H */
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]                 ` <5717843B.7020501-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-04-21 12:52                   ` Erez Shitrit
       [not found]                     ` <CAAk-MO9sie_pFH7wSDhhuEd2uApnz5dMF6E9u6Vgkyimc+1xWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-21 12:52 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 20, 2016 at 4:29 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 4/20/2016 9:12 AM, Erez Shitrit wrote:
>> On Wed, Apr 20, 2016 at 3:11 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>>>> Check (via an sa query) if the SM supports the new option for SendOnly
>>>> multicast joins.
>>>> If the SM supports that option it will use the new join state to create
>>>> such multicast group.
>>>> If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
>>>> join for SendOnly MCG, use the correct state if supported.
>>>>
>>>> This check is performed at every invocation of mcast_restart task, to be
>>>> sure that the driver stays in sync with the current state of the SM.
>>>
>>> Would a better strategy (to minimize gets of SA ClassPortInfo) be to do
>>> this based on local events (like SM change, etc.) ?
>>>
>>> -- Hal
>>>
>>
>> SA query for ClassPortInfo is done prior to the re-join of all
>> multicasts, because it takes no assumption on the current support of
>> the SM.
>> re-join to all multicast is done according to events from the
>> networking layer ("UP" from the kernel)  and from the IB part ("DOWN")
>> events (like SM change, RE-REG etc.)
>> We think that In both cases the driver should get the updated info of
>> the sm support.
>
> Yes, those are same cases I'm referring to but won't this patch make it
> occur once per rejoin rather than once when event occurs ? If so, this
> is an unnecessary load on SA.

The assumption was to check if the sm still keeping that option, and
to do that before any new join.
Can I assume that the sm keeps that setting, without the option to change them?

>
>>
>>>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>> ---
>>>>  drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
>>>>  drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
>>>>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
>>>>  3 files changed, 96 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
>>>> index caec8e9..c51f618 100644
>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib.h
>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
>>>> @@ -392,6 +392,7 @@ struct ipoib_dev_priv {
>>>>       struct ipoib_ethtool_st ethtool;
>>>>       struct timer_list poll_timer;
>>>>       unsigned max_send_sge;
>>>> +     bool sm_fullmember_sendonly_support;
>>>>  };
>>>>
>>>>  struct ipoib_ah {
>>>> @@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
>>>>
>>>>  void ipoib_mark_paths_invalid(struct net_device *dev);
>>>>  void ipoib_flush_paths(struct net_device *dev);
>>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
>>>>  struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
>>>>
>>>>  int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>> index 3b630e5..eb5927b 100644
>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>> @@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
>>>>
>>>>       set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
>>>>
>>>> +     priv->sm_fullmember_sendonly_support = false;
>>>> +
>>>>       if (ipoib_ib_dev_open(dev)) {
>>>>               if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
>>>>                       return 0;
>>>> @@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
>>>>       spin_unlock_irq(&priv->lock);
>>>>  }
>>>>
>>>> +struct classport_info_context {
>>>> +     struct ipoib_dev_priv   *priv;
>>>> +     struct completion       done;
>>>> +     struct ib_sa_query      *sa_query;
>>>> +};
>>>> +
>>>> +static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
>>>> +                                 void *context)
>>>> +{
>>>> +     struct classport_info_context *cb_ctx = context;
>>>> +     struct ipoib_dev_priv *priv;
>>>> +
>>>> +     WARN_ON(!context);
>>>> +
>>>> +     priv = cb_ctx->priv;
>>>> +
>>>> +     if (status) {
>>>> +             pr_debug("device: %s failed query classport_info status: %d\n",
>>>> +                      priv->dev->name, status);
>>>> +             /* keeps the default, will try next mcast_restart */
>>>> +             priv->sm_fullmember_sendonly_support = false;
>>>> +             goto out;
>>>> +     }
>>>> +
>>>> +     /* take out resp_time that located in the last 5 bits */
>>>> +     if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
>>>> +         IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
>>>> +             pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
>>>> +                      priv->dev->name);
>>>> +             priv->sm_fullmember_sendonly_support = true;
>>>> +     } else {
>>>> +             pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
>>>> +                      priv->dev->name);
>>>> +             priv->sm_fullmember_sendonly_support = false;
>>>> +     }
>>>> +
>>>> +out:
>>>> +     complete(&cb_ctx->done);
>>>> +}
>>>> +
>>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
>>>> +{
>>>> +     struct classport_info_context *callback_context;
>>>> +     int ret;
>>>> +
>>>> +     callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
>>>> +     if (!callback_context)
>>>> +             return -ENOMEM;
>>>> +
>>>> +     callback_context->priv = priv;
>>>> +     init_completion(&callback_context->done);
>>>> +
>>>> +     ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
>>>> +                                          priv->ca, priv->port, 3000,
>>>> +                                          GFP_KERNEL,
>>>> +                                          classport_info_query_cb,
>>>> +                                          callback_context,
>>>> +                                          &callback_context->sa_query);
>>>> +     if (ret < 0) {
>>>> +             pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
>>>> +                     priv->dev->name, ret);
>>>> +             kfree(callback_context);
>>>> +             return ret;
>>>> +     }
>>>> +
>>>> +     /* waiting for the callback to finish before returnning */
>>>> +     wait_for_completion(&callback_context->done);
>>>> +     kfree(callback_context);
>>>> +
>>>> +     return ret;
>>>> +}
>>>> +
>>>>  void ipoib_flush_paths(struct net_device *dev)
>>>>  {
>>>>       struct ipoib_dev_priv *priv = netdev_priv(dev);
>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>> index 2588931..c1bb69e 100644
>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>> @@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
>>>>               rec.hop_limit     = priv->broadcast->mcmember.hop_limit;
>>>>
>>>>               /*
>>>> -              * Send-only IB Multicast joins do not work at the core
>>>> -              * IB layer yet, so we can't use them here.  However,
>>>> -              * we are emulating an Ethernet multicast send, which
>>>> -              * does not require a multicast subscription and will
>>>> -              * still send properly.  The most appropriate thing to
>>>> +              * Send-only IB Multicast joins work at the core IB layer but
>>>> +              * require specific SM support.
>>>> +              * We can use such joins here only if the current SM supports that feature.
>>>> +              * However, if not, we emulate an Ethernet multicast send,
>>>> +              * which does not require a multicast subscription and will
>>>> +              * still send properly. The most appropriate thing to
>>>>                * do is to create the group if it doesn't exist as that
>>>>                * most closely emulates the behavior, from a user space
>>>> -              * application perspecitive, of Ethernet multicast
>>>> -              * operation.  For now, we do a full join, maybe later
>>>> -              * when the core IB layers support send only joins we
>>>> -              * will use them.
>>>> +              * application perspective, of Ethernet multicast operation.
>>>>                */
>>>> -#if 0
>>>> -             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
>>>> -                     rec.join_state = 4;
>>>> -#endif
>>>> +             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
>>>> +                 priv->sm_fullmember_sendonly_support)
>>>> +                     /* SM supports sendonly-fullmember, otherwise fallback to full-member */
>>>> +                     rec.join_state = 8;
>>>>       }
>>>>       spin_unlock_irq(&priv->lock);
>>>>
>>>> @@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>>       struct ib_port_attr port_attr;
>>>>       unsigned long delay_until = 0;
>>>>       struct ipoib_mcast *mcast = NULL;
>>>> +     int ret;
>>>>
>>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>>               return;
>>>> @@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>>       else
>>>>               memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
>>>>
>>>> +     /* Check if can send sendonly MCG's with sendonly-fullmember join state */
>>>> +     if (priv->broadcast) {
>>>> +             ret = ipoib_check_sm_sendonly_fullmember_support(priv);
>>>> +             if (ret < 0)
>>>> +                     pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
>>>> +                              priv->dev->name, ret);
>>>> +     }
>>>> +
>>>>       spin_lock_irq(&priv->lock);
>>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>>               goto out;
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]                     ` <CAAk-MO9sie_pFH7wSDhhuEd2uApnz5dMF6E9u6Vgkyimc+1xWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-04-21 13:24                       ` Hal Rosenstock
       [not found]                         ` <5718D497.70306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Hal Rosenstock @ 2016-04-21 13:24 UTC (permalink / raw)
  To: Erez Shitrit
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/21/2016 8:52 AM, Erez Shitrit wrote:
> On Wed, Apr 20, 2016 at 4:29 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On 4/20/2016 9:12 AM, Erez Shitrit wrote:
>>> On Wed, Apr 20, 2016 at 3:11 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>>> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>>>>> Check (via an sa query) if the SM supports the new option for SendOnly
>>>>> multicast joins.
>>>>> If the SM supports that option it will use the new join state to create
>>>>> such multicast group.
>>>>> If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
>>>>> join for SendOnly MCG, use the correct state if supported.
>>>>>
>>>>> This check is performed at every invocation of mcast_restart task, to be
>>>>> sure that the driver stays in sync with the current state of the SM.
>>>>
>>>> Would a better strategy (to minimize gets of SA ClassPortInfo) be to do
>>>> this based on local events (like SM change, etc.) ?
>>>>
>>>> -- Hal
>>>>
>>>
>>> SA query for ClassPortInfo is done prior to the re-join of all
>>> multicasts, because it takes no assumption on the current support of
>>> the SM.
>>> re-join to all multicast is done according to events from the
>>> networking layer ("UP" from the kernel)  and from the IB part ("DOWN")
>>> events (like SM change, RE-REG etc.)
>>> We think that In both cases the driver should get the updated info of
>>> the sm support.
>>
>> Yes, those are same cases I'm referring to but won't this patch make it
>> occur once per rejoin rather than once when event occurs ? If so, this
>> is an unnecessary load on SA.
> 
> The assumption was to check if the sm still keeping that option, and
> to do that before any new join.

That will work but it's overkill.

> Can I assume that the sm keeps that setting, without the option to change them?

Yes, the SA associated with the SM either supports the option or not.
Only need to deal with determining this once unless SM fails over
(various local events).

This will minimize storm of joins and PR queries at SA on client reregister.

>>
>>>
>>>>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>>> ---
>>>>>  drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
>>>>>  drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
>>>>>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
>>>>>  3 files changed, 96 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
>>>>> index caec8e9..c51f618 100644
>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib.h
>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
>>>>> @@ -392,6 +392,7 @@ struct ipoib_dev_priv {
>>>>>       struct ipoib_ethtool_st ethtool;
>>>>>       struct timer_list poll_timer;
>>>>>       unsigned max_send_sge;
>>>>> +     bool sm_fullmember_sendonly_support;
>>>>>  };
>>>>>
>>>>>  struct ipoib_ah {
>>>>> @@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
>>>>>
>>>>>  void ipoib_mark_paths_invalid(struct net_device *dev);
>>>>>  void ipoib_flush_paths(struct net_device *dev);
>>>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
>>>>>  struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
>>>>>
>>>>>  int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>> index 3b630e5..eb5927b 100644
>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>> @@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
>>>>>
>>>>>       set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
>>>>>
>>>>> +     priv->sm_fullmember_sendonly_support = false;
>>>>> +
>>>>>       if (ipoib_ib_dev_open(dev)) {
>>>>>               if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
>>>>>                       return 0;
>>>>> @@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
>>>>>       spin_unlock_irq(&priv->lock);
>>>>>  }
>>>>>
>>>>> +struct classport_info_context {
>>>>> +     struct ipoib_dev_priv   *priv;
>>>>> +     struct completion       done;
>>>>> +     struct ib_sa_query      *sa_query;
>>>>> +};
>>>>> +
>>>>> +static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
>>>>> +                                 void *context)
>>>>> +{
>>>>> +     struct classport_info_context *cb_ctx = context;
>>>>> +     struct ipoib_dev_priv *priv;
>>>>> +
>>>>> +     WARN_ON(!context);
>>>>> +
>>>>> +     priv = cb_ctx->priv;
>>>>> +
>>>>> +     if (status) {
>>>>> +             pr_debug("device: %s failed query classport_info status: %d\n",
>>>>> +                      priv->dev->name, status);
>>>>> +             /* keeps the default, will try next mcast_restart */
>>>>> +             priv->sm_fullmember_sendonly_support = false;
>>>>> +             goto out;
>>>>> +     }
>>>>> +
>>>>> +     /* take out resp_time that located in the last 5 bits */
>>>>> +     if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
>>>>> +         IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
>>>>> +             pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
>>>>> +                      priv->dev->name);
>>>>> +             priv->sm_fullmember_sendonly_support = true;
>>>>> +     } else {
>>>>> +             pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
>>>>> +                      priv->dev->name);
>>>>> +             priv->sm_fullmember_sendonly_support = false;
>>>>> +     }
>>>>> +
>>>>> +out:
>>>>> +     complete(&cb_ctx->done);
>>>>> +}
>>>>> +
>>>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
>>>>> +{
>>>>> +     struct classport_info_context *callback_context;
>>>>> +     int ret;
>>>>> +
>>>>> +     callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
>>>>> +     if (!callback_context)
>>>>> +             return -ENOMEM;
>>>>> +
>>>>> +     callback_context->priv = priv;
>>>>> +     init_completion(&callback_context->done);
>>>>> +
>>>>> +     ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
>>>>> +                                          priv->ca, priv->port, 3000,
>>>>> +                                          GFP_KERNEL,
>>>>> +                                          classport_info_query_cb,
>>>>> +                                          callback_context,
>>>>> +                                          &callback_context->sa_query);
>>>>> +     if (ret < 0) {
>>>>> +             pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
>>>>> +                     priv->dev->name, ret);
>>>>> +             kfree(callback_context);
>>>>> +             return ret;
>>>>> +     }
>>>>> +
>>>>> +     /* waiting for the callback to finish before returnning */
>>>>> +     wait_for_completion(&callback_context->done);
>>>>> +     kfree(callback_context);
>>>>> +
>>>>> +     return ret;
>>>>> +}
>>>>> +
>>>>>  void ipoib_flush_paths(struct net_device *dev)
>>>>>  {
>>>>>       struct ipoib_dev_priv *priv = netdev_priv(dev);
>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>>> index 2588931..c1bb69e 100644
>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>>> @@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
>>>>>               rec.hop_limit     = priv->broadcast->mcmember.hop_limit;
>>>>>
>>>>>               /*
>>>>> -              * Send-only IB Multicast joins do not work at the core
>>>>> -              * IB layer yet, so we can't use them here.  However,
>>>>> -              * we are emulating an Ethernet multicast send, which
>>>>> -              * does not require a multicast subscription and will
>>>>> -              * still send properly.  The most appropriate thing to
>>>>> +              * Send-only IB Multicast joins work at the core IB layer but
>>>>> +              * require specific SM support.
>>>>> +              * We can use such joins here only if the current SM supports that feature.
>>>>> +              * However, if not, we emulate an Ethernet multicast send,
>>>>> +              * which does not require a multicast subscription and will
>>>>> +              * still send properly. The most appropriate thing to
>>>>>                * do is to create the group if it doesn't exist as that
>>>>>                * most closely emulates the behavior, from a user space
>>>>> -              * application perspecitive, of Ethernet multicast
>>>>> -              * operation.  For now, we do a full join, maybe later
>>>>> -              * when the core IB layers support send only joins we
>>>>> -              * will use them.
>>>>> +              * application perspective, of Ethernet multicast operation.
>>>>>                */
>>>>> -#if 0
>>>>> -             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
>>>>> -                     rec.join_state = 4;
>>>>> -#endif
>>>>> +             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
>>>>> +                 priv->sm_fullmember_sendonly_support)
>>>>> +                     /* SM supports sendonly-fullmember, otherwise fallback to full-member */
>>>>> +                     rec.join_state = 8;
>>>>>       }
>>>>>       spin_unlock_irq(&priv->lock);
>>>>>
>>>>> @@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>>>       struct ib_port_attr port_attr;
>>>>>       unsigned long delay_until = 0;
>>>>>       struct ipoib_mcast *mcast = NULL;
>>>>> +     int ret;
>>>>>
>>>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>>>               return;
>>>>> @@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>>>       else
>>>>>               memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
>>>>>
>>>>> +     /* Check if can send sendonly MCG's with sendonly-fullmember join state */
>>>>> +     if (priv->broadcast) {
>>>>> +             ret = ipoib_check_sm_sendonly_fullmember_support(priv);
>>>>> +             if (ret < 0)
>>>>> +                     pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
>>>>> +                              priv->dev->name, ret);
>>>>> +     }
>>>>> +
>>>>>       spin_lock_irq(&priv->lock);
>>>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>>>               goto out;
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]                         ` <5718D497.70306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-04-21 14:11                           ` Erez Shitrit
       [not found]                             ` <CAAk-MO9W-sdi4D+xT6C_TB7uWNMPV8++RRtyoLmJ23HFq11+iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-04-21 14:11 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 21, 2016 at 4:24 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On 4/21/2016 8:52 AM, Erez Shitrit wrote:
>> On Wed, Apr 20, 2016 at 4:29 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>> On 4/20/2016 9:12 AM, Erez Shitrit wrote:
>>>> On Wed, Apr 20, 2016 at 3:11 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>>>> On 4/19/2016 8:51 AM, Erez Shitrit wrote:
>>>>>> Check (via an sa query) if the SM supports the new option for SendOnly
>>>>>> multicast joins.
>>>>>> If the SM supports that option it will use the new join state to create
>>>>>> such multicast group.
>>>>>> If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
>>>>>> join for SendOnly MCG, use the correct state if supported.
>>>>>>
>>>>>> This check is performed at every invocation of mcast_restart task, to be
>>>>>> sure that the driver stays in sync with the current state of the SM.
>>>>>
>>>>> Would a better strategy (to minimize gets of SA ClassPortInfo) be to do
>>>>> this based on local events (like SM change, etc.) ?
>>>>>
>>>>> -- Hal
>>>>>
>>>>
>>>> SA query for ClassPortInfo is done prior to the re-join of all
>>>> multicasts, because it takes no assumption on the current support of
>>>> the SM.
>>>> re-join to all multicast is done according to events from the
>>>> networking layer ("UP" from the kernel)  and from the IB part ("DOWN")
>>>> events (like SM change, RE-REG etc.)
>>>> We think that In both cases the driver should get the updated info of
>>>> the sm support.
>>>
>>> Yes, those are same cases I'm referring to but won't this patch make it
>>> occur once per rejoin rather than once when event occurs ? If so, this
>>> is an unnecessary load on SA.
>>
>> The assumption was to check if the sm still keeping that option, and
>> to do that before any new join.
>
> That will work but it's overkill.
>
>> Can I assume that the sm keeps that setting, without the option to change them?
>
> Yes, the SA associated with the SM either supports the option or not.
> Only need to deal with determining this once unless SM fails over
> (various local events).
>
> This will minimize storm of joins and PR queries at SA on client reregister.

OK, will change that accordingly.
Thanks,

>
>>>
>>>>
>>>>>> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>>>> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>>>>> ---
>>>>>>  drivers/infiniband/ulp/ipoib/ipoib.h           |  2 +
>>>>>>  drivers/infiniband/ulp/ipoib/ipoib_main.c      | 74 ++++++++++++++++++++++++++
>>>>>>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 33 +++++++-----
>>>>>>  3 files changed, 96 insertions(+), 13 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
>>>>>> index caec8e9..c51f618 100644
>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib.h
>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib.h
>>>>>> @@ -392,6 +392,7 @@ struct ipoib_dev_priv {
>>>>>>       struct ipoib_ethtool_st ethtool;
>>>>>>       struct timer_list poll_timer;
>>>>>>       unsigned max_send_sge;
>>>>>> +     bool sm_fullmember_sendonly_support;
>>>>>>  };
>>>>>>
>>>>>>  struct ipoib_ah {
>>>>>> @@ -476,6 +477,7 @@ void ipoib_reap_ah(struct work_struct *work);
>>>>>>
>>>>>>  void ipoib_mark_paths_invalid(struct net_device *dev);
>>>>>>  void ipoib_flush_paths(struct net_device *dev);
>>>>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv);
>>>>>>  struct ipoib_dev_priv *ipoib_intf_alloc(const char *format);
>>>>>>
>>>>>>  int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>> index 3b630e5..eb5927b 100644
>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
>>>>>> @@ -117,6 +117,8 @@ int ipoib_open(struct net_device *dev)
>>>>>>
>>>>>>       set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
>>>>>>
>>>>>> +     priv->sm_fullmember_sendonly_support = false;
>>>>>> +
>>>>>>       if (ipoib_ib_dev_open(dev)) {
>>>>>>               if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags))
>>>>>>                       return 0;
>>>>>> @@ -629,6 +631,78 @@ void ipoib_mark_paths_invalid(struct net_device *dev)
>>>>>>       spin_unlock_irq(&priv->lock);
>>>>>>  }
>>>>>>
>>>>>> +struct classport_info_context {
>>>>>> +     struct ipoib_dev_priv   *priv;
>>>>>> +     struct completion       done;
>>>>>> +     struct ib_sa_query      *sa_query;
>>>>>> +};
>>>>>> +
>>>>>> +static void classport_info_query_cb(int status, struct ib_class_port_info *rec,
>>>>>> +                                 void *context)
>>>>>> +{
>>>>>> +     struct classport_info_context *cb_ctx = context;
>>>>>> +     struct ipoib_dev_priv *priv;
>>>>>> +
>>>>>> +     WARN_ON(!context);
>>>>>> +
>>>>>> +     priv = cb_ctx->priv;
>>>>>> +
>>>>>> +     if (status) {
>>>>>> +             pr_debug("device: %s failed query classport_info status: %d\n",
>>>>>> +                      priv->dev->name, status);
>>>>>> +             /* keeps the default, will try next mcast_restart */
>>>>>> +             priv->sm_fullmember_sendonly_support = false;
>>>>>> +             goto out;
>>>>>> +     }
>>>>>> +
>>>>>> +     /* take out resp_time that located in the last 5 bits */
>>>>>> +     if ((be32_to_cpu(rec->cap_mask2_resp_time) >> 5) &
>>>>>> +         IB_SA_CAP_MASK2_SENDONLY_FULL_MEM_SUPPORT) {
>>>>>> +             pr_debug("device: %s enabled fullmember-sendonly for sendonly MCG\n",
>>>>>> +                      priv->dev->name);
>>>>>> +             priv->sm_fullmember_sendonly_support = true;
>>>>>> +     } else {
>>>>>> +             pr_debug("device: %s disabled fullmember-sendonly for sendonly MCG\n",
>>>>>> +                      priv->dev->name);
>>>>>> +             priv->sm_fullmember_sendonly_support = false;
>>>>>> +     }
>>>>>> +
>>>>>> +out:
>>>>>> +     complete(&cb_ctx->done);
>>>>>> +}
>>>>>> +
>>>>>> +int ipoib_check_sm_sendonly_fullmember_support(struct ipoib_dev_priv *priv)
>>>>>> +{
>>>>>> +     struct classport_info_context *callback_context;
>>>>>> +     int ret;
>>>>>> +
>>>>>> +     callback_context = kmalloc(sizeof(*callback_context), GFP_KERNEL);
>>>>>> +     if (!callback_context)
>>>>>> +             return -ENOMEM;
>>>>>> +
>>>>>> +     callback_context->priv = priv;
>>>>>> +     init_completion(&callback_context->done);
>>>>>> +
>>>>>> +     ret = ib_sa_classport_info_rec_query(&ipoib_sa_client,
>>>>>> +                                          priv->ca, priv->port, 3000,
>>>>>> +                                          GFP_KERNEL,
>>>>>> +                                          classport_info_query_cb,
>>>>>> +                                          callback_context,
>>>>>> +                                          &callback_context->sa_query);
>>>>>> +     if (ret < 0) {
>>>>>> +             pr_info("%s failed to send ib_sa_classport_info query, ret: %d\n",
>>>>>> +                     priv->dev->name, ret);
>>>>>> +             kfree(callback_context);
>>>>>> +             return ret;
>>>>>> +     }
>>>>>> +
>>>>>> +     /* waiting for the callback to finish before returnning */
>>>>>> +     wait_for_completion(&callback_context->done);
>>>>>> +     kfree(callback_context);
>>>>>> +
>>>>>> +     return ret;
>>>>>> +}
>>>>>> +
>>>>>>  void ipoib_flush_paths(struct net_device *dev)
>>>>>>  {
>>>>>>       struct ipoib_dev_priv *priv = netdev_priv(dev);
>>>>>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>>>> index 2588931..c1bb69e 100644
>>>>>> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>>>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>>>>>> @@ -515,22 +515,20 @@ static int ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast)
>>>>>>               rec.hop_limit     = priv->broadcast->mcmember.hop_limit;
>>>>>>
>>>>>>               /*
>>>>>> -              * Send-only IB Multicast joins do not work at the core
>>>>>> -              * IB layer yet, so we can't use them here.  However,
>>>>>> -              * we are emulating an Ethernet multicast send, which
>>>>>> -              * does not require a multicast subscription and will
>>>>>> -              * still send properly.  The most appropriate thing to
>>>>>> +              * Send-only IB Multicast joins work at the core IB layer but
>>>>>> +              * require specific SM support.
>>>>>> +              * We can use such joins here only if the current SM supports that feature.
>>>>>> +              * However, if not, we emulate an Ethernet multicast send,
>>>>>> +              * which does not require a multicast subscription and will
>>>>>> +              * still send properly. The most appropriate thing to
>>>>>>                * do is to create the group if it doesn't exist as that
>>>>>>                * most closely emulates the behavior, from a user space
>>>>>> -              * application perspecitive, of Ethernet multicast
>>>>>> -              * operation.  For now, we do a full join, maybe later
>>>>>> -              * when the core IB layers support send only joins we
>>>>>> -              * will use them.
>>>>>> +              * application perspective, of Ethernet multicast operation.
>>>>>>                */
>>>>>> -#if 0
>>>>>> -             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))
>>>>>> -                     rec.join_state = 4;
>>>>>> -#endif
>>>>>> +             if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) &&
>>>>>> +                 priv->sm_fullmember_sendonly_support)
>>>>>> +                     /* SM supports sendonly-fullmember, otherwise fallback to full-member */
>>>>>> +                     rec.join_state = 8;
>>>>>>       }
>>>>>>       spin_unlock_irq(&priv->lock);
>>>>>>
>>>>>> @@ -559,6 +557,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>>>>       struct ib_port_attr port_attr;
>>>>>>       unsigned long delay_until = 0;
>>>>>>       struct ipoib_mcast *mcast = NULL;
>>>>>> +     int ret;
>>>>>>
>>>>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>>>>               return;
>>>>>> @@ -576,6 +575,14 @@ void ipoib_mcast_join_task(struct work_struct *work)
>>>>>>       else
>>>>>>               memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid));
>>>>>>
>>>>>> +     /* Check if can send sendonly MCG's with sendonly-fullmember join state */
>>>>>> +     if (priv->broadcast) {
>>>>>> +             ret = ipoib_check_sm_sendonly_fullmember_support(priv);
>>>>>> +             if (ret < 0)
>>>>>> +                     pr_debug("%s failed query sm support for sendonly-fullmember (ret: %d)\n",
>>>>>> +                              priv->dev->name, ret);
>>>>>> +     }
>>>>>> +
>>>>>>       spin_lock_irq(&priv->lock);
>>>>>>       if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
>>>>>>               goto out;
>>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]                             ` <CAAk-MO9W-sdi4D+xT6C_TB7uWNMPV8++RRtyoLmJ23HFq11+iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-04-21 17:32                               ` Jason Gunthorpe
       [not found]                                 ` <20160421173219.GA5814-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2016-04-21 17:32 UTC (permalink / raw)
  To: Erez Shitrit
  Cc: Hal Rosenstock, Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 21, 2016 at 05:11:42PM +0300, Erez Shitrit wrote:

> > This will minimize storm of joins and PR queries at SA on client reregister.
> 
> OK, will change that accordingly.

If you check before joining the broadcast group that should be
enough. The broadcast group must always be joined first and is always
re-joined if the SM changes substantially.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]     ` <1461070287-13469-4-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-04-26 20:16       ` Christoph Lameter
       [not found]         ` <alpine.DEB.2.20.1604261514590.2663-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Lameter @ 2016-04-26 20:16 UTC (permalink / raw)
  To: Erez Shitrit
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, 19 Apr 2016, Erez Shitrit wrote:

> +/*
> +* There are 4 types of join states:
> +* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
> +*/
> +enum {
> +	FULLMEMBER_JOIN,
> +	NONMEMBER_JOIN,
> +	SENDONLY_NONMEBER_JOIN,
> +	SENDONLY_FULLMEMBER_JOIN,
> +	NUM_JOIN_MEMBERSHIP_TYPES = 4

It will be 4 because there are 4 definitions before this. No need to
assign a value. The value will be wrong if more definitions are added
later.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]         ` <alpine.DEB.2.20.1604261514590.2663-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
@ 2016-04-26 20:57           ` Hefty, Sean
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045BD0-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Hefty, Sean @ 2016-04-26 20:57 UTC (permalink / raw)
  To: Christoph Lameter, Erez Shitrit
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

> > +* There are 4 types of join states:
> > +* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
> > +*/
> > +enum {
> > +	FULLMEMBER_JOIN,
> > +	NONMEMBER_JOIN,
> > +	SENDONLY_NONMEBER_JOIN,
> > +	SENDONLY_FULLMEMBER_JOIN,
> > +	NUM_JOIN_MEMBERSHIP_TYPES = 4

These seem better suited as flags -- fullmember (y/n), sendonly (y/n) 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045BD0-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-27  6:42               ` Leon Romanovsky
       [not found]                 ` <20160427064239.GO7974-2ukJVAZIZ/Y@public.gmane.org>
  2016-05-02  7:26               ` Erez Shitrit
  1 sibling, 1 reply; 31+ messages in thread
From: Leon Romanovsky @ 2016-04-27  6:42 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Christoph Lameter, Erez Shitrit, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 700 bytes --]

On Tue, Apr 26, 2016 at 08:57:08PM +0000, Hefty, Sean wrote:
> > > +* There are 4 types of join states:
> > > +* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
> > > +*/
> > > +enum {
> > > +	FULLMEMBER_JOIN,
> > > +	NONMEMBER_JOIN,
> > > +	SENDONLY_NONMEBER_JOIN,
> > > +	SENDONLY_FULLMEMBER_JOIN,
> > > +	NUM_JOIN_MEMBERSHIP_TYPES = 4
> 
> These seem better suited as flags -- fullmember (y/n), sendonly (y/n) 

Thanks, excellent point.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                 ` <20160427064239.GO7974-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-04-29  2:34                   ` Doug Ledford
       [not found]                     ` <5722C839.2070704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Doug Ledford @ 2016-04-29  2:34 UTC (permalink / raw)
  To: leon-DgEjT+Ai2ygdnm+yROfE0A, Sean
  Cc: Christoph Lameter, Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1218 bytes --]

On 04/27/2016 02:42 AM, Leon Romanovsky wrote:
> On Tue, Apr 26, 2016 at 08:57:08PM +0000, Hefty, Sean wrote:
>>>> +* There are 4 types of join states:
>>>> +* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
>>>> +*/
>>>> +enum {
>>>> +	FULLMEMBER_JOIN,
>>>> +	NONMEMBER_JOIN,
>>>> +	SENDONLY_NONMEBER_JOIN,
>>>> +	SENDONLY_FULLMEMBER_JOIN,
>>>> +	NUM_JOIN_MEMBERSHIP_TYPES = 4
>>
>> These seem better suited as flags -- fullmember (y/n), sendonly (y/n) 
> 
> Thanks, excellent point.

This doesn't seem to make any sense to me.  Without going back and
re-reading this part of the spec, as I recall, there is:

UnJoined
SendOnly Join
Full Join

You can never have a SendOnly_FullMember join.  Once you are FullMember,
you are no longer SendOnly.

Is a NonMember (assuming here that NonMember is referring to what the CA
is listed as according to the SM) even allowed to join, either as
SendOnly or FullMember?  I would have thought if the SM listed that CA
as a NonMember, that any joins would be flat rejected and NonMember join
states wouldn't make any sense.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                     ` <5722C839.2070704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-04-29 15:50                       ` Hefty, Sean
       [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373AB0475AC-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-04-29 16:27                       ` Hal Rosenstock
  1 sibling, 1 reply; 31+ messages in thread
From: Hefty, Sean @ 2016-04-29 15:50 UTC (permalink / raw)
  To: Doug Ledford, leon-DgEjT+Ai2ygdnm+yROfE0A
  Cc: Christoph Lameter, Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

> This doesn't seem to make any sense to me.  Without going back and
> re-reading this part of the spec, as I recall, there is:
> 
> UnJoined
> SendOnly Join
> Full Join

The 1.2.1 version of the spec has:

JoinState - 4 bits

Join/Leave Status requested by the port. See discussion below.
bit 0: FullMember: Include/delete this endport from the multicast group as a
member sender and receiver.
bit 1: NonMember: Include/delete this endport from the multicast group as
a non-member sender and receiver.
bit 2: SendOnlyNonMember: Include/delete this endport from the multicast
group as a non-member sender only.
bit 3: Reserved

• FullMember: Group messages are routed both to and from the port.
The port is considered a member for purposes of group creation and
deletion, i.e.: if no member ports with FullMember=1 remain, the
group may be deleted; otherwise it may not.
• NonMember: Group messages are routed both to and from the port.
The port is not considered a member for purposes of group creation/deletion.
• SendOnlyNonMember: Group messages are only routed from the
port; none are routed to the port. The port is not considered
a member for purposes of group creation/deletion.

Any combination of the MCMemberRecord:JoinState bits may be 1. When
multiple bits are 1, the qualities of the port's type of membership are the
union of the qualities specified by the bits that are 1.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373AB0475AC-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-29 16:10                           ` Doug Ledford
  0 siblings, 0 replies; 31+ messages in thread
From: Doug Ledford @ 2016-04-29 16:10 UTC (permalink / raw)
  To: Sean, leon-DgEjT+Ai2ygdnm+yROfE0A
  Cc: Christoph Lameter, Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1697 bytes --]

On 04/29/2016 11:50 AM, Hefty, Sean wrote:
>> This doesn't seem to make any sense to me.  Without going back and
>> re-reading this part of the spec, as I recall, there is:
>>
>> UnJoined
>> SendOnly Join
>> Full Join
> 
> The 1.2.1 version of the spec has:
> 
> JoinState - 4 bits
> 
> Join/Leave Status requested by the port. See discussion below.
> bit 0: FullMember: Include/delete this endport from the multicast group as a
> member sender and receiver.
> bit 1: NonMember: Include/delete this endport from the multicast group as
> a non-member sender and receiver.
> bit 2: SendOnlyNonMember: Include/delete this endport from the multicast
> group as a non-member sender only.
> bit 3: Reserved
> 
> • FullMember: Group messages are routed both to and from the port.
> The port is considered a member for purposes of group creation and
> deletion, i.e.: if no member ports with FullMember=1 remain, the
> group may be deleted; otherwise it may not.
> • NonMember: Group messages are routed both to and from the port.
> The port is not considered a member for purposes of group creation/deletion.
> • SendOnlyNonMember: Group messages are only routed from the
> port; none are routed to the port. The port is not considered
> a member for purposes of group creation/deletion.
> 
> Any combination of the MCMemberRecord:JoinState bits may be 1. When
> multiple bits are 1, the qualities of the port's type of membership are the
> union of the qualities specified by the bits that are 1.
> 

OK, that makes more sense.  Thanks for the refresher.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                     ` <5722C839.2070704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-04-29 15:50                       ` Hefty, Sean
@ 2016-04-29 16:27                       ` Hal Rosenstock
       [not found]                         ` <c99ef08a-77d8-1524-042a-0f269a52028d-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  1 sibling, 1 reply; 31+ messages in thread
From: Hal Rosenstock @ 2016-04-29 16:27 UTC (permalink / raw)
  To: Doug Ledford, leon-DgEjT+Ai2ygdnm+yROfE0A, Sean
  Cc: Christoph Lameter, Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/28/2016 10:34 PM, Doug Ledford wrote:
> On 04/27/2016 02:42 AM, Leon Romanovsky wrote:
>> On Tue, Apr 26, 2016 at 08:57:08PM +0000, Hefty, Sean wrote:
>>>>> +* There are 4 types of join states:
>>>>> +* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
>>>>> +*/
>>>>> +enum {
>>>>> +	FULLMEMBER_JOIN,
>>>>> +	NONMEMBER_JOIN,
>>>>> +	SENDONLY_NONMEBER_JOIN,
>>>>> +	SENDONLY_FULLMEMBER_JOIN,
>>>>> +	NUM_JOIN_MEMBERSHIP_TYPES = 4
>>>
>>> These seem better suited as flags -- fullmember (y/n), sendonly (y/n) 
>>
>> Thanks, excellent point.
> 
> This doesn't seem to make any sense to me.  Without going back and
> re-reading this part of the spec, as I recall, there is:
> 
> UnJoined
> SendOnly Join
> Full Join
> 
> You can never have a SendOnly_FullMember join.  Once you are FullMember,
> you are no longer SendOnly.
> 
> Is a NonMember (assuming here that NonMember is referring to what the CA
> is listed as according to the SM) even allowed to join, either as
> SendOnly or FullMember?  I would have thought if the SM listed that CA
> as a NonMember, that any joins would be flat rejected and NonMember join
> states wouldn't make any sense.

There is now a new SendOnlyFullMember option added by IBTA MgtWG post
IBA 1.3.

-- Hal
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                         ` <c99ef08a-77d8-1524-042a-0f269a52028d-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-04-29 17:12                           ` Jason Gunthorpe
       [not found]                             ` <20160429171246.GA4893-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Gunthorpe @ 2016-04-29 17:12 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Doug Ledford, leon-DgEjT+Ai2ygdnm+yROfE0A, Sean,
	Christoph Lameter, Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Fri, Apr 29, 2016 at 12:27:30PM -0400, Hal Rosenstock wrote:
> > You can never have a SendOnly_FullMember join.  Once you are FullMember,
> > you are no longer SendOnly.
> > 
> > Is a NonMember (assuming here that NonMember is referring to what the CA
> > is listed as according to the SM) even allowed to join, either as
> > SendOnly or FullMember?  I would have thought if the SM listed that CA
> > as a NonMember, that any joins would be flat rejected and NonMember join
> > states wouldn't make any sense.
> 
> There is now a new SendOnlyFullMember option added by IBTA MgtWG post
> IBA 1.3.

As I understand it, this is done because the SendOnly Non-Member
essentially has useless reference counting semantics. FullMember
allows the group to be auto-created and continue to exist even if
there is only one SendOnly member.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                             ` <20160429171246.GA4893-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-04-29 17:58                               ` Hal Rosenstock
       [not found]                                 ` <c1bc7c93-53c9-8b71-16eb-2bb92df93b76-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 31+ messages in thread
From: Hal Rosenstock @ 2016-04-29 17:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, leon-DgEjT+Ai2ygdnm+yROfE0A, Sean,
	Christoph Lameter, Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/29/2016 1:12 PM, Jason Gunthorpe wrote:
> On Fri, Apr 29, 2016 at 12:27:30PM -0400, Hal Rosenstock wrote:
>>> You can never have a SendOnly_FullMember join.  Once you are FullMember,
>>> you are no longer SendOnly.
>>>
>>> Is a NonMember (assuming here that NonMember is referring to what the CA
>>> is listed as according to the SM) even allowed to join, either as
>>> SendOnly or FullMember?  I would have thought if the SM listed that CA
>>> as a NonMember, that any joins would be flat rejected and NonMember join
>>> states wouldn't make any sense.
>>
>> There is now a new SendOnlyFullMember option added by IBTA MgtWG post
>> IBA 1.3.
> 
> As I understand it, this is done because the SendOnly Non-Member
> essentially has useless reference counting semantics. FullMember
> allows the group to be auto-created and continue to exist even if
> there is only one SendOnly member.

My understanding of this option is similar to the above.

-- Hal

> Jason
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                                 ` <c1bc7c93-53c9-8b71-16eb-2bb92df93b76-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-04-30  1:37                                   ` Christoph Lameter
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Lameter @ 2016-04-30  1:37 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Jason Gunthorpe, Doug Ledford, leon-DgEjT+Ai2ygdnm+yROfE0A, Sean,
	Erez Shitrit, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Fri, 29 Apr 2016, Hal Rosenstock wrote:

> On 4/29/2016 1:12 PM, Jason Gunthorpe wrote:
> > On Fri, Apr 29, 2016 at 12:27:30PM -0400, Hal Rosenstock wrote:
> >>> You can never have a SendOnly_FullMember join.  Once you are FullMember,
> >>> you are no longer SendOnly.
> >>>
> >>> Is a NonMember (assuming here that NonMember is referring to what the CA
> >>> is listed as according to the SM) even allowed to join, either as
> >>> SendOnly or FullMember?  I would have thought if the SM listed that CA
> >>> as a NonMember, that any joins would be flat rejected and NonMember join
> >>> states wouldn't make any sense.
> >>
> >> There is now a new SendOnlyFullMember option added by IBTA MgtWG post
> >> IBA 1.3.
> >
> > As I understand it, this is done because the SendOnly Non-Member
> > essentially has useless reference counting semantics. FullMember
> > allows the group to be auto-created and continue to exist even if
> > there is only one SendOnly member.
>
> My understanding of this option is similar to the above.

The continued existence of the multicast group is important for Ethernet
gateways etc on the IB fabric. The Ethernet gateways will ignore join
requests from the Ethernet side if there is no IB multicast group.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045BD0-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-04-27  6:42               ` Leon Romanovsky
@ 2016-05-02  7:26               ` Erez Shitrit
       [not found]                 ` <CAAk-MO8Zo2cHo8hiyYdbiSoRG2tgdxcSZPhCDWRUm6KrHCE8EQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 31+ messages in thread
From: Erez Shitrit @ 2016-05-02  7:26 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Christoph Lameter, Erez Shitrit, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 26, 2016 at 11:57 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> > +* There are 4 types of join states:
>> > +* FullMember, NonMember, SendOnlyNonMember, SendOnlyFullMember.
>> > +*/
>> > +enum {
>> > +   FULLMEMBER_JOIN,
>> > +   NONMEMBER_JOIN,
>> > +   SENDONLY_NONMEBER_JOIN,
>> > +   SENDONLY_FULLMEMBER_JOIN,
>> > +   NUM_JOIN_MEMBERSHIP_TYPES = 4
>
> These seem better suited as flags -- fullmember (y/n), sendonly (y/n)

The ib_core tracks over the number of members for each join types, in
order not to double join from specific port to specific join type.
It keeps the numbers of members in special array, entry for each join type.
So, the meaning of the enum is to define the size of the that array
and not to define a bit mask.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
       [not found]                                 ` <20160421173219.GA5814-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-05-02  8:00                                   ` Erez Shitrit
  0 siblings, 0 replies; 31+ messages in thread
From: Erez Shitrit @ 2016-05-02  8:00 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Hal Rosenstock, Erez Shitrit, Doug Ledford, Or Gerlitz,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 21, 2016 at 8:32 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Apr 21, 2016 at 05:11:42PM +0300, Erez Shitrit wrote:
>
>> > This will minimize storm of joins and PR queries at SA on client reregister.
>>
>> OK, will change that accordingly.
>
> If you check before joining the broadcast group that should be
> enough. The broadcast group must always be joined first and is always
> re-joined if the SM changes substantially.

Thanks, will do that .

>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast
       [not found]                 ` <CAAk-MO8Zo2cHo8hiyYdbiSoRG2tgdxcSZPhCDWRUm6KrHCE8EQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-02 17:02                   ` Hefty, Sean
  0 siblings, 0 replies; 31+ messages in thread
From: Hefty, Sean @ 2016-05-02 17:02 UTC (permalink / raw)
  To: Erez Shitrit
  Cc: Christoph Lameter, Erez Shitrit, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

> The ib_core tracks over the number of members for each join types, in
> order not to double join from specific port to specific join type.
> It keeps the numbers of members in special array, entry for each join type.
> So, the meaning of the enum is to define the size of the that array
> and not to define a bit mask.

They still seem better as flags and are defined as such in the spec.  The number of bits can define the size of an array.  Internal implementation convenience should not drive the API.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH for-next 1/4] IB/core: Add support for get ClassPortInfo from the SA
       [not found]     ` <1461070287-13469-2-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-12 23:17       ` Ira Weiny
  0 siblings, 0 replies; 31+ messages in thread
From: Ira Weiny @ 2016-05-12 23:17 UTC (permalink / raw)
  To: Erez Shitrit
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 19, 2016 at 03:51:24PM +0300, Erez Shitrit wrote:
> 
> Change struct ib_class_port_info to conform to IB Spec 1.3
> That in order to get specific capability mask from ClassPortInfo mad.
> 
> From the IB Spec, ClassPortInfo section:
> 	"CapabilityMask2 Bits 0-26: Additional class-specific capabilities...
> 	 RespTimeValue the rest 5 bits"
> 
> The new struct now has one field that contains both of cap_mask2 (which was
> reserved and resp_time which now is 27 bits for cap_mask2 and 5 bits for
> resp_time in the new field)
> 
> More changes to adjust the new structure:
> IB/qib: Change pma_get_classportinfo
> IB/srpt: Adjust the use of ib_class_port_info
> 
> Signed-off-by: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/qib/qib_mad.c   | 6 ++++--
>  drivers/infiniband/ulp/srpt/ib_srpt.c | 5 ++++-
>  include/rdma/ib_mad.h                 | 4 ++--
>  3 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
> index 0bd1837..c5d029d 100644
> --- a/drivers/infiniband/hw/qib/qib_mad.c
> +++ b/drivers/infiniband/hw/qib/qib_mad.c
> @@ -1158,6 +1158,7 @@ static int pma_get_classportinfo(struct ib_pma_mad *pmp,
>  	struct ib_class_port_info *p =
>  		(struct ib_class_port_info *)pmp->data;
>  	struct qib_devdata *dd = dd_from_ibdev(ibdev);
> +	char *p_cap_mask2;
>  
>  	memset(pmp->data, 0, sizeof(pmp->data));
>  
> @@ -1172,11 +1173,12 @@ static int pma_get_classportinfo(struct ib_pma_mad *pmp,
>  	 * Set the most significant bit of CM2 to indicate support for
>  	 * congestion statistics
>  	 */
> -	p->reserved[0] = dd->psxmitwait_supported << 7;
> +	p_cap_mask2 = (char *)&p->cap_mask2_resp_time;
> +	p_cap_mask2[0] = dd->psxmitwait_supported << 7;
>  	/*
>  	 * Expected response time is 4.096 usec. * 2^18 == 1.073741824 sec.
>  	 */
> -	p->resp_time_value = 18;
> +	p_cap_mask2[3] = 18;

Does this get the endianess right?

To me it would be cleaner to use a temp u32 set the values and then byteswap
into the MAD.

Ira

>  
>  	return reply((struct ib_smp *) pmp);
>  }
> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
> index 0bd3cb2..d12b602 100644
> --- a/drivers/infiniband/ulp/srpt/ib_srpt.c
> +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
> @@ -249,12 +249,15 @@ static void srpt_set_ioc(u8 *c_list, u32 slot, u8 value)
>  static void srpt_get_class_port_info(struct ib_dm_mad *mad)
>  {
>  	struct ib_class_port_info *cif;
> +	char *p_cap_mask2;
>  
>  	cif = (struct ib_class_port_info *)mad->data;
>  	memset(cif, 0, sizeof(*cif));
>  	cif->base_version = 1;
>  	cif->class_version = 1;
> -	cif->resp_time_value = 20;
> +
> +	p_cap_mask2 = (char *)&cif->cap_mask2_resp_time;
> +	p_cap_mask2[3] = 20;
>  
>  	mad->mad_hdr.status = 0;
>  }
> diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
> index 37dd534c..2aaf1cb 100644
> --- a/include/rdma/ib_mad.h
> +++ b/include/rdma/ib_mad.h
> @@ -243,8 +243,8 @@ struct ib_class_port_info {
>  	u8			base_version;
>  	u8			class_version;
>  	__be16			capability_mask;
> -	u8			reserved[3];
> -	u8			resp_time_value;
> +	  /* 27 bits for cap_mask2, 5 bits for resp_time */
> +	__be32			cap_mask2_resp_time;
>  	u8			redirect_gid[16];
>  	__be32			redirect_tcslfl;
>  	__be16			redirect_lid;
> -- 
> 1.7.11.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-05-12 23:17 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-19 12:51 [PATCH for-next 0/4] Support SendOnlyFullMember join Erez Shitrit
     [not found] ` <1461070287-13469-1-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-04-19 12:51   ` [PATCH for-next 1/4] IB/core: Add support for get ClassPortInfo from the SA Erez Shitrit
     [not found]     ` <1461070287-13469-2-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-12 23:17       ` Ira Weiny
2016-04-19 12:51   ` [PATCH for-next 2/4] IB/sa: Add support for sa get ClassPortInfo Erez Shitrit
     [not found]     ` <1461070287-13469-3-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-04-20 12:09       ` Hal Rosenstock
     [not found]         ` <57177181.7090408-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-04-20 14:24           ` Erez Shitrit
2016-04-21 12:12           ` Erez Shitrit
2016-04-20 13:12       ` Or Gerlitz
     [not found]         ` <CAJ3xEMihe-54UExbp3ptqH5yqg7dc6a6tKcJw0gNytTECce5TQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-20 13:19           ` Erez Shitrit
2016-04-19 12:51   ` [PATCH for-next 3/4] IB/core: Support new type of join-state for multicast Erez Shitrit
     [not found]     ` <1461070287-13469-4-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-04-26 20:16       ` Christoph Lameter
     [not found]         ` <alpine.DEB.2.20.1604261514590.2663-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2016-04-26 20:57           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045BD0-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-27  6:42               ` Leon Romanovsky
     [not found]                 ` <20160427064239.GO7974-2ukJVAZIZ/Y@public.gmane.org>
2016-04-29  2:34                   ` Doug Ledford
     [not found]                     ` <5722C839.2070704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-29 15:50                       ` Hefty, Sean
     [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373AB0475AC-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-29 16:10                           ` Doug Ledford
2016-04-29 16:27                       ` Hal Rosenstock
     [not found]                         ` <c99ef08a-77d8-1524-042a-0f269a52028d-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-04-29 17:12                           ` Jason Gunthorpe
     [not found]                             ` <20160429171246.GA4893-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-04-29 17:58                               ` Hal Rosenstock
     [not found]                                 ` <c1bc7c93-53c9-8b71-16eb-2bb92df93b76-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-04-30  1:37                                   ` Christoph Lameter
2016-05-02  7:26               ` Erez Shitrit
     [not found]                 ` <CAAk-MO8Zo2cHo8hiyYdbiSoRG2tgdxcSZPhCDWRUm6KrHCE8EQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-02 17:02                   ` Hefty, Sean
2016-04-19 12:51   ` [PATCH for-next 4/4] IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join Erez Shitrit
     [not found]     ` <1461070287-13469-5-git-send-email-erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-04-20 12:11       ` Hal Rosenstock
     [not found]         ` <571771FB.90908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-04-20 13:12           ` Erez Shitrit
     [not found]             ` <CAAk-MO_24actZ4FJqOjk0ztR=Z6toSfB-xuyFKeUvtZ8h0Xv3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-20 13:29               ` Hal Rosenstock
     [not found]                 ` <5717843B.7020501-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-04-21 12:52                   ` Erez Shitrit
     [not found]                     ` <CAAk-MO9sie_pFH7wSDhhuEd2uApnz5dMF6E9u6Vgkyimc+1xWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-21 13:24                       ` Hal Rosenstock
     [not found]                         ` <5718D497.70306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-04-21 14:11                           ` Erez Shitrit
     [not found]                             ` <CAAk-MO9W-sdi4D+xT6C_TB7uWNMPV8++RRtyoLmJ23HFq11+iw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-21 17:32                               ` Jason Gunthorpe
     [not found]                                 ` <20160421173219.GA5814-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-02  8:00                                   ` Erez Shitrit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.