All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
@ 2014-11-13 19:54 ira.weiny-ral2JQCrhuEAvxtiuMwx3w
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

The following patch series modifies the kernel MAD processing (ib_mad/ib_umad)
and related interfaces to process Intel Omni-Path Architecture MADs on devices
which support them.

In addition to supporting some IBTA management classes, OPA devices use MADs
with lengths up to 2K.  These "jumbo" MADs increase the performance of
management traffic.

To distinguish IBTA MADs from OPA MADs a new Base Version is introduced.  The
new format shares the same common header with IBTA MADs which allows us to
share most of the MAD processing code when dealing with the new Base Version.


The patch series is broken into 3 main areas.

1) Add the ability for devices to indicate "jumbo" MAD support.  In addition,
   modify the ib_mad module to detect those devices and allocate the resources
   for the QPs on those devices.

2) Enhance the interface to the device agents to support larger and variable
   length MADs.

3) Add support for creating and processing OPA Base Version MADs including 
   a new SMP class version specific to OPA devices.


  [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
  [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap
  [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
  [RFC PATCH 04/16] ib/mad: add base version parameter to
  [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
  [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
  [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
  [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
  [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path
  [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path
  [RFC PATCH 11/16] ib/mad: create helper function for
  [RFC PATCH 12/16] ib/mad: create helper function for
  [RFC PATCH 13/16] ib/mad: create helper function for
  [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
  [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP
  [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap flag ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (16 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

is_rmpp_data_mad is more descriptive for this function.

Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..4673262 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1734,7 +1734,7 @@ out:
 	return valid;
 }
 
-static int is_data_mad(struct ib_mad_agent_private *mad_agent_priv,
+static int is_rmpp_data_mad(struct ib_mad_agent_private *mad_agent_priv,
 		       struct ib_mad_hdr *mad_hdr)
 {
 	struct ib_rmpp_mad *rmpp_mad;
@@ -1836,7 +1836,7 @@ ib_find_send_mad(struct ib_mad_agent_private *mad_agent_priv,
 	 * been notified that the send has completed
 	 */
 	list_for_each_entry(wr, &mad_agent_priv->send_list, agent_list) {
-		if (is_data_mad(mad_agent_priv, wr->send_buf.mad) &&
+		if (is_rmpp_data_mad(mad_agent_priv, wr->send_buf.mad) &&
 		    wr->tid == mad->mad_hdr.tid &&
 		    wr->timeout &&
 		    rcv_has_same_class(wr, wc) &&
@@ -2411,7 +2411,8 @@ find_send_wr(struct ib_mad_agent_private *mad_agent_priv,
 
 	list_for_each_entry(mad_send_wr, &mad_agent_priv->send_list,
 			    agent_list) {
-		if (is_data_mad(mad_agent_priv, mad_send_wr->send_buf.mad) &&
+		if (is_rmpp_data_mad(mad_agent_priv,
+				     mad_send_wr->send_buf.mad) &&
 		    &mad_send_wr->send_buf == send_buf)
 			return mad_send_wr;
 	}
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap flag
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2014-11-13 19:54   ` [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (15 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Add a device capability flag for OPA devices to signal their support of "jumbo"
MADs.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 include/rdma/ib_verbs.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 470a011..04be3c8 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -123,7 +123,8 @@ enum ib_device_cap_flags {
 	IB_DEVICE_MEM_WINDOW_TYPE_2A	= (1<<23),
 	IB_DEVICE_MEM_WINDOW_TYPE_2B	= (1<<24),
 	IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29),
-	IB_DEVICE_SIGNATURE_HANDOVER	= (1<<30)
+	IB_DEVICE_SIGNATURE_HANDOVER	= (1<<30),
+	IB_DEVICE_JUMBO_MAD_SUPPORT	= (1<<31)
 };
 
 enum ib_signature_prot_cap {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2014-11-13 19:54   ` [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap flag ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
       [not found]     ` <1415908465-24392-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2014-11-13 19:54   ` [RFC PATCH 04/16] ib/mad: add base version parameter to ib_create_send_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (14 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and if
supported mark the special QPs created.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c      | 26 +++++++++++++++++++++++---
 drivers/infiniband/core/mad_priv.h |  1 +
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 4673262..9f5641d 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2855,7 +2855,8 @@ static void init_mad_queue(struct ib_mad_qp_info *qp_info,
 }
 
 static void init_mad_qp(struct ib_mad_port_private *port_priv,
-			struct ib_mad_qp_info *qp_info)
+			struct ib_mad_qp_info *qp_info,
+			int supports_jumbo_mads)
 {
 	qp_info->port_priv = port_priv;
 	init_mad_queue(qp_info, &qp_info->send_queue);
@@ -2865,6 +2866,7 @@ static void init_mad_qp(struct ib_mad_port_private *port_priv,
 	qp_info->snoop_table = NULL;
 	qp_info->snoop_table_size = 0;
 	atomic_set(&qp_info->snoop_count, 0);
+	qp_info->supports_jumbo_mads = supports_jumbo_mads;
 }
 
 static int create_mad_qp(struct ib_mad_qp_info *qp_info,
@@ -2911,6 +2913,17 @@ static void destroy_mad_qp(struct ib_mad_qp_info *qp_info)
 	kfree(qp_info->snoop_table);
 }
 
+static int
+mad_device_supports_jumbo_mads(struct ib_device *device)
+{
+	struct ib_device_attr attr;
+
+	if (!ib_query_device(device, &attr))
+		return (attr.device_cap_flags
+			& IB_DEVICE_JUMBO_MAD_SUPPORT);
+	return 0;
+}
+
 /*
  * Open the port
  * Create the QP, PD, MR, and CQ if needed
@@ -2923,6 +2936,7 @@ static int ib_mad_port_open(struct ib_device *device,
 	unsigned long flags;
 	char name[sizeof "ib_mad123"];
 	int has_smi;
+	int supports_jumbo_mads;
 
 	/* Create new device info */
 	port_priv = kzalloc(sizeof *port_priv, GFP_KERNEL);
@@ -2935,8 +2949,14 @@ static int ib_mad_port_open(struct ib_device *device,
 	port_priv->port_num = port_num;
 	spin_lock_init(&port_priv->reg_lock);
 	INIT_LIST_HEAD(&port_priv->agent_list);
-	init_mad_qp(port_priv, &port_priv->qp_info[0]);
-	init_mad_qp(port_priv, &port_priv->qp_info[1]);
+
+	supports_jumbo_mads = mad_device_supports_jumbo_mads(device);
+	if (supports_jumbo_mads)
+		pr_info("Jumbo MAD support enabled for %s:%d\n",
+				device->name, port_num);
+
+	init_mad_qp(port_priv, &port_priv->qp_info[0], supports_jumbo_mads);
+	init_mad_qp(port_priv, &port_priv->qp_info[1], supports_jumbo_mads);
 
 	cq_size = mad_sendq_size + mad_recvq_size;
 	has_smi = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND;
diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h
index d1a0b0e..4b4110d 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -192,6 +192,7 @@ struct ib_mad_qp_info {
 	struct ib_mad_snoop_private **snoop_table;
 	int snoop_table_size;
 	atomic_t snoop_count;
+	int supports_jumbo_mads;
 };
 
 struct ib_mad_port_private {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 04/16] ib/mad: add base version parameter to ib_create_send_mad
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (13 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

In preparation to support the new OPA MAD Base version, add a base version
parameter ib_create_send_mad and set it to IB_MGMT_BASE_VERSION for current
users.

Definition of the new base version and it's processing will occur in later
patches.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/agent.c         | 3 ++-
 drivers/infiniband/core/cm.c            | 6 ++++--
 drivers/infiniband/core/mad.c           | 3 ++-
 drivers/infiniband/core/mad_rmpp.c      | 6 ++++--
 drivers/infiniband/core/sa_query.c      | 3 ++-
 drivers/infiniband/core/user_mad.c      | 3 ++-
 drivers/infiniband/hw/mlx4/mad.c        | 3 ++-
 drivers/infiniband/hw/mthca/mthca_mad.c | 3 ++-
 drivers/infiniband/hw/qib/qib_iba7322.c | 3 ++-
 drivers/infiniband/hw/qib/qib_mad.c     | 3 ++-
 drivers/infiniband/ulp/srpt/ib_srpt.c   | 3 ++-
 include/rdma/ib_mad.h                   | 4 +++-
 12 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..b6bd305 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -108,7 +108,8 @@ void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
 
 	send_buf = ib_create_send_mad(agent, wc->src_qp, wc->pkey_index, 0,
 				      IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-				      GFP_KERNEL);
+				      GFP_KERNEL,
+				      IB_MGMT_BASE_VERSION);
 	if (IS_ERR(send_buf)) {
 		dev_err(&device->dev, "ib_create_send_mad error\n");
 		goto err1;
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..5767781 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -267,7 +267,8 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
 	m = ib_create_send_mad(mad_agent, cm_id_priv->id.remote_cm_qpn,
 			       cm_id_priv->av.pkey_index,
 			       0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-			       GFP_ATOMIC);
+			       GFP_ATOMIC,
+			       IB_MGMT_BASE_VERSION);
 	if (IS_ERR(m)) {
 		ib_destroy_ah(ah);
 		return PTR_ERR(m);
@@ -297,7 +298,8 @@ static int cm_alloc_response_msg(struct cm_port *port,
 
 	m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
 			       0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-			       GFP_ATOMIC);
+			       GFP_ATOMIC,
+			       IB_MGMT_BASE_VERSION);
 	if (IS_ERR(m)) {
 		ib_destroy_ah(ah);
 		return PTR_ERR(m);
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 9f5641d..0cb91fc 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -920,7 +920,8 @@ struct ib_mad_send_buf * ib_create_send_mad(struct ib_mad_agent *mad_agent,
 					    u32 remote_qpn, u16 pkey_index,
 					    int rmpp_active,
 					    int hdr_len, int data_len,
-					    gfp_t gfp_mask)
+					    gfp_t gfp_mask,
+					    u8 base_version)
 {
 	struct ib_mad_agent_private *mad_agent_priv;
 	struct ib_mad_send_wr_private *mad_send_wr;
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index f37878c..2379e2d 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -139,7 +139,8 @@ static void ack_recv(struct mad_rmpp_recv *rmpp_recv,
 	hdr_len = ib_get_mad_data_offset(recv_wc->recv_buf.mad->mad_hdr.mgmt_class);
 	msg = ib_create_send_mad(&rmpp_recv->agent->agent, recv_wc->wc->src_qp,
 				 recv_wc->wc->pkey_index, 1, hdr_len,
-				 0, GFP_KERNEL);
+				 0, GFP_KERNEL,
+				 IB_MGMT_BASE_VERSION);
 	if (IS_ERR(msg))
 		return;
 
@@ -165,7 +166,8 @@ static struct ib_mad_send_buf *alloc_response_msg(struct ib_mad_agent *agent,
 	hdr_len = ib_get_mad_data_offset(recv_wc->recv_buf.mad->mad_hdr.mgmt_class);
 	msg = ib_create_send_mad(agent, recv_wc->wc->src_qp,
 				 recv_wc->wc->pkey_index, 1,
-				 hdr_len, 0, GFP_KERNEL);
+				 hdr_len, 0, GFP_KERNEL,
+				 IB_MGMT_BASE_VERSION);
 	if (IS_ERR(msg))
 		ib_destroy_ah(ah);
 	else {
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index c38f030..32c3fe6 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -583,7 +583,8 @@ static int alloc_mad(struct ib_sa_query *query, gfp_t gfp_mask)
 	query->mad_buf = ib_create_send_mad(query->port->agent, 1,
 					    query->sm_ah->pkey_index,
 					    0, IB_MGMT_SA_HDR, IB_MGMT_SA_DATA,
-					    gfp_mask);
+					    gfp_mask,
+					    IB_MGMT_BASE_VERSION);
 	if (IS_ERR(query->mad_buf)) {
 		kref_put(&query->sm_ah->ref, free_sm_ah);
 		return -ENOMEM;
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 928cdd2..66019bd 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -521,7 +521,8 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 	packet->msg = ib_create_send_mad(agent,
 					 be32_to_cpu(packet->mad.hdr.qpn),
 					 packet->mad.hdr.pkey_index, rmpp_active,
-					 hdr_len, data_len, GFP_KERNEL);
+					 hdr_len, data_len, GFP_KERNEL,
+					 IB_MGMT_BASE_VERSION);
 	if (IS_ERR(packet->msg)) {
 		ret = PTR_ERR(packet->msg);
 		goto err_ah;
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 82a7dd8..9098906 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -358,7 +358,8 @@ static void forward_trap(struct mlx4_ib_dev *dev, u8 port_num, struct ib_mad *ma
 
 	if (agent) {
 		send_buf = ib_create_send_mad(agent, qpn, 0, 0, IB_MGMT_MAD_HDR,
-					      IB_MGMT_MAD_DATA, GFP_ATOMIC);
+					      IB_MGMT_MAD_DATA, GFP_ATOMIC,
+					      IB_MGMT_BASE_VERSION);
 		if (IS_ERR(send_buf))
 			return;
 		/*
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 8881fa3..2a34cd2 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -170,7 +170,8 @@ static void forward_trap(struct mthca_dev *dev,
 
 	if (agent) {
 		send_buf = ib_create_send_mad(agent, qpn, 0, 0, IB_MGMT_MAD_HDR,
-					      IB_MGMT_MAD_DATA, GFP_ATOMIC);
+					      IB_MGMT_MAD_DATA, GFP_ATOMIC,
+					      IB_MGMT_BASE_VERSION);
 		if (IS_ERR(send_buf))
 			return;
 		/*
diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c
index a7eb325..d41a4170 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -5488,7 +5488,8 @@ static void try_7322_ipg(struct qib_pportdata *ppd)
 		goto retry;
 
 	send_buf = ib_create_send_mad(agent, 0, 0, 0, IB_MGMT_MAD_HDR,
-				      IB_MGMT_MAD_DATA, GFP_ATOMIC);
+				      IB_MGMT_MAD_DATA, GFP_ATOMIC,
+				      IB_MGMT_BASE_VERSION);
 	if (IS_ERR(send_buf))
 		goto retry;
 
diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
index 636be11..2861304 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -83,7 +83,8 @@ static void qib_send_trap(struct qib_ibport *ibp, void *data, unsigned len)
 		return;
 
 	send_buf = ib_create_send_mad(agent, 0, 0, 0, IB_MGMT_MAD_HDR,
-				      IB_MGMT_MAD_DATA, GFP_ATOMIC);
+				      IB_MGMT_MAD_DATA, GFP_ATOMIC,
+				      IB_MGMT_BASE_VERSION);
 	if (IS_ERR(send_buf))
 		return;
 
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index d28a8c2..f3db6d6 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -477,7 +477,8 @@ static void srpt_mad_recv_handler(struct ib_mad_agent *mad_agent,
 	rsp = ib_create_send_mad(mad_agent, mad_wc->wc->src_qp,
 				 mad_wc->wc->pkey_index, 0,
 				 IB_MGMT_DEVICE_HDR, IB_MGMT_DEVICE_DATA,
-				 GFP_KERNEL);
+				 GFP_KERNEL,
+				 IB_MGMT_BASE_VERSION);
 	if (IS_ERR(rsp))
 		goto err_rsp;
 
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 9bb99e9..4149a11 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -618,6 +618,7 @@ int ib_process_mad_wc(struct ib_mad_agent *mad_agent,
  *   automatically adjust the allocated buffer size to account for any
  *   additional padding that may be necessary.
  * @gfp_mask: GFP mask used for the memory allocation.
+ * @base_version: Base Version of this MAD
  *
  * This routine allocates a MAD for sending.  The returned MAD send buffer
  * will reference a data buffer usable for sending a MAD, along
@@ -633,7 +634,8 @@ struct ib_mad_send_buf *ib_create_send_mad(struct ib_mad_agent *mad_agent,
 					   u32 remote_qpn, u16 pkey_index,
 					   int rmpp_active,
 					   int hdr_len, int data_len,
-					   gfp_t gfp_mask);
+					   gfp_t gfp_mask,
+					   u8 base_version);
 
 /**
  * ib_is_mad_class_rmpp - returns whether given management class
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 04/16] ib/mad: add base version parameter to ib_create_send_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (12 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

In support of variable length MADs add in and out MAD size parameters to the
process_mad call.

The out MAD size parameter is passed by reference such that it can be updated
by the agent to indicate the proper response length to be sent by the MAD
stack.

The in and out MAD parameters are made generic by specifying them as
ib_mad_hdr.

Finally all MAD sizes are set to the current IB MAD size and devices which use
MADs are modified to verify the MAD sizes passed to them.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c                | 14 ++++++++++----
 drivers/infiniband/core/sysfs.c              |  4 +++-
 drivers/infiniband/hw/amso1100/c2_provider.c |  5 ++++-
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |  5 ++++-
 drivers/infiniband/hw/cxgb4/provider.c       |  7 +++++--
 drivers/infiniband/hw/ehca/ehca_sqp.c        |  8 +++++++-
 drivers/infiniband/hw/ipath/ipath_mad.c      |  8 +++++++-
 drivers/infiniband/hw/ipath/ipath_verbs.h    |  3 ++-
 drivers/infiniband/hw/mlx4/mad.c             |  9 ++++++++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h         |  3 ++-
 drivers/infiniband/hw/mlx5/mad.c             |  8 +++++++-
 drivers/infiniband/hw/mthca/mthca_dev.h      |  4 ++--
 drivers/infiniband/hw/mthca/mthca_mad.c      |  9 +++++++--
 drivers/infiniband/hw/nes/nes_verbs.c        |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c     |  3 ++-
 drivers/infiniband/hw/qib/qib_mad.c          |  8 +++++++-
 drivers/infiniband/hw/qib/qib_verbs.h        |  3 ++-
 include/rdma/ib_verbs.h                      |  8 +++++---
 18 files changed, 86 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 0cb91fc..59ea90d 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -736,6 +736,8 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	u8 port_num;
 	struct ib_wc mad_wc;
 	struct ib_send_wr *send_wr = &mad_send_wr->send_wr;
+	size_t in_mad_size = sizeof(struct ib_mad);
+	size_t out_mad_size = sizeof(struct ib_mad);
 
 	if (device->node_type == RDMA_NODE_IB_SWITCH &&
 	    smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
@@ -786,8 +788,9 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 
 	/* No GRH for DR SMP */
 	ret = device->process_mad(device, 0, port_num, &mad_wc, NULL,
-				  (struct ib_mad *)smp,
-				  (struct ib_mad *)&mad_priv->mad);
+				  (struct ib_mad_hdr *)smp, in_mad_size,
+				  (struct ib_mad_hdr *)&mad_priv->mad,
+				  &out_mad_size);
 	switch (ret)
 	{
 	case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY:
@@ -2038,11 +2041,14 @@ static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
 local:
 	/* Give driver "right of first refusal" on incoming MAD */
 	if (port_priv->device->process_mad) {
+		size_t resp_mad_size = sizeof(struct ib_mad);
 		ret = port_priv->device->process_mad(port_priv->device, 0,
 						     port_priv->port_num,
 						     wc, &recv->grh,
-						     &recv->mad.mad,
-						     &response->mad.mad);
+						     (struct ib_mad_hdr *)&recv->mad.mad,
+						     sizeof(struct ib_mad),
+						     (struct ib_mad_hdr *)&response->mad.mad,
+						     &resp_mad_size);
 		if (ret & IB_MAD_RESULT_SUCCESS) {
 			if (ret & IB_MAD_RESULT_CONSUMED)
 				goto out;
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index cbd0383..fbe2dc5 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -347,7 +347,9 @@ static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr,
 	in_mad->data[41] = p->port_num;	/* PortSelect field */
 
 	if ((p->ibdev->process_mad(p->ibdev, IB_MAD_IGNORE_MKEY,
-		 p->port_num, NULL, NULL, in_mad, out_mad) &
+		 p->port_num, NULL, NULL,
+		 (struct ib_mad_hdr *)in_mad, sizeof(*in_mad),
+		 (struct ib_mad_hdr *)out_mad, NULL) &
 	     (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) !=
 	    (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) {
 		ret = -EINVAL;
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
index 2d5cbf4..e982cb2 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -584,7 +584,10 @@ static int c2_process_mad(struct ib_device *ibdev,
 			  u8 port_num,
 			  struct ib_wc *in_wc,
 			  struct ib_grh *in_grh,
-			  struct ib_mad *in_mad, struct ib_mad *out_mad)
+			  struct ib_mad_hdr *in_mad,
+			  size_t in_mad_size,
+			  struct ib_mad_hdr *out_mad,
+			  size_t *out_mad_size)
 {
 	pr_debug("%s:%u\n", __func__, __LINE__);
 	return -ENOSYS;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 811b24a..30c157a 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -87,7 +87,10 @@ static int iwch_process_mad(struct ib_device *ibdev,
 			    u8 port_num,
 			    struct ib_wc *in_wc,
 			    struct ib_grh *in_grh,
-			    struct ib_mad *in_mad, struct ib_mad *out_mad)
+			    struct ib_mad_hdr *in_mad,
+			    size_t in_mad_size,
+			    struct ib_mad_hdr *out_mad,
+			    size_t *out_mad_size)
 {
 	return -ENOSYS;
 }
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index 72e3b69..673d6e178 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -81,8 +81,11 @@ static int c4iw_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
 
 static int c4iw_process_mad(struct ib_device *ibdev, int mad_flags,
 			    u8 port_num, struct ib_wc *in_wc,
-			    struct ib_grh *in_grh, struct ib_mad *in_mad,
-			    struct ib_mad *out_mad)
+			    struct ib_grh *in_grh,
+			    struct ib_mad_hdr *in_mad,
+			    size_t in_mad_size,
+			    struct ib_mad_hdr *out_mad,
+			    size_t *out_mad_size)
 {
 	return -ENOSYS;
 }
diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c
index dba8f9f..d4ed490 100644
--- a/drivers/infiniband/hw/ehca/ehca_sqp.c
+++ b/drivers/infiniband/hw/ehca/ehca_sqp.c
@@ -218,9 +218,15 @@ perf_reply:
 
 int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 		     struct ib_wc *in_wc, struct ib_grh *in_grh,
-		     struct ib_mad *in_mad, struct ib_mad *out_mad)
+		     struct ib_mad_hdr *in, size_t in_mad_size,
+		     struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	int ret;
+	struct ib_mad *in_mad = (struct ib_mad *)in;
+	struct ib_mad *out_mad = (struct ib_mad *)out;
+
+	if (in_mad_size != sizeof(*in_mad) || *out_mad_size != sizeof(*out_mad))
+		return IB_MAD_RESULT_FAILURE;
 
 	if (!port_num || port_num > ibdev->phys_port_cnt || !in_wc)
 		return IB_MAD_RESULT_FAILURE;
diff --git a/drivers/infiniband/hw/ipath/ipath_mad.c b/drivers/infiniband/hw/ipath/ipath_mad.c
index e890e5b..d554089 100644
--- a/drivers/infiniband/hw/ipath/ipath_mad.c
+++ b/drivers/infiniband/hw/ipath/ipath_mad.c
@@ -1491,9 +1491,15 @@ bail:
  */
 int ipath_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 		      struct ib_wc *in_wc, struct ib_grh *in_grh,
-		      struct ib_mad *in_mad, struct ib_mad *out_mad)
+		      struct ib_mad_hdr *in, size_t in_mad_size,
+		      struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	int ret;
+	struct ib_mad *in_mad = (struct ib_mad *)in;
+	struct ib_mad *out_mad = (struct ib_mad *)out;
+
+	if (in_mad_size != sizeof(*in_mad) || *out_mad_size != sizeof(*out_mad))
+		return IB_MAD_RESULT_FAILURE;
 
 	switch (in_mad->mad_hdr.mgmt_class) {
 	case IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE:
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.h b/drivers/infiniband/hw/ipath/ipath_verbs.h
index ae6cff4..cd8dd09 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.h
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.h
@@ -703,7 +703,8 @@ int ipath_process_mad(struct ib_device *ibdev,
 		      u8 port_num,
 		      struct ib_wc *in_wc,
 		      struct ib_grh *in_grh,
-		      struct ib_mad *in_mad, struct ib_mad *out_mad);
+		      struct ib_mad_hdr *in, size_t in_mad_size,
+		      struct ib_mad_hdr *out, size_t *out_mad_size);
 
 /*
  * Compare the lower 24 bits of the two values.
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 9098906..cd97722 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -856,8 +856,15 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 
 int mlx4_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 			struct ib_wc *in_wc, struct ib_grh *in_grh,
-			struct ib_mad *in_mad, struct ib_mad *out_mad)
+			struct ib_mad_hdr *in, size_t in_mad_size,
+			struct ib_mad_hdr *out, size_t *out_mad_size)
 {
+	struct ib_mad *in_mad = (struct ib_mad *)in;
+	struct ib_mad *out_mad = (struct ib_mad *)out;
+
+	if (in_mad_size != sizeof(*in_mad) || *out_mad_size != sizeof(*out_mad))
+		return IB_MAD_RESULT_FAILURE;
+
 	switch (rdma_port_get_link_layer(ibdev, port_num)) {
 	case IB_LINK_LAYER_INFINIBAND:
 		return ib_process_mad(ibdev, mad_flags, port_num, in_wc,
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 6eb743f..c5960fe 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -690,7 +690,8 @@ int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int mad_ifc_flags,
 		 void *in_mad, void *response_mad);
 int mlx4_ib_process_mad(struct ib_device *ibdev, int mad_flags,	u8 port_num,
 			struct ib_wc *in_wc, struct ib_grh *in_grh,
-			struct ib_mad *in_mad, struct ib_mad *out_mad);
+			struct ib_mad_hdr *in, size_t in_mad_size,
+			struct ib_mad_hdr *out, size_t *out_mad_size);
 int mlx4_ib_mad_init(struct mlx4_ib_dev *dev);
 void mlx4_ib_mad_cleanup(struct mlx4_ib_dev *dev);
 
diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index b514bbb..69d1827 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -59,10 +59,16 @@ int mlx5_MAD_IFC(struct mlx5_ib_dev *dev, int ignore_mkey, int ignore_bkey,
 
 int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 			struct ib_wc *in_wc, struct ib_grh *in_grh,
-			struct ib_mad *in_mad, struct ib_mad *out_mad)
+			struct ib_mad_hdr *in, size_t in_mad_size,
+			struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	u16 slid;
 	int err;
+	struct ib_mad *in_mad = (struct ib_mad *)in;
+	struct ib_mad *out_mad = (struct ib_mad *)out;
+
+	if (in_mad_size != sizeof(*in_mad) || *out_mad_size != sizeof(*out_mad))
+		return IB_MAD_RESULT_FAILURE;
 
 	slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
 
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 7e6a6d6..60d15f1 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -578,8 +578,8 @@ int mthca_process_mad(struct ib_device *ibdev,
 		      u8 port_num,
 		      struct ib_wc *in_wc,
 		      struct ib_grh *in_grh,
-		      struct ib_mad *in_mad,
-		      struct ib_mad *out_mad);
+		      struct ib_mad_hdr *in, size_t in_mad_size,
+		      struct ib_mad_hdr *out, size_t *out_mad_size);
 int mthca_create_agents(struct mthca_dev *dev);
 void mthca_free_agents(struct mthca_dev *dev);
 
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 2a34cd2..83817da 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -198,13 +198,18 @@ int mthca_process_mad(struct ib_device *ibdev,
 		      u8 port_num,
 		      struct ib_wc *in_wc,
 		      struct ib_grh *in_grh,
-		      struct ib_mad *in_mad,
-		      struct ib_mad *out_mad)
+		      struct ib_mad_hdr *in, size_t in_mad_size,
+		      struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	int err;
 	u16 slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
 	u16 prev_lid = 0;
 	struct ib_port_attr pattr;
+	struct ib_mad *in_mad = (struct ib_mad *)in;
+	struct ib_mad *out_mad = (struct ib_mad *)out;
+
+	if (in_mad_size != sizeof(*in_mad) || *out_mad_size != sizeof(*out_mad))
+		return IB_MAD_RESULT_FAILURE;
 
 	/* Forward locally generated traps to the SM */
 	if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP &&
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index fef067c..73bc51a 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -3223,7 +3223,8 @@ static int nes_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
  */
 static int nes_process_mad(struct ib_device *ibdev, int mad_flags,
 		u8 port_num, struct ib_wc *in_wc, struct ib_grh *in_grh,
-		struct ib_mad *in_mad, struct ib_mad *out_mad)
+		struct ib_mad_hdr *in, size_t in_mad_size,
+		struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	nes_debug(NES_DBG_INIT, "\n");
 	return -ENOSYS;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
index ac02ce4..5606fa5 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
@@ -192,7 +192,8 @@ int ocrdma_process_mad(struct ib_device *ibdev,
 		       u8 port_num,
 		       struct ib_wc *in_wc,
 		       struct ib_grh *in_grh,
-		       struct ib_mad *in_mad, struct ib_mad *out_mad)
+		       struct ib_mad_hdr *in, size_t in_mad_size,
+		       struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	return IB_MAD_RESULT_SUCCESS;
 }
diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
index 2861304..cfc1be7 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -2402,11 +2402,17 @@ bail:
  */
 int qib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port,
 		    struct ib_wc *in_wc, struct ib_grh *in_grh,
-		    struct ib_mad *in_mad, struct ib_mad *out_mad)
+		    struct ib_mad_hdr *in, size_t in_mad_size,
+		    struct ib_mad_hdr *out, size_t *out_mad_size)
 {
 	int ret;
 	struct qib_ibport *ibp = to_iport(ibdev, port);
 	struct qib_pportdata *ppd = ppd_from_ibp(ibp);
+	struct ib_mad *in_mad = (struct ib_mad *)in;
+	struct ib_mad *out_mad = (struct ib_mad *)out;
+
+	if (in_mad_size != sizeof(*in_mad) || *out_mad_size != sizeof(*out_mad))
+		return IB_MAD_RESULT_FAILURE;
 
 	switch (in_mad->mad_hdr.mgmt_class) {
 	case IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE:
diff --git a/drivers/infiniband/hw/qib/qib_verbs.h b/drivers/infiniband/hw/qib/qib_verbs.h
index bfc8948..77f1d31 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.h
+++ b/drivers/infiniband/hw/qib/qib_verbs.h
@@ -873,7 +873,8 @@ void qib_sys_guid_chg(struct qib_ibport *ibp);
 void qib_node_desc_chg(struct qib_ibport *ibp);
 int qib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 		    struct ib_wc *in_wc, struct ib_grh *in_grh,
-		    struct ib_mad *in_mad, struct ib_mad *out_mad);
+		    struct ib_mad_hdr *in, size_t in_mad_size,
+		    struct ib_mad_hdr *out, size_t *out_mad_size);
 int qib_create_agents(struct qib_ibdev *dev);
 void qib_free_agents(struct qib_ibdev *dev);
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 04be3c8..59de0a6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1363,7 +1363,7 @@ struct ib_flow {
 	struct ib_uobject	*uobject;
 };
 
-struct ib_mad;
+struct ib_mad_hdr;
 struct ib_grh;
 
 enum ib_process_mad_flags {
@@ -1595,8 +1595,10 @@ struct ib_device {
 						  u8 port_num,
 						  struct ib_wc *in_wc,
 						  struct ib_grh *in_grh,
-						  struct ib_mad *in_mad,
-						  struct ib_mad *out_mad);
+						  struct ib_mad_hdr *in_mad,
+						  size_t in_mad_size,
+						  struct ib_mad_hdr *out_mad,
+						  size_t *out_mad_size);
 	struct ib_xrcd *	   (*alloc_xrcd)(struct ib_device *device,
 						 struct ib_ucontext *ucontext,
 						 struct ib_udata *udata);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (11 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Define jumbo_mad, jumbo_rmpp_mad, and jumbo_mad_private structures.

Create an RMPP Base header to share between ib_rmpp_mad and jumbo_rmpp_mad

Update code to use the new structures.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c      |  18 +++---
 drivers/infiniband/core/mad_priv.h |  19 +++++-
 drivers/infiniband/core/mad_rmpp.c | 120 ++++++++++++++++++-------------------
 drivers/infiniband/core/user_mad.c |  16 ++---
 include/rdma/ib_mad.h              |  26 +++++++-
 5 files changed, 119 insertions(+), 80 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 59ea90d..aecd54e 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -876,7 +876,7 @@ static int alloc_send_rmpp_list(struct ib_mad_send_wr_private *send_wr,
 				gfp_t gfp_mask)
 {
 	struct ib_mad_send_buf *send_buf = &send_wr->send_buf;
-	struct ib_rmpp_mad *rmpp_mad = send_buf->mad;
+	struct ib_rmpp_base *rmpp_base = send_buf->mad;
 	struct ib_rmpp_segment *seg = NULL;
 	int left, seg_size, pad;
 
@@ -902,10 +902,10 @@ static int alloc_send_rmpp_list(struct ib_mad_send_wr_private *send_wr,
 	if (pad)
 		memset(seg->data + seg_size - pad, 0, pad);
 
-	rmpp_mad->rmpp_hdr.rmpp_version = send_wr->mad_agent_priv->
+	rmpp_base->rmpp_hdr.rmpp_version = send_wr->mad_agent_priv->
 					  agent.rmpp_version;
-	rmpp_mad->rmpp_hdr.rmpp_type = IB_MGMT_RMPP_TYPE_DATA;
-	ib_set_rmpp_flags(&rmpp_mad->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
+	rmpp_base->rmpp_hdr.rmpp_type = IB_MGMT_RMPP_TYPE_DATA;
+	ib_set_rmpp_flags(&rmpp_base->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
 
 	send_wr->cur_seg = container_of(send_wr->rmpp_list.next,
 					struct ib_rmpp_segment, list);
@@ -1741,14 +1741,14 @@ out:
 static int is_rmpp_data_mad(struct ib_mad_agent_private *mad_agent_priv,
 		       struct ib_mad_hdr *mad_hdr)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 
-	rmpp_mad = (struct ib_rmpp_mad *)mad_hdr;
+	rmpp_base = (struct ib_rmpp_base *)mad_hdr;
 	return !mad_agent_priv->agent.rmpp_version ||
 		!ib_mad_kernel_rmpp_agent(&mad_agent_priv->agent) ||
-		!(ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+		!(ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
 				    IB_MGMT_RMPP_FLAG_ACTIVE) ||
-		(rmpp_mad->rmpp_hdr.rmpp_type == IB_MGMT_RMPP_TYPE_DATA);
+		(rmpp_base->rmpp_hdr.rmpp_type == IB_MGMT_RMPP_TYPE_DATA);
 }
 
 static inline int rcv_has_same_class(struct ib_mad_send_wr_private *wr,
@@ -1890,7 +1890,7 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
 			spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
 			if (!ib_mad_kernel_rmpp_agent(&mad_agent_priv->agent)
 			   && ib_is_mad_class_rmpp(mad_recv_wc->recv_buf.mad->mad_hdr.mgmt_class)
-			   && (ib_get_rmpp_flags(&((struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad)->rmpp_hdr)
+			   && (ib_get_rmpp_flags(&((struct ib_rmpp_base *)mad_recv_wc->recv_buf.mad)->rmpp_hdr)
 					& IB_MGMT_RMPP_FLAG_ACTIVE)) {
 				/* user rmpp is in effect
 				 * and this is an active RMPP MAD
diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h
index 4b4110d..c1b5f36 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -83,6 +83,23 @@ struct ib_mad_private {
 	} mad;
 } __attribute__ ((packed));
 
+/**
+ * While, it might be possible to define this as part of the ib_mad_private by
+ * simply extending the union there, we want to prevent posting > 256B MADs on
+ * RDMA hardware that does not support it.
+ *
+ * Furthermore, this allows us to use the smaller kmem_cache's on non-jumbo
+ * capable devices for less memory usage.
+ */
+struct jumbo_mad_private {
+	struct ib_mad_private_header header;
+	struct ib_grh grh;
+	union {
+		struct jumbo_mad mad;
+		struct jumbo_rmpp_mad rmpp_mad;
+	} mad;
+} __packed;
+
 struct ib_rmpp_segment {
 	struct list_head list;
 	u32 num;
@@ -147,7 +164,7 @@ struct ib_mad_send_wr_private {
 
 struct ib_mad_local_private {
 	struct list_head completion_list;
-	struct ib_mad_private *mad_priv;
+	struct ib_mad_private *mad_priv; /* can be struct jumbo_mad_private */
 	struct ib_mad_agent_private *recv_mad_agent;
 	struct ib_mad_send_wr_private *mad_send_wr;
 };
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 2379e2d..7184530 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -111,10 +111,10 @@ void ib_cancel_rmpp_recvs(struct ib_mad_agent_private *agent)
 }
 
 static void format_ack(struct ib_mad_send_buf *msg,
-		       struct ib_rmpp_mad *data,
+		       struct ib_rmpp_base *data,
 		       struct mad_rmpp_recv *rmpp_recv)
 {
-	struct ib_rmpp_mad *ack = msg->mad;
+	struct ib_rmpp_base *ack = msg->mad;
 	unsigned long flags;
 
 	memcpy(ack, &data->mad_hdr, msg->hdr_len);
@@ -144,7 +144,7 @@ static void ack_recv(struct mad_rmpp_recv *rmpp_recv,
 	if (IS_ERR(msg))
 		return;
 
-	format_ack(msg, (struct ib_rmpp_mad *) recv_wc->recv_buf.mad, rmpp_recv);
+	format_ack(msg, (struct ib_rmpp_base *) recv_wc->recv_buf.mad, rmpp_recv);
 	msg->ah = rmpp_recv->ah;
 	ret = ib_post_send_mad(msg, NULL);
 	if (ret)
@@ -182,20 +182,20 @@ static void ack_ds_ack(struct ib_mad_agent_private *agent,
 		       struct ib_mad_recv_wc *recv_wc)
 {
 	struct ib_mad_send_buf *msg;
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	int ret;
 
 	msg = alloc_response_msg(&agent->agent, recv_wc);
 	if (IS_ERR(msg))
 		return;
 
-	rmpp_mad = msg->mad;
-	memcpy(rmpp_mad, recv_wc->recv_buf.mad, msg->hdr_len);
+	rmpp_base = msg->mad;
+	memcpy(rmpp_base, recv_wc->recv_buf.mad, msg->hdr_len);
 
-	rmpp_mad->mad_hdr.method ^= IB_MGMT_METHOD_RESP;
-	ib_set_rmpp_flags(&rmpp_mad->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
-	rmpp_mad->rmpp_hdr.seg_num = 0;
-	rmpp_mad->rmpp_hdr.paylen_newwin = cpu_to_be32(1);
+	rmpp_base->mad_hdr.method ^= IB_MGMT_METHOD_RESP;
+	ib_set_rmpp_flags(&rmpp_base->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
+	rmpp_base->rmpp_hdr.seg_num = 0;
+	rmpp_base->rmpp_hdr.paylen_newwin = cpu_to_be32(1);
 
 	ret = ib_post_send_mad(msg, NULL);
 	if (ret) {
@@ -215,23 +215,23 @@ static void nack_recv(struct ib_mad_agent_private *agent,
 		      struct ib_mad_recv_wc *recv_wc, u8 rmpp_status)
 {
 	struct ib_mad_send_buf *msg;
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	int ret;
 
 	msg = alloc_response_msg(&agent->agent, recv_wc);
 	if (IS_ERR(msg))
 		return;
 
-	rmpp_mad = msg->mad;
-	memcpy(rmpp_mad, recv_wc->recv_buf.mad, msg->hdr_len);
+	rmpp_base = msg->mad;
+	memcpy(rmpp_base, recv_wc->recv_buf.mad, msg->hdr_len);
 
-	rmpp_mad->mad_hdr.method ^= IB_MGMT_METHOD_RESP;
-	rmpp_mad->rmpp_hdr.rmpp_version = IB_MGMT_RMPP_VERSION;
-	rmpp_mad->rmpp_hdr.rmpp_type = IB_MGMT_RMPP_TYPE_ABORT;
-	ib_set_rmpp_flags(&rmpp_mad->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
-	rmpp_mad->rmpp_hdr.rmpp_status = rmpp_status;
-	rmpp_mad->rmpp_hdr.seg_num = 0;
-	rmpp_mad->rmpp_hdr.paylen_newwin = 0;
+	rmpp_base->mad_hdr.method ^= IB_MGMT_METHOD_RESP;
+	rmpp_base->rmpp_hdr.rmpp_version = IB_MGMT_RMPP_VERSION;
+	rmpp_base->rmpp_hdr.rmpp_type = IB_MGMT_RMPP_TYPE_ABORT;
+	ib_set_rmpp_flags(&rmpp_base->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
+	rmpp_base->rmpp_hdr.rmpp_status = rmpp_status;
+	rmpp_base->rmpp_hdr.seg_num = 0;
+	rmpp_base->rmpp_hdr.paylen_newwin = 0;
 
 	ret = ib_post_send_mad(msg, NULL);
 	if (ret) {
@@ -373,18 +373,18 @@ insert_rmpp_recv(struct ib_mad_agent_private *agent,
 
 static inline int get_last_flag(struct ib_mad_recv_buf *seg)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 
-	rmpp_mad = (struct ib_rmpp_mad *) seg->mad;
-	return ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) & IB_MGMT_RMPP_FLAG_LAST;
+	rmpp_base = (struct ib_rmpp_base *) seg->mad;
+	return ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) & IB_MGMT_RMPP_FLAG_LAST;
 }
 
 static inline int get_seg_num(struct ib_mad_recv_buf *seg)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 
-	rmpp_mad = (struct ib_rmpp_mad *) seg->mad;
-	return be32_to_cpu(rmpp_mad->rmpp_hdr.seg_num);
+	rmpp_base = (struct ib_rmpp_base *) seg->mad;
+	return be32_to_cpu(rmpp_base->rmpp_hdr.seg_num);
 }
 
 static inline struct ib_mad_recv_buf * get_next_seg(struct list_head *rmpp_list,
@@ -436,9 +436,9 @@ static inline int get_mad_len(struct mad_rmpp_recv *rmpp_recv)
 
 	rmpp_mad = (struct ib_rmpp_mad *)rmpp_recv->cur_seg_buf->mad;
 
-	hdr_size = ib_get_mad_data_offset(rmpp_mad->mad_hdr.mgmt_class);
+	hdr_size = ib_get_mad_data_offset(rmpp_mad->base.mad_hdr.mgmt_class);
 	data_size = sizeof(struct ib_rmpp_mad) - hdr_size;
-	pad = IB_MGMT_RMPP_DATA - be32_to_cpu(rmpp_mad->rmpp_hdr.paylen_newwin);
+	pad = IB_MGMT_RMPP_DATA - be32_to_cpu(rmpp_mad->base.rmpp_hdr.paylen_newwin);
 	if (pad > IB_MGMT_RMPP_DATA || pad < 0)
 		pad = 0;
 
@@ -567,20 +567,20 @@ static int send_next_seg(struct ib_mad_send_wr_private *mad_send_wr)
 	u32 paylen = 0;
 
 	rmpp_mad = mad_send_wr->send_buf.mad;
-	ib_set_rmpp_flags(&rmpp_mad->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
-	rmpp_mad->rmpp_hdr.seg_num = cpu_to_be32(++mad_send_wr->seg_num);
+	ib_set_rmpp_flags(&rmpp_mad->base.rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
+	rmpp_mad->base.rmpp_hdr.seg_num = cpu_to_be32(++mad_send_wr->seg_num);
 
 	if (mad_send_wr->seg_num == 1) {
-		rmpp_mad->rmpp_hdr.rmpp_rtime_flags |= IB_MGMT_RMPP_FLAG_FIRST;
+		rmpp_mad->base.rmpp_hdr.rmpp_rtime_flags |= IB_MGMT_RMPP_FLAG_FIRST;
 		paylen = mad_send_wr->send_buf.seg_count * IB_MGMT_RMPP_DATA -
 			 mad_send_wr->pad;
 	}
 
 	if (mad_send_wr->seg_num == mad_send_wr->send_buf.seg_count) {
-		rmpp_mad->rmpp_hdr.rmpp_rtime_flags |= IB_MGMT_RMPP_FLAG_LAST;
+		rmpp_mad->base.rmpp_hdr.rmpp_rtime_flags |= IB_MGMT_RMPP_FLAG_LAST;
 		paylen = IB_MGMT_RMPP_DATA - mad_send_wr->pad;
 	}
-	rmpp_mad->rmpp_hdr.paylen_newwin = cpu_to_be32(paylen);
+	rmpp_mad->base.rmpp_hdr.paylen_newwin = cpu_to_be32(paylen);
 
 	/* 2 seconds for an ACK until we can find the packet lifetime */
 	timeout = mad_send_wr->send_buf.timeout_ms;
@@ -644,19 +644,19 @@ static void process_rmpp_ack(struct ib_mad_agent_private *agent,
 			     struct ib_mad_recv_wc *mad_recv_wc)
 {
 	struct ib_mad_send_wr_private *mad_send_wr;
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	unsigned long flags;
 	int seg_num, newwin, ret;
 
-	rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;
-	if (rmpp_mad->rmpp_hdr.rmpp_status) {
+	rmpp_base = (struct ib_rmpp_base *)mad_recv_wc->recv_buf.mad;
+	if (rmpp_base->rmpp_hdr.rmpp_status) {
 		abort_send(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
 		nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
 		return;
 	}
 
-	seg_num = be32_to_cpu(rmpp_mad->rmpp_hdr.seg_num);
-	newwin = be32_to_cpu(rmpp_mad->rmpp_hdr.paylen_newwin);
+	seg_num = be32_to_cpu(rmpp_base->rmpp_hdr.seg_num);
+	newwin = be32_to_cpu(rmpp_base->rmpp_hdr.paylen_newwin);
 	if (newwin < seg_num) {
 		abort_send(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_W2S);
 		nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_W2S);
@@ -741,7 +741,7 @@ process_rmpp_data(struct ib_mad_agent_private *agent,
 	struct ib_rmpp_hdr *rmpp_hdr;
 	u8 rmpp_status;
 
-	rmpp_hdr = &((struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad)->rmpp_hdr;
+	rmpp_hdr = &((struct ib_rmpp_base *)mad_recv_wc->recv_buf.mad)->rmpp_hdr;
 
 	if (rmpp_hdr->rmpp_status) {
 		rmpp_status = IB_MGMT_RMPP_STATUS_BAD_STATUS;
@@ -770,30 +770,30 @@ bad:
 static void process_rmpp_stop(struct ib_mad_agent_private *agent,
 			      struct ib_mad_recv_wc *mad_recv_wc)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 
-	rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;
+	rmpp_base = (struct ib_rmpp_base *)mad_recv_wc->recv_buf.mad;
 
-	if (rmpp_mad->rmpp_hdr.rmpp_status != IB_MGMT_RMPP_STATUS_RESX) {
+	if (rmpp_base->rmpp_hdr.rmpp_status != IB_MGMT_RMPP_STATUS_RESX) {
 		abort_send(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
 		nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
 	} else
-		abort_send(agent, mad_recv_wc, rmpp_mad->rmpp_hdr.rmpp_status);
+		abort_send(agent, mad_recv_wc, rmpp_base->rmpp_hdr.rmpp_status);
 }
 
 static void process_rmpp_abort(struct ib_mad_agent_private *agent,
 			       struct ib_mad_recv_wc *mad_recv_wc)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 
-	rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;
+	rmpp_base = (struct ib_rmpp_base *)mad_recv_wc->recv_buf.mad;
 
-	if (rmpp_mad->rmpp_hdr.rmpp_status < IB_MGMT_RMPP_STATUS_ABORT_MIN ||
-	    rmpp_mad->rmpp_hdr.rmpp_status > IB_MGMT_RMPP_STATUS_ABORT_MAX) {
+	if (rmpp_base->rmpp_hdr.rmpp_status < IB_MGMT_RMPP_STATUS_ABORT_MIN ||
+	    rmpp_base->rmpp_hdr.rmpp_status > IB_MGMT_RMPP_STATUS_ABORT_MAX) {
 		abort_send(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
 		nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
 	} else
-		abort_send(agent, mad_recv_wc, rmpp_mad->rmpp_hdr.rmpp_status);
+		abort_send(agent, mad_recv_wc, rmpp_base->rmpp_hdr.rmpp_status);
 }
 
 struct ib_mad_recv_wc *
@@ -803,16 +803,16 @@ ib_process_rmpp_recv_wc(struct ib_mad_agent_private *agent,
 	struct ib_rmpp_mad *rmpp_mad;
 
 	rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;
-	if (!(rmpp_mad->rmpp_hdr.rmpp_rtime_flags & IB_MGMT_RMPP_FLAG_ACTIVE))
+	if (!(rmpp_mad->base.rmpp_hdr.rmpp_rtime_flags & IB_MGMT_RMPP_FLAG_ACTIVE))
 		return mad_recv_wc;
 
-	if (rmpp_mad->rmpp_hdr.rmpp_version != IB_MGMT_RMPP_VERSION) {
+	if (rmpp_mad->base.rmpp_hdr.rmpp_version != IB_MGMT_RMPP_VERSION) {
 		abort_send(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_UNV);
 		nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_UNV);
 		goto out;
 	}
 
-	switch (rmpp_mad->rmpp_hdr.rmpp_type) {
+	switch (rmpp_mad->base.rmpp_hdr.rmpp_type) {
 	case IB_MGMT_RMPP_TYPE_DATA:
 		return process_rmpp_data(agent, mad_recv_wc);
 	case IB_MGMT_RMPP_TYPE_ACK:
@@ -873,11 +873,11 @@ int ib_send_rmpp_mad(struct ib_mad_send_wr_private *mad_send_wr)
 	int ret;
 
 	rmpp_mad = mad_send_wr->send_buf.mad;
-	if (!(ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+	if (!(ib_get_rmpp_flags(&rmpp_mad->base.rmpp_hdr) &
 	      IB_MGMT_RMPP_FLAG_ACTIVE))
 		return IB_RMPP_RESULT_UNHANDLED;
 
-	if (rmpp_mad->rmpp_hdr.rmpp_type != IB_MGMT_RMPP_TYPE_DATA) {
+	if (rmpp_mad->base.rmpp_hdr.rmpp_type != IB_MGMT_RMPP_TYPE_DATA) {
 		mad_send_wr->seg_num = 1;
 		return IB_RMPP_RESULT_INTERNAL;
 	}
@@ -895,15 +895,15 @@ int ib_send_rmpp_mad(struct ib_mad_send_wr_private *mad_send_wr)
 int ib_process_rmpp_send_wc(struct ib_mad_send_wr_private *mad_send_wr,
 			    struct ib_mad_send_wc *mad_send_wc)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	int ret;
 
-	rmpp_mad = mad_send_wr->send_buf.mad;
-	if (!(ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+	rmpp_base = mad_send_wr->send_buf.mad;
+	if (!(ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
 	      IB_MGMT_RMPP_FLAG_ACTIVE))
 		return IB_RMPP_RESULT_UNHANDLED; /* RMPP not active */
 
-	if (rmpp_mad->rmpp_hdr.rmpp_type != IB_MGMT_RMPP_TYPE_DATA)
+	if (rmpp_base->rmpp_hdr.rmpp_type != IB_MGMT_RMPP_TYPE_DATA)
 		return IB_RMPP_RESULT_INTERNAL;	 /* ACK, STOP, or ABORT */
 
 	if (mad_send_wc->status != IB_WC_SUCCESS ||
@@ -933,11 +933,11 @@ int ib_process_rmpp_send_wc(struct ib_mad_send_wr_private *mad_send_wr,
 
 int ib_retry_rmpp(struct ib_mad_send_wr_private *mad_send_wr)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	int ret;
 
-	rmpp_mad = mad_send_wr->send_buf.mad;
-	if (!(ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+	rmpp_base = mad_send_wr->send_buf.mad;
+	if (!(ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
 	      IB_MGMT_RMPP_FLAG_ACTIVE))
 		return IB_RMPP_RESULT_UNHANDLED; /* RMPP not active */
 
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 66019bd..3b4b614 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -448,7 +448,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 	struct ib_mad_agent *agent;
 	struct ib_ah_attr ah_attr;
 	struct ib_ah *ah;
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	__be64 *tid;
 	int ret, data_len, hdr_len, copy_offset, rmpp_active;
 
@@ -504,13 +504,13 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 		goto err_up;
 	}
 
-	rmpp_mad = (struct ib_rmpp_mad *) packet->mad.data;
-	hdr_len = ib_get_mad_data_offset(rmpp_mad->mad_hdr.mgmt_class);
+	rmpp_base = (struct ib_rmpp_base *) packet->mad.data;
+	hdr_len = ib_get_mad_data_offset(rmpp_base->mad_hdr.mgmt_class);
 
-	if (ib_is_mad_class_rmpp(rmpp_mad->mad_hdr.mgmt_class)
+	if (ib_is_mad_class_rmpp(rmpp_base->mad_hdr.mgmt_class)
 	    && ib_mad_kernel_rmpp_agent(agent)) {
 		copy_offset = IB_MGMT_RMPP_HDR;
-		rmpp_active = ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+		rmpp_active = ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
 						IB_MGMT_RMPP_FLAG_ACTIVE;
 	} else {
 		copy_offset = IB_MGMT_MAD_HDR;
@@ -558,12 +558,12 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 		tid = &((struct ib_mad_hdr *) packet->msg->mad)->tid;
 		*tid = cpu_to_be64(((u64) agent->hi_tid) << 32 |
 				   (be64_to_cpup(tid) & 0xffffffff));
-		rmpp_mad->mad_hdr.tid = *tid;
+		rmpp_base->mad_hdr.tid = *tid;
 	}
 
 	if (!ib_mad_kernel_rmpp_agent(agent)
-	   && ib_is_mad_class_rmpp(rmpp_mad->mad_hdr.mgmt_class)
-	   && (ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) & IB_MGMT_RMPP_FLAG_ACTIVE)) {
+	   && ib_is_mad_class_rmpp(rmpp_base->mad_hdr.mgmt_class)
+	   && (ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) & IB_MGMT_RMPP_FLAG_ACTIVE)) {
 		spin_lock_irq(&file->send_lock);
 		list_add_tail(&packet->list, &file->send_list);
 		spin_unlock_irq(&file->send_lock);
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 4149a11..1fdf856 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -135,6 +135,11 @@ enum {
 	IB_MGMT_SA_DATA = 200,
 	IB_MGMT_DEVICE_HDR = 64,
 	IB_MGMT_DEVICE_DATA = 192,
+
+	JUMBO_MGMT_MAD_HDR = IB_MGMT_MAD_HDR,
+	JUMBO_MGMT_MAD_DATA = 2024,
+	JUMBO_MGMT_RMPP_HDR = IB_MGMT_RMPP_HDR,
+	JUMBO_MGMT_RMPP_DATA = 2012,
 };
 
 struct ib_mad_hdr {
@@ -181,12 +186,26 @@ struct ib_mad {
 	u8			data[IB_MGMT_MAD_DATA];
 };
 
-struct ib_rmpp_mad {
+struct jumbo_mad {
+	struct ib_mad_hdr	mad_hdr;
+	u8			data[JUMBO_MGMT_MAD_DATA];
+};
+
+struct ib_rmpp_base {
 	struct ib_mad_hdr	mad_hdr;
 	struct ib_rmpp_hdr	rmpp_hdr;
+} __packed;
+
+struct ib_rmpp_mad {
+	struct ib_rmpp_base	base;
 	u8			data[IB_MGMT_RMPP_DATA];
 };
 
+struct jumbo_rmpp_mad {
+	struct ib_rmpp_base	base;
+	u8			data[JUMBO_MGMT_RMPP_DATA];
+};
+
 struct ib_sa_mad {
 	struct ib_mad_hdr	mad_hdr;
 	struct ib_rmpp_hdr	rmpp_hdr;
@@ -401,7 +420,10 @@ struct ib_mad_send_wc {
 struct ib_mad_recv_buf {
 	struct list_head	list;
 	struct ib_grh		*grh;
-	struct ib_mad		*mad;
+	union {
+		struct ib_mad		*mad;
+		struct jumbo_mad	*jumbo_mad;
+	};
 };
 
 /**
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
       [not found]     ` <1415908465-24392-8-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2014-11-13 19:54   ` [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (10 subsequent siblings)
  17 siblings, 1 reply; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Create the jumbo MAD kmem_cache and flag the MAD private structure properly.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c      | 86 +++++++++++++++++++++++++++++++-------
 drivers/infiniband/core/mad_priv.h |  4 ++
 2 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index aecd54e..cde1d5d 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -60,6 +60,7 @@ module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
 MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work requests");
 
 static struct kmem_cache *ib_mad_cache;
+struct kmem_cache *jumbo_mad_cache;
 
 static struct list_head ib_mad_port_list;
 static u32 ib_mad_client_id = 0;
@@ -85,6 +86,14 @@ static int add_nonoui_reg_req(struct ib_mad_reg_req *mad_reg_req,
 static int add_oui_reg_req(struct ib_mad_reg_req *mad_reg_req,
 			   struct ib_mad_agent_private *agent_priv);
 
+static void mad_priv_cache_free(struct ib_mad_private *mad_priv)
+{
+	if (mad_priv->header.flags & IB_MAD_PRIV_FLAG_JUMBO)
+		kmem_cache_free(jumbo_mad_cache, mad_priv);
+	else
+		kmem_cache_free(ib_mad_cache, mad_priv);
+}
+
 /*
  * Returns a ib_mad_port_private structure or NULL for a device/port
  * Assumes ib_mad_port_list_lock is being held
@@ -773,7 +782,12 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	}
 	local->mad_priv = NULL;
 	local->recv_mad_agent = NULL;
-	mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
+
+	if (mad_agent_priv->qp_info->supports_jumbo_mads)
+		mad_priv = kmem_cache_alloc(jumbo_mad_cache, GFP_ATOMIC);
+	else
+		mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
+
 	if (!mad_priv) {
 		ret = -ENOMEM;
 		dev_err(&device->dev, "No memory for local response MAD\n");
@@ -804,10 +818,10 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 			 */
 			atomic_inc(&mad_agent_priv->refcount);
 		} else
-			kmem_cache_free(ib_mad_cache, mad_priv);
+			mad_priv_cache_free(mad_priv);
 		break;
 	case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED:
-		kmem_cache_free(ib_mad_cache, mad_priv);
+		mad_priv_cache_free(mad_priv);
 		break;
 	case IB_MAD_RESULT_SUCCESS:
 		/* Treat like an incoming receive MAD */
@@ -823,14 +837,14 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 			 * No receiving agent so drop packet and
 			 * generate send completion.
 			 */
-			kmem_cache_free(ib_mad_cache, mad_priv);
+			mad_priv_cache_free(mad_priv);
 			break;
 		}
 		local->mad_priv = mad_priv;
 		local->recv_mad_agent = recv_mad_agent;
 		break;
 	default:
-		kmem_cache_free(ib_mad_cache, mad_priv);
+		mad_priv_cache_free(mad_priv);
 		kfree(local);
 		ret = -EINVAL;
 		goto out;
@@ -1241,7 +1255,7 @@ void ib_free_recv_mad(struct ib_mad_recv_wc *mad_recv_wc)
 					    recv_wc);
 		priv = container_of(mad_priv_hdr, struct ib_mad_private,
 				    header);
-		kmem_cache_free(ib_mad_cache, priv);
+		mad_priv_cache_free(priv);
 	}
 }
 EXPORT_SYMBOL(ib_free_recv_mad);
@@ -2081,8 +2095,10 @@ out:
 	/* Post another receive request for this QP */
 	if (response) {
 		ib_mad_post_receive_mads(qp_info, response);
-		if (recv)
+		if (recv) {
+			BUG_ON(recv->header.flags & IB_MAD_PRIV_FLAG_JUMBO);
 			kmem_cache_free(ib_mad_cache, recv);
+		}
 	} else
 		ib_mad_post_receive_mads(qp_info, recv);
 }
@@ -2542,7 +2558,7 @@ local_send_completion:
 		spin_lock_irqsave(&mad_agent_priv->lock, flags);
 		atomic_dec(&mad_agent_priv->refcount);
 		if (free_mad)
-			kmem_cache_free(ib_mad_cache, local->mad_priv);
+			mad_priv_cache_free(local->mad_priv);
 		kfree(local);
 	}
 	spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
@@ -2709,6 +2725,7 @@ static int ib_mad_post_receive_mads(struct ib_mad_qp_info *qp_info,
 					    sizeof *mad_priv -
 					      sizeof mad_priv->header,
 					    DMA_FROM_DEVICE);
+			BUG_ON(mad_priv->header.flags & IB_MAD_PRIV_FLAG_JUMBO);
 			kmem_cache_free(ib_mad_cache, mad_priv);
 			dev_err(&qp_info->port_priv->device->dev,
 				"ib_post_recv failed: %d\n", ret);
@@ -2744,12 +2761,21 @@ static void cleanup_recv_queue(struct ib_mad_qp_info *qp_info)
 		/* Remove from posted receive MAD list */
 		list_del(&mad_list->list);
 
-		ib_dma_unmap_single(qp_info->port_priv->device,
-				    recv->header.mapping,
-				    sizeof(struct ib_mad_private) -
-				      sizeof(struct ib_mad_private_header),
-				    DMA_FROM_DEVICE);
-		kmem_cache_free(ib_mad_cache, recv);
+		if (recv->header.flags & IB_MAD_PRIV_FLAG_JUMBO) {
+			ib_dma_unmap_single(qp_info->port_priv->device,
+					    recv->header.mapping,
+					    sizeof(struct jumbo_mad_private) -
+					      sizeof(struct ib_mad_private_header),
+					    DMA_FROM_DEVICE);
+			kmem_cache_free(jumbo_mad_cache, recv);
+		} else {
+			ib_dma_unmap_single(qp_info->port_priv->device,
+					    recv->header.mapping,
+					    sizeof(struct ib_mad_private) -
+					      sizeof(struct ib_mad_private_header),
+					    DMA_FROM_DEVICE);
+			kmem_cache_free(ib_mad_cache, recv);
+		}
 	}
 
 	qp_info->recv_queue.count = 0;
@@ -3157,6 +3183,20 @@ static struct ib_client mad_client = {
 	.remove = ib_mad_remove_device
 };
 
+static void init_ib_mad_private(void *obj)
+{
+	struct ib_mad_private *mp = (struct ib_mad_private *)obj;
+
+	mp->header.flags = 0;
+}
+
+static void init_jumbo_mad_private(void *obj)
+{
+	struct jumbo_mad_private *mp = (struct jumbo_mad_private *)obj;
+
+	mp->header.flags = IB_MAD_PRIV_FLAG_JUMBO;
+}
+
 static int __init ib_mad_init_module(void)
 {
 	int ret;
@@ -3171,23 +3211,36 @@ static int __init ib_mad_init_module(void)
 					 sizeof(struct ib_mad_private),
 					 0,
 					 SLAB_HWCACHE_ALIGN,
-					 NULL);
+					 init_ib_mad_private);
 	if (!ib_mad_cache) {
 		pr_err("Couldn't create ib_mad cache\n");
 		ret = -ENOMEM;
 		goto error1;
 	}
 
+	jumbo_mad_cache = kmem_cache_create("ib_mad_jumbo",
+					 sizeof(struct jumbo_mad_private),
+					 0,
+					 SLAB_HWCACHE_ALIGN,
+					 init_jumbo_mad_private);
+	if (!jumbo_mad_cache) {
+		pr_err("Couldn't create ib_mad cache\n");
+		ret = -ENOMEM;
+		goto error2;
+	}
+
 	INIT_LIST_HEAD(&ib_mad_port_list);
 
 	if (ib_register_client(&mad_client)) {
 		pr_err("Couldn't register ib_mad client\n");
 		ret = -EINVAL;
-		goto error2;
+		goto error3;
 	}
 
 	return 0;
 
+error3:
+	kmem_cache_destroy(jumbo_mad_cache);
 error2:
 	kmem_cache_destroy(ib_mad_cache);
 error1:
@@ -3197,6 +3250,7 @@ error1:
 static void __exit ib_mad_cleanup_module(void)
 {
 	ib_unregister_client(&mad_client);
+	kmem_cache_destroy(jumbo_mad_cache);
 	kmem_cache_destroy(ib_mad_cache);
 }
 
diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h
index c1b5f36..206187a 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -66,11 +66,15 @@ struct ib_mad_list_head {
 	struct ib_mad_queue *mad_queue;
 };
 
+enum ib_mad_private_flags {
+	IB_MAD_PRIV_FLAG_JUMBO = (1 << 0)
+};
 struct ib_mad_private_header {
 	struct ib_mad_list_head mad_list;
 	struct ib_mad_recv_wc recv_wc;
 	struct ib_wc wc;
 	u64 mapping;
+	u64 flags;
 } __attribute__ ((packed));
 
 struct ib_mad_private {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path Architecture base version MADs in ib_create_send_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (9 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

OPA_MIN_CLASS_VERSION -- OPA Class versions are > 0x80
OPA_SMP_CLASS_VERSION -- Defined at 0x80
OPA_MGMT_BASE_VERSION -- Defined at 0x80

Increase max management version to accommodate OPA

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad_priv.h | 4 +++-
 include/rdma/ib_mad.h              | 5 ++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h
index 206187a..29ed8dd 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -56,11 +56,13 @@
 
 /* Registration table sizes */
 #define MAX_MGMT_CLASS		80
-#define MAX_MGMT_VERSION	8
+#define MAX_MGMT_VERSION	0x83
 #define MAX_MGMT_OUI		8
 #define MAX_MGMT_VENDOR_RANGE2	(IB_MGMT_CLASS_VENDOR_RANGE2_END - \
 				IB_MGMT_CLASS_VENDOR_RANGE2_START + 1)
 
+#define OPA_MIN_CLASS_VERSION	0x80
+
 struct ib_mad_list_head {
 	struct list_head list;
 	struct ib_mad_queue *mad_queue;
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 1fdf856..ec94c01 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -42,8 +42,11 @@
 #include <rdma/ib_verbs.h>
 #include <uapi/rdma/ib_user_mad.h>
 
-/* Management base version */
+/* Management base versions */
 #define IB_MGMT_BASE_VERSION			1
+#define OPA_MGMT_BASE_VERSION			0x80
+
+#define OPA_SMP_CLASS_VERSION			0x80
 
 /* Management classes */
 #define IB_MGMT_CLASS_SUBN_LID_ROUTED		0x01
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path Architecture base version MADs in ib_create_send_mad
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path Architecture MADs ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (8 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If the MAD is an OPA base version; verify the device supports jumbo MADs
Set MAD size and sg lengths as appropriate
Split RMPP MADs as appropriate

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index cde1d5d..a3ba37f 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -864,11 +864,11 @@ out:
 	return ret;
 }
 
-static int get_pad_size(int hdr_len, int data_len)
+static int get_pad_size(int hdr_len, int data_len, size_t mad_size)
 {
 	int seg_size, pad;
 
-	seg_size = sizeof(struct ib_mad) - hdr_len;
+	seg_size = mad_size - hdr_len;
 	if (data_len && seg_size) {
 		pad = seg_size - data_len % seg_size;
 		return pad == seg_size ? 0 : pad;
@@ -887,14 +887,14 @@ static void free_send_rmpp_list(struct ib_mad_send_wr_private *mad_send_wr)
 }
 
 static int alloc_send_rmpp_list(struct ib_mad_send_wr_private *send_wr,
-				gfp_t gfp_mask)
+				size_t mad_size, gfp_t gfp_mask)
 {
 	struct ib_mad_send_buf *send_buf = &send_wr->send_buf;
 	struct ib_rmpp_base *rmpp_base = send_buf->mad;
 	struct ib_rmpp_segment *seg = NULL;
 	int left, seg_size, pad;
 
-	send_buf->seg_size = sizeof (struct ib_mad) - send_buf->hdr_len;
+	send_buf->seg_size = mad_size - send_buf->hdr_len;
 	seg_size = send_buf->seg_size;
 	pad = send_wr->pad;
 
@@ -944,20 +944,29 @@ struct ib_mad_send_buf * ib_create_send_mad(struct ib_mad_agent *mad_agent,
 	struct ib_mad_send_wr_private *mad_send_wr;
 	int pad, message_size, ret, size;
 	void *buf;
+	size_t mad_size;
 
 	mad_agent_priv = container_of(mad_agent, struct ib_mad_agent_private,
 				      agent);
-	pad = get_pad_size(hdr_len, data_len);
+
+	if (base_version == OPA_MGMT_BASE_VERSION) {
+		if (!mad_agent_priv->qp_info->supports_jumbo_mads)
+			return ERR_PTR(-EINVAL);
+		mad_size = sizeof(struct jumbo_mad);
+	} else
+		mad_size = sizeof(struct ib_mad);
+
+	pad = get_pad_size(hdr_len, data_len, mad_size);
 	message_size = hdr_len + data_len + pad;
 
 	if (ib_mad_kernel_rmpp_agent(mad_agent)) {
-		if (!rmpp_active && message_size > sizeof(struct ib_mad))
+		if (!rmpp_active && message_size > mad_size)
 			return ERR_PTR(-EINVAL);
 	} else
-		if (rmpp_active || message_size > sizeof(struct ib_mad))
+		if (rmpp_active || message_size > mad_size)
 			return ERR_PTR(-EINVAL);
 
-	size = rmpp_active ? hdr_len : sizeof(struct ib_mad);
+	size = rmpp_active ? hdr_len : mad_size;
 	buf = kzalloc(sizeof *mad_send_wr + size, gfp_mask);
 	if (!buf)
 		return ERR_PTR(-ENOMEM);
@@ -972,7 +981,15 @@ struct ib_mad_send_buf * ib_create_send_mad(struct ib_mad_agent *mad_agent,
 	mad_send_wr->mad_agent_priv = mad_agent_priv;
 	mad_send_wr->sg_list[0].length = hdr_len;
 	mad_send_wr->sg_list[0].lkey = mad_agent->mr->lkey;
-	mad_send_wr->sg_list[1].length = sizeof(struct ib_mad) - hdr_len;
+
+	/* individual jumbo MADs don't have to be 2048 bytes */
+	if (mad_agent_priv->qp_info->supports_jumbo_mads
+	    && base_version == OPA_MGMT_BASE_VERSION
+	    && data_len < mad_size - hdr_len)
+		mad_send_wr->sg_list[1].length = data_len;
+	else
+		mad_send_wr->sg_list[1].length = mad_size - hdr_len;
+
 	mad_send_wr->sg_list[1].lkey = mad_agent->mr->lkey;
 
 	mad_send_wr->send_wr.wr_id = (unsigned long) mad_send_wr;
@@ -985,7 +1002,7 @@ struct ib_mad_send_buf * ib_create_send_mad(struct ib_mad_agent *mad_agent,
 	mad_send_wr->send_wr.wr.ud.pkey_index = pkey_index;
 
 	if (rmpp_active) {
-		ret = alloc_send_rmpp_list(mad_send_wr, gfp_mask);
+		ret = alloc_send_rmpp_list(mad_send_wr, mad_size, gfp_mask);
 		if (ret) {
 			kfree(buf);
 			return ERR_PTR(ret);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path Architecture MADs
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path Architecture base version MADs in ib_create_send_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 11/16] ib/mad: create helper function for smi_handle_dr_smp_send ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (7 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If the registration specifies an OPA MAD class version and the device does not
support jumbo MADs, fail the MAD registration.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index a3ba37f..0ddfbd7 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -238,6 +238,14 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
 		goto error1;
 	}
 
+	/* Validate device and port */
+	port_priv = ib_get_mad_port(device, port_num);
+	if (!port_priv) {
+		dev_notice(&device->dev, "ib_register_mad_agent: Invalid port\n");
+		ret = ERR_PTR(-ENODEV);
+		goto error1;
+	}
+
 	/* Validate MAD registration request if supplied */
 	if (mad_reg_req) {
 		if (mad_reg_req->mgmt_class_version >= MAX_MGMT_VERSION) {
@@ -246,6 +254,12 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
 				   mad_reg_req->mgmt_class_version);
 			goto error1;
 		}
+		if (mad_reg_req->mgmt_class_version >= OPA_MIN_CLASS_VERSION
+		    && !port_priv->qp_info[qpn].supports_jumbo_mads) {
+			dev_notice(&device->dev,
+				   "ib_register_mad_agent: OPA class Version specified on a device which does not support jumbo MAD's\n");
+			goto error1;
+		}
 		if (!recv_handler) {
 			dev_notice(&device->dev,
 				   "ib_register_mad_agent: no recv_handler\n");
@@ -323,14 +337,6 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
 			goto error1;
 	}
 
-	/* Validate device and port */
-	port_priv = ib_get_mad_port(device, port_num);
-	if (!port_priv) {
-		dev_notice(&device->dev, "ib_register_mad_agent: Invalid port\n");
-		ret = ERR_PTR(-ENODEV);
-		goto error1;
-	}
-
 	/* Verify the QP requested is supported.  For example, Ethernet devices
 	 * will not have QP0 */
 	if (!port_priv->qp_info[qpn].qp) {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 11/16] ib/mad: create helper function for smi_handle_dr_smp_send
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path Architecture MADs ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 12/16] ib/mad: create helper function for smi_handle_dr_smp_recv ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (6 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This helper function will be used for processing both IB and OPA SMP sends.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/smi.c | 81 +++++++++++++++++++++++++------------------
 1 file changed, 47 insertions(+), 34 deletions(-)

diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 5855e44..3bac6e6 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -39,84 +39,81 @@
 #include <rdma/ib_smi.h>
 #include "smi.h"
 
-/*
- * Fixup a directed route SMP for sending
- * Return 0 if the SMP should be discarded
- */
-enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
-				       u8 node_type, int port_num)
+static inline
+enum smi_action __smi_handle_dr_smp_send(u8 node_type, int port_num,
+					 u8 *hop_ptr, u8 hop_cnt,
+					 u8 *initial_path,
+					 u8 *return_path,
+					 u8 direction,
+					 int dr_dlid_is_permissive,
+					 int dr_slid_is_permissive)
 {
-	u8 hop_ptr, hop_cnt;
-
-	hop_ptr = smp->hop_ptr;
-	hop_cnt = smp->hop_cnt;
-
 	/* See section 14.2.2.2, Vol 1 IB spec */
 	/* C14-6 -- valid hop_cnt values are from 0 to 63 */
 	if (hop_cnt >= IB_SMP_MAX_PATH_HOPS)
 		return IB_SMI_DISCARD;
 
-	if (!ib_get_smp_direction(smp)) {
+	if (!direction) {
 		/* C14-9:1 */
-		if (hop_cnt && hop_ptr == 0) {
-			smp->hop_ptr++;
-			return (smp->initial_path[smp->hop_ptr] ==
+		if (hop_cnt && *hop_ptr == 0) {
+			(*hop_ptr)++;
+			return (initial_path[*hop_ptr] ==
 				port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-9:2 */
-		if (hop_ptr && hop_ptr < hop_cnt) {
+		if (*hop_ptr && *hop_ptr < hop_cnt) {
 			if (node_type != RDMA_NODE_IB_SWITCH)
 				return IB_SMI_DISCARD;
 
-			/* smp->return_path set when received */
-			smp->hop_ptr++;
-			return (smp->initial_path[smp->hop_ptr] ==
+			/* return_path set when received */
+			(*hop_ptr)++;
+			return (initial_path[*hop_ptr] ==
 				port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-9:3 -- We're at the end of the DR segment of path */
-		if (hop_ptr == hop_cnt) {
-			/* smp->return_path set when received */
-			smp->hop_ptr++;
+		if (*hop_ptr == hop_cnt) {
+			/* return_path set when received */
+			(*hop_ptr)++;
 			return (node_type == RDMA_NODE_IB_SWITCH ||
-				smp->dr_dlid == IB_LID_PERMISSIVE ?
+				dr_dlid_is_permissive ?
 				IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */
 		/* C14-9:5 -- Fail unreasonable hop pointer */
-		return (hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : IB_SMI_DISCARD);
+		return (*hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 
 	} else {
 		/* C14-13:1 */
-		if (hop_cnt && hop_ptr == hop_cnt + 1) {
-			smp->hop_ptr--;
-			return (smp->return_path[smp->hop_ptr] ==
+		if (hop_cnt && *hop_ptr == hop_cnt + 1) {
+			(*hop_ptr)--;
+			return (return_path[*hop_ptr] ==
 				port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-13:2 */
-		if (2 <= hop_ptr && hop_ptr <= hop_cnt) {
+		if (2 <= *hop_ptr && *hop_ptr <= hop_cnt) {
 			if (node_type != RDMA_NODE_IB_SWITCH)
 				return IB_SMI_DISCARD;
 
-			smp->hop_ptr--;
-			return (smp->return_path[smp->hop_ptr] ==
+			(*hop_ptr)--;
+			return (return_path[*hop_ptr] ==
 				port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-13:3 -- at the end of the DR segment of path */
-		if (hop_ptr == 1) {
-			smp->hop_ptr--;
+		if (*hop_ptr == 1) {
+			(*hop_ptr)--;
 			/* C14-13:3 -- SMPs destined for SM shouldn't be here */
 			return (node_type == RDMA_NODE_IB_SWITCH ||
-				smp->dr_slid == IB_LID_PERMISSIVE ?
+				dr_slid_is_permissive ?
 				IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-13:4 -- hop_ptr = 0 -> should have gone to SM */
-		if (hop_ptr == 0)
+		if (*hop_ptr == 0)
 			return IB_SMI_HANDLE;
 
 		/* C14-13:5 -- Check for unreasonable hop pointer */
@@ -125,6 +122,22 @@ enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
 }
 
 /*
+ * Fixup a directed route SMP for sending
+ * Return 0 if the SMP should be discarded
+ */
+enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
+				       u8 node_type, int port_num)
+{
+	return __smi_handle_dr_smp_send(node_type, port_num,
+					&smp->hop_ptr, smp->hop_cnt,
+					smp->initial_path,
+					smp->return_path,
+					ib_get_smp_direction(smp),
+					smp->dr_dlid == IB_LID_PERMISSIVE,
+					smp->dr_slid == IB_LID_PERMISSIVE);
+}
+
+/*
  * Adjust information for a received SMP
  * Return 0 if the SMP should be dropped
  */
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 12/16] ib/mad: create helper function for smi_handle_dr_smp_recv
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 11/16] ib/mad: create helper function for smi_handle_dr_smp_send ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 13/16] ib/mad: create helper function for smi_check_forward_dr_smp ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (5 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This helper function will be used for processing both IB and OPA SMP recvs.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/smi.c | 80 +++++++++++++++++++++++++------------------
 1 file changed, 47 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 3bac6e6..24670de 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -137,91 +137,105 @@ enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
 					smp->dr_slid == IB_LID_PERMISSIVE);
 }
 
-/*
- * Adjust information for a received SMP
- * Return 0 if the SMP should be dropped
- */
-enum smi_action smi_handle_dr_smp_recv(struct ib_smp *smp, u8 node_type,
-				       int port_num, int phys_port_cnt)
+static inline
+enum smi_action __smi_handle_dr_smp_recv(u8 node_type, int port_num,
+					 int phys_port_cnt,
+					 u8 *hop_ptr, u8 hop_cnt,
+					 u8 *initial_path,
+					 u8 *return_path,
+					 u8 direction,
+					 int dr_dlid_is_permissive,
+					 int dr_slid_is_permissive)
 {
-	u8 hop_ptr, hop_cnt;
-
-	hop_ptr = smp->hop_ptr;
-	hop_cnt = smp->hop_cnt;
-
 	/* See section 14.2.2.2, Vol 1 IB spec */
 	/* C14-6 -- valid hop_cnt values are from 0 to 63 */
 	if (hop_cnt >= IB_SMP_MAX_PATH_HOPS)
 		return IB_SMI_DISCARD;
 
-	if (!ib_get_smp_direction(smp)) {
+	if (!direction) {
 		/* C14-9:1 -- sender should have incremented hop_ptr */
-		if (hop_cnt && hop_ptr == 0)
+		if (hop_cnt && *hop_ptr == 0)
 			return IB_SMI_DISCARD;
 
 		/* C14-9:2 -- intermediate hop */
-		if (hop_ptr && hop_ptr < hop_cnt) {
+		if (*hop_ptr && *hop_ptr < hop_cnt) {
 			if (node_type != RDMA_NODE_IB_SWITCH)
 				return IB_SMI_DISCARD;
 
-			smp->return_path[hop_ptr] = port_num;
-			/* smp->hop_ptr updated when sending */
-			return (smp->initial_path[hop_ptr+1] <= phys_port_cnt ?
+			return_path[*hop_ptr] = port_num;
+			/* hop_ptr updated when sending */
+			return (initial_path[*hop_ptr+1] <= phys_port_cnt ?
 				IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-9:3 -- We're at the end of the DR segment of path */
-		if (hop_ptr == hop_cnt) {
+		if (*hop_ptr == hop_cnt) {
 			if (hop_cnt)
-				smp->return_path[hop_ptr] = port_num;
-			/* smp->hop_ptr updated when sending */
+				return_path[*hop_ptr] = port_num;
+			/* hop_ptr updated when sending */
 
 			return (node_type == RDMA_NODE_IB_SWITCH ||
-				smp->dr_dlid == IB_LID_PERMISSIVE ?
+				dr_dlid_is_permissive ?
 				IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */
 		/* C14-9:5 -- fail unreasonable hop pointer */
-		return (hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : IB_SMI_DISCARD);
+		return (*hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 
 	} else {
 
 		/* C14-13:1 */
-		if (hop_cnt && hop_ptr == hop_cnt + 1) {
-			smp->hop_ptr--;
-			return (smp->return_path[smp->hop_ptr] ==
+		if (hop_cnt && *hop_ptr == hop_cnt + 1) {
+			(*hop_ptr)--;
+			return (return_path[*hop_ptr] ==
 				port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-13:2 */
-		if (2 <= hop_ptr && hop_ptr <= hop_cnt) {
+		if (2 <= *hop_ptr && *hop_ptr <= hop_cnt) {
 			if (node_type != RDMA_NODE_IB_SWITCH)
 				return IB_SMI_DISCARD;
 
-			/* smp->hop_ptr updated when sending */
-			return (smp->return_path[hop_ptr-1] <= phys_port_cnt ?
+			/* hop_ptr updated when sending */
+			return (return_path[*hop_ptr-1] <= phys_port_cnt ?
 				IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-13:3 -- We're at the end of the DR segment of path */
-		if (hop_ptr == 1) {
-			if (smp->dr_slid == IB_LID_PERMISSIVE) {
+		if (*hop_ptr == 1) {
+			if (dr_slid_is_permissive) {
 				/* giving SMP to SM - update hop_ptr */
-				smp->hop_ptr--;
+				(*hop_ptr)--;
 				return IB_SMI_HANDLE;
 			}
-			/* smp->hop_ptr updated when sending */
+			/* hop_ptr updated when sending */
 			return (node_type == RDMA_NODE_IB_SWITCH ?
 				IB_SMI_HANDLE : IB_SMI_DISCARD);
 		}
 
 		/* C14-13:4 -- hop_ptr = 0 -> give to SM */
 		/* C14-13:5 -- Check for unreasonable hop pointer */
-		return (hop_ptr == 0 ? IB_SMI_HANDLE : IB_SMI_DISCARD);
+		return (*hop_ptr == 0 ? IB_SMI_HANDLE : IB_SMI_DISCARD);
 	}
 }
 
+/*
+ * Adjust information for a received SMP
+ * Return 0 if the SMP should be dropped
+ */
+enum smi_action smi_handle_dr_smp_recv(struct ib_smp *smp, u8 node_type,
+				       int port_num, int phys_port_cnt)
+{
+	return __smi_handle_dr_smp_recv(node_type, port_num, phys_port_cnt,
+					&smp->hop_ptr, smp->hop_cnt,
+					smp->initial_path,
+					smp->return_path,
+					ib_get_smp_direction(smp),
+					smp->dr_dlid == IB_LID_PERMISSIVE,
+					smp->dr_slid == IB_LID_PERMISSIVE);
+}
+
 enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
 {
 	u8 hop_ptr, hop_cnt;
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 13/16] ib/mad: create helper function for smi_check_forward_dr_smp
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 12/16] ib/mad: create helper function for smi_handle_dr_smp_recv ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (4 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This helper function will be used for processing both IB and OPA SMPs.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/smi.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 24670de..8a5fb1d 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -236,21 +236,20 @@ enum smi_action smi_handle_dr_smp_recv(struct ib_smp *smp, u8 node_type,
 					smp->dr_slid == IB_LID_PERMISSIVE);
 }
 
-enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
+static inline
+enum smi_forward_action __smi_check_forward_dr_smp(u8 hop_ptr, u8 hop_cnt,
+						   u8 direction,
+						   int dr_dlid_is_permissive,
+						   int dr_slid_is_permissive)
 {
-	u8 hop_ptr, hop_cnt;
-
-	hop_ptr = smp->hop_ptr;
-	hop_cnt = smp->hop_cnt;
-
-	if (!ib_get_smp_direction(smp)) {
+	if (!direction) {
 		/* C14-9:2 -- intermediate hop */
 		if (hop_ptr && hop_ptr < hop_cnt)
 			return IB_SMI_FORWARD;
 
 		/* C14-9:3 -- at the end of the DR segment of path */
 		if (hop_ptr == hop_cnt)
-			return (smp->dr_dlid == IB_LID_PERMISSIVE ?
+			return (dr_dlid_is_permissive ?
 				IB_SMI_SEND : IB_SMI_LOCAL);
 
 		/* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */
@@ -263,10 +262,19 @@ enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
 
 		/* C14-13:3 -- at the end of the DR segment of path */
 		if (hop_ptr == 1)
-			return (smp->dr_slid != IB_LID_PERMISSIVE ?
+			return (dr_slid_is_permissive ?
 				IB_SMI_SEND : IB_SMI_LOCAL);
 	}
 	return IB_SMI_LOCAL;
+
+}
+
+enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
+{
+	return __smi_check_forward_dr_smp(smp->hop_ptr, smp->hop_cnt,
+					  ib_get_smp_direction(smp),
+					  smp->dr_dlid == IB_LID_PERMISSIVE,
+					  smp->dr_slid != IB_LID_PERMISSIVE);
 }
 
 /*
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 13/16] ib/mad: create helper function for smi_check_forward_dr_smp ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (3 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This helper function will be used for processing both IB and OPA SMPs.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad.c | 85 +++++++++++++++++++++++++------------------
 1 file changed, 49 insertions(+), 36 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 0ddfbd7..7bd67e8 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1965,6 +1965,52 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
 	}
 }
 
+enum smi_action handle_ib_smi(struct ib_mad_port_private *port_priv,
+			      struct ib_mad_qp_info *qp_info,
+			      struct ib_wc *wc,
+			      int port_num,
+			      struct ib_mad_private *recv,
+			      struct ib_mad_private *response)
+{
+	enum smi_forward_action retsmi;
+
+	if (smi_handle_dr_smp_recv(&recv->mad.smp,
+				   port_priv->device->node_type,
+				   port_num,
+				   port_priv->device->phys_port_cnt) ==
+				   IB_SMI_DISCARD)
+		return IB_SMI_DISCARD;
+
+	retsmi = smi_check_forward_dr_smp(&recv->mad.smp);
+	if (retsmi == IB_SMI_LOCAL)
+		return IB_SMI_HANDLE;
+
+	if (retsmi == IB_SMI_SEND) { /* don't forward */
+		if (smi_handle_dr_smp_send(&recv->mad.smp,
+					   port_priv->device->node_type,
+					   port_num) == IB_SMI_DISCARD)
+			return IB_SMI_DISCARD;
+
+		if (smi_check_local_smp(&recv->mad.smp, port_priv->device) == IB_SMI_DISCARD)
+			return IB_SMI_DISCARD;
+	} else if (port_priv->device->node_type == RDMA_NODE_IB_SWITCH) {
+		/* forward case for switches */
+		memcpy(response, recv, sizeof(*response));
+		response->header.recv_wc.wc = &response->header.wc;
+		response->header.recv_wc.recv_buf.mad = &response->mad.mad;
+		response->header.recv_wc.recv_buf.grh = &response->grh;
+
+		agent_send_response(&response->mad.mad,
+				    &response->grh, wc,
+				    port_priv->device,
+				    smi_get_fwd_port(&recv->mad.smp),
+				    qp_info->qp->qp_num);
+
+		return IB_SMI_DISCARD;
+	}
+	return IB_SMI_HANDLE;
+}
+
 static bool generate_unmatched_resp(struct ib_mad_private *recv,
 				    struct ib_mad_private *response)
 {
@@ -2037,45 +2083,12 @@ static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
 
 	if (recv->mad.mad.mad_hdr.mgmt_class ==
 	    IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
-		enum smi_forward_action retsmi;
-
-		if (smi_handle_dr_smp_recv(&recv->mad.smp,
-					   port_priv->device->node_type,
-					   port_num,
-					   port_priv->device->phys_port_cnt) ==
-					   IB_SMI_DISCARD)
+		if (handle_ib_smi(port_priv, qp_info, wc, port_num, recv,
+				  response)
+		    == IB_SMI_DISCARD)
 			goto out;
-
-		retsmi = smi_check_forward_dr_smp(&recv->mad.smp);
-		if (retsmi == IB_SMI_LOCAL)
-			goto local;
-
-		if (retsmi == IB_SMI_SEND) { /* don't forward */
-			if (smi_handle_dr_smp_send(&recv->mad.smp,
-						   port_priv->device->node_type,
-						   port_num) == IB_SMI_DISCARD)
-				goto out;
-
-			if (smi_check_local_smp(&recv->mad.smp, port_priv->device) == IB_SMI_DISCARD)
-				goto out;
-		} else if (port_priv->device->node_type == RDMA_NODE_IB_SWITCH) {
-			/* forward case for switches */
-			memcpy(response, recv, sizeof(*response));
-			response->header.recv_wc.wc = &response->header.wc;
-			response->header.recv_wc.recv_buf.mad = &response->mad.mad;
-			response->header.recv_wc.recv_buf.grh = &response->grh;
-
-			agent_send_response(&response->mad.mad,
-					    &response->grh, wc,
-					    port_priv->device,
-					    smi_get_fwd_port(&recv->mad.smp),
-					    qp_info->qp->qp_num);
-
-			goto out;
-		}
 	}
 
-local:
 	/* Give driver "right of first refusal" on incoming MAD */
 	if (port_priv->device->process_mad) {
 		size_t resp_mad_size = sizeof(struct ib_mad);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP processing
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (13 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-13 19:54   ` [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General MAD processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (2 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Define the new OPA SMP format, create support functions for this format, and
call the previously defined helper functions as appropriate.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/mad_priv.h |   2 +
 drivers/infiniband/core/opa_smi.h  |  78 +++++++++++++++++++++++++++
 drivers/infiniband/core/smi.c      |  54 +++++++++++++++++++
 drivers/infiniband/core/smi.h      |   6 +++
 include/rdma/opa_smi.h             | 106 +++++++++++++++++++++++++++++++++++++
 5 files changed, 246 insertions(+)
 create mode 100644 drivers/infiniband/core/opa_smi.h
 create mode 100644 include/rdma/opa_smi.h

diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h
index 29ed8dd..7a82950 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -41,6 +41,7 @@
 #include <linux/workqueue.h>
 #include <rdma/ib_mad.h>
 #include <rdma/ib_smi.h>
+#include <rdma/opa_smi.h>
 
 #define IB_MAD_QPS_CORE		2 /* Always QP0 and QP1 as a minimum */
 
@@ -103,6 +104,7 @@ struct jumbo_mad_private {
 	union {
 		struct jumbo_mad mad;
 		struct jumbo_rmpp_mad rmpp_mad;
+		struct opa_smp smp;
 	} mad;
 } __packed;
 
diff --git a/drivers/infiniband/core/opa_smi.h b/drivers/infiniband/core/opa_smi.h
new file mode 100644
index 0000000..d180179
--- /dev/null
+++ b/drivers/infiniband/core/opa_smi.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef __OPA_SMI_H_
+#define __OPA_SMI_H_
+
+#include <rdma/ib_smi.h>
+#include <rdma/opa_smi.h>
+
+#include "smi.h"
+
+enum smi_action opa_smi_handle_dr_smp_recv(struct opa_smp *smp, u8 node_type,
+				       int port_num, int phys_port_cnt);
+int opa_smi_get_fwd_port(struct opa_smp *smp);
+extern enum smi_forward_action opa_smi_check_forward_dr_smp(struct opa_smp *smp);
+extern enum smi_action opa_smi_handle_dr_smp_send(struct opa_smp *smp,
+					      u8 node_type, int port_num);
+
+/*
+ * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM
+ * via process_mad
+ */
+static inline enum smi_action opa_smi_check_local_smp(struct opa_smp *smp,
+						  struct ib_device *device)
+{
+	/* C14-9:3 -- We're at the end of the DR segment of path */
+	/* C14-9:4 -- Hop Pointer = Hop Count + 1 -> give to SMA/SM */
+	return (device->process_mad &&
+		!opa_get_smp_direction(smp) &&
+		(smp->hop_ptr == smp->hop_cnt + 1)) ?
+		IB_SMI_HANDLE : IB_SMI_DISCARD;
+}
+
+/*
+ * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM
+ * via process_mad
+ */
+static inline enum smi_action opa_smi_check_local_returning_smp(struct opa_smp *smp,
+						   struct ib_device *device)
+{
+	/* C14-13:3 -- We're at the end of the DR segment of path */
+	/* C14-13:4 -- Hop Pointer == 0 -> give to SM */
+	return (device->process_mad &&
+		opa_get_smp_direction(smp) &&
+		!smp->hop_ptr) ? IB_SMI_HANDLE : IB_SMI_DISCARD;
+}
+
+#endif	/* __OPA_SMI_H_ */
diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 8a5fb1d..a38ccb4 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -5,6 +5,7 @@
  * Copyright (c) 2004, 2005 Topspin Corporation.  All rights reserved.
  * Copyright (c) 2004-2007 Voltaire Corporation.  All rights reserved.
  * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -38,6 +39,7 @@
 
 #include <rdma/ib_smi.h>
 #include "smi.h"
+#include "opa_smi.h"
 
 static inline
 enum smi_action __smi_handle_dr_smp_send(u8 node_type, int port_num,
@@ -137,6 +139,20 @@ enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
 					smp->dr_slid == IB_LID_PERMISSIVE);
 }
 
+enum smi_action opa_smi_handle_dr_smp_send(struct opa_smp *smp,
+				       u8 node_type, int port_num)
+{
+	return __smi_handle_dr_smp_send(node_type, port_num,
+					&smp->hop_ptr, smp->hop_cnt,
+					smp->route.dr.initial_path,
+					smp->route.dr.return_path,
+					opa_get_smp_direction(smp),
+					smp->route.dr.dr_dlid ==
+					OPA_LID_PERMISSIVE,
+					smp->route.dr.dr_slid ==
+					OPA_LID_PERMISSIVE);
+}
+
 static inline
 enum smi_action __smi_handle_dr_smp_recv(u8 node_type, int port_num,
 					 int phys_port_cnt,
@@ -236,6 +252,24 @@ enum smi_action smi_handle_dr_smp_recv(struct ib_smp *smp, u8 node_type,
 					smp->dr_slid == IB_LID_PERMISSIVE);
 }
 
+/*
+ * Adjust information for a received SMP
+ * Return 0 if the SMP should be dropped
+ */
+enum smi_action opa_smi_handle_dr_smp_recv(struct opa_smp *smp, u8 node_type,
+					   int port_num, int phys_port_cnt)
+{
+	return __smi_handle_dr_smp_recv(node_type, port_num, phys_port_cnt,
+					&smp->hop_ptr, smp->hop_cnt,
+					smp->route.dr.initial_path,
+					smp->route.dr.return_path,
+					opa_get_smp_direction(smp),
+					smp->route.dr.dr_dlid ==
+					OPA_LID_PERMISSIVE,
+					smp->route.dr.dr_slid ==
+					OPA_LID_PERMISSIVE);
+}
+
 static inline
 enum smi_forward_action __smi_check_forward_dr_smp(u8 hop_ptr, u8 hop_cnt,
 						   u8 direction,
@@ -277,6 +311,16 @@ enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
 					  smp->dr_slid != IB_LID_PERMISSIVE);
 }
 
+enum smi_forward_action opa_smi_check_forward_dr_smp(struct opa_smp *smp)
+{
+	return __smi_check_forward_dr_smp(smp->hop_ptr, smp->hop_cnt,
+					  opa_get_smp_direction(smp),
+					  smp->route.dr.dr_dlid ==
+					  OPA_LID_PERMISSIVE,
+					  smp->route.dr.dr_slid ==
+					  OPA_LID_PERMISSIVE);
+}
+
 /*
  * Return the forwarding port number from initial_path for outgoing SMP and
  * from return_path for returning SMP
@@ -286,3 +330,13 @@ int smi_get_fwd_port(struct ib_smp *smp)
 	return (!ib_get_smp_direction(smp) ? smp->initial_path[smp->hop_ptr+1] :
 		smp->return_path[smp->hop_ptr-1]);
 }
+
+/*
+ * Return the forwarding port number from initial_path for outgoing SMP and
+ * from return_path for returning SMP
+ */
+int opa_smi_get_fwd_port(struct opa_smp *smp)
+{
+	return !opa_get_smp_direction(smp) ? smp->route.dr.initial_path[smp->hop_ptr+1] :
+		smp->route.dr.return_path[smp->hop_ptr-1];
+}
diff --git a/drivers/infiniband/core/smi.h b/drivers/infiniband/core/smi.h
index aff96ba..e95c537 100644
--- a/drivers/infiniband/core/smi.h
+++ b/drivers/infiniband/core/smi.h
@@ -62,6 +62,9 @@ extern enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
  * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM
  * via process_mad
  */
+/* NOTE: This is called on opa_smp's don't check fields which are not common
+ * between ib_smp and opa_smp
+ */
 static inline enum smi_action smi_check_local_smp(struct ib_smp *smp,
 						  struct ib_device *device)
 {
@@ -77,6 +80,9 @@ static inline enum smi_action smi_check_local_smp(struct ib_smp *smp,
  * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM
  * via process_mad
  */
+/* NOTE: This is called on opa_smp's don't check fields which are not common
+ * between ib_smp and opa_smp
+ */
 static inline enum smi_action smi_check_local_returning_smp(struct ib_smp *smp,
 						   struct ib_device *device)
 {
diff --git a/include/rdma/opa_smi.h b/include/rdma/opa_smi.h
new file mode 100644
index 0000000..29063e8
--- /dev/null
+++ b/include/rdma/opa_smi.h
@@ -0,0 +1,106 @@
+/*
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if !defined(OPA_SMI_H)
+#define OPA_SMI_H
+
+#include <rdma/ib_mad.h>
+#include <rdma/ib_smi.h>
+
+#define OPA_SMP_LID_DATA_SIZE			2016
+#define OPA_SMP_DR_DATA_SIZE			1872
+#define OPA_SMP_MAX_PATH_HOPS			64
+
+#define OPA_SMI_CLASS_VERSION			0x80
+
+#define OPA_LID_PERMISSIVE			cpu_to_be32(0xFFFFFFFF)
+
+struct opa_smp {
+	u8	base_version;
+	u8	mgmt_class;
+	u8	class_version;
+	u8	method;
+	__be16	status;
+	u8	hop_ptr;
+	u8	hop_cnt;
+	__be64	tid;
+	__be16	attr_id;
+	__be16	resv;
+	__be32	attr_mod;
+	__be64	mkey;
+	union {
+		struct {
+			uint8_t data[OPA_SMP_LID_DATA_SIZE];
+		} lid;
+		struct {
+			__be32	dr_slid;
+			__be32	dr_dlid;
+			u8	initial_path[OPA_SMP_MAX_PATH_HOPS];
+			u8	return_path[OPA_SMP_MAX_PATH_HOPS];
+			u8	reserved[8];
+			u8	data[OPA_SMP_DR_DATA_SIZE];
+		} dr;
+	} route;
+} __packed;
+
+
+static inline u8
+opa_get_smp_direction(struct opa_smp *smp)
+{
+	return ib_get_smp_direction((struct ib_smp *)smp);
+}
+
+static inline u8 *opa_get_smp_data(struct opa_smp *smp)
+{
+	if (smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
+		return smp->route.dr.data;
+
+	return smp->route.lid.data;
+}
+
+static inline size_t opa_get_smp_data_size(struct opa_smp *smp)
+{
+	if (smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
+		return sizeof(smp->route.dr.data);
+
+	return sizeof(smp->route.lid.data);
+}
+
+static inline size_t opa_get_smp_header_size(struct opa_smp *smp)
+{
+	if (smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
+		return sizeof(*smp) - sizeof(smp->route.dr.data);
+
+	return sizeof(*smp) - sizeof(smp->route.lid.data);
+}
+
+#endif /* OPA_SMI_H */
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General MAD processing
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (14 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-13 19:54   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2014-11-18 22:16   ` [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) " Or Gerlitz
  2014-11-25 23:16   ` Hal Rosenstock
  17 siblings, 0 replies; 36+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2014-11-13 19:54 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

OPA SMP packets must carry a valid pkey
	process wc.pkey_index returned by agents for response.

Handle variable length OPA MADs based on the Base Version
Support is provided by:

	* Adjusting the 'fake' WC for locally routed SMP's to represent the
	  proper incoming byte_len
	* out_mad_size is used from the local HCA agents
		1) when sending agent responses on the wire
		2) when passing responses through the local_completions function

NOTE: wc.byte_len includes the GRH length and therefore is different from the
      in_mad_size specified to the local HCA agents.  out_mad_size should _not_
      include the GRH length as it is added by the verbs layer and is not part
      of MAD processing.

Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/agent.c    |  57 +++--
 drivers/infiniband/core/agent.h    |   2 +-
 drivers/infiniband/core/mad.c      | 440 +++++++++++++++++++++++++++++++++----
 drivers/infiniband/core/mad_priv.h |   1 +
 drivers/infiniband/core/mad_rmpp.c |  30 ++-
 drivers/infiniband/core/user_mad.c |  39 ++--
 6 files changed, 486 insertions(+), 83 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index b6bd305..d7a2905 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -78,16 +78,11 @@ ib_get_agent_port(struct ib_device *device, int port_num)
 	return entry;
 }
 
-void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
-			 struct ib_wc *wc, struct ib_device *device,
-			 int port_num, int qpn)
+static int get_agent_ah(struct ib_device *device, int port_num,
+			struct ib_grh *grh, struct ib_wc *wc, int qpn,
+			struct ib_mad_agent **agent, struct ib_ah **ah)
 {
 	struct ib_agent_port_private *port_priv;
-	struct ib_mad_agent *agent;
-	struct ib_mad_send_buf *send_buf;
-	struct ib_ah *ah;
-	struct ib_mad_send_wr_private *mad_send_wr;
-
 	if (device->node_type == RDMA_NODE_IB_SWITCH)
 		port_priv = ib_get_agent_port(device, 0);
 	else
@@ -95,27 +90,57 @@ void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
 
 	if (!port_priv) {
 		dev_err(&device->dev, "Unable to find port agent\n");
-		return;
+		return 1;
 	}
 
-	agent = port_priv->agent[qpn];
-	ah = ib_create_ah_from_wc(agent->qp->pd, wc, grh, port_num);
-	if (IS_ERR(ah)) {
+	*agent = port_priv->agent[qpn];
+	*ah = ib_create_ah_from_wc((*agent)->qp->pd, wc, grh, port_num);
+	if (IS_ERR(*ah)) {
 		dev_err(&device->dev, "ib_create_ah_from_wc error %ld\n",
 			PTR_ERR(ah));
+		return 1;
+	}
+	return 0;
+}
+
+void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
+			 struct ib_wc *wc, struct ib_device *device,
+			 int port_num, int qpn, u32 resp_mad_len)
+{
+	struct ib_mad_agent *agent;
+	struct ib_mad_send_buf *send_buf;
+	struct ib_ah *ah;
+	size_t data_len;
+	size_t hdr_len;
+	struct ib_mad_send_wr_private *mad_send_wr;
+	u8 base_version;
+
+	if (get_agent_ah(device, port_num, grh, wc, qpn, &agent, &ah))
 		return;
+
+	/* base version determines MAD size */
+	base_version = mad->mad_hdr.base_version;
+	if (base_version == OPA_MGMT_BASE_VERSION) {
+		data_len = resp_mad_len - JUMBO_MGMT_MAD_HDR;
+		hdr_len = JUMBO_MGMT_MAD_HDR;
+	} else {
+		data_len = IB_MGMT_MAD_DATA;
+		hdr_len = IB_MGMT_MAD_HDR;
 	}
 
 	send_buf = ib_create_send_mad(agent, wc->src_qp, wc->pkey_index, 0,
-				      IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-				      GFP_KERNEL,
-				      IB_MGMT_BASE_VERSION);
+				      hdr_len, data_len, GFP_KERNEL,
+				      base_version);
 	if (IS_ERR(send_buf)) {
 		dev_err(&device->dev, "ib_create_send_mad error\n");
 		goto err1;
 	}
 
-	memcpy(send_buf->mad, mad, sizeof *mad);
+	if (base_version == OPA_MGMT_BASE_VERSION)
+		memcpy(send_buf->mad, mad, JUMBO_MGMT_MAD_HDR + data_len);
+	else
+		memcpy(send_buf->mad, mad, sizeof(*mad));
+
 	send_buf->ah = ah;
 
 	if (device->node_type == RDMA_NODE_IB_SWITCH) {
diff --git a/drivers/infiniband/core/agent.h b/drivers/infiniband/core/agent.h
index 6669287..cb4081d 100644
--- a/drivers/infiniband/core/agent.h
+++ b/drivers/infiniband/core/agent.h
@@ -46,6 +46,6 @@ extern int ib_agent_port_close(struct ib_device *device, int port_num);
 
 extern void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
 				struct ib_wc *wc, struct ib_device *device,
-				int port_num, int qpn);
+				int port_num, int qpn, u32 resp_mad_len);
 
 #endif	/* __AGENT_H_ */
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 7bd67e8..e73a116 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2005 Intel Corporation.  All rights reserved.
  * Copyright (c) 2005 Mellanox Technologies Ltd.  All rights reserved.
  * Copyright (c) 2009 HNR Consulting. All rights reserved.
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -44,6 +45,7 @@
 #include "mad_priv.h"
 #include "mad_rmpp.h"
 #include "smi.h"
+#include "opa_smi.h"
 #include "agent.h"
 
 MODULE_LICENSE("Dual BSD/GPL");
@@ -85,6 +87,8 @@ static int add_nonoui_reg_req(struct ib_mad_reg_req *mad_reg_req,
 			      u8 mgmt_class);
 static int add_oui_reg_req(struct ib_mad_reg_req *mad_reg_req,
 			   struct ib_mad_agent_private *agent_priv);
+static int ib_mad_post_jumbo_rcv_mads(struct ib_mad_qp_info *qp_info,
+				      struct jumbo_mad_private *mad);
 
 static void mad_priv_cache_free(struct ib_mad_private *mad_priv)
 {
@@ -742,9 +746,10 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 {
 	int ret = 0;
 	struct ib_smp *smp = mad_send_wr->send_buf.mad;
+	struct opa_smp *opa_smp = (struct opa_smp *)smp;
 	unsigned long flags;
 	struct ib_mad_local_private *local;
-	struct ib_mad_private *mad_priv;
+	struct ib_mad_private *mad_priv; /* or jumbo_mad_priv */
 	struct ib_mad_port_private *port_priv;
 	struct ib_mad_agent_private *recv_mad_agent = NULL;
 	struct ib_device *device = mad_agent_priv->agent.device;
@@ -753,6 +758,7 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	struct ib_send_wr *send_wr = &mad_send_wr->send_wr;
 	size_t in_mad_size = sizeof(struct ib_mad);
 	size_t out_mad_size = sizeof(struct ib_mad);
+	u32 opa_drslid;
 
 	if (device->node_type == RDMA_NODE_IB_SWITCH &&
 	    smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
@@ -766,13 +772,34 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	 * If we are at the start of the LID routed part, don't update the
 	 * hop_ptr or hop_cnt.  See section 14.2.2, Vol 1 IB spec.
 	 */
-	if ((ib_get_smp_direction(smp) ? smp->dr_dlid : smp->dr_slid) ==
-	     IB_LID_PERMISSIVE &&
-	     smi_handle_dr_smp_send(smp, device->node_type, port_num) ==
-	     IB_SMI_DISCARD) {
-		ret = -EINVAL;
-		dev_err(&device->dev, "Invalid directed route\n");
-		goto out;
+	if (smp->class_version == OPA_SMP_CLASS_VERSION) {
+		if ((opa_get_smp_direction(opa_smp)
+		     ? opa_smp->route.dr.dr_dlid : opa_smp->route.dr.dr_slid) ==
+		     OPA_LID_PERMISSIVE &&
+		     opa_smi_handle_dr_smp_send(opa_smp, device->node_type,
+						port_num) == IB_SMI_DISCARD) {
+			ret = -EINVAL;
+			dev_err(&device->dev, "OPA Invalid directed route\n");
+			goto out;
+		}
+		opa_drslid = be32_to_cpu(opa_smp->route.dr.dr_slid);
+		if (opa_drslid != OPA_LID_PERMISSIVE &&
+		    opa_drslid & 0xffff0000) {
+			ret = -EINVAL;
+			dev_err(&device->dev, "OPA Invalid dr_slid 0x%x\n",
+			       opa_drslid);
+			goto out;
+		}
+	} else {
+		if ((ib_get_smp_direction(smp) ? smp->dr_dlid : smp->dr_slid) ==
+		     IB_LID_PERMISSIVE &&
+		     smi_handle_dr_smp_send(smp, device->node_type, port_num) ==
+		     IB_SMI_DISCARD) {
+			ret = -EINVAL;
+			dev_err(&device->dev, "Invalid directed route\n");
+			goto out;
+		}
+		opa_drslid = be16_to_cpu(smp->dr_slid);
 	}
 
 	/* Check to post send on QP or process locally */
@@ -789,10 +816,15 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	local->mad_priv = NULL;
 	local->recv_mad_agent = NULL;
 
-	if (mad_agent_priv->qp_info->supports_jumbo_mads)
+	if (mad_agent_priv->qp_info->supports_jumbo_mads) {
 		mad_priv = kmem_cache_alloc(jumbo_mad_cache, GFP_ATOMIC);
-	else
+		in_mad_size = sizeof(struct jumbo_mad);
+		out_mad_size = sizeof(struct jumbo_mad);
+	} else {
 		mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
+		in_mad_size = sizeof(struct ib_mad);
+		out_mad_size = sizeof(struct ib_mad);
+	}
 
 	if (!mad_priv) {
 		ret = -ENOMEM;
@@ -802,10 +834,16 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	}
 
 	build_smp_wc(mad_agent_priv->agent.qp,
-		     send_wr->wr_id, be16_to_cpu(smp->dr_slid),
+		     send_wr->wr_id, (u16)(opa_drslid & 0x0000ffff),
 		     send_wr->wr.ud.pkey_index,
 		     send_wr->wr.ud.port_num, &mad_wc);
 
+	if (smp->base_version == OPA_MGMT_BASE_VERSION) {
+		mad_wc.byte_len = mad_send_wr->send_buf.hdr_len
+					+ mad_send_wr->send_buf.data_len
+					+ sizeof(struct ib_grh);
+	}
+
 	/* No GRH for DR SMP */
 	ret = device->process_mad(device, 0, port_num, &mad_wc, NULL,
 				  (struct ib_mad_hdr *)smp, in_mad_size,
@@ -857,6 +895,8 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
 	}
 
 	local->mad_send_wr = mad_send_wr;
+	local->mad_send_wr->send_wr.wr.ud.pkey_index = mad_wc.pkey_index;
+	local->return_wc_byte_len = out_mad_size;
 	/* Reference MAD agent until send side of local completion handled */
 	atomic_inc(&mad_agent_priv->refcount);
 	/* Queue local completion to local list */
@@ -1749,14 +1789,15 @@ out:
 	return mad_agent;
 }
 
-static int validate_mad(struct ib_mad *mad, u32 qp_num)
+int validate_mad(struct ib_mad *mad, u32 qp_num, int jumbo)
 {
 	int valid = 0;
 
 	/* Make sure MAD base version is understood */
-	if (mad->mad_hdr.base_version != IB_MGMT_BASE_VERSION) {
-		pr_err("MAD received with unsupported base version %d\n",
-			mad->mad_hdr.base_version);
+	if (mad->mad_hdr.base_version != IB_MGMT_BASE_VERSION
+	    && (!jumbo && mad->mad_hdr.base_version != OPA_MGMT_BASE_VERSION)) {
+		pr_err("MAD received with unsupported base version %d %s\n",
+			mad->mad_hdr.base_version, jumbo ? "(jumbo)" : "");
 		goto out;
 	}
 
@@ -1856,18 +1897,18 @@ ib_find_send_mad(struct ib_mad_agent_private *mad_agent_priv,
 		 struct ib_mad_recv_wc *wc)
 {
 	struct ib_mad_send_wr_private *wr;
-	struct ib_mad *mad;
+	struct ib_mad_hdr *mad_hdr;
 
-	mad = (struct ib_mad *)wc->recv_buf.mad;
+	mad_hdr = (struct ib_mad_hdr *)wc->recv_buf.mad;
 
 	list_for_each_entry(wr, &mad_agent_priv->wait_list, agent_list) {
-		if ((wr->tid == mad->mad_hdr.tid) &&
+		if ((wr->tid == mad_hdr->tid) &&
 		    rcv_has_same_class(wr, wc) &&
 		    /*
 		     * Don't check GID for direct routed MADs.
 		     * These might have permissive LIDs.
 		     */
-		    (is_direct(wc->recv_buf.mad->mad_hdr.mgmt_class) ||
+		    (is_direct(mad_hdr->mgmt_class) ||
 		     rcv_has_same_gid(mad_agent_priv, wr, wc)))
 			return (wr->status == IB_WC_SUCCESS) ? wr : NULL;
 	}
@@ -1878,14 +1919,14 @@ ib_find_send_mad(struct ib_mad_agent_private *mad_agent_priv,
 	 */
 	list_for_each_entry(wr, &mad_agent_priv->send_list, agent_list) {
 		if (is_rmpp_data_mad(mad_agent_priv, wr->send_buf.mad) &&
-		    wr->tid == mad->mad_hdr.tid &&
+		    wr->tid == mad_hdr->tid &&
 		    wr->timeout &&
 		    rcv_has_same_class(wr, wc) &&
 		    /*
 		     * Don't check GID for direct routed MADs.
 		     * These might have permissive LIDs.
 		     */
-		    (is_direct(wc->recv_buf.mad->mad_hdr.mgmt_class) ||
+		    (is_direct(mad_hdr->mgmt_class) ||
 		     rcv_has_same_gid(mad_agent_priv, wr, wc)))
 			/* Verify request has not been canceled */
 			return (wr->status == IB_WC_SUCCESS) ? wr : NULL;
@@ -1901,7 +1942,7 @@ void ib_mark_mad_done(struct ib_mad_send_wr_private *mad_send_wr)
 			      &mad_send_wr->mad_agent_priv->done_list);
 }
 
-static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
+void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
 				 struct ib_mad_recv_wc *mad_recv_wc)
 {
 	struct ib_mad_send_wr_private *mad_send_wr;
@@ -2004,7 +2045,8 @@ enum smi_action handle_ib_smi(struct ib_mad_port_private *port_priv,
 				    &response->grh, wc,
 				    port_priv->device,
 				    smi_get_fwd_port(&recv->mad.smp),
-				    qp_info->qp->qp_num);
+				    qp_info->qp->qp_num,
+				    sizeof(struct ib_mad));
 
 		return IB_SMI_DISCARD;
 	}
@@ -2032,22 +2074,15 @@ static bool generate_unmatched_resp(struct ib_mad_private *recv,
 	}
 }
 static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
-				     struct ib_wc *wc)
+				     struct ib_wc *wc,
+				     struct ib_mad_private_header *mad_priv_hdr,
+				     struct ib_mad_qp_info *qp_info)
 {
-	struct ib_mad_qp_info *qp_info;
-	struct ib_mad_private_header *mad_priv_hdr;
 	struct ib_mad_private *recv, *response = NULL;
-	struct ib_mad_list_head *mad_list;
 	struct ib_mad_agent_private *mad_agent;
 	int port_num;
 	int ret = IB_MAD_RESULT_SUCCESS;
 
-	mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id;
-	qp_info = mad_list->mad_queue->qp_info;
-	dequeue_mad(mad_list);
-
-	mad_priv_hdr = container_of(mad_list, struct ib_mad_private_header,
-				    mad_list);
 	recv = container_of(mad_priv_hdr, struct ib_mad_private, header);
 	ib_dma_unmap_single(port_priv->device,
 			    recv->header.mapping,
@@ -2066,7 +2101,7 @@ static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
 		snoop_recv(qp_info, &recv->header.recv_wc, IB_MAD_SNOOP_RECVS);
 
 	/* Validate MAD */
-	if (!validate_mad(&recv->mad.mad, qp_info->qp->qp_num))
+	if (!validate_mad(&recv->mad.mad, qp_info->qp->qp_num, 0))
 		goto out;
 
 	response = kmem_cache_alloc(ib_mad_cache, GFP_KERNEL);
@@ -2107,7 +2142,8 @@ static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
 						    &recv->grh, wc,
 						    port_priv->device,
 						    port_num,
-						    qp_info->qp->qp_num);
+						    qp_info->qp->qp_num,
+						    sizeof(struct ib_mad));
 				goto out;
 			}
 		}
@@ -2124,7 +2160,9 @@ static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv,
 	} else if ((ret & IB_MAD_RESULT_SUCCESS) &&
 		   generate_unmatched_resp(recv, response)) {
 		agent_send_response(&response->mad.mad, &recv->grh, wc,
-				    port_priv->device, port_num, qp_info->qp->qp_num);
+				    port_priv->device, port_num,
+				    qp_info->qp->qp_num,
+				    sizeof(struct ib_mad));
 	}
 
 out:
@@ -2391,6 +2429,241 @@ static void mad_error_handler(struct ib_mad_port_private *port_priv,
 	}
 }
 
+static enum smi_action
+handle_opa_smi(struct ib_mad_port_private *port_priv,
+	       struct ib_mad_qp_info *qp_info,
+	       struct ib_wc *wc,
+	       int port_num,
+	       struct jumbo_mad_private *recv,
+	       struct jumbo_mad_private *response)
+{
+	enum smi_forward_action retsmi;
+
+	if (opa_smi_handle_dr_smp_recv(&recv->mad.smp,
+				   port_priv->device->node_type,
+				   port_num,
+				   port_priv->device->phys_port_cnt) ==
+				   IB_SMI_DISCARD)
+		return IB_SMI_DISCARD;
+
+	retsmi = opa_smi_check_forward_dr_smp(&recv->mad.smp);
+	if (retsmi == IB_SMI_LOCAL)
+		return IB_SMI_HANDLE;
+
+	if (retsmi == IB_SMI_SEND) { /* don't forward */
+		if (opa_smi_handle_dr_smp_send(&recv->mad.smp,
+					   port_priv->device->node_type,
+					   port_num) == IB_SMI_DISCARD)
+			return IB_SMI_DISCARD;
+
+		if (opa_smi_check_local_smp(&recv->mad.smp, port_priv->device) == IB_SMI_DISCARD)
+			return IB_SMI_DISCARD;
+
+	} else if (port_priv->device->node_type == RDMA_NODE_IB_SWITCH) {
+		/* forward case for switches */
+		memcpy(response, recv, sizeof(*response));
+		response->header.recv_wc.wc = &response->header.wc;
+		response->header.recv_wc.recv_buf.mad = (struct ib_mad *)&response->mad.mad;
+		response->header.recv_wc.recv_buf.grh = &response->grh;
+
+		agent_send_response((struct ib_mad *)&response->mad.mad,
+				    &response->grh, wc,
+				    port_priv->device,
+				    opa_smi_get_fwd_port(&recv->mad.smp),
+				    qp_info->qp->qp_num,
+				    recv->header.wc.byte_len);
+
+		return IB_SMI_DISCARD;
+	}
+
+	return IB_SMI_HANDLE;
+}
+
+static enum smi_action
+jumbo_handle_smi(struct ib_mad_port_private *port_priv,
+		 struct ib_mad_qp_info *qp_info,
+		 struct ib_wc *wc,
+		 int port_num,
+		 struct jumbo_mad_private *recv,
+		 struct jumbo_mad_private *response)
+{
+	if (recv->mad.mad.mad_hdr.base_version == OPA_MGMT_BASE_VERSION) {
+		switch (recv->mad.mad.mad_hdr.class_version) {
+		case OPA_SMI_CLASS_VERSION:
+			return handle_opa_smi(port_priv, qp_info, wc, port_num,
+					      recv, response);
+			/* stub for other Jumbo SMI versions */
+		}
+	}
+
+	return handle_ib_smi(port_priv, qp_info, wc, port_num,
+			     (struct ib_mad_private *)recv,
+			     (struct ib_mad_private *)response);
+}
+
+static bool generate_jumbo_unmatched_resp(struct jumbo_mad_private *recv,
+					  struct jumbo_mad_private *response,
+					  size_t *resp_len)
+{
+	if (recv->mad.mad.mad_hdr.method == IB_MGMT_METHOD_GET ||
+	    recv->mad.mad.mad_hdr.method == IB_MGMT_METHOD_SET) {
+		memcpy(response, recv, sizeof(*response));
+		response->header.recv_wc.wc = &response->header.wc;
+		response->header.recv_wc.recv_buf.mad = (struct ib_mad *)&response->mad.mad;
+		response->header.recv_wc.recv_buf.grh = &response->grh;
+		response->mad.mad.mad_hdr.method = IB_MGMT_METHOD_GET_RESP;
+		response->mad.mad.mad_hdr.status =
+			cpu_to_be16(IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB);
+		if (recv->mad.mad.mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
+			response->mad.mad.mad_hdr.status |= IB_SMP_DIRECTION;
+
+		if (recv->mad.mad.mad_hdr.base_version == OPA_MGMT_BASE_VERSION) {
+			if (recv->mad.mad.mad_hdr.mgmt_class ==
+			    IB_MGMT_CLASS_SUBN_LID_ROUTED ||
+			    recv->mad.mad.mad_hdr.mgmt_class ==
+			    IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
+				*resp_len = opa_get_smp_header_size(
+							(struct opa_smp *)&recv->mad.smp);
+			else
+				*resp_len = sizeof(struct ib_mad_hdr);
+		}
+
+		return true;
+	}
+
+	return false;
+}
+
+/**
+ * NOTE: Processing of recv jumbo MADs is kept separate for buffer handling
+ */
+void ib_mad_recv_done_jumbo_handler(struct ib_mad_port_private *port_priv,
+				    struct ib_wc *wc,
+				    struct ib_mad_private_header *mad_priv_hdr,
+				    struct ib_mad_qp_info *qp_info)
+{
+	struct jumbo_mad_private *recv, *response = NULL;
+	struct ib_mad_agent_private *mad_agent;
+	int port_num;
+	int ret = IB_MAD_RESULT_SUCCESS;
+	u8 base_version;
+	size_t resp_len = 0;
+
+	recv = container_of(mad_priv_hdr, struct jumbo_mad_private, header);
+	ib_dma_unmap_single(port_priv->device,
+			    recv->header.mapping,
+			    sizeof(struct jumbo_mad_private) -
+			      sizeof(struct ib_mad_private_header),
+			    DMA_FROM_DEVICE);
+
+	/* Setup MAD receive work completion from "normal" work completion */
+	recv->header.wc = *wc;
+	recv->header.recv_wc.wc = &recv->header.wc;
+	base_version = recv->mad.mad.mad_hdr.base_version;
+	if (base_version == OPA_MGMT_BASE_VERSION)
+		recv->header.recv_wc.mad_len = wc->byte_len - sizeof(struct ib_grh);
+	else
+		recv->header.recv_wc.mad_len = sizeof(struct ib_mad);
+	recv->header.recv_wc.recv_buf.mad = (struct ib_mad *)&recv->mad.mad;
+	recv->header.recv_wc.recv_buf.grh = &recv->grh;
+
+	if (atomic_read(&qp_info->snoop_count))
+		snoop_recv(qp_info, &recv->header.recv_wc, IB_MAD_SNOOP_RECVS);
+
+	if (!validate_mad((struct ib_mad *)&recv->mad.mad, qp_info->qp->qp_num, 1))
+		goto out;
+
+	response = kmem_cache_alloc(jumbo_mad_cache, GFP_KERNEL);
+	if (!response) {
+		pr_err("ib_mad_recv_done_jumbo_handler no memory for response buffer (jumbo)\n");
+		goto out;
+	}
+
+	if (port_priv->device->node_type == RDMA_NODE_IB_SWITCH)
+		port_num = wc->port_num;
+	else
+		port_num = port_priv->port_num;
+
+	if (recv->mad.mad.mad_hdr.mgmt_class ==
+	    IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
+		if (jumbo_handle_smi(port_priv, qp_info, wc, port_num, recv, response)
+		    == IB_SMI_DISCARD)
+			goto out;
+	}
+
+	/* Give driver "right of first refusal" on incoming MAD */
+	if (port_priv->device->process_mad) {
+		resp_len = sizeof(struct jumbo_mad),
+		ret = port_priv->device->process_mad(port_priv->device, 0,
+						     port_priv->port_num,
+						     wc, &recv->grh,
+						     (struct ib_mad_hdr *)&recv->mad.mad,
+						     sizeof(struct jumbo_mad),
+						     (struct ib_mad_hdr *)&response->mad.mad,
+						     &resp_len);
+		if (ret & IB_MAD_RESULT_SUCCESS) {
+			if (ret & IB_MAD_RESULT_CONSUMED)
+				goto out;
+			if (ret & IB_MAD_RESULT_REPLY) {
+				agent_send_response((struct ib_mad *)&response->mad.mad,
+						    &recv->grh, wc,
+						    port_priv->device,
+						    port_num,
+						    qp_info->qp->qp_num,
+						    resp_len);
+				goto out;
+			}
+		}
+	}
+
+	mad_agent = find_mad_agent(port_priv, (struct ib_mad *)&recv->mad.mad);
+	if (mad_agent) {
+		ib_mad_complete_recv(mad_agent, &recv->header.recv_wc);
+		/*
+		 * recv is freed up in error cases in ib_mad_complete_recv
+		 * or via recv_handler in ib_mad_complete_recv()
+		 */
+		recv = NULL;
+	} else if ((ret & IB_MAD_RESULT_SUCCESS) &&
+		   generate_jumbo_unmatched_resp(recv, response, &resp_len)) {
+		agent_send_response((struct ib_mad *)&response->mad.mad, &recv->grh, wc,
+				    port_priv->device, port_num,
+				    qp_info->qp->qp_num,
+				    resp_len);
+	}
+
+out:
+	/* Post another receive request for this QP */
+	if (response) {
+		ib_mad_post_jumbo_rcv_mads(qp_info, response);
+		if (recv) {
+			BUG_ON(!(recv->header.flags & IB_MAD_PRIV_FLAG_JUMBO));
+			kmem_cache_free(jumbo_mad_cache, recv);
+		}
+	} else
+		ib_mad_post_jumbo_rcv_mads(qp_info, recv);
+}
+
+static void ib_mad_recv_mad(struct ib_mad_port_private *port_priv,
+			    struct ib_wc *wc)
+{
+	struct ib_mad_qp_info *qp_info;
+	struct ib_mad_list_head *mad_list;
+	struct ib_mad_private_header *mad_priv_hdr;
+
+	mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id;
+	qp_info = mad_list->mad_queue->qp_info;
+	dequeue_mad(mad_list);
+
+	mad_priv_hdr = container_of(mad_list, struct ib_mad_private_header,
+				    mad_list);
+
+	if (qp_info->supports_jumbo_mads)
+		ib_mad_recv_done_jumbo_handler(port_priv, wc, mad_priv_hdr, qp_info);
+	else
+		ib_mad_recv_done_handler(port_priv, wc, mad_priv_hdr, qp_info);
+}
+
 /*
  * IB MAD completion callback
  */
@@ -2409,7 +2682,7 @@ static void ib_mad_completion_handler(struct work_struct *work)
 				ib_mad_send_done_handler(port_priv, &wc);
 				break;
 			case IB_WC_RECV:
-				ib_mad_recv_done_handler(port_priv, &wc);
+				ib_mad_recv_mad(port_priv, &wc);
 				break;
 			default:
 				BUG_ON(1);
@@ -2541,6 +2814,7 @@ static void local_completions(struct work_struct *work)
 		spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
 		free_mad = 0;
 		if (local->mad_priv) {
+			u8 base_version;
 			recv_mad_agent = local->recv_mad_agent;
 			if (!recv_mad_agent) {
 				dev_err(&mad_agent_priv->agent.device->dev,
@@ -2556,11 +2830,17 @@ static void local_completions(struct work_struct *work)
 			build_smp_wc(recv_mad_agent->agent.qp,
 				     (unsigned long) local->mad_send_wr,
 				     be16_to_cpu(IB_LID_PERMISSIVE),
-				     0, recv_mad_agent->agent.port_num, &wc);
+				     local->mad_send_wr->send_wr.wr.ud.pkey_index,
+				     recv_mad_agent->agent.port_num, &wc);
 
 			local->mad_priv->header.recv_wc.wc = &wc;
-			local->mad_priv->header.recv_wc.mad_len =
-						sizeof(struct ib_mad);
+
+			base_version = local->mad_priv->mad.mad.mad_hdr.base_version;
+			if (base_version == OPA_MGMT_BASE_VERSION)
+				local->mad_priv->header.recv_wc.mad_len = local->return_wc_byte_len;
+			else
+				local->mad_priv->header.recv_wc.mad_len = sizeof(struct ib_mad);
+
 			INIT_LIST_HEAD(&local->mad_priv->header.recv_wc.rmpp_list);
 			list_add(&local->mad_priv->header.recv_wc.recv_buf.list,
 				 &local->mad_priv->header.recv_wc.rmpp_list);
@@ -2818,6 +3098,81 @@ static void cleanup_recv_queue(struct ib_mad_qp_info *qp_info)
 }
 
 /*
+ * Allocate jumbo receive MADs and post receive WRs for them
+ */
+static int ib_mad_post_jumbo_rcv_mads(struct ib_mad_qp_info *qp_info,
+				      struct jumbo_mad_private *mad)
+{
+	unsigned long flags;
+	int post, ret;
+	struct jumbo_mad_private *mad_priv;
+	struct ib_sge sg_list;
+	struct ib_recv_wr recv_wr, *bad_recv_wr;
+	struct ib_mad_queue *recv_queue = &qp_info->recv_queue;
+
+	if (unlikely(!qp_info->supports_jumbo_mads)) {
+		pr_err("Attempt to post jumbo MAD on non-jumbo QP\n");
+		return -EINVAL;
+	}
+
+	/* Initialize common scatter list fields */
+	sg_list.length = sizeof(*mad_priv) - sizeof(mad_priv->header);
+	sg_list.lkey = (*qp_info->port_priv->mr).lkey;
+
+	/* Initialize common receive WR fields */
+	recv_wr.next = NULL;
+	recv_wr.sg_list = &sg_list;
+	recv_wr.num_sge = 1;
+
+	do {
+		/* Allocate and map receive buffer */
+		if (mad) {
+			mad_priv = mad;
+			mad = NULL;
+		} else {
+			mad_priv = kmem_cache_alloc(jumbo_mad_cache, GFP_KERNEL);
+			if (!mad_priv) {
+				pr_err("No memory for jumbo receive buffer\n");
+				ret = -ENOMEM;
+				break;
+			}
+		}
+		sg_list.addr = ib_dma_map_single(qp_info->port_priv->device,
+						 &mad_priv->grh,
+						 sizeof(*mad_priv) -
+						   sizeof(mad_priv->header),
+						 DMA_FROM_DEVICE);
+		mad_priv->header.mapping = sg_list.addr;
+		recv_wr.wr_id = (unsigned long)&mad_priv->header.mad_list;
+		mad_priv->header.mad_list.mad_queue = recv_queue;
+
+		/* Post receive WR */
+		spin_lock_irqsave(&recv_queue->lock, flags);
+		post = (++recv_queue->count < recv_queue->max_active);
+		list_add_tail(&mad_priv->header.mad_list.list, &recv_queue->list);
+		spin_unlock_irqrestore(&recv_queue->lock, flags);
+		ret = ib_post_recv(qp_info->qp, &recv_wr, &bad_recv_wr);
+		if (ret) {
+			spin_lock_irqsave(&recv_queue->lock, flags);
+			list_del(&mad_priv->header.mad_list.list);
+			recv_queue->count--;
+			spin_unlock_irqrestore(&recv_queue->lock, flags);
+			ib_dma_unmap_single(qp_info->port_priv->device,
+					    mad_priv->header.mapping,
+					    sizeof(*mad_priv)-
+					      sizeof(mad_priv->header),
+					    DMA_FROM_DEVICE);
+			BUG_ON(!(mad_priv->header.flags & IB_MAD_PRIV_FLAG_JUMBO));
+			kmem_cache_free(jumbo_mad_cache, mad_priv);
+			pr_err("ib_post_recv failed: %d\n", ret);
+			break;
+		}
+	} while (post);
+
+	return ret;
+}
+
+/*
  * Start the port
  */
 static int ib_mad_port_start(struct ib_mad_port_private *port_priv)
@@ -2892,7 +3247,10 @@ static int ib_mad_port_start(struct ib_mad_port_private *port_priv)
 		if (!port_priv->qp_info[i].qp)
 			continue;
 
-		ret = ib_mad_post_receive_mads(&port_priv->qp_info[i], NULL);
+		if (port_priv->qp_info[i].supports_jumbo_mads)
+			ret = ib_mad_post_jumbo_rcv_mads(&port_priv->qp_info[i], NULL);
+		else
+			ret = ib_mad_post_receive_mads(&port_priv->qp_info[i], NULL);
 		if (ret) {
 			dev_err(&port_priv->device->dev,
 				"Couldn't post receive WRs\n");
diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h
index 7a82950..6c54be8 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -175,6 +175,7 @@ struct ib_mad_local_private {
 	struct ib_mad_private *mad_priv; /* can be struct jumbo_mad_private */
 	struct ib_mad_agent_private *recv_mad_agent;
 	struct ib_mad_send_wr_private *mad_send_wr;
+	size_t return_wc_byte_len;
 };
 
 struct ib_mad_mgmt_method_table {
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 7184530..514f0a1 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -1,6 +1,7 @@
 /*
  * Copyright (c) 2005 Intel Inc. All rights reserved.
  * Copyright (c) 2005-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -67,6 +68,7 @@ struct mad_rmpp_recv {
 	u8 mgmt_class;
 	u8 class_version;
 	u8 method;
+	u8 base_version;
 };
 
 static inline void deref_rmpp_recv(struct mad_rmpp_recv *rmpp_recv)
@@ -318,6 +320,7 @@ create_rmpp_recv(struct ib_mad_agent_private *agent,
 	rmpp_recv->mgmt_class = mad_hdr->mgmt_class;
 	rmpp_recv->class_version = mad_hdr->class_version;
 	rmpp_recv->method  = mad_hdr->method;
+	rmpp_recv->base_version  = mad_hdr->base_version;
 	return rmpp_recv;
 
 error:	kfree(rmpp_recv);
@@ -431,16 +434,23 @@ static void update_seg_num(struct mad_rmpp_recv *rmpp_recv,
 
 static inline int get_mad_len(struct mad_rmpp_recv *rmpp_recv)
 {
-	struct ib_rmpp_mad *rmpp_mad;
+	struct ib_rmpp_base *rmpp_base;
 	int hdr_size, data_size, pad;
 
-	rmpp_mad = (struct ib_rmpp_mad *)rmpp_recv->cur_seg_buf->mad;
+	rmpp_base = &((struct jumbo_rmpp_mad *)rmpp_recv->cur_seg_buf->mad)->base;
 
-	hdr_size = ib_get_mad_data_offset(rmpp_mad->base.mad_hdr.mgmt_class);
-	data_size = sizeof(struct ib_rmpp_mad) - hdr_size;
-	pad = IB_MGMT_RMPP_DATA - be32_to_cpu(rmpp_mad->base.rmpp_hdr.paylen_newwin);
-	if (pad > IB_MGMT_RMPP_DATA || pad < 0)
-		pad = 0;
+	hdr_size = ib_get_mad_data_offset(rmpp_base->mad_hdr.mgmt_class);
+	if (rmpp_recv->base_version == OPA_MGMT_BASE_VERSION) {
+		data_size = sizeof(struct jumbo_rmpp_mad) - hdr_size;
+		pad = JUMBO_MGMT_RMPP_DATA - be32_to_cpu(rmpp_base->rmpp_hdr.paylen_newwin);
+		if (pad > JUMBO_MGMT_RMPP_DATA || pad < 0)
+			pad = 0;
+	} else {
+		data_size = sizeof(struct ib_rmpp_mad) - hdr_size;
+		pad = IB_MGMT_RMPP_DATA - be32_to_cpu(rmpp_base->rmpp_hdr.paylen_newwin);
+		if (pad > IB_MGMT_RMPP_DATA || pad < 0)
+			pad = 0;
+	}
 
 	return hdr_size + rmpp_recv->seg_num * data_size - pad;
 }
@@ -933,11 +943,11 @@ int ib_process_rmpp_send_wc(struct ib_mad_send_wr_private *mad_send_wr,
 
 int ib_retry_rmpp(struct ib_mad_send_wr_private *mad_send_wr)
 {
-	struct ib_rmpp_base *rmpp_base;
+	struct ib_rmpp_mad *rmpp_mad;
 	int ret;
 
-	rmpp_base = mad_send_wr->send_buf.mad;
-	if (!(ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
+	rmpp_mad = mad_send_wr->send_buf.mad;
+	if (!(ib_get_rmpp_flags(&rmpp_mad->base.rmpp_hdr) &
 	      IB_MGMT_RMPP_FLAG_ACTIVE))
 		return IB_RMPP_RESULT_UNHANDLED; /* RMPP not active */
 
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 3b4b614..aca72e4 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -263,20 +263,27 @@ static ssize_t copy_recv_mad(struct ib_umad_file *file, char __user *buf,
 {
 	struct ib_mad_recv_buf *recv_buf;
 	int left, seg_payload, offset, max_seg_payload;
+	int seg_size;
 
-	/* We need enough room to copy the first (or only) MAD segment. */
 	recv_buf = &packet->recv_wc->recv_buf;
-	if ((packet->length <= sizeof (*recv_buf->mad) &&
+
+	if (recv_buf->mad->mad_hdr.base_version == OPA_MGMT_BASE_VERSION)
+		seg_size = sizeof(struct jumbo_mad);
+	else
+		seg_size = sizeof(struct ib_mad);
+
+	/* We need enough room to copy the first (or only) MAD segment. */
+	if ((packet->length <= seg_size &&
 	     count < hdr_size(file) + packet->length) ||
-	    (packet->length > sizeof (*recv_buf->mad) &&
-	     count < hdr_size(file) + sizeof (*recv_buf->mad)))
+	    (packet->length > seg_size &&
+	     count < hdr_size(file) + seg_size))
 		return -EINVAL;
 
 	if (copy_to_user(buf, &packet->mad, hdr_size(file)))
 		return -EFAULT;
 
 	buf += hdr_size(file);
-	seg_payload = min_t(int, packet->length, sizeof (*recv_buf->mad));
+	seg_payload = min_t(int, packet->length, seg_size);
 	if (copy_to_user(buf, recv_buf->mad, seg_payload))
 		return -EFAULT;
 
@@ -293,7 +300,7 @@ static ssize_t copy_recv_mad(struct ib_umad_file *file, char __user *buf,
 			return -ENOSPC;
 		}
 		offset = ib_get_mad_data_offset(recv_buf->mad->mad_hdr.mgmt_class);
-		max_seg_payload = sizeof (struct ib_mad) - offset;
+		max_seg_payload = seg_size - offset;
 
 		for (left = packet->length - seg_payload, buf += seg_payload;
 		     left; left -= seg_payload, buf += seg_payload) {
@@ -448,9 +455,10 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 	struct ib_mad_agent *agent;
 	struct ib_ah_attr ah_attr;
 	struct ib_ah *ah;
-	struct ib_rmpp_base *rmpp_base;
+	struct ib_rmpp_mad *rmpp_mad;
 	__be64 *tid;
 	int ret, data_len, hdr_len, copy_offset, rmpp_active;
+	u8 base_version;
 
 	if (count < hdr_size(file) + IB_MGMT_RMPP_HDR)
 		return -EINVAL;
@@ -504,25 +512,26 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 		goto err_up;
 	}
 
-	rmpp_base = (struct ib_rmpp_base *) packet->mad.data;
-	hdr_len = ib_get_mad_data_offset(rmpp_base->mad_hdr.mgmt_class);
+	rmpp_mad = (struct ib_rmpp_mad *) packet->mad.data;
+	hdr_len = ib_get_mad_data_offset(rmpp_mad->base.mad_hdr.mgmt_class);
 
-	if (ib_is_mad_class_rmpp(rmpp_base->mad_hdr.mgmt_class)
+	if (ib_is_mad_class_rmpp(rmpp_mad->base.mad_hdr.mgmt_class)
 	    && ib_mad_kernel_rmpp_agent(agent)) {
 		copy_offset = IB_MGMT_RMPP_HDR;
-		rmpp_active = ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
+		rmpp_active = ib_get_rmpp_flags(&rmpp_mad->base.rmpp_hdr) &
 						IB_MGMT_RMPP_FLAG_ACTIVE;
 	} else {
 		copy_offset = IB_MGMT_MAD_HDR;
 		rmpp_active = 0;
 	}
 
+	base_version = ((struct ib_mad_hdr *)&packet->mad.data)->base_version;
 	data_len = count - hdr_size(file) - hdr_len;
 	packet->msg = ib_create_send_mad(agent,
 					 be32_to_cpu(packet->mad.hdr.qpn),
 					 packet->mad.hdr.pkey_index, rmpp_active,
 					 hdr_len, data_len, GFP_KERNEL,
-					 IB_MGMT_BASE_VERSION);
+					 base_version);
 	if (IS_ERR(packet->msg)) {
 		ret = PTR_ERR(packet->msg);
 		goto err_ah;
@@ -558,12 +567,12 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
 		tid = &((struct ib_mad_hdr *) packet->msg->mad)->tid;
 		*tid = cpu_to_be64(((u64) agent->hi_tid) << 32 |
 				   (be64_to_cpup(tid) & 0xffffffff));
-		rmpp_base->mad_hdr.tid = *tid;
+		rmpp_mad->base.mad_hdr.tid = *tid;
 	}
 
 	if (!ib_mad_kernel_rmpp_agent(agent)
-	   && ib_is_mad_class_rmpp(rmpp_base->mad_hdr.mgmt_class)
-	   && (ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) & IB_MGMT_RMPP_FLAG_ACTIVE)) {
+	   && ib_is_mad_class_rmpp(rmpp_mad->base.mad_hdr.mgmt_class)
+	   && (ib_get_rmpp_flags(&rmpp_mad->base.rmpp_hdr) & IB_MGMT_RMPP_FLAG_ACTIVE)) {
 		spin_lock_irq(&file->send_lock);
 		list_add_tail(&packet->list, &file->send_list);
 		spin_unlock_irq(&file->send_lock);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (15 preceding siblings ...)
  2014-11-13 19:54   ` [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General MAD processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2014-11-18 22:16   ` Or Gerlitz
       [not found]     ` <CAJ3xEMhtm99dRdcEvhK9s961mDr7YSU3pkv-WK=sESKe_K4kYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-11-25 23:16   ` Hal Rosenstock
  17 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2014-11-18 22:16 UTC (permalink / raw)
  To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Thu, Nov 13, 2014 at 9:54 PM,  <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> The following patch series modifies the kernel MAD processing (ib_mad/ib_umad)
> and related interfaces to process Intel Omni-Path Architecture MADs on devices
> which support them.

So this series allows to process such mad when it arrives or goes
beyond that and allows to send such mad too?

> In addition to supporting some IBTA management classes, OPA devices use MADs
> with lengths up to 2K.  These "jumbo" MADs increase the performance of
> management traffic.

Can you provide 1-2 use cases where such mads will be sent and by what
entity? I recall 2KB mads were mentioned over our LWG talks 8 years
ago on IB routers...

> To distinguish IBTA MADs from OPA MADs a new Base Version is introduced.  The
> new format shares the same common header with IBTA MADs which allows us to
> share most of the MAD processing code when dealing with the new Base Version.
>
>
> The patch series is broken into 3 main areas.
>
> 1) Add the ability for devices to indicate "jumbo" MAD support.  In addition,
>    modify the ib_mad module to detect those devices and allocate the resources
>    for the QPs on those devices.
>
> 2) Enhance the interface to the device agents to support larger and variable
>    length MADs.
>
> 3) Add support for creating and processing OPA Base Version MADs including
>    a new SMP class version specific to OPA devices.
>
>
>   [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
>   [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap
>   [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
>   [RFC PATCH 04/16] ib/mad: add base version parameter to
>   [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
>   [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
>   [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache

why not use a single kmem-cache instance with a non hard coded element
size, 256B (or whatever we use today) or 2KB?

Also (nit), please change the prefix for all patches to be IB/mad: and
not ib/mad: to comply with the existing habit of patch titles for the
IB subsystem

And (another nit), generate patch 0/N using

$ git format-patch --cover-letter

so we have the exact subject line for each patch and the over-all
diffstat in the cover-letter

>   [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
>   [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path
>   [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path
>   [RFC PATCH 11/16] ib/mad: create helper function for
>   [RFC PATCH 12/16] ib/mad: create helper function for
>   [RFC PATCH 13/16] ib/mad: create helper function for
>   [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
>   [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP
>   [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
       [not found]     ` <CAJ3xEMhtm99dRdcEvhK9s961mDr7YSU3pkv-WK=sESKe_K4kYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-11-25 21:52       ` Weiny, Ira
       [not found]         ` <2807E5FD2F6FDA4886F6618EAC48510E0CBC6B23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Weiny, Ira @ 2014-11-25 21:52 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> 
> On Thu, Nov 13, 2014 at 9:54 PM,  <ira.weiny@intel.com> wrote:
> > The following patch series modifies the kernel MAD processing
> > (ib_mad/ib_umad) and related interfaces to process Intel Omni-Path
> > Architecture MADs on devices which support them.
> 
> So this series allows to process such mad when it arrives or goes beyond that
> and allows to send such mad too?

Both send and receive is supported.  My apologies for not being clear.

> 
> > In addition to supporting some IBTA management classes, OPA devices
> > use MADs with lengths up to 2K.  These "jumbo" MADs increase the
> > performance of management traffic.
> 
> Can you provide 1-2 use cases where such mads will be sent and by what
> entity? I recall 2KB mads were mentioned over our LWG talks 8 years ago on IB
> routers...

The Intel Omni-Path driver and SM will be the entities using these MADs.  The patch series is written such that other devices could use jumbo MAD's but there is no attempt to predict how other technologies would do so.

[snip]

> >   [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
> 
> why not use a single kmem-cache instance with a non hard coded element size,
> 256B (or whatever we use today) or 2KB?

I wanted to be able to adjust the element count of the caches separately to better tune overall memory usage.  However, I stopped short of adding additional module parameters to adjust the 2K cache at this time.

> 
> Also (nit), please change the prefix for all patches to be IB/mad: and not
> ib/mad: to comply with the existing habit of patch titles for the IB subsystem

I will thanks.

> 
> And (another nit), generate patch 0/N using
> 
> $ git format-patch --cover-letter
> 
> so we have the exact subject line for each patch and the over-all diffstat in the
> cover-letter

I will.  My reason for not doing this previously was that I run the git send-email command on generated patches from a different machine than the one on which my repo is located due to Intel's firewall setup.  I will clone the repo there and do this in the future.

-- Ira


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
       [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (16 preceding siblings ...)
  2014-11-18 22:16   ` [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) " Or Gerlitz
@ 2014-11-25 23:16   ` Hal Rosenstock
       [not found]     ` <54750DCF.90308-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  17 siblings, 1 reply; 36+ messages in thread
From: Hal Rosenstock @ 2014-11-25 23:16 UTC (permalink / raw)
  To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Eitan Zahavi

On 11/13/2014 2:54 PM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> The following patch series modifies the kernel MAD processing (ib_mad/ib_umad)
> and related interfaces to process Intel Omni-Path Architecture MADs on devices
> which support them.
> 
> In addition to supporting some IBTA management classes, OPA devices use MADs
> with lengths up to 2K.  These "jumbo" MADs increase the performance of
> management traffic.
> 
> To distinguish IBTA MADs from OPA MADs a new Base Version is introduced.  The
> new format shares the same common header with IBTA MADs which allows us to
> share most of the MAD processing code when dealing with the new Base Version.

Is there any intention on standardizing OPA (jumbo MADs) at the IBTA or
has that ship already sailed ? I'm assuming it is the latter.

In any case, OPA base version should be claimed at IBTA so it won't be
used by anything else in the future at the IBTA.

Also, will OPA nodes interoperate with standard IBTA nodes or does OPA
assume a homogeneous OPA subnet ?

> The patch series is broken into 3 main areas.
> 
> 1) Add the ability for devices to indicate "jumbo" MAD support.  In addition,
>    modify the ib_mad module to detect those devices and allocate the resources
>    for the QPs on those devices.
> 
> 2) Enhance the interface to the device agents to support larger and variable
>    length MADs.
> 
> 3) Add support for creating and processing OPA Base Version MADs including 
>    a new SMP class version specific to OPA devices.

At a minimum, OPA SMP class version should also be claimed at IBTA.

>   [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
>   [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap
>   [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
>   [RFC PATCH 04/16] ib/mad: add base version parameter to
>   [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
>   [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
>   [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
>   [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
>   [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path
>   [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path
>   [RFC PATCH 11/16] ib/mad: create helper function for
>   [RFC PATCH 12/16] ib/mad: create helper function for
>   [RFC PATCH 13/16] ib/mad: create helper function for
>   [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
>   [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP
>   [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General

Has regression testing yet been done with these changes in terms of IBTA
MAD support in terms of agents, SMs, and diagnostics to be sure that
things still work properly ?

-- Hal
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
       [not found]         ` <2807E5FD2F6FDA4886F6618EAC48510E0CBC6B23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2014-11-27 10:02           ` Or Gerlitz
       [not found]             ` <5476F6BB.1020200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2014-11-27 10:02 UTC (permalink / raw)
  To: Weiny, Ira; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/25/2014 11:52 PM, Weiny, Ira wrote:
>> On Thu, Nov 13, 2014 at 9:54 PM,  <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>> The following patch series modifies the kernel MAD processing
>>> (ib_mad/ib_umad) and related interfaces to process Intel Omni-Path
>>> Architecture MADs on devices which support them.
>> So this series allows to process such mad when it arrives or goes beyond that
>> and allows to send such mad too?
> Both send and receive is supported.  My apologies for not being clear.
>
>>> In addition to supporting some IBTA management classes, OPA devices
>>> use MADs with lengths up to 2K.  These "jumbo" MADs increase the
>>> performance of management traffic.
>> Can you provide 1-2 use cases where such mads will be sent and by what
>> entity? I recall 2KB mads were mentioned over our LWG talks 8 years ago on IB
>> routers...
> The Intel Omni-Path driver and SM will be the entities using these MADs.  The patch series is written such that other devices could use jumbo MAD's but there is no attempt to predict how other technologies would do so.
>
> [snip]
>
>>>    [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
>> why not use a single kmem-cache instance with a non hard coded element size,
>> 256B (or whatever we use today) or 2KB?
> I wanted to be able to adjust the element count of the caches separately to better tune overall memory usage.  However, I stopped short of adding additional module parameters to adjust the 2K cache at this time.


I tend to think that the resulted code is too much of a special purpose 
one under a  (jumbo == 2K) assumption. See some more comments in the 
individual patches and we'll take it from there.



>
>> Also (nit), please change the prefix for all patches to be IB/mad: and not
>> ib/mad: to comply with the existing habit of patch titles for the IB subsystem
> I will thanks.

Good. See below another easy-to-fix nitpicking comment, but before that, 
for the sake of easier
review and post-robustness of the code to future bisections, please do a 
re-ordering
of the series such that all general refactoring and pre-patches come 
before the OPApatches.

This goes to re-order the current series such tat patches 8/9/10 are 
located after patch 14, as
listed here:

   [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
   [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap
   [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
   [RFC PATCH 04/16] ib/mad: add base version parameter to
   [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
   [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
   [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
   [RFC PATCH 11/16] ib/mad: create helper function for
   [RFC PATCH 12/16] ib/mad: create helper function for
   [RFC PATCH 13/16] ib/mad: create helper function for
   [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
   [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
   [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path
   [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path
   [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP
   [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General

Another easy-to-fix nitpicking comment would be to have all the patches 
be consistent w.r.t to the capitalization
of the 1st letter in the 1st word after the IB/core: or IB/mad:  prefix, e.g

ib/mad: create helper function for smi_handle_dr_smp_send

becomes

IB/mad: Create helper function for smi_handle_dr_smp_send

BTW, here my personal preference is "Add helper" and not "Create helper"

IB/mad: Add helper function for smi_handle_dr_smp_send



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]     ` <1415908465-24392-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2014-11-27 11:47       ` Or Gerlitz
       [not found]         ` <54770F44.2090909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2014-11-27 11:47 UTC (permalink / raw)
  To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/13/2014 9:54 PM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:

> Add a device capability flag for OPA devices to signal their support of "jumbo"
> MADs.
>
> Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and if
> supported mark the special QPs created.

Ira,

Few comments here, the device capability is OK, specifically if it helps 
for the mad
layer logic to be simpler and/or more robust.

You should add device attribute telling what is the size of MADs 
supported by the
device, I think the IBTA mandated 256Band in the OPA case, the driver 
will fill there 2k.

You can't just state that (jumbo == 2k) and go to complete design/changes
which use this hard-coded assumption. You need to see how to re-spin 
this series
w.o this assumption.

I find it very annoying that upper level drivers replicate in different 
ways elements
from the IB device attributes returned by ib_query_device. I met that in 
multiple
drivers and upcoming designs for which I do code review. Are you up to 
come up
with a patch that caches the device attributes on the device structure? 
if not,
I can do that.. and have your code to see it.

Also, I would go and unify (squash) this patch and the preceding one.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
       [not found]     ` <1415908465-24392-8-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2014-11-27 11:50       ` Or Gerlitz
       [not found]         ` <54770FF4.3070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2014-11-27 11:50 UTC (permalink / raw)
  To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/13/2014 9:54 PM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> @@ -773,7 +782,12 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
>   	}
>   	local->mad_priv = NULL;
>   	local->recv_mad_agent = NULL;
> -	mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
> +
> +	if (mad_agent_priv->qp_info->supports_jumbo_mads)
> +		mad_priv = kmem_cache_alloc(jumbo_mad_cache, GFP_ATOMIC);
> +	else
> +		mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
> +
@ minimum (if you really think that one kmem cache for both jumbo and 
non-jumbo mads
isn't the way to get) lets have one pointer that is directed to the 
cache you want to use and this
way all branch as the above one and it's such can be avoided, right?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]         ` <54770F44.2090909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-11-27 13:51           ` Sagi Grimberg
       [not found]             ` <54772C70.8060602-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2014-12-08  0:23           ` Weiny, Ira
  1 sibling, 1 reply; 36+ messages in thread
From: Sagi Grimberg @ 2014-11-27 13:51 UTC (permalink / raw)
  To: Or Gerlitz, ira.weiny-ral2JQCrhuEAvxtiuMwx3w,
	roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/27/2014 1:47 PM, Or Gerlitz wrote:
> On 11/13/2014 9:54 PM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
>
>> Add a device capability flag for OPA devices to signal their support
>> of "jumbo"
>> MADs.
>>
>> Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and if
>> supported mark the special QPs created.
>
> Ira,
>
> Few comments here, the device capability is OK, specifically if it helps
> for the mad
> layer logic to be simpler and/or more robust.
>
> You should add device attribute telling what is the size of MADs
> supported by the
> device, I think the IBTA mandated 256Band in the OPA case, the driver
> will fill there 2k.
>
> You can't just state that (jumbo == 2k) and go to complete design/changes
> which use this hard-coded assumption. You need to see how to re-spin
> this series
> w.o this assumption.

Why do we need both? can't we just rely on just the size?

We're already short on bits in device_cap_flags so maybe we can do
without?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]             ` <54772C70.8060602-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-11-27 14:59               ` Or Gerlitz
       [not found]                 ` <54773C59.6080505-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2014-11-27 14:59 UTC (permalink / raw)
  To: Sagi Grimberg, ira.weiny-ral2JQCrhuEAvxtiuMwx3w,
	roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11/27/2014 3:51 PM, Sagi Grimberg wrote:
> We're already short on bits in device_cap_flags

no shortage @ the kernel... we can add more 32 bits if/when we need it
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
       [not found]             ` <5476F6BB.1020200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-12-01 22:18               ` Weiny, Ira
  0 siblings, 0 replies; 36+ messages in thread
From: Weiny, Ira @ 2014-12-01 22:18 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3097 bytes --]

> >
> >>>    [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
> >> why not use a single kmem-cache instance with a non hard coded
> >> element size, 256B (or whatever we use today) or 2KB?
> > I wanted to be able to adjust the element count of the caches separately to
> better tune overall memory usage.  However, I stopped short of adding
> additional module parameters to adjust the 2K cache at this time.
> 
> 
> I tend to think that the resulted code is too much of a special purpose one
> under a  (jumbo == 2K) assumption. See some more comments in the individual
> patches and we'll take it from there.
> 

Ok, I'll address those comments in the other email threads.

> 
> 
> >
> >> Also (nit), please change the prefix for all patches to be IB/mad:
> >> and not
> >> ib/mad: to comply with the existing habit of patch titles for the IB
> >> subsystem
> > I will thanks.
> 
> Good. See below another easy-to-fix nitpicking comment, but before that, for
> the sake of easier review and post-robustness of the code to future bisections,
> please do a re-ordering of the series such that all general refactoring and pre-
> patches come before the OPApatches.
> 
> This goes to re-order the current series such tat patches 8/9/10 are located
> after patch 14, as listed here:
> 
>    [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
>    [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device
> cap
>    [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
>    [RFC PATCH 04/16] ib/mad: add base version parameter to
>    [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
>    [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
>    [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
>    [RFC PATCH 11/16] ib/mad: create helper function for
>    [RFC PATCH 12/16] ib/mad: create helper function for
>    [RFC PATCH 13/16] ib/mad: create helper function for
>    [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
>    [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
>    [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path
>    [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path
>    [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP
>    [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General
> 

Done.

> Another easy-to-fix nitpicking comment would be to have all the patches be
> consistent w.r.t to the capitalization of the 1st letter in the 1st word after the
> IB/core: or IB/mad:  prefix, e.g
> 
> ib/mad: create helper function for smi_handle_dr_smp_send
> 
> becomes
> 
> IB/mad: Create helper function for smi_handle_dr_smp_send

Done.

> 
> BTW, here my personal preference is "Add helper" and not "Create helper"
> 
> IB/mad: Add helper function for smi_handle_dr_smp_send

Done.

Ira

N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.
       [not found]     ` <54750DCF.90308-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2014-12-01 22:33       ` Weiny, Ira
  0 siblings, 0 replies; 36+ messages in thread
From: Weiny, Ira @ 2014-12-01 22:33 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Eitan Zahavi

I'll get back to you on the IBTA questions when I can but I wanted to address this question.

> 
> Has regression testing yet been done with these changes in terms of IBTA MAD
> support in terms of agents, SMs, and diagnostics to be sure that things still work
> properly ?
> 

Yes.  I have tested OpenSM and infiniband-diags running on qib and mlx4 hardware.  I don't have access to mlx5 or the older ehca and mthca cards.

-- Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
       [not found]         ` <54770FF4.3070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2014-12-05 21:25           ` Weiny, Ira
  0 siblings, 0 replies; 36+ messages in thread
From: Weiny, Ira @ 2014-12-05 21:25 UTC (permalink / raw)
  To: 'Or Gerlitz'
  Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> 
> On 11/13/2014 9:54 PM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> > @@ -773,7 +782,12 @@ static int handle_outgoing_dr_smp(struct
> ib_mad_agent_private *mad_agent_priv,
> >   	}
> >   	local->mad_priv = NULL;
> >   	local->recv_mad_agent = NULL;
> > -	mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
> > +
> > +	if (mad_agent_priv->qp_info->supports_jumbo_mads)
> > +		mad_priv = kmem_cache_alloc(jumbo_mad_cache,
> GFP_ATOMIC);
> > +	else
> > +		mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
> > +
> @ minimum (if you really think that one kmem cache for both jumbo and non-
> jumbo mads isn't the way to get) lets have one pointer that is directed to the
> cache you want to use and this way all branch as the above one and it's such
> can be avoided, right?

That is a good idea, however, I'm going to address your other comments before changing anything here.

-- Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]         ` <54770F44.2090909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2014-11-27 13:51           ` Sagi Grimberg
@ 2014-12-08  0:23           ` Weiny, Ira
       [not found]             ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD4F23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 36+ messages in thread
From: Weiny, Ira @ 2014-12-08  0:23 UTC (permalink / raw)
  To: 'Or Gerlitz', roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

> 
> On 11/13/2014 9:54 PM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> 
> > Add a device capability flag for OPA devices to signal their support of "jumbo"
> > MADs.
> >
> > Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and
> > if supported mark the special QPs created.
> 
> Ira,
> 
> Few comments here, the device capability is OK, specifically if it helps for the
> mad layer logic to be simpler and/or more robust.
> 
> You should add device attribute telling what is the size of MADs supported by
> the device, I think the IBTA mandated 256Band in the OPA case, the driver will
> fill there 2k.

I don't think this is a bad idea, however, the complication is determining the kmem_cache object size in that case.  How about we limit this value to < 2K until such time as there is a need for larger support?

> 
> You can't just state that (jumbo == 2k) and go to complete design/changes
> which use this hard-coded assumption. You need to see how to re-spin this
> series w.o this assumption.

I'm looking into it.  I believe I will still need to have a capability bit, but it may end up having a different meaning.

> 
> I find it very annoying that upper level drivers replicate in different ways
> elements from the IB device attributes returned by ib_query_device. I met that
> in multiple drivers and upcoming designs for which I do code review. Are you
> up to come up with a patch that caches the device attributes on the device
> structure?

I don't follow what you are asking for.  Could you give more details?

> if not,
> I can do that.. and have your code to see it.
> 
> Also, I would go and unify (squash) this patch and the preceding one.
> 

I'll keep this in mind as I rework the patches.

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]             ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD4F23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2014-12-08  0:47               ` Roland Dreier
       [not found]                 ` <CAG4TOxNt+0p+i1a6oN1xx+K_OZEuZhPJ5e=44KScnaGVA4E0SA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-12-08 11:29               ` Or Gerlitz
  1 sibling, 1 reply; 36+ messages in thread
From: Roland Dreier @ 2014-12-08  0:47 UTC (permalink / raw)
  To: Weiny, Ira; +Cc: Or Gerlitz, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Sun, Dec 7, 2014 at 4:23 PM, Weiny, Ira <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> I don't think this is a bad idea, however, the complication is determining the kmem_cache object size in that case.  How about we limit this value to < 2K until such time as there is a need for larger support?

Can we just get rid of all the kmem_caches used in the MAD code?  I
don't think any of this is so performance-critical that we couldn't
just use kmalloc...

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]             ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD4F23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2014-12-08  0:47               ` Roland Dreier
@ 2014-12-08 11:29               ` Or Gerlitz
       [not found]                 ` <CAJ3xEMj-0_0F+VoGZDes92ShFRTbt9Et4WWPt=viY5gx_P-oNg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 36+ messages in thread
From: Or Gerlitz @ 2014-12-08 11:29 UTC (permalink / raw)
  To: Weiny, Ira
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, Dec 8, 2014 at 2:23 AM, Weiny, Ira <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:

>> I find it very annoying that upper level drivers replicate in different ways
>> elements from the IB device attributes returned by ib_query_device. I met that
>> in multiple drivers and upcoming designs for which I do code review. Are you
>> up to come up with a patch that caches the device attributes on the device
>> structure?

> I don't follow what you are asking for.  Could you give more details?

1. add a struct ib_device_attr field to struct ib_device

2. when the device registers itself with the IB core, go and run the
query_device verb with the param being pointer to that field
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]                 ` <CAJ3xEMj-0_0F+VoGZDes92ShFRTbt9Et4WWPt=viY5gx_P-oNg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-12-09 22:36                   ` Weiny, Ira
       [not found]                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD7A97-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Weiny, Ira @ 2014-12-09 22:36 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

> 
> On Mon, Dec 8, 2014 at 2:23 AM, Weiny, Ira <ira.weiny@intel.com> wrote:
> 
> >> I find it very annoying that upper level drivers replicate in
> >> different ways elements from the IB device attributes returned by
> >> ib_query_device. I met that in multiple drivers and upcoming designs
> >> for which I do code review. Are you up to come up with a patch that
> >> caches the device attributes on the device structure?
> 
> > I don't follow what you are asking for.  Could you give more details?
> 
> 1. add a struct ib_device_attr field to struct ib_device
> 
> 2. when the device registers itself with the IB core, go and run the
> query_device verb with the param being pointer to that field

I see where you are going.  Then the MAD stack does not have to cache a "max_mad_size" value but rather looks in the ib_device structure "on the fly"...

So, something like the diff below?  What are the chances we end up with attributes which are not constant?

If Roland would like to go this way I can rework my series based on the attributes being cached.

-- Ira


17:15:59 > git di
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 18c1ece..db18795 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -322,6 +322,8 @@ int ib_register_device(struct ib_device *device,
                                client->add(device);
        }
 
+       device->query_device(device, &device->attributes);
+
  out:
        mutex_unlock(&device_mutex);
        return ret;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 470a011..241a882 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1630,6 +1630,7 @@ struct ib_device {
        u32                          local_dma_lkey;
        u8                           node_type;
        u8                           phys_port_cnt;
+       struct ib_device_attr        attributes;
 };
 
 struct ib_client {

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]                 ` <CAG4TOxNt+0p+i1a6oN1xx+K_OZEuZhPJ5e=44KScnaGVA4E0SA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-12-09 23:23                   ` Weiny, Ira
  0 siblings, 0 replies; 36+ messages in thread
From: Weiny, Ira @ 2014-12-09 23:23 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Or Gerlitz, linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 799 bytes --]

> 
> On Sun, Dec 7, 2014 at 4:23 PM, Weiny, Ira <ira.weiny@intel.com> wrote:
> > I don't think this is a bad idea, however, the complication is determining the
> kmem_cache object size in that case.  How about we limit this value to < 2K
> until such time as there is a need for larger support?
> 
> Can we just get rid of all the kmem_caches used in the MAD code?  I don't
> think any of this is so performance-critical that we couldn't just use kmalloc...
> 

On an SM node the MAD code is _very_ performance-critical.  I don’t think we should give that up.

Another alternative is to have a cache per device based on the size reported.

-- Ira

N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD7A97-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2014-12-10  7:52                       ` Or Gerlitz
  0 siblings, 0 replies; 36+ messages in thread
From: Or Gerlitz @ 2014-12-10  7:52 UTC (permalink / raw)
  To: Weiny, Ira
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Wed, Dec 10, 2014 at 12:36 AM, Weiny, Ira <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> On Mon, Dec 8, 2014 at 2:23 AM, Weiny, Ira <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:

>> 1. add a struct ib_device_attr field to struct ib_device
>> 2. when the device registers itself with the IB core, go and run the
>> query_device verb with the param being pointer to that field

> I see where you are going.  Then the MAD stack does not have to cache a "max_mad_size" value but rather looks in the ib_device structure "on the fly"...
> So, something like the diff below?

exactly, thanks.

> What are the chances we end up with attributes which are not constant?

I don't see how this can happen.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]                 ` <54773C59.6080505-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-01-07 16:32                   ` Weiny, Ira
       [not found]                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CBEE86E-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Weiny, Ira @ 2015-01-07 16:32 UTC (permalink / raw)
  To: Or Gerlitz, Sagi Grimberg, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

> 
> On 11/27/2014 3:51 PM, Sagi Grimberg wrote:
> > We're already short on bits in device_cap_flags
> 
> no shortage @ the kernel... we can add more 32 bits if/when we need it

Why is there a gap in the bits?

enum ib_device_cap_flags {
...
        IB_DEVICE_MEM_WINDOW_TYPE_2B    = (1<<24), 
        IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29),
...
};

25-28 are not used?  Is there some legacy software out there which uses these?

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
       [not found]                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CBEE86E-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-01-08 11:41                       ` Or Gerlitz
  0 siblings, 0 replies; 36+ messages in thread
From: Or Gerlitz @ 2015-01-08 11:41 UTC (permalink / raw)
  To: Weiny, Ira
  Cc: Sagi Grimberg, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 1/7/2015 6:32 PM, Weiny, Ira wrote:
>> >On 11/27/2014 3:51 PM, Sagi Grimberg wrote:
>>> > >We're already short on bits in device_cap_flags
>> >
>> >no shortage @ the kernel... we can add more 32 bits if/when we need it
> Why is there a gap in the bits?
>
> enum ib_device_cap_flags {
> ...
>          IB_DEVICE_MEM_WINDOW_TYPE_2B    = (1<<24),
>          IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29),
> ...
> };
>
> 25-28 are not used?  Is there some legacy software out there which uses these?

sort of... the verbs RSS design which wasn't accepted upstream uses bits 
25/26/27 and Mellanox Core-Direct (whose verbs model called 
"Cross-Channel") which wasn't submitted upstream yet uses bit 28. Since 
in the kernel we can trivially use 64 bits, I would opt to keep these 
entries free for now.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2015-01-08 11:41 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-13 19:54 [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
     [not found] ` <1415908465-24392-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-11-13 19:54   ` [RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device cap flag ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device ira.weiny-ral2JQCrhuEAvxtiuMwx3w
     [not found]     ` <1415908465-24392-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-11-27 11:47       ` Or Gerlitz
     [not found]         ` <54770F44.2090909-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-11-27 13:51           ` Sagi Grimberg
     [not found]             ` <54772C70.8060602-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-11-27 14:59               ` Or Gerlitz
     [not found]                 ` <54773C59.6080505-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-01-07 16:32                   ` Weiny, Ira
     [not found]                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CBEE86E-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-01-08 11:41                       ` Or Gerlitz
2014-12-08  0:23           ` Weiny, Ira
     [not found]             ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD4F23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-12-08  0:47               ` Roland Dreier
     [not found]                 ` <CAG4TOxNt+0p+i1a6oN1xx+K_OZEuZhPJ5e=44KScnaGVA4E0SA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-09 23:23                   ` Weiny, Ira
2014-12-08 11:29               ` Or Gerlitz
     [not found]                 ` <CAJ3xEMj-0_0F+VoGZDes92ShFRTbt9Et4WWPt=viY5gx_P-oNg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-12-09 22:36                   ` Weiny, Ira
     [not found]                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CBD7A97-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-12-10  7:52                       ` Or Gerlitz
2014-11-13 19:54   ` [RFC PATCH 04/16] ib/mad: add base version parameter to ib_create_send_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache ira.weiny-ral2JQCrhuEAvxtiuMwx3w
     [not found]     ` <1415908465-24392-8-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-11-27 11:50       ` Or Gerlitz
     [not found]         ` <54770FF4.3070807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-05 21:25           ` Weiny, Ira
2014-11-13 19:54   ` [RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path Architecture base version MADs in ib_create_send_mad ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path Architecture MADs ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 11/16] ib/mad: create helper function for smi_handle_dr_smp_send ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 12/16] ib/mad: create helper function for smi_handle_dr_smp_recv ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 13/16] ib/mad: create helper function for smi_check_forward_dr_smp ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 14/16] ib/mad: Create helper function for SMI processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-13 19:54   ` [RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General MAD processing ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2014-11-18 22:16   ` [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) " Or Gerlitz
     [not found]     ` <CAJ3xEMhtm99dRdcEvhK9s961mDr7YSU3pkv-WK=sESKe_K4kYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-11-25 21:52       ` Weiny, Ira
     [not found]         ` <2807E5FD2F6FDA4886F6618EAC48510E0CBC6B23-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-11-27 10:02           ` Or Gerlitz
     [not found]             ` <5476F6BB.1020200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-12-01 22:18               ` Weiny, Ira
2014-11-25 23:16   ` Hal Rosenstock
     [not found]     ` <54750DCF.90308-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-12-01 22:33       ` Weiny, Ira

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.