linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] staging/rdma/hfi1: complete fixes for ECN detection
@ 2015-11-05  2:10 ira.weiny-ral2JQCrhuEAvxtiuMwx3w
       [not found] ` <1446689411-22263-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-11-05  2:10 UTC (permalink / raw)
  To: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The following 3 patches fix the ECN detection and add a module parameter to
turn the prescan of the receive queue on and off.

Arthur Kepner (2):
  staging/rdma/hfi1: don't cache "prescan head"
  staging/rdma/hfi1: optionally prescan rx queue for {B,F}ECNs - UC, RC

Vennila Megavannan (1):
  staging/rdma/hfi1: Method to toggle "fast ECN" detection

 drivers/staging/rdma/hfi1/Kconfig  |  14 ++---
 drivers/staging/rdma/hfi1/driver.c | 101 +++++++++++++++++++------------------
 drivers/staging/rdma/hfi1/hfi.h    |  13 -----
 drivers/staging/rdma/hfi1/rc.c     |  10 ++--
 drivers/staging/rdma/hfi1/uc.c     |  15 +++---
 drivers/staging/rdma/hfi1/verbs.c  |  41 ++++++++++++---
 6 files changed, 108 insertions(+), 86 deletions(-)

-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] staging/rdma/hfi1: don't cache "prescan head"
       [not found] ` <1446689411-22263-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-11-05  2:10   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2015-11-05  2:10   ` [PATCH 2/3] staging/rdma/hfi1: optionally prescan rx queue for {B,F}ECNs - UC, RC ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2015-11-05  2:10   ` [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2 siblings, 0 replies; 7+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-11-05  2:10 UTC (permalink / raw)
  To: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w, Arthur Kepner,
	Ira Weiny

From: Arthur Kepner <arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

When HFI1_CAP_DMA_RTAIL is toggled off the "prescan head" can get out of sync
with the receive context's "head". This happens when, after prescan_rxq() newly
arrived packets are then received, and processed by an RX interrupt handler.
This is an unavoidable race, and to avoid getting out of sync we always start
prescanning at the current "rcd->head" entry.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Arthur Kepner <arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/driver.c | 13 +++----------
 drivers/staging/rdma/hfi1/hfi.h    | 13 -------------
 2 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/driver.c b/drivers/staging/rdma/hfi1/driver.c
index ce69141b56cb..cfcbe417fde9 100644
--- a/drivers/staging/rdma/hfi1/driver.c
+++ b/drivers/staging/rdma/hfi1/driver.c
@@ -509,14 +509,10 @@ static inline void init_ps_mdata(struct ps_mdata *mdata,
 	mdata->rsize = packet->rsize;
 	mdata->maxcnt = packet->maxcnt;
 
-	if (rcd->ps_state.initialized == 0) {
-		mdata->ps_head = packet->rhqoff;
-		rcd->ps_state.initialized++;
-	} else
-		mdata->ps_head = rcd->ps_state.ps_head;
+	mdata->ps_head = packet->rhqoff;
 
 	if (HFI1_CAP_IS_KSET(DMA_RTAIL)) {
-		mdata->ps_tail = packet->hdrqtail;
+		mdata->ps_tail = get_rcvhdrtail(rcd);
 		mdata->ps_seq = 0; /* not used with DMA_RTAIL */
 	} else {
 		mdata->ps_tail = 0; /* used only with DMA_RTAIL*/
@@ -533,12 +529,9 @@ static inline int ps_done(struct ps_mdata *mdata, u64 rhf)
 
 static inline void update_ps_mdata(struct ps_mdata *mdata)
 {
-	struct hfi1_ctxtdata *rcd = mdata->rcd;
-
 	mdata->ps_head += mdata->rsize;
-	if (mdata->ps_head > mdata->maxcnt)
+	if (mdata->ps_head >= mdata->maxcnt)
 		mdata->ps_head = 0;
-	rcd->ps_state.ps_head = mdata->ps_head;
 	if (!HFI1_CAP_IS_KSET(DMA_RTAIL)) {
 		if (++mdata->ps_seq > 13)
 			mdata->ps_seq = 1;
diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h
index 190f7a2f6773..96135b813fba 100644
--- a/drivers/staging/rdma/hfi1/hfi.h
+++ b/drivers/staging/rdma/hfi1/hfi.h
@@ -139,15 +139,6 @@ extern const struct pci_error_handlers hfi1_pci_err_handler;
 struct hfi1_opcode_stats_perctx;
 #endif
 
-/*
- * struct ps_state keeps state associated with RX queue "prescanning"
- * (prescanning for FECNs, and BECNs), if prescanning is in use.
- */
-struct ps_state {
-	u32 ps_head;
-	int initialized;
-};
-
 struct ctxt_eager_bufs {
 	ssize_t size;            /* total size of eager buffers */
 	u32 count;               /* size of buffers array */
@@ -302,10 +293,6 @@ struct hfi1_ctxtdata {
 	struct list_head sdma_queues;
 	spinlock_t sdma_qlock;
 
-#ifdef CONFIG_PRESCAN_RXQ
-	struct ps_state ps_state;
-#endif /* CONFIG_PRESCAN_RXQ */
-
 	/*
 	 * The interrupt handler for a particular receive context can vary
 	 * throughout it's lifetime. This is not a lock protected data member so
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] staging/rdma/hfi1: optionally prescan rx queue for {B,F}ECNs - UC, RC
       [not found] ` <1446689411-22263-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-11-05  2:10   ` [PATCH 1/3] staging/rdma/hfi1: don't cache "prescan head" ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2015-11-05  2:10   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2015-11-05  2:10   ` [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2 siblings, 0 replies; 7+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-11-05  2:10 UTC (permalink / raw)
  To: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w, Arthur Kepner,
	Ira Weiny

From: Arthur Kepner <arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

To more rapidly respond to Explicit Congestion Notifications, prescan the
receive queue, and process FECNs, and BECNs first.  When a UC, or RC packet
containing a FECN, or BECN is found, immediately react to the ECN (either by
returning a CNP, or adjusting the injection rate). Afterward, the packet will
be processed normally.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Arthur Kepner <arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/driver.c | 64 +++++++++++++++++++-------------------
 drivers/staging/rdma/hfi1/rc.c     | 10 +++---
 drivers/staging/rdma/hfi1/uc.c     | 15 +++++----
 drivers/staging/rdma/hfi1/verbs.c  | 41 ++++++++++++++++++++----
 4 files changed, 81 insertions(+), 49 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/driver.c b/drivers/staging/rdma/hfi1/driver.c
index cfcbe417fde9..9a4ec09af020 100644
--- a/drivers/staging/rdma/hfi1/driver.c
+++ b/drivers/staging/rdma/hfi1/driver.c
@@ -436,59 +436,58 @@ static inline void init_packet(struct hfi1_ctxtdata *rcd,
 
 #ifndef CONFIG_PRESCAN_RXQ
 static void prescan_rxq(struct hfi1_packet *packet) {}
-#else /* CONFIG_PRESCAN_RXQ */
+#else /* !CONFIG_PRESCAN_RXQ */
 static int prescan_receive_queue;
 
 static void process_ecn(struct hfi1_qp *qp, struct hfi1_ib_header *hdr,
 			struct hfi1_other_headers *ohdr,
-			u64 rhf, struct ib_grh *grh)
+			u64 rhf, u32 bth1, struct ib_grh *grh)
 {
 	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
-	u32 bth1;
+	u32 rqpn = 0;
+	u16 rlid;
 	u8 sc5, svc_type;
-	int is_fecn, is_becn;
 
 	switch (qp->ibqp.qp_type) {
+	case IB_QPT_SMI:
+	case IB_QPT_GSI:
 	case IB_QPT_UD:
+		rlid = be16_to_cpu(hdr->lrh[3]);
+		rqpn = be32_to_cpu(ohdr->u.ud.deth[1]) & HFI1_QPN_MASK;
 		svc_type = IB_CC_SVCTYPE_UD;
 		break;
-	case IB_QPT_UC:	/* LATER */
-	case IB_QPT_RC:	/* LATER */
+	case IB_QPT_UC:
+		rlid = qp->remote_ah_attr.dlid;
+		rqpn = qp->remote_qpn;
+		svc_type = IB_CC_SVCTYPE_UC;
+		break;
+	case IB_QPT_RC:
+		rlid = qp->remote_ah_attr.dlid;
+		rqpn = qp->remote_qpn;
+		svc_type = IB_CC_SVCTYPE_RC;
+		break;
 	default:
 		return;
 	}
 
-	is_fecn = (be32_to_cpu(ohdr->bth[1]) >> HFI1_FECN_SHIFT) &
-			HFI1_FECN_MASK;
-	is_becn = (be32_to_cpu(ohdr->bth[1]) >> HFI1_BECN_SHIFT) &
-			HFI1_BECN_MASK;
-
 	sc5 = (be16_to_cpu(hdr->lrh[0]) >> 12) & 0xf;
 	if (rhf_dc_info(rhf))
 		sc5 |= 0x10;
 
-	if (is_fecn) {
-		u32 src_qpn = be32_to_cpu(ohdr->u.ud.deth[1]) & HFI1_QPN_MASK;
+	if (bth1 & HFI1_FECN_SMASK) {
 		u16 pkey = (u16)be32_to_cpu(ohdr->bth[0]);
 		u16 dlid = be16_to_cpu(hdr->lrh[1]);
-		u16 slid = be16_to_cpu(hdr->lrh[3]);
 
-		return_cnp(ibp, qp, src_qpn, pkey, dlid, slid, sc5, grh);
+		return_cnp(ibp, qp, rqpn, pkey, dlid, rlid, sc5, grh);
 	}
 
-	if (is_becn) {
+	if (bth1 & HFI1_BECN_SMASK) {
 		struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
-		u32 lqpn =  be32_to_cpu(ohdr->bth[1]) & HFI1_QPN_MASK;
+		u32 lqpn = bth1 & HFI1_QPN_MASK;
 		u8 sl = ibp->sc_to_sl[sc5];
 
-		process_becn(ppd, sl, 0, lqpn, 0, svc_type);
+		process_becn(ppd, sl, rlid, lqpn, rqpn, svc_type);
 	}
-
-	/* turn off BECN, or FECN */
-	bth1 = be32_to_cpu(ohdr->bth[1]);
-	bth1 &= ~(HFI1_FECN_MASK << HFI1_FECN_SHIFT);
-	bth1 &= ~(HFI1_BECN_MASK << HFI1_BECN_SHIFT);
-	ohdr->bth[1] = cpu_to_be32(bth1);
 }
 
 struct ps_mdata {
@@ -508,7 +507,6 @@ static inline void init_ps_mdata(struct ps_mdata *mdata,
 	mdata->rcd = rcd;
 	mdata->rsize = packet->rsize;
 	mdata->maxcnt = packet->maxcnt;
-
 	mdata->ps_head = packet->rhqoff;
 
 	if (HFI1_CAP_IS_KSET(DMA_RTAIL)) {
@@ -564,7 +562,7 @@ static void prescan_rxq(struct hfi1_packet *packet)
 		struct hfi1_other_headers *ohdr;
 		struct ib_grh *grh = NULL;
 		u64 rhf = rhf_to_cpu(rhf_addr);
-		u32 etype = rhf_rcv_type(rhf), qpn;
+		u32 etype = rhf_rcv_type(rhf), qpn, bth1;
 		int is_ecn = 0;
 		u8 lnh;
 
@@ -586,15 +584,13 @@ static void prescan_rxq(struct hfi1_packet *packet)
 		} else
 			goto next; /* just in case */
 
-		is_ecn |= be32_to_cpu(ohdr->bth[1]) &
-			(HFI1_FECN_MASK << HFI1_FECN_SHIFT);
-		is_ecn |= be32_to_cpu(ohdr->bth[1]) &
-			(HFI1_BECN_MASK << HFI1_BECN_SHIFT);
+		bth1 = be32_to_cpu(ohdr->bth[1]);
+		is_ecn = !!(bth1 & (HFI1_FECN_SMASK | HFI1_BECN_SMASK));
 
 		if (!is_ecn)
 			goto next;
 
-		qpn = be32_to_cpu(ohdr->bth[1]) & HFI1_QPN_MASK;
+		qpn = bth1 & HFI1_QPN_MASK;
 		rcu_read_lock();
 		qp = hfi1_lookup_qpn(ibp, qpn);
 
@@ -603,8 +599,12 @@ static void prescan_rxq(struct hfi1_packet *packet)
 			goto next;
 		}
 
-		process_ecn(qp, hdr, ohdr, rhf, grh);
+		process_ecn(qp, hdr, ohdr, rhf, bth1, grh);
 		rcu_read_unlock();
+
+		/* turn off BECN, FECN */
+		bth1 &= ~(HFI1_FECN_SMASK | HFI1_BECN_SMASK);
+		ohdr->bth[1] = cpu_to_be32(bth1);
 next:
 		update_ps_mdata(&mdata);
 	}
diff --git a/drivers/staging/rdma/hfi1/rc.c b/drivers/staging/rdma/hfi1/rc.c
index 0b19206ff33e..e25c70b9de34 100644
--- a/drivers/staging/rdma/hfi1/rc.c
+++ b/drivers/staging/rdma/hfi1/rc.c
@@ -1978,7 +1978,7 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
 	}
 
 	psn = be32_to_cpu(ohdr->bth[2]);
-	opcode = bth0 >> 24;
+	opcode = (bth0 >> 24) & 0xff;
 
 	/*
 	 * Process responses (ACKs) before anything else.  Note that the
@@ -2391,19 +2391,19 @@ void hfi1_rc_hdrerr(
 	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
 	int diff;
 	u32 opcode;
-	u32 psn;
+	u32 psn, bth0;
 
 	/* Check for GRH */
 	ohdr = &hdr->u.oth;
 	if (has_grh)
 		ohdr = &hdr->u.l.oth;
 
-	opcode = be32_to_cpu(ohdr->bth[0]);
-	if (hfi1_ruc_check_hdr(ibp, hdr, has_grh, qp, opcode))
+	bth0 = be32_to_cpu(ohdr->bth[0]);
+	if (hfi1_ruc_check_hdr(ibp, hdr, has_grh, qp, bth0))
 		return;
 
 	psn = be32_to_cpu(ohdr->bth[2]);
-	opcode >>= 24;
+	opcode = (bth0 >> 24) & 0xff;
 
 	/* Only deal with RDMA Writes for now */
 	if (opcode < IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST) {
diff --git a/drivers/staging/rdma/hfi1/uc.c b/drivers/staging/rdma/hfi1/uc.c
index b536f397737c..d785878d8b90 100644
--- a/drivers/staging/rdma/hfi1/uc.c
+++ b/drivers/staging/rdma/hfi1/uc.c
@@ -268,7 +268,7 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 	u32 tlen = packet->tlen;
 	struct hfi1_qp *qp = packet->qp;
 	struct hfi1_other_headers *ohdr = packet->ohdr;
-	u32 opcode;
+	u32 bth0, opcode;
 	u32 hdrsize = packet->hlen;
 	u32 psn;
 	u32 pad;
@@ -278,10 +278,9 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 	int has_grh = rcv_flags & HFI1_HAS_GRH;
 	int ret;
 	u32 bth1;
-	struct ib_grh *grh = NULL;
 
-	opcode = be32_to_cpu(ohdr->bth[0]);
-	if (hfi1_ruc_check_hdr(ibp, hdr, has_grh, qp, opcode))
+	bth0 = be32_to_cpu(ohdr->bth[0]);
+	if (hfi1_ruc_check_hdr(ibp, hdr, has_grh, qp, bth0))
 		return;
 
 	bth1 = be32_to_cpu(ohdr->bth[1]);
@@ -303,6 +302,7 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 		}
 
 		if (bth1 & HFI1_FECN_SMASK) {
+			struct ib_grh *grh = NULL;
 			u16 pkey = (u16)be32_to_cpu(ohdr->bth[0]);
 			u16 slid = be16_to_cpu(hdr->lrh[3]);
 			u16 dlid = be16_to_cpu(hdr->lrh[1]);
@@ -310,13 +310,16 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 			u8 sc5;
 
 			sc5 = ibp->sl_to_sc[qp->remote_ah_attr.sl];
+			if (has_grh)
+				grh = &hdr->u.l.grh;
 
-			return_cnp(ibp, qp, src_qp, pkey, dlid, slid, sc5, grh);
+			return_cnp(ibp, qp, src_qp, pkey, dlid, slid, sc5,
+				   grh);
 		}
 	}
 
 	psn = be32_to_cpu(ohdr->bth[2]);
-	opcode >>= 24;
+	opcode = (bth0 >> 24) & 0xff;
 
 	/* Compare the PSN verses the expected PSN. */
 	if (unlikely(cmp_psn(psn, qp->r_psn) != 0)) {
diff --git a/drivers/staging/rdma/hfi1/verbs.c b/drivers/staging/rdma/hfi1/verbs.c
index a13a2b135365..b31588264ca0 100644
--- a/drivers/staging/rdma/hfi1/verbs.c
+++ b/drivers/staging/rdma/hfi1/verbs.c
@@ -2142,11 +2142,40 @@ void hfi1_schedule_send(struct hfi1_qp *qp)
 void hfi1_cnp_rcv(struct hfi1_packet *packet)
 {
 	struct hfi1_ibport *ibp = &packet->rcd->ppd->ibport_data;
-
-	if (packet->qp->ibqp.qp_type == IB_QPT_UC)
-		hfi1_uc_rcv(packet);
-	else if (packet->qp->ibqp.qp_type == IB_QPT_UD)
-		hfi1_ud_rcv(packet);
-	else
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	struct hfi1_ib_header *hdr = packet->hdr;
+	struct hfi1_qp *qp = packet->qp;
+	u32 lqpn, rqpn = 0;
+	u16 rlid = 0;
+	u8 sl, sc5, sc4_bit, svc_type;
+	bool sc4_set = has_sc4_bit(packet);
+
+	switch (packet->qp->ibqp.qp_type) {
+	case IB_QPT_UC:
+		rlid = qp->remote_ah_attr.dlid;
+		rqpn = qp->remote_qpn;
+		svc_type = IB_CC_SVCTYPE_UC;
+		break;
+	case IB_QPT_RC:
+		rlid = qp->remote_ah_attr.dlid;
+		rqpn = qp->remote_qpn;
+		svc_type = IB_CC_SVCTYPE_RC;
+		break;
+	case IB_QPT_SMI:
+	case IB_QPT_GSI:
+	case IB_QPT_UD:
+		svc_type = IB_CC_SVCTYPE_UD;
+		break;
+	default:
 		ibp->n_pkt_drops++;
+		return;
+	}
+
+	sc4_bit = sc4_set << 4;
+	sc5 = (be16_to_cpu(hdr->lrh[0]) >> 12) & 0xf;
+	sc5 |= sc4_bit;
+	sl = ibp->sc_to_sl[sc5];
+	lqpn = qp->ibqp.qp_num;
+
+	process_becn(ppd, sl, rlid, lqpn, rqpn, svc_type);
 }
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection
       [not found] ` <1446689411-22263-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-11-05  2:10   ` [PATCH 1/3] staging/rdma/hfi1: don't cache "prescan head" ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2015-11-05  2:10   ` [PATCH 2/3] staging/rdma/hfi1: optionally prescan rx queue for {B,F}ECNs - UC, RC ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2015-11-05  2:10   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
       [not found]     ` <1446689411-22263-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2 siblings, 1 reply; 7+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-11-05  2:10 UTC (permalink / raw)
  To: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w, Vennila Megavannan,
	Ira Weiny

From: Vennila Megavannan <vennila.megavannan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Add a module paramter to toggle prescan/Fast ECN Detection.

In addition change the PRESCAN_RXQ Kconfig default to "yes".

Reviewed-by: Arthur Kepner<arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn<mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Vennila Megavannan<vennila.megavannan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/Kconfig  | 14 +++++++-------
 drivers/staging/rdma/hfi1/driver.c | 24 +++++++++++++++++-------
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/Kconfig b/drivers/staging/rdma/hfi1/Kconfig
index fd25078ee923..cfc9f9324b8d 100644
--- a/drivers/staging/rdma/hfi1/Kconfig
+++ b/drivers/staging/rdma/hfi1/Kconfig
@@ -26,12 +26,12 @@ config SDMA_VERBOSITY
 	This is a configuration flag to enable verbose
 	SDMA debug
 config PRESCAN_RXQ
-	bool "Enable prescanning of the RX queue for ECNs"
+	bool "Enable optional prescanning of the RX queue for ECNs"
 	depends on INFINIBAND_HFI1
-	default n
+	default y
 	---help---
-	This option toggles the prescanning of the receive queue for
-	Explicit Congestion Notifications. If an ECN is detected, it
-	is processed as quickly as possible, the ECN is toggled off.
-	After the prescanning step, the receive queue is processed as
-	usual.
+	This option enables code for the prescanning of the receive queue for
+	Explicit Congestion Notifications.  Pre-scanning can be controlled via a
+	module option at run time.  If an ECN is detected, it is processed as
+	quickly as possible, the ECN is toggled off.  After the prescanning
+	step, the receive queue is processed as usual.
diff --git a/drivers/staging/rdma/hfi1/driver.c b/drivers/staging/rdma/hfi1/driver.c
index 9a4ec09af020..4c1dd7130d7a 100644
--- a/drivers/staging/rdma/hfi1/driver.c
+++ b/drivers/staging/rdma/hfi1/driver.c
@@ -83,6 +83,14 @@ unsigned int hfi1_cu = 1;
 module_param_named(cu, hfi1_cu, uint, S_IRUGO);
 MODULE_PARM_DESC(cu, "Credit return units");
 
+#ifdef CONFIG_PRESCAN_RXQ
+static unsigned int prescan_rx_queue;
+module_param_named(prescan_rxq, prescan_rx_queue, uint,
+		   S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(prescan_rxq,
+		 "Used to toggle rx prescan. Set to 1 to enable prescan");
+#endif /* CONFIG_PRESCAN_RXQ */
+
 unsigned long hfi1_cap_mask = HFI1_CAP_MASK_DEFAULT;
 static int hfi1_caps_set(const char *, const struct kernel_param *);
 static int hfi1_caps_get(char *, const struct kernel_param *);
@@ -435,10 +443,8 @@ static inline void init_packet(struct hfi1_ctxtdata *rcd,
 }
 
 #ifndef CONFIG_PRESCAN_RXQ
-static void prescan_rxq(struct hfi1_packet *packet) {}
+#define prescan_rxq(packet)
 #else /* !CONFIG_PRESCAN_RXQ */
-static int prescan_receive_queue;
-
 static void process_ecn(struct hfi1_qp *qp, struct hfi1_ib_header *hdr,
 			struct hfi1_other_headers *ohdr,
 			u64 rhf, u32 bth1, struct ib_grh *grh)
@@ -541,15 +547,19 @@ static inline void update_ps_mdata(struct ps_mdata *mdata)
  * containing Excplicit Congestion Notifications (FECNs, or BECNs).
  * When an ECN is found, process the Congestion Notification, and toggle
  * it off.
+ * This is declared as a macro to allow quick checking of the module param and
+ * avoid the overhead of a function call if not enabled.
  */
-static void prescan_rxq(struct hfi1_packet *packet)
+#define prescan_rxq(packet) \
+	do { \
+		if (prescan_rx_queue) \
+			__prescan_rxq(packet); \
+	} while (0)
+static void __prescan_rxq(struct hfi1_packet *packet)
 {
 	struct hfi1_ctxtdata *rcd = packet->rcd;
 	struct ps_mdata mdata;
 
-	if (!prescan_receive_queue)
-		return;
-
 	init_ps_mdata(&mdata, packet);
 
 	while (1) {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection
       [not found]     ` <1446689411-22263-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-11-05  7:56       ` Dan Carpenter
  2015-11-05 17:02         ` ira.weiny
  0 siblings, 1 reply; 7+ messages in thread
From: Dan Carpenter @ 2015-11-05  7:56 UTC (permalink / raw)
  To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	dledford-H+wXaHxf7aLQT0dZR+AlfA, Vennila Megavannan

On Wed, Nov 04, 2015 at 09:10:11PM -0500, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> From: Vennila Megavannan <vennila.megavannan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> Add a module paramter to toggle prescan/Fast ECN Detection.
> 
> In addition change the PRESCAN_RXQ Kconfig default to "yes".
> 
> Reviewed-by: Arthur Kepner<arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Mike Marciniszyn<mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Vennila Megavannan<vennila.megavannan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Hm...  In the original code we had the config entry but the code to
support this was actually disabled.  Now we are turning on the config
entry by default but the module parameter defaults to disabled.  I think
the documentation is a bit tricky because it should say that actually
enabling the config is not enough, it's still disabled.

Do we really need the config to be there?  Distros are going to enable
it.  Who is it who wants to disable the config?  Also is it better to
default to off or on for this code, what are the upsides and downsides
of that choice?

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection
  2015-11-05  7:56       ` Dan Carpenter
@ 2015-11-05 17:02         ` ira.weiny
  2015-11-07  1:05           ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: ira.weiny @ 2015-11-05 17:02 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	dledford-H+wXaHxf7aLQT0dZR+AlfA, Vennila Megavannan

On Thu, Nov 05, 2015 at 10:56:27AM +0300, Dan Carpenter wrote:
> On Wed, Nov 04, 2015 at 09:10:11PM -0500, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> > From: Vennila Megavannan <vennila.megavannan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > 
> > Add a module paramter to toggle prescan/Fast ECN Detection.
> > 
> > In addition change the PRESCAN_RXQ Kconfig default to "yes".
> > 
> > Reviewed-by: Arthur Kepner<arthur.kepner-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Reviewed-by: Mike Marciniszyn<mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Vennila Megavannan<vennila.megavannan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> Hm...  In the original code we had the config entry but the code to
> support this was actually disabled.  Now we are turning on the config
> entry by default but the module parameter defaults to disabled.  I think
> the documentation is a bit tricky because it should say that actually
> enabling the config is not enough, it's still disabled.

We tried to make it clear but obviously missed the mark.

> 
> Do we really need the config to be there?  Distros are going to enable it.
> Who is it who wants to disable the config?

The config is maintained to be able to fully remove the code to obtain the
maximum receive performance _if_ the customer knows that they will never need
the prescanning.  This is the _ultimate_ in performance on this path.  We
anticipate some HPC customers who build their own kernels may want this option.

>
> Also is it better to
> default to off or on for this code, what are the upsides and downsides
> of that choice?
> 

I'll update the commit message.  But will also explain here.

Enabling the code with the config option introduces a _slight_ performance
impact but allows the user to determine if this congestion control fix should
be active or not at run time.  You are correct that most distros will enable
the config options (default Yes).  Therefore this becomes a system admin
control item for the fabric being configured.  After testing we found that this
was the best compromise for most systems in terms of performance and
flexibility.  This is because prescanning is not required most of the time.
Should a customer find that their topology and/or workload requires prescanning
they can then enable this option at run time.  This will result in a
significant reduction of performance at the node level but may for those
customers result in better overall fabric/cluster performance.

So in conclusion we have 3 performance levels and the combination in this patch
puts us at the "middle one".

Thanks,
Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection
  2015-11-05 17:02         ` ira.weiny
@ 2015-11-07  1:05           ` Greg KH
  0 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2015-11-07  1:05 UTC (permalink / raw)
  To: ira.weiny; +Cc: devel, linux-rdma, dledford, Vennila Megavannan, Dan Carpenter

On Thu, Nov 05, 2015 at 12:02:25PM -0500, ira.weiny wrote:
> On Thu, Nov 05, 2015 at 10:56:27AM +0300, Dan Carpenter wrote:
> > On Wed, Nov 04, 2015 at 09:10:11PM -0500, ira.weiny@intel.com wrote:
> > > From: Vennila Megavannan <vennila.megavannan@intel.com>
> > > 
> > > Add a module paramter to toggle prescan/Fast ECN Detection.
> > > 
> > > In addition change the PRESCAN_RXQ Kconfig default to "yes".
> > > 
> > > Reviewed-by: Arthur Kepner<arthur.kepner@intel.com>
> > > Reviewed-by: Mike Marciniszyn<mike.marciniszyn@intel.com>
> > > Signed-off-by: Vennila Megavannan<vennila.megavannan@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > Hm...  In the original code we had the config entry but the code to
> > support this was actually disabled.  Now we are turning on the config
> > entry by default but the module parameter defaults to disabled.  I think
> > the documentation is a bit tricky because it should say that actually
> > enabling the config is not enough, it's still disabled.
> 
> We tried to make it clear but obviously missed the mark.
> 
> > 
> > Do we really need the config to be there?  Distros are going to enable it.
> > Who is it who wants to disable the config?
> 
> The config is maintained to be able to fully remove the code to obtain the
> maximum receive performance _if_ the customer knows that they will never need
> the prescanning.  This is the _ultimate_ in performance on this path.  We
> anticipate some HPC customers who build their own kernels may want this option.
> 
> >
> > Also is it better to
> > default to off or on for this code, what are the upsides and downsides
> > of that choice?
> > 
> 
> I'll update the commit message.  But will also explain here.
> 
> Enabling the code with the config option introduces a _slight_ performance
> impact but allows the user to determine if this congestion control fix should
> be active or not at run time.  You are correct that most distros will enable
> the config options (default Yes).  Therefore this becomes a system admin
> control item for the fabric being configured.  After testing we found that this
> was the best compromise for most systems in terms of performance and
> flexibility.  This is because prescanning is not required most of the time.
> Should a customer find that their topology and/or workload requires prescanning
> they can then enable this option at run time.  This will result in a
> significant reduction of performance at the node level but may for those
> customers result in better overall fabric/cluster performance.
> 
> So in conclusion we have 3 performance levels and the combination in this patch
> puts us at the "middle one".

Can't you just make this a dynamic thing, not a build-time thing?  No
one will rebuild their kernels (distros forbid this for obvious support
reasons), so making this work "automatically" is the best thing.  Second
best is having the ability to "tune" it by hand, but you really don't
want that.  Worse case is having a build-time option, which you chose :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-11-07  1:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-05  2:10 [PATCH 0/3] staging/rdma/hfi1: complete fixes for ECN detection ira.weiny-ral2JQCrhuEAvxtiuMwx3w
     [not found] ` <1446689411-22263-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-11-05  2:10   ` [PATCH 1/3] staging/rdma/hfi1: don't cache "prescan head" ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-11-05  2:10   ` [PATCH 2/3] staging/rdma/hfi1: optionally prescan rx queue for {B,F}ECNs - UC, RC ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-11-05  2:10   ` [PATCH 3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection ira.weiny-ral2JQCrhuEAvxtiuMwx3w
     [not found]     ` <1446689411-22263-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-11-05  7:56       ` Dan Carpenter
2015-11-05 17:02         ` ira.weiny
2015-11-07  1:05           ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).