All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20] IB/hfi1, qib, rdmavt: Another round of patches for 4.11
@ 2017-03-01 18:21 Dennis Dalessandro
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Mike Marciniszyn, Dean Luick, Jakub Byczkowski, Tadeusz Struk,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny, Brian Welty,
	Michael J. Ruhl, Easwar Hariharan, Don Hiatt, Sebastian Sanchez

Doug,

Here is another round of patches for 4.11. Included with the usual bug fixes
and general improvements of particular interest are new versions of the two
patches that you didn't take for the first set. The fault injection stuff.
We decided to go ahead and use the already existing config variable for those.
The other interesting thing here is a patch to the IB core for MGID/MLID
checking.

Patches apply on top of Linus' master branch which includes your most recent
pull request so this should apply equally well to your tree. Patches can 
also be found in my GitHub repo at:
https://github.com/ddalessa/kernel/tree/for-4.11

---

Dean Luick (1):
      IB/hfi1: Force logical link down

Don Hiatt (2):
      IB/hfi1: Add receive fault injection feature
      IB/hfi1: Add transmit fault injection feature

Easwar Hariharan (1):
      IB/hfi1: Check for QSFP presence before attempting reads

Michael J. Ruhl (5):
      IB/hfi1: Race hazard avoidance in user SDMA driver
      IB/hfi1: Cache registers during state change
      IB/hfi1: Add a patch value to the firmware version string
      IB/hfi1: Ensure VL index is within bounds
      IB/core: If the MGID/MLID pair is not on the list return an error

Mike Marciniszyn (7):
      IB/rdmavt,IB/hfi1,IB/qib: Make wc opcode translation driver dependent
      IB/rdmavt: Add additional fields to post send trace
      IB/rdmavt: Add tracing for cq entry and poll
      IB/rdmavt: Add swqe completion trace
      IB/rdmavt: Avoid reseting wqe send_flags in unreserve
      IB/hfi1: Eliminate synchronize_rcu() in mr delete
      IB/rdmavt,IB/qib,IB/hfi1: Make percpu refcount optional for user MRs

Sebastian Sanchez (2):
      IB/hfi1: NULL pointer dereference when freeing rhashtable
      IB/rdmavt,IB/hfi1: Fix timer migration regressions

Tadeusz Struk (2):
      IB/hfi1: Check device id early during init
      IB/hfi1: Protect the global dev_cntr_names and port_cntr_names


 drivers/infiniband/core/uverbs_cmd.c    |   13 +-
 drivers/infiniband/hw/hfi1/chip.c       |  178 ++++++++++++++++++++----
 drivers/infiniband/hw/hfi1/chip.h       |   18 +-
 drivers/infiniband/hw/hfi1/debugfs.c    |  230 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/hfi1/debugfs.h    |   41 ++++++
 drivers/infiniband/hw/hfi1/driver.c     |   19 +++
 drivers/infiniband/hw/hfi1/firmware.c   |   14 +-
 drivers/infiniband/hw/hfi1/hfi.h        |   11 +
 drivers/infiniband/hw/hfi1/init.c       |   19 +--
 drivers/infiniband/hw/hfi1/rc.c         |   12 +-
 drivers/infiniband/hw/hfi1/ruc.c        |    6 +
 drivers/infiniband/hw/hfi1/sdma.c       |   43 ++++--
 drivers/infiniband/hw/hfi1/trace_misc.h |   48 ++++++
 drivers/infiniband/hw/hfi1/trace_rc.h   |    7 -
 drivers/infiniband/hw/hfi1/trace_tx.h   |   43 ++++++
 drivers/infiniband/hw/hfi1/user_sdma.c  |    3 
 drivers/infiniband/hw/hfi1/verbs.c      |  104 ++++++++++++--
 drivers/infiniband/hw/hfi1/verbs.h      |    5 +
 drivers/infiniband/hw/qib/qib_rc.c      |   10 +
 drivers/infiniband/hw/qib/qib_ruc.c     |    5 +
 drivers/infiniband/hw/qib/qib_verbs.c   |   20 +++
 drivers/infiniband/sw/rdmavt/cq.c       |    3 
 drivers/infiniband/sw/rdmavt/mr.c       |   55 +++++--
 drivers/infiniband/sw/rdmavt/qp.c       |   32 +---
 drivers/infiniband/sw/rdmavt/trace.h    |    4 -
 drivers/infiniband/sw/rdmavt/trace_cq.h |  127 +++++++++++++++++
 drivers/infiniband/sw/rdmavt/trace_rc.h |  109 +++++++++++++++
 drivers/infiniband/sw/rdmavt/trace_tx.h |   34 ++++-
 include/rdma/ib_pack.h                  |    2 
 include/rdma/rdma_vt.h                  |    1 
 include/rdma/rdmavt_qp.h                |    7 -
 31 files changed, 1077 insertions(+), 146 deletions(-)
 create mode 100644 drivers/infiniband/sw/rdmavt/trace_cq.h
 create mode 100644 drivers/infiniband/sw/rdmavt/trace_rc.h

--
-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/20] IB/hfi1: Force logical link down
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-03-01 18:21   ` Dennis Dalessandro
  2017-03-01 18:21   ` [PATCH 02/20] IB/hfi1: Race hazard avoidance in user SDMA driver Dennis Dalessandro
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Easwar Hariharan, Dean Luick,
	Jakub Byczkowski

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If the logical link state does not read as down when
the physical link state is offline, force it to down.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Jakub Byczkowski <jakub.byczkowski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Easwar Hariharan <easwar.hariharan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c |   86 +++++++++++++++++++++++++++++--------
 1 files changed, 68 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 121a4c9..44322c6 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -1045,6 +1045,7 @@ static int wait_logical_linkstate(struct hfi1_pportdata *ppd, u32 state,
 static int qos_rmt_entries(struct hfi1_devdata *dd, unsigned int *mp,
 			   unsigned int *np);
 static void clear_full_mgmt_pkey(struct hfi1_pportdata *ppd);
+static int wait_link_transfer_active(struct hfi1_devdata *dd, int wait_ms);
 
 /*
  * Error interrupt table entry.  This is used as input to the interrupt
@@ -8891,8 +8892,6 @@ int send_idle_sma(struct hfi1_devdata *dd, u64 message)
  */
 static int do_quick_linkup(struct hfi1_devdata *dd)
 {
-	u64 reg;
-	unsigned long timeout;
 	int ret;
 
 	lcb_shutdown(dd, 0);
@@ -8915,19 +8914,9 @@ static int do_quick_linkup(struct hfi1_devdata *dd)
 		write_csr(dd, DC_LCB_CFG_RUN,
 			  1ull << DC_LCB_CFG_RUN_EN_SHIFT);
 
-		/* watch LCB_STS_LINK_TRANSFER_ACTIVE */
-		timeout = jiffies + msecs_to_jiffies(10);
-		while (1) {
-			reg = read_csr(dd, DC_LCB_STS_LINK_TRANSFER_ACTIVE);
-			if (reg)
-				break;
-			if (time_after(jiffies, timeout)) {
-				dd_dev_err(dd,
-					   "timeout waiting for LINK_TRANSFER_ACTIVE\n");
-				return -ETIMEDOUT;
-			}
-			udelay(2);
-		}
+		ret = wait_link_transfer_active(dd, 10);
+		if (ret)
+			return ret;
 
 		write_csr(dd, DC_LCB_CFG_ALLOW_LINK_UP,
 			  1ull << DC_LCB_CFG_ALLOW_LINK_UP_VAL_SHIFT);
@@ -10082,6 +10071,64 @@ static void check_lni_states(struct hfi1_pportdata *ppd)
 	decode_state_complete(ppd, last_remote_state, "received");
 }
 
+/* wait for wait_ms for LINK_TRANSFER_ACTIVE to go to 1 */
+static int wait_link_transfer_active(struct hfi1_devdata *dd, int wait_ms)
+{
+	u64 reg;
+	unsigned long timeout;
+
+	/* watch LCB_STS_LINK_TRANSFER_ACTIVE */
+	timeout = jiffies + msecs_to_jiffies(wait_ms);
+	while (1) {
+		reg = read_csr(dd, DC_LCB_STS_LINK_TRANSFER_ACTIVE);
+		if (reg)
+			break;
+		if (time_after(jiffies, timeout)) {
+			dd_dev_err(dd,
+				   "timeout waiting for LINK_TRANSFER_ACTIVE\n");
+			return -ETIMEDOUT;
+		}
+		udelay(2);
+	}
+	return 0;
+}
+
+/* called when the logical link state is not down as it should be */
+static void force_logical_link_state_down(struct hfi1_pportdata *ppd)
+{
+	struct hfi1_devdata *dd = ppd->dd;
+
+	/*
+	 * Bring link up in LCB loopback
+	 */
+	write_csr(dd, DC_LCB_CFG_TX_FIFOS_RESET, 1);
+	write_csr(dd, DC_LCB_CFG_IGNORE_LOST_RCLK,
+		  DC_LCB_CFG_IGNORE_LOST_RCLK_EN_SMASK);
+
+	write_csr(dd, DC_LCB_CFG_LANE_WIDTH, 0);
+	write_csr(dd, DC_LCB_CFG_REINIT_AS_SLAVE, 0);
+	write_csr(dd, DC_LCB_CFG_CNT_FOR_SKIP_STALL, 0x110);
+	write_csr(dd, DC_LCB_CFG_LOOPBACK, 0x2);
+
+	write_csr(dd, DC_LCB_CFG_TX_FIFOS_RESET, 0);
+	(void)read_csr(dd, DC_LCB_CFG_TX_FIFOS_RESET);
+	udelay(3);
+	write_csr(dd, DC_LCB_CFG_ALLOW_LINK_UP, 1);
+	write_csr(dd, DC_LCB_CFG_RUN, 1ull << DC_LCB_CFG_RUN_EN_SHIFT);
+
+	wait_link_transfer_active(dd, 100);
+
+	/*
+	 * Bring the link down again.
+	 */
+	write_csr(dd, DC_LCB_CFG_TX_FIFOS_RESET, 1);
+	write_csr(dd, DC_LCB_CFG_ALLOW_LINK_UP, 0);
+	write_csr(dd, DC_LCB_CFG_IGNORE_LOST_RCLK, 0);
+
+	/* call again to adjust ppd->statusp, if needed */
+	get_logical_state(ppd);
+}
+
 /*
  * Helper for set_link_state().  Do not call except from that routine.
  * Expects ppd->hls_mutex to be held.
@@ -10135,15 +10182,18 @@ static int goto_offline(struct hfi1_pportdata *ppd, u8 rem_reason)
 			return ret;
 	}
 
-	/* make sure the logical state is also down */
-	wait_logical_linkstate(ppd, IB_PORT_DOWN, 1000);
-
 	/*
 	 * Now in charge of LCB - must be after the physical state is
 	 * offline.quiet and before host_link_state is changed.
 	 */
 	set_host_lcb_access(dd);
 	write_csr(dd, DC_LCB_ERR_EN, ~0ull); /* watch LCB errors */
+
+	/* make sure the logical state is also down */
+	ret = wait_logical_linkstate(ppd, IB_PORT_DOWN, 1000);
+	if (ret)
+		force_logical_link_state_down(ppd);
+
 	ppd->host_link_state = HLS_LINK_COOLDOWN; /* LCB access allowed */
 
 	if (ppd->port_type == PORT_TYPE_QSFP &&

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/20] IB/hfi1: Race hazard avoidance in user SDMA driver
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-03-01 18:21   ` [PATCH 01/20] IB/hfi1: Force logical link down Dennis Dalessandro
@ 2017-03-01 18:21   ` Dennis Dalessandro
  2017-03-01 18:21   ` [PATCH 03/20] IB/hfi1: Cache registers during state change Dennis Dalessandro
                     ` (19 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael Ruhl

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Set the errcode before the state and add the smb_wmb() to avoid a
potential race condition with the user.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index e6811c4..060e374 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -1615,9 +1615,10 @@ static inline void set_comp_state(struct hfi1_user_sdma_pkt_q *pq,
 {
 	hfi1_cdbg(SDMA, "[%u:%u:%u:%u] Setting completion status %u %d",
 		  pq->dd->unit, pq->ctxt, pq->subctxt, idx, state, ret);
-	cq->comps[idx].status = state;
 	if (state == ERROR)
 		cq->comps[idx].errcode = -ret;
+	smp_wmb(); /* make sure errcode is visible first */
+	cq->comps[idx].status = state;
 	trace_hfi1_sdma_user_completion(pq->dd, pq->ctxt, pq->subctxt,
 					idx, state, ret);
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/20] IB/hfi1: Cache registers during state change
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-03-01 18:21   ` [PATCH 01/20] IB/hfi1: Force logical link down Dennis Dalessandro
  2017-03-01 18:21   ` [PATCH 02/20] IB/hfi1: Race hazard avoidance in user SDMA driver Dennis Dalessandro
@ 2017-03-01 18:21   ` Dennis Dalessandro
  2017-03-01 18:21   ` [PATCH 04/20] IB/hfi1: NULL pointer dereference when freeing rhashtable Dennis Dalessandro
                     ` (18 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

When the LCB is going offline, inopportune port queries can cause
benign error messages to be logged.  To deal with this, cache the
registers just before setting the LCB to offline, allowing queries to
return without eliciting the error.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c |   58 +++++++++++++++++++++++++++++++++++--
 1 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 44322c6..8b8840a 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -8345,6 +8345,52 @@ static int read_lcb_via_8051(struct hfi1_devdata *dd, u32 addr, u64 *data)
 }
 
 /*
+ * Provide a cache for some of the LCB registers in case the LCB is
+ * unavailable.
+ * (The LCB is unavailable in certain link states, for example.)
+ */
+struct lcb_datum {
+	u32 off;
+	u64 val;
+};
+
+static struct lcb_datum lcb_cache[] = {
+	{ DC_LCB_ERR_INFO_RX_REPLAY_CNT, 0},
+	{ DC_LCB_ERR_INFO_SEQ_CRC_CNT, 0 },
+	{ DC_LCB_ERR_INFO_REINIT_FROM_PEER_CNT, 0 },
+};
+
+static void update_lcb_cache(struct hfi1_devdata *dd)
+{
+	int i;
+	int ret;
+	u64 val;
+
+	for (i = 0; i < ARRAY_SIZE(lcb_cache); i++) {
+		ret = read_lcb_csr(dd, lcb_cache[i].off, &val);
+
+		/* Update if we get good data */
+		if (likely(ret != -EBUSY))
+			lcb_cache[i].val = val;
+	}
+}
+
+static int read_lcb_cache(u32 off, u64 *val)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(lcb_cache); i++) {
+		if (lcb_cache[i].off == off) {
+			*val = lcb_cache[i].val;
+			return 0;
+		}
+	}
+
+	pr_warn("%s bad offset 0x%x\n", __func__, off);
+	return -1;
+}
+
+/*
  * Read an LCB CSR.  Access may not be in host control, so check.
  * Return 0 on success, -EBUSY on failure.
  */
@@ -8355,9 +8401,13 @@ int read_lcb_csr(struct hfi1_devdata *dd, u32 addr, u64 *data)
 	/* if up, go through the 8051 for the value */
 	if (ppd->host_link_state & HLS_UP)
 		return read_lcb_via_8051(dd, addr, data);
-	/* if going up or down, no access */
-	if (ppd->host_link_state & (HLS_GOING_UP | HLS_GOING_OFFLINE))
-		return -EBUSY;
+	/* if going up or down, check the cache, otherwise, no access */
+	if (ppd->host_link_state & (HLS_GOING_UP | HLS_GOING_OFFLINE)) {
+		if (read_lcb_cache(addr, data))
+			return -EBUSY;
+		return 0;
+	}
+
 	/* otherwise, host has access */
 	*data = read_csr(dd, addr);
 	return 0;
@@ -10145,6 +10195,8 @@ static int goto_offline(struct hfi1_pportdata *ppd, u8 rem_reason)
 	int do_transition;
 	int do_wait;
 
+	update_lcb_cache(dd);
+
 	previous_state = ppd->host_link_state;
 	ppd->host_link_state = HLS_GOING_OFFLINE;
 	pstate = read_physical_state(dd);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/20] IB/hfi1: NULL pointer dereference when freeing rhashtable
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-03-01 18:21   ` [PATCH 03/20] IB/hfi1: Cache registers during state change Dennis Dalessandro
@ 2017-03-01 18:21   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 05/20] IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependent Dennis Dalessandro
                     ` (17 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn, Sebastian Sanchez

From: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

A NULL pointer dereference occurs when the driver
is unloaded, and the SDMA rhashtable is freed if
the rhashtable_init() function has not been called.
Prevent this by changing sdma_rht to be a pointer
to a dynamically allocated hash table. The NULL-ness
of the pointer serves as an indication that the hash
table was initialized and that it needs to be
destroyed.

Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/hfi.h  |    2 +-
 drivers/infiniband/hw/hfi1/sdma.c |   38 ++++++++++++++++++++++++++-----------
 2 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 0808e3c..b69ab47 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1167,7 +1167,7 @@ struct hfi1_devdata {
 	bool eprom_available;	/* true if EPROM is available for this device */
 	bool aspm_supported;	/* Does HW support ASPM */
 	bool aspm_enabled;	/* ASPM state: enabled/disabled */
-	struct rhashtable sdma_rht;
+	struct rhashtable *sdma_rht;
 
 	struct kobject kobj;
 };
diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index 1d81cac..9bee28d 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -868,7 +868,7 @@ struct sdma_engine *sdma_select_user_engine(struct hfi1_devdata *dd,
 
 	cpu_id = smp_processor_id();
 	rcu_read_lock();
-	rht_node = rhashtable_lookup_fast(&dd->sdma_rht, &cpu_id,
+	rht_node = rhashtable_lookup_fast(dd->sdma_rht, &cpu_id,
 					  sdma_rht_params);
 
 	if (rht_node && rht_node->map[vl]) {
@@ -962,7 +962,7 @@ ssize_t sdma_set_cpu_to_sde_map(struct sdma_engine *sde, const char *buf,
 			continue;
 		}
 
-		rht_node = rhashtable_lookup_fast(&dd->sdma_rht, &cpu,
+		rht_node = rhashtable_lookup_fast(dd->sdma_rht, &cpu,
 						  sdma_rht_params);
 		if (!rht_node) {
 			rht_node = kzalloc(sizeof(*rht_node), GFP_KERNEL);
@@ -982,7 +982,7 @@ ssize_t sdma_set_cpu_to_sde_map(struct sdma_engine *sde, const char *buf,
 			rht_node->map[vl]->ctr = 1;
 			rht_node->map[vl]->sde[0] = sde;
 
-			ret = rhashtable_insert_fast(&dd->sdma_rht,
+			ret = rhashtable_insert_fast(dd->sdma_rht,
 						     &rht_node->node,
 						     sdma_rht_params);
 			if (ret) {
@@ -1025,7 +1025,7 @@ ssize_t sdma_set_cpu_to_sde_map(struct sdma_engine *sde, const char *buf,
 		if (cpumask_test_cpu(cpu, mask))
 			continue;
 
-		rht_node = rhashtable_lookup_fast(&dd->sdma_rht, &cpu,
+		rht_node = rhashtable_lookup_fast(dd->sdma_rht, &cpu,
 						  sdma_rht_params);
 		if (rht_node) {
 			bool empty = true;
@@ -1049,7 +1049,7 @@ ssize_t sdma_set_cpu_to_sde_map(struct sdma_engine *sde, const char *buf,
 			}
 
 			if (empty) {
-				ret = rhashtable_remove_fast(&dd->sdma_rht,
+				ret = rhashtable_remove_fast(dd->sdma_rht,
 							     &rht_node->node,
 							     sdma_rht_params);
 				WARN_ON(ret);
@@ -1108,7 +1108,7 @@ void sdma_seqfile_dump_cpu_list(struct seq_file *s,
 	struct sdma_rht_node *rht_node;
 	int i, j;
 
-	rht_node = rhashtable_lookup_fast(&dd->sdma_rht, &cpuid,
+	rht_node = rhashtable_lookup_fast(dd->sdma_rht, &cpuid,
 					  sdma_rht_params);
 	if (!rht_node)
 		return;
@@ -1322,6 +1322,12 @@ static void sdma_clean(struct hfi1_devdata *dd, size_t num_engines)
 	synchronize_rcu();
 	kfree(dd->per_sdma);
 	dd->per_sdma = NULL;
+
+	if (dd->sdma_rht) {
+		rhashtable_free_and_destroy(dd->sdma_rht, sdma_rht_free, NULL);
+		kfree(dd->sdma_rht);
+		dd->sdma_rht = NULL;
+	}
 }
 
 /**
@@ -1341,12 +1347,14 @@ int sdma_init(struct hfi1_devdata *dd, u8 port)
 {
 	unsigned this_idx;
 	struct sdma_engine *sde;
+	struct rhashtable *tmp_sdma_rht;
 	u16 descq_cnt;
 	void *curr_head;
 	struct hfi1_pportdata *ppd = dd->pport + port;
 	u32 per_sdma_credits;
 	uint idle_cnt = sdma_idle_cnt;
 	size_t num_engines = dd->chip_sdma_engines;
+	int ret = -ENOMEM;
 
 	if (!HFI1_CAP_IS_KSET(SDMA)) {
 		HFI1_CAP_CLEAR(SDMA_AHG);
@@ -1378,7 +1386,7 @@ int sdma_init(struct hfi1_devdata *dd, u8 port)
 	/* alloc memory for array of send engines */
 	dd->per_sdma = kcalloc(num_engines, sizeof(*dd->per_sdma), GFP_KERNEL);
 	if (!dd->per_sdma)
-		return -ENOMEM;
+		return ret;
 
 	idle_cnt = ns_to_cclock(dd, idle_cnt);
 	if (!sdma_desct_intr)
@@ -1507,18 +1515,27 @@ int sdma_init(struct hfi1_devdata *dd, u8 port)
 	dd->flags |= HFI1_HAS_SEND_DMA;
 	dd->flags |= idle_cnt ? HFI1_HAS_SDMA_TIMEOUT : 0;
 	dd->num_sdma = num_engines;
-	if (sdma_map_init(dd, port, ppd->vls_operational, NULL))
+	ret = sdma_map_init(dd, port, ppd->vls_operational, NULL);
+	if (ret < 0)
+		goto bail;
+
+	tmp_sdma_rht = kzalloc(sizeof(*tmp_sdma_rht), GFP_KERNEL);
+	if (!tmp_sdma_rht) {
+		ret = -ENOMEM;
 		goto bail;
+	}
 
-	if (rhashtable_init(&dd->sdma_rht, &sdma_rht_params))
+	ret = rhashtable_init(tmp_sdma_rht, &sdma_rht_params);
+	if (ret < 0)
 		goto bail;
+	dd->sdma_rht = tmp_sdma_rht;
 
 	dd_dev_info(dd, "SDMA num_sdma: %u\n", dd->num_sdma);
 	return 0;
 
 bail:
 	sdma_clean(dd, num_engines);
-	return -ENOMEM;
+	return ret;
 }
 
 /**
@@ -1604,7 +1621,6 @@ void sdma_exit(struct hfi1_devdata *dd)
 		sdma_finalput(&sde->state);
 	}
 	sdma_clean(dd, dd->num_sdma);
-	rhashtable_free_and_destroy(&dd->sdma_rht, sdma_rht_free, NULL);
 }
 
 /*

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/20] IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependent
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-03-01 18:21   ` [PATCH 04/20] IB/hfi1: NULL pointer dereference when freeing rhashtable Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 06/20] IB/rdmavt: Add additional fields to post send trace Dennis Dalessandro
                     ` (16 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The work to create a completion helper moved the translation of send
wqe operations to completion opcodes to rdmvat.

This precludes having driver dependent operations.  Make the translation
driver dependent by doing the translation in the driver prior to the
rvt_qp_swqe_complete() call using restored translation tables.

Fixes: Commit f2dc9cdce83c ("IB/rdmavt: Add a send completion helper")
Fixes: Commit 0771da5a6e9d ("IB/hfi1,IB/qib: Use new send completion helper")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/rc.c       |   10 ++++++++--
 drivers/infiniband/hw/hfi1/ruc.c      |    5 ++++-
 drivers/infiniband/hw/hfi1/verbs.c    |   16 ++++++++++++++++
 drivers/infiniband/hw/qib/qib_rc.c    |   10 ++++++++--
 drivers/infiniband/hw/qib/qib_ruc.c   |    5 ++++-
 drivers/infiniband/hw/qib/qib_verbs.c |   13 +++++++++++++
 drivers/infiniband/sw/rdmavt/qp.c     |   17 -----------------
 include/rdma/rdmavt_qp.h              |    3 ++-
 8 files changed, 55 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index 7382be1..4649530 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -1034,7 +1034,10 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
 		/* see post_send() */
 		barrier();
 		rvt_put_swqe(wqe);
-		rvt_qp_swqe_complete(qp, wqe, IB_WC_SUCCESS);
+		rvt_qp_swqe_complete(qp,
+				     wqe,
+				     ib_hfi1_wc_opcode[wqe->wr.opcode],
+				     IB_WC_SUCCESS);
 	}
 	/*
 	 * If we were waiting for sends to complete before re-sending,
@@ -1081,7 +1084,10 @@ static inline void update_last_psn(struct rvt_qp *qp, u32 psn)
 		qp->s_last = s_last;
 		/* see post_send() */
 		barrier();
-		rvt_qp_swqe_complete(qp, wqe, IB_WC_SUCCESS);
+		rvt_qp_swqe_complete(qp,
+				     wqe,
+				     ib_hfi1_wc_opcode[wqe->wr.opcode],
+				     IB_WC_SUCCESS);
 	} else {
 		struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
 
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index aa15bcb..d2eb793 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -920,7 +920,10 @@ void hfi1_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
 	    qp->ibqp.qp_type == IB_QPT_GSI)
 		atomic_dec(&ibah_to_rvtah(wqe->ud_wr.ah)->refcount);
 
-	rvt_qp_swqe_complete(qp, wqe, status);
+	rvt_qp_swqe_complete(qp,
+			     wqe,
+			     ib_hfi1_wc_opcode[wqe->wr.opcode],
+			     status);
 
 	if (qp->s_acked == old_last)
 		qp->s_acked = last;
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 222315f..815cb44 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -297,6 +297,22 @@ static inline bool wss_exceeds_threshold(void)
 }
 
 /*
+ * Translate ib_wr_opcode into ib_wc_opcode.
+ */
+const enum ib_wc_opcode ib_hfi1_wc_opcode[] = {
+	[IB_WR_RDMA_WRITE] = IB_WC_RDMA_WRITE,
+	[IB_WR_RDMA_WRITE_WITH_IMM] = IB_WC_RDMA_WRITE,
+	[IB_WR_SEND] = IB_WC_SEND,
+	[IB_WR_SEND_WITH_IMM] = IB_WC_SEND,
+	[IB_WR_RDMA_READ] = IB_WC_RDMA_READ,
+	[IB_WR_ATOMIC_CMP_AND_SWP] = IB_WC_COMP_SWAP,
+	[IB_WR_ATOMIC_FETCH_AND_ADD] = IB_WC_FETCH_ADD,
+	[IB_WR_SEND_WITH_INV] = IB_WC_SEND,
+	[IB_WR_LOCAL_INV] = IB_WC_LOCAL_INV,
+	[IB_WR_REG_MR] = IB_WC_REG_MR
+};
+
+/*
  * Length of header by opcode, 0 --> not supported
  */
 const u8 hdr_len_by_opcode[256] = {
diff --git a/drivers/infiniband/hw/qib/qib_rc.c b/drivers/infiniband/hw/qib/qib_rc.c
index 12658e3..0234987 100644
--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -938,7 +938,10 @@ void qib_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
 		/* see post_send() */
 		barrier();
 		rvt_put_swqe(wqe);
-		rvt_qp_swqe_complete(qp, wqe, IB_WC_SUCCESS);
+		rvt_qp_swqe_complete(qp,
+				     wqe,
+				     ib_qib_wc_opcode[wqe->wr.opcode],
+				     IB_WC_SUCCESS);
 	}
 	/*
 	 * If we were waiting for sends to complete before resending,
@@ -983,7 +986,10 @@ static inline void update_last_psn(struct rvt_qp *qp, u32 psn)
 		qp->s_last = s_last;
 		/* see post_send() */
 		barrier();
-		rvt_qp_swqe_complete(qp, wqe, IB_WC_SUCCESS);
+		rvt_qp_swqe_complete(qp,
+				     wqe,
+				     ib_qib_wc_opcode[wqe->wr.opcode],
+				     IB_WC_SUCCESS);
 	} else
 		this_cpu_inc(*ibp->rvp.rc_delayed_comp);
 
diff --git a/drivers/infiniband/hw/qib/qib_ruc.c b/drivers/infiniband/hw/qib/qib_ruc.c
index 17655cc..6e1adf7 100644
--- a/drivers/infiniband/hw/qib/qib_ruc.c
+++ b/drivers/infiniband/hw/qib/qib_ruc.c
@@ -769,7 +769,10 @@ void qib_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
 	    qp->ibqp.qp_type == IB_QPT_GSI)
 		atomic_dec(&ibah_to_rvtah(wqe->ud_wr.ah)->refcount);
 
-	rvt_qp_swqe_complete(qp, wqe, status);
+	rvt_qp_swqe_complete(qp,
+			     wqe,
+			     ib_qib_wc_opcode[wqe->wr.opcode],
+			     status);
 
 	if (qp->s_acked == old_last)
 		qp->s_acked = last;
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 83f8b5f..e120efe 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -114,6 +114,19 @@
 MODULE_PARM_DESC(disable_sma, "Disable the SMA");
 
 /*
+ * Translate ib_wr_opcode into ib_wc_opcode.
+ */
+const enum ib_wc_opcode ib_qib_wc_opcode[] = {
+	[IB_WR_RDMA_WRITE] = IB_WC_RDMA_WRITE,
+	[IB_WR_RDMA_WRITE_WITH_IMM] = IB_WC_RDMA_WRITE,
+	[IB_WR_SEND] = IB_WC_SEND,
+	[IB_WR_SEND_WITH_IMM] = IB_WC_SEND,
+	[IB_WR_RDMA_READ] = IB_WC_RDMA_READ,
+	[IB_WR_ATOMIC_CMP_AND_SWP] = IB_WC_COMP_SWAP,
+	[IB_WR_ATOMIC_FETCH_AND_ADD] = IB_WC_FETCH_ADD
+};
+
+/*
  * System image GUID.
  */
 __be64 ib_qib_sys_image_guid;
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index f5ad8d4..28fb724 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -117,23 +117,6 @@
 };
 EXPORT_SYMBOL(ib_rvt_state_ops);
 
-/*
- * Translate ib_wr_opcode into ib_wc_opcode.
- */
-const enum ib_wc_opcode ib_rvt_wc_opcode[] = {
-	[IB_WR_RDMA_WRITE] = IB_WC_RDMA_WRITE,
-	[IB_WR_RDMA_WRITE_WITH_IMM] = IB_WC_RDMA_WRITE,
-	[IB_WR_SEND] = IB_WC_SEND,
-	[IB_WR_SEND_WITH_IMM] = IB_WC_SEND,
-	[IB_WR_RDMA_READ] = IB_WC_RDMA_READ,
-	[IB_WR_ATOMIC_CMP_AND_SWP] = IB_WC_COMP_SWAP,
-	[IB_WR_ATOMIC_FETCH_AND_ADD] = IB_WC_FETCH_ADD,
-	[IB_WR_SEND_WITH_INV] = IB_WC_SEND,
-	[IB_WR_LOCAL_INV] = IB_WC_LOCAL_INV,
-	[IB_WR_REG_MR] = IB_WC_REG_MR
-};
-EXPORT_SYMBOL(ib_rvt_wc_opcode);
-
 static void get_map_page(struct rvt_qpn_table *qpt,
 			 struct rvt_qpn_map *map,
 			 gfp_t gfp)
diff --git a/include/rdma/rdmavt_qp.h b/include/rdma/rdmavt_qp.h
index f381639..3cdd9e2 100644
--- a/include/rdma/rdmavt_qp.h
+++ b/include/rdma/rdmavt_qp.h
@@ -574,6 +574,7 @@ static inline void rvt_qp_wqe_unreserve(
 static inline void rvt_qp_swqe_complete(
 	struct rvt_qp *qp,
 	struct rvt_swqe *wqe,
+	enum ib_wc_opcode opcode,
 	enum ib_wc_status status)
 {
 	if (unlikely(wqe->wr.send_flags & RVT_SEND_RESERVE_USED))
@@ -586,7 +587,7 @@ static inline void rvt_qp_swqe_complete(
 		memset(&wc, 0, sizeof(wc));
 		wc.wr_id = wqe->wr.wr_id;
 		wc.status = status;
-		wc.opcode = ib_rvt_wc_opcode[wqe->wr.opcode];
+		wc.opcode = opcode;
 		wc.qp = &qp->ibqp;
 		wc.byte_len = wqe->length;
 		rvt_cq_enter(ibcq_to_rvtcq(qp->ibqp.send_cq), &wc,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/20] IB/rdmavt: Add additional fields to post send trace
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 05/20] IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependent Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 07/20] IB/rdmavt: Add tracing for cq entry and poll Dennis Dalessandro
                     ` (15 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This fix is to get additional debugging information.

The following fields are added:
- wqe
- qpt
- num_sge
- ssn
- pid
- send_flags

These additional fields provide for more focused filtering
and triggering.

The patch also moves the trace to just before the wqe is
posted to get the most accurate information and future proofs
the code to trace all possible reserved opcodes.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/qp.c       |    2 +-
 drivers/infiniband/sw/rdmavt/trace_tx.h |   34 ++++++++++++++++++++++++++++---
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 28fb724..3c55a8b 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -1772,11 +1772,11 @@ static int rvt_post_one_wr(struct rvt_qp *qp,
 					0);
 		qp->s_next_psn = wqe->lpsn + 1;
 	}
-	trace_rvt_post_one_wr(qp, wqe);
 	if (unlikely(reserved_op))
 		rvt_qp_wqe_reserve(qp, wqe);
 	else
 		qp->s_avail--;
+	trace_rvt_post_one_wr(qp, wqe);
 	smp_wmb(); /* see request builders */
 	qp->s_head = next;
 
diff --git a/drivers/infiniband/sw/rdmavt/trace_tx.h b/drivers/infiniband/sw/rdmavt/trace_tx.h
index 0e03173..a613a22 100644
--- a/drivers/infiniband/sw/rdmavt/trace_tx.h
+++ b/drivers/infiniband/sw/rdmavt/trace_tx.h
@@ -71,10 +71,20 @@
 	wr_opcode_name(RDMA_READ_WITH_INV),                \
 	wr_opcode_name(LOCAL_INV),                         \
 	wr_opcode_name(MASKED_ATOMIC_CMP_AND_SWP),         \
-	wr_opcode_name(MASKED_ATOMIC_FETCH_AND_ADD))
+	wr_opcode_name(MASKED_ATOMIC_FETCH_AND_ADD),       \
+	wr_opcode_name(RESERVED1),                         \
+	wr_opcode_name(RESERVED2),                         \
+	wr_opcode_name(RESERVED3),                         \
+	wr_opcode_name(RESERVED4),                         \
+	wr_opcode_name(RESERVED5),                         \
+	wr_opcode_name(RESERVED6),                         \
+	wr_opcode_name(RESERVED7),                         \
+	wr_opcode_name(RESERVED8),                         \
+	wr_opcode_name(RESERVED9),                         \
+	wr_opcode_name(RESERVED10))
 
 #define POS_PRN \
-"[%s] wr_id %llx qpn %x psn 0x%x lpsn 0x%x length %u opcode 0x%.2x,%s size %u avail %u head %u last %u"
+"[%s] wqe %p wr_id %llx send_flags %x qpn %x qpt %u psn %x lpsn %x ssn %x length %u opcode 0x%.2x,%s size %u avail %u head %u last %u pid %u num_sge %u"
 
 TRACE_EVENT(
 	rvt_post_one_wr,
@@ -83,7 +93,9 @@
 	TP_STRUCT__entry(
 		RDI_DEV_ENTRY(ib_to_rvt(qp->ibqp.device))
 		__field(u64, wr_id)
+		__field(struct rvt_swqe *, wqe)
 		__field(u32, qpn)
+		__field(u32, qpt)
 		__field(u32, psn)
 		__field(u32, lpsn)
 		__field(u32, length)
@@ -92,11 +104,17 @@
 		__field(u32, avail)
 		__field(u32, head)
 		__field(u32, last)
+		__field(u32, ssn)
+		__field(int, send_flags)
+		__field(pid_t, pid)
+		__field(int, num_sge)
 	),
 	TP_fast_assign(
 		RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
+		__entry->wqe = wqe;
 		__entry->wr_id = wqe->wr.wr_id;
 		__entry->qpn = qp->ibqp.qp_num;
+		__entry->qpt = qp->ibqp.qp_type;
 		__entry->psn = wqe->psn;
 		__entry->lpsn = wqe->lpsn;
 		__entry->length = wqe->length;
@@ -105,20 +123,30 @@
 		__entry->avail = qp->s_avail;
 		__entry->head = qp->s_head;
 		__entry->last = qp->s_last;
+		__entry->pid = qp->pid;
+		__entry->ssn = wqe->ssn;
+		__entry->send_flags = wqe->wr.send_flags;
+		__entry->num_sge = wqe->wr.num_sge;
 	),
 	TP_printk(
 		POS_PRN,
 		__get_str(dev),
+		__entry->wqe,
 		__entry->wr_id,
+		__entry->send_flags,
 		__entry->qpn,
+		__entry->qpt,
 		__entry->psn,
 		__entry->lpsn,
+		__entry->ssn,
 		__entry->length,
 		__entry->opcode, show_wr_opcode(__entry->opcode),
 		__entry->size,
 		__entry->avail,
 		__entry->head,
-		__entry->last
+		__entry->last,
+		__entry->pid,
+		__entry->num_sge
 	)
 );
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/20] IB/rdmavt: Add tracing for cq entry and poll
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 06/20] IB/rdmavt: Add additional fields to post send trace Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 08/20] IB/rdmavt: Add swqe completion trace Dennis Dalessandro
                     ` (14 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The following fields are defined for filtering and triggering:
- wr_id
- status
- opcode
- qpn
- length
- idx

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/cq.c       |    3 +
 drivers/infiniband/sw/rdmavt/trace.h    |    1 
 drivers/infiniband/sw/rdmavt/trace_cq.h |  127 +++++++++++++++++++++++++++++++
 3 files changed, 131 insertions(+), 0 deletions(-)
 create mode 100644 drivers/infiniband/sw/rdmavt/trace_cq.h

diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 7aa7a4e..0ae2ff8 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -50,6 +50,7 @@
 #include <linux/kthread.h>
 #include "cq.h"
 #include "vt.h"
+#include "trace.h"
 
 /**
  * rvt_cq_enter - add a new entry to the completion queue
@@ -93,6 +94,7 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
 		}
 		return;
 	}
+	trace_rvt_cq_enter(cq, entry, head);
 	if (cq->ip) {
 		wc->uqueue[head].wr_id = entry->wr_id;
 		wc->uqueue[head].status = entry->status;
@@ -482,6 +484,7 @@ int rvt_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry)
 		if (tail == wc->head)
 			break;
 		/* The kernel doesn't need a RMB since it has the lock. */
+		trace_rvt_cq_poll(cq, &wc->kqueue[tail], npolled);
 		*entry = wc->kqueue[tail];
 		if (tail >= cq->ibcq.cqe)
 			tail = 0;
diff --git a/drivers/infiniband/sw/rdmavt/trace.h b/drivers/infiniband/sw/rdmavt/trace.h
index e2d23ac..89554c0 100644
--- a/drivers/infiniband/sw/rdmavt/trace.h
+++ b/drivers/infiniband/sw/rdmavt/trace.h
@@ -52,3 +52,4 @@
 #include "trace_qp.h"
 #include "trace_tx.h"
 #include "trace_mr.h"
+#include "trace_cq.h"
diff --git a/drivers/infiniband/sw/rdmavt/trace_cq.h b/drivers/infiniband/sw/rdmavt/trace_cq.h
new file mode 100644
index 0000000..a315850
--- /dev/null
+++ b/drivers/infiniband/sw/rdmavt/trace_cq.h
@@ -0,0 +1,127 @@
+/*
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *  - Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  - Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *  - Neither the name of Intel Corporation nor the names of its
+ *    contributors may be used to endorse or promote products derived
+ *    from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#if !defined(__RVT_TRACE_CQ_H) || defined(TRACE_HEADER_MULTI_READ)
+#define __RVT_TRACE_CQ_H
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include <rdma/ib_verbs.h>
+#include <rdma/rdmavt_cq.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rvt_cq
+
+#define wc_opcode_name(opcode) { IB_WC_##opcode, #opcode  }
+#define show_wc_opcode(opcode)                                \
+__print_symbolic(opcode,                                      \
+	wc_opcode_name(SEND),                                 \
+	wc_opcode_name(RDMA_WRITE),                           \
+	wc_opcode_name(RDMA_READ),                            \
+	wc_opcode_name(COMP_SWAP),                            \
+	wc_opcode_name(FETCH_ADD),                            \
+	wc_opcode_name(LSO),                                  \
+	wc_opcode_name(LOCAL_INV),                            \
+	wc_opcode_name(REG_MR),                               \
+	wc_opcode_name(MASKED_COMP_SWAP),                     \
+	wc_opcode_name(RECV),                                 \
+	wc_opcode_name(RECV_RDMA_WITH_IMM))
+
+#define CQ_PRN \
+"[%s] idx %u wr_id %llx status %u opcode %u,%s length %u qpn %x"
+
+DECLARE_EVENT_CLASS(
+	rvt_cq_entry_template,
+	TP_PROTO(struct rvt_cq *cq, struct ib_wc *wc, u32 idx),
+	TP_ARGS(cq, wc, idx),
+	TP_STRUCT__entry(
+		RDI_DEV_ENTRY(cq->rdi)
+		__field(u64, wr_id)
+		__field(u32, status)
+		__field(u32, opcode)
+		__field(u32, qpn)
+		__field(u32, length)
+		__field(u32, idx)
+	),
+	TP_fast_assign(
+		RDI_DEV_ASSIGN(cq->rdi)
+		__entry->wr_id = wc->wr_id;
+		__entry->status = wc->status;
+		__entry->opcode = wc->opcode;
+		__entry->length = wc->byte_len;
+		__entry->qpn = wc->qp->qp_num;
+		__entry->idx = idx;
+	),
+	TP_printk(
+		CQ_PRN,
+		__get_str(dev),
+		__entry->idx,
+		__entry->wr_id,
+		__entry->status,
+		__entry->opcode, show_wc_opcode(__entry->opcode),
+		__entry->length,
+		__entry->qpn
+	)
+);
+
+DEFINE_EVENT(
+	rvt_cq_entry_template, rvt_cq_enter,
+	TP_PROTO(struct rvt_cq *cq, struct ib_wc *wc, u32 idx),
+	TP_ARGS(cq, wc, idx));
+
+DEFINE_EVENT(
+	rvt_cq_entry_template, rvt_cq_poll,
+	TP_PROTO(struct rvt_cq *cq, struct ib_wc *wc, u32 idx),
+	TP_ARGS(cq, wc, idx));
+
+#endif /* __RVT_TRACE_CQ_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_cq
+#include <trace/define_trace.h>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/20] IB/rdmavt: Add swqe completion trace
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 07/20] IB/rdmavt: Add tracing for cq entry and poll Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 09/20] IB/hfi1: Check device id early during init Dennis Dalessandro
                     ` (13 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The following fields are available for filter/trace:
- wqe
- wr_id
- qpn
- qpt
- length
- idx
- ssn
- (wr)opcode
- (wr)send_flags

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/rc.c       |    2 ++
 drivers/infiniband/hw/hfi1/ruc.c      |    1 +
 drivers/infiniband/hw/hfi1/trace_tx.h |   43 +++++++++++++++++++++++++++++++++
 3 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index 4649530..0e56578 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -1028,6 +1028,7 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
 		    cmp_psn(qp->s_sending_psn, qp->s_sending_hpsn) <= 0)
 			break;
 		s_last = qp->s_last;
+		trace_hfi1_qp_send_completion(qp, wqe, s_last);
 		if (++s_last >= qp->s_size)
 			s_last = 0;
 		qp->s_last = s_last;
@@ -1079,6 +1080,7 @@ static inline void update_last_psn(struct rvt_qp *qp, u32 psn)
 
 		rvt_put_swqe(wqe);
 		s_last = qp->s_last;
+		trace_hfi1_qp_send_completion(qp, wqe, s_last);
 		if (++s_last >= qp->s_size)
 			s_last = 0;
 		qp->s_last = s_last;
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index d2eb793..a1c3c97 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -911,6 +911,7 @@ void hfi1_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
 	old_last = last;
 	if (++last >= qp->s_size)
 		last = 0;
+	trace_hfi1_qp_send_completion(qp, wqe, last);
 	qp->s_last = last;
 	/* See post_send() */
 	barrier();
diff --git a/drivers/infiniband/hw/hfi1/trace_tx.h b/drivers/infiniband/hw/hfi1/trace_tx.h
index 415d6be..2c9ac57 100644
--- a/drivers/infiniband/hw/hfi1/trace_tx.h
+++ b/drivers/infiniband/hw/hfi1/trace_tx.h
@@ -633,6 +633,49 @@
 	     TP_PROTO(struct hfi1_devdata *dd, struct buffer_control *bc),
 	     TP_ARGS(dd, bc));
 
+TRACE_EVENT(
+	hfi1_qp_send_completion,
+	TP_PROTO(struct rvt_qp *qp, struct rvt_swqe *wqe, u32 idx),
+	TP_ARGS(qp, wqe, idx),
+	TP_STRUCT__entry(
+		DD_DEV_ENTRY(dd_from_ibdev(qp->ibqp.device))
+		__field(struct rvt_swqe *, wqe)
+		__field(u64, wr_id)
+		__field(u32, qpn)
+		__field(u32, qpt)
+		__field(u32, length)
+		__field(u32, idx)
+		__field(u32, ssn)
+		__field(enum ib_wr_opcode, opcode)
+		__field(int, send_flags)
+	),
+	TP_fast_assign(
+		DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
+		__entry->wqe = wqe;
+		__entry->wr_id = wqe->wr.wr_id;
+		__entry->qpn = qp->ibqp.qp_num;
+		__entry->qpt = qp->ibqp.qp_type;
+		__entry->length = wqe->length;
+		__entry->idx = idx;
+		__entry->ssn = wqe->ssn;
+		__entry->opcode = wqe->wr.opcode;
+		__entry->send_flags = wqe->wr.send_flags;
+	),
+	TP_printk(
+		"[%s] qpn 0x%x qpt %u wqe %p idx %u wr_id %llx length %u ssn %u opcode %x send_flags %x",
+		__get_str(dev),
+		__entry->qpn,
+		__entry->qpt,
+		__entry->wqe,
+		__entry->idx,
+		__entry->wr_id,
+		__entry->length,
+		__entry->ssn,
+		__entry->opcode,
+		__entry->send_flags
+	)
+);
+
 #endif /* __HFI1_TRACE_TX_H */
 
 #undef TRACE_INCLUDE_PATH

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/20] IB/hfi1: Check device id early during init
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 08/20] IB/rdmavt: Add swqe completion trace Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 10/20] IB/hfi1: Protect the global dev_cntr_names and port_cntr_names Dennis Dalessandro
                     ` (12 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny, Tadeusz Struk

From: Tadeusz Struk <tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If there is a wrong device passed to the driver it should fail early,
without trying to initialize the device only to find out that it has
an invalid device later during the init.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Tadeusz Struk <tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/init.c |   19 ++++++++++---------
 1 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index f40864e..9bfb8eb 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1425,6 +1425,16 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* First, lock the non-writable module parameters */
 	HFI1_CAP_LOCK();
 
+	/* Validate dev ids */
+	if (!(ent->device == PCI_DEVICE_ID_INTEL0 ||
+	      ent->device == PCI_DEVICE_ID_INTEL1)) {
+		hfi1_early_err(&pdev->dev,
+			       "Failing on unknown Intel deviceid 0x%x\n",
+			       ent->device);
+		ret = -ENODEV;
+		goto bail;
+	}
+
 	/* Validate some global module parameters */
 	ret = init_validate_rcvhdrcnt(&pdev->dev, rcvhdrcnt);
 	if (ret)
@@ -1470,15 +1480,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (ret)
 		goto bail;
 
-	if (!(ent->device == PCI_DEVICE_ID_INTEL0 ||
-	      ent->device == PCI_DEVICE_ID_INTEL1)) {
-		hfi1_early_err(&pdev->dev,
-			       "Failing on unknown Intel deviceid 0x%x\n",
-			       ent->device);
-		ret = -ENODEV;
-		goto clean_bail;
-	}
-
 	/*
 	 * Do device-specific initialization, function table setup, dd
 	 * allocation, etc.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/20] IB/hfi1: Protect the global dev_cntr_names and port_cntr_names
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 09/20] IB/hfi1: Check device id early during init Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:22   ` [PATCH 11/20] IB/hfi1: Check for QSFP presence before attempting reads Dennis Dalessandro
                     ` (11 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Easwar Hariharan, Tadeusz Struk

From: Tadeusz Struk <tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Protect the global dev_cntr_names and port_cntr_names with the global
mutex as they are allocated and freed in a function called per device.
Otherwise there is a danger of double free and memory leaks.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Easwar Hariharan <easwar.hariharan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Tadeusz Struk <tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/verbs.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 815cb44..8d71654 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1540,6 +1540,7 @@ static void hfi1_get_dev_fw_str(struct ib_device *ibdev, char *str,
 	"DRIVER_EgrHdrFull"
 };
 
+static DEFINE_MUTEX(cntr_names_lock); /* protects the *_cntr_names bufers */
 static const char **dev_cntr_names;
 static const char **port_cntr_names;
 static int num_driver_cntrs = ARRAY_SIZE(driver_cntr_names);
@@ -1594,6 +1595,7 @@ static int init_cntr_names(const char *names_in,
 {
 	int i, err;
 
+	mutex_lock(&cntr_names_lock);
 	if (!cntr_names_initialized) {
 		struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 
@@ -1602,8 +1604,10 @@ static int init_cntr_names(const char *names_in,
 				      num_driver_cntrs,
 				      &num_dev_cntrs,
 				      &dev_cntr_names);
-		if (err)
+		if (err) {
+			mutex_unlock(&cntr_names_lock);
 			return NULL;
+		}
 
 		for (i = 0; i < num_driver_cntrs; i++)
 			dev_cntr_names[num_dev_cntrs + i] =
@@ -1617,10 +1621,12 @@ static int init_cntr_names(const char *names_in,
 		if (err) {
 			kfree(dev_cntr_names);
 			dev_cntr_names = NULL;
+			mutex_unlock(&cntr_names_lock);
 			return NULL;
 		}
 		cntr_names_initialized = 1;
 	}
+	mutex_unlock(&cntr_names_lock);
 
 	if (!port_num)
 		return rdma_alloc_hw_stats_struct(
@@ -1839,9 +1845,13 @@ void hfi1_unregister_ib_device(struct hfi1_devdata *dd)
 	del_timer_sync(&dev->mem_timer);
 	verbs_txreq_exit(dev);
 
+	mutex_lock(&cntr_names_lock);
 	kfree(dev_cntr_names);
 	kfree(port_cntr_names);
+	dev_cntr_names = NULL;
+	port_cntr_names = NULL;
 	cntr_names_initialized = 0;
+	mutex_unlock(&cntr_names_lock);
 }
 
 void hfi1_cnp_rcv(struct hfi1_packet *packet)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/20] IB/hfi1: Check for QSFP presence before attempting reads
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 10/20] IB/hfi1: Protect the global dev_cntr_names and port_cntr_names Dennis Dalessandro
@ 2017-03-01 18:22   ` Dennis Dalessandro
  2017-03-01 18:23   ` [PATCH 12/20] IB/hfi1: Add a patch value to the firmware version string Dennis Dalessandro
                     ` (10 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:22 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Easwar Hariharan

From: Easwar Hariharan <easwar.hariharan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Attempting to read the status of a QSFP cable creates noise in the logs
and misses out on setting an appropriate Offline/Disabled Reason if the
cable is not plugged in. Check for this prior to attempting the read and
attendant retries.

Fixes: 673b975f1fba ("IB/hfi1: Add QSFP sanity pre-check")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Easwar Hariharan <easwar.hariharan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 8b8840a..f9d0d8c 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -9533,8 +9533,11 @@ static int test_qsfp_read(struct hfi1_pportdata *ppd)
 	int ret;
 	u8 status;
 
-	/* report success if not a QSFP */
-	if (ppd->port_type != PORT_TYPE_QSFP)
+	/*
+	 * Report success if not a QSFP or, if it is a QSFP, but the cable is
+	 * not present
+	 */
+	if (ppd->port_type != PORT_TYPE_QSFP || !qsfp_mod_present(ppd))
 		return 0;
 
 	/* read byte 2, the status byte */

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/20] IB/hfi1: Add a patch value to the firmware version string
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-03-01 18:22   ` [PATCH 11/20] IB/hfi1: Check for QSFP presence before attempting reads Dennis Dalessandro
@ 2017-03-01 18:23   ` Dennis Dalessandro
  2017-03-01 18:23   ` [PATCH 13/20] IB/rdmavt,IB/hfi1: Fix timer migration regressions Dennis Dalessandro
                     ` (9 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:23 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Easwar Hariharan

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The HFI firmware now includes a patch level in its version.
Updating the necessary code to include the patch version in the
firmware string.

Reviewed-by: Easwar Hariharan <easwar.hariharan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c     |   23 +++++++++++++++--------
 drivers/infiniband/hw/hfi1/chip.h     |   18 +++++++++++-------
 drivers/infiniband/hw/hfi1/firmware.c |   14 ++++++++------
 drivers/infiniband/hw/hfi1/hfi.h      |    9 +++++----
 drivers/infiniband/hw/hfi1/verbs.c    |   14 ++++++++------
 5 files changed, 47 insertions(+), 31 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index f9d0d8c..77f4b41 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -7166,7 +7166,7 @@ static void get_link_widths(struct hfi1_devdata *dd, u16 *tx_width,
 	 * set the max_rate field in handle_verify_cap until v0.19.
 	 */
 	if ((dd->icode == ICODE_RTL_SILICON) &&
-	    (dd->dc8051_ver < dc8051_ver(0, 19))) {
+	    (dd->dc8051_ver < dc8051_ver(0, 19, 0))) {
 		/* max_rate: 0 = 12.5G, 1 = 25G */
 		switch (max_rate) {
 		case 0:
@@ -7351,7 +7351,7 @@ void handle_verify_cap(struct work_struct *work)
 	}
 
 	ppd->link_speed_active = 0;	/* invalid value */
-	if (dd->dc8051_ver < dc8051_ver(0, 20)) {
+	if (dd->dc8051_ver < dc8051_ver(0, 20, 0)) {
 		/* remote_tx_rate: 0 = 12.5G, 1 = 25G */
 		switch (remote_tx_rate) {
 		case 0:
@@ -8422,7 +8422,7 @@ static int write_lcb_via_8051(struct hfi1_devdata *dd, u32 addr, u64 data)
 	int ret;
 
 	if (dd->icode == ICODE_FUNCTIONAL_SIMULATOR ||
-	    (dd->dc8051_ver < dc8051_ver(0, 20))) {
+	    (dd->dc8051_ver < dc8051_ver(0, 20, 0))) {
 		if (acquire_lcb_access(dd, 0) == 0) {
 			write_csr(dd, addr, data);
 			release_lcb_access(dd, 0);
@@ -8728,13 +8728,20 @@ static void read_remote_device_id(struct hfi1_devdata *dd, u16 *device_id,
 			& REMOTE_DEVICE_REV_MASK;
 }
 
-void read_misc_status(struct hfi1_devdata *dd, u8 *ver_a, u8 *ver_b)
+void read_misc_status(struct hfi1_devdata *dd, u8 *ver_major, u8 *ver_minor,
+		      u8 *ver_patch)
 {
 	u32 frame;
 
 	read_8051_config(dd, MISC_STATUS, GENERAL_CONFIG, &frame);
-	*ver_a = (frame >> STS_FM_VERSION_A_SHIFT) & STS_FM_VERSION_A_MASK;
-	*ver_b = (frame >> STS_FM_VERSION_B_SHIFT) & STS_FM_VERSION_B_MASK;
+	*ver_major = (frame >> STS_FM_VERSION_MAJOR_SHIFT) &
+		STS_FM_VERSION_MAJOR_MASK;
+	*ver_minor = (frame >> STS_FM_VERSION_MINOR_SHIFT) &
+		STS_FM_VERSION_MINOR_MASK;
+
+	read_8051_config(dd, VERSION_PATCH, GENERAL_CONFIG, &frame);
+	*ver_patch = (frame >> STS_FM_VERSION_PATCH_SHIFT) &
+		STS_FM_VERSION_PATCH_MASK;
 }
 
 static void read_vc_remote_phy(struct hfi1_devdata *dd, u8 *power_management,
@@ -9130,7 +9137,7 @@ static int set_local_link_attributes(struct hfi1_pportdata *ppd)
 	if (ret)
 		goto set_local_link_attributes_fail;
 
-	if (dd->dc8051_ver < dc8051_ver(0, 20)) {
+	if (dd->dc8051_ver < dc8051_ver(0, 20, 0)) {
 		/* set the tx rate to the fastest enabled */
 		if (ppd->link_speed_enabled & OPA_LINK_SPEED_25G)
 			ppd->local_tx_rate = 1;
diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
index 043fd21..24df45f 100644
--- a/drivers/infiniband/hw/hfi1/chip.h
+++ b/drivers/infiniband/hw/hfi1/chip.h
@@ -1,7 +1,7 @@
 #ifndef _CHIP_H
 #define _CHIP_H
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -394,7 +394,8 @@
 #define LAST_REMOTE_STATE_COMPLETE   0x13
 #define LINK_QUALITY_INFO            0x14
 #define REMOTE_DEVICE_ID	     0x15
-#define LINK_DOWN_REASON	     0x16
+#define LINK_DOWN_REASON	     0x16 /* first byte of offset 0x16 */
+#define VERSION_PATCH		     0x16 /* last byte of offset 0x16 */
 
 /* 8051 lane specific register field IDs */
 #define TX_EQ_SETTINGS		0x00
@@ -524,10 +525,12 @@ enum {
 #define SUPPORTED_CRCS (CAP_CRC_14B | CAP_CRC_48B)
 
 /* misc status version fields */
-#define STS_FM_VERSION_A_SHIFT 16
-#define STS_FM_VERSION_A_MASK  0xff
-#define STS_FM_VERSION_B_SHIFT 24
-#define STS_FM_VERSION_B_MASK  0xff
+#define STS_FM_VERSION_MINOR_SHIFT 16
+#define STS_FM_VERSION_MINOR_MASK  0xff
+#define STS_FM_VERSION_MAJOR_SHIFT 24
+#define STS_FM_VERSION_MAJOR_MASK  0xff
+#define STS_FM_VERSION_PATCH_SHIFT 24
+#define STS_FM_VERSION_PATCH_MASK  0xff
 
 /* LCB_CFG_CRC_MODE TX_VAL and RX_VAL CRC mode values */
 #define LCB_CRC_16B			0x0	/* 16b CRC */
@@ -698,7 +701,8 @@ bool check_chip_resource(struct hfi1_devdata *dd, u32 resource,
 int read_8051_data(struct hfi1_devdata *dd, u32 addr, u32 len, u64 *result);
 
 /* chip.c */
-void read_misc_status(struct hfi1_devdata *dd, u8 *ver_a, u8 *ver_b);
+void read_misc_status(struct hfi1_devdata *dd, u8 *ver_major, u8 *ver_minor,
+		      u8 *ver_patch);
 void read_guid(struct hfi1_devdata *dd);
 int wait_fm_ready(struct hfi1_devdata *dd, u32 mstimeout);
 void set_link_down_reason(struct hfi1_pportdata *ppd, u8 lcl_reason,
diff --git a/drivers/infiniband/hw/hfi1/firmware.c b/drivers/infiniband/hw/hfi1/firmware.c
index 0dd50cd..4042c11 100644
--- a/drivers/infiniband/hw/hfi1/firmware.c
+++ b/drivers/infiniband/hw/hfi1/firmware.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -1004,7 +1004,9 @@ static int load_8051_firmware(struct hfi1_devdata *dd,
 {
 	u64 reg;
 	int ret;
-	u8 ver_a, ver_b;
+	u8 ver_major;
+	u8 ver_minor;
+	u8 ver_patch;
 
 	/*
 	 * DC Reset sequence
@@ -1073,10 +1075,10 @@ static int load_8051_firmware(struct hfi1_devdata *dd,
 		return -ETIMEDOUT;
 	}
 
-	read_misc_status(dd, &ver_a, &ver_b);
-	dd_dev_info(dd, "8051 firmware version %d.%d\n",
-		    (int)ver_b, (int)ver_a);
-	dd->dc8051_ver = dc8051_ver(ver_b, ver_a);
+	read_misc_status(dd, &ver_major, &ver_minor, &ver_patch);
+	dd_dev_info(dd, "8051 firmware version %d.%d.%d\n",
+		    (int)ver_major, (int)ver_minor, (int)ver_patch);
+	dd->dc8051_ver = dc8051_ver(ver_major, ver_minor, ver_patch);
 
 	return 0;
 }
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index b69ab47..a31638c 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1020,7 +1020,7 @@ struct hfi1_devdata {
 	u8 qos_shift;
 
 	u16 irev;	/* implementation revision */
-	u16 dc8051_ver; /* 8051 firmware version */
+	u32 dc8051_ver; /* 8051 firmware version */
 
 	spinlock_t hfi1_diag_trans_lock; /* protect diag observer ops */
 	struct platform_config platform_config;
@@ -1173,9 +1173,10 @@ struct hfi1_devdata {
 };
 
 /* 8051 firmware version helper */
-#define dc8051_ver(a, b) ((a) << 8 | (b))
-#define dc8051_ver_maj(a) ((a & 0xff00) >> 8)
-#define dc8051_ver_min(a)  (a & 0x00ff)
+#define dc8051_ver(a, b, c) ((a) << 16 | (b) << 8 | (c))
+#define dc8051_ver_maj(a) (((a) & 0xff0000) >> 16)
+#define dc8051_ver_min(a) (((a) & 0x00ff00) >> 8)
+#define dc8051_ver_patch(a) ((a) & 0x0000ff)
 
 /* f_put_tid types */
 #define PT_EXPECTED 0
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 8d71654..928918c 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1236,12 +1236,14 @@ int hfi1_verbs_send(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 static void hfi1_fill_device_attr(struct hfi1_devdata *dd)
 {
 	struct rvt_dev_info *rdi = &dd->verbs_dev.rdi;
-	u16 ver = dd->dc8051_ver;
+	u32 ver = dd->dc8051_ver;
 
 	memset(&rdi->dparms.props, 0, sizeof(rdi->dparms.props));
 
-	rdi->dparms.props.fw_ver = ((u64)(dc8051_ver_maj(ver)) << 16) |
-				    (u64)dc8051_ver_min(ver);
+	rdi->dparms.props.fw_ver = ((u64)(dc8051_ver_maj(ver)) << 32) |
+		((u64)(dc8051_ver_min(ver)) << 16) |
+		(u64)dc8051_ver_patch(ver);
+
 	rdi->dparms.props.device_cap_flags = IB_DEVICE_BAD_PKEY_CNTR |
 			IB_DEVICE_BAD_QKEY_CNTR | IB_DEVICE_SHUTDOWN_PORT |
 			IB_DEVICE_SYS_IMAGE_GUID | IB_DEVICE_RC_RNR_NAK_GEN |
@@ -1520,10 +1522,10 @@ static void hfi1_get_dev_fw_str(struct ib_device *ibdev, char *str,
 {
 	struct rvt_dev_info *rdi = ib_to_rvt(ibdev);
 	struct hfi1_ibdev *dev = dev_from_rdi(rdi);
-	u16 ver = dd_from_dev(dev)->dc8051_ver;
+	u32 ver = dd_from_dev(dev)->dc8051_ver;
 
-	snprintf(str, str_len, "%u.%u", dc8051_ver_maj(ver),
-		 dc8051_ver_min(ver));
+	snprintf(str, str_len, "%u.%u.%u", dc8051_ver_maj(ver),
+		 dc8051_ver_min(ver), dc8051_ver_patch(ver));
 }
 
 static const char * const driver_cntr_names[] = {

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/20] IB/rdmavt,IB/hfi1: Fix timer migration regressions
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (11 preceding siblings ...)
  2017-03-01 18:23   ` [PATCH 12/20] IB/hfi1: Add a patch value to the firmware version string Dennis Dalessandro
@ 2017-03-01 18:23   ` Dennis Dalessandro
  2017-03-01 18:23   ` [PATCH 14/20] IB/rdmavt: Avoid reseting wqe send_flags in unreserve Dennis Dalessandro
                     ` (8 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:23 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn, Brian Welty,
	Sebastian Sanchez

From: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

RC timeout counter isn't getting incremented.
Increment counter and add the trace for it.

Fixes: 87c23b4ab018 ("IB/rdmavt: Adding timer logic to rdmavt")
Reviewed-by: Brian Welty <brian.welty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/trace_rc.h   |    7 --
 drivers/infiniband/sw/rdmavt/qp.c       |    6 +-
 drivers/infiniband/sw/rdmavt/trace.h    |    3 +
 drivers/infiniband/sw/rdmavt/trace_rc.h |  109 +++++++++++++++++++++++++++++++
 4 files changed, 117 insertions(+), 8 deletions(-)
 create mode 100644 drivers/infiniband/sw/rdmavt/trace_rc.h

diff --git a/drivers/infiniband/hw/hfi1/trace_rc.h b/drivers/infiniband/hw/hfi1/trace_rc.h
index 5ea5005..8ce4765 100644
--- a/drivers/infiniband/hw/hfi1/trace_rc.h
+++ b/drivers/infiniband/hw/hfi1/trace_rc.h
@@ -1,5 +1,5 @@
 /*
-* Copyright(c) 2015, 2016 Intel Corporation.
+* Copyright(c) 2015, 2016, 2017 Intel Corporation.
 *
 * This file is provided under a dual BSD/GPLv2 license.  When using or
 * redistributing this file, you may do so under either license.
@@ -104,11 +104,6 @@
 	     TP_ARGS(qp, psn)
 );
 
-DEFINE_EVENT(hfi1_rc_template, hfi1_timeout,
-	     TP_PROTO(struct rvt_qp *qp, u32 psn),
-	     TP_ARGS(qp, psn)
-);
-
 DEFINE_EVENT(hfi1_rc_template, hfi1_rcv_error,
 	     TP_PROTO(struct rvt_qp *qp, u32 psn),
 	     TP_ARGS(qp, psn)
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 3c55a8b..d7dabdf 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016, 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -2052,8 +2052,12 @@ static void rvt_rc_timeout(unsigned long arg)
 	spin_lock_irqsave(&qp->r_lock, flags);
 	spin_lock(&qp->s_lock);
 	if (qp->s_flags & RVT_S_TIMER) {
+		struct rvt_ibport *rvp = rdi->ports[qp->port_num - 1];
+
 		qp->s_flags &= ~RVT_S_TIMER;
+		rvp->n_rc_timeouts++;
 		del_timer(&qp->s_timer);
+		trace_rvt_rc_timeout(qp, qp->s_last_psn + 1);
 		if (rdi->driver_f.notify_restart_rc)
 			rdi->driver_f.notify_restart_rc(qp,
 							qp->s_last_psn + 1,
diff --git a/drivers/infiniband/sw/rdmavt/trace.h b/drivers/infiniband/sw/rdmavt/trace.h
index 89554c0..bb4b1e7 100644
--- a/drivers/infiniband/sw/rdmavt/trace.h
+++ b/drivers/infiniband/sw/rdmavt/trace.h
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016, 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -53,3 +53,4 @@
 #include "trace_tx.h"
 #include "trace_mr.h"
 #include "trace_cq.h"
+#include "trace_rc.h"
diff --git a/drivers/infiniband/sw/rdmavt/trace_rc.h b/drivers/infiniband/sw/rdmavt/trace_rc.h
new file mode 100644
index 0000000..9952769
--- /dev/null
+++ b/drivers/infiniband/sw/rdmavt/trace_rc.h
@@ -0,0 +1,109 @@
+/*
+ * Copyright(c) 2017 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *  - Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  - Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *  - Neither the name of Intel Corporation nor the names of its
+ *    contributors may be used to endorse or promote products derived
+ *    from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#if !defined(__RVT_TRACE_RC_H) || defined(TRACE_HEADER_MULTI_READ)
+#define __RVT_TRACE_RC_H
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include <rdma/ib_verbs.h>
+#include <rdma/rdma_vt.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rvt_rc
+
+DECLARE_EVENT_CLASS(rvt_rc_template,
+		    TP_PROTO(struct rvt_qp *qp, u32 psn),
+		    TP_ARGS(qp, psn),
+		    TP_STRUCT__entry(
+			RDI_DEV_ENTRY(ib_to_rvt(qp->ibqp.device))
+			__field(u32, qpn)
+			__field(u32, s_flags)
+			__field(u32, psn)
+			__field(u32, s_psn)
+			__field(u32, s_next_psn)
+			__field(u32, s_sending_psn)
+			__field(u32, s_sending_hpsn)
+			__field(u32, r_psn)
+			),
+		    TP_fast_assign(
+			RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
+			__entry->qpn = qp->ibqp.qp_num;
+			__entry->s_flags = qp->s_flags;
+			__entry->psn = psn;
+			__entry->s_psn = qp->s_psn;
+			__entry->s_next_psn = qp->s_next_psn;
+			__entry->s_sending_psn = qp->s_sending_psn;
+			__entry->s_sending_hpsn = qp->s_sending_hpsn;
+			__entry->r_psn = qp->r_psn;
+			),
+		    TP_printk(
+			"[%s] qpn 0x%x s_flags 0x%x psn 0x%x s_psn 0x%x s_next_psn 0x%x s_sending_psn 0x%x sending_hpsn 0x%x r_psn 0x%x",
+			__get_str(dev),
+			__entry->qpn,
+			__entry->s_flags,
+			__entry->psn,
+			__entry->s_psn,
+			__entry->s_next_psn,
+			__entry->s_sending_psn,
+			__entry->s_sending_hpsn,
+			__entry->r_psn
+			)
+);
+
+DEFINE_EVENT(rvt_rc_template, rvt_rc_timeout,
+	     TP_PROTO(struct rvt_qp *qp, u32 psn),
+	     TP_ARGS(qp, psn)
+);
+
+#endif /* __RVT_TRACE_RC_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_rc
+#include <trace/define_trace.h>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/20] IB/rdmavt: Avoid reseting wqe send_flags in unreserve
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (12 preceding siblings ...)
  2017-03-01 18:23   ` [PATCH 13/20] IB/rdmavt,IB/hfi1: Fix timer migration regressions Dennis Dalessandro
@ 2017-03-01 18:23   ` Dennis Dalessandro
  2017-03-01 18:23   ` [PATCH 15/20] IB/hfi1: Ensure VL index is within bounds Dennis Dalessandro
                     ` (7 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:23 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The wqe should be read only and in fact the superfluous reset of the
RVT_SEND_RESERVE_USED flag causes an issue where reserved operations
elicit a bad completion to the ULP.

The maintenance of the flag is now entirely within rvt_post_one_wr()
where a reserved operation will set the flag and a non-reserved operation
will insure the operation that is about to be posted has the flag reset.

Fixes: Commit 856cc4c237ad ("IB/hfi1: Add the capability for reserved operations")
Reviewed-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/qp.c |    7 +++++--
 include/rdma/rdmavt_qp.h          |    4 +---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index d7dabdf..728f5f1 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -1772,10 +1772,13 @@ static int rvt_post_one_wr(struct rvt_qp *qp,
 					0);
 		qp->s_next_psn = wqe->lpsn + 1;
 	}
-	if (unlikely(reserved_op))
+	if (unlikely(reserved_op)) {
+		wqe->wr.send_flags |= RVT_SEND_RESERVE_USED;
 		rvt_qp_wqe_reserve(qp, wqe);
-	else
+	} else {
+		wqe->wr.send_flags &= ~RVT_SEND_RESERVE_USED;
 		qp->s_avail--;
+	}
 	trace_rvt_post_one_wr(qp, wqe);
 	smp_wmb(); /* see request builders */
 	qp->s_head = next;
diff --git a/include/rdma/rdmavt_qp.h b/include/rdma/rdmavt_qp.h
index 3cdd9e2..e3bb312 100644
--- a/include/rdma/rdmavt_qp.h
+++ b/include/rdma/rdmavt_qp.h
@@ -2,7 +2,7 @@
 #define DEF_RDMAVT_INCQP_H
 
 /*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016, 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -526,7 +526,6 @@ static inline void rvt_qp_wqe_reserve(
 	struct rvt_qp *qp,
 	struct rvt_swqe *wqe)
 {
-	wqe->wr.send_flags |= RVT_SEND_RESERVE_USED;
 	atomic_inc(&qp->s_reserved_used);
 }
 
@@ -550,7 +549,6 @@ static inline void rvt_qp_wqe_unreserve(
 	struct rvt_swqe *wqe)
 {
 	if (unlikely(wqe->wr.send_flags & RVT_SEND_RESERVE_USED)) {
-		wqe->wr.send_flags &= ~RVT_SEND_RESERVE_USED;
 		atomic_dec(&qp->s_reserved_used);
 		/* insure no compiler re-order up to s_last change */
 		smp_mb__after_atomic();

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 15/20] IB/hfi1: Ensure VL index is within bounds
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (13 preceding siblings ...)
  2017-03-01 18:23   ` [PATCH 14/20] IB/rdmavt: Avoid reseting wqe send_flags in unreserve Dennis Dalessandro
@ 2017-03-01 18:23   ` Dennis Dalessandro
  2017-03-01 18:23   ` [PATCH 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro
                     ` (6 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:23 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Improve the safety of the code and ensure the array cannot be indexed
out of bounds when picking the CPU for a given SDMA engine.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/sdma.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index 9bee28d..1f7bf30 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -962,6 +962,11 @@ ssize_t sdma_set_cpu_to_sde_map(struct sdma_engine *sde, const char *buf,
 			continue;
 		}
 
+		if (vl >= ARRAY_SIZE(rht_node->map)) {
+			ret = -EINVAL;
+			goto out;
+		}
+
 		rht_node = rhashtable_lookup_fast(dd->sdma_rht, &cpu,
 						  sdma_rht_params);
 		if (!rht_node) {

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 16/20] IB/hfi1: Add receive fault injection feature
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (14 preceding siblings ...)
  2017-03-01 18:23   ` [PATCH 15/20] IB/hfi1: Ensure VL index is within bounds Dennis Dalessandro
@ 2017-03-01 18:23   ` Dennis Dalessandro
       [not found]     ` <20170301182344.29989.12032.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-03-01 18:23   ` [PATCH 17/20] IB/hfi1: Add transmit " Dennis Dalessandro
                     ` (5 subsequent siblings)
  21 siblings, 1 reply; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:23 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Mike Marciniszyn

From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Add fault injection capability:
  - Drop packets unconditionally (fault_by_packet)
  - Drop packets based on opcode (fault_by_opcode)

This feature reacts to the global FAULT_INJECTION
config flag.

The faulting traces have been added:
  - misc/fault_opcode
  - misc/fault_packet

See 'Documentation/fault-injection/fault-injection.txt'
for details.

Examples:
  - Dropping packets by opcode:
    /sys/kernel/debug/hfi1/hfi1_X/fault_opcode
	# Enable fault
	echo Y > fault_by_opcode
	# Setprobability of dropping (0-100%)
	# echo 25 > probability
	# Set opcode
	echo 0x64 > opcode
	# Number of times to fault
	echo 3 > times
	# An optional mask allows you to fault
	# a range of opcodes
	echo 0xf0 > mask
    /sys/kernel/debug/hfi1/hfi1_X/fault_stats
    contains a value in parentheses to indicate
    number of each opcode dropped.

  - Dropping packets unconditionally
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet
	# Enable fault
	echo Y > fault_by_packet
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
    contains the number of packets dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/debugfs.c    |  222 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/hfi1/debugfs.h    |   35 +++++
 drivers/infiniband/hw/hfi1/driver.c     |    8 +
 drivers/infiniband/hw/hfi1/trace_misc.h |   48 +++++++
 drivers/infiniband/hw/hfi1/verbs.c      |    6 +
 drivers/infiniband/hw/hfi1/verbs.h      |    4 +
 6 files changed, 323 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 7fe9dd8..763cdb0 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -51,8 +51,12 @@
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/string.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/fault-inject.h>
 
 #include "hfi.h"
+#include "trace.h"
 #include "debugfs.h"
 #include "device.h"
 #include "qp.h"
@@ -1063,6 +1067,217 @@ static int _sdma_cpu_list_seq_show(struct seq_file *s, void *v)
 DEBUGFS_SEQ_FILE_OPEN(sdma_cpu_list)
 DEBUGFS_FILE_OPS(sdma_cpu_list);
 
+#ifdef CONFIG_FAULT_INJECTION
+static void *_fault_stats_seq_start(struct seq_file *s, loff_t *pos)
+{
+	struct hfi1_opcode_stats_perctx *opstats;
+
+	if (*pos >= ARRAY_SIZE(opstats->stats))
+		return NULL;
+	return pos;
+}
+
+static void *_fault_stats_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct hfi1_opcode_stats_perctx *opstats;
+
+	++*pos;
+	if (*pos >= ARRAY_SIZE(opstats->stats))
+		return NULL;
+	return pos;
+}
+
+static void _fault_stats_seq_stop(struct seq_file *s, void *v)
+{
+}
+
+static int _fault_stats_seq_show(struct seq_file *s, void *v)
+{
+	loff_t *spos = v;
+	loff_t i = *spos, j;
+	u64 n_packets = 0, n_bytes = 0;
+	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
+	struct hfi1_devdata *dd = dd_from_dev(ibd);
+
+	for (j = 0; j < dd->first_user_ctxt; j++) {
+		if (!dd->rcd[j])
+			continue;
+		n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
+		n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+	}
+	if (!n_packets && !n_bytes)
+		return SEQ_SKIP;
+	if (!ibd->fault_opcode->n_rxfaults[i] &&
+	    !ibd->fault_opcode->n_txfaults[i])
+		return SEQ_SKIP;
+	seq_printf(s, "%02llx %llu/%llu (faults rx:%llu faults: tx:%llu)\n", i,
+		   (unsigned long long)n_packets,
+		   (unsigned long long)n_bytes,
+		   (unsigned long long)ibd->fault_opcode->n_rxfaults[i],
+		   (unsigned long long)ibd->fault_opcode->n_txfaults[i]);
+	return 0;
+}
+
+DEBUGFS_SEQ_FILE_OPS(fault_stats);
+DEBUGFS_SEQ_FILE_OPEN(fault_stats);
+DEBUGFS_FILE_OPS(fault_stats);
+
+static void fault_exit_opcode_debugfs(struct hfi1_ibdev *ibd)
+{
+	debugfs_remove_recursive(ibd->fault_opcode->dir);
+	kfree(ibd->fault_opcode);
+	ibd->fault_opcode = NULL;
+}
+
+static int __init fault_init_opcode_debugfs(struct hfi1_ibdev *ibd)
+{
+	struct dentry *parent = ibd->hfi1_ibdev_dbg;
+
+	ibd->fault_opcode = kzalloc(sizeof(*ibd->fault_opcode), GFP_KERNEL);
+	if (!ibd->fault_opcode)
+		return -ENOMEM;
+
+	ibd->fault_opcode->attr.interval = 1;
+	ibd->fault_opcode->attr.require_end = ULONG_MAX;
+	ibd->fault_opcode->attr.stacktrace_depth = 32;
+	ibd->fault_opcode->attr.dname = NULL;
+	ibd->fault_opcode->attr.verbose = 0;
+	ibd->fault_opcode->fault_by_opcode = false;
+	ibd->fault_opcode->opcode = 0;
+	ibd->fault_opcode->mask = 0xff;
+
+	ibd->fault_opcode->dir =
+		fault_create_debugfs_attr("fault_opcode",
+					  parent,
+					  &ibd->fault_opcode->attr);
+	if (IS_ERR(ibd->fault_opcode->dir)) {
+		kfree(ibd->fault_opcode);
+		return -ENOENT;
+	}
+
+	DEBUGFS_SEQ_FILE_CREATE(fault_stats, ibd->fault_opcode->dir, ibd);
+	if (!debugfs_create_bool("fault_by_opcode", 0600,
+				 ibd->fault_opcode->dir,
+				 &ibd->fault_opcode->fault_by_opcode))
+		goto fail;
+	if (!debugfs_create_x8("opcode", 0600, ibd->fault_opcode->dir,
+			       &ibd->fault_opcode->opcode))
+		goto fail;
+	if (!debugfs_create_x8("mask", 0600, ibd->fault_opcode->dir,
+			       &ibd->fault_opcode->mask))
+		goto fail;
+
+	return 0;
+fail:
+	fault_exit_opcode_debugfs(ibd);
+	return -ENOMEM;
+}
+
+static void fault_exit_packet_debugfs(struct hfi1_ibdev *ibd)
+{
+	debugfs_remove_recursive(ibd->fault_packet->dir);
+	kfree(ibd->fault_packet);
+	ibd->fault_packet = NULL;
+}
+
+static int __init fault_init_packet_debugfs(struct hfi1_ibdev *ibd)
+{
+	struct dentry *parent = ibd->hfi1_ibdev_dbg;
+
+	ibd->fault_packet = kzalloc(sizeof(*ibd->fault_packet), GFP_KERNEL);
+	if (!ibd->fault_packet)
+		return -ENOMEM;
+
+	ibd->fault_packet->attr.interval = 1;
+	ibd->fault_packet->attr.require_end = ULONG_MAX;
+	ibd->fault_packet->attr.stacktrace_depth = 32;
+	ibd->fault_packet->attr.dname = NULL;
+	ibd->fault_packet->attr.verbose = 0;
+	ibd->fault_packet->fault_by_packet = false;
+
+	ibd->fault_packet->dir =
+		fault_create_debugfs_attr("fault_packet",
+					  parent,
+					  &ibd->fault_opcode->attr);
+	if (IS_ERR(ibd->fault_packet->dir)) {
+		kfree(ibd->fault_packet);
+		return -ENOENT;
+	}
+
+	if (!debugfs_create_bool("fault_by_packet", 0600,
+				 ibd->fault_packet->dir,
+				 &ibd->fault_packet->fault_by_packet))
+		goto fail;
+	if (!debugfs_create_u64("fault_stats", 0400,
+				ibd->fault_packet->dir,
+				&ibd->fault_packet->n_faults))
+		goto fail;
+
+	return 0;
+fail:
+	fault_exit_packet_debugfs(ibd);
+	return -ENOMEM;
+}
+
+static void fault_exit_debugfs(struct hfi1_ibdev *ibd)
+{
+	fault_exit_opcode_debugfs(ibd);
+	fault_exit_packet_debugfs(ibd);
+}
+
+static int __init fault_init_debugfs(struct hfi1_ibdev *ibd)
+{
+	int ret = 0;
+
+	ret = fault_init_opcode_debugfs(ibd);
+	if (ret)
+		return ret;
+
+	ret = fault_init_packet_debugfs(ibd);
+	if (ret)
+		fault_exit_opcode_debugfs(ibd);
+
+	return ret;
+}
+
+bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx)
+{
+	bool ret = false;
+	struct hfi1_ibdev *ibd = to_idev(qp->ibqp.device);
+
+	if (!ibd->fault_opcode || !ibd->fault_opcode->fault_by_opcode)
+		return false;
+	if (ibd->fault_opcode->opcode != (opcode & ibd->fault_opcode->mask))
+		return false;
+	ret = should_fail(&ibd->fault_opcode->attr, 1);
+	if (ret) {
+		trace_hfi1_fault_opcode(qp, opcode);
+		if (rx)
+			ibd->fault_opcode->n_rxfaults[opcode]++;
+		else
+			ibd->fault_opcode->n_txfaults[opcode]++;
+	}
+	return ret;
+}
+
+bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
+{
+	struct rvt_dev_info *rdi = &packet->rcd->ppd->dd->verbs_dev.rdi;
+	struct hfi1_ibdev *ibd = dev_from_rdi(rdi);
+	bool ret = false;
+
+	if (!ibd->fault_packet || !ibd->fault_packet->fault_by_packet)
+		return false;
+
+	ret = should_fail(&ibd->fault_packet->attr, 1);
+	if (ret) {
+		++ibd->fault_packet->n_faults;
+		trace_hfi1_fault_packet(packet);
+	}
+	return ret;
+}
+#endif
+
 void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 {
 	char name[sizeof("port0counters") + 1];
@@ -1112,12 +1327,19 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 					    !port_cntr_ops[i].ops.write ?
 					    S_IRUGO : S_IRUGO | S_IWUSR);
 		}
+
+#ifdef CONFIG_FAULT_INJECTION
+	fault_init_debugfs(ibd);
+#endif
 }
 
 void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd)
 {
 	if (!hfi1_dbg_root)
 		goto out;
+#ifdef CONFIG_FAULT_INJECTION
+	fault_exit_debugfs(ibd);
+#endif
 	debugfs_remove(ibd->hfi1_ibdev_link);
 	debugfs_remove_recursive(ibd->hfi1_ibdev_dbg);
 out:
diff --git a/drivers/infiniband/hw/hfi1/debugfs.h b/drivers/infiniband/hw/hfi1/debugfs.h
index b6fb681..53dfdae 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.h
+++ b/drivers/infiniband/hw/hfi1/debugfs.h
@@ -53,6 +53,41 @@
 void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd);
 void hfi1_dbg_init(void);
 void hfi1_dbg_exit(void);
+
+#ifdef CONFIG_FAULT_INJECTION
+#include <linux/fault-inject.h>
+struct fault_opcode {
+	struct fault_attr attr;
+	struct dentry *dir;
+	bool fault_by_opcode;
+	u64 n_rxfaults[256];
+	u64 n_txfaults[256];
+	u8 opcode;
+	u8 mask;
+};
+
+struct fault_packet {
+	struct fault_attr attr;
+	struct dentry *dir;
+	bool fault_by_packet;
+	u64 n_faults;
+};
+
+bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx);
+bool hfi1_dbg_fault_packet(struct hfi1_packet *packet);
+#else
+static inline bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
+{
+	return false;
+}
+
+static inline bool hfi1_dbg_fault_opcode(struct rvt_qp *qp,
+					 u32 opcode, bool rx)
+{
+	return false;
+}
+#endif
+
 #else
 static inline void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 {
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 3881c95..c0b012f 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -59,6 +59,7 @@
 #include "trace.h"
 #include "qp.h"
 #include "sdma.h"
+#include "debugfs.h"
 
 #undef pr_fmt
 #define pr_fmt(fmt) DRIVER_NAME ": " fmt
@@ -1354,6 +1355,9 @@ void handle_eflags(struct hfi1_packet *packet)
  */
 int process_receive_ib(struct hfi1_packet *packet)
 {
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
+
 	trace_hfi1_rcvhdr(packet->rcd->ppd->dd,
 			  packet->rcd->ctxt,
 			  rhf_err_flags(packet->rhf),
@@ -1409,6 +1413,8 @@ int process_receive_error(struct hfi1_packet *packet)
 
 int kdeth_process_expected(struct hfi1_packet *packet)
 {
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
 
@@ -1421,6 +1427,8 @@ int kdeth_process_eager(struct hfi1_packet *packet)
 {
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
 
 	dd_dev_err(packet->rcd->dd,
 		   "Unhandled eager packet received. Dropping.\n");
diff --git a/drivers/infiniband/hw/hfi1/trace_misc.h b/drivers/infiniband/hw/hfi1/trace_misc.h
index d308454..deac77d 100644
--- a/drivers/infiniband/hw/hfi1/trace_misc.h
+++ b/drivers/infiniband/hw/hfi1/trace_misc.h
@@ -72,6 +72,54 @@
 		      __entry->src)
 );
 
+#ifdef CONFIG_FAULT_INJECTION
+TRACE_EVENT(hfi1_fault_opcode,
+	    TP_PROTO(struct rvt_qp *qp, u8 opcode),
+	    TP_ARGS(qp, opcode),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd_from_ibdev(qp->ibqp.device))
+			     __field(u32, qpn)
+			     __field(u8, opcode)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
+			   __entry->qpn = qp->ibqp.qp_num;
+			   __entry->opcode = opcode;
+			   ),
+	    TP_printk("[%s] qpn 0x%x opcode 0x%x",
+		      __get_str(dev), __entry->qpn, __entry->opcode)
+);
+
+TRACE_EVENT(hfi1_fault_packet,
+	    TP_PROTO(struct hfi1_packet *packet),
+	    TP_ARGS(packet),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(packet->rcd->ppd->dd)
+			     __field(u64, eflags)
+			     __field(u32, ctxt)
+			     __field(u32, hlen)
+			     __field(u32, tlen)
+			     __field(u32, updegr)
+			     __field(u32, etail)
+			     ),
+	     TP_fast_assign(DD_DEV_ASSIGN(packet->rcd->ppd->dd);
+			    __entry->eflags = rhf_err_flags(packet->rhf);
+			    __entry->ctxt = packet->rcd->ctxt;
+			    __entry->hlen = packet->hlen;
+			    __entry->tlen = packet->tlen;
+			    __entry->updegr = packet->updegr;
+			    __entry->etail = rhf_egr_index(packet->rhf);
+			    ),
+	     TP_printk(
+		"[%s] ctxt %d eflags 0x%llx hlen %d tlen %d updegr %d etail %d",
+		__get_str(dev),
+		__entry->ctxt,
+		__entry->eflags,
+		__entry->hlen,
+		__entry->tlen,
+		__entry->updegr,
+		__entry->etail
+		)
+);
+#endif
+
 #endif /* __HFI1_TRACE_MISC_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 928918c..9f016da 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -60,6 +60,7 @@
 #include "trace.h"
 #include "qp.h"
 #include "verbs_txreq.h"
+#include "debugfs.h"
 
 static unsigned int hfi1_lkey_table_size = 16;
 module_param_named(lkey_table_size, hfi1_lkey_table_size, uint,
@@ -599,6 +600,11 @@ void hfi1_ib_rcv(struct hfi1_packet *packet)
 			rcu_read_unlock();
 			goto drop;
 		}
+		if (unlikely(hfi1_dbg_fault_opcode(packet->qp, opcode,
+						   true))) {
+			rcu_read_unlock();
+			goto drop;
+		}
 		spin_lock_irqsave(&packet->qp->r_lock, flags);
 		packet_handler = qp_ok(opcode, packet);
 		if (likely(packet_handler))
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 3a0b589..2756ec3 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -195,6 +195,10 @@ struct hfi1_ibdev {
 	struct dentry *hfi1_ibdev_dbg;
 	/* per HFI symlinks to above */
 	struct dentry *hfi1_ibdev_link;
+#ifdef CONFIG_FAULT_INJECTION
+	struct fault_opcode *fault_opcode;
+	struct fault_packet *fault_packet;
+#endif
 #endif
 };
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 17/20] IB/hfi1: Add transmit fault injection feature
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (15 preceding siblings ...)
  2017-03-01 18:23   ` [PATCH 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro
@ 2017-03-01 18:23   ` Dennis Dalessandro
  2017-03-01 18:24   ` [PATCH 18/20] IB/hfi1: Eliminate synchronize_rcu() in mr delete Dennis Dalessandro
                     ` (4 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:23 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Mike Marciniszyn

From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Add ability to fault packets on transmit by opcode.
Dropping by packet can be achieved by setting the mask to 0.

In order to drop non-verbs traffic we set PbcInsertHrc
to NONE (0x2). The packet will still be delivered to
the receiving node but a KHdrHCRCErr (KDETH packet
with a bad HCRC) will be triggered and the packet will
not be delivered to the correct context.

In order to drop regular verbs traffic we set the
PbcTestEbp flag. The packet will still be delivered
to the receiving node but a 'late ebp error' will
be triggered and will be dropped.

A global toggle (/sys/kernel/debug/hfi1/hfi1_X/fault_suppress_err)
has been added to suppress the error messages on the receive
node when a packet was faulted on the sending node.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c    |    4 +++
 drivers/infiniband/hw/hfi1/debugfs.c |    8 ++++++
 drivers/infiniband/hw/hfi1/debugfs.h |    6 ++++
 drivers/infiniband/hw/hfi1/driver.c  |   11 ++++++++
 drivers/infiniband/hw/hfi1/verbs.c   |   49 +++++++++++++++++++++++++++++-----
 drivers/infiniband/hw/hfi1/verbs.h   |    1 +
 include/rdma/ib_pack.h               |    2 +
 7 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 77f4b41..79a316a 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -64,6 +64,7 @@
 #include "platform.h"
 #include "aspm.h"
 #include "affinity.h"
+#include "debugfs.h"
 
 #define NUM_IB_PORTS 1
 
@@ -7898,6 +7899,9 @@ static void handle_dcc_err(struct hfi1_devdata *dd, u32 unused, u64 reg)
 		reg &= ~DCC_ERR_FLG_EN_CSR_ACCESS_BLOCKED_HOST_SMASK;
 	}
 
+	if (unlikely(hfi1_dbg_fault_suppress_err(&dd->verbs_dev)))
+		reg &= ~DCC_ERR_FLG_LATE_EBP_ERR_SMASK;
+
 	/* report any remaining errors */
 	if (reg)
 		dd_dev_info_ratelimited(dd, "DCC Error: %s\n",
diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 763cdb0..91d7376 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -1240,6 +1240,11 @@ static int __init fault_init_debugfs(struct hfi1_ibdev *ibd)
 	return ret;
 }
 
+bool hfi1_dbg_fault_suppress_err(struct hfi1_ibdev *ibd)
+{
+	return ibd->fault_suppress_err;
+}
+
 bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx)
 {
 	bool ret = false;
@@ -1329,6 +1334,9 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 		}
 
 #ifdef CONFIG_FAULT_INJECTION
+	debugfs_create_bool("fault_suppress_err", 0600,
+			    ibd->hfi1_ibdev_dbg,
+			    &ibd->fault_suppress_err);
 	fault_init_debugfs(ibd);
 #endif
 }
diff --git a/drivers/infiniband/hw/hfi1/debugfs.h b/drivers/infiniband/hw/hfi1/debugfs.h
index 53dfdae..84fa892 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.h
+++ b/drivers/infiniband/hw/hfi1/debugfs.h
@@ -75,6 +75,7 @@ struct fault_packet {
 
 bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx);
 bool hfi1_dbg_fault_packet(struct hfi1_packet *packet);
+bool hfi1_dbg_fault_suppress_err(struct hfi1_ibdev *ibd);
 #else
 static inline bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
 {
@@ -86,6 +87,11 @@ static inline bool hfi1_dbg_fault_opcode(struct rvt_qp *qp,
 {
 	return false;
 }
+
+static inline bool hfi1_dbg_fault_suppress_err(struct hfi1_ibdev *ibd)
+{
+	return false;
+}
 #endif
 
 #else
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index c0b012f..64bdbce 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -1367,6 +1367,11 @@ int process_receive_ib(struct hfi1_packet *packet)
 			  packet->updegr,
 			  rhf_egr_index(packet->rhf));
 
+	if (unlikely(
+		 (hfi1_dbg_fault_suppress_err(&packet->rcd->dd->verbs_dev) &&
+		 (packet->rhf & RHF_DC_ERR))))
+		return RHF_RCV_CONTINUE;
+
 	if (unlikely(rhf_err_flags(packet->rhf))) {
 		handle_eflags(packet);
 		return RHF_RCV_CONTINUE;
@@ -1402,6 +1407,12 @@ int process_receive_bypass(struct hfi1_packet *packet)
 
 int process_receive_error(struct hfi1_packet *packet)
 {
+	/* KHdrHCRCErr -- KDETH packet with a bad HCRC */
+	if (unlikely(
+		 hfi1_dbg_fault_suppress_err(&packet->rcd->dd->verbs_dev) &&
+		 rhf_rcv_type_err(packet->rhf) == 3))
+		return RHF_RCV_CONTINUE;
+
 	handle_eflags(packet);
 
 	if (unlikely(rhf_err_flags(packet->rhf)))
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 9f016da..5e7e577 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -518,6 +518,35 @@ static inline opcode_handler qp_ok(int opcode, struct hfi1_packet *packet)
 	return NULL;
 }
 
+static u64 hfi1_fault_tx(struct rvt_qp *qp, u8 opcode, u64 pbc)
+{
+#ifdef CONFIG_HFI1_FAULT_INJECTION
+	if ((opcode & IB_OPCODE_MSP) == IB_OPCODE_MSP)
+		/*
+		 * In order to drop non-IB traffic we
+		 * set PbcInsertHrc to NONE (0x2).
+		 * The packet will still be delivered
+		 * to the receiving node but a
+		 * KHdrHCRCErr (KDETH packet with a bad
+		 * HCRC) will be triggered and the
+		 * packet will not be delivered to the
+		 * correct context.
+		 */
+		pbc |= (u64)PBC_IHCRC_NONE << PBC_INSERT_HCRC_SHIFT;
+	else
+		/*
+		 * In order to drop regular verbs
+		 * traffic we set the PbcTestEbp
+		 * flag. The packet will still be
+		 * delivered to the receiving node but
+		 * a 'late ebp error' will be
+		 * triggered and will be dropped.
+		 */
+		pbc |= PBC_TEST_EBP;
+#endif
+	return pbc;
+}
+
 /**
  * hfi1_ib_rcv - process an incoming packet
  * @packet: data packet information
@@ -803,7 +832,6 @@ static int build_verbs_tx_desc(
 		if (ret)
 			goto bail_txadd;
 	}
-
 	/* add the ulp payload - if any. tx->ss can be NULL for acks */
 	if (tx->ss)
 		ret = build_verbs_ulp_payload(sde, length, tx);
@@ -822,7 +850,6 @@ int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 	struct hfi1_ibdev *dev = ps->dev;
 	struct hfi1_pportdata *ppd = ps->ppd;
 	struct verbs_txreq *tx;
-	u64 pbc_flags = 0;
 	u8 sc5 = priv->s_sc;
 
 	int ret;
@@ -831,12 +858,16 @@ int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 	if (!sdma_txreq_built(&tx->txreq)) {
 		if (likely(pbc == 0)) {
 			u32 vl = sc_to_vlt(dd_from_ibdev(qp->ibqp.device), sc5);
+			u8 opcode = get_opcode(&tx->phdr.hdr);
+
 			/* No vl15 here */
 			/* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
-			pbc_flags |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
+			pbc |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
 
+			if (unlikely(hfi1_dbg_fault_opcode(qp, opcode, false)))
+				pbc = hfi1_fault_tx(qp, opcode, pbc);
 			pbc = create_pbc(ppd,
-					 pbc_flags,
+					 pbc,
 					 qp->srate_mbps,
 					 vl,
 					 plen);
@@ -939,7 +970,6 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 	u32 plen = hdrwords + dwords + 2; /* includes pbc */
 	struct hfi1_pportdata *ppd = ps->ppd;
 	u32 *hdr = (u32 *)&ps->s_txreq->phdr.hdr;
-	u64 pbc_flags = 0;
 	u8 sc5;
 	unsigned long flags = 0;
 	struct send_context *sc;
@@ -964,9 +994,14 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 
 	if (likely(pbc == 0)) {
 		u8 vl = sc_to_vlt(dd_from_ibdev(qp->ibqp.device), sc5);
+		struct verbs_txreq *tx = ps->s_txreq;
+		u8 opcode = get_opcode(&tx->phdr.hdr);
+
 		/* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
-		pbc_flags |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
-		pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps, vl, plen);
+		pbc |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
+		if (unlikely(hfi1_dbg_fault_opcode(qp, opcode, false)))
+			pbc = hfi1_fault_tx(qp, opcode, pbc);
+		pbc = create_pbc(ppd, pbc, qp->srate_mbps, vl, plen);
 	}
 	if (cb)
 		iowait_pio_inc(&priv->s_iowait);
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 2756ec3..6c549e7 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -198,6 +198,7 @@ struct hfi1_ibdev {
 #ifdef CONFIG_FAULT_INJECTION
 	struct fault_opcode *fault_opcode;
 	struct fault_packet *fault_packet;
+	bool fault_suppress_err;
 #endif
 #endif
 };
diff --git a/include/rdma/ib_pack.h b/include/rdma/ib_pack.h
index b13419c..3665589 100644
--- a/include/rdma/ib_pack.h
+++ b/include/rdma/ib_pack.h
@@ -80,6 +80,8 @@ enum {
 	IB_OPCODE_UD                                = 0x60,
 	/* per IBTA 1.3 vol 1 Table 38, A10.3.2 */
 	IB_OPCODE_CNP                               = 0x80,
+	/* Manufacturer specific */
+	IB_OPCODE_MSP                               = 0xe0,
 
 	/* operations -- just used to define real constants */
 	IB_OPCODE_SEND_FIRST                        = 0x00,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 18/20] IB/hfi1: Eliminate synchronize_rcu() in mr delete
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (16 preceding siblings ...)
  2017-03-01 18:23   ` [PATCH 17/20] IB/hfi1: Add transmit " Dennis Dalessandro
@ 2017-03-01 18:24   ` Dennis Dalessandro
  2017-03-01 18:24   ` [PATCH 19/20] IB/rdmavt, IB/qib, IB/hfi1: Make percpu refcount optional for user MRs Dennis Dalessandro
                     ` (3 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:24 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The synchronize_rcu() call can be eliminated to improve memory deregistration
performance.

There are two key fields involved:
- The rcu pointer itself
- the lkey_published field

To close the window between the rcu read of the mregion pointer and the
reference count the code should:

1. To lkey/rkey validation (reader)

Read the rcu pointer.  If the pointer is non-NULL, get a reference.

To the current validation tests use a READ_ONCE() on the lkey_published.

Upon any failure release the reference.

2. To the remove logic (delete)

Insure the published is zeroed prior to setting the pointer to NULL.
This requires using rcu_assign_pointer() to insure lkey_published
is written prior to the NULL.

3. To the insert logic (add)

Insure the published is set use an rcu_assign_pointer() to insure the
pointer is after all MR fields.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/sw/rdmavt/mr.c |   49 +++++++++++++++++++++++++------------
 1 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index ae30b68..7c86955 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -191,8 +191,9 @@ static int rvt_alloc_lkey(struct rvt_mregion *mr, int dma_region)
 
 		tmr = rcu_access_pointer(dev->dma_mr);
 		if (!tmr) {
-			rcu_assign_pointer(dev->dma_mr, mr);
 			mr->lkey_published = 1;
+			/* Insure published written first */
+			rcu_assign_pointer(dev->dma_mr, mr);
 			rvt_get_mr(mr);
 		}
 		goto success;
@@ -224,8 +225,9 @@ static int rvt_alloc_lkey(struct rvt_mregion *mr, int dma_region)
 		mr->lkey |= 1 << 8;
 		rkt->gen++;
 	}
-	rcu_assign_pointer(rkt->table[r], mr);
 	mr->lkey_published = 1;
+	/* Insure published written first */
+	rcu_assign_pointer(rkt->table[r], mr);
 success:
 	spin_unlock_irqrestore(&rkt->lock, flags);
 out:
@@ -253,23 +255,24 @@ static void rvt_free_lkey(struct rvt_mregion *mr)
 	spin_lock_irqsave(&rkt->lock, flags);
 	if (!lkey) {
 		if (mr->lkey_published) {
-			RCU_INIT_POINTER(dev->dma_mr, NULL);
+			mr->lkey_published = 0;
+			/* insure published is written before pointer */
+			rcu_assign_pointer(dev->dma_mr, NULL);
 			rvt_put_mr(mr);
 		}
 	} else {
 		if (!mr->lkey_published)
 			goto out;
 		r = lkey >> (32 - dev->dparms.lkey_table_size);
-		RCU_INIT_POINTER(rkt->table[r], NULL);
+		mr->lkey_published = 0;
+		/* insure published is written before pointer */
+		rcu_assign_pointer(rkt->table[r], NULL);
 	}
-	mr->lkey_published = 0;
 	freed++;
 out:
 	spin_unlock_irqrestore(&rkt->lock, flags);
-	if (freed) {
-		synchronize_rcu();
+	if (freed)
 		percpu_ref_kill(&mr->refcount);
-	}
 }
 
 static struct rvt_mr *__rvt_alloc_mr(int count, struct ib_pd *pd)
@@ -822,16 +825,21 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
 		goto ok;
 	}
 	mr = rcu_dereference(rkt->table[sge->lkey >> rkt->shift]);
-	if (unlikely(!mr || atomic_read(&mr->lkey_invalid) ||
-		     mr->lkey != sge->lkey || mr->pd != &pd->ibpd))
+	if (!mr)
 		goto bail;
+	rvt_get_mr(mr);
+	if (!READ_ONCE(mr->lkey_published))
+		goto bail_unref;
+
+	if (unlikely(atomic_read(&mr->lkey_invalid) ||
+		     mr->lkey != sge->lkey || mr->pd != &pd->ibpd))
+		goto bail_unref;
 
 	off = sge->addr - mr->user_base;
 	if (unlikely(sge->addr < mr->user_base ||
 		     off + sge->length > mr->length ||
 		     (mr->access_flags & acc) != acc))
-		goto bail;
-	rvt_get_mr(mr);
+		goto bail_unref;
 	rcu_read_unlock();
 
 	off += mr->offset;
@@ -867,6 +875,8 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
 	isge->n = n;
 ok:
 	return 1;
+bail_unref:
+	rvt_put_mr(mr);
 bail:
 	rcu_read_unlock();
 	return 0;
@@ -922,15 +932,20 @@ int rvt_rkey_ok(struct rvt_qp *qp, struct rvt_sge *sge,
 	}
 
 	mr = rcu_dereference(rkt->table[rkey >> rkt->shift]);
-	if (unlikely(!mr || atomic_read(&mr->lkey_invalid) ||
-		     mr->lkey != rkey || qp->ibqp.pd != mr->pd))
+	if (!mr)
 		goto bail;
+	rvt_get_mr(mr);
+	/* insure mr read is before test */
+	if (!READ_ONCE(mr->lkey_published))
+		goto bail_unref;
+	if (unlikely(atomic_read(&mr->lkey_invalid) ||
+		     mr->lkey != rkey || qp->ibqp.pd != mr->pd))
+		goto bail_unref;
 
 	off = vaddr - mr->iova;
 	if (unlikely(vaddr < mr->iova || off + len > mr->length ||
 		     (mr->access_flags & acc) == 0))
-		goto bail;
-	rvt_get_mr(mr);
+		goto bail_unref;
 	rcu_read_unlock();
 
 	off += mr->offset;
@@ -966,6 +981,8 @@ int rvt_rkey_ok(struct rvt_qp *qp, struct rvt_sge *sge,
 	sge->n = n;
 ok:
 	return 1;
+bail_unref:
+	rvt_put_mr(mr);
 bail:
 	rcu_read_unlock();
 	return 0;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 19/20] IB/rdmavt, IB/qib, IB/hfi1: Make percpu refcount optional for user MRs
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (17 preceding siblings ...)
  2017-03-01 18:24   ` [PATCH 18/20] IB/hfi1: Eliminate synchronize_rcu() in mr delete Dennis Dalessandro
@ 2017-03-01 18:24   ` Dennis Dalessandro
  2017-03-01 18:24   ` [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error Dennis Dalessandro
                     ` (2 subsequent siblings)
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:24 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

In some cases, the cost of user memory deregistration is more
important than the data path benefit of percpu reference counts.

Add a (default off) module parameter to disarm percpu for user memory
regions.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/verbs.c    |    7 +++++++
 drivers/infiniband/hw/qib/qib_verbs.c |    7 +++++++
 drivers/infiniband/sw/rdmavt/mr.c     |    6 +++++-
 include/rdma/rdma_vt.h                |    1 +
 4 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 5e7e577..552b26d 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -68,6 +68,12 @@
 MODULE_PARM_DESC(lkey_table_size,
 		 "LKEY table size in bits (2^n, 1 <= n <= 23)");
 
+static unsigned int hfi1_no_user_mr_percpu;
+module_param_named(no_user_mr_percpu, hfi1_no_user_mr_percpu, uint,
+		   S_IRUGO);
+MODULE_PARM_DESC(no_user_mr_percpu,
+		 "Avoid percpu refcount for user MRs (default 0)");
+
 static unsigned int hfi1_max_pds = 0xFFFF;
 module_param_named(max_pds, hfi1_max_pds, uint, S_IRUGO);
 MODULE_PARM_DESC(max_pds,
@@ -1841,6 +1847,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
 	/* misc settings */
 	dd->verbs_dev.rdi.flags = 0; /* Let rdmavt handle it all */
 	dd->verbs_dev.rdi.dparms.lkey_table_size = hfi1_lkey_table_size;
+	dd->verbs_dev.rdi.dparms.no_user_mr_percpu = hfi1_no_user_mr_percpu;
 	dd->verbs_dev.rdi.dparms.nports = dd->num_pports;
 	dd->verbs_dev.rdi.dparms.npkeys = hfi1_get_npkeys(dd);
 
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index e120efe..6c718cd 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -56,6 +56,12 @@
 MODULE_PARM_DESC(lkey_table_size,
 		 "LKEY table size in bits (2^n, 1 <= n <= 23)");
 
+static unsigned int qib_no_user_mr_percpu;
+module_param_named(no_user_mr_percpu, qib_no_user_mr_percpu, uint,
+		   S_IRUGO);
+MODULE_PARM_DESC(no_user_mr_percpu,
+		 "Avoid percpu refcount for user MRs (default 0)");
+
 static unsigned int ib_qib_max_pds = 0xFFFF;
 module_param_named(max_pds, ib_qib_max_pds, uint, S_IRUGO);
 MODULE_PARM_DESC(max_pds,
@@ -1606,6 +1612,7 @@ int qib_register_ib_device(struct qib_devdata *dd)
 	dd->verbs_dev.rdi.dparms.max_rdma_atomic = QIB_MAX_RDMA_ATOMIC;
 	dd->verbs_dev.rdi.driver_f.get_guid_be = qib_get_guid_be;
 	dd->verbs_dev.rdi.dparms.lkey_table_size = qib_lkey_table_size;
+	dd->verbs_dev.rdi.dparms.no_user_mr_percpu = qib_no_user_mr_percpu;
 	dd->verbs_dev.rdi.dparms.qp_table_size = ib_qib_qp_table_size;
 	dd->verbs_dev.rdi.dparms.qpn_start = 1;
 	dd->verbs_dev.rdi.dparms.qpn_res_start = QIB_KD_QP;
diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index 7c86955..bbcc31f 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -280,6 +280,7 @@ static void rvt_free_lkey(struct rvt_mregion *mr)
 	struct rvt_mr *mr;
 	int rval = -ENOMEM;
 	int m;
+	struct rvt_dev_info *dev = ib_to_rvt(pd->device);
 
 	/* Allocate struct plus pointers to first level page tables. */
 	m = (count + RVT_SEGSZ - 1) / RVT_SEGSZ;
@@ -287,7 +288,10 @@ static void rvt_free_lkey(struct rvt_mregion *mr)
 	if (!mr)
 		goto bail;
 
-	rval = rvt_init_mregion(&mr->mr, pd, count, 0);
+	rval = rvt_init_mregion(&mr->mr, pd, count,
+				ibpd_to_rvtpd(pd)->user &&
+				dev->dparms.no_user_mr_percpu ?
+					PERCPU_REF_INIT_ATOMIC : 0);
 	if (rval)
 		goto bail;
 	/*
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index 8fc1ca7..d60a41e 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -142,6 +142,7 @@ struct rvt_driver_params {
 	 * For instance special module parameters. Goes here.
 	 */
 	unsigned int lkey_table_size;
+	unsigned int no_user_mr_percpu;
 	unsigned int qp_table_size;
 	int qpn_start;
 	int qpn_inc;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (18 preceding siblings ...)
  2017-03-01 18:24   ` [PATCH 19/20] IB/rdmavt, IB/qib, IB/hfi1: Make percpu refcount optional for user MRs Dennis Dalessandro
@ 2017-03-01 18:24   ` Dennis Dalessandro
       [not found]     ` <20170301182426.29989.77369.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2017-03-02 15:18   ` [PATCH v2 " Dennis Dalessandro
  2017-03-02 18:26   ` [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro
  21 siblings, 1 reply; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-01 18:24 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Ira Weiny

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

A list of MGID/MLID pairs is built when doing a multicast attach.  When
the multicast detach is called, the list is searched, and regardless of
the search outcome, the driver detach is called.

If an MGID/MLID pair is not on the list, driver detach should not be
called, and an error should be returned.  Failure to do so can leave
the Multicast list out of sync with what the driver has.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 7b7a76e..40cd335 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3186,6 +3186,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 	struct ib_qp                 *qp;
 	struct ib_uverbs_mcast_entry *mcast;
 	int                           ret = -EINVAL;
+	bool                          found = false;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
@@ -3194,10 +3195,6 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 	if (!qp)
 		return -EINVAL;
 
-	ret = ib_detach_mcast(qp, (union ib_gid *) cmd.gid, cmd.mlid);
-	if (ret)
-		goto out_put;
-
 	obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
 
 	list_for_each_entry(mcast, &obj->mcast_list, list)
@@ -3205,9 +3202,17 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 		    !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
 			list_del(&mcast->list);
 			kfree(mcast);
+			found = true;
 			break;
 		}
 
+	if (!found) {
+		ret = -EINVAL;
+		goto out_put;
+	}
+
+	ret = ib_detach_mcast(qp, (union ib_gid *)cmd.gid, cmd.mlid);
+
 out_put:
 	put_qp_write(qp);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error
       [not found]     ` <20170301182426.29989.77369.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-03-02 14:00       ` Leon Romanovsky
       [not found]         ` <20170302140028.GE9525-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Leon Romanovsky @ 2017-03-02 14:00 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Ira Weiny

[-- Attachment #1: Type: text/plain, Size: 737 bytes --]

On Wed, Mar 01, 2017 at 10:24:32AM -0800, Dennis Dalessandro wrote:
> From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> A list of MGID/MLID pairs is built when doing a multicast attach.  When
> the multicast detach is called, the list is searched, and regardless of
> the search outcome, the driver detach is called.
>
> If an MGID/MLID pair is not on the list, driver detach should not be
> called, and an error should be returned.  Failure to do so can leave
> the Multicast list out of sync with what the driver has.

It took me a while to understand your last sentence.
Can we add Fixes line and submit it to stable too?

Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error
       [not found]         ` <20170302140028.GE9525-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-03-02 15:01           ` Dennis Dalessandro
       [not found]             ` <4d49f7ab-5bc5-029c-4321-fd349c9dc0f8-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-02 15:01 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Ira Weiny

On 03/02/2017 09:00 AM, Leon Romanovsky wrote:
> On Wed, Mar 01, 2017 at 10:24:32AM -0800, Dennis Dalessandro wrote:
>> From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>
>> A list of MGID/MLID pairs is built when doing a multicast attach.  When
>> the multicast detach is called, the list is searched, and regardless of
>> the search outcome, the driver detach is called.
>>
>> If an MGID/MLID pair is not on the list, driver detach should not be
>> called, and an error should be returned.  Failure to do so can leave
>> the Multicast list out of sync with what the driver has.
>
> It took me a while to understand your last sentence.
> Can we add Fixes line and submit it to stable too?
>
> Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>

We'll reword the last bit there and add the fixes line and Cc stable.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error
       [not found]             ` <4d49f7ab-5bc5-029c-4321-fd349c9dc0f8-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-03-02 15:13               ` Leon Romanovsky
  0 siblings, 0 replies; 30+ messages in thread
From: Leon Romanovsky @ 2017-03-02 15:13 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Ira Weiny

[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]

On Thu, Mar 02, 2017 at 10:01:25AM -0500, Dennis Dalessandro wrote:
> On 03/02/2017 09:00 AM, Leon Romanovsky wrote:
> > On Wed, Mar 01, 2017 at 10:24:32AM -0800, Dennis Dalessandro wrote:
> > > From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > >
> > > A list of MGID/MLID pairs is built when doing a multicast attach.  When
> > > the multicast detach is called, the list is searched, and regardless of
> > > the search outcome, the driver detach is called.
> > >
> > > If an MGID/MLID pair is not on the list, driver detach should not be
> > > called, and an error should be returned.  Failure to do so can leave
> > > the Multicast list out of sync with what the driver has.
> >
> > It took me a while to understand your last sentence.
> > Can we add Fixes line and submit it to stable too?
> >
> > Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >
>
> We'll reword the last bit there and add the fixes line and Cc stable.

Thanks, I appreciate it.

>
> -Denny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 20/20] IB/core: If the MGID/MLID pair is not on the list return an error
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (19 preceding siblings ...)
  2017-03-01 18:24   ` [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error Dennis Dalessandro
@ 2017-03-02 15:18   ` Dennis Dalessandro
  2017-03-02 18:26   ` [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro
  21 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-02 15:18 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

A list of MGID/MLID pairs is built when doing a multicast attach.  When
the multicast detach is called, the list is searched, and regardless of
the search outcome, the driver detach is called.

If an MGID/MLID pair is not on the list, driver detach should not be
called, and an error should be returned.  Calling the driver without
removing an MGID/MLID pair from the list can leave the core and driver
out of sync.

Fixes: f4e401562c11 IB/uverbs: track multicast group membership for userspace QPs
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 7b7a76e..40cd335 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3186,6 +3186,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 	struct ib_qp                 *qp;
 	struct ib_uverbs_mcast_entry *mcast;
 	int                           ret = -EINVAL;
+	bool                          found = false;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
@@ -3194,10 +3195,6 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 	if (!qp)
 		return -EINVAL;
 
-	ret = ib_detach_mcast(qp, (union ib_gid *) cmd.gid, cmd.mlid);
-	if (ret)
-		goto out_put;
-
 	obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
 
 	list_for_each_entry(mcast, &obj->mcast_list, list)
@@ -3205,9 +3202,17 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
 		    !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
 			list_del(&mcast->list);
 			kfree(mcast);
+			found = true;
 			break;
 		}
 
+	if (!found) {
+		ret = -EINVAL;
+		goto out_put;
+	}
+
+	ret = ib_detach_mcast(qp, (union ib_gid *)cmd.gid, cmd.mlid);
+
 out_put:
 	put_qp_write(qp);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/20] IB/hfi1: Add receive fault injection feature
       [not found]     ` <20170301182344.29989.12032.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-03-02 17:19       ` Dennis Dalessandro
  0 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-02 17:19 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Mike Marciniszyn

On 03/01/2017 01:23 PM, Dennis Dalessandro wrote:
> +static int __init fault_init_opcode_debugfs(struct hfi1_ibdev *ibd)
> +{

Turns out gcc 4.4 doesn't like the __init annotation. After getting all 
clear email from 0-day builds I got another WARNING email later on that 
night. Compiles cleanly on more recent gcc but we might as well go ahead 
and fix as this seems to be a deprecated practice anyway. Will follow up 
with an updated patch shortly.

> +static int __init fault_init_packet_debugfs(struct hfi1_ibdev *ibd)

And this one.

> +static int __init fault_init_debugfs(struct hfi1_ibdev *ibd)

And one more.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature
       [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (20 preceding siblings ...)
  2017-03-02 15:18   ` [PATCH v2 " Dennis Dalessandro
@ 2017-03-02 18:26   ` Dennis Dalessandro
       [not found]     ` <20170302182610.28851.42687.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  21 siblings, 1 reply; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-02 18:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Add fault injection capability:
  - Drop packets unconditionally (fault_by_packet)
  - Drop packets based on opcode (fault_by_opcode)

This feature reacts to the global FAULT_INJECTION
config flag.

The faulting traces have been added:
  - misc/fault_opcode
  - misc/fault_packet

See 'Documentation/fault-injection/fault-injection.txt'
for details.

Examples:
  - Dropping packets by opcode:
    /sys/kernel/debug/hfi1/hfi1_X/fault_opcode
	# Enable fault
	echo Y > fault_by_opcode
	# Setprobability of dropping (0-100%)
	# echo 25 > probability
	# Set opcode
	echo 0x64 > opcode
	# Number of times to fault
	echo 3 > times
	# An optional mask allows you to fault
	# a range of opcodes
	echo 0xf0 > mask
    /sys/kernel/debug/hfi1/hfi1_X/fault_stats
    contains a value in parentheses to indicate
    number of each opcode dropped.

  - Dropping packets unconditionally
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet
	# Enable fault
	echo Y > fault_by_packet
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
    contains the number of packets dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

---
changes from v1:
	remove __init annotations from 3 places
---
 drivers/infiniband/hw/hfi1/debugfs.c    |  222 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/hfi1/debugfs.h    |   35 +++++
 drivers/infiniband/hw/hfi1/driver.c     |    8 +
 drivers/infiniband/hw/hfi1/trace_misc.h |   48 +++++++
 drivers/infiniband/hw/hfi1/verbs.c      |    6 +
 drivers/infiniband/hw/hfi1/verbs.h      |    4 +
 6 files changed, 323 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 7fe9dd8..cac6d52 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -51,8 +51,12 @@
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/string.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/fault-inject.h>
 
 #include "hfi.h"
+#include "trace.h"
 #include "debugfs.h"
 #include "device.h"
 #include "qp.h"
@@ -1063,6 +1067,217 @@ static int _sdma_cpu_list_seq_show(struct seq_file *s, void *v)
 DEBUGFS_SEQ_FILE_OPEN(sdma_cpu_list)
 DEBUGFS_FILE_OPS(sdma_cpu_list);
 
+#ifdef CONFIG_FAULT_INJECTION
+static void *_fault_stats_seq_start(struct seq_file *s, loff_t *pos)
+{
+	struct hfi1_opcode_stats_perctx *opstats;
+
+	if (*pos >= ARRAY_SIZE(opstats->stats))
+		return NULL;
+	return pos;
+}
+
+static void *_fault_stats_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct hfi1_opcode_stats_perctx *opstats;
+
+	++*pos;
+	if (*pos >= ARRAY_SIZE(opstats->stats))
+		return NULL;
+	return pos;
+}
+
+static void _fault_stats_seq_stop(struct seq_file *s, void *v)
+{
+}
+
+static int _fault_stats_seq_show(struct seq_file *s, void *v)
+{
+	loff_t *spos = v;
+	loff_t i = *spos, j;
+	u64 n_packets = 0, n_bytes = 0;
+	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
+	struct hfi1_devdata *dd = dd_from_dev(ibd);
+
+	for (j = 0; j < dd->first_user_ctxt; j++) {
+		if (!dd->rcd[j])
+			continue;
+		n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
+		n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+	}
+	if (!n_packets && !n_bytes)
+		return SEQ_SKIP;
+	if (!ibd->fault_opcode->n_rxfaults[i] &&
+	    !ibd->fault_opcode->n_txfaults[i])
+		return SEQ_SKIP;
+	seq_printf(s, "%02llx %llu/%llu (faults rx:%llu faults: tx:%llu)\n", i,
+		   (unsigned long long)n_packets,
+		   (unsigned long long)n_bytes,
+		   (unsigned long long)ibd->fault_opcode->n_rxfaults[i],
+		   (unsigned long long)ibd->fault_opcode->n_txfaults[i]);
+	return 0;
+}
+
+DEBUGFS_SEQ_FILE_OPS(fault_stats);
+DEBUGFS_SEQ_FILE_OPEN(fault_stats);
+DEBUGFS_FILE_OPS(fault_stats);
+
+static void fault_exit_opcode_debugfs(struct hfi1_ibdev *ibd)
+{
+	debugfs_remove_recursive(ibd->fault_opcode->dir);
+	kfree(ibd->fault_opcode);
+	ibd->fault_opcode = NULL;
+}
+
+static int fault_init_opcode_debugfs(struct hfi1_ibdev *ibd)
+{
+	struct dentry *parent = ibd->hfi1_ibdev_dbg;
+
+	ibd->fault_opcode = kzalloc(sizeof(*ibd->fault_opcode), GFP_KERNEL);
+	if (!ibd->fault_opcode)
+		return -ENOMEM;
+
+	ibd->fault_opcode->attr.interval = 1;
+	ibd->fault_opcode->attr.require_end = ULONG_MAX;
+	ibd->fault_opcode->attr.stacktrace_depth = 32;
+	ibd->fault_opcode->attr.dname = NULL;
+	ibd->fault_opcode->attr.verbose = 0;
+	ibd->fault_opcode->fault_by_opcode = false;
+	ibd->fault_opcode->opcode = 0;
+	ibd->fault_opcode->mask = 0xff;
+
+	ibd->fault_opcode->dir =
+		fault_create_debugfs_attr("fault_opcode",
+					  parent,
+					  &ibd->fault_opcode->attr);
+	if (IS_ERR(ibd->fault_opcode->dir)) {
+		kfree(ibd->fault_opcode);
+		return -ENOENT;
+	}
+
+	DEBUGFS_SEQ_FILE_CREATE(fault_stats, ibd->fault_opcode->dir, ibd);
+	if (!debugfs_create_bool("fault_by_opcode", 0600,
+				 ibd->fault_opcode->dir,
+				 &ibd->fault_opcode->fault_by_opcode))
+		goto fail;
+	if (!debugfs_create_x8("opcode", 0600, ibd->fault_opcode->dir,
+			       &ibd->fault_opcode->opcode))
+		goto fail;
+	if (!debugfs_create_x8("mask", 0600, ibd->fault_opcode->dir,
+			       &ibd->fault_opcode->mask))
+		goto fail;
+
+	return 0;
+fail:
+	fault_exit_opcode_debugfs(ibd);
+	return -ENOMEM;
+}
+
+static void fault_exit_packet_debugfs(struct hfi1_ibdev *ibd)
+{
+	debugfs_remove_recursive(ibd->fault_packet->dir);
+	kfree(ibd->fault_packet);
+	ibd->fault_packet = NULL;
+}
+
+static int fault_init_packet_debugfs(struct hfi1_ibdev *ibd)
+{
+	struct dentry *parent = ibd->hfi1_ibdev_dbg;
+
+	ibd->fault_packet = kzalloc(sizeof(*ibd->fault_packet), GFP_KERNEL);
+	if (!ibd->fault_packet)
+		return -ENOMEM;
+
+	ibd->fault_packet->attr.interval = 1;
+	ibd->fault_packet->attr.require_end = ULONG_MAX;
+	ibd->fault_packet->attr.stacktrace_depth = 32;
+	ibd->fault_packet->attr.dname = NULL;
+	ibd->fault_packet->attr.verbose = 0;
+	ibd->fault_packet->fault_by_packet = false;
+
+	ibd->fault_packet->dir =
+		fault_create_debugfs_attr("fault_packet",
+					  parent,
+					  &ibd->fault_opcode->attr);
+	if (IS_ERR(ibd->fault_packet->dir)) {
+		kfree(ibd->fault_packet);
+		return -ENOENT;
+	}
+
+	if (!debugfs_create_bool("fault_by_packet", 0600,
+				 ibd->fault_packet->dir,
+				 &ibd->fault_packet->fault_by_packet))
+		goto fail;
+	if (!debugfs_create_u64("fault_stats", 0400,
+				ibd->fault_packet->dir,
+				&ibd->fault_packet->n_faults))
+		goto fail;
+
+	return 0;
+fail:
+	fault_exit_packet_debugfs(ibd);
+	return -ENOMEM;
+}
+
+static void fault_exit_debugfs(struct hfi1_ibdev *ibd)
+{
+	fault_exit_opcode_debugfs(ibd);
+	fault_exit_packet_debugfs(ibd);
+}
+
+static int fault_init_debugfs(struct hfi1_ibdev *ibd)
+{
+	int ret = 0;
+
+	ret = fault_init_opcode_debugfs(ibd);
+	if (ret)
+		return ret;
+
+	ret = fault_init_packet_debugfs(ibd);
+	if (ret)
+		fault_exit_opcode_debugfs(ibd);
+
+	return ret;
+}
+
+bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx)
+{
+	bool ret = false;
+	struct hfi1_ibdev *ibd = to_idev(qp->ibqp.device);
+
+	if (!ibd->fault_opcode || !ibd->fault_opcode->fault_by_opcode)
+		return false;
+	if (ibd->fault_opcode->opcode != (opcode & ibd->fault_opcode->mask))
+		return false;
+	ret = should_fail(&ibd->fault_opcode->attr, 1);
+	if (ret) {
+		trace_hfi1_fault_opcode(qp, opcode);
+		if (rx)
+			ibd->fault_opcode->n_rxfaults[opcode]++;
+		else
+			ibd->fault_opcode->n_txfaults[opcode]++;
+	}
+	return ret;
+}
+
+bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
+{
+	struct rvt_dev_info *rdi = &packet->rcd->ppd->dd->verbs_dev.rdi;
+	struct hfi1_ibdev *ibd = dev_from_rdi(rdi);
+	bool ret = false;
+
+	if (!ibd->fault_packet || !ibd->fault_packet->fault_by_packet)
+		return false;
+
+	ret = should_fail(&ibd->fault_packet->attr, 1);
+	if (ret) {
+		++ibd->fault_packet->n_faults;
+		trace_hfi1_fault_packet(packet);
+	}
+	return ret;
+}
+#endif
+
 void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 {
 	char name[sizeof("port0counters") + 1];
@@ -1112,12 +1327,19 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 					    !port_cntr_ops[i].ops.write ?
 					    S_IRUGO : S_IRUGO | S_IWUSR);
 		}
+
+#ifdef CONFIG_FAULT_INJECTION
+	fault_init_debugfs(ibd);
+#endif
 }
 
 void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd)
 {
 	if (!hfi1_dbg_root)
 		goto out;
+#ifdef CONFIG_FAULT_INJECTION
+	fault_exit_debugfs(ibd);
+#endif
 	debugfs_remove(ibd->hfi1_ibdev_link);
 	debugfs_remove_recursive(ibd->hfi1_ibdev_dbg);
 out:
diff --git a/drivers/infiniband/hw/hfi1/debugfs.h b/drivers/infiniband/hw/hfi1/debugfs.h
index b6fb681..53dfdae 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.h
+++ b/drivers/infiniband/hw/hfi1/debugfs.h
@@ -53,6 +53,41 @@
 void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd);
 void hfi1_dbg_init(void);
 void hfi1_dbg_exit(void);
+
+#ifdef CONFIG_FAULT_INJECTION
+#include <linux/fault-inject.h>
+struct fault_opcode {
+	struct fault_attr attr;
+	struct dentry *dir;
+	bool fault_by_opcode;
+	u64 n_rxfaults[256];
+	u64 n_txfaults[256];
+	u8 opcode;
+	u8 mask;
+};
+
+struct fault_packet {
+	struct fault_attr attr;
+	struct dentry *dir;
+	bool fault_by_packet;
+	u64 n_faults;
+};
+
+bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx);
+bool hfi1_dbg_fault_packet(struct hfi1_packet *packet);
+#else
+static inline bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
+{
+	return false;
+}
+
+static inline bool hfi1_dbg_fault_opcode(struct rvt_qp *qp,
+					 u32 opcode, bool rx)
+{
+	return false;
+}
+#endif
+
 #else
 static inline void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 {
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 3881c95..c0b012f 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -59,6 +59,7 @@
 #include "trace.h"
 #include "qp.h"
 #include "sdma.h"
+#include "debugfs.h"
 
 #undef pr_fmt
 #define pr_fmt(fmt) DRIVER_NAME ": " fmt
@@ -1354,6 +1355,9 @@ void handle_eflags(struct hfi1_packet *packet)
  */
 int process_receive_ib(struct hfi1_packet *packet)
 {
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
+
 	trace_hfi1_rcvhdr(packet->rcd->ppd->dd,
 			  packet->rcd->ctxt,
 			  rhf_err_flags(packet->rhf),
@@ -1409,6 +1413,8 @@ int process_receive_error(struct hfi1_packet *packet)
 
 int kdeth_process_expected(struct hfi1_packet *packet)
 {
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
 
@@ -1421,6 +1427,8 @@ int kdeth_process_eager(struct hfi1_packet *packet)
 {
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
 
 	dd_dev_err(packet->rcd->dd,
 		   "Unhandled eager packet received. Dropping.\n");
diff --git a/drivers/infiniband/hw/hfi1/trace_misc.h b/drivers/infiniband/hw/hfi1/trace_misc.h
index d308454..deac77d 100644
--- a/drivers/infiniband/hw/hfi1/trace_misc.h
+++ b/drivers/infiniband/hw/hfi1/trace_misc.h
@@ -72,6 +72,54 @@
 		      __entry->src)
 );
 
+#ifdef CONFIG_FAULT_INJECTION
+TRACE_EVENT(hfi1_fault_opcode,
+	    TP_PROTO(struct rvt_qp *qp, u8 opcode),
+	    TP_ARGS(qp, opcode),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd_from_ibdev(qp->ibqp.device))
+			     __field(u32, qpn)
+			     __field(u8, opcode)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
+			   __entry->qpn = qp->ibqp.qp_num;
+			   __entry->opcode = opcode;
+			   ),
+	    TP_printk("[%s] qpn 0x%x opcode 0x%x",
+		      __get_str(dev), __entry->qpn, __entry->opcode)
+);
+
+TRACE_EVENT(hfi1_fault_packet,
+	    TP_PROTO(struct hfi1_packet *packet),
+	    TP_ARGS(packet),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(packet->rcd->ppd->dd)
+			     __field(u64, eflags)
+			     __field(u32, ctxt)
+			     __field(u32, hlen)
+			     __field(u32, tlen)
+			     __field(u32, updegr)
+			     __field(u32, etail)
+			     ),
+	     TP_fast_assign(DD_DEV_ASSIGN(packet->rcd->ppd->dd);
+			    __entry->eflags = rhf_err_flags(packet->rhf);
+			    __entry->ctxt = packet->rcd->ctxt;
+			    __entry->hlen = packet->hlen;
+			    __entry->tlen = packet->tlen;
+			    __entry->updegr = packet->updegr;
+			    __entry->etail = rhf_egr_index(packet->rhf);
+			    ),
+	     TP_printk(
+		"[%s] ctxt %d eflags 0x%llx hlen %d tlen %d updegr %d etail %d",
+		__get_str(dev),
+		__entry->ctxt,
+		__entry->eflags,
+		__entry->hlen,
+		__entry->tlen,
+		__entry->updegr,
+		__entry->etail
+		)
+);
+#endif
+
 #endif /* __HFI1_TRACE_MISC_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 928918c..9f016da 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -60,6 +60,7 @@
 #include "trace.h"
 #include "qp.h"
 #include "verbs_txreq.h"
+#include "debugfs.h"
 
 static unsigned int hfi1_lkey_table_size = 16;
 module_param_named(lkey_table_size, hfi1_lkey_table_size, uint,
@@ -599,6 +600,11 @@ void hfi1_ib_rcv(struct hfi1_packet *packet)
 			rcu_read_unlock();
 			goto drop;
 		}
+		if (unlikely(hfi1_dbg_fault_opcode(packet->qp, opcode,
+						   true))) {
+			rcu_read_unlock();
+			goto drop;
+		}
 		spin_lock_irqsave(&packet->qp->r_lock, flags);
 		packet_handler = qp_ok(opcode, packet);
 		if (likely(packet_handler))
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 3a0b589..2756ec3 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -195,6 +195,10 @@ struct hfi1_ibdev {
 	struct dentry *hfi1_ibdev_dbg;
 	/* per HFI symlinks to above */
 	struct dentry *hfi1_ibdev_link;
+#ifdef CONFIG_FAULT_INJECTION
+	struct fault_opcode *fault_opcode;
+	struct fault_packet *fault_packet;
+#endif
 #endif
 };
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature
       [not found]     ` <20170302182610.28851.42687.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-03-03 16:15       ` Leon Romanovsky
       [not found]         ` <20170303161508.GF14379-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 30+ messages in thread
From: Leon Romanovsky @ 2017-03-03 16:15 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1520 bytes --]

On Thu, Mar 02, 2017 at 10:26:17AM -0800, Dennis Dalessandro wrote:
> From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> Add fault injection capability:
>   - Drop packets unconditionally (fault_by_packet)
>   - Drop packets based on opcode (fault_by_opcode)
>
> This feature reacts to the global FAULT_INJECTION
> config flag.
>
> The faulting traces have been added:
>   - misc/fault_opcode
>   - misc/fault_packet
>
> See 'Documentation/fault-injection/fault-injection.txt'
> for details.
>
> Examples:
>   - Dropping packets by opcode:
>     /sys/kernel/debug/hfi1/hfi1_X/fault_opcode
> 	# Enable fault
> 	echo Y > fault_by_opcode
> 	# Setprobability of dropping (0-100%)
> 	# echo 25 > probability
> 	# Set opcode
> 	echo 0x64 > opcode
> 	# Number of times to fault
> 	echo 3 > times
> 	# An optional mask allows you to fault
> 	# a range of opcodes
> 	echo 0xf0 > mask
>     /sys/kernel/debug/hfi1/hfi1_X/fault_stats
>     contains a value in parentheses to indicate
>     number of each opcode dropped.
>
>   - Dropping packets unconditionally
>     /sys/kernel/debug/hfi1/hfi1_X/fault_packet
> 	# Enable fault
> 	echo Y > fault_by_packet
>     /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
>     contains the number of packets dropped.

It will be so nice to have this functionality extended to objects and
placed in IB/core. For example, VNIC and IPoIB will be able to use
faults by packet, while all other drivers and ULPs will use faults by
objects.

Does it make sense?

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature
       [not found]         ` <20170303161508.GF14379-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-03-03 17:02           ` Dennis Dalessandro
  0 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-03 17:02 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 03/03/2017 11:15 AM, Leon Romanovsky wrote:
> On Thu, Mar 02, 2017 at 10:26:17AM -0800, Dennis Dalessandro wrote:
>> From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>
>> Add fault injection capability:
>>   - Drop packets unconditionally (fault_by_packet)
>>   - Drop packets based on opcode (fault_by_opcode)
>>
>> This feature reacts to the global FAULT_INJECTION
>> config flag.
>>
>> The faulting traces have been added:
>>   - misc/fault_opcode
>>   - misc/fault_packet
>>
>> See 'Documentation/fault-injection/fault-injection.txt'
>> for details.
>>
>> Examples:
>>   - Dropping packets by opcode:
>>     /sys/kernel/debug/hfi1/hfi1_X/fault_opcode
>> 	# Enable fault
>> 	echo Y > fault_by_opcode
>> 	# Setprobability of dropping (0-100%)
>> 	# echo 25 > probability
>> 	# Set opcode
>> 	echo 0x64 > opcode
>> 	# Number of times to fault
>> 	echo 3 > times
>> 	# An optional mask allows you to fault
>> 	# a range of opcodes
>> 	echo 0xf0 > mask
>>     /sys/kernel/debug/hfi1/hfi1_X/fault_stats
>>     contains a value in parentheses to indicate
>>     number of each opcode dropped.
>>
>>   - Dropping packets unconditionally
>>     /sys/kernel/debug/hfi1/hfi1_X/fault_packet
>> 	# Enable fault
>> 	echo Y > fault_by_packet
>>     /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
>>     contains the number of packets dropped.
>
> It will be so nice to have this functionality extended to objects and
> placed in IB/core. For example, VNIC and IPoIB will be able to use
> faults by packet, while all other drivers and ULPs will use faults by
> objects.
>
> Does it make sense?

I agree. We also should think about a subsystem wide approach for things 
like this, and other forms of debugging.

-Denny


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature
       [not found] ` <20170321001900.28538.38175.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-03-21  0:26   ` Dennis Dalessandro
  0 siblings, 0 replies; 30+ messages in thread
From: Dennis Dalessandro @ 2017-03-21  0:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Mike Marciniszyn

From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Add fault injection capability:
  - Drop packets unconditionally (fault_by_packet)
  - Drop packets based on opcode (fault_by_opcode)

This feature reacts to the global FAULT_INJECTION
config flag.

The faulting traces have been added:
  - misc/fault_opcode
  - misc/fault_packet

See 'Documentation/fault-injection/fault-injection.txt'
for details.

Examples:
  - Dropping packets by opcode:
    /sys/kernel/debug/hfi1/hfi1_X/fault_opcode
	# Enable fault
	echo Y > fault_by_opcode
	# Setprobability of dropping (0-100%)
	# echo 25 > probability
	# Set opcode
	echo 0x64 > opcode
	# Number of times to fault
	echo 3 > times
	# An optional mask allows you to fault
	# a range of opcodes
	echo 0xf0 > mask
    /sys/kernel/debug/hfi1/hfi1_X/fault_stats
    contains a value in parentheses to indicate
    number of each opcode dropped.

  - Dropping packets unconditionally
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet
	# Enable fault
	echo Y > fault_by_packet
    /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats
    contains the number of packets dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/debugfs.c    |  222 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/hfi1/debugfs.h    |   51 +++++++
 drivers/infiniband/hw/hfi1/driver.c     |    8 +
 drivers/infiniband/hw/hfi1/trace_misc.h |   48 +++++++
 drivers/infiniband/hw/hfi1/verbs.c      |    6 +
 drivers/infiniband/hw/hfi1/verbs.h      |    4 +
 6 files changed, 336 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 7fe9dd8..cac6d52 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -51,8 +51,12 @@
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/string.h>
+#include <linux/types.h>
+#include <linux/ratelimit.h>
+#include <linux/fault-inject.h>
 
 #include "hfi.h"
+#include "trace.h"
 #include "debugfs.h"
 #include "device.h"
 #include "qp.h"
@@ -1063,6 +1067,217 @@ static int _sdma_cpu_list_seq_show(struct seq_file *s, void *v)
 DEBUGFS_SEQ_FILE_OPEN(sdma_cpu_list)
 DEBUGFS_FILE_OPS(sdma_cpu_list);
 
+#ifdef CONFIG_FAULT_INJECTION
+static void *_fault_stats_seq_start(struct seq_file *s, loff_t *pos)
+{
+	struct hfi1_opcode_stats_perctx *opstats;
+
+	if (*pos >= ARRAY_SIZE(opstats->stats))
+		return NULL;
+	return pos;
+}
+
+static void *_fault_stats_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct hfi1_opcode_stats_perctx *opstats;
+
+	++*pos;
+	if (*pos >= ARRAY_SIZE(opstats->stats))
+		return NULL;
+	return pos;
+}
+
+static void _fault_stats_seq_stop(struct seq_file *s, void *v)
+{
+}
+
+static int _fault_stats_seq_show(struct seq_file *s, void *v)
+{
+	loff_t *spos = v;
+	loff_t i = *spos, j;
+	u64 n_packets = 0, n_bytes = 0;
+	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
+	struct hfi1_devdata *dd = dd_from_dev(ibd);
+
+	for (j = 0; j < dd->first_user_ctxt; j++) {
+		if (!dd->rcd[j])
+			continue;
+		n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
+		n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+	}
+	if (!n_packets && !n_bytes)
+		return SEQ_SKIP;
+	if (!ibd->fault_opcode->n_rxfaults[i] &&
+	    !ibd->fault_opcode->n_txfaults[i])
+		return SEQ_SKIP;
+	seq_printf(s, "%02llx %llu/%llu (faults rx:%llu faults: tx:%llu)\n", i,
+		   (unsigned long long)n_packets,
+		   (unsigned long long)n_bytes,
+		   (unsigned long long)ibd->fault_opcode->n_rxfaults[i],
+		   (unsigned long long)ibd->fault_opcode->n_txfaults[i]);
+	return 0;
+}
+
+DEBUGFS_SEQ_FILE_OPS(fault_stats);
+DEBUGFS_SEQ_FILE_OPEN(fault_stats);
+DEBUGFS_FILE_OPS(fault_stats);
+
+static void fault_exit_opcode_debugfs(struct hfi1_ibdev *ibd)
+{
+	debugfs_remove_recursive(ibd->fault_opcode->dir);
+	kfree(ibd->fault_opcode);
+	ibd->fault_opcode = NULL;
+}
+
+static int fault_init_opcode_debugfs(struct hfi1_ibdev *ibd)
+{
+	struct dentry *parent = ibd->hfi1_ibdev_dbg;
+
+	ibd->fault_opcode = kzalloc(sizeof(*ibd->fault_opcode), GFP_KERNEL);
+	if (!ibd->fault_opcode)
+		return -ENOMEM;
+
+	ibd->fault_opcode->attr.interval = 1;
+	ibd->fault_opcode->attr.require_end = ULONG_MAX;
+	ibd->fault_opcode->attr.stacktrace_depth = 32;
+	ibd->fault_opcode->attr.dname = NULL;
+	ibd->fault_opcode->attr.verbose = 0;
+	ibd->fault_opcode->fault_by_opcode = false;
+	ibd->fault_opcode->opcode = 0;
+	ibd->fault_opcode->mask = 0xff;
+
+	ibd->fault_opcode->dir =
+		fault_create_debugfs_attr("fault_opcode",
+					  parent,
+					  &ibd->fault_opcode->attr);
+	if (IS_ERR(ibd->fault_opcode->dir)) {
+		kfree(ibd->fault_opcode);
+		return -ENOENT;
+	}
+
+	DEBUGFS_SEQ_FILE_CREATE(fault_stats, ibd->fault_opcode->dir, ibd);
+	if (!debugfs_create_bool("fault_by_opcode", 0600,
+				 ibd->fault_opcode->dir,
+				 &ibd->fault_opcode->fault_by_opcode))
+		goto fail;
+	if (!debugfs_create_x8("opcode", 0600, ibd->fault_opcode->dir,
+			       &ibd->fault_opcode->opcode))
+		goto fail;
+	if (!debugfs_create_x8("mask", 0600, ibd->fault_opcode->dir,
+			       &ibd->fault_opcode->mask))
+		goto fail;
+
+	return 0;
+fail:
+	fault_exit_opcode_debugfs(ibd);
+	return -ENOMEM;
+}
+
+static void fault_exit_packet_debugfs(struct hfi1_ibdev *ibd)
+{
+	debugfs_remove_recursive(ibd->fault_packet->dir);
+	kfree(ibd->fault_packet);
+	ibd->fault_packet = NULL;
+}
+
+static int fault_init_packet_debugfs(struct hfi1_ibdev *ibd)
+{
+	struct dentry *parent = ibd->hfi1_ibdev_dbg;
+
+	ibd->fault_packet = kzalloc(sizeof(*ibd->fault_packet), GFP_KERNEL);
+	if (!ibd->fault_packet)
+		return -ENOMEM;
+
+	ibd->fault_packet->attr.interval = 1;
+	ibd->fault_packet->attr.require_end = ULONG_MAX;
+	ibd->fault_packet->attr.stacktrace_depth = 32;
+	ibd->fault_packet->attr.dname = NULL;
+	ibd->fault_packet->attr.verbose = 0;
+	ibd->fault_packet->fault_by_packet = false;
+
+	ibd->fault_packet->dir =
+		fault_create_debugfs_attr("fault_packet",
+					  parent,
+					  &ibd->fault_opcode->attr);
+	if (IS_ERR(ibd->fault_packet->dir)) {
+		kfree(ibd->fault_packet);
+		return -ENOENT;
+	}
+
+	if (!debugfs_create_bool("fault_by_packet", 0600,
+				 ibd->fault_packet->dir,
+				 &ibd->fault_packet->fault_by_packet))
+		goto fail;
+	if (!debugfs_create_u64("fault_stats", 0400,
+				ibd->fault_packet->dir,
+				&ibd->fault_packet->n_faults))
+		goto fail;
+
+	return 0;
+fail:
+	fault_exit_packet_debugfs(ibd);
+	return -ENOMEM;
+}
+
+static void fault_exit_debugfs(struct hfi1_ibdev *ibd)
+{
+	fault_exit_opcode_debugfs(ibd);
+	fault_exit_packet_debugfs(ibd);
+}
+
+static int fault_init_debugfs(struct hfi1_ibdev *ibd)
+{
+	int ret = 0;
+
+	ret = fault_init_opcode_debugfs(ibd);
+	if (ret)
+		return ret;
+
+	ret = fault_init_packet_debugfs(ibd);
+	if (ret)
+		fault_exit_opcode_debugfs(ibd);
+
+	return ret;
+}
+
+bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx)
+{
+	bool ret = false;
+	struct hfi1_ibdev *ibd = to_idev(qp->ibqp.device);
+
+	if (!ibd->fault_opcode || !ibd->fault_opcode->fault_by_opcode)
+		return false;
+	if (ibd->fault_opcode->opcode != (opcode & ibd->fault_opcode->mask))
+		return false;
+	ret = should_fail(&ibd->fault_opcode->attr, 1);
+	if (ret) {
+		trace_hfi1_fault_opcode(qp, opcode);
+		if (rx)
+			ibd->fault_opcode->n_rxfaults[opcode]++;
+		else
+			ibd->fault_opcode->n_txfaults[opcode]++;
+	}
+	return ret;
+}
+
+bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
+{
+	struct rvt_dev_info *rdi = &packet->rcd->ppd->dd->verbs_dev.rdi;
+	struct hfi1_ibdev *ibd = dev_from_rdi(rdi);
+	bool ret = false;
+
+	if (!ibd->fault_packet || !ibd->fault_packet->fault_by_packet)
+		return false;
+
+	ret = should_fail(&ibd->fault_packet->attr, 1);
+	if (ret) {
+		++ibd->fault_packet->n_faults;
+		trace_hfi1_fault_packet(packet);
+	}
+	return ret;
+}
+#endif
+
 void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 {
 	char name[sizeof("port0counters") + 1];
@@ -1112,12 +1327,19 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 					    !port_cntr_ops[i].ops.write ?
 					    S_IRUGO : S_IRUGO | S_IWUSR);
 		}
+
+#ifdef CONFIG_FAULT_INJECTION
+	fault_init_debugfs(ibd);
+#endif
 }
 
 void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd)
 {
 	if (!hfi1_dbg_root)
 		goto out;
+#ifdef CONFIG_FAULT_INJECTION
+	fault_exit_debugfs(ibd);
+#endif
 	debugfs_remove(ibd->hfi1_ibdev_link);
 	debugfs_remove_recursive(ibd->hfi1_ibdev_dbg);
 out:
diff --git a/drivers/infiniband/hw/hfi1/debugfs.h b/drivers/infiniband/hw/hfi1/debugfs.h
index b6fb681..70be5ca 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.h
+++ b/drivers/infiniband/hw/hfi1/debugfs.h
@@ -53,23 +53,68 @@
 void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd);
 void hfi1_dbg_init(void);
 void hfi1_dbg_exit(void);
+
+#ifdef CONFIG_FAULT_INJECTION
+#include <linux/fault-inject.h>
+struct fault_opcode {
+	struct fault_attr attr;
+	struct dentry *dir;
+	bool fault_by_opcode;
+	u64 n_rxfaults[256];
+	u64 n_txfaults[256];
+	u8 opcode;
+	u8 mask;
+};
+
+struct fault_packet {
+	struct fault_attr attr;
+	struct dentry *dir;
+	bool fault_by_packet;
+	u64 n_faults;
+};
+
+bool hfi1_dbg_fault_opcode(struct rvt_qp *qp, u32 opcode, bool rx);
+bool hfi1_dbg_fault_packet(struct hfi1_packet *packet);
+#else
+static inline bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
+{
+	return false;
+}
+
+static inline bool hfi1_dbg_fault_opcode(struct rvt_qp *qp,
+					 u32 opcode, bool rx)
+{
+	return false;
+}
+#endif
+
 #else
 static inline void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 {
 }
 
-void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd)
+static inline void hfi1_dbg_ibdev_exit(struct hfi1_ibdev *ibd)
+{
+}
+
+static inline void hfi1_dbg_init(void)
 {
 }
 
-void hfi1_dbg_init(void)
+static inline void hfi1_dbg_exit(void)
 {
 }
 
-void hfi1_dbg_exit(void)
+static inline bool hfi1_dbg_fault_packet(struct hfi1_packet *packet)
 {
+	return false;
 }
 
+static inline bool hfi1_dbg_fault_opcode(struct rvt_qp *qp,
+					 u32 opcode, bool rx)
+{
+	return false;
+}
 #endif
 
 #endif                          /* _HFI1_DEBUGFS_H */
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 3881c95..c0b012f 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -59,6 +59,7 @@
 #include "trace.h"
 #include "qp.h"
 #include "sdma.h"
+#include "debugfs.h"
 
 #undef pr_fmt
 #define pr_fmt(fmt) DRIVER_NAME ": " fmt
@@ -1354,6 +1355,9 @@ void handle_eflags(struct hfi1_packet *packet)
  */
 int process_receive_ib(struct hfi1_packet *packet)
 {
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
+
 	trace_hfi1_rcvhdr(packet->rcd->ppd->dd,
 			  packet->rcd->ctxt,
 			  rhf_err_flags(packet->rhf),
@@ -1409,6 +1413,8 @@ int process_receive_error(struct hfi1_packet *packet)
 
 int kdeth_process_expected(struct hfi1_packet *packet)
 {
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
 
@@ -1421,6 +1427,8 @@ int kdeth_process_eager(struct hfi1_packet *packet)
 {
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
+	if (unlikely(hfi1_dbg_fault_packet(packet)))
+		return RHF_RCV_CONTINUE;
 
 	dd_dev_err(packet->rcd->dd,
 		   "Unhandled eager packet received. Dropping.\n");
diff --git a/drivers/infiniband/hw/hfi1/trace_misc.h b/drivers/infiniband/hw/hfi1/trace_misc.h
index d308454..deac77d 100644
--- a/drivers/infiniband/hw/hfi1/trace_misc.h
+++ b/drivers/infiniband/hw/hfi1/trace_misc.h
@@ -72,6 +72,54 @@
 		      __entry->src)
 );
 
+#ifdef CONFIG_FAULT_INJECTION
+TRACE_EVENT(hfi1_fault_opcode,
+	    TP_PROTO(struct rvt_qp *qp, u8 opcode),
+	    TP_ARGS(qp, opcode),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd_from_ibdev(qp->ibqp.device))
+			     __field(u32, qpn)
+			     __field(u8, opcode)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
+			   __entry->qpn = qp->ibqp.qp_num;
+			   __entry->opcode = opcode;
+			   ),
+	    TP_printk("[%s] qpn 0x%x opcode 0x%x",
+		      __get_str(dev), __entry->qpn, __entry->opcode)
+);
+
+TRACE_EVENT(hfi1_fault_packet,
+	    TP_PROTO(struct hfi1_packet *packet),
+	    TP_ARGS(packet),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(packet->rcd->ppd->dd)
+			     __field(u64, eflags)
+			     __field(u32, ctxt)
+			     __field(u32, hlen)
+			     __field(u32, tlen)
+			     __field(u32, updegr)
+			     __field(u32, etail)
+			     ),
+	     TP_fast_assign(DD_DEV_ASSIGN(packet->rcd->ppd->dd);
+			    __entry->eflags = rhf_err_flags(packet->rhf);
+			    __entry->ctxt = packet->rcd->ctxt;
+			    __entry->hlen = packet->hlen;
+			    __entry->tlen = packet->tlen;
+			    __entry->updegr = packet->updegr;
+			    __entry->etail = rhf_egr_index(packet->rhf);
+			    ),
+	     TP_printk(
+		"[%s] ctxt %d eflags 0x%llx hlen %d tlen %d updegr %d etail %d",
+		__get_str(dev),
+		__entry->ctxt,
+		__entry->eflags,
+		__entry->hlen,
+		__entry->tlen,
+		__entry->updegr,
+		__entry->etail
+		)
+);
+#endif
+
 #endif /* __HFI1_TRACE_MISC_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 928918c..9f016da 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -60,6 +60,7 @@
 #include "trace.h"
 #include "qp.h"
 #include "verbs_txreq.h"
+#include "debugfs.h"
 
 static unsigned int hfi1_lkey_table_size = 16;
 module_param_named(lkey_table_size, hfi1_lkey_table_size, uint,
@@ -599,6 +600,11 @@ void hfi1_ib_rcv(struct hfi1_packet *packet)
 			rcu_read_unlock();
 			goto drop;
 		}
+		if (unlikely(hfi1_dbg_fault_opcode(packet->qp, opcode,
+						   true))) {
+			rcu_read_unlock();
+			goto drop;
+		}
 		spin_lock_irqsave(&packet->qp->r_lock, flags);
 		packet_handler = qp_ok(opcode, packet);
 		if (likely(packet_handler))
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 3a0b589..2756ec3 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -195,6 +195,10 @@ struct hfi1_ibdev {
 	struct dentry *hfi1_ibdev_dbg;
 	/* per HFI symlinks to above */
 	struct dentry *hfi1_ibdev_link;
+#ifdef CONFIG_FAULT_INJECTION
+	struct fault_opcode *fault_opcode;
+	struct fault_packet *fault_packet;
+#endif
 #endif
 };
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-03-21  0:26 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-01 18:21 [PATCH 00/20] IB/hfi1, qib, rdmavt: Another round of patches for 4.11 Dennis Dalessandro
     [not found] ` <20170301181719.29989.31238.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-03-01 18:21   ` [PATCH 01/20] IB/hfi1: Force logical link down Dennis Dalessandro
2017-03-01 18:21   ` [PATCH 02/20] IB/hfi1: Race hazard avoidance in user SDMA driver Dennis Dalessandro
2017-03-01 18:21   ` [PATCH 03/20] IB/hfi1: Cache registers during state change Dennis Dalessandro
2017-03-01 18:21   ` [PATCH 04/20] IB/hfi1: NULL pointer dereference when freeing rhashtable Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 05/20] IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependent Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 06/20] IB/rdmavt: Add additional fields to post send trace Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 07/20] IB/rdmavt: Add tracing for cq entry and poll Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 08/20] IB/rdmavt: Add swqe completion trace Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 09/20] IB/hfi1: Check device id early during init Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 10/20] IB/hfi1: Protect the global dev_cntr_names and port_cntr_names Dennis Dalessandro
2017-03-01 18:22   ` [PATCH 11/20] IB/hfi1: Check for QSFP presence before attempting reads Dennis Dalessandro
2017-03-01 18:23   ` [PATCH 12/20] IB/hfi1: Add a patch value to the firmware version string Dennis Dalessandro
2017-03-01 18:23   ` [PATCH 13/20] IB/rdmavt,IB/hfi1: Fix timer migration regressions Dennis Dalessandro
2017-03-01 18:23   ` [PATCH 14/20] IB/rdmavt: Avoid reseting wqe send_flags in unreserve Dennis Dalessandro
2017-03-01 18:23   ` [PATCH 15/20] IB/hfi1: Ensure VL index is within bounds Dennis Dalessandro
2017-03-01 18:23   ` [PATCH 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro
     [not found]     ` <20170301182344.29989.12032.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-03-02 17:19       ` Dennis Dalessandro
2017-03-01 18:23   ` [PATCH 17/20] IB/hfi1: Add transmit " Dennis Dalessandro
2017-03-01 18:24   ` [PATCH 18/20] IB/hfi1: Eliminate synchronize_rcu() in mr delete Dennis Dalessandro
2017-03-01 18:24   ` [PATCH 19/20] IB/rdmavt, IB/qib, IB/hfi1: Make percpu refcount optional for user MRs Dennis Dalessandro
2017-03-01 18:24   ` [PATCH 20/20] IB/core: If the MGID/MLID pair is not on the list return an error Dennis Dalessandro
     [not found]     ` <20170301182426.29989.77369.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-03-02 14:00       ` Leon Romanovsky
     [not found]         ` <20170302140028.GE9525-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-03-02 15:01           ` Dennis Dalessandro
     [not found]             ` <4d49f7ab-5bc5-029c-4321-fd349c9dc0f8-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-03-02 15:13               ` Leon Romanovsky
2017-03-02 15:18   ` [PATCH v2 " Dennis Dalessandro
2017-03-02 18:26   ` [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro
     [not found]     ` <20170302182610.28851.42687.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-03-03 16:15       ` Leon Romanovsky
     [not found]         ` <20170303161508.GF14379-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-03-03 17:02           ` Dennis Dalessandro
2017-03-21  0:24 [PATCH v2 00/20] IB/hfi1, qib, rdmavt: Another round of patches for 4.11 Dennis Dalessandro
     [not found] ` <20170321001900.28538.38175.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-03-21  0:26   ` [PATCH v2 16/20] IB/hfi1: Add receive fault injection feature Dennis Dalessandro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.