linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/19] for-5.3 patches for review
@ 2019-06-11 15:07 Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 01/19] xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA Chuck Lever
                   ` (18 more replies)
  0 siblings, 19 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:07 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

This is a series of fixes and architectural changes that should
improve robustness and result in better scalability of NFS/RDMA.

The fundamental observation is that the RPC work queues are BOUND,
thus rescheduling work in the Receive completion handler to one of
these work queues just forces it to run later on the same CPU. So
try to do more work right in the Receive completion handler to
reduce context switch overhead.

A secondary concern is that the average amount of wall-clock time
it takes to handle a single Receive completion caps the IOPS rate
(both per-xprt and per-NIC). In this patch series I've taken a few
steps to reduce that latency, and I'm looking into a few others.

This series can be fetched from:

 git://git.linux-nfs.org/projects/cel/cel-2.6.git

in topic branch "nfs-for-5.3".


Changes since RFC:
* Rebased on v5.2-rc4
* Clarified some patch descriptions and comments
* Addressed Anna's compiler warning in frwr_unmap_[a]sync
* Cross-ported recent xs_connect changes to xprt_rdma_connect
* Fixed several trace-event bugs

---

Chuck Lever (19):
      xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA
      xprtrdma: Fix use-after-free in rpcrdma_post_recvs
      xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req
      xprtrdma: Fix occasional transport deadlock
      xprtrdma: Remove the RPCRDMA_REQ_F_PENDING flag
      xprtrdma: Remove fr_state
      xprtrdma: Add mechanism to place MRs back on the free list
      xprtrdma: Reduce context switching due to Local Invalidation
      xprtrdma: Wake RPCs directly in rpcrdma_wc_send path
      xprtrdma: Simplify rpcrdma_rep_create
      xprtrdma: Streamline rpcrdma_post_recvs
      xprtrdma: Refactor chunk encoding
      xprtrdma: Remove rpcrdma_req::rl_buffer
      xprtrdma: Modernize ops->connect
      NFS4: Add a trace event to record invalid CB sequence IDs
      NFS: Fix show_nfs_errors macros again
      NFS: Display symbolic status code names in trace log
      NFS: Update symbolic flags displayed by trace events
      NFS: Record task, client ID, and XID in xdr_status trace points


 fs/nfs/callback_proc.c          |   28 ++-
 fs/nfs/nfs2xdr.c                |    2 
 fs/nfs/nfs3xdr.c                |    2 
 fs/nfs/nfs4trace.h              |  165 +++++++++++++-------
 fs/nfs/nfs4xdr.c                |    2 
 fs/nfs/nfstrace.h               |  181 ++++++++++++++++------
 include/linux/sunrpc/xprt.h     |    3 
 include/trace/events/rpcrdma.h  |   90 ++++++++---
 net/sunrpc/sched.c              |    1 
 net/sunrpc/xprt.c               |   32 ++++
 net/sunrpc/xprtrdma/frwr_ops.c  |  327 ++++++++++++++++++++++++++-------------
 net/sunrpc/xprtrdma/rpc_rdma.c  |  148 ++++++++----------
 net/sunrpc/xprtrdma/transport.c |   83 ++++++++--
 net/sunrpc/xprtrdma/verbs.c     |  115 ++++++--------
 net/sunrpc/xprtrdma/xprt_rdma.h |   44 +----
 net/sunrpc/xprtsock.c           |   23 ---
 16 files changed, 776 insertions(+), 470 deletions(-)

--
Chuck Lever

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 01/19] xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 02/19] xprtrdma: Fix use-after-free in rpcrdma_post_recvs Chuck Lever
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

A backchannel reply does not set task->tk_client.

Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/trace/events/rpcrdma.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index df9851cb..f0678e3 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -559,7 +559,8 @@
 		const struct rpc_rqst *rqst = &req->rl_slot;
 
 		__entry->task_id = rqst->rq_task->tk_pid;
-		__entry->client_id = rqst->rq_task->tk_client->cl_clid;
+		__entry->client_id = rqst->rq_task->tk_client ?
+				     rqst->rq_task->tk_client->cl_clid : -1;
 		__entry->req = req;
 		__entry->num_sge = req->rl_sendctx->sc_wr.num_sge;
 		__entry->signaled = req->rl_sendctx->sc_wr.send_flags &


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 02/19] xprtrdma: Fix use-after-free in rpcrdma_post_recvs
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 01/19] xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 03/19] xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req Chuck Lever
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Dereference wr->next /before/ the memory backing wr has been
released. This issue was found by code inspection. It is not
expected to be a significant problem because it is in an error
path that is almost never executed.

Fixes: 7c8d9e7c8863 ("xprtrdma: Move Receive posting to ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 84bb379..e71315e 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1553,10 +1553,11 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 	rc = ib_post_recv(r_xprt->rx_ia.ri_id->qp, wr,
 			  (const struct ib_recv_wr **)&bad_wr);
 	if (rc) {
-		for (wr = bad_wr; wr; wr = wr->next) {
+		for (wr = bad_wr; wr;) {
 			struct rpcrdma_rep *rep;
 
 			rep = container_of(wr, struct rpcrdma_rep, rr_recv_wr);
+			wr = wr->next;
 			rpcrdma_recv_buffer_put(rep);
 			--count;
 		}


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 03/19] xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 01/19] xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 02/19] xprtrdma: Fix use-after-free in rpcrdma_post_recvs Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 04/19] xprtrdma: Fix occasional transport deadlock Chuck Lever
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

This is a latent bug. xdr_stream_pos works by subtracting
xdr_stream::nwords from xdr_buf::len. But xdr_stream::nwords is not
initialized by xdr_init_encode().

It works today only because all fields in rpcrdma_req::rl_stream
are initialized to zero by rpcrdma_req_create, making the
subtraction in xdr_stream_pos always a no-op.

I found this issue via code inspection. It was introduced by commit
39f4cd9e9982 ("xprtrdma: Harden chunk list encoding against send
buffer overflow"), but the code has changed enough since then that
this fix can't be automatically applied to stable.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/trace/events/rpcrdma.h |    9 +++++----
 net/sunrpc/xprtrdma/rpc_rdma.c |    6 +++---
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index f0678e3..59492a93 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -470,13 +470,12 @@
 
 TRACE_EVENT(xprtrdma_marshal,
 	TP_PROTO(
-		const struct rpc_rqst *rqst,
-		unsigned int hdrlen,
+		const struct rpcrdma_req *req,
 		unsigned int rtype,
 		unsigned int wtype
 	),
 
-	TP_ARGS(rqst, hdrlen, rtype, wtype),
+	TP_ARGS(req, rtype, wtype),
 
 	TP_STRUCT__entry(
 		__field(unsigned int, task_id)
@@ -491,10 +490,12 @@
 	),
 
 	TP_fast_assign(
+		const struct rpc_rqst *rqst = &req->rl_slot;
+
 		__entry->task_id = rqst->rq_task->tk_pid;
 		__entry->client_id = rqst->rq_task->tk_client->cl_clid;
 		__entry->xid = be32_to_cpu(rqst->rq_xid);
-		__entry->hdrlen = hdrlen;
+		__entry->hdrlen = req->rl_hdrbuf.len;
 		__entry->headlen = rqst->rq_snd_buf.head[0].iov_len;
 		__entry->pagelen = rqst->rq_snd_buf.page_len;
 		__entry->taillen = rqst->rq_snd_buf.tail[0].iov_len;
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 85115a2..97bfb80 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -867,12 +867,12 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 	if (ret)
 		goto out_err;
 
-	trace_xprtrdma_marshal(rqst, xdr_stream_pos(xdr), rtype, wtype);
-
-	ret = rpcrdma_prepare_send_sges(r_xprt, req, xdr_stream_pos(xdr),
+	ret = rpcrdma_prepare_send_sges(r_xprt, req, req->rl_hdrbuf.len,
 					&rqst->rq_snd_buf, rtype);
 	if (ret)
 		goto out_err;
+
+	trace_xprtrdma_marshal(req, rtype, wtype);
 	return 0;
 
 out_err:


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 04/19] xprtrdma: Fix occasional transport deadlock
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (2 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 03/19] xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 05/19] xprtrdma: Remove the RPCRDMA_REQ_F_PENDING flag Chuck Lever
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Under high I/O workloads, I've noticed that an RPC/RDMA transport
occasionally deadlocks (IOPS goes to zero, and doesn't recover).
Diagnosis shows that the sendctx queue is empty, but when sendctxs
are returned to the queue, the xprt_write_space wake-up never
occurs. The wake-up logic in rpcrdma_sendctx_put_locked is racy.

I noticed that both EMPTY_SCQ and XPRT_WRITE_SPACE are implemented
via an atomic bit. Just one of those is sufficient. Removing
EMPTY_SCQ in favor of the generic bit mechanism makes the deadlock
un-reproducible.

Without EMPTY_SCQ, rpcrdma_buffer::rb_flags is no longer used and
is therefore removed.

Unfortunately this patch does not apply cleanly to stable. If
needed, someone will have to port it and test it.

Fixes: 2fad659209d5 ("xprtrdma: Wait on empty sendctx queue")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/trace/events/rpcrdma.h  |   27 +++++++++++++++++++++++++++
 net/sunrpc/xprtrdma/frwr_ops.c  |    6 +++++-
 net/sunrpc/xprtrdma/rpc_rdma.c  |   26 ++++++++++++--------------
 net/sunrpc/xprtrdma/verbs.c     |   11 +++--------
 net/sunrpc/xprtrdma/xprt_rdma.h |    6 ------
 5 files changed, 47 insertions(+), 29 deletions(-)

diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 59492a93..2fb4151 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -539,6 +539,33 @@
 	)
 );
 
+TRACE_EVENT(xprtrdma_prepsend_failed,
+	TP_PROTO(const struct rpc_rqst *rqst,
+		 int ret
+	),
+
+	TP_ARGS(rqst, ret),
+
+	TP_STRUCT__entry(
+		__field(unsigned int, task_id)
+		__field(unsigned int, client_id)
+		__field(u32, xid)
+		__field(int, ret)
+	),
+
+	TP_fast_assign(
+		__entry->task_id = rqst->rq_task->tk_pid;
+		__entry->client_id = rqst->rq_task->tk_client->cl_clid;
+		__entry->xid = be32_to_cpu(rqst->rq_xid);
+		__entry->ret = ret;
+	),
+
+	TP_printk("task:%u@%u xid=0x%08x: ret=%d",
+		__entry->task_id, __entry->client_id, __entry->xid,
+		__entry->ret
+	)
+);
+
 TRACE_EVENT(xprtrdma_post_send,
 	TP_PROTO(
 		const struct rpcrdma_req *req,
diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 794ba4c..ac47314 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -391,7 +391,7 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 			rpcrdma_mr_recycle(mr);
 		mr = rpcrdma_mr_get(r_xprt);
 		if (!mr)
-			return ERR_PTR(-EAGAIN);
+			goto out_getmr_err;
 	} while (mr->frwr.fr_state != FRWR_IS_INVALID);
 	frwr = &mr->frwr;
 	frwr->fr_state = FRWR_IS_VALID;
@@ -448,6 +448,10 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 	*out = mr;
 	return seg;
 
+out_getmr_err:
+	xprt_wait_for_buffer_space(&r_xprt->rx_xprt);
+	return ERR_PTR(-EAGAIN);
+
 out_dmamap_err:
 	mr->mr_dir = DMA_NONE;
 	trace_xprtrdma_frwr_sgerr(mr, i);
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 97bfb80..59b214b 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -699,22 +699,28 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 			  struct rpcrdma_req *req, u32 hdrlen,
 			  struct xdr_buf *xdr, enum rpcrdma_chunktype rtype)
 {
+	int ret;
+
+	ret = -EAGAIN;
 	req->rl_sendctx = rpcrdma_sendctx_get_locked(r_xprt);
 	if (!req->rl_sendctx)
-		return -EAGAIN;
+		goto err;
 	req->rl_sendctx->sc_wr.num_sge = 0;
 	req->rl_sendctx->sc_unmap_count = 0;
 	req->rl_sendctx->sc_req = req;
 	__clear_bit(RPCRDMA_REQ_F_TX_RESOURCES, &req->rl_flags);
 
+	ret = -EIO;
 	if (!rpcrdma_prepare_hdr_sge(r_xprt, req, hdrlen))
-		return -EIO;
-
+		goto err;
 	if (rtype != rpcrdma_areadch)
 		if (!rpcrdma_prepare_msg_sges(r_xprt, req, xdr, rtype))
-			return -EIO;
-
+			goto err;
 	return 0;
+
+err:
+	trace_xprtrdma_prepsend_failed(&req->rl_slot, ret);
+	return ret;
 }
 
 /**
@@ -877,15 +883,7 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 
 out_err:
 	trace_xprtrdma_marshal_failed(rqst, ret);
-	switch (ret) {
-	case -EAGAIN:
-		xprt_wait_for_buffer_space(rqst->rq_xprt);
-		break;
-	case -ENOBUFS:
-		break;
-	default:
-		r_xprt->rx_stats.failed_marshal_count++;
-	}
+	r_xprt->rx_stats.failed_marshal_count++;
 	return ret;
 }
 
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index e71315e..0be5a36 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -901,7 +901,7 @@ struct rpcrdma_sendctx *rpcrdma_sendctx_get_locked(struct rpcrdma_xprt *r_xprt)
 	 * completions recently. This is a sign the Send Queue is
 	 * backing up. Cause the caller to pause and try again.
 	 */
-	set_bit(RPCRDMA_BUF_F_EMPTY_SCQ, &buf->rb_flags);
+	xprt_wait_for_buffer_space(&r_xprt->rx_xprt);
 	r_xprt->rx_stats.empty_sendctx_q++;
 	return NULL;
 }
@@ -936,10 +936,7 @@ struct rpcrdma_sendctx *rpcrdma_sendctx_get_locked(struct rpcrdma_xprt *r_xprt)
 	/* Paired with READ_ONCE */
 	smp_store_release(&buf->rb_sc_tail, next_tail);
 
-	if (test_and_clear_bit(RPCRDMA_BUF_F_EMPTY_SCQ, &buf->rb_flags)) {
-		smp_mb__after_atomic();
-		xprt_write_space(&sc->sc_xprt->rx_xprt);
-	}
+	xprt_write_space(&sc->sc_xprt->rx_xprt);
 }
 
 static void
@@ -977,8 +974,6 @@ struct rpcrdma_sendctx *rpcrdma_sendctx_get_locked(struct rpcrdma_xprt *r_xprt)
 	r_xprt->rx_stats.mrs_allocated += count;
 	spin_unlock(&buf->rb_mrlock);
 	trace_xprtrdma_createmrs(r_xprt, count);
-
-	xprt_write_space(&r_xprt->rx_xprt);
 }
 
 static void
@@ -990,6 +985,7 @@ struct rpcrdma_sendctx *rpcrdma_sendctx_get_locked(struct rpcrdma_xprt *r_xprt)
 						   rx_buf);
 
 	rpcrdma_mrs_create(r_xprt);
+	xprt_write_space(&r_xprt->rx_xprt);
 }
 
 /**
@@ -1089,7 +1085,6 @@ int rpcrdma_buffer_create(struct rpcrdma_xprt *r_xprt)
 	struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
 	int i, rc;
 
-	buf->rb_flags = 0;
 	buf->rb_max_requests = r_xprt->rx_ep.rep_max_requests;
 	buf->rb_bc_srv_max_requests = 0;
 	spin_lock_init(&buf->rb_mrlock);
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index d1e0749..2c6c5d8 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -391,7 +391,6 @@ struct rpcrdma_buffer {
 	struct list_head	rb_recv_bufs;
 	struct list_head	rb_allreqs;
 
-	unsigned long		rb_flags;
 	u32			rb_max_requests;
 	u32			rb_credits;	/* most recent credit grant */
 
@@ -402,11 +401,6 @@ struct rpcrdma_buffer {
 	struct delayed_work	rb_refresh_worker;
 };
 
-/* rb_flags */
-enum {
-	RPCRDMA_BUF_F_EMPTY_SCQ = 0,
-};
-
 /*
  * Statistics for RPCRDMA
  */


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 05/19] xprtrdma: Remove the RPCRDMA_REQ_F_PENDING flag
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (3 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 04/19] xprtrdma: Fix occasional transport deadlock Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 06/19] xprtrdma: Remove fr_state Chuck Lever
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Commit 9590d083c1bb ("xprtrdma: Use xprt_pin_rqst in
rpcrdma_reply_handler") pins incoming RPC/RDMA replies so they
can be left in the pending requests queue while they are being
processed without introducing a race between ->buf_free and the
transport's reply handler. Therefore RPCRDMA_REQ_F_PENDING is no
longer necessary.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/rpc_rdma.c  |    1 -
 net/sunrpc/xprtrdma/transport.c |    4 +---
 net/sunrpc/xprtrdma/xprt_rdma.h |    1 -
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 59b214b..fbc0a9f 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -1371,7 +1371,6 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep)
 	}
 	req->rl_reply = rep;
 	rep->rr_rqst = rqst;
-	clear_bit(RPCRDMA_REQ_F_PENDING, &req->rl_flags);
 
 	trace_xprtrdma_reply(rqst->rq_task, rep, req, credits);
 	queue_work(buf->rb_completion_wq, &rep->rr_work);
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 1f73a6a..f84375d 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -618,8 +618,7 @@ static bool rpcrdma_check_regbuf(struct rpcrdma_xprt *r_xprt,
 	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(rqst->rq_xprt);
 	struct rpcrdma_req *req = rpcr_to_rdmar(rqst);
 
-	if (test_bit(RPCRDMA_REQ_F_PENDING, &req->rl_flags))
-		rpcrdma_release_rqst(r_xprt, req);
+	rpcrdma_release_rqst(r_xprt, req);
 	trace_xprtrdma_op_free(task, req);
 }
 
@@ -667,7 +666,6 @@ static bool rpcrdma_check_regbuf(struct rpcrdma_xprt *r_xprt,
 		goto drop_connection;
 	rqst->rq_xtime = ktime_get();
 
-	__set_bit(RPCRDMA_REQ_F_PENDING, &req->rl_flags);
 	if (rpcrdma_ep_post(&r_xprt->rx_ia, &r_xprt->rx_ep, req))
 		goto drop_connection;
 
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 2c6c5d8..3c39aa3 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -344,7 +344,6 @@ struct rpcrdma_req {
 
 /* rl_flags */
 enum {
-	RPCRDMA_REQ_F_PENDING = 0,
 	RPCRDMA_REQ_F_TX_RESOURCES,
 };
 


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 06/19] xprtrdma: Remove fr_state
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (4 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 05/19] xprtrdma: Remove the RPCRDMA_REQ_F_PENDING flag Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 07/19] xprtrdma: Add mechanism to place MRs back on the free list Chuck Lever
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Now that both the Send and Receive completions are handled in
process context, it is safe to DMA unmap and return MRs to the
free or recycle lists directly in the completion handlers.

Doing this means rpcrdma_frwr no longer needs to track the state of
each MR, meaning that a VALID or FLUSHED MR can no longer appear on
an xprt's MR free list. Thus there is no longer a need to track the
MR's registration state in rpcrdma_frwr.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/trace/events/rpcrdma.h  |   19 ----
 net/sunrpc/xprtrdma/frwr_ops.c  |  204 ++++++++++++++++++---------------------
 net/sunrpc/xprtrdma/rpc_rdma.c  |    2 
 net/sunrpc/xprtrdma/xprt_rdma.h |   11 --
 4 files changed, 96 insertions(+), 140 deletions(-)

diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 2fb4151..1d25470 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -181,18 +181,6 @@
 				),					\
 				TP_ARGS(task, mr, nsegs))
 
-TRACE_DEFINE_ENUM(FRWR_IS_INVALID);
-TRACE_DEFINE_ENUM(FRWR_IS_VALID);
-TRACE_DEFINE_ENUM(FRWR_FLUSHED_FR);
-TRACE_DEFINE_ENUM(FRWR_FLUSHED_LI);
-
-#define xprtrdma_show_frwr_state(x)					\
-		__print_symbolic(x,					\
-				{ FRWR_IS_INVALID, "INVALID" },		\
-				{ FRWR_IS_VALID, "VALID" },		\
-				{ FRWR_FLUSHED_FR, "FLUSHED_FR" },	\
-				{ FRWR_FLUSHED_LI, "FLUSHED_LI" })
-
 DECLARE_EVENT_CLASS(xprtrdma_frwr_done,
 	TP_PROTO(
 		const struct ib_wc *wc,
@@ -203,22 +191,19 @@
 
 	TP_STRUCT__entry(
 		__field(const void *, mr)
-		__field(unsigned int, state)
 		__field(unsigned int, status)
 		__field(unsigned int, vendor_err)
 	),
 
 	TP_fast_assign(
 		__entry->mr = container_of(frwr, struct rpcrdma_mr, frwr);
-		__entry->state = frwr->fr_state;
 		__entry->status = wc->status;
 		__entry->vendor_err = __entry->status ? wc->vendor_err : 0;
 	),
 
 	TP_printk(
-		"mr=%p state=%s: %s (%u/0x%x)",
-		__entry->mr, xprtrdma_show_frwr_state(__entry->state),
-		rdma_show_wc_status(__entry->status),
+		"mr=%p: %s (%u/0x%x)",
+		__entry->mr, rdma_show_wc_status(__entry->status),
 		__entry->status, __entry->vendor_err
 	)
 );
diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index ac47314..dae2caa 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -168,7 +168,6 @@ int frwr_init_mr(struct rpcrdma_ia *ia, struct rpcrdma_mr *mr)
 		goto out_list_err;
 
 	mr->frwr.fr_mr = frmr;
-	mr->frwr.fr_state = FRWR_IS_INVALID;
 	mr->mr_dir = DMA_NONE;
 	INIT_LIST_HEAD(&mr->mr_list);
 	INIT_WORK(&mr->mr_recycle, frwr_mr_recycle_worker);
@@ -298,65 +297,6 @@ size_t frwr_maxpages(struct rpcrdma_xprt *r_xprt)
 }
 
 /**
- * frwr_wc_fastreg - Invoked by RDMA provider for a flushed FastReg WC
- * @cq:	completion queue (ignored)
- * @wc:	completed WR
- *
- */
-static void
-frwr_wc_fastreg(struct ib_cq *cq, struct ib_wc *wc)
-{
-	struct ib_cqe *cqe = wc->wr_cqe;
-	struct rpcrdma_frwr *frwr =
-			container_of(cqe, struct rpcrdma_frwr, fr_cqe);
-
-	/* WARNING: Only wr_cqe and status are reliable at this point */
-	if (wc->status != IB_WC_SUCCESS)
-		frwr->fr_state = FRWR_FLUSHED_FR;
-	trace_xprtrdma_wc_fastreg(wc, frwr);
-}
-
-/**
- * frwr_wc_localinv - Invoked by RDMA provider for a flushed LocalInv WC
- * @cq:	completion queue (ignored)
- * @wc:	completed WR
- *
- */
-static void
-frwr_wc_localinv(struct ib_cq *cq, struct ib_wc *wc)
-{
-	struct ib_cqe *cqe = wc->wr_cqe;
-	struct rpcrdma_frwr *frwr = container_of(cqe, struct rpcrdma_frwr,
-						 fr_cqe);
-
-	/* WARNING: Only wr_cqe and status are reliable at this point */
-	if (wc->status != IB_WC_SUCCESS)
-		frwr->fr_state = FRWR_FLUSHED_LI;
-	trace_xprtrdma_wc_li(wc, frwr);
-}
-
-/**
- * frwr_wc_localinv_wake - Invoked by RDMA provider for a signaled LocalInv WC
- * @cq:	completion queue (ignored)
- * @wc:	completed WR
- *
- * Awaken anyone waiting for an MR to finish being fenced.
- */
-static void
-frwr_wc_localinv_wake(struct ib_cq *cq, struct ib_wc *wc)
-{
-	struct ib_cqe *cqe = wc->wr_cqe;
-	struct rpcrdma_frwr *frwr = container_of(cqe, struct rpcrdma_frwr,
-						 fr_cqe);
-
-	/* WARNING: Only wr_cqe and status are reliable at this point */
-	if (wc->status != IB_WC_SUCCESS)
-		frwr->fr_state = FRWR_FLUSHED_LI;
-	trace_xprtrdma_wc_li_wake(wc, frwr);
-	complete(&frwr->fr_linv_done);
-}
-
-/**
  * frwr_map - Register a memory region
  * @r_xprt: controlling transport
  * @seg: memory region co-ordinates
@@ -378,23 +318,15 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 {
 	struct rpcrdma_ia *ia = &r_xprt->rx_ia;
 	bool holes_ok = ia->ri_mrtype == IB_MR_TYPE_SG_GAPS;
-	struct rpcrdma_frwr *frwr;
 	struct rpcrdma_mr *mr;
 	struct ib_mr *ibmr;
 	struct ib_reg_wr *reg_wr;
 	int i, n;
 	u8 key;
 
-	mr = NULL;
-	do {
-		if (mr)
-			rpcrdma_mr_recycle(mr);
-		mr = rpcrdma_mr_get(r_xprt);
-		if (!mr)
-			goto out_getmr_err;
-	} while (mr->frwr.fr_state != FRWR_IS_INVALID);
-	frwr = &mr->frwr;
-	frwr->fr_state = FRWR_IS_VALID;
+	mr = rpcrdma_mr_get(r_xprt);
+	if (!mr)
+		goto out_getmr_err;
 
 	if (nsegs > ia->ri_max_frwr_depth)
 		nsegs = ia->ri_max_frwr_depth;
@@ -423,7 +355,7 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 	if (!mr->mr_nents)
 		goto out_dmamap_err;
 
-	ibmr = frwr->fr_mr;
+	ibmr = mr->frwr.fr_mr;
 	n = ib_map_mr_sg(ibmr, mr->mr_sg, mr->mr_nents, NULL, PAGE_SIZE);
 	if (unlikely(n != mr->mr_nents))
 		goto out_mapmr_err;
@@ -433,7 +365,7 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 	key = (u8)(ibmr->rkey & 0x000000FF);
 	ib_update_fast_reg_key(ibmr, ++key);
 
-	reg_wr = &frwr->fr_regwr;
+	reg_wr = &mr->frwr.fr_regwr;
 	reg_wr->mr = ibmr;
 	reg_wr->key = ibmr->rkey;
 	reg_wr->access = writing ?
@@ -465,6 +397,23 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 }
 
 /**
+ * frwr_wc_fastreg - Invoked by RDMA provider for a flushed FastReg WC
+ * @cq:	completion queue (ignored)
+ * @wc:	completed WR
+ *
+ */
+static void frwr_wc_fastreg(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct ib_cqe *cqe = wc->wr_cqe;
+	struct rpcrdma_frwr *frwr =
+		container_of(cqe, struct rpcrdma_frwr, fr_cqe);
+
+	/* WARNING: Only wr_cqe and status are reliable at this point */
+	trace_xprtrdma_wc_fastreg(wc, frwr);
+	/* The MR will get recycled when the associated req is retransmitted */
+}
+
+/**
  * frwr_send - post Send WR containing the RPC Call message
  * @ia: interface adapter
  * @req: Prepared RPC Call
@@ -516,31 +465,72 @@ void frwr_reminv(struct rpcrdma_rep *rep, struct list_head *mrs)
 		if (mr->mr_handle == rep->rr_inv_rkey) {
 			list_del_init(&mr->mr_list);
 			trace_xprtrdma_mr_remoteinv(mr);
-			mr->frwr.fr_state = FRWR_IS_INVALID;
 			rpcrdma_mr_unmap_and_put(mr);
 			break;	/* only one invalidated MR per RPC */
 		}
 }
 
+static void __frwr_release_mr(struct ib_wc *wc, struct rpcrdma_mr *mr)
+{
+	if (wc->status != IB_WC_SUCCESS)
+		rpcrdma_mr_recycle(mr);
+	else
+		rpcrdma_mr_unmap_and_put(mr);
+}
+
 /**
- * frwr_unmap_sync - invalidate memory regions that were registered for @req
- * @r_xprt: controlling transport
- * @mrs: list of MRs to process
+ * frwr_wc_localinv - Invoked by RDMA provider for a LOCAL_INV WC
+ * @cq:	completion queue (ignored)
+ * @wc:	completed WR
  *
- * Sleeps until it is safe for the host CPU to access the
- * previously mapped memory regions.
+ */
+static void frwr_wc_localinv(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct ib_cqe *cqe = wc->wr_cqe;
+	struct rpcrdma_frwr *frwr =
+		container_of(cqe, struct rpcrdma_frwr, fr_cqe);
+	struct rpcrdma_mr *mr = container_of(frwr, struct rpcrdma_mr, frwr);
+
+	/* WARNING: Only wr_cqe and status are reliable at this point */
+	__frwr_release_mr(wc, mr);
+	trace_xprtrdma_wc_li(wc, frwr);
+}
+
+/**
+ * frwr_wc_localinv_wake - Invoked by RDMA provider for a LOCAL_INV WC
+ * @cq:	completion queue (ignored)
+ * @wc:	completed WR
  *
- * Caller ensures that @mrs is not empty before the call. This
- * function empties the list.
+ * Awaken anyone waiting for an MR to finish being fenced.
  */
-void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct list_head *mrs)
+static void frwr_wc_localinv_wake(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct ib_cqe *cqe = wc->wr_cqe;
+	struct rpcrdma_frwr *frwr =
+		container_of(cqe, struct rpcrdma_frwr, fr_cqe);
+	struct rpcrdma_mr *mr = container_of(frwr, struct rpcrdma_mr, frwr);
+
+	/* WARNING: Only wr_cqe and status are reliable at this point */
+	__frwr_release_mr(wc, mr);
+	trace_xprtrdma_wc_li_wake(wc, frwr);
+	complete(&frwr->fr_linv_done);
+}
+
+/**
+ * frwr_unmap_sync - invalidate memory regions that were registered for @req
+ * @r_xprt: controlling transport instance
+ * @req: rpcrdma_req with a non-empty list of MRs to process
+ *
+ * Sleeps until it is safe for the host CPU to access the previously mapped
+ * memory regions.
+ */
+void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 {
 	struct ib_send_wr *first, **prev, *last;
 	const struct ib_send_wr *bad_wr;
-	struct rpcrdma_ia *ia = &r_xprt->rx_ia;
 	struct rpcrdma_frwr *frwr;
 	struct rpcrdma_mr *mr;
-	int count, rc;
+	int rc;
 
 	/* ORDER: Invalidate all of the MRs first
 	 *
@@ -548,33 +538,32 @@ void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct list_head *mrs)
 	 * a single ib_post_send() call.
 	 */
 	frwr = NULL;
-	count = 0;
 	prev = &first;
-	list_for_each_entry(mr, mrs, mr_list) {
-		mr->frwr.fr_state = FRWR_IS_INVALID;
+	while (!list_empty(&req->rl_registered)) {
+		mr = rpcrdma_mr_pop(&req->rl_registered);
 
-		frwr = &mr->frwr;
 		trace_xprtrdma_mr_localinv(mr);
+		r_xprt->rx_stats.local_inv_needed++;
 
+		frwr = &mr->frwr;
 		frwr->fr_cqe.done = frwr_wc_localinv;
 		last = &frwr->fr_invwr;
-		memset(last, 0, sizeof(*last));
+		last->next = NULL;
 		last->wr_cqe = &frwr->fr_cqe;
+		last->sg_list = NULL;
+		last->num_sge = 0;
 		last->opcode = IB_WR_LOCAL_INV;
+		last->send_flags = IB_SEND_SIGNALED;
 		last->ex.invalidate_rkey = mr->mr_handle;
-		count++;
 
 		*prev = last;
 		prev = &last->next;
 	}
-	if (!frwr)
-		goto unmap;
 
 	/* Strong send queue ordering guarantees that when the
 	 * last WR in the chain completes, all WRs in the chain
 	 * are complete.
 	 */
-	last->send_flags = IB_SEND_SIGNALED;
 	frwr->fr_cqe.done = frwr_wc_localinv_wake;
 	reinit_completion(&frwr->fr_linv_done);
 
@@ -582,29 +571,20 @@ void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct list_head *mrs)
 	 * replaces the QP. The RPC reply handler won't call us
 	 * unless ri_id->qp is a valid pointer.
 	 */
-	r_xprt->rx_stats.local_inv_needed++;
 	bad_wr = NULL;
-	rc = ib_post_send(ia->ri_id->qp, first, &bad_wr);
-	if (bad_wr != first)
-		wait_for_completion(&frwr->fr_linv_done);
-	if (rc)
-		goto out_release;
+	rc = ib_post_send(r_xprt->rx_ia.ri_id->qp, first, &bad_wr);
+	trace_xprtrdma_post_send(req, rc);
 
-	/* ORDER: Now DMA unmap all of the MRs, and return
-	 * them to the free MR list.
+	/* The final LOCAL_INV WR in the chain is supposed to
+	 * do the wake. If it was never posted, the wake will
+	 * not happen, so don't wait in that case.
 	 */
-unmap:
-	while (!list_empty(mrs)) {
-		mr = rpcrdma_mr_pop(mrs);
-		rpcrdma_mr_unmap_and_put(mr);
-	}
-	return;
-
-out_release:
-	pr_err("rpcrdma: FRWR invalidate ib_post_send returned %i\n", rc);
+	if (bad_wr != first)
+		wait_for_completion(&frwr->fr_linv_done);
+	if (!rc)
+		return;
 
-	/* Unmap and release the MRs in the LOCAL_INV WRs that did not
-	 * get posted.
+	/* Recycle MRs in the LOCAL_INV chain that did not get posted.
 	 */
 	while (bad_wr) {
 		frwr = container_of(bad_wr, struct rpcrdma_frwr,
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index fbc0a9f..f23450b 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -1277,7 +1277,7 @@ void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 	 * RPC has relinquished all its Send Queue entries.
 	 */
 	if (!list_empty(&req->rl_registered))
-		frwr_unmap_sync(r_xprt, &req->rl_registered);
+		frwr_unmap_sync(r_xprt, req);
 
 	/* Ensure that any DMA mapped pages associated with
 	 * the Send of the RPC Call have been unmapped before
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 3c39aa3..a9de116 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -240,17 +240,9 @@ struct rpcrdma_sendctx {
  * An external memory region is any buffer or page that is registered
  * on the fly (ie, not pre-registered).
  */
-enum rpcrdma_frwr_state {
-	FRWR_IS_INVALID,	/* ready to be used */
-	FRWR_IS_VALID,		/* in use */
-	FRWR_FLUSHED_FR,	/* flushed FASTREG WR */
-	FRWR_FLUSHED_LI,	/* flushed LOCALINV WR */
-};
-
 struct rpcrdma_frwr {
 	struct ib_mr			*fr_mr;
 	struct ib_cqe			fr_cqe;
-	enum rpcrdma_frwr_state		fr_state;
 	struct completion		fr_linv_done;
 	union {
 		struct ib_reg_wr	fr_regwr;
@@ -567,8 +559,7 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 				struct rpcrdma_mr **mr);
 int frwr_send(struct rpcrdma_ia *ia, struct rpcrdma_req *req);
 void frwr_reminv(struct rpcrdma_rep *rep, struct list_head *mrs);
-void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt,
-		     struct list_head *mrs);
+void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req);
 
 /*
  * RPC/RDMA protocol calls - xprtrdma/rpc_rdma.c


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 07/19] xprtrdma: Add mechanism to place MRs back on the free list
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (5 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 06/19] xprtrdma: Remove fr_state Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 08/19] xprtrdma: Reduce context switching due to Local Invalidation Chuck Lever
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

When a marshal operation fails, any MRs that were already set up for
that request are recycled. Recycling releases MRs and creates new
ones, which is expensive.

Since commit f2877623082b ("xprtrdma: Chain Send to FastReg WRs")
was merged, recycling FRWRs is unnecessary. This is because before
that commit, frwr_map had already posted FAST_REG Work Requests,
so ownership of the MRs had already been passed to the NIC and thus
dealing with them had to be delayed until they completed.

Since that commit, however, FAST_REG WRs are posted at the same time
as the Send WR. This means that if marshaling fails, we are certain
the MRs are safe to simply unmap and place back on the free list
because neither the Send nor the FAST_REG WRs have been posted yet.
The kernel still has ownership of the MRs at this point.

This reduces the total number of MRs that the xprt has to create
under heavy workloads and makes the marshaling logic less brittle.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/frwr_ops.c  |   20 ++++++++++++++++++++
 net/sunrpc/xprtrdma/rpc_rdma.c  |    1 +
 net/sunrpc/xprtrdma/xprt_rdma.h |    1 +
 3 files changed, 22 insertions(+)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index dae2caa..5b7ff75 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -144,6 +144,26 @@ void frwr_release_mr(struct rpcrdma_mr *mr)
 	frwr_release_mr(mr);
 }
 
+/* frwr_reset - Place MRs back on the free list
+ * @req: request to reset
+ *
+ * Used after a failed marshal. For FRWR, this means the MRs
+ * don't have to be fully released and recreated.
+ *
+ * NB: This is safe only as long as none of @req's MRs are
+ * involved with an ongoing asynchronous FAST_REG or LOCAL_INV
+ * Work Request.
+ */
+void frwr_reset(struct rpcrdma_req *req)
+{
+	while (!list_empty(&req->rl_registered)) {
+		struct rpcrdma_mr *mr;
+
+		mr = rpcrdma_mr_pop(&req->rl_registered);
+		rpcrdma_mr_unmap_and_put(mr);
+	}
+}
+
 /**
  * frwr_init_mr - Initialize one MR
  * @ia: interface adapter
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index f23450b..67d72d6 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -884,6 +884,7 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 out_err:
 	trace_xprtrdma_marshal_failed(rqst, ret);
 	r_xprt->rx_stats.failed_marshal_count++;
+	frwr_reset(req);
 	return ret;
 }
 
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index a9de116..a396528 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -549,6 +549,7 @@ static inline bool rpcrdma_regbuf_dma_map(struct rpcrdma_xprt *r_xprt,
 /* Memory registration calls xprtrdma/frwr_ops.c
  */
 bool frwr_is_supported(struct ib_device *device);
+void frwr_reset(struct rpcrdma_req *req);
 int frwr_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep);
 int frwr_init_mr(struct rpcrdma_ia *ia, struct rpcrdma_mr *mr);
 void frwr_release_mr(struct rpcrdma_mr *mr);


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 08/19] xprtrdma: Reduce context switching due to Local Invalidation
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (6 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 07/19] xprtrdma: Add mechanism to place MRs back on the free list Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 09/19] xprtrdma: Wake RPCs directly in rpcrdma_wc_send path Chuck Lever
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Since commit ba69cd122ece ("xprtrdma: Remove support for FMR memory
registration"), FRWR is the only supported memory registration mode.

We can take advantage of the asynchronous nature of FRWR's LOCAL_INV
Work Requests to get rid of the completion wait by having the
LOCAL_INV completion handler take care of DMA unmapping MRs and
waking the upper layer RPC waiter.

This eliminates two context switches when local invalidation is
necessary. As a side benefit, we will no longer need the per-xprt
deferred completion work queue.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/trace/events/rpcrdma.h  |    1 
 net/sunrpc/xprtrdma/frwr_ops.c  |  103 +++++++++++++++++++++++++++++++++++++++
 net/sunrpc/xprtrdma/rpc_rdma.c  |   61 +++++++++++------------
 net/sunrpc/xprtrdma/verbs.c     |   17 ------
 net/sunrpc/xprtrdma/xprt_rdma.h |    8 ++-
 5 files changed, 137 insertions(+), 53 deletions(-)

diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 1d25470..98023d9 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -712,6 +712,7 @@
 DEFINE_FRWR_DONE_EVENT(xprtrdma_wc_fastreg);
 DEFINE_FRWR_DONE_EVENT(xprtrdma_wc_li);
 DEFINE_FRWR_DONE_EVENT(xprtrdma_wc_li_wake);
+DEFINE_FRWR_DONE_EVENT(xprtrdma_wc_li_done);
 
 TRACE_EVENT(xprtrdma_frwr_alloc,
 	TP_PROTO(
diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 5b7ff75..d009cb2 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -542,7 +542,10 @@ static void frwr_wc_localinv_wake(struct ib_cq *cq, struct ib_wc *wc)
  * @req: rpcrdma_req with a non-empty list of MRs to process
  *
  * Sleeps until it is safe for the host CPU to access the previously mapped
- * memory regions.
+ * memory regions. This guarantees that registered MRs are properly fenced
+ * from the server before the RPC consumer accesses the data in them. It
+ * also ensures proper Send flow control: waking the next RPC waits until
+ * this RPC has relinquished all its Send Queue entries.
  */
 void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 {
@@ -616,3 +619,101 @@ void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 		rpcrdma_mr_recycle(mr);
 	}
 }
+
+/**
+ * frwr_wc_localinv_done - Invoked by RDMA provider for a signaled LOCAL_INV WC
+ * @cq:	completion queue (ignored)
+ * @wc:	completed WR
+ *
+ */
+static void frwr_wc_localinv_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct ib_cqe *cqe = wc->wr_cqe;
+	struct rpcrdma_frwr *frwr =
+		container_of(cqe, struct rpcrdma_frwr, fr_cqe);
+	struct rpcrdma_mr *mr = container_of(frwr, struct rpcrdma_mr, frwr);
+
+	/* WARNING: Only wr_cqe and status are reliable at this point */
+	__frwr_release_mr(wc, mr);
+	trace_xprtrdma_wc_li_done(wc, frwr);
+	rpcrdma_complete_rqst(frwr->fr_req->rl_reply);
+}
+
+/**
+ * frwr_unmap_async - invalidate memory regions that were registered for @req
+ * @r_xprt: controlling transport instance
+ * @req: rpcrdma_req with a non-empty list of MRs to process
+ *
+ * This guarantees that registered MRs are properly fenced from the
+ * server before the RPC consumer accesses the data in them. It also
+ * ensures proper Send flow control: waking the next RPC waits until
+ * this RPC has relinquished all its Send Queue entries.
+ */
+void frwr_unmap_async(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
+{
+	struct ib_send_wr *first, *last, **prev;
+	const struct ib_send_wr *bad_wr;
+	struct rpcrdma_frwr *frwr;
+	struct rpcrdma_mr *mr;
+	int rc;
+
+	/* Chain the LOCAL_INV Work Requests and post them with
+	 * a single ib_post_send() call.
+	 */
+	frwr = NULL;
+	prev = &first;
+	while (!list_empty(&req->rl_registered)) {
+		mr = rpcrdma_mr_pop(&req->rl_registered);
+
+		trace_xprtrdma_mr_localinv(mr);
+		r_xprt->rx_stats.local_inv_needed++;
+
+		frwr = &mr->frwr;
+		frwr->fr_cqe.done = frwr_wc_localinv;
+		frwr->fr_req = req;
+		last = &frwr->fr_invwr;
+		last->next = NULL;
+		last->wr_cqe = &frwr->fr_cqe;
+		last->sg_list = NULL;
+		last->num_sge = 0;
+		last->opcode = IB_WR_LOCAL_INV;
+		last->send_flags = IB_SEND_SIGNALED;
+		last->ex.invalidate_rkey = mr->mr_handle;
+
+		*prev = last;
+		prev = &last->next;
+	}
+
+	/* Strong send queue ordering guarantees that when the
+	 * last WR in the chain completes, all WRs in the chain
+	 * are complete. The last completion will wake up the
+	 * RPC waiter.
+	 */
+	frwr->fr_cqe.done = frwr_wc_localinv_done;
+
+	/* Transport disconnect drains the receive CQ before it
+	 * replaces the QP. The RPC reply handler won't call us
+	 * unless ri_id->qp is a valid pointer.
+	 */
+	bad_wr = NULL;
+	rc = ib_post_send(r_xprt->rx_ia.ri_id->qp, first, &bad_wr);
+	trace_xprtrdma_post_send(req, rc);
+	if (!rc)
+		return;
+
+	/* Recycle MRs in the LOCAL_INV chain that did not get posted.
+	 */
+	while (bad_wr) {
+		frwr = container_of(bad_wr, struct rpcrdma_frwr, fr_invwr);
+		mr = container_of(frwr, struct rpcrdma_mr, frwr);
+		bad_wr = bad_wr->next;
+
+		rpcrdma_mr_recycle(mr);
+	}
+
+	/* The final LOCAL_INV WR in the chain is supposed to
+	 * do the wake. If it was never posted, the wake will
+	 * not happen, so wake here in that case.
+	 */
+	rpcrdma_complete_rqst(req->rl_reply);
+}
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 67d72d6..33b6e6a 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -1268,24 +1268,15 @@ void rpcrdma_complete_rqst(struct rpcrdma_rep *rep)
 	goto out;
 }
 
-void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
+/* Ensure that any DMA mapped pages associated with
+ * the Send of the RPC Call have been unmapped before
+ * allowing the RPC to complete. This protects argument
+ * memory not controlled by the RPC client from being
+ * re-used before we're done with it.
+ */
+static void rpcrdma_release_tx(struct rpcrdma_xprt *r_xprt,
+			       struct rpcrdma_req *req)
 {
-	/* Invalidate and unmap the data payloads before waking
-	 * the waiting application. This guarantees the memory
-	 * regions are properly fenced from the server before the
-	 * application accesses the data. It also ensures proper
-	 * send flow control: waking the next RPC waits until this
-	 * RPC has relinquished all its Send Queue entries.
-	 */
-	if (!list_empty(&req->rl_registered))
-		frwr_unmap_sync(r_xprt, req);
-
-	/* Ensure that any DMA mapped pages associated with
-	 * the Send of the RPC Call have been unmapped before
-	 * allowing the RPC to complete. This protects argument
-	 * memory not controlled by the RPC client from being
-	 * re-used before we're done with it.
-	 */
 	if (test_bit(RPCRDMA_REQ_F_TX_RESOURCES, &req->rl_flags)) {
 		r_xprt->rx_stats.reply_waits_for_send++;
 		out_of_line_wait_on_bit(&req->rl_flags,
@@ -1295,24 +1286,23 @@ void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 	}
 }
 
-/* Reply handling runs in the poll worker thread. Anything that
- * might wait is deferred to a separate workqueue.
+/**
+ * rpcrdma_release_rqst - Release hardware resources
+ * @r_xprt: controlling transport instance
+ * @req: request with resources to release
+ *
  */
-void rpcrdma_deferred_completion(struct work_struct *work)
+void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 {
-	struct rpcrdma_rep *rep =
-			container_of(work, struct rpcrdma_rep, rr_work);
-	struct rpcrdma_req *req = rpcr_to_rdmar(rep->rr_rqst);
-	struct rpcrdma_xprt *r_xprt = rep->rr_rxprt;
+	if (!list_empty(&req->rl_registered))
+		frwr_unmap_sync(r_xprt, req);
 
-	trace_xprtrdma_defer_cmp(rep);
-	if (rep->rr_wc_flags & IB_WC_WITH_INVALIDATE)
-		frwr_reminv(rep, &req->rl_registered);
-	rpcrdma_release_rqst(r_xprt, req);
-	rpcrdma_complete_rqst(rep);
+	rpcrdma_release_tx(r_xprt, req);
 }
 
-/* Process received RPC/RDMA messages.
+/**
+ * rpcrdma_reply_handler - Process received RPC/RDMA messages
+ * @rep: Incoming rpcrdma_rep object to process
  *
  * Errors must result in the RPC task either being awakened, or
  * allowed to timeout, to discover the errors at that time.
@@ -1374,7 +1364,16 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep)
 	rep->rr_rqst = rqst;
 
 	trace_xprtrdma_reply(rqst->rq_task, rep, req, credits);
-	queue_work(buf->rb_completion_wq, &rep->rr_work);
+
+	if (rep->rr_wc_flags & IB_WC_WITH_INVALIDATE)
+		frwr_reminv(rep, &req->rl_registered);
+	if (!list_empty(&req->rl_registered)) {
+		frwr_unmap_async(r_xprt, req);
+		/* LocalInv completion will complete the RPC */
+	} else {
+		rpcrdma_release_tx(r_xprt, req);
+		rpcrdma_complete_rqst(rep);
+	}
 	return;
 
 out_badversion:
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 0be5a36..c50a4b2 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -89,14 +89,12 @@
  */
 static void rpcrdma_xprt_drain(struct rpcrdma_xprt *r_xprt)
 {
-	struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
 	struct rpcrdma_ia *ia = &r_xprt->rx_ia;
 
 	/* Flush Receives, then wait for deferred Reply work
 	 * to complete.
 	 */
 	ib_drain_rq(ia->ri_id->qp);
-	drain_workqueue(buf->rb_completion_wq);
 
 	/* Deferred Reply processing might have scheduled
 	 * local invalidations.
@@ -1056,7 +1054,6 @@ static bool rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, bool temp)
 
 	rep->rr_cqe.done = rpcrdma_wc_receive;
 	rep->rr_rxprt = r_xprt;
-	INIT_WORK(&rep->rr_work, rpcrdma_deferred_completion);
 	rep->rr_recv_wr.next = NULL;
 	rep->rr_recv_wr.wr_cqe = &rep->rr_cqe;
 	rep->rr_recv_wr.sg_list = &rep->rr_rdmabuf->rg_iov;
@@ -1117,15 +1114,6 @@ int rpcrdma_buffer_create(struct rpcrdma_xprt *r_xprt)
 	if (rc)
 		goto out;
 
-	buf->rb_completion_wq = alloc_workqueue("rpcrdma-%s",
-						WQ_MEM_RECLAIM | WQ_HIGHPRI,
-						0,
-			r_xprt->rx_xprt.address_strings[RPC_DISPLAY_ADDR]);
-	if (!buf->rb_completion_wq) {
-		rc = -ENOMEM;
-		goto out;
-	}
-
 	return 0;
 out:
 	rpcrdma_buffer_destroy(buf);
@@ -1199,11 +1187,6 @@ static void rpcrdma_rep_destroy(struct rpcrdma_rep *rep)
 {
 	cancel_delayed_work_sync(&buf->rb_refresh_worker);
 
-	if (buf->rb_completion_wq) {
-		destroy_workqueue(buf->rb_completion_wq);
-		buf->rb_completion_wq = NULL;
-	}
-
 	rpcrdma_sendctxs_destroy(buf);
 
 	while (!list_empty(&buf->rb_recv_bufs)) {
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index a396528..e465221 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -202,10 +202,9 @@ struct rpcrdma_rep {
 	bool			rr_temp;
 	struct rpcrdma_regbuf	*rr_rdmabuf;
 	struct rpcrdma_xprt	*rr_rxprt;
-	struct work_struct	rr_work;
+	struct rpc_rqst		*rr_rqst;
 	struct xdr_buf		rr_hdrbuf;
 	struct xdr_stream	rr_stream;
-	struct rpc_rqst		*rr_rqst;
 	struct list_head	rr_list;
 	struct ib_recv_wr	rr_recv_wr;
 };
@@ -240,10 +239,12 @@ struct rpcrdma_sendctx {
  * An external memory region is any buffer or page that is registered
  * on the fly (ie, not pre-registered).
  */
+struct rpcrdma_req;
 struct rpcrdma_frwr {
 	struct ib_mr			*fr_mr;
 	struct ib_cqe			fr_cqe;
 	struct completion		fr_linv_done;
+	struct rpcrdma_req		*fr_req;
 	union {
 		struct ib_reg_wr	fr_regwr;
 		struct ib_send_wr	fr_invwr;
@@ -388,7 +389,6 @@ struct rpcrdma_buffer {
 	u32			rb_bc_srv_max_requests;
 	u32			rb_bc_max_requests;
 
-	struct workqueue_struct *rb_completion_wq;
 	struct delayed_work	rb_refresh_worker;
 };
 
@@ -561,6 +561,7 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 int frwr_send(struct rpcrdma_ia *ia, struct rpcrdma_req *req);
 void frwr_reminv(struct rpcrdma_rep *rep, struct list_head *mrs);
 void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req);
+void frwr_unmap_async(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req);
 
 /*
  * RPC/RDMA protocol calls - xprtrdma/rpc_rdma.c
@@ -585,7 +586,6 @@ int rpcrdma_prepare_send_sges(struct rpcrdma_xprt *r_xprt,
 void rpcrdma_reply_handler(struct rpcrdma_rep *rep);
 void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt,
 			  struct rpcrdma_req *req);
-void rpcrdma_deferred_completion(struct work_struct *work);
 
 static inline void rpcrdma_set_xdrlen(struct xdr_buf *xdr, size_t len)
 {


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 09/19] xprtrdma: Wake RPCs directly in rpcrdma_wc_send path
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (7 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 08/19] xprtrdma: Reduce context switching due to Local Invalidation Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 10/19] xprtrdma: Simplify rpcrdma_rep_create Chuck Lever
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Eliminate a context switch in the path that handles RPC wake-ups
when a Receive completion has to wait for a Send completion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/rpc_rdma.c  |   61 +++++++++++++++------------------------
 net/sunrpc/xprtrdma/transport.c |   10 ++++++
 net/sunrpc/xprtrdma/verbs.c     |    3 +-
 net/sunrpc/xprtrdma/xprt_rdma.h |   12 ++------
 4 files changed, 36 insertions(+), 50 deletions(-)

diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 33b6e6a..caf0b19 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -511,6 +511,16 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	return 0;
 }
 
+static void rpcrdma_sendctx_done(struct kref *kref)
+{
+	struct rpcrdma_req *req =
+		container_of(kref, struct rpcrdma_req, rl_kref);
+	struct rpcrdma_rep *rep = req->rl_reply;
+
+	rpcrdma_complete_rqst(rep);
+	rep->rr_rxprt->rx_stats.reply_waits_for_send++;
+}
+
 /**
  * rpcrdma_sendctx_unmap - DMA-unmap Send buffer
  * @sc: sendctx containing SGEs to unmap
@@ -520,6 +530,9 @@ void rpcrdma_sendctx_unmap(struct rpcrdma_sendctx *sc)
 {
 	struct ib_sge *sge;
 
+	if (!sc->sc_unmap_count)
+		return;
+
 	/* The first two SGEs contain the transport header and
 	 * the inline buffer. These are always left mapped so
 	 * they can be cheaply re-used.
@@ -529,9 +542,7 @@ void rpcrdma_sendctx_unmap(struct rpcrdma_sendctx *sc)
 		ib_dma_unmap_page(sc->sc_device, sge->addr, sge->length,
 				  DMA_TO_DEVICE);
 
-	if (test_and_clear_bit(RPCRDMA_REQ_F_TX_RESOURCES,
-			       &sc->sc_req->rl_flags))
-		wake_up_bit(&sc->sc_req->rl_flags, RPCRDMA_REQ_F_TX_RESOURCES);
+	kref_put(&sc->sc_req->rl_kref, rpcrdma_sendctx_done);
 }
 
 /* Prepare an SGE for the RPC-over-RDMA transport header.
@@ -666,7 +677,7 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 out:
 	sc->sc_wr.num_sge += sge_no;
 	if (sc->sc_unmap_count)
-		__set_bit(RPCRDMA_REQ_F_TX_RESOURCES, &req->rl_flags);
+		kref_get(&req->rl_kref);
 	return true;
 
 out_regbuf:
@@ -708,7 +719,7 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 	req->rl_sendctx->sc_wr.num_sge = 0;
 	req->rl_sendctx->sc_unmap_count = 0;
 	req->rl_sendctx->sc_req = req;
-	__clear_bit(RPCRDMA_REQ_F_TX_RESOURCES, &req->rl_flags);
+	kref_init(&req->rl_kref);
 
 	ret = -EIO;
 	if (!rpcrdma_prepare_hdr_sge(r_xprt, req, hdrlen))
@@ -1268,36 +1279,12 @@ void rpcrdma_complete_rqst(struct rpcrdma_rep *rep)
 	goto out;
 }
 
-/* Ensure that any DMA mapped pages associated with
- * the Send of the RPC Call have been unmapped before
- * allowing the RPC to complete. This protects argument
- * memory not controlled by the RPC client from being
- * re-used before we're done with it.
- */
-static void rpcrdma_release_tx(struct rpcrdma_xprt *r_xprt,
-			       struct rpcrdma_req *req)
+static void rpcrdma_reply_done(struct kref *kref)
 {
-	if (test_bit(RPCRDMA_REQ_F_TX_RESOURCES, &req->rl_flags)) {
-		r_xprt->rx_stats.reply_waits_for_send++;
-		out_of_line_wait_on_bit(&req->rl_flags,
-					RPCRDMA_REQ_F_TX_RESOURCES,
-					bit_wait,
-					TASK_UNINTERRUPTIBLE);
-	}
-}
+	struct rpcrdma_req *req =
+		container_of(kref, struct rpcrdma_req, rl_kref);
 
-/**
- * rpcrdma_release_rqst - Release hardware resources
- * @r_xprt: controlling transport instance
- * @req: request with resources to release
- *
- */
-void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
-{
-	if (!list_empty(&req->rl_registered))
-		frwr_unmap_sync(r_xprt, req);
-
-	rpcrdma_release_tx(r_xprt, req);
+	rpcrdma_complete_rqst(req->rl_reply);
 }
 
 /**
@@ -1367,13 +1354,11 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep)
 
 	if (rep->rr_wc_flags & IB_WC_WITH_INVALIDATE)
 		frwr_reminv(rep, &req->rl_registered);
-	if (!list_empty(&req->rl_registered)) {
+	if (!list_empty(&req->rl_registered))
 		frwr_unmap_async(r_xprt, req);
 		/* LocalInv completion will complete the RPC */
-	} else {
-		rpcrdma_release_tx(r_xprt, req);
-		rpcrdma_complete_rqst(rep);
-	}
+	else
+		kref_put(&req->rl_kref, rpcrdma_reply_done);
 	return;
 
 out_badversion:
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index f84375d..9575f1d 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -618,8 +618,16 @@ static bool rpcrdma_check_regbuf(struct rpcrdma_xprt *r_xprt,
 	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(rqst->rq_xprt);
 	struct rpcrdma_req *req = rpcr_to_rdmar(rqst);
 
-	rpcrdma_release_rqst(r_xprt, req);
 	trace_xprtrdma_op_free(task, req);
+
+	if (!list_empty(&req->rl_registered))
+		frwr_unmap_sync(r_xprt, req);
+
+	/* XXX: If the RPC is completing because of a signal and
+	 * not because a reply was received, we ought to ensure
+	 * that the Send completion has fired, so that memory
+	 * involved with the Send is not still visible to the NIC.
+	 */
 }
 
 /**
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index c50a4b2..4e22cc2 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1462,8 +1462,7 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 	struct ib_send_wr *send_wr = &req->rl_sendctx->sc_wr;
 	int rc;
 
-	if (!ep->rep_send_count ||
-	    test_bit(RPCRDMA_REQ_F_TX_RESOURCES, &req->rl_flags)) {
+	if (!ep->rep_send_count || kref_read(&req->rl_kref) > 1) {
 		send_wr->send_flags |= IB_SEND_SIGNALED;
 		ep->rep_send_count = ep->rep_send_batch;
 	} else {
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index e465221..5475f0d 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -44,7 +44,8 @@
 
 #include <linux/wait.h> 		/* wait_queue_head_t, etc */
 #include <linux/spinlock.h> 		/* spinlock_t, etc */
-#include <linux/atomic.h>			/* atomic_t, etc */
+#include <linux/atomic.h>		/* atomic_t, etc */
+#include <linux/kref.h>			/* struct kref */
 #include <linux/workqueue.h>		/* struct work_struct */
 
 #include <rdma/rdma_cm.h>		/* RDMA connection api */
@@ -329,17 +330,12 @@ struct rpcrdma_req {
 	struct rpcrdma_regbuf	*rl_recvbuf;	/* rq_rcv_buf */
 
 	struct list_head	rl_all;
-	unsigned long		rl_flags;
+	struct kref		rl_kref;
 
 	struct list_head	rl_registered;	/* registered segments */
 	struct rpcrdma_mr_seg	rl_segments[RPCRDMA_MAX_SEGS];
 };
 
-/* rl_flags */
-enum {
-	RPCRDMA_REQ_F_TX_RESOURCES,
-};
-
 static inline struct rpcrdma_req *
 rpcr_to_rdmar(const struct rpc_rqst *rqst)
 {
@@ -584,8 +580,6 @@ int rpcrdma_prepare_send_sges(struct rpcrdma_xprt *r_xprt,
 void rpcrdma_set_max_header_sizes(struct rpcrdma_xprt *);
 void rpcrdma_complete_rqst(struct rpcrdma_rep *rep);
 void rpcrdma_reply_handler(struct rpcrdma_rep *rep);
-void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt,
-			  struct rpcrdma_req *req);
 
 static inline void rpcrdma_set_xdrlen(struct xdr_buf *xdr, size_t len)
 {


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 10/19] xprtrdma: Simplify rpcrdma_rep_create
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (8 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 09/19] xprtrdma: Wake RPCs directly in rpcrdma_wc_send path Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:08 ` [PATCH v2 11/19] xprtrdma: Streamline rpcrdma_post_recvs Chuck Lever
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Clean up.

Commit 7c8d9e7c8863 ("xprtrdma: Move Receive posting to Receive
handler") reduced the number of rpcrdma_rep_create call sites to
one. After that commit, the backchannel code no longer invokes it.

Therefore the free list logic added by commit d698c4a02ee0
("xprtrdma: Fix backchannel allocation of extra rpcrdma_reps") is
no longer necessary, and in fact adds some extra overhead that we
can do without.

Simply post any newly created reps. They will get added back to
the rb_recv_bufs list when they subsequently complete.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |   22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 4e22cc2..de6be10 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1036,9 +1036,9 @@ struct rpcrdma_req *rpcrdma_req_create(struct rpcrdma_xprt *r_xprt, size_t size,
 	return NULL;
 }
 
-static bool rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, bool temp)
+static struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt,
+					      bool temp)
 {
-	struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
 	struct rpcrdma_rep *rep;
 
 	rep = kzalloc(sizeof(*rep), GFP_KERNEL);
@@ -1049,9 +1049,9 @@ static bool rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, bool temp)
 					       DMA_FROM_DEVICE, GFP_KERNEL);
 	if (!rep->rr_rdmabuf)
 		goto out_free;
+
 	xdr_buf_init(&rep->rr_hdrbuf, rdmab_data(rep->rr_rdmabuf),
 		     rdmab_length(rep->rr_rdmabuf));
-
 	rep->rr_cqe.done = rpcrdma_wc_receive;
 	rep->rr_rxprt = r_xprt;
 	rep->rr_recv_wr.next = NULL;
@@ -1059,16 +1059,12 @@ static bool rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, bool temp)
 	rep->rr_recv_wr.sg_list = &rep->rr_rdmabuf->rg_iov;
 	rep->rr_recv_wr.num_sge = 1;
 	rep->rr_temp = temp;
-
-	spin_lock(&buf->rb_lock);
-	list_add(&rep->rr_list, &buf->rb_recv_bufs);
-	spin_unlock(&buf->rb_lock);
-	return true;
+	return rep;
 
 out_free:
 	kfree(rep);
 out:
-	return false;
+	return NULL;
 }
 
 /**
@@ -1497,7 +1493,6 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 	count = 0;
 	wr = NULL;
 	while (needed) {
-		struct rpcrdma_regbuf *rb;
 		struct rpcrdma_rep *rep;
 
 		spin_lock(&buf->rb_lock);
@@ -1507,13 +1502,12 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 			list_del(&rep->rr_list);
 		spin_unlock(&buf->rb_lock);
 		if (!rep) {
-			if (!rpcrdma_rep_create(r_xprt, temp))
+			rep = rpcrdma_rep_create(r_xprt, temp);
+			if (!rep)
 				break;
-			continue;
 		}
 
-		rb = rep->rr_rdmabuf;
-		if (!rpcrdma_regbuf_dma_map(r_xprt, rb)) {
+		if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf)) {
 			rpcrdma_recv_buffer_put(rep);
 			break;
 		}


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 11/19] xprtrdma: Streamline rpcrdma_post_recvs
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (9 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 10/19] xprtrdma: Simplify rpcrdma_rep_create Chuck Lever
@ 2019-06-11 15:08 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 12/19] xprtrdma: Refactor chunk encoding Chuck Lever
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:08 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

rb_lock is contended between rpcrdma_buffer_create,
rpcrdma_buffer_put, and rpcrdma_post_recvs.

Commit e340c2d6ef2a ("xprtrdma: Reduce the doorbell rate (Receive)")
causes rpcrdma_post_recvs to take the rb_lock repeatedly when it
determines more Receives are needed. Streamline this code path so
it takes the lock just once in most cases to build the Receive
chain that is about to be posted.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |   59 ++++++++++++++++++++++++++++---------------
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index de6be10..3270c8a 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1478,11 +1478,13 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 {
 	struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
 	struct rpcrdma_ep *ep = &r_xprt->rx_ep;
-	struct ib_recv_wr *wr, *bad_wr;
+	struct ib_recv_wr *i, *wr, *bad_wr;
+	struct rpcrdma_rep *rep;
 	int needed, count, rc;
 
 	rc = 0;
 	count = 0;
+
 	needed = buf->rb_credits + (buf->rb_bc_srv_max_requests << 1);
 	if (ep->rep_receive_count > needed)
 		goto out;
@@ -1490,39 +1492,48 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 	if (!temp)
 		needed += RPCRDMA_MAX_RECV_BATCH;
 
-	count = 0;
+	/* fast path: all needed reps can be found on the free list */
 	wr = NULL;
+	spin_lock(&buf->rb_lock);
 	while (needed) {
-		struct rpcrdma_rep *rep;
-
-		spin_lock(&buf->rb_lock);
 		rep = list_first_entry_or_null(&buf->rb_recv_bufs,
 					       struct rpcrdma_rep, rr_list);
-		if (likely(rep))
-			list_del(&rep->rr_list);
-		spin_unlock(&buf->rb_lock);
-		if (!rep) {
-			rep = rpcrdma_rep_create(r_xprt, temp);
-			if (!rep)
-				break;
-		}
+		if (!rep)
+			break;
 
-		if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf)) {
-			rpcrdma_recv_buffer_put(rep);
+		list_del(&rep->rr_list);
+		rep->rr_recv_wr.next = wr;
+		wr = &rep->rr_recv_wr;
+		--needed;
+	}
+	spin_unlock(&buf->rb_lock);
+
+	while (needed) {
+		rep = rpcrdma_rep_create(r_xprt, temp);
+		if (!rep)
 			break;
-		}
 
-		trace_xprtrdma_post_recv(rep->rr_recv_wr.wr_cqe);
 		rep->rr_recv_wr.next = wr;
 		wr = &rep->rr_recv_wr;
-		++count;
 		--needed;
 	}
-	if (!count)
+	if (!wr)
 		goto out;
 
+	for (i = wr; i; i = i->next) {
+		rep = container_of(i, struct rpcrdma_rep, rr_recv_wr);
+
+		if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf))
+			goto release_wrs;
+
+		trace_xprtrdma_post_recv(rep->rr_recv_wr.wr_cqe);
+		++count;
+	}
+
 	rc = ib_post_recv(r_xprt->rx_ia.ri_id->qp, wr,
 			  (const struct ib_recv_wr **)&bad_wr);
+out:
+	trace_xprtrdma_post_recvs(r_xprt, count, rc);
 	if (rc) {
 		for (wr = bad_wr; wr;) {
 			struct rpcrdma_rep *rep;
@@ -1534,6 +1545,12 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb)
 		}
 	}
 	ep->rep_receive_count += count;
-out:
-	trace_xprtrdma_post_recvs(r_xprt, count, rc);
+	return;
+
+release_wrs:
+	for (i = wr; i;) {
+		rep = container_of(i, struct rpcrdma_rep, rr_recv_wr);
+		i = i->next;
+		rpcrdma_recv_buffer_put(rep);
+	}
 }


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 12/19] xprtrdma: Refactor chunk encoding
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (10 preceding siblings ...)
  2019-06-11 15:08 ` [PATCH v2 11/19] xprtrdma: Streamline rpcrdma_post_recvs Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 13/19] xprtrdma: Remove rpcrdma_req::rl_buffer Chuck Lever
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Clean up.

Move the "not present" case into the individual chunk encoders. This
improves code organization and readability.

The reason for the original organization was to optimize for the
case where there there are no chunks. The optimization turned out to
be inconsequential, so let's err on the side of code readability.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/rpc_rdma.c |   36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index caf0b19..d3515d3 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -366,6 +366,9 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	unsigned int pos;
 	int nsegs;
 
+	if (rtype == rpcrdma_noch)
+		goto done;
+
 	pos = rqst->rq_snd_buf.head[0].iov_len;
 	if (rtype == rpcrdma_areadch)
 		pos = 0;
@@ -389,7 +392,8 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 		nsegs -= mr->mr_nents;
 	} while (nsegs);
 
-	return 0;
+done:
+	return encode_item_not_present(xdr);
 }
 
 /* Register and XDR encode the Write list. Supports encoding a list
@@ -417,6 +421,9 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	int nsegs, nchunks;
 	__be32 *segcount;
 
+	if (wtype != rpcrdma_writech)
+		goto done;
+
 	seg = req->rl_segments;
 	nsegs = rpcrdma_convert_iovs(r_xprt, &rqst->rq_rcv_buf,
 				     rqst->rq_rcv_buf.head[0].iov_len,
@@ -451,7 +458,8 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	/* Update count of segments in this Write chunk */
 	*segcount = cpu_to_be32(nchunks);
 
-	return 0;
+done:
+	return encode_item_not_present(xdr);
 }
 
 /* Register and XDR encode the Reply chunk. Supports encoding an array
@@ -476,6 +484,9 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
 	int nsegs, nchunks;
 	__be32 *segcount;
 
+	if (wtype != rpcrdma_replych)
+		return encode_item_not_present(xdr);
+
 	seg = req->rl_segments;
 	nsegs = rpcrdma_convert_iovs(r_xprt, &rqst->rq_rcv_buf, 0, wtype, seg);
 	if (nsegs < 0)
@@ -859,28 +870,13 @@ static bool rpcrdma_prepare_msg_sges(struct rpcrdma_xprt *r_xprt,
 	 * send a Call message with a Position Zero Read chunk and a
 	 * regular Read chunk at the same time.
 	 */
-	if (rtype != rpcrdma_noch) {
-		ret = rpcrdma_encode_read_list(r_xprt, req, rqst, rtype);
-		if (ret)
-			goto out_err;
-	}
-	ret = encode_item_not_present(xdr);
+	ret = rpcrdma_encode_read_list(r_xprt, req, rqst, rtype);
 	if (ret)
 		goto out_err;
-
-	if (wtype == rpcrdma_writech) {
-		ret = rpcrdma_encode_write_list(r_xprt, req, rqst, wtype);
-		if (ret)
-			goto out_err;
-	}
-	ret = encode_item_not_present(xdr);
+	ret = rpcrdma_encode_write_list(r_xprt, req, rqst, wtype);
 	if (ret)
 		goto out_err;
-
-	if (wtype != rpcrdma_replych)
-		ret = encode_item_not_present(xdr);
-	else
-		ret = rpcrdma_encode_reply_chunk(r_xprt, req, rqst, wtype);
+	ret = rpcrdma_encode_reply_chunk(r_xprt, req, rqst, wtype);
 	if (ret)
 		goto out_err;
 


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 13/19] xprtrdma: Remove rpcrdma_req::rl_buffer
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (11 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 12/19] xprtrdma: Refactor chunk encoding Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 14/19] xprtrdma: Modernize ops->connect Chuck Lever
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Clean up.

There is only one remaining function, rpcrdma_buffer_put(), that
uses this field. Its caller can supply a pointer to the correct
rpcrdma_buffer, enabling the removal of an 8-byte pointer field
from a frequently-allocated shared data structure.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/transport.c |    5 ++++-
 net/sunrpc/xprtrdma/verbs.c     |    6 ++----
 net/sunrpc/xprtrdma/xprt_rdma.h |    4 ++--
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 9575f1d..3688e078 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -550,8 +550,11 @@ void xprt_rdma_close(struct rpc_xprt *xprt)
 static void
 xprt_rdma_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *rqst)
 {
+	struct rpcrdma_xprt *r_xprt =
+		container_of(xprt, struct rpcrdma_xprt, rx_xprt);
+
 	memset(rqst, 0, sizeof(*rqst));
-	rpcrdma_buffer_put(rpcr_to_rdmar(rqst));
+	rpcrdma_buffer_put(&r_xprt->rx_buf, rpcr_to_rdmar(rqst));
 	rpc_wake_up_next(&xprt->backlog);
 }
 
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 3270c8a..805b1f35 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1019,7 +1019,6 @@ struct rpcrdma_req *rpcrdma_req_create(struct rpcrdma_xprt *r_xprt, size_t size,
 	if (!req->rl_recvbuf)
 		goto out4;
 
-	req->rl_buffer = buffer;
 	INIT_LIST_HEAD(&req->rl_registered);
 	spin_lock(&buffer->rb_lock);
 	list_add(&req->rl_all, &buffer->rb_allreqs);
@@ -1299,13 +1298,12 @@ struct rpcrdma_req *
 
 /**
  * rpcrdma_buffer_put - Put request/reply buffers back into pool
+ * @buffers: buffer pool
  * @req: object to return
  *
  */
-void
-rpcrdma_buffer_put(struct rpcrdma_req *req)
+void rpcrdma_buffer_put(struct rpcrdma_buffer *buffers, struct rpcrdma_req *req)
 {
-	struct rpcrdma_buffer *buffers = req->rl_buffer;
 	struct rpcrdma_rep *rep = req->rl_reply;
 
 	req->rl_reply = NULL;
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 5475f0d..117e328 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -320,7 +320,6 @@ enum {
 struct rpcrdma_req {
 	struct list_head	rl_list;
 	struct rpc_rqst		rl_slot;
-	struct rpcrdma_buffer	*rl_buffer;
 	struct rpcrdma_rep	*rl_reply;
 	struct xdr_stream	rl_stream;
 	struct xdr_buf		rl_hdrbuf;
@@ -499,7 +498,8 @@ struct rpcrdma_req *rpcrdma_req_create(struct rpcrdma_xprt *r_xprt, size_t size,
 }
 
 struct rpcrdma_req *rpcrdma_buffer_get(struct rpcrdma_buffer *);
-void rpcrdma_buffer_put(struct rpcrdma_req *);
+void rpcrdma_buffer_put(struct rpcrdma_buffer *buffers,
+			struct rpcrdma_req *req);
 void rpcrdma_recv_buffer_put(struct rpcrdma_rep *);
 
 bool rpcrdma_regbuf_realloc(struct rpcrdma_regbuf *rb, size_t size,


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 14/19] xprtrdma: Modernize ops->connect
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (12 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 13/19] xprtrdma: Remove rpcrdma_req::rl_buffer Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 15/19] NFS4: Add a trace event to record invalid CB sequence IDs Chuck Lever
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Adapt and apply changes that were made to the TCP socket connect
code. See the following commits for details on the purpose of
these changes:

Commit 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly")
Commit 3851f1cdb2b8 ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout")
Commit 02910177aede ("SUNRPC: Fix reconnection timeouts")

Some common transport code is moved to xprt.c to satisfy the code
duplication police.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/xprt.h     |    3 ++
 include/trace/events/rpcrdma.h  |   31 ++++++++++++++++++
 net/sunrpc/sched.c              |    1 +
 net/sunrpc/xprt.c               |   32 +++++++++++++++++++
 net/sunrpc/xprtrdma/transport.c |   66 ++++++++++++++++++++++++++++++---------
 net/sunrpc/xprtrdma/xprt_rdma.h |    1 +
 net/sunrpc/xprtsock.c           |   23 +-------------
 7 files changed, 121 insertions(+), 36 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index a6d9fce..cc78fd3 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -334,6 +334,9 @@ struct xprt_class {
  */
 struct rpc_xprt		*xprt_create_transport(struct xprt_create *args);
 void			xprt_connect(struct rpc_task *task);
+unsigned long		xprt_reconnect_delay(const struct rpc_xprt *xprt);
+void			xprt_reconnect_backoff(struct rpc_xprt *xprt,
+					       unsigned long init_to);
 void			xprt_reserve(struct rpc_task *task);
 void			xprt_retry_reserve(struct rpc_task *task);
 int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 98023d9..f6a4eaa 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -375,6 +375,37 @@
 DEFINE_RXPRT_EVENT(xprtrdma_op_close);
 DEFINE_RXPRT_EVENT(xprtrdma_op_connect);
 
+TRACE_EVENT(xprtrdma_op_set_cto,
+	TP_PROTO(
+		const struct rpcrdma_xprt *r_xprt,
+		unsigned long connect,
+		unsigned long reconnect
+	),
+
+	TP_ARGS(r_xprt, connect, reconnect),
+
+	TP_STRUCT__entry(
+		__field(const void *, r_xprt)
+		__field(unsigned long, connect)
+		__field(unsigned long, reconnect)
+		__string(addr, rpcrdma_addrstr(r_xprt))
+		__string(port, rpcrdma_portstr(r_xprt))
+	),
+
+	TP_fast_assign(
+		__entry->r_xprt = r_xprt;
+		__entry->connect = connect;
+		__entry->reconnect = reconnect;
+		__assign_str(addr, rpcrdma_addrstr(r_xprt));
+		__assign_str(port, rpcrdma_portstr(r_xprt));
+	),
+
+	TP_printk("peer=[%s]:%s r_xprt=%p: connect=%lu reconnect=%lu",
+		__get_str(addr), __get_str(port), __entry->r_xprt,
+		__entry->connect / HZ, __entry->reconnect / HZ
+	)
+);
+
 TRACE_EVENT(xprtrdma_qp_event,
 	TP_PROTO(
 		const struct rpcrdma_xprt *r_xprt,
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index bb04ae5..5ad5dea 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -58,6 +58,7 @@
  */
 struct workqueue_struct *rpciod_workqueue __read_mostly;
 struct workqueue_struct *xprtiod_workqueue __read_mostly;
+EXPORT_SYMBOL_GPL(xprtiod_workqueue);
 
 unsigned long
 rpc_task_timeout(const struct rpc_task *task)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index ad21880..b1f54b7 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -850,6 +850,38 @@ void xprt_connect(struct rpc_task *task)
 	xprt_release_write(xprt, task);
 }
 
+/**
+ * xprt_reconnect_delay - compute the wait before scheduling a connect
+ * @xprt: transport instance
+ *
+ */
+unsigned long xprt_reconnect_delay(const struct rpc_xprt *xprt)
+{
+	unsigned long start, now = jiffies;
+
+	start = xprt->stat.connect_start + xprt->reestablish_timeout;
+	if (time_after(start, now))
+		return start - now;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xprt_reconnect_delay);
+
+/**
+ * xprt_reconnect_backoff - compute the new re-establish timeout
+ * @xprt: transport instance
+ * @init_to: initial reestablish timeout
+ *
+ */
+void xprt_reconnect_backoff(struct rpc_xprt *xprt, unsigned long init_to)
+{
+	xprt->reestablish_timeout <<= 1;
+	if (xprt->reestablish_timeout > xprt->max_reconnect_timeout)
+		xprt->reestablish_timeout = xprt->max_reconnect_timeout;
+	if (xprt->reestablish_timeout < init_to)
+		xprt->reestablish_timeout = init_to;
+}
+EXPORT_SYMBOL_GPL(xprt_reconnect_backoff);
+
 enum xprt_xid_rb_cmp {
 	XID_RB_EQUAL,
 	XID_RB_LEFT,
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 3688e078..4993aa4 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -298,6 +298,7 @@
 	module_put(THIS_MODULE);
 }
 
+/* 60 second timeout, no retries */
 static const struct rpc_timeout xprt_rdma_default_timeout = {
 	.to_initval = 60 * HZ,
 	.to_maxval = 60 * HZ,
@@ -323,8 +324,9 @@
 	if (!xprt)
 		return ERR_PTR(-ENOMEM);
 
-	/* 60 second timeout, no retries */
 	xprt->timeout = &xprt_rdma_default_timeout;
+	xprt->connect_timeout = xprt->timeout->to_initval;
+	xprt->max_reconnect_timeout = xprt->timeout->to_maxval;
 	xprt->bind_timeout = RPCRDMA_BIND_TO;
 	xprt->reestablish_timeout = RPCRDMA_INIT_REEST_TO;
 	xprt->idle_timeout = RPCRDMA_IDLE_DISC_TO;
@@ -487,31 +489,64 @@ void xprt_rdma_close(struct rpc_xprt *xprt)
 }
 
 /**
- * xprt_rdma_connect - try to establish a transport connection
+ * xprt_rdma_set_connect_timeout - set timeouts for establishing a connection
+ * @xprt: controlling transport instance
+ * @connect_timeout: reconnect timeout after client disconnects
+ * @reconnect_timeout: reconnect timeout after server disconnects
+ *
+ */
+static void xprt_rdma_tcp_set_connect_timeout(struct rpc_xprt *xprt,
+					      unsigned long connect_timeout,
+					      unsigned long reconnect_timeout)
+{
+	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
+
+	trace_xprtrdma_op_set_cto(r_xprt, connect_timeout, reconnect_timeout);
+
+	spin_lock(&xprt->transport_lock);
+
+	if (connect_timeout < xprt->connect_timeout) {
+		struct rpc_timeout to;
+		unsigned long initval;
+
+		to = *xprt->timeout;
+		initval = connect_timeout;
+		if (initval < RPCRDMA_INIT_REEST_TO << 1)
+			initval = RPCRDMA_INIT_REEST_TO << 1;
+		to.to_initval = initval;
+		to.to_maxval = initval;
+		r_xprt->rx_timeout = to;
+		xprt->timeout = &r_xprt->rx_timeout;
+		xprt->connect_timeout = connect_timeout;
+	}
+
+	if (reconnect_timeout < xprt->max_reconnect_timeout)
+		xprt->max_reconnect_timeout = reconnect_timeout;
+
+	spin_unlock(&xprt->transport_lock);
+}
+
+/**
+ * xprt_rdma_connect - schedule an attempt to reconnect
  * @xprt: transport state
- * @task: RPC scheduler context
+ * @task: RPC scheduler context (unused)
  *
  */
 static void
 xprt_rdma_connect(struct rpc_xprt *xprt, struct rpc_task *task)
 {
 	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
+	unsigned long delay;
 
 	trace_xprtrdma_op_connect(r_xprt);
+
+	delay = 0;
 	if (r_xprt->rx_ep.rep_connected != 0) {
-		/* Reconnect */
-		schedule_delayed_work(&r_xprt->rx_connect_worker,
-				      xprt->reestablish_timeout);
-		xprt->reestablish_timeout <<= 1;
-		if (xprt->reestablish_timeout > RPCRDMA_MAX_REEST_TO)
-			xprt->reestablish_timeout = RPCRDMA_MAX_REEST_TO;
-		else if (xprt->reestablish_timeout < RPCRDMA_INIT_REEST_TO)
-			xprt->reestablish_timeout = RPCRDMA_INIT_REEST_TO;
-	} else {
-		schedule_delayed_work(&r_xprt->rx_connect_worker, 0);
-		if (!RPC_IS_ASYNC(task))
-			flush_delayed_work(&r_xprt->rx_connect_worker);
+		delay = xprt_reconnect_delay(xprt);
+		xprt_reconnect_backoff(xprt, RPCRDMA_INIT_REEST_TO);
 	}
+	queue_delayed_work(xprtiod_workqueue, &r_xprt->rx_connect_worker,
+			   delay);
 }
 
 /**
@@ -769,6 +804,7 @@ void xprt_rdma_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
 	.send_request		= xprt_rdma_send_request,
 	.close			= xprt_rdma_close,
 	.destroy		= xprt_rdma_destroy,
+	.set_connect_timeout	= xprt_rdma_tcp_set_connect_timeout,
 	.print_stats		= xprt_rdma_print_stats,
 	.enable_swap		= xprt_rdma_enable_swap,
 	.disable_swap		= xprt_rdma_disable_swap,
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 117e328..8378f45 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -432,6 +432,7 @@ struct rpcrdma_xprt {
 	struct rpcrdma_ep	rx_ep;
 	struct rpcrdma_buffer	rx_buf;
 	struct delayed_work	rx_connect_worker;
+	struct rpc_timeout	rx_timeout;
 	struct rpcrdma_stats	rx_stats;
 };
 
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index c69951e..b154600 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2402,25 +2402,6 @@ static void xs_tcp_setup_socket(struct work_struct *work)
 	xprt_wake_pending_tasks(xprt, status);
 }
 
-static unsigned long xs_reconnect_delay(const struct rpc_xprt *xprt)
-{
-	unsigned long start, now = jiffies;
-
-	start = xprt->stat.connect_start + xprt->reestablish_timeout;
-	if (time_after(start, now))
-		return start - now;
-	return 0;
-}
-
-static void xs_reconnect_backoff(struct rpc_xprt *xprt)
-{
-	xprt->reestablish_timeout <<= 1;
-	if (xprt->reestablish_timeout > xprt->max_reconnect_timeout)
-		xprt->reestablish_timeout = xprt->max_reconnect_timeout;
-	if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO)
-		xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
-}
-
 /**
  * xs_connect - connect a socket to a remote endpoint
  * @xprt: pointer to transport structure
@@ -2450,8 +2431,8 @@ static void xs_connect(struct rpc_xprt *xprt, struct rpc_task *task)
 		/* Start by resetting any existing state */
 		xs_reset_transport(transport);
 
-		delay = xs_reconnect_delay(xprt);
-		xs_reconnect_backoff(xprt);
+		delay = xprt_reconnect_delay(xprt);
+		xprt_reconnect_backoff(xprt, XS_TCP_INIT_REEST_TO);
 
 	} else
 		dprintk("RPC:       xs_connect scheduled xprt %p\n", xprt);


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 15/19] NFS4: Add a trace event to record invalid CB sequence IDs
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (13 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 14/19] xprtrdma: Modernize ops->connect Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again Chuck Lever
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Help debug NFSv4 callback failures.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/callback_proc.c |   28 ++++++++++++++++++++--------
 fs/nfs/nfs4trace.h     |   38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 3159673..f39924b 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -414,27 +414,39 @@ __be32 nfs4_callback_devicenotify(void *argp, void *resp,
 validate_seqid(const struct nfs4_slot_table *tbl, const struct nfs4_slot *slot,
 		const struct cb_sequenceargs * args)
 {
+	__be32 ret;
+
+	ret = cpu_to_be32(NFS4ERR_BADSLOT);
 	if (args->csa_slotid > tbl->server_highest_slotid)
-		return htonl(NFS4ERR_BADSLOT);
+		goto out_err;
 
 	/* Replay */
 	if (args->csa_sequenceid == slot->seq_nr) {
+		ret = cpu_to_be32(NFS4ERR_DELAY);
 		if (nfs4_test_locked_slot(tbl, slot->slot_nr))
-			return htonl(NFS4ERR_DELAY);
+			goto out_err;
+
 		/* Signal process_op to set this error on next op */
+		ret = cpu_to_be32(NFS4ERR_RETRY_UNCACHED_REP);
 		if (args->csa_cachethis == 0)
-			return htonl(NFS4ERR_RETRY_UNCACHED_REP);
+			goto out_err;
 
 		/* Liar! We never allowed you to set csa_cachethis != 0 */
-		return htonl(NFS4ERR_SEQ_FALSE_RETRY);
+		ret = cpu_to_be32(NFS4ERR_SEQ_FALSE_RETRY);
+		goto out_err;
 	}
 
 	/* Note: wraparound relies on seq_nr being of type u32 */
-	if (likely(args->csa_sequenceid == slot->seq_nr + 1))
-		return htonl(NFS4_OK);
-
 	/* Misordered request */
-	return htonl(NFS4ERR_SEQ_MISORDERED);
+	ret = cpu_to_be32(NFS4ERR_SEQ_MISORDERED);
+	if (args->csa_sequenceid != slot->seq_nr + 1)
+		goto out_err;
+
+	return cpu_to_be32(NFS4_OK);
+
+out_err:
+	trace_nfs4_cb_seqid_err(args, ret);
+	return ret;
 }
 
 /*
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index cd1a5c0..6beb1f2 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -490,6 +490,44 @@
 			__entry->highest_slotid
 		)
 );
+
+TRACE_EVENT(nfs4_cb_seqid_err,
+		TP_PROTO(
+			const struct cb_sequenceargs *args,
+			__be32 status
+		),
+		TP_ARGS(args, status),
+
+		TP_STRUCT__entry(
+			__field(unsigned int, session)
+			__field(unsigned int, slot_nr)
+			__field(unsigned int, seq_nr)
+			__field(unsigned int, highest_slotid)
+			__field(unsigned int, cachethis)
+			__field(int, error)
+		),
+
+		TP_fast_assign(
+			__entry->session = nfs_session_id_hash(&args->csa_sessionid);
+			__entry->slot_nr = args->csa_slotid;
+			__entry->seq_nr = args->csa_sequenceid;
+			__entry->highest_slotid = args->csa_highestslotid;
+			__entry->cachethis = args->csa_cachethis;
+			__entry->error = -be32_to_cpu(status);
+		),
+
+		TP_printk(
+			"error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
+			"highest_slotid=%u",
+			__entry->error,
+			show_nfsv4_errors(__entry->error),
+			__entry->session,
+			__entry->slot_nr,
+			__entry->seq_nr,
+			__entry->highest_slotid
+		)
+);
+
 #endif /* CONFIG_NFS_V4_1 */
 
 TRACE_EVENT(nfs4_setup_sequence,


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (14 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 15/19] NFS4: Add a trace event to record invalid CB sequence IDs Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 19:33   ` Olga Kornievskaia
  2019-06-11 15:09 ` [PATCH v2 17/19] NFS: Display symbolic status code names in trace log Chuck Lever
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

I noticed that NFS status values stopped working again.

trace_print_symbols_seq() takes an unsigned long. Passing a negative
errno or negative NFSERR value just confuses it, and since we're
using C macros here and not static inline functions, all bets are
off due to implicit type casting.

Straight-line the calling conventions so that error codes are stored
in the trace record as positive values in an unsigned long field.

It's often the case that an error value that is positive is a byte
count but when it's negative, it's an error (e.g. nfs4_write). Fix
those cases so that the value that is eventually stored in the
error field is a positive NFS status or errno, or zero.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs4trace.h |  120 ++++++++++++++++++++++++++--------------------------
 1 file changed, 60 insertions(+), 60 deletions(-)

diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 6beb1f2..9a01731 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -156,7 +156,7 @@
 TRACE_DEFINE_ENUM(NFS4ERR_XDEV);
 
 #define show_nfsv4_errors(error) \
-	__print_symbolic(-(error), \
+	__print_symbolic(error, \
 		{ NFS4_OK, "OK" }, \
 		/* Mapped by nfs4_stat_to_errno() */ \
 		{ EPERM, "EPERM" }, \
@@ -348,7 +348,7 @@
 
 		TP_STRUCT__entry(
 			__string(dstaddr, clp->cl_hostname)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -357,7 +357,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) dstaddr=%s",
+			"error=%lu (%s) dstaddr=%s",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
 			__get_str(dstaddr)
@@ -420,7 +420,7 @@
 			__field(unsigned int, highest_slotid)
 			__field(unsigned int, target_highest_slotid)
 			__field(unsigned int, status_flags)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -435,7 +435,7 @@
 			__entry->error = res->sr_status;
 		),
 		TP_printk(
-			"error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
+			"error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
 			"highest_slotid=%u target_highest_slotid=%u "
 			"status_flags=%u (%s)",
 			__entry->error,
@@ -467,7 +467,7 @@
 			__field(unsigned int, seq_nr)
 			__field(unsigned int, highest_slotid)
 			__field(unsigned int, cachethis)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -476,11 +476,11 @@
 			__entry->seq_nr = args->csa_sequenceid;
 			__entry->highest_slotid = args->csa_highestslotid;
 			__entry->cachethis = args->csa_cachethis;
-			__entry->error = -be32_to_cpu(status);
+			__entry->error = be32_to_cpu(status);
 		),
 
 		TP_printk(
-			"error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
+			"error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
 			"highest_slotid=%u",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -504,7 +504,7 @@
 			__field(unsigned int, seq_nr)
 			__field(unsigned int, highest_slotid)
 			__field(unsigned int, cachethis)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -513,11 +513,11 @@
 			__entry->seq_nr = args->csa_sequenceid;
 			__entry->highest_slotid = args->csa_highestslotid;
 			__entry->cachethis = args->csa_cachethis;
-			__entry->error = -be32_to_cpu(status);
+			__entry->error = be32_to_cpu(status);
 		),
 
 		TP_printk(
-			"error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
+			"error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
 			"highest_slotid=%u",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -572,18 +572,18 @@
 
 		TP_STRUCT__entry(
 			__field(u32, op)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
 			__entry->op = op;
-			__entry->error = -error;
+			__entry->error = error;
 		),
 
 		TP_printk(
-			"operation %d: nfs status %d (%s)",
-			__entry->op,
-			__entry->error, show_nfsv4_errors(__entry->error)
+			"error=%lu (%s) operation %d:",
+			__entry->error, show_nfsv4_errors(__entry->error),
+			__entry->op
 		)
 );
 
@@ -597,7 +597,7 @@
 		TP_ARGS(ctx, flags, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(unsigned int, flags)
 			__field(unsigned int, fmode)
 			__field(dev_t, dev)
@@ -615,7 +615,7 @@
 			const struct nfs4_state *state = ctx->state;
 			const struct inode *inode = NULL;
 
-			__entry->error = error;
+			__entry->error = -error;
 			__entry->flags = flags;
 			__entry->fmode = (__force unsigned int)ctx->mode;
 			__entry->dev = ctx->dentry->d_sb->s_dev;
@@ -647,7 +647,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) flags=%d (%s) fmode=%s "
+			"error=%lu (%s) flags=%d (%s) fmode=%s "
 			"fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"name=%02x:%02x:%llu/%s stateid=%d:0x%08x "
 			"openstateid=%d:0x%08x",
@@ -733,7 +733,7 @@
 			__field(u32, fhandle)
 			__field(u64, fileid)
 			__field(unsigned int, fmode)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, stateid_seq)
 			__field(u32, stateid_hash)
 		),
@@ -753,7 +753,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fmode=%s fileid=%02x:%02x:%llu "
+			"error=%lu (%s) fmode=%s fileid=%02x:%02x:%llu "
 			"fhandle=0x%08x openstateid=%d:0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -795,7 +795,7 @@
 		TP_ARGS(request, state, cmd, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, cmd)
 			__field(char, type)
 			__field(loff_t, start)
@@ -825,7 +825,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) cmd=%s:%s range=%lld:%lld "
+			"error=%lu (%s) cmd=%s:%s range=%lld:%lld "
 			"fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"stateid=%d:0x%08x",
 			__entry->error,
@@ -865,7 +865,7 @@
 		TP_ARGS(request, state, lockstateid, cmd, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, cmd)
 			__field(char, type)
 			__field(loff_t, start)
@@ -901,7 +901,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) cmd=%s:%s range=%lld:%lld "
+			"error=%lu (%s) cmd=%s:%s range=%lld:%lld "
 			"fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"stateid=%d:0x%08x lockstateid=%d:0x%08x",
 			__entry->error,
@@ -970,7 +970,7 @@
 		TP_STRUCT__entry(
 			__field(dev_t, dev)
 			__field(u32, fhandle)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, stateid_seq)
 			__field(u32, stateid_hash)
 		),
@@ -986,7 +986,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) dev=%02x:%02x fhandle=0x%08x "
+			"error=%lu (%s) dev=%02x:%02x fhandle=0x%08x "
 			"stateid=%d:0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1007,7 +1007,7 @@
 		TP_ARGS(state, lsp, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(dev_t, dev)
 			__field(u32, fhandle)
 			__field(u64, fileid)
@@ -1029,7 +1029,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"stateid=%d:0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1064,7 +1064,7 @@
 
 		TP_STRUCT__entry(
 			__field(dev_t, dev)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(u64, dir)
 			__string(name, name->name)
 		),
@@ -1072,12 +1072,12 @@
 		TP_fast_assign(
 			__entry->dev = dir->i_sb->s_dev;
 			__entry->dir = NFS_FILEID(dir);
-			__entry->error = error;
+			__entry->error = -error;
 			__assign_str(name, name->name);
 		),
 
 		TP_printk(
-			"error=%d (%s) name=%02x:%02x:%llu/%s",
+			"error=%lu (%s) name=%02x:%02x:%llu/%s",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -1114,7 +1114,7 @@
 		TP_STRUCT__entry(
 			__field(dev_t, dev)
 			__field(u64, ino)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -1124,7 +1124,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) inode=%02x:%02x:%llu",
+			"error=%lu (%s) inode=%02x:%02x:%llu",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -1145,7 +1145,7 @@
 
 		TP_STRUCT__entry(
 			__field(dev_t, dev)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(u64, olddir)
 			__string(oldname, oldname->name)
 			__field(u64, newdir)
@@ -1162,7 +1162,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) oldname=%02x:%02x:%llu/%s "
+			"error=%lu (%s) oldname=%02x:%02x:%llu/%s "
 			"newname=%02x:%02x:%llu/%s",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1187,18 +1187,18 @@
 			__field(dev_t, dev)
 			__field(u32, fhandle)
 			__field(u64, fileid)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
 			__entry->dev = inode->i_sb->s_dev;
 			__entry->fileid = NFS_FILEID(inode);
 			__entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -1238,7 +1238,7 @@
 			__field(dev_t, dev)
 			__field(u32, fhandle)
 			__field(u64, fileid)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, stateid_seq)
 			__field(u32, stateid_hash)
 		),
@@ -1255,7 +1255,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"stateid=%d:0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1295,7 +1295,7 @@
 			__field(u32, fhandle)
 			__field(u64, fileid)
 			__field(unsigned int, valid)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -1307,7 +1307,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"valid=%s",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1342,7 +1342,7 @@
 		TP_ARGS(clp, fhandle, inode, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(dev_t, dev)
 			__field(u32, fhandle)
 			__field(u64, fileid)
@@ -1363,7 +1363,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"dstaddr=%s",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1397,7 +1397,7 @@
 		TP_ARGS(clp, fhandle, inode, stateid, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(dev_t, dev)
 			__field(u32, fhandle)
 			__field(u64, fileid)
@@ -1424,7 +1424,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"stateid=%d:0x%08x dstaddr=%s",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1460,7 +1460,7 @@
 		TP_ARGS(name, len, id, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(u32, id)
 			__dynamic_array(char, name, len > 0 ? len + 1 : 1)
 		),
@@ -1475,8 +1475,8 @@
 		),
 
 		TP_printk(
-			"error=%d id=%u name=%s",
-			__entry->error,
+			"error=%lu (%s) id=%u name=%s",
+			__entry->error, show_nfsv4_errors(__entry->error),
 			__entry->id,
 			__get_str(name)
 		)
@@ -1509,7 +1509,7 @@
 			__field(u64, fileid)
 			__field(loff_t, offset)
 			__field(size_t, count)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, stateid_seq)
 			__field(u32, stateid_hash)
 		),
@@ -1523,7 +1523,7 @@
 			__entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
 			__entry->offset = hdr->args.offset;
 			__entry->count = hdr->args.count;
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 			__entry->stateid_seq =
 				be32_to_cpu(state->stateid.seqid);
 			__entry->stateid_hash =
@@ -1531,7 +1531,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"offset=%lld count=%zu stateid=%d:0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1569,7 +1569,7 @@
 			__field(u64, fileid)
 			__field(loff_t, offset)
 			__field(size_t, count)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, stateid_seq)
 			__field(u32, stateid_hash)
 		),
@@ -1583,7 +1583,7 @@
 			__entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
 			__entry->offset = hdr->args.offset;
 			__entry->count = hdr->args.count;
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 			__entry->stateid_seq =
 				be32_to_cpu(state->stateid.seqid);
 			__entry->stateid_hash =
@@ -1591,7 +1591,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"offset=%lld count=%zu stateid=%d:0x%08x",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1630,7 +1630,7 @@
 			__field(u64, fileid)
 			__field(loff_t, offset)
 			__field(size_t, count)
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -1644,7 +1644,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"offset=%lld count=%zu",
 			__entry->error,
 			show_nfsv4_errors(__entry->error),
@@ -1694,7 +1694,7 @@
 			__field(u32, iomode)
 			__field(u64, offset)
 			__field(u64, count)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(int, stateid_seq)
 			__field(u32, stateid_hash)
 			__field(int, layoutstateid_seq)
@@ -1727,7 +1727,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"iomode=%s offset=%llu count=%llu stateid=%d:0x%08x "
 			"layoutstateid=%d:0x%08x",
 			__entry->error,


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 17/19] NFS: Display symbolic status code names in trace log
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (15 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 18/19] NFS: Update symbolic flags displayed by trace events Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 19/19] NFS: Record task, client ID, and XID in xdr_status trace points Chuck Lever
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

For improved readability, add nfs_show_status() call-sites in the
generic NFS trace points so that the symbolic status code name is
displayed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfstrace.h |   74 +++++++++++++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 34 deletions(-)

diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index a0d6910..adc30b0 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -83,7 +83,7 @@
 		TP_ARGS(inode, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(dev_t, dev)
 			__field(u32, fhandle)
 			__field(unsigned char, type)
@@ -96,7 +96,7 @@
 
 		TP_fast_assign(
 			const struct nfs_inode *nfsi = NFS_I(inode);
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 			__entry->dev = inode->i_sb->s_dev;
 			__entry->fileid = nfsi->fileid;
 			__entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
@@ -108,10 +108,10 @@
 		),
 
 		TP_printk(
-			"error=%d fileid=%02x:%02x:%llu fhandle=0x%08x "
+			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"type=%u (%s) version=%llu size=%lld "
 			"cache_validity=%lu (%s) nfs_flags=%ld (%s)",
-			__entry->error,
+			__entry->error, nfs_show_status(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			(unsigned long long)__entry->fileid,
 			__entry->fhandle,
@@ -219,7 +219,7 @@
 		TP_ARGS(dir, dentry, flags, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(unsigned int, flags)
 			__field(dev_t, dev)
 			__field(u64, dir)
@@ -229,14 +229,14 @@
 		TP_fast_assign(
 			__entry->dev = dir->i_sb->s_dev;
 			__entry->dir = NFS_FILEID(dir);
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 			__entry->flags = flags;
 			__assign_str(name, dentry->d_name.name);
 		),
 
 		TP_printk(
-			"error=%d flags=%u (%s) name=%02x:%02x:%llu/%s",
-			__entry->error,
+			"error=%lu (%s) flags=%u (%s) name=%02x:%02x:%llu/%s",
+			__entry->error, nfs_show_status(__entry->error),
 			__entry->flags,
 			show_lookup_flags(__entry->flags),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -323,7 +323,7 @@
 		TP_ARGS(dir, ctx, flags, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(unsigned int, flags)
 			__field(unsigned int, fmode)
 			__field(dev_t, dev)
@@ -332,7 +332,7 @@
 		),
 
 		TP_fast_assign(
-			__entry->error = error;
+			__entry->error = -error;
 			__entry->dev = dir->i_sb->s_dev;
 			__entry->dir = NFS_FILEID(dir);
 			__entry->flags = flags;
@@ -341,9 +341,9 @@
 		),
 
 		TP_printk(
-			"error=%d flags=%u (%s) fmode=%s "
+			"error=%lu (%s) flags=%u (%s) fmode=%s "
 			"name=%02x:%02x:%llu/%s",
-			__entry->error,
+			__entry->error, nfs_show_status(__entry->error),
 			__entry->flags,
 			show_open_flags(__entry->flags),
 			show_fmode_flags(__entry->fmode),
@@ -397,7 +397,7 @@
 		TP_ARGS(dir, dentry, flags, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(unsigned int, flags)
 			__field(dev_t, dev)
 			__field(u64, dir)
@@ -405,7 +405,7 @@
 		),
 
 		TP_fast_assign(
-			__entry->error = error;
+			__entry->error = -error;
 			__entry->dev = dir->i_sb->s_dev;
 			__entry->dir = NFS_FILEID(dir);
 			__entry->flags = flags;
@@ -413,8 +413,8 @@
 		),
 
 		TP_printk(
-			"error=%d flags=%u (%s) name=%02x:%02x:%llu/%s",
-			__entry->error,
+			"error=%lu (%s) flags=%u (%s) name=%02x:%02x:%llu/%s",
+			__entry->error, nfs_show_status(__entry->error),
 			__entry->flags,
 			show_open_flags(__entry->flags),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -469,7 +469,7 @@
 		TP_ARGS(dir, dentry, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(dev_t, dev)
 			__field(u64, dir)
 			__string(name, dentry->d_name.name)
@@ -478,13 +478,13 @@
 		TP_fast_assign(
 			__entry->dev = dir->i_sb->s_dev;
 			__entry->dir = NFS_FILEID(dir);
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 			__assign_str(name, dentry->d_name.name);
 		),
 
 		TP_printk(
-			"error=%d name=%02x:%02x:%llu/%s",
-			__entry->error,
+			"error=%lu (%s) name=%02x:%02x:%llu/%s",
+			__entry->error, nfs_show_status(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			(unsigned long long)__entry->dir,
 			__get_str(name)
@@ -557,7 +557,7 @@
 		TP_ARGS(inode, dir, dentry, error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(dev_t, dev)
 			__field(u64, fileid)
 			__field(u64, dir)
@@ -568,13 +568,13 @@
 			__entry->dev = inode->i_sb->s_dev;
 			__entry->fileid = NFS_FILEID(inode);
 			__entry->dir = NFS_FILEID(dir);
-			__entry->error = error;
+			__entry->error = error < 0 ? -error : 0;
 			__assign_str(name, dentry->d_name.name);
 		),
 
 		TP_printk(
-			"error=%d fileid=%02x:%02x:%llu name=%02x:%02x:%llu/%s",
-			__entry->error,
+			"error=%lu (%s) fileid=%02x:%02x:%llu name=%02x:%02x:%llu/%s",
+			__entry->error, nfs_show_status(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			__entry->fileid,
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -642,7 +642,7 @@
 
 		TP_STRUCT__entry(
 			__field(dev_t, dev)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(u64, old_dir)
 			__string(old_name, old_dentry->d_name.name)
 			__field(u64, new_dir)
@@ -651,17 +651,17 @@
 
 		TP_fast_assign(
 			__entry->dev = old_dir->i_sb->s_dev;
+			__entry->error = -error;
 			__entry->old_dir = NFS_FILEID(old_dir);
 			__entry->new_dir = NFS_FILEID(new_dir);
-			__entry->error = error;
 			__assign_str(old_name, old_dentry->d_name.name);
 			__assign_str(new_name, new_dentry->d_name.name);
 		),
 
 		TP_printk(
-			"error=%d old_name=%02x:%02x:%llu/%s "
+			"error=%lu (%s) old_name=%02x:%02x:%llu/%s "
 			"new_name=%02x:%02x:%llu/%s",
-			__entry->error,
+			__entry->error, nfs_show_status(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			(unsigned long long)__entry->old_dir,
 			__get_str(old_name),
@@ -697,7 +697,7 @@
 
 		TP_STRUCT__entry(
 			__field(dev_t, dev)
-			__field(int, error)
+			__field(unsigned long, error)
 			__field(u64, dir)
 			__dynamic_array(char, name, data->args.name.len + 1)
 		),
@@ -707,15 +707,15 @@
 			size_t len = data->args.name.len;
 			__entry->dev = dir->i_sb->s_dev;
 			__entry->dir = NFS_FILEID(dir);
-			__entry->error = error;
+			__entry->error = -error;
 			memcpy(__get_str(name),
 				data->args.name.name, len);
 			__get_str(name)[len] = 0;
 		),
 
 		TP_printk(
-			"error=%d name=%02x:%02x:%llu/%s",
-			__entry->error,
+			"error=%lu (%s) name=%02x:%02x:%llu/%s",
+			__entry->error, nfs_show_status(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			(unsigned long long)__entry->dir,
 			__get_str(name)
@@ -974,6 +974,8 @@
 TRACE_DEFINE_ENUM(NFSERR_NOENT);
 TRACE_DEFINE_ENUM(NFSERR_IO);
 TRACE_DEFINE_ENUM(NFSERR_NXIO);
+TRACE_DEFINE_ENUM(ECHILD);
+TRACE_DEFINE_ENUM(NFSERR_EAGAIN);
 TRACE_DEFINE_ENUM(NFSERR_ACCES);
 TRACE_DEFINE_ENUM(NFSERR_EXIST);
 TRACE_DEFINE_ENUM(NFSERR_XDEV);
@@ -985,6 +987,7 @@
 TRACE_DEFINE_ENUM(NFSERR_NOSPC);
 TRACE_DEFINE_ENUM(NFSERR_ROFS);
 TRACE_DEFINE_ENUM(NFSERR_MLINK);
+TRACE_DEFINE_ENUM(NFSERR_OPNOTSUPP);
 TRACE_DEFINE_ENUM(NFSERR_NAMETOOLONG);
 TRACE_DEFINE_ENUM(NFSERR_NOTEMPTY);
 TRACE_DEFINE_ENUM(NFSERR_DQUOT);
@@ -1007,6 +1010,8 @@
 			{ NFSERR_NOENT, "NOENT" }, \
 			{ NFSERR_IO, "IO" }, \
 			{ NFSERR_NXIO, "NXIO" }, \
+			{ ECHILD, "CHILD" }, \
+			{ NFSERR_EAGAIN, "AGAIN" }, \
 			{ NFSERR_ACCES, "ACCES" }, \
 			{ NFSERR_EXIST, "EXIST" }, \
 			{ NFSERR_XDEV, "XDEV" }, \
@@ -1018,6 +1023,7 @@
 			{ NFSERR_NOSPC, "NOSPC" }, \
 			{ NFSERR_ROFS, "ROFS" }, \
 			{ NFSERR_MLINK, "MLINK" }, \
+			{ NFSERR_OPNOTSUPP, "OPNOTSUPP" }, \
 			{ NFSERR_NAMETOOLONG, "NAMETOOLONG" }, \
 			{ NFSERR_NOTEMPTY, "NOTEMPTY" }, \
 			{ NFSERR_DQUOT, "DQUOT" }, \
@@ -1041,7 +1047,7 @@
 		TP_ARGS(error),
 
 		TP_STRUCT__entry(
-			__field(int, error)
+			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
@@ -1049,7 +1055,7 @@
 		),
 
 		TP_printk(
-			"error=%d (%s)",
+			"error=%lu (%s)",
 			__entry->error, nfs_show_status(__entry->error)
 		)
 );


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 18/19] NFS: Update symbolic flags displayed by trace events
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (16 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 17/19] NFS: Display symbolic status code names in trace log Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  2019-06-11 15:09 ` [PATCH v2 19/19] NFS: Record task, client ID, and XID in xdr_status trace points Chuck Lever
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfstrace.h |   98 +++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 84 insertions(+), 14 deletions(-)

diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index adc30b0..3a0ed3d 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -11,6 +11,16 @@
 #include <linux/tracepoint.h>
 #include <linux/iversion.h>
 
+TRACE_DEFINE_ENUM(DT_UNKNOWN);
+TRACE_DEFINE_ENUM(DT_FIFO);
+TRACE_DEFINE_ENUM(DT_CHR);
+TRACE_DEFINE_ENUM(DT_DIR);
+TRACE_DEFINE_ENUM(DT_BLK);
+TRACE_DEFINE_ENUM(DT_REG);
+TRACE_DEFINE_ENUM(DT_LNK);
+TRACE_DEFINE_ENUM(DT_SOCK);
+TRACE_DEFINE_ENUM(DT_WHT);
+
 #define nfs_show_file_type(ftype) \
 	__print_symbolic(ftype, \
 			{ DT_UNKNOWN, "UNKNOWN" }, \
@@ -23,25 +33,57 @@
 			{ DT_SOCK, "SOCK" }, \
 			{ DT_WHT, "WHT" })
 
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_DATA);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_ATIME);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_ACCESS);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_ACL);
+TRACE_DEFINE_ENUM(NFS_INO_REVAL_PAGECACHE);
+TRACE_DEFINE_ENUM(NFS_INO_REVAL_FORCED);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_LABEL);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_CHANGE);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_CTIME);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_MTIME);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_SIZE);
+TRACE_DEFINE_ENUM(NFS_INO_INVALID_OTHER);
+
 #define nfs_show_cache_validity(v) \
 	__print_flags(v, "|", \
-			{ NFS_INO_INVALID_ATTR, "INVALID_ATTR" }, \
 			{ NFS_INO_INVALID_DATA, "INVALID_DATA" }, \
 			{ NFS_INO_INVALID_ATIME, "INVALID_ATIME" }, \
 			{ NFS_INO_INVALID_ACCESS, "INVALID_ACCESS" }, \
 			{ NFS_INO_INVALID_ACL, "INVALID_ACL" }, \
 			{ NFS_INO_REVAL_PAGECACHE, "REVAL_PAGECACHE" }, \
 			{ NFS_INO_REVAL_FORCED, "REVAL_FORCED" }, \
-			{ NFS_INO_INVALID_LABEL, "INVALID_LABEL" })
+			{ NFS_INO_INVALID_LABEL, "INVALID_LABEL" }, \
+			{ NFS_INO_INVALID_CHANGE, "INVALID_CHANGE" }, \
+			{ NFS_INO_INVALID_CTIME, "INVALID_CTIME" }, \
+			{ NFS_INO_INVALID_MTIME, "INVALID_MTIME" }, \
+			{ NFS_INO_INVALID_SIZE, "INVALID_SIZE" }, \
+			{ NFS_INO_INVALID_OTHER, "INVALID_OTHER" })
+
+TRACE_DEFINE_ENUM(NFS_INO_ADVISE_RDPLUS);
+TRACE_DEFINE_ENUM(NFS_INO_STALE);
+TRACE_DEFINE_ENUM(NFS_INO_ACL_LRU_SET);
+TRACE_DEFINE_ENUM(NFS_INO_INVALIDATING);
+TRACE_DEFINE_ENUM(NFS_INO_FSCACHE);
+TRACE_DEFINE_ENUM(NFS_INO_FSCACHE_LOCK);
+TRACE_DEFINE_ENUM(NFS_INO_LAYOUTCOMMIT);
+TRACE_DEFINE_ENUM(NFS_INO_LAYOUTCOMMITTING);
+TRACE_DEFINE_ENUM(NFS_INO_LAYOUTSTATS);
+TRACE_DEFINE_ENUM(NFS_INO_ODIRECT);
 
 #define nfs_show_nfsi_flags(v) \
 	__print_flags(v, "|", \
 			{ 1 << NFS_INO_ADVISE_RDPLUS, "ADVISE_RDPLUS" }, \
 			{ 1 << NFS_INO_STALE, "STALE" }, \
+			{ 1 << NFS_INO_ACL_LRU_SET, "ACL_LRU_SET" }, \
 			{ 1 << NFS_INO_INVALIDATING, "INVALIDATING" }, \
 			{ 1 << NFS_INO_FSCACHE, "FSCACHE" }, \
+			{ 1 << NFS_INO_FSCACHE_LOCK, "FSCACHE_LOCK" }, \
 			{ 1 << NFS_INO_LAYOUTCOMMIT, "NEED_LAYOUTCOMMIT" }, \
-			{ 1 << NFS_INO_LAYOUTCOMMITTING, "LAYOUTCOMMIT" })
+			{ 1 << NFS_INO_LAYOUTCOMMITTING, "LAYOUTCOMMIT" }, \
+			{ 1 << NFS_INO_LAYOUTSTATS, "LAYOUTSTATS" }, \
+			{ 1 << NFS_INO_ODIRECT, "ODIRECT" })
 
 DECLARE_EVENT_CLASS(nfs_inode_event,
 		TP_PROTO(
@@ -110,7 +152,7 @@
 		TP_printk(
 			"error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
 			"type=%u (%s) version=%llu size=%lld "
-			"cache_validity=%lu (%s) nfs_flags=%ld (%s)",
+			"cache_validity=0x%lx (%s) nfs_flags=0x%lx (%s)",
 			__entry->error, nfs_show_status(__entry->error),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
 			(unsigned long long)__entry->fileid,
@@ -260,15 +302,43 @@
 DEFINE_NFS_LOOKUP_EVENT(nfs_lookup_revalidate_enter);
 DEFINE_NFS_LOOKUP_EVENT_DONE(nfs_lookup_revalidate_exit);
 
+TRACE_DEFINE_ENUM(O_WRONLY);
+TRACE_DEFINE_ENUM(O_RDWR);
+TRACE_DEFINE_ENUM(O_CREAT);
+TRACE_DEFINE_ENUM(O_EXCL);
+TRACE_DEFINE_ENUM(O_NOCTTY);
+TRACE_DEFINE_ENUM(O_TRUNC);
+TRACE_DEFINE_ENUM(O_APPEND);
+TRACE_DEFINE_ENUM(O_NONBLOCK);
+TRACE_DEFINE_ENUM(O_DSYNC);
+TRACE_DEFINE_ENUM(O_DIRECT);
+TRACE_DEFINE_ENUM(O_LARGEFILE);
+TRACE_DEFINE_ENUM(O_DIRECTORY);
+TRACE_DEFINE_ENUM(O_NOFOLLOW);
+TRACE_DEFINE_ENUM(O_NOATIME);
+TRACE_DEFINE_ENUM(O_CLOEXEC);
+
 #define show_open_flags(flags) \
-	__print_flags((unsigned long)flags, "|", \
+	__print_flags(flags, "|", \
+		{ O_WRONLY, "O_WRONLY" }, \
+		{ O_RDWR, "O_RDWR" }, \
 		{ O_CREAT, "O_CREAT" }, \
 		{ O_EXCL, "O_EXCL" }, \
+		{ O_NOCTTY, "O_NOCTTY" }, \
 		{ O_TRUNC, "O_TRUNC" }, \
 		{ O_APPEND, "O_APPEND" }, \
+		{ O_NONBLOCK, "O_NONBLOCK" }, \
 		{ O_DSYNC, "O_DSYNC" }, \
 		{ O_DIRECT, "O_DIRECT" }, \
-		{ O_DIRECTORY, "O_DIRECTORY" })
+		{ O_LARGEFILE, "O_LARGEFILE" }, \
+		{ O_DIRECTORY, "O_DIRECTORY" }, \
+		{ O_NOFOLLOW, "O_NOFOLLOW" }, \
+		{ O_NOATIME, "O_NOATIME" }, \
+		{ O_CLOEXEC, "O_CLOEXEC" })
+
+TRACE_DEFINE_ENUM(FMODE_READ);
+TRACE_DEFINE_ENUM(FMODE_WRITE);
+TRACE_DEFINE_ENUM(FMODE_EXEC);
 
 #define show_fmode_flags(mode) \
 	__print_flags(mode, "|", \
@@ -286,7 +356,7 @@
 		TP_ARGS(dir, ctx, flags),
 
 		TP_STRUCT__entry(
-			__field(unsigned int, flags)
+			__field(unsigned long, flags)
 			__field(unsigned int, fmode)
 			__field(dev_t, dev)
 			__field(u64, dir)
@@ -302,7 +372,7 @@
 		),
 
 		TP_printk(
-			"flags=%u (%s) fmode=%s name=%02x:%02x:%llu/%s",
+			"flags=%lu (%s) fmode=%s name=%02x:%02x:%llu/%s",
 			__entry->flags,
 			show_open_flags(__entry->flags),
 			show_fmode_flags(__entry->fmode),
@@ -324,7 +394,7 @@
 
 		TP_STRUCT__entry(
 			__field(unsigned long, error)
-			__field(unsigned int, flags)
+			__field(unsigned long, flags)
 			__field(unsigned int, fmode)
 			__field(dev_t, dev)
 			__field(u64, dir)
@@ -341,7 +411,7 @@
 		),
 
 		TP_printk(
-			"error=%lu (%s) flags=%u (%s) fmode=%s "
+			"error=%lu (%s) flags=%lu (%s) fmode=%s "
 			"name=%02x:%02x:%llu/%s",
 			__entry->error, nfs_show_status(__entry->error),
 			__entry->flags,
@@ -363,7 +433,7 @@
 		TP_ARGS(dir, dentry, flags),
 
 		TP_STRUCT__entry(
-			__field(unsigned int, flags)
+			__field(unsigned long, flags)
 			__field(dev_t, dev)
 			__field(u64, dir)
 			__string(name, dentry->d_name.name)
@@ -377,7 +447,7 @@
 		),
 
 		TP_printk(
-			"flags=%u (%s) name=%02x:%02x:%llu/%s",
+			"flags=%lu (%s) name=%02x:%02x:%llu/%s",
 			__entry->flags,
 			show_open_flags(__entry->flags),
 			MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -398,7 +468,7 @@
 
 		TP_STRUCT__entry(
 			__field(unsigned long, error)
-			__field(unsigned int, flags)
+			__field(unsigned long, flags)
 			__field(dev_t, dev)
 			__field(u64, dir)
 			__string(name, dentry->d_name.name)
@@ -413,7 +483,7 @@
 		),
 
 		TP_printk(
-			"error=%lu (%s) flags=%u (%s) name=%02x:%02x:%llu/%s",
+			"error=%lu (%s) flags=%lu (%s) name=%02x:%02x:%llu/%s",
 			__entry->error, nfs_show_status(__entry->error),
 			__entry->flags,
 			show_open_flags(__entry->flags),


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 19/19] NFS: Record task, client ID, and XID in xdr_status trace points
  2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
                   ` (17 preceding siblings ...)
  2019-06-11 15:09 ` [PATCH v2 18/19] NFS: Update symbolic flags displayed by trace events Chuck Lever
@ 2019-06-11 15:09 ` Chuck Lever
  18 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 15:09 UTC (permalink / raw)
  To: linux-rdma, linux-nfs

When triggering an nfs_xdr_status trace point, record the task ID
and XID of the failing RPC to better pinpoint the problem.

This feels like a bit of a layering violation.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs2xdr.c   |    2 +-
 fs/nfs/nfs3xdr.c   |    2 +-
 fs/nfs/nfs4trace.h |   15 +++++++++++++--
 fs/nfs/nfs4xdr.c   |    2 +-
 fs/nfs/nfstrace.h  |   15 +++++++++++++--
 5 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
index 572794d..cbc17a2 100644
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -151,7 +151,7 @@ static int decode_stat(struct xdr_stream *xdr, enum nfs_stat *status)
 	return 0;
 out_status:
 	*status = be32_to_cpup(p);
-	trace_nfs_xdr_status((int)*status);
+	trace_nfs_xdr_status(xdr, (int)*status);
 	return 0;
 }
 
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index abbbdde..6027678 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -343,7 +343,7 @@ static int decode_nfsstat3(struct xdr_stream *xdr, enum nfs_stat *status)
 	return 0;
 out_status:
 	*status = be32_to_cpup(p);
-	trace_nfs_xdr_status((int)*status);
+	trace_nfs_xdr_status(xdr, (int)*status);
 	return 0;
 }
 
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 9a01731..6cba216 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -564,24 +564,35 @@
 
 TRACE_EVENT(nfs4_xdr_status,
 		TP_PROTO(
+			const struct xdr_stream *xdr,
 			u32 op,
 			int error
 		),
 
-		TP_ARGS(op, error),
+		TP_ARGS(xdr, op, error),
 
 		TP_STRUCT__entry(
+			__field(unsigned int, task_id)
+			__field(unsigned int, client_id)
+			__field(u32, xid)
 			__field(u32, op)
 			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
+			const struct rpc_rqst *rqstp = xdr->rqst;
+			const struct rpc_task *task = rqstp->rq_task;
+
+			__entry->task_id = task->tk_pid;
+			__entry->client_id = task->tk_client->cl_clid;
+			__entry->xid = be32_to_cpu(rqstp->rq_xid);
 			__entry->op = op;
 			__entry->error = error;
 		),
 
 		TP_printk(
-			"error=%lu (%s) operation %d:",
+			"task:%u@%d xid=0x%08x error=%lu (%s) operation=%d",
+			__entry->task_id, __entry->client_id, __entry->xid,
 			__entry->error, show_nfsv4_errors(__entry->error),
 			__entry->op
 		)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 6024461..d974ff3 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -3187,7 +3187,7 @@ static bool __decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected,
 	return true;
 out_status:
 	nfserr = be32_to_cpup(p);
-	trace_nfs4_xdr_status(opnum, nfserr);
+	trace_nfs4_xdr_status(xdr, opnum, nfserr);
 	*nfs_retval = nfs4_stat_to_errno(nfserr);
 	return true;
 out_bad_operation:
diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index 3a0ed3d..84a79ba 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -1111,21 +1111,32 @@
 
 TRACE_EVENT(nfs_xdr_status,
 		TP_PROTO(
+			const struct xdr_stream *xdr,
 			int error
 		),
 
-		TP_ARGS(error),
+		TP_ARGS(xdr, error),
 
 		TP_STRUCT__entry(
+			__field(unsigned int, task_id)
+			__field(unsigned int, client_id)
+			__field(u32, xid)
 			__field(unsigned long, error)
 		),
 
 		TP_fast_assign(
+			const struct rpc_rqst *rqstp = xdr->rqst;
+			const struct rpc_task *task = rqstp->rq_task;
+
+			__entry->task_id = task->tk_pid;
+			__entry->client_id = task->tk_client->cl_clid;
+			__entry->xid = be32_to_cpu(rqstp->rq_xid);
 			__entry->error = error;
 		),
 
 		TP_printk(
-			"error=%lu (%s)",
+			"task:%u@%d xid=0x%08x error=%lu (%s)",
+			__entry->task_id, __entry->client_id, __entry->xid,
 			__entry->error, nfs_show_status(__entry->error)
 		)
 );


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again
  2019-06-11 15:09 ` [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again Chuck Lever
@ 2019-06-11 19:33   ` Olga Kornievskaia
  2019-06-11 19:37     ` Chuck Lever
  0 siblings, 1 reply; 24+ messages in thread
From: Olga Kornievskaia @ 2019-06-11 19:33 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, linux-nfs

On Tue, Jun 11, 2019 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> I noticed that NFS status values stopped working again.
>
> trace_print_symbols_seq() takes an unsigned long. Passing a negative
> errno or negative NFSERR value just confuses it, and since we're
> using C macros here and not static inline functions, all bets are
> off due to implicit type casting.
>
> Straight-line the calling conventions so that error codes are stored
> in the trace record as positive values in an unsigned long field.
>
> It's often the case that an error value that is positive is a byte
> count but when it's negative, it's an error (e.g. nfs4_write). Fix
> those cases so that the value that is eventually stored in the
> error field is a positive NFS status or errno, or zero.
>

Hi Chuck,

To clarify, so on error case, we no longer going be seeing a negative
value so error=-5 (EIO) would be error=5 (EIO)? I have always relied
on searching for "error=-" thru the trace_pipe log for errors. Do we
really need to change that?

> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfs/nfs4trace.h |  120 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 60 insertions(+), 60 deletions(-)
>
> diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
> index 6beb1f2..9a01731 100644
> --- a/fs/nfs/nfs4trace.h
> +++ b/fs/nfs/nfs4trace.h
> @@ -156,7 +156,7 @@
>  TRACE_DEFINE_ENUM(NFS4ERR_XDEV);
>
>  #define show_nfsv4_errors(error) \
> -       __print_symbolic(-(error), \
> +       __print_symbolic(error, \
>                 { NFS4_OK, "OK" }, \
>                 /* Mapped by nfs4_stat_to_errno() */ \
>                 { EPERM, "EPERM" }, \
> @@ -348,7 +348,7 @@
>
>                 TP_STRUCT__entry(
>                         __string(dstaddr, clp->cl_hostname)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -357,7 +357,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) dstaddr=%s",
> +                       "error=%lu (%s) dstaddr=%s",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
>                         __get_str(dstaddr)
> @@ -420,7 +420,7 @@
>                         __field(unsigned int, highest_slotid)
>                         __field(unsigned int, target_highest_slotid)
>                         __field(unsigned int, status_flags)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -435,7 +435,7 @@
>                         __entry->error = res->sr_status;
>                 ),
>                 TP_printk(
> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>                         "highest_slotid=%u target_highest_slotid=%u "
>                         "status_flags=%u (%s)",
>                         __entry->error,
> @@ -467,7 +467,7 @@
>                         __field(unsigned int, seq_nr)
>                         __field(unsigned int, highest_slotid)
>                         __field(unsigned int, cachethis)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -476,11 +476,11 @@
>                         __entry->seq_nr = args->csa_sequenceid;
>                         __entry->highest_slotid = args->csa_highestslotid;
>                         __entry->cachethis = args->csa_cachethis;
> -                       __entry->error = -be32_to_cpu(status);
> +                       __entry->error = be32_to_cpu(status);
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>                         "highest_slotid=%u",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -504,7 +504,7 @@
>                         __field(unsigned int, seq_nr)
>                         __field(unsigned int, highest_slotid)
>                         __field(unsigned int, cachethis)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -513,11 +513,11 @@
>                         __entry->seq_nr = args->csa_sequenceid;
>                         __entry->highest_slotid = args->csa_highestslotid;
>                         __entry->cachethis = args->csa_cachethis;
> -                       __entry->error = -be32_to_cpu(status);
> +                       __entry->error = be32_to_cpu(status);
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>                         "highest_slotid=%u",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -572,18 +572,18 @@
>
>                 TP_STRUCT__entry(
>                         __field(u32, op)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
>                         __entry->op = op;
> -                       __entry->error = -error;
> +                       __entry->error = error;
>                 ),
>
>                 TP_printk(
> -                       "operation %d: nfs status %d (%s)",
> -                       __entry->op,
> -                       __entry->error, show_nfsv4_errors(__entry->error)
> +                       "error=%lu (%s) operation %d:",
> +                       __entry->error, show_nfsv4_errors(__entry->error),
> +                       __entry->op
>                 )
>  );
>
> @@ -597,7 +597,7 @@
>                 TP_ARGS(ctx, flags, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(unsigned int, flags)
>                         __field(unsigned int, fmode)
>                         __field(dev_t, dev)
> @@ -615,7 +615,7 @@
>                         const struct nfs4_state *state = ctx->state;
>                         const struct inode *inode = NULL;
>
> -                       __entry->error = error;
> +                       __entry->error = -error;
>                         __entry->flags = flags;
>                         __entry->fmode = (__force unsigned int)ctx->mode;
>                         __entry->dev = ctx->dentry->d_sb->s_dev;
> @@ -647,7 +647,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) flags=%d (%s) fmode=%s "
> +                       "error=%lu (%s) flags=%d (%s) fmode=%s "
>                         "fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "name=%02x:%02x:%llu/%s stateid=%d:0x%08x "
>                         "openstateid=%d:0x%08x",
> @@ -733,7 +733,7 @@
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
>                         __field(unsigned int, fmode)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, stateid_seq)
>                         __field(u32, stateid_hash)
>                 ),
> @@ -753,7 +753,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fmode=%s fileid=%02x:%02x:%llu "
> +                       "error=%lu (%s) fmode=%s fileid=%02x:%02x:%llu "
>                         "fhandle=0x%08x openstateid=%d:0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -795,7 +795,7 @@
>                 TP_ARGS(request, state, cmd, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, cmd)
>                         __field(char, type)
>                         __field(loff_t, start)
> @@ -825,7 +825,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
>                         "fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "stateid=%d:0x%08x",
>                         __entry->error,
> @@ -865,7 +865,7 @@
>                 TP_ARGS(request, state, lockstateid, cmd, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, cmd)
>                         __field(char, type)
>                         __field(loff_t, start)
> @@ -901,7 +901,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
>                         "fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "stateid=%d:0x%08x lockstateid=%d:0x%08x",
>                         __entry->error,
> @@ -970,7 +970,7 @@
>                 TP_STRUCT__entry(
>                         __field(dev_t, dev)
>                         __field(u32, fhandle)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, stateid_seq)
>                         __field(u32, stateid_hash)
>                 ),
> @@ -986,7 +986,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) dev=%02x:%02x fhandle=0x%08x "
> +                       "error=%lu (%s) dev=%02x:%02x fhandle=0x%08x "
>                         "stateid=%d:0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1007,7 +1007,7 @@
>                 TP_ARGS(state, lsp, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(dev_t, dev)
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
> @@ -1029,7 +1029,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "stateid=%d:0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1064,7 +1064,7 @@
>
>                 TP_STRUCT__entry(
>                         __field(dev_t, dev)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(u64, dir)
>                         __string(name, name->name)
>                 ),
> @@ -1072,12 +1072,12 @@
>                 TP_fast_assign(
>                         __entry->dev = dir->i_sb->s_dev;
>                         __entry->dir = NFS_FILEID(dir);
> -                       __entry->error = error;
> +                       __entry->error = -error;
>                         __assign_str(name, name->name);
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) name=%02x:%02x:%llu/%s",
> +                       "error=%lu (%s) name=%02x:%02x:%llu/%s",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
>                         MAJOR(__entry->dev), MINOR(__entry->dev),
> @@ -1114,7 +1114,7 @@
>                 TP_STRUCT__entry(
>                         __field(dev_t, dev)
>                         __field(u64, ino)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -1124,7 +1124,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) inode=%02x:%02x:%llu",
> +                       "error=%lu (%s) inode=%02x:%02x:%llu",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
>                         MAJOR(__entry->dev), MINOR(__entry->dev),
> @@ -1145,7 +1145,7 @@
>
>                 TP_STRUCT__entry(
>                         __field(dev_t, dev)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(u64, olddir)
>                         __string(oldname, oldname->name)
>                         __field(u64, newdir)
> @@ -1162,7 +1162,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) oldname=%02x:%02x:%llu/%s "
> +                       "error=%lu (%s) oldname=%02x:%02x:%llu/%s "
>                         "newname=%02x:%02x:%llu/%s",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1187,18 +1187,18 @@
>                         __field(dev_t, dev)
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
>                         __entry->dev = inode->i_sb->s_dev;
>                         __entry->fileid = NFS_FILEID(inode);
>                         __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
> -                       __entry->error = error;
> +                       __entry->error = error < 0 ? -error : 0;
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
>                         MAJOR(__entry->dev), MINOR(__entry->dev),
> @@ -1238,7 +1238,7 @@
>                         __field(dev_t, dev)
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, stateid_seq)
>                         __field(u32, stateid_hash)
>                 ),
> @@ -1255,7 +1255,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "stateid=%d:0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1295,7 +1295,7 @@
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
>                         __field(unsigned int, valid)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -1307,7 +1307,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "valid=%s",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1342,7 +1342,7 @@
>                 TP_ARGS(clp, fhandle, inode, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(dev_t, dev)
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
> @@ -1363,7 +1363,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "dstaddr=%s",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1397,7 +1397,7 @@
>                 TP_ARGS(clp, fhandle, inode, stateid, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(dev_t, dev)
>                         __field(u32, fhandle)
>                         __field(u64, fileid)
> @@ -1424,7 +1424,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "stateid=%d:0x%08x dstaddr=%s",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1460,7 +1460,7 @@
>                 TP_ARGS(name, len, id, error),
>
>                 TP_STRUCT__entry(
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(u32, id)
>                         __dynamic_array(char, name, len > 0 ? len + 1 : 1)
>                 ),
> @@ -1475,8 +1475,8 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d id=%u name=%s",
> -                       __entry->error,
> +                       "error=%lu (%s) id=%u name=%s",
> +                       __entry->error, show_nfsv4_errors(__entry->error),
>                         __entry->id,
>                         __get_str(name)
>                 )
> @@ -1509,7 +1509,7 @@
>                         __field(u64, fileid)
>                         __field(loff_t, offset)
>                         __field(size_t, count)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, stateid_seq)
>                         __field(u32, stateid_hash)
>                 ),
> @@ -1523,7 +1523,7 @@
>                         __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>                         __entry->offset = hdr->args.offset;
>                         __entry->count = hdr->args.count;
> -                       __entry->error = error;
> +                       __entry->error = error < 0 ? -error : 0;
>                         __entry->stateid_seq =
>                                 be32_to_cpu(state->stateid.seqid);
>                         __entry->stateid_hash =
> @@ -1531,7 +1531,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "offset=%lld count=%zu stateid=%d:0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1569,7 +1569,7 @@
>                         __field(u64, fileid)
>                         __field(loff_t, offset)
>                         __field(size_t, count)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, stateid_seq)
>                         __field(u32, stateid_hash)
>                 ),
> @@ -1583,7 +1583,7 @@
>                         __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>                         __entry->offset = hdr->args.offset;
>                         __entry->count = hdr->args.count;
> -                       __entry->error = error;
> +                       __entry->error = error < 0 ? -error : 0;
>                         __entry->stateid_seq =
>                                 be32_to_cpu(state->stateid.seqid);
>                         __entry->stateid_hash =
> @@ -1591,7 +1591,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "offset=%lld count=%zu stateid=%d:0x%08x",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1630,7 +1630,7 @@
>                         __field(u64, fileid)
>                         __field(loff_t, offset)
>                         __field(size_t, count)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                 ),
>
>                 TP_fast_assign(
> @@ -1644,7 +1644,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "offset=%lld count=%zu",
>                         __entry->error,
>                         show_nfsv4_errors(__entry->error),
> @@ -1694,7 +1694,7 @@
>                         __field(u32, iomode)
>                         __field(u64, offset)
>                         __field(u64, count)
> -                       __field(int, error)
> +                       __field(unsigned long, error)
>                         __field(int, stateid_seq)
>                         __field(u32, stateid_hash)
>                         __field(int, layoutstateid_seq)
> @@ -1727,7 +1727,7 @@
>                 ),
>
>                 TP_printk(
> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>                         "iomode=%s offset=%llu count=%llu stateid=%d:0x%08x "
>                         "layoutstateid=%d:0x%08x",
>                         __entry->error,
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again
  2019-06-11 19:33   ` Olga Kornievskaia
@ 2019-06-11 19:37     ` Chuck Lever
  2019-06-11 20:00       ` Olga Kornievskaia
  0 siblings, 1 reply; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 19:37 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-rdma, Linux NFS Mailing List



> On Jun 11, 2019, at 3:33 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Jun 11, 2019 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> I noticed that NFS status values stopped working again.
>> 
>> trace_print_symbols_seq() takes an unsigned long. Passing a negative
>> errno or negative NFSERR value just confuses it, and since we're
>> using C macros here and not static inline functions, all bets are
>> off due to implicit type casting.
>> 
>> Straight-line the calling conventions so that error codes are stored
>> in the trace record as positive values in an unsigned long field.
>> 
>> It's often the case that an error value that is positive is a byte
>> count but when it's negative, it's an error (e.g. nfs4_write). Fix
>> those cases so that the value that is eventually stored in the
>> error field is a positive NFS status or errno, or zero.
>> 
> 
> Hi Chuck,
> 
> To clarify, so on error case, we no longer going be seeing a negative
> value so error=-5 (EIO) would be error=5 (EIO)? I have always relied
> on searching for "error=-" thru the trace_pipe log for errors. Do we
> really need to change that?

error= will be zero or a positive errno/status code. If the trace point
has a count= or task->tk_status= you can see the byte count when
error=0.

So now the search will be for anything that has "error=" but is not
"error=0".


>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> fs/nfs/nfs4trace.h |  120 ++++++++++++++++++++++++++--------------------------
>> 1 file changed, 60 insertions(+), 60 deletions(-)
>> 
>> diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
>> index 6beb1f2..9a01731 100644
>> --- a/fs/nfs/nfs4trace.h
>> +++ b/fs/nfs/nfs4trace.h
>> @@ -156,7 +156,7 @@
>> TRACE_DEFINE_ENUM(NFS4ERR_XDEV);
>> 
>> #define show_nfsv4_errors(error) \
>> -       __print_symbolic(-(error), \
>> +       __print_symbolic(error, \
>>                { NFS4_OK, "OK" }, \
>>                /* Mapped by nfs4_stat_to_errno() */ \
>>                { EPERM, "EPERM" }, \
>> @@ -348,7 +348,7 @@
>> 
>>                TP_STRUCT__entry(
>>                        __string(dstaddr, clp->cl_hostname)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -357,7 +357,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) dstaddr=%s",
>> +                       "error=%lu (%s) dstaddr=%s",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>>                        __get_str(dstaddr)
>> @@ -420,7 +420,7 @@
>>                        __field(unsigned int, highest_slotid)
>>                        __field(unsigned int, target_highest_slotid)
>>                        __field(unsigned int, status_flags)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -435,7 +435,7 @@
>>                        __entry->error = res->sr_status;
>>                ),
>>                TP_printk(
>> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>                        "highest_slotid=%u target_highest_slotid=%u "
>>                        "status_flags=%u (%s)",
>>                        __entry->error,
>> @@ -467,7 +467,7 @@
>>                        __field(unsigned int, seq_nr)
>>                        __field(unsigned int, highest_slotid)
>>                        __field(unsigned int, cachethis)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -476,11 +476,11 @@
>>                        __entry->seq_nr = args->csa_sequenceid;
>>                        __entry->highest_slotid = args->csa_highestslotid;
>>                        __entry->cachethis = args->csa_cachethis;
>> -                       __entry->error = -be32_to_cpu(status);
>> +                       __entry->error = be32_to_cpu(status);
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>                        "highest_slotid=%u",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -504,7 +504,7 @@
>>                        __field(unsigned int, seq_nr)
>>                        __field(unsigned int, highest_slotid)
>>                        __field(unsigned int, cachethis)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -513,11 +513,11 @@
>>                        __entry->seq_nr = args->csa_sequenceid;
>>                        __entry->highest_slotid = args->csa_highestslotid;
>>                        __entry->cachethis = args->csa_cachethis;
>> -                       __entry->error = -be32_to_cpu(status);
>> +                       __entry->error = be32_to_cpu(status);
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>                        "highest_slotid=%u",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -572,18 +572,18 @@
>> 
>>                TP_STRUCT__entry(
>>                        __field(u32, op)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>>                        __entry->op = op;
>> -                       __entry->error = -error;
>> +                       __entry->error = error;
>>                ),
>> 
>>                TP_printk(
>> -                       "operation %d: nfs status %d (%s)",
>> -                       __entry->op,
>> -                       __entry->error, show_nfsv4_errors(__entry->error)
>> +                       "error=%lu (%s) operation %d:",
>> +                       __entry->error, show_nfsv4_errors(__entry->error),
>> +                       __entry->op
>>                )
>> );
>> 
>> @@ -597,7 +597,7 @@
>>                TP_ARGS(ctx, flags, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(unsigned int, flags)
>>                        __field(unsigned int, fmode)
>>                        __field(dev_t, dev)
>> @@ -615,7 +615,7 @@
>>                        const struct nfs4_state *state = ctx->state;
>>                        const struct inode *inode = NULL;
>> 
>> -                       __entry->error = error;
>> +                       __entry->error = -error;
>>                        __entry->flags = flags;
>>                        __entry->fmode = (__force unsigned int)ctx->mode;
>>                        __entry->dev = ctx->dentry->d_sb->s_dev;
>> @@ -647,7 +647,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) flags=%d (%s) fmode=%s "
>> +                       "error=%lu (%s) flags=%d (%s) fmode=%s "
>>                        "fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "name=%02x:%02x:%llu/%s stateid=%d:0x%08x "
>>                        "openstateid=%d:0x%08x",
>> @@ -733,7 +733,7 @@
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>>                        __field(unsigned int, fmode)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, stateid_seq)
>>                        __field(u32, stateid_hash)
>>                ),
>> @@ -753,7 +753,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fmode=%s fileid=%02x:%02x:%llu "
>> +                       "error=%lu (%s) fmode=%s fileid=%02x:%02x:%llu "
>>                        "fhandle=0x%08x openstateid=%d:0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -795,7 +795,7 @@
>>                TP_ARGS(request, state, cmd, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, cmd)
>>                        __field(char, type)
>>                        __field(loff_t, start)
>> @@ -825,7 +825,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
>> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
>>                        "fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "stateid=%d:0x%08x",
>>                        __entry->error,
>> @@ -865,7 +865,7 @@
>>                TP_ARGS(request, state, lockstateid, cmd, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, cmd)
>>                        __field(char, type)
>>                        __field(loff_t, start)
>> @@ -901,7 +901,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
>> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
>>                        "fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "stateid=%d:0x%08x lockstateid=%d:0x%08x",
>>                        __entry->error,
>> @@ -970,7 +970,7 @@
>>                TP_STRUCT__entry(
>>                        __field(dev_t, dev)
>>                        __field(u32, fhandle)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, stateid_seq)
>>                        __field(u32, stateid_hash)
>>                ),
>> @@ -986,7 +986,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) dev=%02x:%02x fhandle=0x%08x "
>> +                       "error=%lu (%s) dev=%02x:%02x fhandle=0x%08x "
>>                        "stateid=%d:0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1007,7 +1007,7 @@
>>                TP_ARGS(state, lsp, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(dev_t, dev)
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>> @@ -1029,7 +1029,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "stateid=%d:0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1064,7 +1064,7 @@
>> 
>>                TP_STRUCT__entry(
>>                        __field(dev_t, dev)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(u64, dir)
>>                        __string(name, name->name)
>>                ),
>> @@ -1072,12 +1072,12 @@
>>                TP_fast_assign(
>>                        __entry->dev = dir->i_sb->s_dev;
>>                        __entry->dir = NFS_FILEID(dir);
>> -                       __entry->error = error;
>> +                       __entry->error = -error;
>>                        __assign_str(name, name->name);
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) name=%02x:%02x:%llu/%s",
>> +                       "error=%lu (%s) name=%02x:%02x:%llu/%s",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>>                        MAJOR(__entry->dev), MINOR(__entry->dev),
>> @@ -1114,7 +1114,7 @@
>>                TP_STRUCT__entry(
>>                        __field(dev_t, dev)
>>                        __field(u64, ino)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -1124,7 +1124,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) inode=%02x:%02x:%llu",
>> +                       "error=%lu (%s) inode=%02x:%02x:%llu",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>>                        MAJOR(__entry->dev), MINOR(__entry->dev),
>> @@ -1145,7 +1145,7 @@
>> 
>>                TP_STRUCT__entry(
>>                        __field(dev_t, dev)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(u64, olddir)
>>                        __string(oldname, oldname->name)
>>                        __field(u64, newdir)
>> @@ -1162,7 +1162,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) oldname=%02x:%02x:%llu/%s "
>> +                       "error=%lu (%s) oldname=%02x:%02x:%llu/%s "
>>                        "newname=%02x:%02x:%llu/%s",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1187,18 +1187,18 @@
>>                        __field(dev_t, dev)
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>>                        __entry->dev = inode->i_sb->s_dev;
>>                        __entry->fileid = NFS_FILEID(inode);
>>                        __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>> -                       __entry->error = error;
>> +                       __entry->error = error < 0 ? -error : 0;
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>>                        MAJOR(__entry->dev), MINOR(__entry->dev),
>> @@ -1238,7 +1238,7 @@
>>                        __field(dev_t, dev)
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, stateid_seq)
>>                        __field(u32, stateid_hash)
>>                ),
>> @@ -1255,7 +1255,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "stateid=%d:0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1295,7 +1295,7 @@
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>>                        __field(unsigned int, valid)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -1307,7 +1307,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "valid=%s",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1342,7 +1342,7 @@
>>                TP_ARGS(clp, fhandle, inode, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(dev_t, dev)
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>> @@ -1363,7 +1363,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "dstaddr=%s",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1397,7 +1397,7 @@
>>                TP_ARGS(clp, fhandle, inode, stateid, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(dev_t, dev)
>>                        __field(u32, fhandle)
>>                        __field(u64, fileid)
>> @@ -1424,7 +1424,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "stateid=%d:0x%08x dstaddr=%s",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1460,7 +1460,7 @@
>>                TP_ARGS(name, len, id, error),
>> 
>>                TP_STRUCT__entry(
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(u32, id)
>>                        __dynamic_array(char, name, len > 0 ? len + 1 : 1)
>>                ),
>> @@ -1475,8 +1475,8 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d id=%u name=%s",
>> -                       __entry->error,
>> +                       "error=%lu (%s) id=%u name=%s",
>> +                       __entry->error, show_nfsv4_errors(__entry->error),
>>                        __entry->id,
>>                        __get_str(name)
>>                )
>> @@ -1509,7 +1509,7 @@
>>                        __field(u64, fileid)
>>                        __field(loff_t, offset)
>>                        __field(size_t, count)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, stateid_seq)
>>                        __field(u32, stateid_hash)
>>                ),
>> @@ -1523,7 +1523,7 @@
>>                        __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>>                        __entry->offset = hdr->args.offset;
>>                        __entry->count = hdr->args.count;
>> -                       __entry->error = error;
>> +                       __entry->error = error < 0 ? -error : 0;
>>                        __entry->stateid_seq =
>>                                be32_to_cpu(state->stateid.seqid);
>>                        __entry->stateid_hash =
>> @@ -1531,7 +1531,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "offset=%lld count=%zu stateid=%d:0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1569,7 +1569,7 @@
>>                        __field(u64, fileid)
>>                        __field(loff_t, offset)
>>                        __field(size_t, count)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, stateid_seq)
>>                        __field(u32, stateid_hash)
>>                ),
>> @@ -1583,7 +1583,7 @@
>>                        __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>>                        __entry->offset = hdr->args.offset;
>>                        __entry->count = hdr->args.count;
>> -                       __entry->error = error;
>> +                       __entry->error = error < 0 ? -error : 0;
>>                        __entry->stateid_seq =
>>                                be32_to_cpu(state->stateid.seqid);
>>                        __entry->stateid_hash =
>> @@ -1591,7 +1591,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "offset=%lld count=%zu stateid=%d:0x%08x",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1630,7 +1630,7 @@
>>                        __field(u64, fileid)
>>                        __field(loff_t, offset)
>>                        __field(size_t, count)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                ),
>> 
>>                TP_fast_assign(
>> @@ -1644,7 +1644,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "offset=%lld count=%zu",
>>                        __entry->error,
>>                        show_nfsv4_errors(__entry->error),
>> @@ -1694,7 +1694,7 @@
>>                        __field(u32, iomode)
>>                        __field(u64, offset)
>>                        __field(u64, count)
>> -                       __field(int, error)
>> +                       __field(unsigned long, error)
>>                        __field(int, stateid_seq)
>>                        __field(u32, stateid_hash)
>>                        __field(int, layoutstateid_seq)
>> @@ -1727,7 +1727,7 @@
>>                ),
>> 
>>                TP_printk(
>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>                        "iomode=%s offset=%llu count=%llu stateid=%d:0x%08x "
>>                        "layoutstateid=%d:0x%08x",
>>                        __entry->error,

--
Chuck Lever




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again
  2019-06-11 19:37     ` Chuck Lever
@ 2019-06-11 20:00       ` Olga Kornievskaia
  2019-06-11 20:06         ` Chuck Lever
  0 siblings, 1 reply; 24+ messages in thread
From: Olga Kornievskaia @ 2019-06-11 20:00 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, Linux NFS Mailing List

On Tue, Jun 11, 2019 at 3:37 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
>
>
> > On Jun 11, 2019, at 3:33 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Tue, Jun 11, 2019 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> I noticed that NFS status values stopped working again.
> >>
> >> trace_print_symbols_seq() takes an unsigned long. Passing a negative
> >> errno or negative NFSERR value just confuses it, and since we're
> >> using C macros here and not static inline functions, all bets are
> >> off due to implicit type casting.
> >>
> >> Straight-line the calling conventions so that error codes are stored
> >> in the trace record as positive values in an unsigned long field.
> >>
> >> It's often the case that an error value that is positive is a byte
> >> count but when it's negative, it's an error (e.g. nfs4_write). Fix
> >> those cases so that the value that is eventually stored in the
> >> error field is a positive NFS status or errno, or zero.
> >>
> >
> > Hi Chuck,
> >
> > To clarify, so on error case, we no longer going be seeing a negative
> > value so error=-5 (EIO) would be error=5 (EIO)? I have always relied
> > on searching for "error=-" thru the trace_pipe log for errors. Do we
> > really need to change that?
>
> error= will be zero or a positive errno/status code. If the trace point
> has a count= or task->tk_status= you can see the byte count when
> error=0.
>
> So now the search will be for anything that has "error=" but is not
> "error=0".

Unfortunately, "error=" but not "error=0" isn't easily translated into
the vi search... ("error=-" was more convenient).

Can we keep the value of the error= negative but change it to the
positive value for the show_nfsv4_errors()?

> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >> fs/nfs/nfs4trace.h |  120 ++++++++++++++++++++++++++--------------------------
> >> 1 file changed, 60 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
> >> index 6beb1f2..9a01731 100644
> >> --- a/fs/nfs/nfs4trace.h
> >> +++ b/fs/nfs/nfs4trace.h
> >> @@ -156,7 +156,7 @@
> >> TRACE_DEFINE_ENUM(NFS4ERR_XDEV);
> >>
> >> #define show_nfsv4_errors(error) \
> >> -       __print_symbolic(-(error), \
> >> +       __print_symbolic(error, \
> >>                { NFS4_OK, "OK" }, \
> >>                /* Mapped by nfs4_stat_to_errno() */ \
> >>                { EPERM, "EPERM" }, \
> >> @@ -348,7 +348,7 @@
> >>
> >>                TP_STRUCT__entry(
> >>                        __string(dstaddr, clp->cl_hostname)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -357,7 +357,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) dstaddr=%s",
> >> +                       "error=%lu (%s) dstaddr=%s",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >>                        __get_str(dstaddr)
> >> @@ -420,7 +420,7 @@
> >>                        __field(unsigned int, highest_slotid)
> >>                        __field(unsigned int, target_highest_slotid)
> >>                        __field(unsigned int, status_flags)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -435,7 +435,7 @@
> >>                        __entry->error = res->sr_status;
> >>                ),
> >>                TP_printk(
> >> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> >> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> >>                        "highest_slotid=%u target_highest_slotid=%u "
> >>                        "status_flags=%u (%s)",
> >>                        __entry->error,
> >> @@ -467,7 +467,7 @@
> >>                        __field(unsigned int, seq_nr)
> >>                        __field(unsigned int, highest_slotid)
> >>                        __field(unsigned int, cachethis)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -476,11 +476,11 @@
> >>                        __entry->seq_nr = args->csa_sequenceid;
> >>                        __entry->highest_slotid = args->csa_highestslotid;
> >>                        __entry->cachethis = args->csa_cachethis;
> >> -                       __entry->error = -be32_to_cpu(status);
> >> +                       __entry->error = be32_to_cpu(status);
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> >> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> >>                        "highest_slotid=%u",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -504,7 +504,7 @@
> >>                        __field(unsigned int, seq_nr)
> >>                        __field(unsigned int, highest_slotid)
> >>                        __field(unsigned int, cachethis)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -513,11 +513,11 @@
> >>                        __entry->seq_nr = args->csa_sequenceid;
> >>                        __entry->highest_slotid = args->csa_highestslotid;
> >>                        __entry->cachethis = args->csa_cachethis;
> >> -                       __entry->error = -be32_to_cpu(status);
> >> +                       __entry->error = be32_to_cpu(status);
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> >> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
> >>                        "highest_slotid=%u",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -572,18 +572,18 @@
> >>
> >>                TP_STRUCT__entry(
> >>                        __field(u32, op)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >>                        __entry->op = op;
> >> -                       __entry->error = -error;
> >> +                       __entry->error = error;
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "operation %d: nfs status %d (%s)",
> >> -                       __entry->op,
> >> -                       __entry->error, show_nfsv4_errors(__entry->error)
> >> +                       "error=%lu (%s) operation %d:",
> >> +                       __entry->error, show_nfsv4_errors(__entry->error),
> >> +                       __entry->op
> >>                )
> >> );
> >>
> >> @@ -597,7 +597,7 @@
> >>                TP_ARGS(ctx, flags, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(unsigned int, flags)
> >>                        __field(unsigned int, fmode)
> >>                        __field(dev_t, dev)
> >> @@ -615,7 +615,7 @@
> >>                        const struct nfs4_state *state = ctx->state;
> >>                        const struct inode *inode = NULL;
> >>
> >> -                       __entry->error = error;
> >> +                       __entry->error = -error;
> >>                        __entry->flags = flags;
> >>                        __entry->fmode = (__force unsigned int)ctx->mode;
> >>                        __entry->dev = ctx->dentry->d_sb->s_dev;
> >> @@ -647,7 +647,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) flags=%d (%s) fmode=%s "
> >> +                       "error=%lu (%s) flags=%d (%s) fmode=%s "
> >>                        "fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "name=%02x:%02x:%llu/%s stateid=%d:0x%08x "
> >>                        "openstateid=%d:0x%08x",
> >> @@ -733,7 +733,7 @@
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >>                        __field(unsigned int, fmode)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, stateid_seq)
> >>                        __field(u32, stateid_hash)
> >>                ),
> >> @@ -753,7 +753,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fmode=%s fileid=%02x:%02x:%llu "
> >> +                       "error=%lu (%s) fmode=%s fileid=%02x:%02x:%llu "
> >>                        "fhandle=0x%08x openstateid=%d:0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -795,7 +795,7 @@
> >>                TP_ARGS(request, state, cmd, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, cmd)
> >>                        __field(char, type)
> >>                        __field(loff_t, start)
> >> @@ -825,7 +825,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
> >> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
> >>                        "fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "stateid=%d:0x%08x",
> >>                        __entry->error,
> >> @@ -865,7 +865,7 @@
> >>                TP_ARGS(request, state, lockstateid, cmd, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, cmd)
> >>                        __field(char, type)
> >>                        __field(loff_t, start)
> >> @@ -901,7 +901,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
> >> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
> >>                        "fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "stateid=%d:0x%08x lockstateid=%d:0x%08x",
> >>                        __entry->error,
> >> @@ -970,7 +970,7 @@
> >>                TP_STRUCT__entry(
> >>                        __field(dev_t, dev)
> >>                        __field(u32, fhandle)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, stateid_seq)
> >>                        __field(u32, stateid_hash)
> >>                ),
> >> @@ -986,7 +986,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) dev=%02x:%02x fhandle=0x%08x "
> >> +                       "error=%lu (%s) dev=%02x:%02x fhandle=0x%08x "
> >>                        "stateid=%d:0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1007,7 +1007,7 @@
> >>                TP_ARGS(state, lsp, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(dev_t, dev)
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >> @@ -1029,7 +1029,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "stateid=%d:0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1064,7 +1064,7 @@
> >>
> >>                TP_STRUCT__entry(
> >>                        __field(dev_t, dev)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(u64, dir)
> >>                        __string(name, name->name)
> >>                ),
> >> @@ -1072,12 +1072,12 @@
> >>                TP_fast_assign(
> >>                        __entry->dev = dir->i_sb->s_dev;
> >>                        __entry->dir = NFS_FILEID(dir);
> >> -                       __entry->error = error;
> >> +                       __entry->error = -error;
> >>                        __assign_str(name, name->name);
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) name=%02x:%02x:%llu/%s",
> >> +                       "error=%lu (%s) name=%02x:%02x:%llu/%s",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >>                        MAJOR(__entry->dev), MINOR(__entry->dev),
> >> @@ -1114,7 +1114,7 @@
> >>                TP_STRUCT__entry(
> >>                        __field(dev_t, dev)
> >>                        __field(u64, ino)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -1124,7 +1124,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) inode=%02x:%02x:%llu",
> >> +                       "error=%lu (%s) inode=%02x:%02x:%llu",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >>                        MAJOR(__entry->dev), MINOR(__entry->dev),
> >> @@ -1145,7 +1145,7 @@
> >>
> >>                TP_STRUCT__entry(
> >>                        __field(dev_t, dev)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(u64, olddir)
> >>                        __string(oldname, oldname->name)
> >>                        __field(u64, newdir)
> >> @@ -1162,7 +1162,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) oldname=%02x:%02x:%llu/%s "
> >> +                       "error=%lu (%s) oldname=%02x:%02x:%llu/%s "
> >>                        "newname=%02x:%02x:%llu/%s",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1187,18 +1187,18 @@
> >>                        __field(dev_t, dev)
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >>                        __entry->dev = inode->i_sb->s_dev;
> >>                        __entry->fileid = NFS_FILEID(inode);
> >>                        __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
> >> -                       __entry->error = error;
> >> +                       __entry->error = error < 0 ? -error : 0;
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >>                        MAJOR(__entry->dev), MINOR(__entry->dev),
> >> @@ -1238,7 +1238,7 @@
> >>                        __field(dev_t, dev)
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, stateid_seq)
> >>                        __field(u32, stateid_hash)
> >>                ),
> >> @@ -1255,7 +1255,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "stateid=%d:0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1295,7 +1295,7 @@
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >>                        __field(unsigned int, valid)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -1307,7 +1307,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "valid=%s",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1342,7 +1342,7 @@
> >>                TP_ARGS(clp, fhandle, inode, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(dev_t, dev)
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >> @@ -1363,7 +1363,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "dstaddr=%s",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1397,7 +1397,7 @@
> >>                TP_ARGS(clp, fhandle, inode, stateid, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(dev_t, dev)
> >>                        __field(u32, fhandle)
> >>                        __field(u64, fileid)
> >> @@ -1424,7 +1424,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "stateid=%d:0x%08x dstaddr=%s",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1460,7 +1460,7 @@
> >>                TP_ARGS(name, len, id, error),
> >>
> >>                TP_STRUCT__entry(
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(u32, id)
> >>                        __dynamic_array(char, name, len > 0 ? len + 1 : 1)
> >>                ),
> >> @@ -1475,8 +1475,8 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d id=%u name=%s",
> >> -                       __entry->error,
> >> +                       "error=%lu (%s) id=%u name=%s",
> >> +                       __entry->error, show_nfsv4_errors(__entry->error),
> >>                        __entry->id,
> >>                        __get_str(name)
> >>                )
> >> @@ -1509,7 +1509,7 @@
> >>                        __field(u64, fileid)
> >>                        __field(loff_t, offset)
> >>                        __field(size_t, count)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, stateid_seq)
> >>                        __field(u32, stateid_hash)
> >>                ),
> >> @@ -1523,7 +1523,7 @@
> >>                        __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
> >>                        __entry->offset = hdr->args.offset;
> >>                        __entry->count = hdr->args.count;
> >> -                       __entry->error = error;
> >> +                       __entry->error = error < 0 ? -error : 0;
> >>                        __entry->stateid_seq =
> >>                                be32_to_cpu(state->stateid.seqid);
> >>                        __entry->stateid_hash =
> >> @@ -1531,7 +1531,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "offset=%lld count=%zu stateid=%d:0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1569,7 +1569,7 @@
> >>                        __field(u64, fileid)
> >>                        __field(loff_t, offset)
> >>                        __field(size_t, count)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, stateid_seq)
> >>                        __field(u32, stateid_hash)
> >>                ),
> >> @@ -1583,7 +1583,7 @@
> >>                        __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
> >>                        __entry->offset = hdr->args.offset;
> >>                        __entry->count = hdr->args.count;
> >> -                       __entry->error = error;
> >> +                       __entry->error = error < 0 ? -error : 0;
> >>                        __entry->stateid_seq =
> >>                                be32_to_cpu(state->stateid.seqid);
> >>                        __entry->stateid_hash =
> >> @@ -1591,7 +1591,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "offset=%lld count=%zu stateid=%d:0x%08x",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1630,7 +1630,7 @@
> >>                        __field(u64, fileid)
> >>                        __field(loff_t, offset)
> >>                        __field(size_t, count)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                ),
> >>
> >>                TP_fast_assign(
> >> @@ -1644,7 +1644,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "offset=%lld count=%zu",
> >>                        __entry->error,
> >>                        show_nfsv4_errors(__entry->error),
> >> @@ -1694,7 +1694,7 @@
> >>                        __field(u32, iomode)
> >>                        __field(u64, offset)
> >>                        __field(u64, count)
> >> -                       __field(int, error)
> >> +                       __field(unsigned long, error)
> >>                        __field(int, stateid_seq)
> >>                        __field(u32, stateid_hash)
> >>                        __field(int, layoutstateid_seq)
> >> @@ -1727,7 +1727,7 @@
> >>                ),
> >>
> >>                TP_printk(
> >> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
> >>                        "iomode=%s offset=%llu count=%llu stateid=%d:0x%08x "
> >>                        "layoutstateid=%d:0x%08x",
> >>                        __entry->error,
>
> --
> Chuck Lever
>
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again
  2019-06-11 20:00       ` Olga Kornievskaia
@ 2019-06-11 20:06         ` Chuck Lever
  0 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2019-06-11 20:06 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-rdma, Linux NFS Mailing List



> On Jun 11, 2019, at 4:00 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Jun 11, 2019 at 3:37 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Jun 11, 2019, at 3:33 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Tue, Jun 11, 2019 at 11:09 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>> 
>>>> I noticed that NFS status values stopped working again.
>>>> 
>>>> trace_print_symbols_seq() takes an unsigned long. Passing a negative
>>>> errno or negative NFSERR value just confuses it, and since we're
>>>> using C macros here and not static inline functions, all bets are
>>>> off due to implicit type casting.
>>>> 
>>>> Straight-line the calling conventions so that error codes are stored
>>>> in the trace record as positive values in an unsigned long field.
>>>> 
>>>> It's often the case that an error value that is positive is a byte
>>>> count but when it's negative, it's an error (e.g. nfs4_write). Fix
>>>> those cases so that the value that is eventually stored in the
>>>> error field is a positive NFS status or errno, or zero.
>>>> 
>>> 
>>> Hi Chuck,
>>> 
>>> To clarify, so on error case, we no longer going be seeing a negative
>>> value so error=-5 (EIO) would be error=5 (EIO)? I have always relied
>>> on searching for "error=-" thru the trace_pipe log for errors. Do we
>>> really need to change that?
>> 
>> error= will be zero or a positive errno/status code. If the trace point
>> has a count= or task->tk_status= you can see the byte count when
>> error=0.
>> 
>> So now the search will be for anything that has "error=" but is not
>> "error=0".
> 
> Unfortunately, "error=" but not "error=0" isn't easily translated into
> the vi search... ("error=-" was more convenient).
> 
> Can we keep the value of the error= negative but change it to the
> positive value for the show_nfsv4_errors()?

The patch strips out implicit and explicit type casts which
makes the trace points less brittle.

You could just as easily use awk or grep instead, or switch
to using trace-cmd, which has more efficient mechanisms for
filtering by value.

I can try storing the value as an unsigned long, and
printing it as negative (signed) long, so that zero is
still displayed as "error=0". But I think some people might
find this confusing.


>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>> fs/nfs/nfs4trace.h |  120 ++++++++++++++++++++++++++--------------------------
>>>> 1 file changed, 60 insertions(+), 60 deletions(-)
>>>> 
>>>> diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
>>>> index 6beb1f2..9a01731 100644
>>>> --- a/fs/nfs/nfs4trace.h
>>>> +++ b/fs/nfs/nfs4trace.h
>>>> @@ -156,7 +156,7 @@
>>>> TRACE_DEFINE_ENUM(NFS4ERR_XDEV);
>>>> 
>>>> #define show_nfsv4_errors(error) \
>>>> -       __print_symbolic(-(error), \
>>>> +       __print_symbolic(error, \
>>>>               { NFS4_OK, "OK" }, \
>>>>               /* Mapped by nfs4_stat_to_errno() */ \
>>>>               { EPERM, "EPERM" }, \
>>>> @@ -348,7 +348,7 @@
>>>> 
>>>>               TP_STRUCT__entry(
>>>>                       __string(dstaddr, clp->cl_hostname)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -357,7 +357,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) dstaddr=%s",
>>>> +                       "error=%lu (%s) dstaddr=%s",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>>                       __get_str(dstaddr)
>>>> @@ -420,7 +420,7 @@
>>>>                       __field(unsigned int, highest_slotid)
>>>>                       __field(unsigned int, target_highest_slotid)
>>>>                       __field(unsigned int, status_flags)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -435,7 +435,7 @@
>>>>                       __entry->error = res->sr_status;
>>>>               ),
>>>>               TP_printk(
>>>> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>>> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>>>                       "highest_slotid=%u target_highest_slotid=%u "
>>>>                       "status_flags=%u (%s)",
>>>>                       __entry->error,
>>>> @@ -467,7 +467,7 @@
>>>>                       __field(unsigned int, seq_nr)
>>>>                       __field(unsigned int, highest_slotid)
>>>>                       __field(unsigned int, cachethis)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -476,11 +476,11 @@
>>>>                       __entry->seq_nr = args->csa_sequenceid;
>>>>                       __entry->highest_slotid = args->csa_highestslotid;
>>>>                       __entry->cachethis = args->csa_cachethis;
>>>> -                       __entry->error = -be32_to_cpu(status);
>>>> +                       __entry->error = be32_to_cpu(status);
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>>> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>>>                       "highest_slotid=%u",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -504,7 +504,7 @@
>>>>                       __field(unsigned int, seq_nr)
>>>>                       __field(unsigned int, highest_slotid)
>>>>                       __field(unsigned int, cachethis)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -513,11 +513,11 @@
>>>>                       __entry->seq_nr = args->csa_sequenceid;
>>>>                       __entry->highest_slotid = args->csa_highestslotid;
>>>>                       __entry->cachethis = args->csa_cachethis;
>>>> -                       __entry->error = -be32_to_cpu(status);
>>>> +                       __entry->error = be32_to_cpu(status);
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>>> +                       "error=%lu (%s) session=0x%08x slot_nr=%u seq_nr=%u "
>>>>                       "highest_slotid=%u",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -572,18 +572,18 @@
>>>> 
>>>>               TP_STRUCT__entry(
>>>>                       __field(u32, op)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>>                       __entry->op = op;
>>>> -                       __entry->error = -error;
>>>> +                       __entry->error = error;
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "operation %d: nfs status %d (%s)",
>>>> -                       __entry->op,
>>>> -                       __entry->error, show_nfsv4_errors(__entry->error)
>>>> +                       "error=%lu (%s) operation %d:",
>>>> +                       __entry->error, show_nfsv4_errors(__entry->error),
>>>> +                       __entry->op
>>>>               )
>>>> );
>>>> 
>>>> @@ -597,7 +597,7 @@
>>>>               TP_ARGS(ctx, flags, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(unsigned int, flags)
>>>>                       __field(unsigned int, fmode)
>>>>                       __field(dev_t, dev)
>>>> @@ -615,7 +615,7 @@
>>>>                       const struct nfs4_state *state = ctx->state;
>>>>                       const struct inode *inode = NULL;
>>>> 
>>>> -                       __entry->error = error;
>>>> +                       __entry->error = -error;
>>>>                       __entry->flags = flags;
>>>>                       __entry->fmode = (__force unsigned int)ctx->mode;
>>>>                       __entry->dev = ctx->dentry->d_sb->s_dev;
>>>> @@ -647,7 +647,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) flags=%d (%s) fmode=%s "
>>>> +                       "error=%lu (%s) flags=%d (%s) fmode=%s "
>>>>                       "fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "name=%02x:%02x:%llu/%s stateid=%d:0x%08x "
>>>>                       "openstateid=%d:0x%08x",
>>>> @@ -733,7 +733,7 @@
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>>                       __field(unsigned int, fmode)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, stateid_seq)
>>>>                       __field(u32, stateid_hash)
>>>>               ),
>>>> @@ -753,7 +753,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fmode=%s fileid=%02x:%02x:%llu "
>>>> +                       "error=%lu (%s) fmode=%s fileid=%02x:%02x:%llu "
>>>>                       "fhandle=0x%08x openstateid=%d:0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -795,7 +795,7 @@
>>>>               TP_ARGS(request, state, cmd, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, cmd)
>>>>                       __field(char, type)
>>>>                       __field(loff_t, start)
>>>> @@ -825,7 +825,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
>>>> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
>>>>                       "fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "stateid=%d:0x%08x",
>>>>                       __entry->error,
>>>> @@ -865,7 +865,7 @@
>>>>               TP_ARGS(request, state, lockstateid, cmd, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, cmd)
>>>>                       __field(char, type)
>>>>                       __field(loff_t, start)
>>>> @@ -901,7 +901,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) cmd=%s:%s range=%lld:%lld "
>>>> +                       "error=%lu (%s) cmd=%s:%s range=%lld:%lld "
>>>>                       "fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "stateid=%d:0x%08x lockstateid=%d:0x%08x",
>>>>                       __entry->error,
>>>> @@ -970,7 +970,7 @@
>>>>               TP_STRUCT__entry(
>>>>                       __field(dev_t, dev)
>>>>                       __field(u32, fhandle)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, stateid_seq)
>>>>                       __field(u32, stateid_hash)
>>>>               ),
>>>> @@ -986,7 +986,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) dev=%02x:%02x fhandle=0x%08x "
>>>> +                       "error=%lu (%s) dev=%02x:%02x fhandle=0x%08x "
>>>>                       "stateid=%d:0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1007,7 +1007,7 @@
>>>>               TP_ARGS(state, lsp, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(dev_t, dev)
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>> @@ -1029,7 +1029,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "stateid=%d:0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1064,7 +1064,7 @@
>>>> 
>>>>               TP_STRUCT__entry(
>>>>                       __field(dev_t, dev)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(u64, dir)
>>>>                       __string(name, name->name)
>>>>               ),
>>>> @@ -1072,12 +1072,12 @@
>>>>               TP_fast_assign(
>>>>                       __entry->dev = dir->i_sb->s_dev;
>>>>                       __entry->dir = NFS_FILEID(dir);
>>>> -                       __entry->error = error;
>>>> +                       __entry->error = -error;
>>>>                       __assign_str(name, name->name);
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) name=%02x:%02x:%llu/%s",
>>>> +                       "error=%lu (%s) name=%02x:%02x:%llu/%s",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>>                       MAJOR(__entry->dev), MINOR(__entry->dev),
>>>> @@ -1114,7 +1114,7 @@
>>>>               TP_STRUCT__entry(
>>>>                       __field(dev_t, dev)
>>>>                       __field(u64, ino)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -1124,7 +1124,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) inode=%02x:%02x:%llu",
>>>> +                       "error=%lu (%s) inode=%02x:%02x:%llu",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>>                       MAJOR(__entry->dev), MINOR(__entry->dev),
>>>> @@ -1145,7 +1145,7 @@
>>>> 
>>>>               TP_STRUCT__entry(
>>>>                       __field(dev_t, dev)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(u64, olddir)
>>>>                       __string(oldname, oldname->name)
>>>>                       __field(u64, newdir)
>>>> @@ -1162,7 +1162,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) oldname=%02x:%02x:%llu/%s "
>>>> +                       "error=%lu (%s) oldname=%02x:%02x:%llu/%s "
>>>>                       "newname=%02x:%02x:%llu/%s",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1187,18 +1187,18 @@
>>>>                       __field(dev_t, dev)
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>>                       __entry->dev = inode->i_sb->s_dev;
>>>>                       __entry->fileid = NFS_FILEID(inode);
>>>>                       __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>>>> -                       __entry->error = error;
>>>> +                       __entry->error = error < 0 ? -error : 0;
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>>                       MAJOR(__entry->dev), MINOR(__entry->dev),
>>>> @@ -1238,7 +1238,7 @@
>>>>                       __field(dev_t, dev)
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, stateid_seq)
>>>>                       __field(u32, stateid_hash)
>>>>               ),
>>>> @@ -1255,7 +1255,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "stateid=%d:0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1295,7 +1295,7 @@
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>>                       __field(unsigned int, valid)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -1307,7 +1307,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "valid=%s",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1342,7 +1342,7 @@
>>>>               TP_ARGS(clp, fhandle, inode, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(dev_t, dev)
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>> @@ -1363,7 +1363,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "dstaddr=%s",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1397,7 +1397,7 @@
>>>>               TP_ARGS(clp, fhandle, inode, stateid, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(dev_t, dev)
>>>>                       __field(u32, fhandle)
>>>>                       __field(u64, fileid)
>>>> @@ -1424,7 +1424,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "stateid=%d:0x%08x dstaddr=%s",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1460,7 +1460,7 @@
>>>>               TP_ARGS(name, len, id, error),
>>>> 
>>>>               TP_STRUCT__entry(
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(u32, id)
>>>>                       __dynamic_array(char, name, len > 0 ? len + 1 : 1)
>>>>               ),
>>>> @@ -1475,8 +1475,8 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d id=%u name=%s",
>>>> -                       __entry->error,
>>>> +                       "error=%lu (%s) id=%u name=%s",
>>>> +                       __entry->error, show_nfsv4_errors(__entry->error),
>>>>                       __entry->id,
>>>>                       __get_str(name)
>>>>               )
>>>> @@ -1509,7 +1509,7 @@
>>>>                       __field(u64, fileid)
>>>>                       __field(loff_t, offset)
>>>>                       __field(size_t, count)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, stateid_seq)
>>>>                       __field(u32, stateid_hash)
>>>>               ),
>>>> @@ -1523,7 +1523,7 @@
>>>>                       __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>>>>                       __entry->offset = hdr->args.offset;
>>>>                       __entry->count = hdr->args.count;
>>>> -                       __entry->error = error;
>>>> +                       __entry->error = error < 0 ? -error : 0;
>>>>                       __entry->stateid_seq =
>>>>                               be32_to_cpu(state->stateid.seqid);
>>>>                       __entry->stateid_hash =
>>>> @@ -1531,7 +1531,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "offset=%lld count=%zu stateid=%d:0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1569,7 +1569,7 @@
>>>>                       __field(u64, fileid)
>>>>                       __field(loff_t, offset)
>>>>                       __field(size_t, count)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, stateid_seq)
>>>>                       __field(u32, stateid_hash)
>>>>               ),
>>>> @@ -1583,7 +1583,7 @@
>>>>                       __entry->fhandle = nfs_fhandle_hash(NFS_FH(inode));
>>>>                       __entry->offset = hdr->args.offset;
>>>>                       __entry->count = hdr->args.count;
>>>> -                       __entry->error = error;
>>>> +                       __entry->error = error < 0 ? -error : 0;
>>>>                       __entry->stateid_seq =
>>>>                               be32_to_cpu(state->stateid.seqid);
>>>>                       __entry->stateid_hash =
>>>> @@ -1591,7 +1591,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "offset=%lld count=%zu stateid=%d:0x%08x",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1630,7 +1630,7 @@
>>>>                       __field(u64, fileid)
>>>>                       __field(loff_t, offset)
>>>>                       __field(size_t, count)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>               ),
>>>> 
>>>>               TP_fast_assign(
>>>> @@ -1644,7 +1644,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "offset=%lld count=%zu",
>>>>                       __entry->error,
>>>>                       show_nfsv4_errors(__entry->error),
>>>> @@ -1694,7 +1694,7 @@
>>>>                       __field(u32, iomode)
>>>>                       __field(u64, offset)
>>>>                       __field(u64, count)
>>>> -                       __field(int, error)
>>>> +                       __field(unsigned long, error)
>>>>                       __field(int, stateid_seq)
>>>>                       __field(u32, stateid_hash)
>>>>                       __field(int, layoutstateid_seq)
>>>> @@ -1727,7 +1727,7 @@
>>>>               ),
>>>> 
>>>>               TP_printk(
>>>> -                       "error=%d (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>> +                       "error=%lu (%s) fileid=%02x:%02x:%llu fhandle=0x%08x "
>>>>                       "iomode=%s offset=%llu count=%llu stateid=%d:0x%08x "
>>>>                       "layoutstateid=%d:0x%08x",
>>>>                       __entry->error,
>> 
>> --
>> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-06-11 20:06 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-11 15:07 [PATCH v2 00/19] for-5.3 patches for review Chuck Lever
2019-06-11 15:08 ` [PATCH v2 01/19] xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA Chuck Lever
2019-06-11 15:08 ` [PATCH v2 02/19] xprtrdma: Fix use-after-free in rpcrdma_post_recvs Chuck Lever
2019-06-11 15:08 ` [PATCH v2 03/19] xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req Chuck Lever
2019-06-11 15:08 ` [PATCH v2 04/19] xprtrdma: Fix occasional transport deadlock Chuck Lever
2019-06-11 15:08 ` [PATCH v2 05/19] xprtrdma: Remove the RPCRDMA_REQ_F_PENDING flag Chuck Lever
2019-06-11 15:08 ` [PATCH v2 06/19] xprtrdma: Remove fr_state Chuck Lever
2019-06-11 15:08 ` [PATCH v2 07/19] xprtrdma: Add mechanism to place MRs back on the free list Chuck Lever
2019-06-11 15:08 ` [PATCH v2 08/19] xprtrdma: Reduce context switching due to Local Invalidation Chuck Lever
2019-06-11 15:08 ` [PATCH v2 09/19] xprtrdma: Wake RPCs directly in rpcrdma_wc_send path Chuck Lever
2019-06-11 15:08 ` [PATCH v2 10/19] xprtrdma: Simplify rpcrdma_rep_create Chuck Lever
2019-06-11 15:08 ` [PATCH v2 11/19] xprtrdma: Streamline rpcrdma_post_recvs Chuck Lever
2019-06-11 15:09 ` [PATCH v2 12/19] xprtrdma: Refactor chunk encoding Chuck Lever
2019-06-11 15:09 ` [PATCH v2 13/19] xprtrdma: Remove rpcrdma_req::rl_buffer Chuck Lever
2019-06-11 15:09 ` [PATCH v2 14/19] xprtrdma: Modernize ops->connect Chuck Lever
2019-06-11 15:09 ` [PATCH v2 15/19] NFS4: Add a trace event to record invalid CB sequence IDs Chuck Lever
2019-06-11 15:09 ` [PATCH v2 16/19] NFS: Fix show_nfs_errors macros again Chuck Lever
2019-06-11 19:33   ` Olga Kornievskaia
2019-06-11 19:37     ` Chuck Lever
2019-06-11 20:00       ` Olga Kornievskaia
2019-06-11 20:06         ` Chuck Lever
2019-06-11 15:09 ` [PATCH v2 17/19] NFS: Display symbolic status code names in trace log Chuck Lever
2019-06-11 15:09 ` [PATCH v2 18/19] NFS: Update symbolic flags displayed by trace events Chuck Lever
2019-06-11 15:09 ` [PATCH v2 19/19] NFS: Record task, client ID, and XID in xdr_status trace points Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).