All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Wysochanski <dwysocha@redhat.com>
To: Chuck Lever <chuck.lever@oracle.com>,
	Bruce Fields <bfields@fieldses.org>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 3/3] SUNRPC: Count ops completing with tk_status < 0
Date: Mon, 03 Jun 2019 14:53:54 -0400	[thread overview]
Message-ID: <f7976bde9979e8b763c0009b523331ab5ce6b6ed.camel@redhat.com> (raw)
In-Reply-To: <CD3B0503-ABA0-4670-9A76-0B9DF0AE5B5C@oracle.com>

On Fri, 2019-05-31 at 09:25 -0400, Chuck Lever wrote:
> > On May 30, 2019, at 6:33 PM, Bruce Fields <bfields@fieldses.org>
> > wrote:
> > 
> > On Thu, May 30, 2019 at 06:19:54PM -0400, Chuck Lever wrote:
> > > 
> > > 
> > > > On May 30, 2019, at 5:38 PM, bfields@fieldses.org wrote:
> > > > 
> > > > On Thu, May 23, 2019 at 04:13:50PM -0400, Dave Wysochanski
> > > > wrote:
> > > > > We often see various error conditions with NFS4.x that show
> > > > > up with
> > > > > a very high operation count all completing with tk_status < 0
> > > > > in a
> > > > > short period of time.  Add a count to rpc_iostats to record
> > > > > on a
> > > > > per-op basis the ops that complete in this manner, which will
> > > > > enable lower overhead diagnostics.
> > > > 
> > > > Looks like a good idea to me.
> > > > 
> > > > It's too bad we can't distinguish the errors.  (E.g. ESTALE on
> > > > a lookup
> > > > call is a lot more interesting than ENOENT.)  But understood
> > > > that
> > > > maintaining (number of possible errors) * (number of possible
> > > > ops)
> > > > counters is probably overkill, so just counting the number of
> > > > errors
> > > > seems like a good start.
> > > 
> > > We now have trace points that can do that too.
> > 
> > You mean, that can report every error (and its value)?
> 
> Yes, the nfs_xdr_status trace point reports the error by value and
> symbolic name.
> 

The tracepoint is very useful I agree.  I don't think it will show:
a) the mount
b) the opcode

Or am I mistaken and there's a way to get those with a filter or
another tracepoint?


> 
> > I assume having these statistics in mountstats is still useful,
> > though.
> > 
> > --b.
> > 
> > > 
> > > 
> > > > --b.
> > > > 
> > > > > 
> > > > > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> > > > > ---
> > > > > include/linux/sunrpc/metrics.h | 7 ++++++-
> > > > > net/sunrpc/stats.c             | 8 ++++++--
> > > > > 2 files changed, 12 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/include/linux/sunrpc/metrics.h
> > > > > b/include/linux/sunrpc/metrics.h
> > > > > index 1b3751327575..0ee3f7052846 100644
> > > > > --- a/include/linux/sunrpc/metrics.h
> > > > > +++ b/include/linux/sunrpc/metrics.h
> > > > > @@ -30,7 +30,7 @@
> > > > > #include <linux/ktime.h>
> > > > > #include <linux/spinlock.h>
> > > > > 
> > > > > -#define RPC_IOSTATS_VERS	"1.0"
> > > > > +#define RPC_IOSTATS_VERS	"1.1"
> > > > > 
> > > > > struct rpc_iostats {
> > > > > 	spinlock_t		om_lock;
> > > > > @@ -66,6 +66,11 @@ struct rpc_iostats {
> > > > > 	ktime_t			om_queue,	/* queued for
> > > > > xmit */
> > > > > 				om_rtt,		/* RPC RTT */
> > > > > 				om_execute;	/* RPC
> > > > > execution */
> > > > > +	/*
> > > > > +	 * The count of operations that complete with tk_status
> > > > > < 0.
> > > > > +	 * These statuses usually indicate error conditions.
> > > > > +	 */
> > > > > +	unsigned long           om_error_status;
> > > > > } ____cacheline_aligned;
> > > > > 
> > > > > struct rpc_task;
> > > > > diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c
> > > > > index 8b2d3c58ffae..737414247ca7 100644
> > > > > --- a/net/sunrpc/stats.c
> > > > > +++ b/net/sunrpc/stats.c
> > > > > @@ -176,6 +176,8 @@ void rpc_count_iostats_metrics(const
> > > > > struct rpc_task *task,
> > > > > 
> > > > > 	execute = ktime_sub(now, task->tk_start);
> > > > > 	op_metrics->om_execute = ktime_add(op_metrics-
> > > > > >om_execute, execute);
> > > > > +	if (task->tk_status < 0)
> > > > > +		op_metrics->om_error_status++;
> > > > > 
> > > > > 	spin_unlock(&op_metrics->om_lock);
> > > > > 
> > > > > @@ -218,13 +220,14 @@ static void _add_rpc_iostats(struct
> > > > > rpc_iostats *a, struct rpc_iostats *b)
> > > > > 	a->om_queue = ktime_add(a->om_queue, b->om_queue);
> > > > > 	a->om_rtt = ktime_add(a->om_rtt, b->om_rtt);
> > > > > 	a->om_execute = ktime_add(a->om_execute, b-
> > > > > >om_execute);
> > > > > +	a->om_error_status += b->om_error_status;
> > > > > }
> > > > > 
> > > > > static void _print_rpc_iostats(struct seq_file *seq, struct
> > > > > rpc_iostats *stats,
> > > > > 			       int op, const struct
> > > > > rpc_procinfo *procs)
> > > > > {
> > > > > 	_print_name(seq, op, procs);
> > > > > -	seq_printf(seq, "%lu %lu %lu %llu %llu %llu %llu
> > > > > %llu\n",
> > > > > +	seq_printf(seq, "%lu %lu %lu %llu %llu %llu %llu %llu
> > > > > %lu\n",
> > > > > 		   stats->om_ops,
> > > > > 		   stats->om_ntrans,
> > > > > 		   stats->om_timeouts,
> > > > > @@ -232,7 +235,8 @@ static void _print_rpc_iostats(struct
> > > > > seq_file *seq, struct rpc_iostats *stats,
> > > > > 		   stats->om_bytes_recv,
> > > > > 		   ktime_to_ms(stats->om_queue),
> > > > > 		   ktime_to_ms(stats->om_rtt),
> > > > > -		   ktime_to_ms(stats->om_execute));
> > > > > +		   ktime_to_ms(stats->om_execute),
> > > > > +		   stats->om_error_status);
> > > > > }
> > > > > 
> > > > > void rpc_clnt_show_stats(struct seq_file *seq, struct
> > > > > rpc_clnt *clnt)
> > > > > -- 
> > > > > 2.20.1
> > > 
> > > --
> > > Chuck Lever
> 
> --
> Chuck Lever
> 
> 
> 



  reply	other threads:[~2019-06-03 18:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 20:13 [PATCH 1/3] SUNRPC: Move call to rpc_count_iostats before rpc_call_done Dave Wysochanski
2019-05-23 20:13 ` [PATCH 2/3] SUNRPC: Use proper printk specifiers for unsigned long long Dave Wysochanski
2019-05-23 20:13 ` [PATCH 3/3] SUNRPC: Count ops completing with tk_status < 0 Dave Wysochanski
2019-05-30 21:38   ` J. Bruce Fields
2019-05-30 22:19     ` Chuck Lever
2019-05-30 22:33       ` Bruce Fields
2019-05-31 13:25         ` Chuck Lever
2019-06-03 18:53           ` Dave Wysochanski [this message]
2019-06-03 18:56             ` Chuck Lever
2019-06-03 19:05               ` Dave Wysochanski
2019-06-03 19:08                 ` Chuck Lever
2019-06-04 14:45               ` Bruce Fields
2019-06-04 14:56                 ` Trond Myklebust
2019-06-07 14:27                 ` Dave Wysochanski
2019-05-31  0:17     ` David Wysochanski
2019-05-23 20:13 ` [PATCH] mountstats: add per-op error counts for mountstats command Dave Wysochanski
2019-05-23 20:33   ` [PATCH v2] " Dave Wysochanski
2019-06-03 14:31     ` Steve Dickson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7976bde9979e8b763c0009b523331ab5ce6b6ed.camel@redhat.com \
    --to=dwysocha@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.