All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Xiubo Li <xiubli@redhat.com>, idryomov@gmail.com
Cc: zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org
Subject: Re: [PATCH 1/2] ceph: periodically send perf metrics to ceph
Date: Tue, 16 Jun 2020 14:58:05 -0400	[thread overview]
Message-ID: <1267c29b99277382d1483c07d3980dbd29e9535b.camel@kernel.org> (raw)
In-Reply-To: <c115e073-187c-977b-3950-50ab9a2531c7@redhat.com>

On Wed, 2020-06-17 at 02:14 +0800, Xiubo Li wrote:
> On 2020/6/16 21:40, Jeff Layton wrote:
> > On Tue, 2020-06-16 at 08:52 -0400,xiubli@redhat.com  wrote:
> > > From: Xiubo Li<xiubli@redhat.com>
> > > 
> > > This will send the caps/read/write/metadata metrics to any available
> > > MDS only once per second as default, which will be the same as the
> > > userland client, or every metric_send_interval seconds, which is a
> > > module parameter.
> > > 
> > > URL:https://tracker.ceph.com/issues/43215
> > > Signed-off-by: Xiubo Li<xiubli@redhat.com>
> > > ---
> > >   fs/ceph/mds_client.c         |  46 +++++++++------
> > >   fs/ceph/mds_client.h         |   4 ++
> > >   fs/ceph/metric.c             | 133 +++++++++++++++++++++++++++++++++++++++++++
> > >   fs/ceph/metric.h             |  78 +++++++++++++++++++++++++
> > >   fs/ceph/super.c              |  29 ++++++++++
> > >   include/linux/ceph/ceph_fs.h |   1 +
> > >   6 files changed, 274 insertions(+), 17 deletions(-)
> > > 
> > Long patch! Might not hurt to break out some of the cleanups into
> > separate patches, but they're fairly straighforward so I won't require
> > it.
> 
> Sure, I will split the patch into small ones if possible.
> 
> 
> > [...]
> > 
> > >   /* This is the global metrics */
> > >   struct ceph_client_metric {
> > >   	atomic64_t            total_dentries;
> > > @@ -35,8 +100,21 @@ struct ceph_client_metric {
> > >   	ktime_t metadata_latency_sq_sum;
> > >   	ktime_t metadata_latency_min;
> > >   	ktime_t metadata_latency_max;
> > > +
> > > +	struct delayed_work delayed_work;  /* delayed work */
> > >   };
> > >   
> > > +static inline void metric_schedule_delayed(struct ceph_client_metric *m)
> > > +{
> > > +	/* per second as default */
> > > +	unsigned int hz = round_jiffies_relative(HZ * metric_send_interval);
> > > +
> > > +	if (!metric_send_interval)
> > > +		return;
> > > +
> > > +	schedule_delayed_work(&m->delayed_work, hz);
> > > +}
> > > +
> > Should this also be gated on a MDS feature bit or anything? What happens
> > if we're running against a really old MDS that doesn't support these
> > stats? Does it just drop them on the floor? Should we disable sending
> > them in that case?
> 
> Tested without metric code in the ceph and when ceph saw a unknown type 
> message, it will close the socket connection directly.
> 

Ouch, that sounds bad. How does the userland client handle this? This
seems like the kind of thing we usually add a feature bit for
somewhere...

> 
> > >   extern int ceph_metric_init(struct ceph_client_metric *m);
> > >   extern void ceph_metric_destroy(struct ceph_client_metric *m);
> > >   
> > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c
> > > index c9784eb1..66a940c 100644
> > > --- a/fs/ceph/super.c
> > > +++ b/fs/ceph/super.c
> > > @@ -1282,6 +1282,35 @@ static void __exit exit_ceph(void)
> > >   	destroy_caches();
> > >   }
> > >   
> > > +static int param_set_metric_interval(const char *val, const struct kernel_param *kp)
> > > +{
> > > +	int ret;
> > > +	unsigned int interval;
> > > +
> > > +	ret = kstrtouint(val, 0, &interval);
> > > +	if (ret < 0) {
> > > +		pr_err("Failed to parse metric interval '%s'\n", val);
> > > +		return ret;
> > > +	}
> > > +
> > > +	if (interval > 5 || interval < 1) {
> > > +		pr_err("Invalid metric interval %u\n", interval);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	metric_send_interval = interval;
> > > +	return 0;
> > > +}
> > > +
> > > +static const struct kernel_param_ops param_ops_metric_interval = {
> > > +	.set = param_set_metric_interval,
> > > +	.get = param_get_uint,
> > > +};
> > > +
> > > +unsigned int metric_send_interval = 1;
> > > +module_param_cb(metric_send_interval, &param_ops_metric_interval, &metric_send_interval, 0644);
> > > +MODULE_PARM_DESC(metric_send_interval, "Interval (in seconds) of sending perf metric to ceph cluster, valid values are 1~5 (default: 1)");
> > > +
> > Aren't valid values 0-5, with 0 disabling this feature? That's
> > what metric_schedule_delayed() seems to indicate...
> 
> What value should it be as default ? 0 to disable it or 1 ?
> 

I don't have strong preference here. It's probably safe enough to make
it 1. I do like the ability to turn this off though, as that gives us a
workaround in the event that it does cause trouble.

> Maybe in future we should let the ceph side send a notification/request 
> to all the clients if it wants to collect the metrics and then we could 
> enable it here, and defautly just disable it.
> 

That seems like a heavier-weight solution than is called for here. This
seems like something that ought to have a feature bit. Then if that's
turned off, we can just disable this altogether.

> 
> > >   module_init(init_ceph);
> > >   module_exit(exit_ceph);
> > >   
> > > diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
> > > index ebf5ba6..455e9b9 100644
> > > --- a/include/linux/ceph/ceph_fs.h
> > > +++ b/include/linux/ceph/ceph_fs.h
> > > @@ -130,6 +130,7 @@ struct ceph_dir_layout {
> > >   #define CEPH_MSG_CLIENT_REQUEST         24
> > >   #define CEPH_MSG_CLIENT_REQUEST_FORWARD 25
> > >   #define CEPH_MSG_CLIENT_REPLY           26
> > > +#define CEPH_MSG_CLIENT_METRICS         29
> > >   #define CEPH_MSG_CLIENT_CAPS            0x310
> > >   #define CEPH_MSG_CLIENT_LEASE           0x311
> > >   #define CEPH_MSG_CLIENT_SNAP            0x312
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2020-06-16 18:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-16 12:52 [PATCH 0/2] ceph: periodically send perf metrics to ceph xiubli
2020-06-16 12:52 ` [PATCH 1/2] " xiubli
2020-06-16 13:40   ` Jeff Layton
2020-06-16 18:14     ` Xiubo Li
2020-06-16 18:58       ` Jeff Layton [this message]
2020-06-17  0:28         ` Xiubo Li
2020-06-16 12:52 ` [PATCH 2/2] ceph: send client provided metric flags in client metadata xiubli
2020-06-16 14:11   ` Jeff Layton
2020-06-16 14:18     ` Xiubo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1267c29b99277382d1483c07d3980dbd29e9535b.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=pdonnell@redhat.com \
    --cc=xiubli@redhat.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.