All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/9] ceph: add perf metrics support
@ 2020-02-10  5:33 xiubli
  2020-02-10  5:33 ` [PATCH v6 1/9] ceph: add global dentry lease metric support xiubli
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:33 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

Changed in V6:
- fold r_end_stamp patch to its first user
- remove some parameters' declartion which are only used once
- switch debugfs sending_metric UI to a module parameter
- make the cap hit/mis metric as global per superblock
- some other small fixes

It will send the metrics to ceph cluster per metric_send_interval seconds
if enabled, metric_send_interval is a module parameter and default value
is 0, 0 also means disabled.


We can get the metrics from the debugfs:

$ cat /sys/kernel/debug/ceph/0c93a60d-5645-4c46-8568-4c8f63db4c7f.client4267/metrics 
item          total       sum_lat(us)     avg_lat(us)
-----------------------------------------------------
read          13          417000          32076
write         42          131205000       3123928
metadata      104         493000          4740

item          total           miss            hit
-------------------------------------------------
d_lease       204             0               918
caps          204             213             368218


In the MDS side, we can get the metrics(NOTE: the latency is in
nanosecond):

$ ./bin/ceph fs perf stats | python -m json.tool
{
    "client_metadata": {
        "client.4267": {
            "IP": "v1:192.168.195.165",
            "hostname": "fedora1",
            "mount_point": "N/A",
            "root": "/"
        }
    },
    "counters": [
        "cap_hit"
    ],
    "global_counters": [
        "read_latency",
        "write_latency",
        "metadata_latency",
        "dentry_lease_hit"
    ],
    "global_metrics": {
        "client.4267": [
            [
                0,
                32076923
            ],
            [
                3,
                123928571
            ],
            [
                0,
                4740384
            ],
            [
                918,
                0
            ]
        ]
    },
    "metrics": {
        "delayed_ranks": [],
        "mds.0": {
            "client.4267": [
                [
                    368218,
                    213
                ]
            ]
        }
    }
}


The provided metric flags in client metadata

$./bin/cephfs-journal-tool --rank=1:0 event get --type=SESSION json
Wrote output to JSON file 'dump'
$ cat dump
[ 
    {
        "client instance": "client.4275 v1:192.168.195.165:0/461391971",
        "open": "true",
        "client map version": 1,
        "inos": "[]",
        "inotable version": 0,
        "client_metadata": {
            "client_features": {
                "feature_bits": "0000000000001bff"
            },
            "metric_spec": {
                "metric_flags": {
                    "feature_bits": "000000000000001f"
                }
            },
            "entity_id": "",
            "hostname": "fedora1",
            "kernel_version": "5.5.0-rc2+",
            "root": "/"
        }
    },
[...]





Xiubo Li (9):
  ceph: add global dentry lease metric support
  ceph: add caps perf metric for each session
  ceph: add global read latency metric support
  ceph: add global write latency metric support
  ceph: add global metadata perf metric support
  ceph: periodically send perf metrics to ceph
  ceph: add CEPH_DEFINE_RW_FUNC helper support
  ceph: add reset metrics support
  ceph: send client provided metric flags in client metadata

 fs/ceph/acl.c                   |   2 +
 fs/ceph/addr.c                  |  13 ++
 fs/ceph/caps.c                  |  29 +++
 fs/ceph/debugfs.c               | 107 ++++++++-
 fs/ceph/dir.c                   |  25 ++-
 fs/ceph/file.c                  |  22 ++
 fs/ceph/mds_client.c            | 381 +++++++++++++++++++++++++++++---
 fs/ceph/mds_client.h            |   6 +
 fs/ceph/metric.h                | 155 +++++++++++++
 fs/ceph/quota.c                 |   9 +-
 fs/ceph/super.c                 |   4 +
 fs/ceph/super.h                 |  11 +
 fs/ceph/xattr.c                 |  17 +-
 include/linux/ceph/ceph_fs.h    |   1 +
 include/linux/ceph/debugfs.h    |  14 ++
 include/linux/ceph/osd_client.h |   1 +
 net/ceph/osd_client.c           |   2 +
 17 files changed, 759 insertions(+), 40 deletions(-)
 create mode 100644 fs/ceph/metric.h

-- 
2.21.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/9] ceph: add global dentry lease metric support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
@ 2020-02-10  5:33 ` xiubli
  2020-02-10  5:34 ` [PATCH v6 2/9] ceph: add caps perf metric for each session xiubli
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:33 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

For the dentry lease we will only count the hit/miss info triggered
from the vfs calls, for the cases like request reply handling and
perodically ceph_trim_dentries() we will ignore them.

Currently only the debugfs is support:

The output will be:

item          total           miss            hit
-------------------------------------------------
d_lease       11              7               141

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/debugfs.c    | 32 ++++++++++++++++++++++++++++----
 fs/ceph/dir.c        | 16 ++++++++++++++--
 fs/ceph/mds_client.c | 37 +++++++++++++++++++++++++++++++++++--
 fs/ceph/mds_client.h |  4 ++++
 fs/ceph/metric.h     | 11 +++++++++++
 fs/ceph/super.h      |  1 +
 6 files changed, 93 insertions(+), 8 deletions(-)
 create mode 100644 fs/ceph/metric.h

diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index 481ac97b4d25..15975ba95d9a 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -124,6 +124,22 @@ static int mdsc_show(struct seq_file *s, void *p)
 	return 0;
 }
 
+static int metric_show(struct seq_file *s, void *p)
+{
+	struct ceph_fs_client *fsc = s->private;
+	struct ceph_mds_client *mdsc = fsc->mdsc;
+
+	seq_printf(s, "item          total           miss            hit\n");
+	seq_printf(s, "-------------------------------------------------\n");
+
+	seq_printf(s, "%-14s%-16lld%-16lld%lld\n", "d_lease",
+		   atomic64_read(&mdsc->metric.total_dentries),
+		   percpu_counter_sum(&mdsc->metric.d_lease_mis),
+		   percpu_counter_sum(&mdsc->metric.d_lease_hit));
+
+	return 0;
+}
+
 static int caps_show_cb(struct inode *inode, struct ceph_cap *cap, void *p)
 {
 	struct seq_file *s = p;
@@ -222,6 +238,7 @@ DEFINE_SHOW_ATTRIBUTE(mdsmap);
 DEFINE_SHOW_ATTRIBUTE(mdsc);
 DEFINE_SHOW_ATTRIBUTE(caps);
 DEFINE_SHOW_ATTRIBUTE(mds_sessions);
+DEFINE_SHOW_ATTRIBUTE(metric);
 
 
 /*
@@ -255,6 +272,7 @@ void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc)
 	debugfs_remove(fsc->debugfs_mdsmap);
 	debugfs_remove(fsc->debugfs_mds_sessions);
 	debugfs_remove(fsc->debugfs_caps);
+	debugfs_remove(fsc->debugfs_metric);
 	debugfs_remove(fsc->debugfs_mdsc);
 }
 
@@ -295,11 +313,17 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc)
 						fsc,
 						&mdsc_fops);
 
+	fsc->debugfs_metric = debugfs_create_file("metrics",
+						  0400,
+						  fsc->client->debugfs_dir,
+						  fsc,
+						  &metric_fops);
+
 	fsc->debugfs_caps = debugfs_create_file("caps",
-						   0400,
-						   fsc->client->debugfs_dir,
-						   fsc,
-						   &caps_fops);
+						0400,
+						fsc->client->debugfs_dir,
+						fsc,
+						&caps_fops);
 }
 
 
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 9e6c711c4b70..4771bf61d562 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -38,6 +38,8 @@ static int __dir_lease_try_check(const struct dentry *dentry);
 static int ceph_d_init(struct dentry *dentry)
 {
 	struct ceph_dentry_info *di;
+	struct ceph_fs_client *fsc = ceph_sb_to_client(dentry->d_sb);
+	struct ceph_mds_client *mdsc = fsc->mdsc;
 
 	di = kmem_cache_zalloc(ceph_dentry_cachep, GFP_KERNEL);
 	if (!di)
@@ -48,6 +50,9 @@ static int ceph_d_init(struct dentry *dentry)
 	di->time = jiffies;
 	dentry->d_fsdata = di;
 	INIT_LIST_HEAD(&di->lease_list);
+
+	atomic64_inc(&mdsc->metric.total_dentries);
+
 	return 0;
 }
 
@@ -1613,6 +1618,7 @@ static int dir_lease_is_valid(struct inode *dir, struct dentry *dentry)
  */
 static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
 {
+	struct ceph_mds_client *mdsc;
 	int valid = 0;
 	struct dentry *parent;
 	struct inode *dir, *inode;
@@ -1651,9 +1657,8 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
 		}
 	}
 
+	mdsc = ceph_sb_to_client(dir->i_sb)->mdsc;
 	if (!valid) {
-		struct ceph_mds_client *mdsc =
-			ceph_sb_to_client(dir->i_sb)->mdsc;
 		struct ceph_mds_request *req;
 		int op, err;
 		u32 mask;
@@ -1661,6 +1666,8 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
 		if (flags & LOOKUP_RCU)
 			return -ECHILD;
 
+		percpu_counter_inc(&mdsc->metric.d_lease_mis);
+
 		op = ceph_snap(dir) == CEPH_SNAPDIR ?
 			CEPH_MDS_OP_LOOKUPSNAP : CEPH_MDS_OP_LOOKUP;
 		req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
@@ -1692,6 +1699,8 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
 			dout("d_revalidate %p lookup result=%d\n",
 			     dentry, err);
 		}
+	} else {
+		percpu_counter_inc(&mdsc->metric.d_lease_hit);
 	}
 
 	dout("d_revalidate %p %s\n", dentry, valid ? "valid" : "invalid");
@@ -1734,9 +1743,12 @@ static int ceph_d_delete(const struct dentry *dentry)
 static void ceph_d_release(struct dentry *dentry)
 {
 	struct ceph_dentry_info *di = ceph_dentry(dentry);
+	struct ceph_fs_client *fsc = ceph_sb_to_client(dentry->d_sb);
 
 	dout("d_release %p\n", dentry);
 
+	atomic64_dec(&fsc->mdsc->metric.total_dentries);
+
 	spin_lock(&dentry->d_lock);
 	__dentry_lease_unlist(di);
 	dentry->d_fsdata = NULL;
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 8263f75badfc..a24fd00676b8 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4158,10 +4158,31 @@ static void delayed_work(struct work_struct *work)
 	schedule_delayed(mdsc);
 }
 
+static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
+{
+	int ret;
+
+	if (!metric)
+		return -EINVAL;
+
+	atomic64_set(&metric->total_dentries, 0);
+	ret = percpu_counter_init(&metric->d_lease_hit, 0, GFP_KERNEL);
+	if (ret)
+		return ret;
+	ret = percpu_counter_init(&metric->d_lease_mis, 0, GFP_KERNEL);
+	if (ret) {
+		percpu_counter_destroy(&metric->d_lease_hit);
+		return ret;
+	}
+
+	return 0;
+}
+
 int ceph_mdsc_init(struct ceph_fs_client *fsc)
 
 {
 	struct ceph_mds_client *mdsc;
+	int err;
 
 	mdsc = kzalloc(sizeof(struct ceph_mds_client), GFP_NOFS);
 	if (!mdsc)
@@ -4170,8 +4191,8 @@ int ceph_mdsc_init(struct ceph_fs_client *fsc)
 	mutex_init(&mdsc->mutex);
 	mdsc->mdsmap = kzalloc(sizeof(*mdsc->mdsmap), GFP_NOFS);
 	if (!mdsc->mdsmap) {
-		kfree(mdsc);
-		return -ENOMEM;
+		err = -ENOMEM;
+		goto err_mdsc;
 	}
 
 	fsc->mdsc = mdsc;
@@ -4210,6 +4231,9 @@ int ceph_mdsc_init(struct ceph_fs_client *fsc)
 	init_waitqueue_head(&mdsc->cap_flushing_wq);
 	INIT_WORK(&mdsc->cap_reclaim_work, ceph_cap_reclaim_work);
 	atomic_set(&mdsc->cap_reclaim_pending, 0);
+	err = ceph_mdsc_metric_init(&mdsc->metric);
+	if (err)
+		goto err_mdsmap;
 
 	spin_lock_init(&mdsc->dentry_list_lock);
 	INIT_LIST_HEAD(&mdsc->dentry_leases);
@@ -4228,6 +4252,12 @@ int ceph_mdsc_init(struct ceph_fs_client *fsc)
 	strscpy(mdsc->nodename, utsname()->nodename,
 		sizeof(mdsc->nodename));
 	return 0;
+
+err_mdsmap:
+	kfree(mdsc->mdsmap);
+err_mdsc:
+	kfree(mdsc);
+	return err;
 }
 
 /*
@@ -4485,6 +4515,9 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
 
 	ceph_mdsc_stop(mdsc);
 
+	percpu_counter_destroy(&mdsc->metric.d_lease_mis);
+	percpu_counter_destroy(&mdsc->metric.d_lease_hit);
+
 	fsc->mdsc = NULL;
 	kfree(mdsc);
 	dout("mdsc_destroy %p done\n", mdsc);
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index 27a7446e10d3..674fc7725913 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -16,6 +16,8 @@
 #include <linux/ceph/mdsmap.h>
 #include <linux/ceph/auth.h>
 
+#include "metric.h"
+
 /* The first 8 bits are reserved for old ceph releases */
 enum ceph_feature_type {
 	CEPHFS_FEATURE_MIMIC = 8,
@@ -446,6 +448,8 @@ struct ceph_mds_client {
 	struct list_head  dentry_leases;     /* fifo list */
 	struct list_head  dentry_dir_leases; /* lru list */
 
+	struct ceph_client_metric metric;
+
 	spinlock_t		snapid_map_lock;
 	struct rb_root		snapid_map_tree;
 	struct list_head	snapid_map_lru;
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
new file mode 100644
index 000000000000..998fe2a643cf
--- /dev/null
+++ b/fs/ceph/metric.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _FS_CEPH_MDS_METRIC_H
+#define _FS_CEPH_MDS_METRIC_H
+
+/* This is the global metrics */
+struct ceph_client_metric {
+	atomic64_t            total_dentries;
+	struct percpu_counter d_lease_hit;
+	struct percpu_counter d_lease_mis;
+};
+#endif /* _FS_CEPH_MDS_METRIC_H */
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 70aa32cfb64d..5241efe0f9d0 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -128,6 +128,7 @@ struct ceph_fs_client {
 	struct dentry *debugfs_congestion_kb;
 	struct dentry *debugfs_bdi;
 	struct dentry *debugfs_mdsc, *debugfs_mdsmap;
+	struct dentry *debugfs_metric;
 	struct dentry *debugfs_mds_sessions;
 #endif
 
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/9] ceph: add caps perf metric for each session
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
  2020-02-10  5:33 ` [PATCH v6 1/9] ceph: add global dentry lease metric support xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-17 13:27   ` Jeff Layton
  2020-02-10  5:34 ` [PATCH v6 3/9] ceph: add global read latency metric support xiubli
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

This will fulfill the cap hit/mis metric stuff per-superblock,
it will count the hit/mis counters based each inode, and if one
inode's 'issued & ~revoking == mask' will mean a hit, or a miss.

item          total           miss            hit
-------------------------------------------------
caps          295             107             4119

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/acl.c        |  2 ++
 fs/ceph/caps.c       | 29 +++++++++++++++++++++++++++++
 fs/ceph/debugfs.c    | 16 ++++++++++++++++
 fs/ceph/dir.c        |  9 +++++++--
 fs/ceph/file.c       |  2 ++
 fs/ceph/mds_client.c | 26 ++++++++++++++++++++++----
 fs/ceph/metric.h     |  3 +++
 fs/ceph/quota.c      |  9 +++++++--
 fs/ceph/super.h      |  9 +++++++++
 fs/ceph/xattr.c      | 17 ++++++++++++++---
 10 files changed, 111 insertions(+), 11 deletions(-)

diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
index 26be6520d3fb..58e119e3519f 100644
--- a/fs/ceph/acl.c
+++ b/fs/ceph/acl.c
@@ -22,6 +22,8 @@ static inline void ceph_set_cached_acl(struct inode *inode,
 	struct ceph_inode_info *ci = ceph_inode(inode);
 
 	spin_lock(&ci->i_ceph_lock);
+	__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
+
 	if (__ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 0))
 		set_cached_acl(inode, type, acl);
 	else
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 7fc87b693ba4..b4f122eb74bb 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -818,6 +818,32 @@ int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented)
 	return have;
 }
 
+/*
+ * Counts the cap metric.
+ *
+ * This will try to traverse all the ci->i_caps, if we can
+ * get all the cap 'mask' it will count the hit, or the mis.
+ */
+void __ceph_caps_metric(struct ceph_inode_info *ci, int mask)
+{
+	struct ceph_mds_client *mdsc =
+		ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc;
+	struct ceph_client_metric *metric = &mdsc->metric;
+	int issued;
+
+	lockdep_assert_held(&ci->i_ceph_lock);
+
+	if (mask <= 0)
+		return;
+
+	issued = __ceph_caps_issued(ci, NULL);
+
+	if ((mask & issued) == mask)
+		percpu_counter_inc(&metric->i_caps_hit);
+	else
+		percpu_counter_inc(&metric->i_caps_mis);
+}
+
 /*
  * Get cap bits issued by caps other than @ocap
  */
@@ -2758,6 +2784,7 @@ int ceph_try_get_caps(struct inode *inode, int need, int want,
 	BUG_ON(want & ~(CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO |
 			CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL |
 			CEPH_CAP_ANY_DIR_OPS));
+	ceph_caps_metric(ceph_inode(inode), need | want);
 	ret = try_get_cap_refs(inode, need, want, 0, nonblock, got);
 	return ret == -EAGAIN ? 0 : ret;
 }
@@ -2784,6 +2811,8 @@ int ceph_get_caps(struct file *filp, int need, int want,
 	    fi->filp_gen != READ_ONCE(fsc->filp_gen))
 		return -EBADF;
 
+	ceph_caps_metric(ci, need | want);
+
 	while (true) {
 		if (endoff > 0)
 			check_max_size(inode, endoff);
diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index 15975ba95d9a..c83e52bd9961 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -128,6 +128,7 @@ static int metric_show(struct seq_file *s, void *p)
 {
 	struct ceph_fs_client *fsc = s->private;
 	struct ceph_mds_client *mdsc = fsc->mdsc;
+	int i, nr_caps = 0;
 
 	seq_printf(s, "item          total           miss            hit\n");
 	seq_printf(s, "-------------------------------------------------\n");
@@ -137,6 +138,21 @@ static int metric_show(struct seq_file *s, void *p)
 		   percpu_counter_sum(&mdsc->metric.d_lease_mis),
 		   percpu_counter_sum(&mdsc->metric.d_lease_hit));
 
+	mutex_lock(&mdsc->mutex);
+	for (i = 0; i < mdsc->max_sessions; i++) {
+		struct ceph_mds_session *s;
+
+		s = __ceph_lookup_mds_session(mdsc, i);
+		if (!s)
+			continue;
+		nr_caps += s->s_nr_caps;
+		ceph_put_mds_session(s);
+	}
+	mutex_unlock(&mdsc->mutex);
+	seq_printf(s, "%-14s%-16d%-16lld%lld\n", "caps", nr_caps,
+		   percpu_counter_sum(&mdsc->metric.i_caps_mis),
+		   percpu_counter_sum(&mdsc->metric.i_caps_hit));
+
 	return 0;
 }
 
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 4771bf61d562..ffeaff5bf211 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -313,7 +313,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx)
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
 	int i;
-	int err;
+	int err, ret = -1;
 	unsigned frag = -1;
 	struct ceph_mds_reply_info_parsed *rinfo;
 
@@ -346,13 +346,16 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx)
 	    !ceph_test_mount_opt(fsc, NOASYNCREADDIR) &&
 	    ceph_snap(inode) != CEPH_SNAPDIR &&
 	    __ceph_dir_is_complete_ordered(ci) &&
-	    __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1)) {
+	    (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1))) {
 		int shared_gen = atomic_read(&ci->i_shared_gen);
+		__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
 		spin_unlock(&ci->i_ceph_lock);
 		err = __dcache_readdir(file, ctx, shared_gen);
 		if (err != -EAGAIN)
 			return err;
 	} else {
+		if (ret != -1)
+			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
 		spin_unlock(&ci->i_ceph_lock);
 	}
 
@@ -757,6 +760,8 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 		struct ceph_dentry_info *di = ceph_dentry(dentry);
 
 		spin_lock(&ci->i_ceph_lock);
+		__ceph_caps_metric(ci, CEPH_CAP_FILE_SHARED);
+
 		dout(" dir %p flags are %d\n", dir, ci->i_ceph_flags);
 		if (strncmp(dentry->d_name.name,
 			    fsc->mount_options->snapdir_name,
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 4d1b5cc6dd3b..96803500b712 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -384,6 +384,8 @@ int ceph_open(struct inode *inode, struct file *file)
 	 * asynchronously.
 	 */
 	spin_lock(&ci->i_ceph_lock);
+	__ceph_caps_metric(ci, wanted);
+
 	if (__ceph_is_any_real_caps(ci) &&
 	    (((fmode & CEPH_FILE_MODE_WR) == 0) || ci->i_auth_cap)) {
 		int mds_wanted = __ceph_caps_mds_wanted(ci, true);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index a24fd00676b8..1431e52e9558 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4169,13 +4169,29 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
 	ret = percpu_counter_init(&metric->d_lease_hit, 0, GFP_KERNEL);
 	if (ret)
 		return ret;
+
 	ret = percpu_counter_init(&metric->d_lease_mis, 0, GFP_KERNEL);
-	if (ret) {
-		percpu_counter_destroy(&metric->d_lease_hit);
-		return ret;
-	}
+	if (ret)
+		goto err_d_lease_mis;
+
+	ret = percpu_counter_init(&metric->i_caps_hit, 0, GFP_KERNEL);
+	if (ret)
+		goto err_i_caps_hit;
+
+	ret = percpu_counter_init(&metric->i_caps_mis, 0, GFP_KERNEL);
+	if (ret)
+		goto err_i_caps_mis;
 
 	return 0;
+
+err_i_caps_mis:
+	percpu_counter_destroy(&metric->i_caps_hit);
+err_i_caps_hit:
+	percpu_counter_destroy(&metric->d_lease_mis);
+err_d_lease_mis:
+	percpu_counter_destroy(&metric->d_lease_hit);
+
+	return ret;
 }
 
 int ceph_mdsc_init(struct ceph_fs_client *fsc)
@@ -4515,6 +4531,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
 
 	ceph_mdsc_stop(mdsc);
 
+	percpu_counter_destroy(&mdsc->metric.i_caps_mis);
+	percpu_counter_destroy(&mdsc->metric.i_caps_hit);
 	percpu_counter_destroy(&mdsc->metric.d_lease_mis);
 	percpu_counter_destroy(&mdsc->metric.d_lease_hit);
 
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
index 998fe2a643cf..e2fceb38a924 100644
--- a/fs/ceph/metric.h
+++ b/fs/ceph/metric.h
@@ -7,5 +7,8 @@ struct ceph_client_metric {
 	atomic64_t            total_dentries;
 	struct percpu_counter d_lease_hit;
 	struct percpu_counter d_lease_mis;
+
+	struct percpu_counter i_caps_hit;
+	struct percpu_counter i_caps_mis;
 };
 #endif /* _FS_CEPH_MDS_METRIC_H */
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index de56dee60540..4ce2f658e63d 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -147,9 +147,14 @@ static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc,
 		return NULL;
 	}
 	if (qri->inode) {
+		struct ceph_inode_info *ci = ceph_inode(qri->inode);
+		int ret;
+
+		ceph_caps_metric(ci, CEPH_STAT_CAP_INODE);
+
 		/* get caps */
-		int ret = __ceph_do_getattr(qri->inode, NULL,
-					    CEPH_STAT_CAP_INODE, true);
+		ret = __ceph_do_getattr(qri->inode, NULL,
+					CEPH_STAT_CAP_INODE, true);
 		if (ret >= 0)
 			in = qri->inode;
 		else
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 5241efe0f9d0..44b9a971ec9a 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -641,6 +641,14 @@ static inline bool __ceph_is_any_real_caps(struct ceph_inode_info *ci)
 	return !RB_EMPTY_ROOT(&ci->i_caps);
 }
 
+extern void __ceph_caps_metric(struct ceph_inode_info *ci, int mask);
+static inline void ceph_caps_metric(struct ceph_inode_info *ci, int mask)
+{
+	spin_lock(&ci->i_ceph_lock);
+	__ceph_caps_metric(ci, mask);
+	spin_unlock(&ci->i_ceph_lock);
+}
+
 extern int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented);
 extern int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int t);
 extern int __ceph_caps_issued_other(struct ceph_inode_info *ci,
@@ -927,6 +935,7 @@ extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
 			     int mask, bool force);
 static inline int ceph_do_getattr(struct inode *inode, int mask, bool force)
 {
+	ceph_caps_metric(ceph_inode(inode), mask);
 	return __ceph_do_getattr(inode, NULL, mask, force);
 }
 extern int ceph_permission(struct inode *inode, int mask);
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
index 7b8a070a782d..9b28e87b6719 100644
--- a/fs/ceph/xattr.c
+++ b/fs/ceph/xattr.c
@@ -829,6 +829,7 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
 	struct ceph_vxattr *vxattr = NULL;
 	int req_mask;
 	ssize_t err;
+	int ret = -1;
 
 	/* let's see if a virtual xattr was requested */
 	vxattr = ceph_match_vxattr(inode, name);
@@ -856,7 +857,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
 
 	if (ci->i_xattrs.version == 0 ||
 	    !((req_mask & CEPH_CAP_XATTR_SHARED) ||
-	      __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1))) {
+	      (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1)))) {
+		if (ret != -1)
+			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
 		spin_unlock(&ci->i_ceph_lock);
 
 		/* security module gets xattr while filling trace */
@@ -871,6 +874,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
 		if (err)
 			return err;
 		spin_lock(&ci->i_ceph_lock);
+	} else {
+		if (ret != -1)
+			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
 	}
 
 	err = __build_xattrs(inode);
@@ -907,19 +913,24 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size)
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	bool len_only = (size == 0);
 	u32 namelen;
-	int err;
+	int err, ret = -1;
 
 	spin_lock(&ci->i_ceph_lock);
 	dout("listxattr %p ver=%lld index_ver=%lld\n", inode,
 	     ci->i_xattrs.version, ci->i_xattrs.index_version);
 
 	if (ci->i_xattrs.version == 0 ||
-	    !__ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1)) {
+	    !(ret = __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1))) {
+		if (ret != -1)
+			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
 		spin_unlock(&ci->i_ceph_lock);
 		err = ceph_do_getattr(inode, CEPH_STAT_CAP_XATTR, true);
 		if (err)
 			return err;
 		spin_lock(&ci->i_ceph_lock);
+	} else {
+		if (ret != -1)
+			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
 	}
 
 	err = __build_xattrs(inode);
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/9] ceph: add global read latency metric support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
  2020-02-10  5:33 ` [PATCH v6 1/9] ceph: add global dentry lease metric support xiubli
  2020-02-10  5:34 ` [PATCH v6 2/9] ceph: add caps perf metric for each session xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-10  5:34 ` [PATCH v6 4/9] ceph: add global write " xiubli
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

It will calculate the latency for the read osd requests, which only
include the time cousumed by network and the ceph osd.

item          total       sum_lat(us)     avg_lat(us)
-----------------------------------------------------
read          1036        848000          818

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/addr.c                  |  6 ++++++
 fs/ceph/debugfs.c               | 11 +++++++++++
 fs/ceph/file.c                  | 13 +++++++++++++
 fs/ceph/mds_client.c            | 14 ++++++++++++++
 fs/ceph/metric.h                | 20 ++++++++++++++++++++
 include/linux/ceph/osd_client.h |  1 +
 net/ceph/osd_client.c           |  2 ++
 7 files changed, 67 insertions(+)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 7136f9947354..1cc47a062a6c 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -216,6 +216,8 @@ static int ceph_sync_readpages(struct ceph_fs_client *fsc,
 	if (!rc)
 		rc = ceph_osdc_wait_request(osdc, req);
 
+	ceph_update_read_latency(&fsc->mdsc->metric, req, rc);
+
 	ceph_osdc_put_request(req);
 	dout("readpages result %d\n", rc);
 	return rc;
@@ -299,6 +301,7 @@ static int ceph_readpage(struct file *filp, struct page *page)
 static void finish_read(struct ceph_osd_request *req)
 {
 	struct inode *inode = req->r_inode;
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	struct ceph_osd_data *osd_data;
 	int rc = req->r_result <= 0 ? req->r_result : 0;
 	int bytes = req->r_result >= 0 ? req->r_result : 0;
@@ -336,6 +339,9 @@ static void finish_read(struct ceph_osd_request *req)
 		put_page(page);
 		bytes -= PAGE_SIZE;
 	}
+
+	ceph_update_read_latency(&fsc->mdsc->metric, req, rc);
+
 	kfree(osd_data->pages);
 }
 
diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index c83e52bd9961..d814a3a27611 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -129,7 +129,18 @@ static int metric_show(struct seq_file *s, void *p)
 	struct ceph_fs_client *fsc = s->private;
 	struct ceph_mds_client *mdsc = fsc->mdsc;
 	int i, nr_caps = 0;
+	s64 total, sum, avg = 0;
 
+	seq_printf(s, "item          total       sum_lat(us)     avg_lat(us)\n");
+	seq_printf(s, "-----------------------------------------------------\n");
+
+	total = percpu_counter_sum(&mdsc->metric.total_reads);
+	sum = percpu_counter_sum(&mdsc->metric.read_latency_sum);
+	sum = jiffies_to_usecs(sum);
+	avg = total ? sum / total : 0;
+	seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "read", total, sum, avg);
+
+	seq_printf(s, "\n");
 	seq_printf(s, "item          total           miss            hit\n");
 	seq_printf(s, "-------------------------------------------------\n");
 
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 96803500b712..3526673bd51e 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -660,6 +660,9 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
 		ret = ceph_osdc_start_request(osdc, req, false);
 		if (!ret)
 			ret = ceph_osdc_wait_request(osdc, req);
+
+		ceph_update_read_latency(&fsc->mdsc->metric, req, ret);
+
 		ceph_osdc_put_request(req);
 
 		i_size = i_size_read(inode);
@@ -798,6 +801,8 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
 	struct inode *inode = req->r_inode;
 	struct ceph_aio_request *aio_req = req->r_priv;
 	struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0);
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct ceph_client_metric *metric = &fsc->mdsc->metric;
 
 	BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_BVECS);
 	BUG_ON(!osd_data->num_bvecs);
@@ -805,6 +810,10 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
 	dout("ceph_aio_complete_req %p rc %d bytes %u\n",
 	     inode, rc, osd_data->bvec_pos.iter.bi_size);
 
+	/* r_start_stamp == 0 means the request was not submitted */
+	if (req->r_start_stamp && !aio_req->write)
+		ceph_update_read_latency(metric, req, rc);
+
 	if (rc == -EOLDSNAPC) {
 		struct ceph_aio_work *aio_work;
 		BUG_ON(!aio_req->write);
@@ -933,6 +942,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 	struct inode *inode = file_inode(file);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct ceph_client_metric *metric = &fsc->mdsc->metric;
 	struct ceph_vino vino;
 	struct ceph_osd_request *req;
 	struct bio_vec *bvecs;
@@ -1049,6 +1059,9 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 		if (!ret)
 			ret = ceph_osdc_wait_request(&fsc->client->osdc, req);
 
+		if (!write)
+			ceph_update_read_latency(metric, req, ret);
+
 		size = i_size_read(inode);
 		if (!write) {
 			if (ret == -ENOENT)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 1431e52e9558..e2d8312cc332 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4182,8 +4182,20 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
 	if (ret)
 		goto err_i_caps_mis;
 
+	ret = percpu_counter_init(&metric->total_reads, 0, GFP_KERNEL);
+	if (ret)
+		goto err_total_reads;
+
+	ret = percpu_counter_init(&metric->read_latency_sum, 0, GFP_KERNEL);
+	if (ret)
+		goto err_read_latency_sum;
+
 	return 0;
 
+err_read_latency_sum:
+	percpu_counter_destroy(&metric->total_reads);
+err_total_reads:
+	percpu_counter_destroy(&metric->i_caps_mis);
 err_i_caps_mis:
 	percpu_counter_destroy(&metric->i_caps_hit);
 err_i_caps_hit:
@@ -4531,6 +4543,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
 
 	ceph_mdsc_stop(mdsc);
 
+	percpu_counter_destroy(&mdsc->metric.read_latency_sum);
+	percpu_counter_destroy(&mdsc->metric.total_reads);
 	percpu_counter_destroy(&mdsc->metric.i_caps_mis);
 	percpu_counter_destroy(&mdsc->metric.i_caps_hit);
 	percpu_counter_destroy(&mdsc->metric.d_lease_mis);
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
index e2fceb38a924..afea44a3794b 100644
--- a/fs/ceph/metric.h
+++ b/fs/ceph/metric.h
@@ -2,6 +2,8 @@
 #ifndef _FS_CEPH_MDS_METRIC_H
 #define _FS_CEPH_MDS_METRIC_H
 
+#include <linux/ceph/osd_client.h>
+
 /* This is the global metrics */
 struct ceph_client_metric {
 	atomic64_t            total_dentries;
@@ -10,5 +12,23 @@ struct ceph_client_metric {
 
 	struct percpu_counter i_caps_hit;
 	struct percpu_counter i_caps_mis;
+
+	struct percpu_counter total_reads;
+	struct percpu_counter read_latency_sum;
 };
+
+static inline void ceph_update_read_latency(struct ceph_client_metric *m,
+					    struct ceph_osd_request *req,
+					    int rc)
+{
+	if (!m || !req)
+		return;
+
+	if (rc >= 0 || rc == -ENOENT || rc == -ETIMEDOUT) {
+		s64 latency = req->r_end_stamp - req->r_start_stamp;
+
+		percpu_counter_inc(&m->total_reads);
+		percpu_counter_add(&m->read_latency_sum, latency);
+	}
+}
 #endif /* _FS_CEPH_MDS_METRIC_H */
diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
index 9d9f745b98a1..02ff3a302d26 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -213,6 +213,7 @@ struct ceph_osd_request {
 	/* internal */
 	unsigned long r_stamp;                /* jiffies, send or check time */
 	unsigned long r_start_stamp;          /* jiffies */
+	unsigned long r_end_stamp;            /* jiffies */
 	int r_attempts;
 	u32 r_map_dne_bound;
 
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 8ff2856e2d52..108c9457d629 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -2389,6 +2389,8 @@ static void finish_request(struct ceph_osd_request *req)
 	WARN_ON(lookup_request_mc(&osdc->map_checks, req->r_tid));
 	dout("%s req %p tid %llu\n", __func__, req, req->r_tid);
 
+	req->r_end_stamp = jiffies;
+
 	if (req->r_osd)
 		unlink_request(req->r_osd, req);
 	atomic_dec(&osdc->num_requests);
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/9] ceph: add global write latency metric support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (2 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 3/9] ceph: add global read latency metric support xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-10  5:34 ` [PATCH v6 5/9] ceph: add global metadata perf " xiubli
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

It will calculate the latency for the write osd requests, which only
include the time cousumed by network and the ceph osd.

item          total       sum_lat(us)     avg_lat(us)
-----------------------------------------------------
write         1048        8778000         8375

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/addr.c       |  7 +++++++
 fs/ceph/debugfs.c    |  6 ++++++
 fs/ceph/file.c       | 13 ++++++++++---
 fs/ceph/mds_client.c | 14 ++++++++++++++
 fs/ceph/metric.h     | 18 ++++++++++++++++++
 5 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 1cc47a062a6c..d14392b58f16 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -649,6 +649,8 @@ static int ceph_sync_writepages(struct ceph_fs_client *fsc,
 	if (!rc)
 		rc = ceph_osdc_wait_request(osdc, req);
 
+	ceph_update_write_latency(&fsc->mdsc->metric, req, rc);
+
 	ceph_osdc_put_request(req);
 	if (rc == 0)
 		rc = len;
@@ -800,6 +802,8 @@ static void writepages_finish(struct ceph_osd_request *req)
 		ceph_clear_error_write(ci);
 	}
 
+	ceph_update_write_latency(&fsc->mdsc->metric, req, rc);
+
 	/*
 	 * We lost the cache cap, need to truncate the page before
 	 * it is unlocked, otherwise we'd truncate it later in the
@@ -1858,6 +1862,9 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page)
 	err = ceph_osdc_start_request(&fsc->client->osdc, req, false);
 	if (!err)
 		err = ceph_osdc_wait_request(&fsc->client->osdc, req);
+
+	ceph_update_write_latency(&fsc->mdsc->metric, req, err);
+
 out_put:
 	ceph_osdc_put_request(req);
 	if (err == -ECANCELED)
diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index d814a3a27611..464bfbdb970d 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -140,6 +140,12 @@ static int metric_show(struct seq_file *s, void *p)
 	avg = total ? sum / total : 0;
 	seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "read", total, sum, avg);
 
+	total = percpu_counter_sum(&mdsc->metric.total_writes);
+	sum = percpu_counter_sum(&mdsc->metric.write_latency_sum);
+	sum = jiffies_to_usecs(sum);
+	avg = total ? sum / total : 0;
+	seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "write", total, sum, avg);
+
 	seq_printf(s, "\n");
 	seq_printf(s, "item          total           miss            hit\n");
 	seq_printf(s, "-------------------------------------------------\n");
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 3526673bd51e..f970a3aa349a 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -811,8 +811,12 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
 	     inode, rc, osd_data->bvec_pos.iter.bi_size);
 
 	/* r_start_stamp == 0 means the request was not submitted */
-	if (req->r_start_stamp && !aio_req->write)
-		ceph_update_read_latency(metric, req, rc);
+	if (req->r_start_stamp) {
+		if (aio_req->write)
+			ceph_update_write_latency(metric, req, rc);
+		else
+			ceph_update_read_latency(metric, req, rc);
+	}
 
 	if (rc == -EOLDSNAPC) {
 		struct ceph_aio_work *aio_work;
@@ -1059,7 +1063,9 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 		if (!ret)
 			ret = ceph_osdc_wait_request(&fsc->client->osdc, req);
 
-		if (!write)
+		if (write)
+			ceph_update_write_latency(metric, req, ret);
+		else
 			ceph_update_read_latency(metric, req, ret);
 
 		size = i_size_read(inode);
@@ -1233,6 +1239,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos,
 		if (!ret)
 			ret = ceph_osdc_wait_request(&fsc->client->osdc, req);
 
+		ceph_update_write_latency(&fsc->mdsc->metric, req, ret);
 out:
 		ceph_osdc_put_request(req);
 		if (ret != 0) {
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index e2d8312cc332..cc2b426cd8e4 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4190,8 +4190,20 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
 	if (ret)
 		goto err_read_latency_sum;
 
+	ret = percpu_counter_init(&metric->total_writes, 0, GFP_KERNEL);
+	if (ret)
+		goto err_total_writes;
+
+	ret = percpu_counter_init(&metric->write_latency_sum, 0, GFP_KERNEL);
+	if (ret)
+		goto err_write_latency_sum;
+
 	return 0;
 
+err_write_latency_sum:
+	percpu_counter_destroy(&metric->total_writes);
+err_total_writes:
+	percpu_counter_destroy(&metric->read_latency_sum);
 err_read_latency_sum:
 	percpu_counter_destroy(&metric->total_reads);
 err_total_reads:
@@ -4543,6 +4555,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
 
 	ceph_mdsc_stop(mdsc);
 
+	percpu_counter_destroy(&mdsc->metric.write_latency_sum);
+	percpu_counter_destroy(&mdsc->metric.total_writes);
 	percpu_counter_destroy(&mdsc->metric.read_latency_sum);
 	percpu_counter_destroy(&mdsc->metric.total_reads);
 	percpu_counter_destroy(&mdsc->metric.i_caps_mis);
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
index afea44a3794b..a87197f3e915 100644
--- a/fs/ceph/metric.h
+++ b/fs/ceph/metric.h
@@ -15,6 +15,9 @@ struct ceph_client_metric {
 
 	struct percpu_counter total_reads;
 	struct percpu_counter read_latency_sum;
+
+	struct percpu_counter total_writes;
+	struct percpu_counter write_latency_sum;
 };
 
 static inline void ceph_update_read_latency(struct ceph_client_metric *m,
@@ -31,4 +34,19 @@ static inline void ceph_update_read_latency(struct ceph_client_metric *m,
 		percpu_counter_add(&m->read_latency_sum, latency);
 	}
 }
+
+static inline void ceph_update_write_latency(struct ceph_client_metric *m,
+					     struct ceph_osd_request *req,
+					     int rc)
+{
+	if (!m || !req)
+		return;
+
+	if (!rc || rc == -ETIMEDOUT) {
+		s64 latency = req->r_end_stamp - req->r_start_stamp;
+
+		percpu_counter_inc(&m->total_writes);
+		percpu_counter_add(&m->write_latency_sum, latency);
+	}
+}
 #endif /* _FS_CEPH_MDS_METRIC_H */
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/9] ceph: add global metadata perf metric support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (3 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 4/9] ceph: add global write " xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-10  5:34 ` [PATCH v6 6/9] ceph: periodically send perf metrics to ceph xiubli
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

It will calculate the latency for the metedata requests, which only
include the time cousumed by network and the ceph.

item          total       sum_lat(us)     avg_lat(us)
-----------------------------------------------------
metadata      113         220000          1946

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/debugfs.c    |  6 ++++++
 fs/ceph/mds_client.c | 20 ++++++++++++++++++++
 fs/ceph/metric.h     | 13 +++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index 464bfbdb970d..60f3e307fca1 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -146,6 +146,12 @@ static int metric_show(struct seq_file *s, void *p)
 	avg = total ? sum / total : 0;
 	seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "write", total, sum, avg);
 
+	total = percpu_counter_sum(&mdsc->metric.total_metadatas);
+	sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum);
+	sum = jiffies_to_usecs(sum);
+	avg = total ? sum / total : 0;
+	seq_printf(s, "%-14s%-12lld%-16lld%lld\n", "metadata", total, sum, avg);
+
 	seq_printf(s, "\n");
 	seq_printf(s, "item          total           miss            hit\n");
 	seq_printf(s, "-------------------------------------------------\n");
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index cc2b426cd8e4..d414eded6810 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -3019,6 +3019,12 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
 
 	/* kick calling process */
 	complete_request(mdsc, req);
+
+	if (!result || result == -ENOENT) {
+		s64 latency = jiffies - req->r_started;
+
+		ceph_update_metadata_latency(&mdsc->metric, latency);
+	}
 out:
 	ceph_mdsc_put_request(req);
 	return;
@@ -4198,8 +4204,20 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
 	if (ret)
 		goto err_write_latency_sum;
 
+	ret = percpu_counter_init(&metric->total_metadatas, 0, GFP_KERNEL);
+	if (ret)
+		goto err_total_metadatas;
+
+	ret = percpu_counter_init(&metric->metadata_latency_sum, 0, GFP_KERNEL);
+	if (ret)
+		goto err_metadata_latency_sum;
+
 	return 0;
 
+err_metadata_latency_sum:
+	percpu_counter_destroy(&metric->total_metadatas);
+err_total_metadatas:
+	percpu_counter_destroy(&metric->write_latency_sum);
 err_write_latency_sum:
 	percpu_counter_destroy(&metric->total_writes);
 err_total_writes:
@@ -4555,6 +4573,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
 
 	ceph_mdsc_stop(mdsc);
 
+	percpu_counter_destroy(&mdsc->metric.metadata_latency_sum);
+	percpu_counter_destroy(&mdsc->metric.total_metadatas);
 	percpu_counter_destroy(&mdsc->metric.write_latency_sum);
 	percpu_counter_destroy(&mdsc->metric.total_writes);
 	percpu_counter_destroy(&mdsc->metric.read_latency_sum);
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
index a87197f3e915..9de8beb436c7 100644
--- a/fs/ceph/metric.h
+++ b/fs/ceph/metric.h
@@ -18,6 +18,9 @@ struct ceph_client_metric {
 
 	struct percpu_counter total_writes;
 	struct percpu_counter write_latency_sum;
+
+	struct percpu_counter total_metadatas;
+	struct percpu_counter metadata_latency_sum;
 };
 
 static inline void ceph_update_read_latency(struct ceph_client_metric *m,
@@ -49,4 +52,14 @@ static inline void ceph_update_write_latency(struct ceph_client_metric *m,
 		percpu_counter_add(&m->write_latency_sum, latency);
 	}
 }
+
+static inline void ceph_update_metadata_latency(struct ceph_client_metric *m,
+						s64 latency)
+{
+	if (!m)
+		return;
+
+	percpu_counter_inc(&m->total_metadatas);
+	percpu_counter_add(&m->metadata_latency_sum, latency);
+}
 #endif /* _FS_CEPH_MDS_METRIC_H */
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 6/9] ceph: periodically send perf metrics to ceph
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (4 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 5/9] ceph: add global metadata perf " xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-10 15:34   ` Ilya Dryomov
  2020-02-10  5:34 ` [PATCH v6 7/9] ceph: add CEPH_DEFINE_RW_FUNC helper support xiubli
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

Add metric_send_interval module parameter support, the default valume
is 0, means disabled. If none zero it will enable the transmission of
the metrics to the ceph cluster periodically per metric_send_interval
seconds.

This will send the caps, dentry lease and read/write/metadata perf
metrics to any available MDS only once per metric_send_interval
seconds.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/mds_client.c         | 235 +++++++++++++++++++++++++++++++----
 fs/ceph/mds_client.h         |   2 +
 fs/ceph/metric.h             |  76 +++++++++++
 fs/ceph/super.c              |   4 +
 fs/ceph/super.h              |   1 +
 include/linux/ceph/ceph_fs.h |   1 +
 6 files changed, 294 insertions(+), 25 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index d414eded6810..f9a6f95c7941 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4085,16 +4085,167 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
 	ceph_force_reconnect(fsc->sb);
 }
 
-/*
- * delayed work -- periodically trim expired leases, renew caps with mds
- */
+static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc,
+				   struct ceph_mds_session *s,
+				   u64 nr_caps)
+{
+	struct ceph_metric_head *head;
+	struct ceph_metric_cap *cap;
+	struct ceph_metric_dentry_lease *lease;
+	struct ceph_metric_read_latency *read;
+	struct ceph_metric_write_latency *write;
+	struct ceph_metric_metadata_latency *meta;
+	struct ceph_msg *msg;
+	struct timespec64 ts;
+	s64 sum, total;
+	s32 items = 0;
+	s32 len;
+
+	if (!mdsc || !s)
+		return false;
+
+	len = sizeof(*head) + sizeof(*cap) + sizeof(*lease) + sizeof(*read)
+	      + sizeof(*write) + sizeof(*meta);
+
+	msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true);
+	if (!msg) {
+		pr_err("send metrics to mds%d, failed to allocate message\n",
+		       s->s_mds);
+		return false;
+	}
+
+	head = msg->front.iov_base;
+
+	/* encode the cap metric */
+	cap = (struct ceph_metric_cap *)(head + 1);
+	cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO);
+	cap->ver = 1;
+	cap->compat = 1;
+	cap->data_len = cpu_to_le32(sizeof(*cap) - 10);
+	cap->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_hit));
+	cap->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_mis));
+	cap->total = cpu_to_le64(nr_caps);
+	items++;
+
+	dout("cap metric hit %lld, mis %lld, total caps %lld",
+	     le64_to_cpu(cap->hit), le64_to_cpu(cap->mis),
+	     le64_to_cpu(cap->total));
+
+	/* encode the read latency metric */
+	read = (struct ceph_metric_read_latency *)(cap + 1);
+	read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY);
+	read->ver = 1;
+	read->compat = 1;
+	read->data_len = cpu_to_le32(sizeof(*read) - 10);
+	total = percpu_counter_sum(&mdsc->metric.total_reads),
+	sum = percpu_counter_sum(&mdsc->metric.read_latency_sum);
+	jiffies_to_timespec64(sum, &ts);
+	read->sec = cpu_to_le32(ts.tv_sec);
+	read->nsec = cpu_to_le32(ts.tv_nsec);
+	items++;
+	dout("read latency metric total %lld, sum lat %lld", total, sum);
+
+	/* encode the write latency metric */
+	write = (struct ceph_metric_write_latency *)(read + 1);
+	write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY);
+	write->ver = 1;
+	write->compat = 1;
+	write->data_len = cpu_to_le32(sizeof(*write) - 10);
+	total = percpu_counter_sum(&mdsc->metric.total_writes),
+	sum = percpu_counter_sum(&mdsc->metric.write_latency_sum);
+	jiffies_to_timespec64(sum, &ts);
+	write->sec = cpu_to_le32(ts.tv_sec);
+	write->nsec = cpu_to_le32(ts.tv_nsec);
+	items++;
+	dout("write latency metric total %lld, sum lat %lld", total, sum);
+
+	/* encode the metadata latency metric */
+	meta = (struct ceph_metric_metadata_latency *)(write + 1);
+	meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY);
+	meta->ver = 1;
+	meta->compat = 1;
+	meta->data_len = cpu_to_le32(sizeof(*meta) - 10);
+	total = percpu_counter_sum(&mdsc->metric.total_metadatas),
+	sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum);
+	jiffies_to_timespec64(sum, &ts);
+	meta->sec = cpu_to_le32(ts.tv_sec);
+	meta->nsec = cpu_to_le32(ts.tv_nsec);
+	items++;
+	dout("metadata latency metric total %lld, sum lat %lld", total, sum);
+
+	/* encode the dentry lease metric */
+	lease = (struct ceph_metric_dentry_lease *)(meta + 1);
+	lease->type = cpu_to_le32(CLIENT_METRIC_TYPE_DENTRY_LEASE);
+	lease->ver = 1;
+	lease->compat = 1;
+	lease->data_len = cpu_to_le32(sizeof(*lease) - 10);
+	lease->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_hit));
+	lease->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_mis));
+	lease->total = cpu_to_le64(atomic64_read(&mdsc->metric.total_dentries));
+	items++;
+	dout("dentry lease metric hit %lld, mis %lld, total dentries %lld",
+	     le64_to_cpu(lease->hit), le64_to_cpu(lease->mis),
+	     le64_to_cpu(lease->total));
+
+	put_unaligned_le32(items, &head->num);
+	msg->front.iov_len = cpu_to_le32(len);
+	msg->hdr.version = cpu_to_le16(1);
+	msg->hdr.compat_version = cpu_to_le16(1);
+	msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
+	dout("send metrics to mds%d %p\n", s->s_mds, msg);
+	ceph_con_send(&s->s_con, msg);
+
+	return true;
+}
+
+#define CEPH_WORK_DELAY_DEF 5
+static void __schedule_delayed(struct delayed_work *work, int delay)
+{
+	unsigned int hz = round_jiffies_relative(HZ * delay);
+
+	schedule_delayed_work(work, hz);
+}
+
 static void schedule_delayed(struct ceph_mds_client *mdsc)
 {
-	int delay = 5;
-	unsigned hz = round_jiffies_relative(HZ * delay);
-	schedule_delayed_work(&mdsc->delayed_work, hz);
+	__schedule_delayed(&mdsc->delayed_work, CEPH_WORK_DELAY_DEF);
+}
+
+static void metric_schedule_delayed(struct ceph_mds_client *mdsc)
+{
+	/* delay CEPH_WORK_DELAY_DEF seconds when idle */
+	int delay = metric_send_interval ? : CEPH_WORK_DELAY_DEF;
+
+	__schedule_delayed(&mdsc->metric_delayed_work, delay);
+}
+
+static bool check_session_state(struct ceph_mds_client *mdsc,
+				struct ceph_mds_session *s)
+{
+	if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
+		dout("resending session close request for mds%d\n",
+				s->s_mds);
+		request_close_session(mdsc, s);
+		return false;
+	}
+	if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
+		if (s->s_state == CEPH_MDS_SESSION_OPEN) {
+			s->s_state = CEPH_MDS_SESSION_HUNG;
+			pr_info("mds%d hung\n", s->s_mds);
+		}
+	}
+	if (s->s_state == CEPH_MDS_SESSION_NEW ||
+	    s->s_state == CEPH_MDS_SESSION_RESTARTING ||
+	    s->s_state == CEPH_MDS_SESSION_REJECTED)
+		/* this mds is failed or recovering, just wait */
+		return false;
+
+	return true;
 }
 
+/*
+ * delayed work -- periodically trim expired leases, renew caps with mds
+ */
 static void delayed_work(struct work_struct *work)
 {
 	int i;
@@ -4116,23 +4267,8 @@ static void delayed_work(struct work_struct *work)
 		struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i);
 		if (!s)
 			continue;
-		if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
-			dout("resending session close request for mds%d\n",
-			     s->s_mds);
-			request_close_session(mdsc, s);
-			ceph_put_mds_session(s);
-			continue;
-		}
-		if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
-			if (s->s_state == CEPH_MDS_SESSION_OPEN) {
-				s->s_state = CEPH_MDS_SESSION_HUNG;
-				pr_info("mds%d hung\n", s->s_mds);
-			}
-		}
-		if (s->s_state == CEPH_MDS_SESSION_NEW ||
-		    s->s_state == CEPH_MDS_SESSION_RESTARTING ||
-		    s->s_state == CEPH_MDS_SESSION_REJECTED) {
-			/* this mds is failed or recovering, just wait */
+
+		if (!check_session_state(mdsc, s)) {
 			ceph_put_mds_session(s);
 			continue;
 		}
@@ -4164,8 +4300,53 @@ static void delayed_work(struct work_struct *work)
 	schedule_delayed(mdsc);
 }
 
-static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
+static void metric_delayed_work(struct work_struct *work)
+{
+	struct ceph_mds_client *mdsc =
+		container_of(work, struct ceph_mds_client, metric_delayed_work.work);
+	struct ceph_mds_session *s;
+	u64 nr_caps = 0;
+	bool ret;
+	int i;
+
+	if (!metric_send_interval)
+		goto idle;
+
+	dout("mdsc metric_delayed_work\n");
+
+	mutex_lock(&mdsc->mutex);
+	for (i = 0; i < mdsc->max_sessions; i++) {
+		s = __ceph_lookup_mds_session(mdsc, i);
+		if (!s)
+			continue;
+		nr_caps += s->s_nr_caps;
+		ceph_put_mds_session(s);
+	}
+
+	for (i = 0; i < mdsc->max_sessions; i++) {
+		s = __ceph_lookup_mds_session(mdsc, i);
+		if (!s)
+			continue;
+		if (!check_session_state(mdsc, s)) {
+			ceph_put_mds_session(s);
+			continue;
+		}
+
+		/* Only send the metric once in any available session */
+		ret = ceph_mdsc_send_metrics(mdsc, s, nr_caps);
+		ceph_put_mds_session(s);
+		if (ret)
+			break;
+	}
+	mutex_unlock(&mdsc->mutex);
+
+idle:
+	metric_schedule_delayed(mdsc);
+}
+
+static int ceph_mdsc_metric_init(struct ceph_mds_client *mdsc)
 {
+	struct ceph_client_metric *metric = &mdsc->metric;
 	int ret;
 
 	if (!metric)
@@ -4289,7 +4470,8 @@ int ceph_mdsc_init(struct ceph_fs_client *fsc)
 	init_waitqueue_head(&mdsc->cap_flushing_wq);
 	INIT_WORK(&mdsc->cap_reclaim_work, ceph_cap_reclaim_work);
 	atomic_set(&mdsc->cap_reclaim_pending, 0);
-	err = ceph_mdsc_metric_init(&mdsc->metric);
+	INIT_DELAYED_WORK(&mdsc->metric_delayed_work, metric_delayed_work);
+	err = ceph_mdsc_metric_init(mdsc);
 	if (err)
 		goto err_mdsmap;
 
@@ -4511,6 +4693,7 @@ void ceph_mdsc_close_sessions(struct ceph_mds_client *mdsc)
 
 	cancel_work_sync(&mdsc->cap_reclaim_work);
 	cancel_delayed_work_sync(&mdsc->delayed_work); /* cancel timer */
+	cancel_delayed_work_sync(&mdsc->metric_delayed_work); /* cancel timer */
 
 	dout("stopped\n");
 }
@@ -4553,6 +4736,7 @@ static void ceph_mdsc_stop(struct ceph_mds_client *mdsc)
 {
 	dout("stop\n");
 	cancel_delayed_work_sync(&mdsc->delayed_work); /* cancel timer */
+	cancel_delayed_work_sync(&mdsc->metric_delayed_work); /* cancel timer */
 	if (mdsc->mdsmap)
 		ceph_mdsmap_destroy(mdsc->mdsmap);
 	kfree(mdsc->sessions);
@@ -4719,6 +4903,7 @@ void ceph_mdsc_handle_mdsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg)
 
 	mutex_unlock(&mdsc->mutex);
 	schedule_delayed(mdsc);
+	metric_schedule_delayed(mdsc);
 	return;
 
 bad_unlock:
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index 674fc7725913..c13910da07c4 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -448,7 +448,9 @@ struct ceph_mds_client {
 	struct list_head  dentry_leases;     /* fifo list */
 	struct list_head  dentry_dir_leases; /* lru list */
 
+	/* metrics */
 	struct ceph_client_metric metric;
+	struct delayed_work	  metric_delayed_work;  /* delayed work */
 
 	spinlock_t		snapid_map_lock;
 	struct rb_root		snapid_map_tree;
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
index 9de8beb436c7..224e92a70d88 100644
--- a/fs/ceph/metric.h
+++ b/fs/ceph/metric.h
@@ -4,6 +4,82 @@
 
 #include <linux/ceph/osd_client.h>
 
+enum ceph_metric_type {
+	CLIENT_METRIC_TYPE_CAP_INFO,
+	CLIENT_METRIC_TYPE_READ_LATENCY,
+	CLIENT_METRIC_TYPE_WRITE_LATENCY,
+	CLIENT_METRIC_TYPE_METADATA_LATENCY,
+	CLIENT_METRIC_TYPE_DENTRY_LEASE,
+
+	CLIENT_METRIC_TYPE_MAX = CLIENT_METRIC_TYPE_DENTRY_LEASE,
+};
+
+/* metric caps header */
+struct ceph_metric_cap {
+	__le32 type;     /* ceph metric type */
+
+	__u8  ver;
+	__u8  compat;
+
+	__le32 data_len; /* length of sizeof(hit + mis + total) */
+	__le64 hit;
+	__le64 mis;
+	__le64 total;
+} __attribute__ ((packed));
+
+/* metric dentry lease header */
+struct ceph_metric_dentry_lease {
+	__le32 type;     /* ceph metric type */
+
+	__u8  ver;
+	__u8  compat;
+
+	__le32 data_len; /* length of sizeof(hit + mis + total) */
+	__le64 hit;
+	__le64 mis;
+	__le64 total;
+} __attribute__ ((packed));
+
+/* metric read latency header */
+struct ceph_metric_read_latency {
+	__le32 type;     /* ceph metric type */
+
+	__u8  ver;
+	__u8  compat;
+
+	__le32 data_len; /* length of sizeof(sec + nsec) */
+	__le32 sec;
+	__le32 nsec;
+} __attribute__ ((packed));
+
+/* metric write latency header */
+struct ceph_metric_write_latency {
+	__le32 type;     /* ceph metric type */
+
+	__u8  ver;
+	__u8  compat;
+
+	__le32 data_len; /* length of sizeof(sec + nsec) */
+	__le32 sec;
+	__le32 nsec;
+} __attribute__ ((packed));
+
+/* metric metadata latency header */
+struct ceph_metric_metadata_latency {
+	__le32 type;     /* ceph metric type */
+
+	__u8  ver;
+	__u8  compat;
+
+	__le32 data_len; /* length of sizeof(sec + nsec) */
+	__le32 sec;
+	__le32 nsec;
+} __attribute__ ((packed));
+
+struct ceph_metric_head {
+	__le32 num;	/* the number of metrics that will be sent */
+} __attribute__ ((packed));
+
 /* This is the global metrics */
 struct ceph_client_metric {
 	atomic64_t            total_dentries;
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 196d547c7054..5fef4f59e13e 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1315,6 +1315,10 @@ bool enable_async_dirops;
 module_param(enable_async_dirops, bool, 0644);
 MODULE_PARM_DESC(enable_async_dirops, "Asynchronous directory operations enabled");
 
+unsigned int metric_send_interval;
+module_param(metric_send_interval, uint, 0644);
+MODULE_PARM_DESC(metric_send_interval, "Interval (in seconds) of sending perf metric to ceph cluster (default: 0)");
+
 module_init(init_ceph);
 module_exit(exit_ceph);
 
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 44b9a971ec9a..7eda7acc859a 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -73,6 +73,7 @@
 #define CEPH_CAPS_WANTED_DELAY_MAX_DEFAULT     60  /* cap release delay */
 
 extern bool enable_async_dirops;
+extern unsigned int metric_send_interval;
 
 struct ceph_mount_options {
 	unsigned int flags;
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index a099f60feb7b..6028d3e865e4 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -130,6 +130,7 @@ struct ceph_dir_layout {
 #define CEPH_MSG_CLIENT_REQUEST         24
 #define CEPH_MSG_CLIENT_REQUEST_FORWARD 25
 #define CEPH_MSG_CLIENT_REPLY           26
+#define CEPH_MSG_CLIENT_METRICS         29
 #define CEPH_MSG_CLIENT_CAPS            0x310
 #define CEPH_MSG_CLIENT_LEASE           0x311
 #define CEPH_MSG_CLIENT_SNAP            0x312
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 7/9] ceph: add CEPH_DEFINE_RW_FUNC helper support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (5 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 6/9] ceph: periodically send perf metrics to ceph xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-10  5:34 ` [PATCH v6 8/9] ceph: add reset metrics support xiubli
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

This will support the string store.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 include/linux/ceph/debugfs.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/ceph/debugfs.h b/include/linux/ceph/debugfs.h
index 8b3a1a7a953a..b9100712f87f 100644
--- a/include/linux/ceph/debugfs.h
+++ b/include/linux/ceph/debugfs.h
@@ -4,6 +4,20 @@
 
 #include <linux/ceph/types.h>
 
+#define CEPH_DEFINE_RW_FUNC(name)					\
+static int name##_open(struct inode *inode, struct file *file)		\
+{									\
+	return single_open(file, name##_show, inode->i_private);	\
+}									\
+									\
+static const struct file_operations name##_fops = {			\
+	.open		= name##_open,					\
+	.read		= seq_read,					\
+	.write		= name##_store,					\
+	.llseek		= seq_lseek,					\
+	.release	= single_release,				\
+}
+
 /* debugfs.c */
 extern void ceph_debugfs_init(void);
 extern void ceph_debugfs_cleanup(void);
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 8/9] ceph: add reset metrics support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (6 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 7/9] ceph: add CEPH_DEFINE_RW_FUNC helper support xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-10 15:22   ` Ilya Dryomov
  2020-02-10  5:34 ` [PATCH v6 9/9] ceph: send client provided metric flags in client metadata xiubli
  2020-02-15  0:39 ` [PATCH v6 0/9] ceph: add perf metrics support Xiubo Li
  9 siblings, 1 reply; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

Sometimes we need to discard the old perf metrics and start to get
new ones. And this will reset the most metric counters, except the
total numbers for caps and dentries.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/debugfs.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index 60f3e307fca1..6e595a37af5d 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -179,6 +179,43 @@ static int metric_show(struct seq_file *s, void *p)
 	return 0;
 }
 
+static ssize_t metric_store(struct file *file, const char __user *user_buf,
+			    size_t count, loff_t *ppos)
+{
+	struct seq_file *s = file->private_data;
+	struct ceph_fs_client *fsc = s->private;
+	struct ceph_mds_client *mdsc = fsc->mdsc;
+	struct ceph_client_metric *metric = &mdsc->metric;
+	char buf[8];
+
+	if (copy_from_user(buf, user_buf, 8))
+		return -EFAULT;
+
+	if (strncmp(buf, "reset", strlen("reset"))) {
+		pr_err("Invalid set value '%s', only 'reset' is valid\n", buf);
+		return -EINVAL;
+	}
+
+	percpu_counter_set(&metric->d_lease_hit, 0);
+	percpu_counter_set(&metric->d_lease_mis, 0);
+
+	percpu_counter_set(&metric->i_caps_hit, 0);
+	percpu_counter_set(&metric->i_caps_mis, 0);
+
+	percpu_counter_set(&metric->read_latency_sum, 0);
+	percpu_counter_set(&metric->total_reads, 0);
+
+	percpu_counter_set(&metric->write_latency_sum, 0);
+	percpu_counter_set(&metric->total_writes, 0);
+
+	percpu_counter_set(&metric->metadata_latency_sum, 0);
+	percpu_counter_set(&metric->total_metadatas, 0);
+
+	return count;
+}
+
+CEPH_DEFINE_RW_FUNC(metric);
+
 static int caps_show_cb(struct inode *inode, struct ceph_cap *cap, void *p)
 {
 	struct seq_file *s = p;
@@ -277,7 +314,6 @@ DEFINE_SHOW_ATTRIBUTE(mdsmap);
 DEFINE_SHOW_ATTRIBUTE(mdsc);
 DEFINE_SHOW_ATTRIBUTE(caps);
 DEFINE_SHOW_ATTRIBUTE(mds_sessions);
-DEFINE_SHOW_ATTRIBUTE(metric);
 
 
 /*
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 9/9] ceph: send client provided metric flags in client metadata
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (7 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 8/9] ceph: add reset metrics support xiubli
@ 2020-02-10  5:34 ` xiubli
  2020-02-15  0:39 ` [PATCH v6 0/9] ceph: add perf metrics support Xiubo Li
  9 siblings, 0 replies; 18+ messages in thread
From: xiubli @ 2020-02-10  5:34 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

Will send the metric flags to MDS, currently it supports the cap,
dentry lease, read latency, write latency and metadata latency.

URL: https://tracker.ceph.com/issues/43435
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/mds_client.c | 47 ++++++++++++++++++++++++++++++++++++++++++--
 fs/ceph/metric.h     | 14 +++++++++++++
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index f9a6f95c7941..376e7cf1685f 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1082,6 +1082,41 @@ static void encode_supported_features(void **p, void *end)
 	}
 }
 
+static const unsigned char metric_bits[] = CEPHFS_METRIC_SPEC_CLIENT_SUPPORTED;
+#define METRIC_BYTES(cnt) (DIV_ROUND_UP((size_t)metric_bits[cnt - 1] + 1, 64) * 8)
+static void encode_metric_spec(void **p, void *end)
+{
+	static const size_t count = ARRAY_SIZE(metric_bits);
+
+	/* header */
+	BUG_ON(*p + 2 > end);
+	ceph_encode_8(p, 1); /* version */
+	ceph_encode_8(p, 1); /* compat */
+
+	if (count > 0) {
+		size_t i;
+		size_t size = METRIC_BYTES(count);
+
+		BUG_ON(*p + 4 + 4 + size > end);
+
+		/* metric spec info length */
+		ceph_encode_32(p, 4 + size);
+
+		/* metric spec */
+		ceph_encode_32(p, size);
+		memset(*p, 0, size);
+		for (i = 0; i < count; i++)
+			((unsigned char *)(*p))[i / 8] |= BIT(metric_bits[i] % 8);
+		*p += size;
+	} else {
+		BUG_ON(*p + 4 + 4 > end);
+		/* metric spec info length */
+		ceph_encode_32(p, 4);
+		/* metric spec */
+		ceph_encode_32(p, 0);
+	}
+}
+
 /*
  * session message, specialization for CEPH_SESSION_REQUEST_OPEN
  * to include additional client metadata fields.
@@ -1121,6 +1156,13 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6
 		size = FEATURE_BYTES(count);
 	extra_bytes += 4 + size;
 
+	/* metric spec */
+	size = 0;
+	count = ARRAY_SIZE(metric_bits);
+	if (count > 0)
+		size = METRIC_BYTES(count);
+	extra_bytes += 2 + 4 + 4 + size;
+
 	/* Allocate the message */
 	msg = ceph_msg_new(CEPH_MSG_CLIENT_SESSION, sizeof(*h) + extra_bytes,
 			   GFP_NOFS, false);
@@ -1139,9 +1181,9 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6
 	 * Serialize client metadata into waiting buffer space, using
 	 * the format that userspace expects for map<string, string>
 	 *
-	 * ClientSession messages with metadata are v3
+	 * ClientSession messages with metadata are v4
 	 */
-	msg->hdr.version = cpu_to_le16(3);
+	msg->hdr.version = cpu_to_le16(4);
 	msg->hdr.compat_version = cpu_to_le16(1);
 
 	/* The write pointer, following the session_head structure */
@@ -1164,6 +1206,7 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6
 	}
 
 	encode_supported_features(&p, end);
+	encode_metric_spec(&p, end);
 	msg->front.iov_len = p - msg->front.iov_base;
 	msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
 
diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
index 224e92a70d88..c0149484e71d 100644
--- a/fs/ceph/metric.h
+++ b/fs/ceph/metric.h
@@ -14,6 +14,20 @@ enum ceph_metric_type {
 	CLIENT_METRIC_TYPE_MAX = CLIENT_METRIC_TYPE_DENTRY_LEASE,
 };
 
+/*
+ * This will always have the highest metric bit value
+ * as the last element of the array.
+ */
+#define CEPHFS_METRIC_SPEC_CLIENT_SUPPORTED {	\
+	CLIENT_METRIC_TYPE_CAP_INFO,		\
+	CLIENT_METRIC_TYPE_READ_LATENCY,	\
+	CLIENT_METRIC_TYPE_WRITE_LATENCY,	\
+	CLIENT_METRIC_TYPE_METADATA_LATENCY,	\
+	CLIENT_METRIC_TYPE_DENTRY_LEASE,	\
+						\
+	CLIENT_METRIC_TYPE_MAX,			\
+}
+
 /* metric caps header */
 struct ceph_metric_cap {
 	__le32 type;     /* ceph metric type */
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 8/9] ceph: add reset metrics support
  2020-02-10  5:34 ` [PATCH v6 8/9] ceph: add reset metrics support xiubli
@ 2020-02-10 15:22   ` Ilya Dryomov
  0 siblings, 0 replies; 18+ messages in thread
From: Ilya Dryomov @ 2020-02-10 15:22 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Jeff Layton, Sage Weil, Yan, Zheng, Patrick Donnelly, Ceph Development

On Mon, Feb 10, 2020 at 6:34 AM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> Sometimes we need to discard the old perf metrics and start to get
> new ones. And this will reset the most metric counters, except the
> total numbers for caps and dentries.
>
> URL: https://tracker.ceph.com/issues/43215
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  fs/ceph/debugfs.c | 38 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
> index 60f3e307fca1..6e595a37af5d 100644
> --- a/fs/ceph/debugfs.c
> +++ b/fs/ceph/debugfs.c
> @@ -179,6 +179,43 @@ static int metric_show(struct seq_file *s, void *p)
>         return 0;
>  }
>
> +static ssize_t metric_store(struct file *file, const char __user *user_buf,
> +                           size_t count, loff_t *ppos)
> +{
> +       struct seq_file *s = file->private_data;
> +       struct ceph_fs_client *fsc = s->private;
> +       struct ceph_mds_client *mdsc = fsc->mdsc;
> +       struct ceph_client_metric *metric = &mdsc->metric;
> +       char buf[8];
> +
> +       if (copy_from_user(buf, user_buf, 8))
> +               return -EFAULT;
> +
> +       if (strncmp(buf, "reset", strlen("reset"))) {
> +               pr_err("Invalid set value '%s', only 'reset' is valid\n", buf);
> +               return -EINVAL;
> +       }

Hi Xiubo,

Why strncmp?  How does this handle inputs like "resetfoobar"?

> +
> +       percpu_counter_set(&metric->d_lease_hit, 0);
> +       percpu_counter_set(&metric->d_lease_mis, 0);
> +
> +       percpu_counter_set(&metric->i_caps_hit, 0);
> +       percpu_counter_set(&metric->i_caps_mis, 0);
> +
> +       percpu_counter_set(&metric->read_latency_sum, 0);
> +       percpu_counter_set(&metric->total_reads, 0);
> +
> +       percpu_counter_set(&metric->write_latency_sum, 0);
> +       percpu_counter_set(&metric->total_writes, 0);
> +
> +       percpu_counter_set(&metric->metadata_latency_sum, 0);
> +       percpu_counter_set(&metric->total_metadatas, 0);
> +
> +       return count;
> +}
> +
> +CEPH_DEFINE_RW_FUNC(metric);

More broadly, how are these metrics going to be used?  I suspect
the MDSes will gradually start relying on the them in the future
and probably make decisions based off of them?  If that is the case,
did you think about clients being able to mess with that by zeroing
these counters on a regular basis?

It looks like all of this is still in flight on the userspace side, but
I don't see anything similar in https://github.com/ceph/ceph/pull/32120.
Is there a different PR or is this kernel-only for some reason?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/9] ceph: periodically send perf metrics to ceph
  2020-02-10  5:34 ` [PATCH v6 6/9] ceph: periodically send perf metrics to ceph xiubli
@ 2020-02-10 15:34   ` Ilya Dryomov
  2020-02-11  1:29     ` Xiubo Li
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Dryomov @ 2020-02-10 15:34 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Jeff Layton, Sage Weil, Yan, Zheng, Patrick Donnelly, Ceph Development

On Mon, Feb 10, 2020 at 6:34 AM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> Add metric_send_interval module parameter support, the default valume
> is 0, means disabled. If none zero it will enable the transmission of
> the metrics to the ceph cluster periodically per metric_send_interval
> seconds.
>
> This will send the caps, dentry lease and read/write/metadata perf
> metrics to any available MDS only once per metric_send_interval
> seconds.
>
> URL: https://tracker.ceph.com/issues/43215
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  fs/ceph/mds_client.c         | 235 +++++++++++++++++++++++++++++++----
>  fs/ceph/mds_client.h         |   2 +
>  fs/ceph/metric.h             |  76 +++++++++++
>  fs/ceph/super.c              |   4 +
>  fs/ceph/super.h              |   1 +
>  include/linux/ceph/ceph_fs.h |   1 +
>  6 files changed, 294 insertions(+), 25 deletions(-)
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index d414eded6810..f9a6f95c7941 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -4085,16 +4085,167 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
>         ceph_force_reconnect(fsc->sb);
>  }
>
> -/*
> - * delayed work -- periodically trim expired leases, renew caps with mds
> - */
> +static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc,
> +                                  struct ceph_mds_session *s,
> +                                  u64 nr_caps)
> +{
> +       struct ceph_metric_head *head;
> +       struct ceph_metric_cap *cap;
> +       struct ceph_metric_dentry_lease *lease;
> +       struct ceph_metric_read_latency *read;
> +       struct ceph_metric_write_latency *write;
> +       struct ceph_metric_metadata_latency *meta;
> +       struct ceph_msg *msg;
> +       struct timespec64 ts;
> +       s64 sum, total;
> +       s32 items = 0;
> +       s32 len;
> +
> +       if (!mdsc || !s)
> +               return false;
> +
> +       len = sizeof(*head) + sizeof(*cap) + sizeof(*lease) + sizeof(*read)
> +             + sizeof(*write) + sizeof(*meta);
> +
> +       msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true);
> +       if (!msg) {
> +               pr_err("send metrics to mds%d, failed to allocate message\n",
> +                      s->s_mds);
> +               return false;
> +       }
> +
> +       head = msg->front.iov_base;
> +
> +       /* encode the cap metric */
> +       cap = (struct ceph_metric_cap *)(head + 1);
> +       cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO);
> +       cap->ver = 1;
> +       cap->compat = 1;
> +       cap->data_len = cpu_to_le32(sizeof(*cap) - 10);
> +       cap->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_hit));
> +       cap->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_mis));
> +       cap->total = cpu_to_le64(nr_caps);
> +       items++;
> +
> +       dout("cap metric hit %lld, mis %lld, total caps %lld",
> +            le64_to_cpu(cap->hit), le64_to_cpu(cap->mis),
> +            le64_to_cpu(cap->total));
> +
> +       /* encode the read latency metric */
> +       read = (struct ceph_metric_read_latency *)(cap + 1);
> +       read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY);
> +       read->ver = 1;
> +       read->compat = 1;
> +       read->data_len = cpu_to_le32(sizeof(*read) - 10);
> +       total = percpu_counter_sum(&mdsc->metric.total_reads),
> +       sum = percpu_counter_sum(&mdsc->metric.read_latency_sum);
> +       jiffies_to_timespec64(sum, &ts);
> +       read->sec = cpu_to_le32(ts.tv_sec);
> +       read->nsec = cpu_to_le32(ts.tv_nsec);
> +       items++;
> +       dout("read latency metric total %lld, sum lat %lld", total, sum);
> +
> +       /* encode the write latency metric */
> +       write = (struct ceph_metric_write_latency *)(read + 1);
> +       write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY);
> +       write->ver = 1;
> +       write->compat = 1;
> +       write->data_len = cpu_to_le32(sizeof(*write) - 10);
> +       total = percpu_counter_sum(&mdsc->metric.total_writes),
> +       sum = percpu_counter_sum(&mdsc->metric.write_latency_sum);
> +       jiffies_to_timespec64(sum, &ts);
> +       write->sec = cpu_to_le32(ts.tv_sec);
> +       write->nsec = cpu_to_le32(ts.tv_nsec);
> +       items++;
> +       dout("write latency metric total %lld, sum lat %lld", total, sum);
> +
> +       /* encode the metadata latency metric */
> +       meta = (struct ceph_metric_metadata_latency *)(write + 1);
> +       meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY);
> +       meta->ver = 1;
> +       meta->compat = 1;
> +       meta->data_len = cpu_to_le32(sizeof(*meta) - 10);
> +       total = percpu_counter_sum(&mdsc->metric.total_metadatas),
> +       sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum);
> +       jiffies_to_timespec64(sum, &ts);
> +       meta->sec = cpu_to_le32(ts.tv_sec);
> +       meta->nsec = cpu_to_le32(ts.tv_nsec);
> +       items++;
> +       dout("metadata latency metric total %lld, sum lat %lld", total, sum);
> +
> +       /* encode the dentry lease metric */
> +       lease = (struct ceph_metric_dentry_lease *)(meta + 1);
> +       lease->type = cpu_to_le32(CLIENT_METRIC_TYPE_DENTRY_LEASE);
> +       lease->ver = 1;
> +       lease->compat = 1;
> +       lease->data_len = cpu_to_le32(sizeof(*lease) - 10);
> +       lease->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_hit));
> +       lease->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_mis));
> +       lease->total = cpu_to_le64(atomic64_read(&mdsc->metric.total_dentries));
> +       items++;
> +       dout("dentry lease metric hit %lld, mis %lld, total dentries %lld",
> +            le64_to_cpu(lease->hit), le64_to_cpu(lease->mis),
> +            le64_to_cpu(lease->total));
> +
> +       put_unaligned_le32(items, &head->num);
> +       msg->front.iov_len = cpu_to_le32(len);
> +       msg->hdr.version = cpu_to_le16(1);
> +       msg->hdr.compat_version = cpu_to_le16(1);
> +       msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
> +       dout("send metrics to mds%d %p\n", s->s_mds, msg);
> +       ceph_con_send(&s->s_con, msg);
> +
> +       return true;
> +}
> +
> +#define CEPH_WORK_DELAY_DEF 5
> +static void __schedule_delayed(struct delayed_work *work, int delay)
> +{
> +       unsigned int hz = round_jiffies_relative(HZ * delay);
> +
> +       schedule_delayed_work(work, hz);
> +}
> +
>  static void schedule_delayed(struct ceph_mds_client *mdsc)
>  {
> -       int delay = 5;
> -       unsigned hz = round_jiffies_relative(HZ * delay);
> -       schedule_delayed_work(&mdsc->delayed_work, hz);
> +       __schedule_delayed(&mdsc->delayed_work, CEPH_WORK_DELAY_DEF);
> +}
> +
> +static void metric_schedule_delayed(struct ceph_mds_client *mdsc)
> +{
> +       /* delay CEPH_WORK_DELAY_DEF seconds when idle */
> +       int delay = metric_send_interval ? : CEPH_WORK_DELAY_DEF;
> +
> +       __schedule_delayed(&mdsc->metric_delayed_work, delay);
> +}
> +
> +static bool check_session_state(struct ceph_mds_client *mdsc,
> +                               struct ceph_mds_session *s)
> +{
> +       if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
> +               dout("resending session close request for mds%d\n",
> +                               s->s_mds);
> +               request_close_session(mdsc, s);
> +               return false;
> +       }
> +       if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
> +               if (s->s_state == CEPH_MDS_SESSION_OPEN) {
> +                       s->s_state = CEPH_MDS_SESSION_HUNG;
> +                       pr_info("mds%d hung\n", s->s_mds);
> +               }
> +       }
> +       if (s->s_state == CEPH_MDS_SESSION_NEW ||
> +           s->s_state == CEPH_MDS_SESSION_RESTARTING ||
> +           s->s_state == CEPH_MDS_SESSION_REJECTED)
> +               /* this mds is failed or recovering, just wait */
> +               return false;
> +
> +       return true;
>  }
>
> +/*
> + * delayed work -- periodically trim expired leases, renew caps with mds
> + */
>  static void delayed_work(struct work_struct *work)
>  {
>         int i;
> @@ -4116,23 +4267,8 @@ static void delayed_work(struct work_struct *work)
>                 struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i);
>                 if (!s)
>                         continue;
> -               if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
> -                       dout("resending session close request for mds%d\n",
> -                            s->s_mds);
> -                       request_close_session(mdsc, s);
> -                       ceph_put_mds_session(s);
> -                       continue;
> -               }
> -               if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
> -                       if (s->s_state == CEPH_MDS_SESSION_OPEN) {
> -                               s->s_state = CEPH_MDS_SESSION_HUNG;
> -                               pr_info("mds%d hung\n", s->s_mds);
> -                       }
> -               }
> -               if (s->s_state == CEPH_MDS_SESSION_NEW ||
> -                   s->s_state == CEPH_MDS_SESSION_RESTARTING ||
> -                   s->s_state == CEPH_MDS_SESSION_REJECTED) {
> -                       /* this mds is failed or recovering, just wait */
> +
> +               if (!check_session_state(mdsc, s)) {
>                         ceph_put_mds_session(s);
>                         continue;
>                 }
> @@ -4164,8 +4300,53 @@ static void delayed_work(struct work_struct *work)
>         schedule_delayed(mdsc);
>  }
>
> -static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
> +static void metric_delayed_work(struct work_struct *work)
> +{
> +       struct ceph_mds_client *mdsc =
> +               container_of(work, struct ceph_mds_client, metric_delayed_work.work);
> +       struct ceph_mds_session *s;
> +       u64 nr_caps = 0;
> +       bool ret;
> +       int i;
> +
> +       if (!metric_send_interval)
> +               goto idle;
> +
> +       dout("mdsc metric_delayed_work\n");
> +
> +       mutex_lock(&mdsc->mutex);
> +       for (i = 0; i < mdsc->max_sessions; i++) {
> +               s = __ceph_lookup_mds_session(mdsc, i);
> +               if (!s)
> +                       continue;
> +               nr_caps += s->s_nr_caps;
> +               ceph_put_mds_session(s);
> +       }
> +
> +       for (i = 0; i < mdsc->max_sessions; i++) {
> +               s = __ceph_lookup_mds_session(mdsc, i);
> +               if (!s)
> +                       continue;
> +               if (!check_session_state(mdsc, s)) {
> +                       ceph_put_mds_session(s);
> +                       continue;
> +               }
> +
> +               /* Only send the metric once in any available session */
> +               ret = ceph_mdsc_send_metrics(mdsc, s, nr_caps);
> +               ceph_put_mds_session(s);
> +               if (ret)
> +                       break;
> +       }
> +       mutex_unlock(&mdsc->mutex);
> +
> +idle:
> +       metric_schedule_delayed(mdsc);

Looks like this will schedule metric_delayed_work() every 5 seconds
even if metric_send_interval = 0 (i.e. sending is disabled).  What is
the reason for that?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/9] ceph: periodically send perf metrics to ceph
  2020-02-10 15:34   ` Ilya Dryomov
@ 2020-02-11  1:29     ` Xiubo Li
  2020-02-11 17:42       ` Ilya Dryomov
  0 siblings, 1 reply; 18+ messages in thread
From: Xiubo Li @ 2020-02-11  1:29 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Jeff Layton, Sage Weil, Yan, Zheng, Patrick Donnelly, Ceph Development

On 2020/2/10 23:34, Ilya Dryomov wrote:
> On Mon, Feb 10, 2020 at 6:34 AM <xiubli@redhat.com> wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> Add metric_send_interval module parameter support, the default valume
>> is 0, means disabled. If none zero it will enable the transmission of
>> the metrics to the ceph cluster periodically per metric_send_interval
>> seconds.
>>
>> This will send the caps, dentry lease and read/write/metadata perf
>> metrics to any available MDS only once per metric_send_interval
>> seconds.
>>
>> URL: https://tracker.ceph.com/issues/43215
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>   fs/ceph/mds_client.c         | 235 +++++++++++++++++++++++++++++++----
>>   fs/ceph/mds_client.h         |   2 +
>>   fs/ceph/metric.h             |  76 +++++++++++
>>   fs/ceph/super.c              |   4 +
>>   fs/ceph/super.h              |   1 +
>>   include/linux/ceph/ceph_fs.h |   1 +
>>   6 files changed, 294 insertions(+), 25 deletions(-)
>>
>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>> index d414eded6810..f9a6f95c7941 100644
>> --- a/fs/ceph/mds_client.c
>> +++ b/fs/ceph/mds_client.c
>> @@ -4085,16 +4085,167 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
>>          ceph_force_reconnect(fsc->sb);
>>   }
>>
>> -/*
>> - * delayed work -- periodically trim expired leases, renew caps with mds
>> - */
>> +static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc,
>> +                                  struct ceph_mds_session *s,
>> +                                  u64 nr_caps)
>> +{
>> +       struct ceph_metric_head *head;
>> +       struct ceph_metric_cap *cap;
>> +       struct ceph_metric_dentry_lease *lease;
>> +       struct ceph_metric_read_latency *read;
>> +       struct ceph_metric_write_latency *write;
>> +       struct ceph_metric_metadata_latency *meta;
>> +       struct ceph_msg *msg;
>> +       struct timespec64 ts;
>> +       s64 sum, total;
>> +       s32 items = 0;
>> +       s32 len;
>> +
>> +       if (!mdsc || !s)
>> +               return false;
>> +
>> +       len = sizeof(*head) + sizeof(*cap) + sizeof(*lease) + sizeof(*read)
>> +             + sizeof(*write) + sizeof(*meta);
>> +
>> +       msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true);
>> +       if (!msg) {
>> +               pr_err("send metrics to mds%d, failed to allocate message\n",
>> +                      s->s_mds);
>> +               return false;
>> +       }
>> +
>> +       head = msg->front.iov_base;
>> +
>> +       /* encode the cap metric */
>> +       cap = (struct ceph_metric_cap *)(head + 1);
>> +       cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO);
>> +       cap->ver = 1;
>> +       cap->compat = 1;
>> +       cap->data_len = cpu_to_le32(sizeof(*cap) - 10);
>> +       cap->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_hit));
>> +       cap->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_mis));
>> +       cap->total = cpu_to_le64(nr_caps);
>> +       items++;
>> +
>> +       dout("cap metric hit %lld, mis %lld, total caps %lld",
>> +            le64_to_cpu(cap->hit), le64_to_cpu(cap->mis),
>> +            le64_to_cpu(cap->total));
>> +
>> +       /* encode the read latency metric */
>> +       read = (struct ceph_metric_read_latency *)(cap + 1);
>> +       read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY);
>> +       read->ver = 1;
>> +       read->compat = 1;
>> +       read->data_len = cpu_to_le32(sizeof(*read) - 10);
>> +       total = percpu_counter_sum(&mdsc->metric.total_reads),
>> +       sum = percpu_counter_sum(&mdsc->metric.read_latency_sum);
>> +       jiffies_to_timespec64(sum, &ts);
>> +       read->sec = cpu_to_le32(ts.tv_sec);
>> +       read->nsec = cpu_to_le32(ts.tv_nsec);
>> +       items++;
>> +       dout("read latency metric total %lld, sum lat %lld", total, sum);
>> +
>> +       /* encode the write latency metric */
>> +       write = (struct ceph_metric_write_latency *)(read + 1);
>> +       write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY);
>> +       write->ver = 1;
>> +       write->compat = 1;
>> +       write->data_len = cpu_to_le32(sizeof(*write) - 10);
>> +       total = percpu_counter_sum(&mdsc->metric.total_writes),
>> +       sum = percpu_counter_sum(&mdsc->metric.write_latency_sum);
>> +       jiffies_to_timespec64(sum, &ts);
>> +       write->sec = cpu_to_le32(ts.tv_sec);
>> +       write->nsec = cpu_to_le32(ts.tv_nsec);
>> +       items++;
>> +       dout("write latency metric total %lld, sum lat %lld", total, sum);
>> +
>> +       /* encode the metadata latency metric */
>> +       meta = (struct ceph_metric_metadata_latency *)(write + 1);
>> +       meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY);
>> +       meta->ver = 1;
>> +       meta->compat = 1;
>> +       meta->data_len = cpu_to_le32(sizeof(*meta) - 10);
>> +       total = percpu_counter_sum(&mdsc->metric.total_metadatas),
>> +       sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum);
>> +       jiffies_to_timespec64(sum, &ts);
>> +       meta->sec = cpu_to_le32(ts.tv_sec);
>> +       meta->nsec = cpu_to_le32(ts.tv_nsec);
>> +       items++;
>> +       dout("metadata latency metric total %lld, sum lat %lld", total, sum);
>> +
>> +       /* encode the dentry lease metric */
>> +       lease = (struct ceph_metric_dentry_lease *)(meta + 1);
>> +       lease->type = cpu_to_le32(CLIENT_METRIC_TYPE_DENTRY_LEASE);
>> +       lease->ver = 1;
>> +       lease->compat = 1;
>> +       lease->data_len = cpu_to_le32(sizeof(*lease) - 10);
>> +       lease->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_hit));
>> +       lease->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_mis));
>> +       lease->total = cpu_to_le64(atomic64_read(&mdsc->metric.total_dentries));
>> +       items++;
>> +       dout("dentry lease metric hit %lld, mis %lld, total dentries %lld",
>> +            le64_to_cpu(lease->hit), le64_to_cpu(lease->mis),
>> +            le64_to_cpu(lease->total));
>> +
>> +       put_unaligned_le32(items, &head->num);
>> +       msg->front.iov_len = cpu_to_le32(len);
>> +       msg->hdr.version = cpu_to_le16(1);
>> +       msg->hdr.compat_version = cpu_to_le16(1);
>> +       msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
>> +       dout("send metrics to mds%d %p\n", s->s_mds, msg);
>> +       ceph_con_send(&s->s_con, msg);
>> +
>> +       return true;
>> +}
>> +
>> +#define CEPH_WORK_DELAY_DEF 5
>> +static void __schedule_delayed(struct delayed_work *work, int delay)
>> +{
>> +       unsigned int hz = round_jiffies_relative(HZ * delay);
>> +
>> +       schedule_delayed_work(work, hz);
>> +}
>> +
>>   static void schedule_delayed(struct ceph_mds_client *mdsc)
>>   {
>> -       int delay = 5;
>> -       unsigned hz = round_jiffies_relative(HZ * delay);
>> -       schedule_delayed_work(&mdsc->delayed_work, hz);
>> +       __schedule_delayed(&mdsc->delayed_work, CEPH_WORK_DELAY_DEF);
>> +}
>> +
>> +static void metric_schedule_delayed(struct ceph_mds_client *mdsc)
>> +{
>> +       /* delay CEPH_WORK_DELAY_DEF seconds when idle */
>> +       int delay = metric_send_interval ? : CEPH_WORK_DELAY_DEF;
>> +
>> +       __schedule_delayed(&mdsc->metric_delayed_work, delay);
>> +}
>> +
>> +static bool check_session_state(struct ceph_mds_client *mdsc,
>> +                               struct ceph_mds_session *s)
>> +{
>> +       if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
>> +               dout("resending session close request for mds%d\n",
>> +                               s->s_mds);
>> +               request_close_session(mdsc, s);
>> +               return false;
>> +       }
>> +       if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
>> +               if (s->s_state == CEPH_MDS_SESSION_OPEN) {
>> +                       s->s_state = CEPH_MDS_SESSION_HUNG;
>> +                       pr_info("mds%d hung\n", s->s_mds);
>> +               }
>> +       }
>> +       if (s->s_state == CEPH_MDS_SESSION_NEW ||
>> +           s->s_state == CEPH_MDS_SESSION_RESTARTING ||
>> +           s->s_state == CEPH_MDS_SESSION_REJECTED)
>> +               /* this mds is failed or recovering, just wait */
>> +               return false;
>> +
>> +       return true;
>>   }
>>
>> +/*
>> + * delayed work -- periodically trim expired leases, renew caps with mds
>> + */
>>   static void delayed_work(struct work_struct *work)
>>   {
>>          int i;
>> @@ -4116,23 +4267,8 @@ static void delayed_work(struct work_struct *work)
>>                  struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i);
>>                  if (!s)
>>                          continue;
>> -               if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
>> -                       dout("resending session close request for mds%d\n",
>> -                            s->s_mds);
>> -                       request_close_session(mdsc, s);
>> -                       ceph_put_mds_session(s);
>> -                       continue;
>> -               }
>> -               if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
>> -                       if (s->s_state == CEPH_MDS_SESSION_OPEN) {
>> -                               s->s_state = CEPH_MDS_SESSION_HUNG;
>> -                               pr_info("mds%d hung\n", s->s_mds);
>> -                       }
>> -               }
>> -               if (s->s_state == CEPH_MDS_SESSION_NEW ||
>> -                   s->s_state == CEPH_MDS_SESSION_RESTARTING ||
>> -                   s->s_state == CEPH_MDS_SESSION_REJECTED) {
>> -                       /* this mds is failed or recovering, just wait */
>> +
>> +               if (!check_session_state(mdsc, s)) {
>>                          ceph_put_mds_session(s);
>>                          continue;
>>                  }
>> @@ -4164,8 +4300,53 @@ static void delayed_work(struct work_struct *work)
>>          schedule_delayed(mdsc);
>>   }
>>
>> -static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
>> +static void metric_delayed_work(struct work_struct *work)
>> +{
>> +       struct ceph_mds_client *mdsc =
>> +               container_of(work, struct ceph_mds_client, metric_delayed_work.work);
>> +       struct ceph_mds_session *s;
>> +       u64 nr_caps = 0;
>> +       bool ret;
>> +       int i;
>> +
>> +       if (!metric_send_interval)
>> +               goto idle;
>> +
>> +       dout("mdsc metric_delayed_work\n");
>> +
>> +       mutex_lock(&mdsc->mutex);
>> +       for (i = 0; i < mdsc->max_sessions; i++) {
>> +               s = __ceph_lookup_mds_session(mdsc, i);
>> +               if (!s)
>> +                       continue;
>> +               nr_caps += s->s_nr_caps;
>> +               ceph_put_mds_session(s);
>> +       }
>> +
>> +       for (i = 0; i < mdsc->max_sessions; i++) {
>> +               s = __ceph_lookup_mds_session(mdsc, i);
>> +               if (!s)
>> +                       continue;
>> +               if (!check_session_state(mdsc, s)) {
>> +                       ceph_put_mds_session(s);
>> +                       continue;
>> +               }
>> +
>> +               /* Only send the metric once in any available session */
>> +               ret = ceph_mdsc_send_metrics(mdsc, s, nr_caps);
>> +               ceph_put_mds_session(s);
>> +               if (ret)
>> +                       break;
>> +       }
>> +       mutex_unlock(&mdsc->mutex);
>> +
>> +idle:
>> +       metric_schedule_delayed(mdsc);
> Looks like this will schedule metric_delayed_work() every 5 seconds
> even if metric_send_interval = 0 (i.e. sending is disabled).  What is
> the reason for that?

Hi Ilya,

Before I folded the metric_delayed_work() into delayed_work(). But for 
the this version since the interval is settable, so it hard to calculate 
the next schedule delay for that.

When it is idle just looping every 5 seconds, I thought though this is 
not a very graceful approach it won't introduce too much overload. If we 
do not like this, let's switch it to a completion.

Thanks,


> Thanks,
>
>                  Ilya
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/9] ceph: periodically send perf metrics to ceph
  2020-02-11  1:29     ` Xiubo Li
@ 2020-02-11 17:42       ` Ilya Dryomov
  2020-02-12  8:38         ` Xiubo Li
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Dryomov @ 2020-02-11 17:42 UTC (permalink / raw)
  To: Xiubo Li
  Cc: Jeff Layton, Sage Weil, Yan, Zheng, Patrick Donnelly, Ceph Development

On Tue, Feb 11, 2020 at 2:30 AM Xiubo Li <xiubli@redhat.com> wrote:
>
> On 2020/2/10 23:34, Ilya Dryomov wrote:
> > On Mon, Feb 10, 2020 at 6:34 AM <xiubli@redhat.com> wrote:
> >> From: Xiubo Li <xiubli@redhat.com>
> >>
> >> Add metric_send_interval module parameter support, the default valume
> >> is 0, means disabled. If none zero it will enable the transmission of
> >> the metrics to the ceph cluster periodically per metric_send_interval
> >> seconds.
> >>
> >> This will send the caps, dentry lease and read/write/metadata perf
> >> metrics to any available MDS only once per metric_send_interval
> >> seconds.
> >>
> >> URL: https://tracker.ceph.com/issues/43215
> >> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> >> ---
> >>   fs/ceph/mds_client.c         | 235 +++++++++++++++++++++++++++++++----
> >>   fs/ceph/mds_client.h         |   2 +
> >>   fs/ceph/metric.h             |  76 +++++++++++
> >>   fs/ceph/super.c              |   4 +
> >>   fs/ceph/super.h              |   1 +
> >>   include/linux/ceph/ceph_fs.h |   1 +
> >>   6 files changed, 294 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> >> index d414eded6810..f9a6f95c7941 100644
> >> --- a/fs/ceph/mds_client.c
> >> +++ b/fs/ceph/mds_client.c
> >> @@ -4085,16 +4085,167 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
> >>          ceph_force_reconnect(fsc->sb);
> >>   }
> >>
> >> -/*
> >> - * delayed work -- periodically trim expired leases, renew caps with mds
> >> - */
> >> +static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc,
> >> +                                  struct ceph_mds_session *s,
> >> +                                  u64 nr_caps)
> >> +{
> >> +       struct ceph_metric_head *head;
> >> +       struct ceph_metric_cap *cap;
> >> +       struct ceph_metric_dentry_lease *lease;
> >> +       struct ceph_metric_read_latency *read;
> >> +       struct ceph_metric_write_latency *write;
> >> +       struct ceph_metric_metadata_latency *meta;
> >> +       struct ceph_msg *msg;
> >> +       struct timespec64 ts;
> >> +       s64 sum, total;
> >> +       s32 items = 0;
> >> +       s32 len;
> >> +
> >> +       if (!mdsc || !s)
> >> +               return false;
> >> +
> >> +       len = sizeof(*head) + sizeof(*cap) + sizeof(*lease) + sizeof(*read)
> >> +             + sizeof(*write) + sizeof(*meta);
> >> +
> >> +       msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true);
> >> +       if (!msg) {
> >> +               pr_err("send metrics to mds%d, failed to allocate message\n",
> >> +                      s->s_mds);
> >> +               return false;
> >> +       }
> >> +
> >> +       head = msg->front.iov_base;
> >> +
> >> +       /* encode the cap metric */
> >> +       cap = (struct ceph_metric_cap *)(head + 1);
> >> +       cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO);
> >> +       cap->ver = 1;
> >> +       cap->compat = 1;
> >> +       cap->data_len = cpu_to_le32(sizeof(*cap) - 10);
> >> +       cap->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_hit));
> >> +       cap->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_mis));
> >> +       cap->total = cpu_to_le64(nr_caps);
> >> +       items++;
> >> +
> >> +       dout("cap metric hit %lld, mis %lld, total caps %lld",
> >> +            le64_to_cpu(cap->hit), le64_to_cpu(cap->mis),
> >> +            le64_to_cpu(cap->total));
> >> +
> >> +       /* encode the read latency metric */
> >> +       read = (struct ceph_metric_read_latency *)(cap + 1);
> >> +       read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY);
> >> +       read->ver = 1;
> >> +       read->compat = 1;
> >> +       read->data_len = cpu_to_le32(sizeof(*read) - 10);
> >> +       total = percpu_counter_sum(&mdsc->metric.total_reads),
> >> +       sum = percpu_counter_sum(&mdsc->metric.read_latency_sum);
> >> +       jiffies_to_timespec64(sum, &ts);
> >> +       read->sec = cpu_to_le32(ts.tv_sec);
> >> +       read->nsec = cpu_to_le32(ts.tv_nsec);
> >> +       items++;
> >> +       dout("read latency metric total %lld, sum lat %lld", total, sum);
> >> +
> >> +       /* encode the write latency metric */
> >> +       write = (struct ceph_metric_write_latency *)(read + 1);
> >> +       write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY);
> >> +       write->ver = 1;
> >> +       write->compat = 1;
> >> +       write->data_len = cpu_to_le32(sizeof(*write) - 10);
> >> +       total = percpu_counter_sum(&mdsc->metric.total_writes),
> >> +       sum = percpu_counter_sum(&mdsc->metric.write_latency_sum);
> >> +       jiffies_to_timespec64(sum, &ts);
> >> +       write->sec = cpu_to_le32(ts.tv_sec);
> >> +       write->nsec = cpu_to_le32(ts.tv_nsec);
> >> +       items++;
> >> +       dout("write latency metric total %lld, sum lat %lld", total, sum);
> >> +
> >> +       /* encode the metadata latency metric */
> >> +       meta = (struct ceph_metric_metadata_latency *)(write + 1);
> >> +       meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY);
> >> +       meta->ver = 1;
> >> +       meta->compat = 1;
> >> +       meta->data_len = cpu_to_le32(sizeof(*meta) - 10);
> >> +       total = percpu_counter_sum(&mdsc->metric.total_metadatas),
> >> +       sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum);
> >> +       jiffies_to_timespec64(sum, &ts);
> >> +       meta->sec = cpu_to_le32(ts.tv_sec);
> >> +       meta->nsec = cpu_to_le32(ts.tv_nsec);
> >> +       items++;
> >> +       dout("metadata latency metric total %lld, sum lat %lld", total, sum);
> >> +
> >> +       /* encode the dentry lease metric */
> >> +       lease = (struct ceph_metric_dentry_lease *)(meta + 1);
> >> +       lease->type = cpu_to_le32(CLIENT_METRIC_TYPE_DENTRY_LEASE);
> >> +       lease->ver = 1;
> >> +       lease->compat = 1;
> >> +       lease->data_len = cpu_to_le32(sizeof(*lease) - 10);
> >> +       lease->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_hit));
> >> +       lease->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_mis));
> >> +       lease->total = cpu_to_le64(atomic64_read(&mdsc->metric.total_dentries));
> >> +       items++;
> >> +       dout("dentry lease metric hit %lld, mis %lld, total dentries %lld",
> >> +            le64_to_cpu(lease->hit), le64_to_cpu(lease->mis),
> >> +            le64_to_cpu(lease->total));
> >> +
> >> +       put_unaligned_le32(items, &head->num);
> >> +       msg->front.iov_len = cpu_to_le32(len);
> >> +       msg->hdr.version = cpu_to_le16(1);
> >> +       msg->hdr.compat_version = cpu_to_le16(1);
> >> +       msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
> >> +       dout("send metrics to mds%d %p\n", s->s_mds, msg);
> >> +       ceph_con_send(&s->s_con, msg);
> >> +
> >> +       return true;
> >> +}
> >> +
> >> +#define CEPH_WORK_DELAY_DEF 5
> >> +static void __schedule_delayed(struct delayed_work *work, int delay)
> >> +{
> >> +       unsigned int hz = round_jiffies_relative(HZ * delay);
> >> +
> >> +       schedule_delayed_work(work, hz);
> >> +}
> >> +
> >>   static void schedule_delayed(struct ceph_mds_client *mdsc)
> >>   {
> >> -       int delay = 5;
> >> -       unsigned hz = round_jiffies_relative(HZ * delay);
> >> -       schedule_delayed_work(&mdsc->delayed_work, hz);
> >> +       __schedule_delayed(&mdsc->delayed_work, CEPH_WORK_DELAY_DEF);
> >> +}
> >> +
> >> +static void metric_schedule_delayed(struct ceph_mds_client *mdsc)
> >> +{
> >> +       /* delay CEPH_WORK_DELAY_DEF seconds when idle */
> >> +       int delay = metric_send_interval ? : CEPH_WORK_DELAY_DEF;
> >> +
> >> +       __schedule_delayed(&mdsc->metric_delayed_work, delay);
> >> +}
> >> +
> >> +static bool check_session_state(struct ceph_mds_client *mdsc,
> >> +                               struct ceph_mds_session *s)
> >> +{
> >> +       if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
> >> +               dout("resending session close request for mds%d\n",
> >> +                               s->s_mds);
> >> +               request_close_session(mdsc, s);
> >> +               return false;
> >> +       }
> >> +       if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
> >> +               if (s->s_state == CEPH_MDS_SESSION_OPEN) {
> >> +                       s->s_state = CEPH_MDS_SESSION_HUNG;
> >> +                       pr_info("mds%d hung\n", s->s_mds);
> >> +               }
> >> +       }
> >> +       if (s->s_state == CEPH_MDS_SESSION_NEW ||
> >> +           s->s_state == CEPH_MDS_SESSION_RESTARTING ||
> >> +           s->s_state == CEPH_MDS_SESSION_REJECTED)
> >> +               /* this mds is failed or recovering, just wait */
> >> +               return false;
> >> +
> >> +       return true;
> >>   }
> >>
> >> +/*
> >> + * delayed work -- periodically trim expired leases, renew caps with mds
> >> + */
> >>   static void delayed_work(struct work_struct *work)
> >>   {
> >>          int i;
> >> @@ -4116,23 +4267,8 @@ static void delayed_work(struct work_struct *work)
> >>                  struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i);
> >>                  if (!s)
> >>                          continue;
> >> -               if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
> >> -                       dout("resending session close request for mds%d\n",
> >> -                            s->s_mds);
> >> -                       request_close_session(mdsc, s);
> >> -                       ceph_put_mds_session(s);
> >> -                       continue;
> >> -               }
> >> -               if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
> >> -                       if (s->s_state == CEPH_MDS_SESSION_OPEN) {
> >> -                               s->s_state = CEPH_MDS_SESSION_HUNG;
> >> -                               pr_info("mds%d hung\n", s->s_mds);
> >> -                       }
> >> -               }
> >> -               if (s->s_state == CEPH_MDS_SESSION_NEW ||
> >> -                   s->s_state == CEPH_MDS_SESSION_RESTARTING ||
> >> -                   s->s_state == CEPH_MDS_SESSION_REJECTED) {
> >> -                       /* this mds is failed or recovering, just wait */
> >> +
> >> +               if (!check_session_state(mdsc, s)) {
> >>                          ceph_put_mds_session(s);
> >>                          continue;
> >>                  }
> >> @@ -4164,8 +4300,53 @@ static void delayed_work(struct work_struct *work)
> >>          schedule_delayed(mdsc);
> >>   }
> >>
> >> -static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
> >> +static void metric_delayed_work(struct work_struct *work)
> >> +{
> >> +       struct ceph_mds_client *mdsc =
> >> +               container_of(work, struct ceph_mds_client, metric_delayed_work.work);
> >> +       struct ceph_mds_session *s;
> >> +       u64 nr_caps = 0;
> >> +       bool ret;
> >> +       int i;
> >> +
> >> +       if (!metric_send_interval)
> >> +               goto idle;
> >> +
> >> +       dout("mdsc metric_delayed_work\n");
> >> +
> >> +       mutex_lock(&mdsc->mutex);
> >> +       for (i = 0; i < mdsc->max_sessions; i++) {
> >> +               s = __ceph_lookup_mds_session(mdsc, i);
> >> +               if (!s)
> >> +                       continue;
> >> +               nr_caps += s->s_nr_caps;
> >> +               ceph_put_mds_session(s);
> >> +       }
> >> +
> >> +       for (i = 0; i < mdsc->max_sessions; i++) {
> >> +               s = __ceph_lookup_mds_session(mdsc, i);
> >> +               if (!s)
> >> +                       continue;
> >> +               if (!check_session_state(mdsc, s)) {
> >> +                       ceph_put_mds_session(s);
> >> +                       continue;
> >> +               }
> >> +
> >> +               /* Only send the metric once in any available session */
> >> +               ret = ceph_mdsc_send_metrics(mdsc, s, nr_caps);
> >> +               ceph_put_mds_session(s);
> >> +               if (ret)
> >> +                       break;
> >> +       }
> >> +       mutex_unlock(&mdsc->mutex);
> >> +
> >> +idle:
> >> +       metric_schedule_delayed(mdsc);
> > Looks like this will schedule metric_delayed_work() every 5 seconds
> > even if metric_send_interval = 0 (i.e. sending is disabled).  What is
> > the reason for that?
>
> Hi Ilya,
>
> Before I folded the metric_delayed_work() into delayed_work(). But for
> the this version since the interval is settable, so it hard to calculate
> the next schedule delay for that.
>
> When it is idle just looping every 5 seconds, I thought though this is
> not a very graceful approach it won't introduce too much overload. If we
> do not like this, let's switch it to a completion.

Take a look at module_param_cb macro.  I think you can provide a
setter and schedule the first work / modify the delay from there.

That said, I'm not sure making the interval configurable is a good
idea.  I'm not saying you need to change anything -- just that if it
was me, I would send these metrics once per tick (i.e. delayed_work)
with an on/off switch and no other tunables.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 6/9] ceph: periodically send perf metrics to ceph
  2020-02-11 17:42       ` Ilya Dryomov
@ 2020-02-12  8:38         ` Xiubo Li
  0 siblings, 0 replies; 18+ messages in thread
From: Xiubo Li @ 2020-02-12  8:38 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Jeff Layton, Sage Weil, Yan, Zheng, Patrick Donnelly, Ceph Development

On 2020/2/12 1:42, Ilya Dryomov wrote:
> On Tue, Feb 11, 2020 at 2:30 AM Xiubo Li <xiubli@redhat.com> wrote:
>> On 2020/2/10 23:34, Ilya Dryomov wrote:
>>> On Mon, Feb 10, 2020 at 6:34 AM <xiubli@redhat.com> wrote:
>>>> From: Xiubo Li <xiubli@redhat.com>
>>>>
>>>> Add metric_send_interval module parameter support, the default valume
>>>> is 0, means disabled. If none zero it will enable the transmission of
>>>> the metrics to the ceph cluster periodically per metric_send_interval
>>>> seconds.
>>>>
>>>> This will send the caps, dentry lease and read/write/metadata perf
>>>> metrics to any available MDS only once per metric_send_interval
>>>> seconds.
>>>>
>>>> URL: https://tracker.ceph.com/issues/43215
>>>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>>>> ---
>>>>    fs/ceph/mds_client.c         | 235 +++++++++++++++++++++++++++++++----
>>>>    fs/ceph/mds_client.h         |   2 +
>>>>    fs/ceph/metric.h             |  76 +++++++++++
>>>>    fs/ceph/super.c              |   4 +
>>>>    fs/ceph/super.h              |   1 +
>>>>    include/linux/ceph/ceph_fs.h |   1 +
>>>>    6 files changed, 294 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>>>> index d414eded6810..f9a6f95c7941 100644
>>>> --- a/fs/ceph/mds_client.c
>>>> +++ b/fs/ceph/mds_client.c
>>>> @@ -4085,16 +4085,167 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
>>>>           ceph_force_reconnect(fsc->sb);
>>>>    }
>>>>
>>>> -/*
>>>> - * delayed work -- periodically trim expired leases, renew caps with mds
>>>> - */
>>>> +static bool ceph_mdsc_send_metrics(struct ceph_mds_client *mdsc,
>>>> +                                  struct ceph_mds_session *s,
>>>> +                                  u64 nr_caps)
>>>> +{
>>>> +       struct ceph_metric_head *head;
>>>> +       struct ceph_metric_cap *cap;
>>>> +       struct ceph_metric_dentry_lease *lease;
>>>> +       struct ceph_metric_read_latency *read;
>>>> +       struct ceph_metric_write_latency *write;
>>>> +       struct ceph_metric_metadata_latency *meta;
>>>> +       struct ceph_msg *msg;
>>>> +       struct timespec64 ts;
>>>> +       s64 sum, total;
>>>> +       s32 items = 0;
>>>> +       s32 len;
>>>> +
>>>> +       if (!mdsc || !s)
>>>> +               return false;
>>>> +
>>>> +       len = sizeof(*head) + sizeof(*cap) + sizeof(*lease) + sizeof(*read)
>>>> +             + sizeof(*write) + sizeof(*meta);
>>>> +
>>>> +       msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true);
>>>> +       if (!msg) {
>>>> +               pr_err("send metrics to mds%d, failed to allocate message\n",
>>>> +                      s->s_mds);
>>>> +               return false;
>>>> +       }
>>>> +
>>>> +       head = msg->front.iov_base;
>>>> +
>>>> +       /* encode the cap metric */
>>>> +       cap = (struct ceph_metric_cap *)(head + 1);
>>>> +       cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO);
>>>> +       cap->ver = 1;
>>>> +       cap->compat = 1;
>>>> +       cap->data_len = cpu_to_le32(sizeof(*cap) - 10);
>>>> +       cap->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_hit));
>>>> +       cap->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_mis));
>>>> +       cap->total = cpu_to_le64(nr_caps);
>>>> +       items++;
>>>> +
>>>> +       dout("cap metric hit %lld, mis %lld, total caps %lld",
>>>> +            le64_to_cpu(cap->hit), le64_to_cpu(cap->mis),
>>>> +            le64_to_cpu(cap->total));
>>>> +
>>>> +       /* encode the read latency metric */
>>>> +       read = (struct ceph_metric_read_latency *)(cap + 1);
>>>> +       read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY);
>>>> +       read->ver = 1;
>>>> +       read->compat = 1;
>>>> +       read->data_len = cpu_to_le32(sizeof(*read) - 10);
>>>> +       total = percpu_counter_sum(&mdsc->metric.total_reads),
>>>> +       sum = percpu_counter_sum(&mdsc->metric.read_latency_sum);
>>>> +       jiffies_to_timespec64(sum, &ts);
>>>> +       read->sec = cpu_to_le32(ts.tv_sec);
>>>> +       read->nsec = cpu_to_le32(ts.tv_nsec);
>>>> +       items++;
>>>> +       dout("read latency metric total %lld, sum lat %lld", total, sum);
>>>> +
>>>> +       /* encode the write latency metric */
>>>> +       write = (struct ceph_metric_write_latency *)(read + 1);
>>>> +       write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY);
>>>> +       write->ver = 1;
>>>> +       write->compat = 1;
>>>> +       write->data_len = cpu_to_le32(sizeof(*write) - 10);
>>>> +       total = percpu_counter_sum(&mdsc->metric.total_writes),
>>>> +       sum = percpu_counter_sum(&mdsc->metric.write_latency_sum);
>>>> +       jiffies_to_timespec64(sum, &ts);
>>>> +       write->sec = cpu_to_le32(ts.tv_sec);
>>>> +       write->nsec = cpu_to_le32(ts.tv_nsec);
>>>> +       items++;
>>>> +       dout("write latency metric total %lld, sum lat %lld", total, sum);
>>>> +
>>>> +       /* encode the metadata latency metric */
>>>> +       meta = (struct ceph_metric_metadata_latency *)(write + 1);
>>>> +       meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY);
>>>> +       meta->ver = 1;
>>>> +       meta->compat = 1;
>>>> +       meta->data_len = cpu_to_le32(sizeof(*meta) - 10);
>>>> +       total = percpu_counter_sum(&mdsc->metric.total_metadatas),
>>>> +       sum = percpu_counter_sum(&mdsc->metric.metadata_latency_sum);
>>>> +       jiffies_to_timespec64(sum, &ts);
>>>> +       meta->sec = cpu_to_le32(ts.tv_sec);
>>>> +       meta->nsec = cpu_to_le32(ts.tv_nsec);
>>>> +       items++;
>>>> +       dout("metadata latency metric total %lld, sum lat %lld", total, sum);
>>>> +
>>>> +       /* encode the dentry lease metric */
>>>> +       lease = (struct ceph_metric_dentry_lease *)(meta + 1);
>>>> +       lease->type = cpu_to_le32(CLIENT_METRIC_TYPE_DENTRY_LEASE);
>>>> +       lease->ver = 1;
>>>> +       lease->compat = 1;
>>>> +       lease->data_len = cpu_to_le32(sizeof(*lease) - 10);
>>>> +       lease->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_hit));
>>>> +       lease->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.d_lease_mis));
>>>> +       lease->total = cpu_to_le64(atomic64_read(&mdsc->metric.total_dentries));
>>>> +       items++;
>>>> +       dout("dentry lease metric hit %lld, mis %lld, total dentries %lld",
>>>> +            le64_to_cpu(lease->hit), le64_to_cpu(lease->mis),
>>>> +            le64_to_cpu(lease->total));
>>>> +
>>>> +       put_unaligned_le32(items, &head->num);
>>>> +       msg->front.iov_len = cpu_to_le32(len);
>>>> +       msg->hdr.version = cpu_to_le16(1);
>>>> +       msg->hdr.compat_version = cpu_to_le16(1);
>>>> +       msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
>>>> +       dout("send metrics to mds%d %p\n", s->s_mds, msg);
>>>> +       ceph_con_send(&s->s_con, msg);
>>>> +
>>>> +       return true;
>>>> +}
>>>> +
>>>> +#define CEPH_WORK_DELAY_DEF 5
>>>> +static void __schedule_delayed(struct delayed_work *work, int delay)
>>>> +{
>>>> +       unsigned int hz = round_jiffies_relative(HZ * delay);
>>>> +
>>>> +       schedule_delayed_work(work, hz);
>>>> +}
>>>> +
>>>>    static void schedule_delayed(struct ceph_mds_client *mdsc)
>>>>    {
>>>> -       int delay = 5;
>>>> -       unsigned hz = round_jiffies_relative(HZ * delay);
>>>> -       schedule_delayed_work(&mdsc->delayed_work, hz);
>>>> +       __schedule_delayed(&mdsc->delayed_work, CEPH_WORK_DELAY_DEF);
>>>> +}
>>>> +
>>>> +static void metric_schedule_delayed(struct ceph_mds_client *mdsc)
>>>> +{
>>>> +       /* delay CEPH_WORK_DELAY_DEF seconds when idle */
>>>> +       int delay = metric_send_interval ? : CEPH_WORK_DELAY_DEF;
>>>> +
>>>> +       __schedule_delayed(&mdsc->metric_delayed_work, delay);
>>>> +}
>>>> +
>>>> +static bool check_session_state(struct ceph_mds_client *mdsc,
>>>> +                               struct ceph_mds_session *s)
>>>> +{
>>>> +       if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
>>>> +               dout("resending session close request for mds%d\n",
>>>> +                               s->s_mds);
>>>> +               request_close_session(mdsc, s);
>>>> +               return false;
>>>> +       }
>>>> +       if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
>>>> +               if (s->s_state == CEPH_MDS_SESSION_OPEN) {
>>>> +                       s->s_state = CEPH_MDS_SESSION_HUNG;
>>>> +                       pr_info("mds%d hung\n", s->s_mds);
>>>> +               }
>>>> +       }
>>>> +       if (s->s_state == CEPH_MDS_SESSION_NEW ||
>>>> +           s->s_state == CEPH_MDS_SESSION_RESTARTING ||
>>>> +           s->s_state == CEPH_MDS_SESSION_REJECTED)
>>>> +               /* this mds is failed or recovering, just wait */
>>>> +               return false;
>>>> +
>>>> +       return true;
>>>>    }
>>>>
>>>> +/*
>>>> + * delayed work -- periodically trim expired leases, renew caps with mds
>>>> + */
>>>>    static void delayed_work(struct work_struct *work)
>>>>    {
>>>>           int i;
>>>> @@ -4116,23 +4267,8 @@ static void delayed_work(struct work_struct *work)
>>>>                   struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i);
>>>>                   if (!s)
>>>>                           continue;
>>>> -               if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
>>>> -                       dout("resending session close request for mds%d\n",
>>>> -                            s->s_mds);
>>>> -                       request_close_session(mdsc, s);
>>>> -                       ceph_put_mds_session(s);
>>>> -                       continue;
>>>> -               }
>>>> -               if (s->s_ttl && time_after(jiffies, s->s_ttl)) {
>>>> -                       if (s->s_state == CEPH_MDS_SESSION_OPEN) {
>>>> -                               s->s_state = CEPH_MDS_SESSION_HUNG;
>>>> -                               pr_info("mds%d hung\n", s->s_mds);
>>>> -                       }
>>>> -               }
>>>> -               if (s->s_state == CEPH_MDS_SESSION_NEW ||
>>>> -                   s->s_state == CEPH_MDS_SESSION_RESTARTING ||
>>>> -                   s->s_state == CEPH_MDS_SESSION_REJECTED) {
>>>> -                       /* this mds is failed or recovering, just wait */
>>>> +
>>>> +               if (!check_session_state(mdsc, s)) {
>>>>                           ceph_put_mds_session(s);
>>>>                           continue;
>>>>                   }
>>>> @@ -4164,8 +4300,53 @@ static void delayed_work(struct work_struct *work)
>>>>           schedule_delayed(mdsc);
>>>>    }
>>>>
>>>> -static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
>>>> +static void metric_delayed_work(struct work_struct *work)
>>>> +{
>>>> +       struct ceph_mds_client *mdsc =
>>>> +               container_of(work, struct ceph_mds_client, metric_delayed_work.work);
>>>> +       struct ceph_mds_session *s;
>>>> +       u64 nr_caps = 0;
>>>> +       bool ret;
>>>> +       int i;
>>>> +
>>>> +       if (!metric_send_interval)
>>>> +               goto idle;
>>>> +
>>>> +       dout("mdsc metric_delayed_work\n");
>>>> +
>>>> +       mutex_lock(&mdsc->mutex);
>>>> +       for (i = 0; i < mdsc->max_sessions; i++) {
>>>> +               s = __ceph_lookup_mds_session(mdsc, i);
>>>> +               if (!s)
>>>> +                       continue;
>>>> +               nr_caps += s->s_nr_caps;
>>>> +               ceph_put_mds_session(s);
>>>> +       }
>>>> +
>>>> +       for (i = 0; i < mdsc->max_sessions; i++) {
>>>> +               s = __ceph_lookup_mds_session(mdsc, i);
>>>> +               if (!s)
>>>> +                       continue;
>>>> +               if (!check_session_state(mdsc, s)) {
>>>> +                       ceph_put_mds_session(s);
>>>> +                       continue;
>>>> +               }
>>>> +
>>>> +               /* Only send the metric once in any available session */
>>>> +               ret = ceph_mdsc_send_metrics(mdsc, s, nr_caps);
>>>> +               ceph_put_mds_session(s);
>>>> +               if (ret)
>>>> +                       break;
>>>> +       }
>>>> +       mutex_unlock(&mdsc->mutex);
>>>> +
>>>> +idle:
>>>> +       metric_schedule_delayed(mdsc);
>>> Looks like this will schedule metric_delayed_work() every 5 seconds
>>> even if metric_send_interval = 0 (i.e. sending is disabled).  What is
>>> the reason for that?
>> Hi Ilya,
>>
>> Before I folded the metric_delayed_work() into delayed_work(). But for
>> the this version since the interval is settable, so it hard to calculate
>> the next schedule delay for that.
>>
>> When it is idle just looping every 5 seconds, I thought though this is
>> not a very graceful approach it won't introduce too much overload. If we
>> do not like this, let's switch it to a completion.
> Take a look at module_param_cb macro.  I think you can provide a
> setter and schedule the first work / modify the delay from there.

Hi Ilya,

Yeah, this is what I was trying to switch to.

> That said, I'm not sure making the interval configurable is a good
> idea.  I'm not saying you need to change anything -- just that if it
> was me, I would send these metrics once per tick (i.e. delayed_work)
> with an on/off switch and no other tunables.

Currently I still couldn't be sure whether per second will introduce any 
potential overload in some cases. But for now I couldn't foresee it will.

Thanks,

Xiubo

>
> Thanks,
>
>                  Ilya
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/9] ceph: add perf metrics support
  2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
                   ` (8 preceding siblings ...)
  2020-02-10  5:34 ` [PATCH v6 9/9] ceph: send client provided metric flags in client metadata xiubli
@ 2020-02-15  0:39 ` Xiubo Li
  9 siblings, 0 replies; 18+ messages in thread
From: Xiubo Li @ 2020-02-15  0:39 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel

On 2020/2/10 13:33, xiubli@redhat.com wrote:
>
> Xiubo Li (9):
>    ceph: add global dentry lease metric support
>    ceph: add caps perf metric for each session
>    ceph: add global read latency metric support
>    ceph: add global write latency metric support
>    ceph: add global metadata perf metric support

Hi Jeff, Ilya

Currently the corresponding PR in the ceph is still not merged, so the 
following 4 patches we could ignore for now. And I will address the new 
comments and post them after that PR get merged.

The above 5 ones are only kclient concerned, if the above is okay could 
we split this series and test/merge them ?

Thanks

BRs

Xiubo


>    ceph: periodically send perf metrics to ceph
>    ceph: add CEPH_DEFINE_RW_FUNC helper support
>    ceph: add reset metrics support
>    ceph: send client provided metric flags in client metadata
>
>   fs/ceph/acl.c                   |   2 +
>   fs/ceph/addr.c                  |  13 ++
>   fs/ceph/caps.c                  |  29 +++
>   fs/ceph/debugfs.c               | 107 ++++++++-
>   fs/ceph/dir.c                   |  25 ++-
>   fs/ceph/file.c                  |  22 ++
>   fs/ceph/mds_client.c            | 381 +++++++++++++++++++++++++++++---
>   fs/ceph/mds_client.h            |   6 +
>   fs/ceph/metric.h                | 155 +++++++++++++
>   fs/ceph/quota.c                 |   9 +-
>   fs/ceph/super.c                 |   4 +
>   fs/ceph/super.h                 |  11 +
>   fs/ceph/xattr.c                 |  17 +-
>   include/linux/ceph/ceph_fs.h    |   1 +
>   include/linux/ceph/debugfs.h    |  14 ++
>   include/linux/ceph/osd_client.h |   1 +
>   net/ceph/osd_client.c           |   2 +
>   17 files changed, 759 insertions(+), 40 deletions(-)
>   create mode 100644 fs/ceph/metric.h
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 2/9] ceph: add caps perf metric for each session
  2020-02-10  5:34 ` [PATCH v6 2/9] ceph: add caps perf metric for each session xiubli
@ 2020-02-17 13:27   ` Jeff Layton
  2020-02-17 13:50     ` Xiubo Li
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Layton @ 2020-02-17 13:27 UTC (permalink / raw)
  To: xiubli, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel

On Mon, 2020-02-10 at 00:34 -0500, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> This will fulfill the cap hit/mis metric stuff per-superblock,
> it will count the hit/mis counters based each inode, and if one
> inode's 'issued & ~revoking == mask' will mean a hit, or a miss.
> 
> item          total           miss            hit
> -------------------------------------------------
> caps          295             107             4119
> 
> URL: https://tracker.ceph.com/issues/43215
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  fs/ceph/acl.c        |  2 ++
>  fs/ceph/caps.c       | 29 +++++++++++++++++++++++++++++
>  fs/ceph/debugfs.c    | 16 ++++++++++++++++
>  fs/ceph/dir.c        |  9 +++++++--
>  fs/ceph/file.c       |  2 ++
>  fs/ceph/mds_client.c | 26 ++++++++++++++++++++++----
>  fs/ceph/metric.h     |  3 +++
>  fs/ceph/quota.c      |  9 +++++++--
>  fs/ceph/super.h      |  9 +++++++++
>  fs/ceph/xattr.c      | 17 ++++++++++++++---
>  10 files changed, 111 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c
> index 26be6520d3fb..58e119e3519f 100644
> --- a/fs/ceph/acl.c
> +++ b/fs/ceph/acl.c
> @@ -22,6 +22,8 @@ static inline void ceph_set_cached_acl(struct inode *inode,
>  	struct ceph_inode_info *ci = ceph_inode(inode);
>  
>  	spin_lock(&ci->i_ceph_lock);
> +	__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
> +
>  	if (__ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 0))
>  		set_cached_acl(inode, type, acl);
>  	else
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index 7fc87b693ba4..b4f122eb74bb 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -818,6 +818,32 @@ int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented)
>  	return have;
>  }
>  
> +/*
> + * Counts the cap metric.
> + *
> + * This will try to traverse all the ci->i_caps, if we can
> + * get all the cap 'mask' it will count the hit, or the mis.
> + */
> +void __ceph_caps_metric(struct ceph_inode_info *ci, int mask)
> +{
> +	struct ceph_mds_client *mdsc =
> +		ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc;
> +	struct ceph_client_metric *metric = &mdsc->metric;
> +	int issued;
> +
> +	lockdep_assert_held(&ci->i_ceph_lock);
> +
> +	if (mask <= 0)
> +		return;
> +
> +	issued = __ceph_caps_issued(ci, NULL);
> +
> +	if ((mask & issued) == mask)
> +		percpu_counter_inc(&metric->i_caps_hit);
> +	else
> +		percpu_counter_inc(&metric->i_caps_mis);
> +}
> +
>  /*
>   * Get cap bits issued by caps other than @ocap
>   */
> @@ -2758,6 +2784,7 @@ int ceph_try_get_caps(struct inode *inode, int need, int want,
>  	BUG_ON(want & ~(CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO |
>  			CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL |
>  			CEPH_CAP_ANY_DIR_OPS));
> +	ceph_caps_metric(ceph_inode(inode), need | want);
>  	ret = try_get_cap_refs(inode, need, want, 0, nonblock, got);
>  	return ret == -EAGAIN ? 0 : ret;
>  }
> @@ -2784,6 +2811,8 @@ int ceph_get_caps(struct file *filp, int need, int want,
>  	    fi->filp_gen != READ_ONCE(fsc->filp_gen))
>  		return -EBADF;
>  
> +	ceph_caps_metric(ci, need | want);
> +
>  	while (true) {
>  		if (endoff > 0)
>  			check_max_size(inode, endoff);
> diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
> index 15975ba95d9a..c83e52bd9961 100644
> --- a/fs/ceph/debugfs.c
> +++ b/fs/ceph/debugfs.c
> @@ -128,6 +128,7 @@ static int metric_show(struct seq_file *s, void *p)
>  {
>  	struct ceph_fs_client *fsc = s->private;
>  	struct ceph_mds_client *mdsc = fsc->mdsc;
> +	int i, nr_caps = 0;
>  
>  	seq_printf(s, "item          total           miss            hit\n");
>  	seq_printf(s, "-------------------------------------------------\n");
> @@ -137,6 +138,21 @@ static int metric_show(struct seq_file *s, void *p)
>  		   percpu_counter_sum(&mdsc->metric.d_lease_mis),
>  		   percpu_counter_sum(&mdsc->metric.d_lease_hit));
>  
> +	mutex_lock(&mdsc->mutex);
> +	for (i = 0; i < mdsc->max_sessions; i++) {
> +		struct ceph_mds_session *s;
> +
> +		s = __ceph_lookup_mds_session(mdsc, i);
> +		if (!s)
> +			continue;
> +		nr_caps += s->s_nr_caps;
> +		ceph_put_mds_session(s);
> +	}
> +	mutex_unlock(&mdsc->mutex);
> +	seq_printf(s, "%-14s%-16d%-16lld%lld\n", "caps", nr_caps,
> +		   percpu_counter_sum(&mdsc->metric.i_caps_mis),
> +		   percpu_counter_sum(&mdsc->metric.i_caps_hit));
> +
>  	return 0;
>  }
>  
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index 4771bf61d562..ffeaff5bf211 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -313,7 +313,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx)
>  	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
>  	struct ceph_mds_client *mdsc = fsc->mdsc;
>  	int i;
> -	int err;
> +	int err, ret = -1;
>  	unsigned frag = -1;
>  	struct ceph_mds_reply_info_parsed *rinfo;
>  
> @@ -346,13 +346,16 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx)
>  	    !ceph_test_mount_opt(fsc, NOASYNCREADDIR) &&
>  	    ceph_snap(inode) != CEPH_SNAPDIR &&
>  	    __ceph_dir_is_complete_ordered(ci) &&
> -	    __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1)) {
> +	    (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1))) {
>  		int shared_gen = atomic_read(&ci->i_shared_gen);
> +		__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);

Why is this dealing with Xs caps when you've checked Fs?

>  		spin_unlock(&ci->i_ceph_lock);
>  		err = __dcache_readdir(file, ctx, shared_gen);
>  		if (err != -EAGAIN)
>  			return err;
>  	} else {
> +		if (ret != -1)
> +			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);

Ditto.

>  		spin_unlock(&ci->i_ceph_lock);
>  	}
>  
> @@ -757,6 +760,8 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
>  		struct ceph_dentry_info *di = ceph_dentry(dentry);
>  
>  		spin_lock(&ci->i_ceph_lock);
> +		__ceph_caps_metric(ci, CEPH_CAP_FILE_SHARED);
> +
>  		dout(" dir %p flags are %d\n", dir, ci->i_ceph_flags);
>  		if (strncmp(dentry->d_name.name,
>  			    fsc->mount_options->snapdir_name,
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 4d1b5cc6dd3b..96803500b712 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -384,6 +384,8 @@ int ceph_open(struct inode *inode, struct file *file)
>  	 * asynchronously.
>  	 */
>  	spin_lock(&ci->i_ceph_lock);
> +	__ceph_caps_metric(ci, wanted);
> +
>  	if (__ceph_is_any_real_caps(ci) &&
>  	    (((fmode & CEPH_FILE_MODE_WR) == 0) || ci->i_auth_cap)) {
>  		int mds_wanted = __ceph_caps_mds_wanted(ci, true);
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index a24fd00676b8..1431e52e9558 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -4169,13 +4169,29 @@ static int ceph_mdsc_metric_init(struct ceph_client_metric *metric)
>  	ret = percpu_counter_init(&metric->d_lease_hit, 0, GFP_KERNEL);
>  	if (ret)
>  		return ret;
> +
>  	ret = percpu_counter_init(&metric->d_lease_mis, 0, GFP_KERNEL);
> -	if (ret) {
> -		percpu_counter_destroy(&metric->d_lease_hit);
> -		return ret;
> -	}
> +	if (ret)
> +		goto err_d_lease_mis;
> +
> +	ret = percpu_counter_init(&metric->i_caps_hit, 0, GFP_KERNEL);
> +	if (ret)
> +		goto err_i_caps_hit;
> +
> +	ret = percpu_counter_init(&metric->i_caps_mis, 0, GFP_KERNEL);
> +	if (ret)
> +		goto err_i_caps_mis;
>  
>  	return 0;
> +
> +err_i_caps_mis:
> +	percpu_counter_destroy(&metric->i_caps_hit);
> +err_i_caps_hit:
> +	percpu_counter_destroy(&metric->d_lease_mis);
> +err_d_lease_mis:
> +	percpu_counter_destroy(&metric->d_lease_hit);
> +
> +	return ret;
>  }
>  
>  int ceph_mdsc_init(struct ceph_fs_client *fsc)
> @@ -4515,6 +4531,8 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
>  
>  	ceph_mdsc_stop(mdsc);
>  
> +	percpu_counter_destroy(&mdsc->metric.i_caps_mis);
> +	percpu_counter_destroy(&mdsc->metric.i_caps_hit);
>  	percpu_counter_destroy(&mdsc->metric.d_lease_mis);
>  	percpu_counter_destroy(&mdsc->metric.d_lease_hit);
>  
> diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h
> index 998fe2a643cf..e2fceb38a924 100644
> --- a/fs/ceph/metric.h
> +++ b/fs/ceph/metric.h
> @@ -7,5 +7,8 @@ struct ceph_client_metric {
>  	atomic64_t            total_dentries;
>  	struct percpu_counter d_lease_hit;
>  	struct percpu_counter d_lease_mis;
> +
> +	struct percpu_counter i_caps_hit;
> +	struct percpu_counter i_caps_mis;
>  };
>  #endif /* _FS_CEPH_MDS_METRIC_H */
> diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
> index de56dee60540..4ce2f658e63d 100644
> --- a/fs/ceph/quota.c
> +++ b/fs/ceph/quota.c
> @@ -147,9 +147,14 @@ static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc,
>  		return NULL;
>  	}
>  	if (qri->inode) {
> +		struct ceph_inode_info *ci = ceph_inode(qri->inode);
> +		int ret;
> +
> +		ceph_caps_metric(ci, CEPH_STAT_CAP_INODE);
> +
>  		/* get caps */
> -		int ret = __ceph_do_getattr(qri->inode, NULL,
> -					    CEPH_STAT_CAP_INODE, true);
> +		ret = __ceph_do_getattr(qri->inode, NULL,
> +					CEPH_STAT_CAP_INODE, true);
>  		if (ret >= 0)
>  			in = qri->inode;
>  		else
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index 5241efe0f9d0..44b9a971ec9a 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -641,6 +641,14 @@ static inline bool __ceph_is_any_real_caps(struct ceph_inode_info *ci)
>  	return !RB_EMPTY_ROOT(&ci->i_caps);
>  }
>  
> +extern void __ceph_caps_metric(struct ceph_inode_info *ci, int mask);
> +static inline void ceph_caps_metric(struct ceph_inode_info *ci, int mask)
> +{
> +	spin_lock(&ci->i_ceph_lock);
> +	__ceph_caps_metric(ci, mask);
> +	spin_unlock(&ci->i_ceph_lock);
> +}
> +
>  extern int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented);
>  extern int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int t);
>  extern int __ceph_caps_issued_other(struct ceph_inode_info *ci,
> @@ -927,6 +935,7 @@ extern int __ceph_do_getattr(struct inode *inode, struct page *locked_page,
>  			     int mask, bool force);
>  static inline int ceph_do_getattr(struct inode *inode, int mask, bool force)
>  {
> +	ceph_caps_metric(ceph_inode(inode), mask);
>  	return __ceph_do_getattr(inode, NULL, mask, force);
>  }
>  extern int ceph_permission(struct inode *inode, int mask);
> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> index 7b8a070a782d..9b28e87b6719 100644
> --- a/fs/ceph/xattr.c
> +++ b/fs/ceph/xattr.c
> @@ -829,6 +829,7 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
>  	struct ceph_vxattr *vxattr = NULL;
>  	int req_mask;
>  	ssize_t err;
> +	int ret = -1;
>  
>  	/* let's see if a virtual xattr was requested */
>  	vxattr = ceph_match_vxattr(inode, name);
> @@ -856,7 +857,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
>  
>  	if (ci->i_xattrs.version == 0 ||
>  	    !((req_mask & CEPH_CAP_XATTR_SHARED) ||
> -	      __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1))) {
> +	      (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1)))) {
> +		if (ret != -1)
> +			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
>  		spin_unlock(&ci->i_ceph_lock);
>  
>  		/* security module gets xattr while filling trace */
> @@ -871,6 +874,9 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value,
>  		if (err)
>  			return err;
>  		spin_lock(&ci->i_ceph_lock);
> +	} else {
> +		if (ret != -1)
> +			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
>  	}
>  
>  	err = __build_xattrs(inode);
> @@ -907,19 +913,24 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size)
>  	struct ceph_inode_info *ci = ceph_inode(inode);
>  	bool len_only = (size == 0);
>  	u32 namelen;
> -	int err;
> +	int err, ret = -1;
>  
>  	spin_lock(&ci->i_ceph_lock);
>  	dout("listxattr %p ver=%lld index_ver=%lld\n", inode,
>  	     ci->i_xattrs.version, ci->i_xattrs.index_version);
>  
>  	if (ci->i_xattrs.version == 0 ||
> -	    !__ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1)) {
> +	    !(ret = __ceph_caps_issued_mask(ci, CEPH_CAP_XATTR_SHARED, 1))) {
> +		if (ret != -1)
> +			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
>  		spin_unlock(&ci->i_ceph_lock);
>  		err = ceph_do_getattr(inode, CEPH_STAT_CAP_XATTR, true);
>  		if (err)
>  			return err;
>  		spin_lock(&ci->i_ceph_lock);
> +	} else {
> +		if (ret != -1)
> +			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
>  	}
>  
>  	err = __build_xattrs(inode);

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 2/9] ceph: add caps perf metric for each session
  2020-02-17 13:27   ` Jeff Layton
@ 2020-02-17 13:50     ` Xiubo Li
  0 siblings, 0 replies; 18+ messages in thread
From: Xiubo Li @ 2020-02-17 13:50 UTC (permalink / raw)
  To: Jeff Layton, idryomov; +Cc: sage, zyan, pdonnell, ceph-devel

On 2020/2/17 21:27, Jeff Layton wrote:
> On Mon, 2020-02-10 at 00:34 -0500, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> This will fulfill the cap hit/mis metric stuff per-superblock,
>> it will count the hit/mis counters based each inode, and if one
>> inode's 'issued & ~revoking == mask' will mean a hit, or a miss.
>>
>> item          total           miss            hit
>> -------------------------------------------------
>> caps          295             107             4119
>>
>> []
[...]
>>   
>> @@ -346,13 +346,16 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx)
>>   	    !ceph_test_mount_opt(fsc, NOASYNCREADDIR) &&
>>   	    ceph_snap(inode) != CEPH_SNAPDIR &&
>>   	    __ceph_dir_is_complete_ordered(ci) &&
>> -	    __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1)) {
>> +	    (ret = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1))) {
>>   		int shared_gen = atomic_read(&ci->i_shared_gen);
>> +		__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
> Why is this dealing with Xs caps when you've checked Fs?

Good catch.

This might just copied from some where and forgot to change it. Will fix it.

>
>>   		spin_unlock(&ci->i_ceph_lock);
>>   		err = __dcache_readdir(file, ctx, shared_gen);
>>   		if (err != -EAGAIN)
>>   			return err;
>>   	} else {
>> +		if (ret != -1)
>> +			__ceph_caps_metric(ci, CEPH_CAP_XATTR_SHARED);
> Ditto.

Here too.

Thanks.
Xiubo

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-02-17 13:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-10  5:33 [PATCH v6 0/9] ceph: add perf metrics support xiubli
2020-02-10  5:33 ` [PATCH v6 1/9] ceph: add global dentry lease metric support xiubli
2020-02-10  5:34 ` [PATCH v6 2/9] ceph: add caps perf metric for each session xiubli
2020-02-17 13:27   ` Jeff Layton
2020-02-17 13:50     ` Xiubo Li
2020-02-10  5:34 ` [PATCH v6 3/9] ceph: add global read latency metric support xiubli
2020-02-10  5:34 ` [PATCH v6 4/9] ceph: add global write " xiubli
2020-02-10  5:34 ` [PATCH v6 5/9] ceph: add global metadata perf " xiubli
2020-02-10  5:34 ` [PATCH v6 6/9] ceph: periodically send perf metrics to ceph xiubli
2020-02-10 15:34   ` Ilya Dryomov
2020-02-11  1:29     ` Xiubo Li
2020-02-11 17:42       ` Ilya Dryomov
2020-02-12  8:38         ` Xiubo Li
2020-02-10  5:34 ` [PATCH v6 7/9] ceph: add CEPH_DEFINE_RW_FUNC helper support xiubli
2020-02-10  5:34 ` [PATCH v6 8/9] ceph: add reset metrics support xiubli
2020-02-10 15:22   ` Ilya Dryomov
2020-02-10  5:34 ` [PATCH v6 9/9] ceph: send client provided metric flags in client metadata xiubli
2020-02-15  0:39 ` [PATCH v6 0/9] ceph: add perf metrics support Xiubo Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.