All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: ceph-devel@vger.kernel.org
Cc: zyan@redhat.com, sage@redhat.com, idryomov@gmail.com,
	pdonnell@redhat.com
Subject: [RFC PATCH 0/9] ceph: add asynchronous create functionality
Date: Fri, 10 Jan 2020 15:56:38 -0500	[thread overview]
Message-ID: <20200110205647.311023-1-jlayton@kernel.org> (raw)

I recently sent a patchset that allows the client to do an asynchronous
UNLINK call to the MDS when it has the appropriate caps and dentry info.
This set adds the corresponding functionality for creates.

When the client has the appropriate caps on the parent directory and
dentry information, and a delegated inode number, it can satisfy a
request locally without contacting the server. This allows the kernel
client to return very quickly from an O_CREAT open, so it can get on
with doing other things.

These numbers are based on my personal test rig, which is a KVM client
vs a vstart cluster running on my workstation (nothing scientific here).

A simple benchmark (with the cephfs mounted at /mnt/cephfs):
-------------------8<-------------------
#!/bin/sh

TESTDIR=/mnt/cephfs/test-dirops.$$

mkdir $TESTDIR
stat $TESTDIR
echo "Creating files in $TESTDIR"
time for i in `seq 1 10000`; do
    echo "foobarbaz" > $TESTDIR/$i
done
-------------------8<-------------------

With async dirops disabled:

real	0m9.865s
user	0m0.353s
sys	0m0.888s

With async dirops enabled:

real	0m5.272s
user	0m0.104s
sys	0m0.454s

That workload is a bit synthetic though. One workload we're interested
in improving is untar. Untarring a deep directory tree (random kernel
tarball I had laying around):

Disabled:
$ time tar xf ~/linux-4.18.0-153.el8.jlayton.006.tar

real	1m35.774s
user	0m0.835s
sys	0m7.410s

Enabled:
$ time tar xf ~/linux-4.18.0-153.el8.jlayton.006.tar

real	1m32.182s
user	0m0.783s
sys	0m6.830s

Not a huge win there. I suspect at this point that synchronous mkdir
may be serializing behind the async creates.

It needs a lot more performance tuning and analysis, but it's now at the
point where it's basically usable. To enable it, turn on the
ceph.enable_async_dirops module option.

There are some places that need further work:

1) The MDS patchset to delegate inodes to the client is not yet merged:

    https://github.com/ceph/ceph/pull/31817

2) this is 64-bit arch only for the moment. I'm using an xarray to track
the delegated inode numbers, and those don't do 64-bit indexes on
32-bit machines. Is anyone using 32-bit ceph clients? We could probably
build an xarray of xarrays if needed.

3) The error handling is still pretty lame. If the create fails, it'll
set a writeback error on the parent dir and the inode itself, but the
client could end up writing a bunch before it notices, if it even
bothers to check. We probably need to do better here. I'm open to
suggestions on this bit especially.

Jeff Layton (9):
  ceph: ensure we have a new cap before continuing in fill_inode
  ceph: print name of xattr being set in set/getxattr dout message
  ceph: close some holes in struct ceph_mds_request
  ceph: make ceph_fill_inode non-static
  libceph: export ceph_file_layout_is_valid
  ceph: decode interval_sets for delegated inos
  ceph: add flag to delegate an inode number for async create
  ceph: copy layout, max_size and truncate_size on successful sync
    create
  ceph: attempt to do async create when possible

 fs/ceph/caps.c               |  31 +++++-
 fs/ceph/file.c               | 202 +++++++++++++++++++++++++++++++++--
 fs/ceph/inode.c              |  57 +++++-----
 fs/ceph/mds_client.c         | 130 ++++++++++++++++++++--
 fs/ceph/mds_client.h         |  12 ++-
 fs/ceph/super.h              |  10 ++
 fs/ceph/xattr.c              |   5 +-
 include/linux/ceph/ceph_fs.h |   8 +-
 net/ceph/ceph_fs.c           |   1 +
 9 files changed, 396 insertions(+), 60 deletions(-)

-- 
2.24.1

             reply	other threads:[~2020-01-10 20:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-10 20:56 Jeff Layton [this message]
2020-01-10 20:56 ` [RFC PATCH 1/9] ceph: ensure we have a new cap before continuing in fill_inode Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 2/9] ceph: print name of xattr being set in set/getxattr dout message Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 3/9] ceph: close some holes in struct ceph_mds_request Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 4/9] ceph: make ceph_fill_inode non-static Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 5/9] libceph: export ceph_file_layout_is_valid Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 6/9] ceph: decode interval_sets for delegated inos Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 7/9] ceph: add flag to delegate an inode number for async create Jeff Layton
2020-01-13  9:17   ` Yan, Zheng
2020-01-13 13:31     ` Jeff Layton
2020-01-13 14:51       ` Yan, Zheng
2020-01-10 20:56 ` [RFC PATCH 8/9] ceph: copy layout, max_size and truncate_size on successful sync create Jeff Layton
2020-01-13  3:51   ` Yan, Zheng
2020-01-13 13:26     ` Jeff Layton
2020-01-13 14:56       ` Yan, Zheng
2020-01-13 15:13         ` Jeff Layton
2020-01-13 16:37           ` Yan, Zheng
2020-01-13  9:01   ` Yan, Zheng
2020-01-13 13:29     ` Jeff Layton
2020-01-10 20:56 ` [RFC PATCH 9/9] ceph: attempt to do async create when possible Jeff Layton
2020-01-13  1:43   ` Xiubo Li
2020-01-13 13:16     ` Jeff Layton
2020-01-13 10:53   ` Yan, Zheng
2020-01-13 13:44     ` Jeff Layton
2020-01-13 14:48       ` Yan, Zheng
2020-01-13 15:20         ` Jeff Layton
2020-01-14  2:08           ` Yan, Zheng
2020-01-13 11:07 ` [RFC PATCH 0/9] ceph: add asynchronous create functionality Yan, Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200110205647.311023-1-jlayton@kernel.org \
    --to=jlayton@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=pdonnell@redhat.com \
    --cc=sage@redhat.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.