linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Luis Henriques <lhenriques@suse.com>
To: "Jeff Layton" <jlayton@kernel.org>
Cc: <ceph-devel@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] ceph: allow object copies across different filesystems in the same cluster
Date: Mon, 09 Sep 2019 11:18:25 +0100	[thread overview]
Message-ID: <87k1ahojri.fsf@suse.com> (raw)
In-Reply-To: <afce4ec5a02c17121e615423a2e7e1b664aa77a3.camel@kernel.org> (Jeff Layton's message of "Sat, 07 Sep 2019 09:53:29 -0400")

"Jeff Layton" <jlayton@kernel.org> writes:

> On Fri, 2019-09-06 at 17:26 +0100, Luis Henriques wrote:
>> "Jeff Layton" <jlayton@kernel.org> writes:
>> 
>> > On Fri, 2019-09-06 at 14:57 +0100, Luis Henriques wrote:
>> > > OSDs are able to perform object copies across different pools.  Thus,
>> > > there's no need to prevent copy_file_range from doing remote copies if the
>> > > source and destination superblocks are different.  Only return -EXDEV if
>> > > they have different fsid (the cluster ID).
>> > > 
>> > > Signed-off-by: Luis Henriques <lhenriques@suse.com>
>> > > ---
>> > >  fs/ceph/file.c | 23 +++++++++++++++++++----
>> > >  1 file changed, 19 insertions(+), 4 deletions(-)
>> > > 
>> > > Hi!
>> > > 
>> > > I've finally managed to run some tests using multiple filesystems, both
>> > > within a single cluster and also using two different clusters.  The
>> > > behaviour of copy_file_range (with this patch, of course) was what I
>> > > expected:
>> > > 
>> > >   - Object copies work fine across different filesystems within the same
>> > >     cluster (even with pools in different PGs);
>> > >   - -EXDEV is returned if the fsid is different
>> > > 
>> > > (OT: I wonder why the cluster ID is named 'fsid'; historical reasons?
>> > >  Because this is actually what's in ceph.conf fsid in "[global]"
>> > >  section.  Anyway...)
>> > > 
>> > > So, what's missing right now is (I always mention this when I have the
>> > > opportunity!) to merge https://github.com/ceph/ceph/pull/25374 :-)
>> > > And add the corresponding support for the new flag to the kernel
>> > > client, of course.
>> > > 
>> > > Cheers,
>> > > --
>> > > Luis
>> > > 
>> > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> > > index 685a03cc4b77..88d116893c2b 100644
>> > > --- a/fs/ceph/file.c
>> > > +++ b/fs/ceph/file.c
>> > > @@ -1904,6 +1904,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off,
>> > >  	struct ceph_inode_info *src_ci = ceph_inode(src_inode);
>> > >  	struct ceph_inode_info *dst_ci = ceph_inode(dst_inode);
>> > >  	struct ceph_cap_flush *prealloc_cf;
>> > > +	struct ceph_fs_client *src_fsc = ceph_inode_to_client(src_inode);
>> > >  	struct ceph_object_locator src_oloc, dst_oloc;
>> > >  	struct ceph_object_id src_oid, dst_oid;
>> > >  	loff_t endoff = 0, size;
>> > > @@ -1915,8 +1916,22 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off,
>> > >  
>> > >  	if (src_inode == dst_inode)
>> > >  		return -EINVAL;
>> > > -	if (src_inode->i_sb != dst_inode->i_sb)
>> > > -		return -EXDEV;
>> > > +	if (src_inode->i_sb != dst_inode->i_sb) {
>> > > +		struct ceph_fs_client *dst_fsc = ceph_inode_to_client(dst_inode);
>> > > +
>> > > +		if (!src_fsc->client->have_fsid || !dst_fsc->client->have_fsid) {
>> > > +			dout("No fsid in a fs client\n");
>> > > +			return -EXDEV;
>> > > +		}
>> > 
>> > In what situation is there no fsid? Old cluster version?
>> > 
>> > If there is no fsid, can we take that to indicate that there is only a
>> > single filesystem possible in the cluster and that we should attempt the
>> > copy anyway?
>> 
>> TBH I'm not sure if 'have_fsid' can ever be 'false' in this call.  It is
>> set to 'true' when handling the monmap, and it's never changed back to
>> 'false'.  Since I don't think copy_file_range will be invoked *before*
>> we get the monmap, it should be safe to drop this check.  Maybe it could
>> be replaced it by a WARN_ON()?
>> 
>
> Yeah. I think the have_fsid flag just allows us to avoid the pr_err msg
> in ceph_check_fsid when the client is initially created. Maybe there is
> some better way to achieve that?

I guess the struct ceph_fsid embedded in the client(s) could be changed
into a pointer initialized to NULL (and later dynamically allocated).
Then, the have_fsid check could be replaced by a NULL check.  Not sure
if it would bring any real benefit, though.  Want me to give that a try?
Or maybe I misunderstood you question.

> In any case, I'd just drop that condition here.

Ok, I'll send v2 in a second, without this check.

[ BTW, looks like my initial post didn't made it into vger.kernel.org.
  It was probably dropped because I screwed-up the 'To:' field in my
  email (no idea how I did that, TBH). ]

Cheers,
-- 
Luis

  reply	other threads:[~2019-09-09 10:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190906135750.29543-1-lhenriques@suse.com>
     [not found] ` <30b09cb015563913d073c488c8de8ba0cceedd7b.camel@kernel.org>
2019-09-06 16:26   ` [PATCH] ceph: allow object copies across different filesystems in the same cluster Luis Henriques
2019-09-07 13:53     ` Jeff Layton
2019-09-09 10:18       ` Luis Henriques [this message]
2019-09-09 10:28         ` [PATCH v2] " Luis Henriques
2019-09-09 10:35           ` Jeff Layton
2019-09-09 11:05             ` Jeff Layton
2019-09-09 13:55               ` Luis Henriques
2019-09-09 15:21                 ` Jeff Layton
2019-09-09 11:15             ` Luis Henriques
2019-09-09 22:22               ` Gregory Farnum
2019-09-10 10:45                 ` Luis Henriques
2019-09-09 10:51           ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k1ahojri.fsf@suse.com \
    --to=lhenriques@suse.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).