linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Luis Henriques <lhenriques@suse.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	ceph-devel@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>,
	fstests <fstests@vger.kernel.org>,
	Dave Chinner <dchinner@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Miklos Szeredi <miklos@szeredi.hu>,
	"J. Bruce Fields" <bfields@redhat.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] ceph: don't return -ESTALE if there's still an open file
Date: Sat, 16 May 2020 08:10:27 -0400	[thread overview]
Message-ID: <61e877867de7e683d88e2cc5d0945b0aecce1d2a.camel@kernel.org> (raw)
In-Reply-To: <CAOQ4uxhPzcX6Ti8UX4WOg9gJJn+YTuk9OgU80d9imoJ2QdXaWQ@mail.gmail.com>

On Sat, 2020-05-16 at 09:58 +0300, Amir Goldstein wrote:
> [pulling in nfs guys]
> 
> > > Questions:
> > > 1. Does sync() result in fully purging inodes on MDS?
> > 
> > I don't think so, but again, that code is not trivial to follow. I do
> > know that the MDS keeps around a "strays directory" which contains
> > unlinked inodes that are lazily cleaned up. My suspicion is that it's
> > satisfying lookups out of this cache as well.
> > 
> > Which may be fine...the MDS is not required to be POSIX compliant after
> > all. Only the fs drivers are.
> > 
> > > 2. Is i_nlink synchronized among nodes on deferred delete?
> > > IWO, can inode come back from the dead on client if another node
> > > has linked it before i_nlink 0 was observed?
> > 
> > No, that shouldn't happen. The caps mechanism should ensure that it
> > can't be observed by other clients until after the change.
> > 
> > That said, Luis' current patch doesn't ensure we have the correct caps
> > to check the i_nlink. We may need to add that in before we can roll with
> > this.
> > 
> > > 3. Can an NFS client be "migrated" from one ceph node to another
> > > with an open but unlinked file?
> > > 
> > 
> > No. Open files in ceph are generally per-client. You can't pass around a
> > fd (or equivalent).
> 
> Not sure we are talking about the same thing.
> It's not ceph fd that is being passed around, it's the NFS client's fd.
> If there is no case where NFS client would access ceph client2
> with a handle it got from ceph client1, then there is no reason to satisfy
> an open_by_handle() call for an unlinked file on client2.
> If file was opened on client1, it may be "legal" to satisfy open_by_handle()
> on client2, but I don't see how stopping to satisfy that can break anything.
> 

Not currently, but eventually we may need to allow for that...which is
another good reason to handle this on the (Ceph) client instead, as the
client can then decide whether treat an unlinked file as an ESTALE
return based on its needs.

> > > I think what the test is trying to verify is that a "fully purged" inodes
> > > cannot be opened db handle, but there is no standard way to verify
> > > "fully purged", so the test resorts to sync() + another sync() + drop_caches.
> > > 
> > 
> > Got it. That makes sense.
> > 
> > > Is there anything else that needs to be done on ceph in order to flush
> > > all deferred operations from this client to MDS?
> > 
> > I'm leaning toward something like what Luis has proposed, but adding in
> > appropriate cap handling.
> 
> That sounds fine.
> 
> > Basically, we have to consider the situation where one client has the
> > file open and another client unlinks it, and then does an
> > open_by_handle_at. Should it succeed in that case?
> > 
> > I can see arguments for either way.
> 
> IMO, the behavior should be defined for a client that has the file open.
> For the rest it does not really matter.
> 
> My argument is that is it easy to satisfy the test's expectation and conform
> to behavior of other filesystems without breaking any real workload.
> 
> To satisfy the test's expectation, you only need to change behavior of ceph
> client in i_count 1 use case. If i_count is 1 need to take all relevant caps
> to check that i_nlink is "globally" 0, before returning ESTALE.
> But if i_count > 1, no need to bother.

Makes sense. Thanks.

-- 
Jeff Layton <jlayton@kernel.org>


      reply	other threads:[~2020-05-16 12:11 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200514111453.GA99187@suse.com>
     [not found] ` <8497fe9a11ac1837813ee5f14b6ebae8fa6bf707.camel@kernel.org>
     [not found]   ` <20200514124845.GA12559@suse.com>
     [not found]     ` <4e5bf0e3bf055e53a342b19d168f6cf441781973.camel@kernel.org>
     [not found]       ` <CAOQ4uxhireZBRvcPQzTS8yOoO4gQt78M0ktZo-9yQ-zcaLZbow@mail.gmail.com>
     [not found]         ` <20200515111548.GA54598@suse.com>
     [not found]           ` <61b1f19edcc349641b5383c2ac70cbf9a15ba4bd.camel@kernel.org>
     [not found]             ` <CAOQ4uxiWZoSj3Pjwskd_hu-ErV9096hLt13CDcW6nEEvcwDNVA@mail.gmail.com>
     [not found]               ` <e227d42fdc91587e34bc64ac252970d39d9b4eee.camel@kernel.org>
2020-05-16  6:58                 ` [PATCH] ceph: don't return -ESTALE if there's still an open file Amir Goldstein
2020-05-16 12:10                   ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61e877867de7e683d88e2cc5d0945b0aecce1d2a.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=bfields@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dchinner@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=hch@lst.de \
    --cc=idryomov@gmail.com \
    --cc=lhenriques@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).