ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-fscrypt@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH v2 01/18] vfs: export new_inode_pseudo
Date: Wed, 09 Sep 2020 15:24:02 -0400	[thread overview]
Message-ID: <5471b5436b71e860da5bb3bb76a2a8bd0a61387e.camel@kernel.org> (raw)
In-Reply-To: <20200909184912.GA425889@gmail.com>

On Wed, 2020-09-09 at 11:49 -0700, Eric Biggers wrote:
> [+Cc Al]
> 
> On Wed, Sep 09, 2020 at 12:51:02PM -0400, Jeff Layton wrote:
> > > > No, more like:
> > > > 
> > > > Syscall					Workqueue
> > > > ------------------------------------------------------------------------------
> > > > 1. allocate an inode
> > > > 2. determine we can do an async create
> > > >    and allocate an inode number for it
> > > > 3. hash the inode (must set I_CREATING
> > > >    if we allocated with new_inode()) 
> > > > 4. issue the call to the MDS
> > > > 5. finish filling out the inode()
> > > > 6.					MDS reply comes in, and workqueue thread
> > > > 					looks up new inode (-ESTALE)
> > > > 7. unlock_new_inode()
> > > > 
> > > > 
> > > > Because 6 happens before 7 in this case, we get an ESTALE on that
> > > > lookup.
> > > 
> > > How is ESTALE at (6) possible?  (3) will set I_NEW on the inode when inserting
> > > it into the inode hash table.  Then (6) will wait for I_NEW to be cleared before
> > > returning the inode.  (7) will clear I_NEW and I_CREATING together.
> > > 
> > 
> > Long call chain, but:
> > 
> > ceph_fill_trace
> >    ceph_get_inode
> >       iget5_locked
> >          ilookup5(..._nowait, etc)
> >             find_inode_fast
> > 
> > 
> > ...and find_inode_fast does:
> > 
> >                 if (unlikely(inode->i_state & I_CREATING)) {                                        
> >                         spin_unlock(&inode->i_lock);                                                
> >                         return ERR_PTR(-ESTALE);                                                    
> >                 }                                                                                   
> 
> Why does ilookup5() not wait for I_NEW to be cleared if the inode has
> I_NEW|I_CREATING, but it does wait for I_NEW to be cleared if I_NEW is set its
> own?  That seems like a bug.
> 

Funny, I asked Al the same thing on IRC the other day:

23:28 < jlayton> viro: wondering if there is a potential race with I_CREATING in find_inode. 
                 Seems like you could have 2 tasks racing in calls to iget5_locked for the 
                 same inode. One creates an inode and starts instantiating it, and the second 
                 one gets NULL back because I_CREATING is set.
23:30 < viro> jlayton: where would I_CREATING come from?
23:30 < viro> it's set on insert_inode_locked() and similar paths
23:31 < viro> where you want iget5_locked() to fuck off and eat ESTALE
23:31 < jlayton> ok, right -- I was trying to pass an inode from new_inode to inode_insert5
23:32 < viro> seeing that it's been asked for an inode number that did _not_ exist until just 
              now (we'd just allocated it)

The assumption is that we'll never go looking for an inode until after
I_NEW is cleared. In the case of an asynchronous create in ceph though,
we may do exactly that if the reply comes back very quickly.

-- 
Jeff Layton <jlayton@kernel.org>


  reply	other threads:[~2020-09-09 19:24 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04 16:05 [RFC PATCH v2 00/18] ceph+fscrypt: context, filename and symlink support Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 01/18] vfs: export new_inode_pseudo Jeff Layton
2020-09-08  3:38   ` Eric Biggers
2020-09-08 11:27     ` Jeff Layton
2020-09-08 22:31       ` Eric Biggers
2020-09-09 10:47         ` Jeff Layton
2020-09-09 16:12           ` Eric Biggers
2020-09-09 16:51             ` Jeff Layton
2020-09-09 18:49               ` Eric Biggers
2020-09-09 19:24                 ` Jeff Layton [this message]
2020-09-04 16:05 ` [RFC PATCH v2 02/18] fscrypt: drop unused inode argument from fscrypt_fname_alloc_buffer Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 03/18] fscrypt: export fscrypt_d_revalidate Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 04/18] fscrypt: add fscrypt_new_context_from_inode Jeff Layton
2020-09-08  3:48   ` Eric Biggers
2020-09-08 11:29     ` Jeff Layton
2020-09-08 12:29     ` Jeff Layton
2020-09-08 22:34       ` Eric Biggers
2020-09-04 16:05 ` [RFC PATCH v2 05/18] fscrypt: don't balk when inode is already marked encrypted Jeff Layton
2020-09-08  3:52   ` Eric Biggers
2020-09-08 12:54     ` Jeff Layton
2020-09-08 23:08       ` Eric Biggers
2020-09-04 16:05 ` [RFC PATCH v2 06/18] fscrypt: move nokey_name conversion to separate function and export it Jeff Layton
2020-09-08  3:55   ` Eric Biggers
2020-09-08 12:50     ` Jeff Layton
2020-09-08 22:53       ` Eric Biggers
2020-09-09 16:02         ` Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 07/18] lib: lift fscrypt base64 conversion into lib/ Jeff Layton
2020-09-08  3:59   ` Eric Biggers
2020-09-08 12:51     ` Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 08/18] ceph: add fscrypt ioctls Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 09/18] ceph: crypto context handling for ceph Jeff Layton
2020-09-08  4:29   ` Eric Biggers
2020-09-08 16:14     ` Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 10/18] ceph: preallocate inode for ops that may create one Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 11/18] ceph: add routine to create context prior to RPC Jeff Layton
2020-09-08  4:43   ` Eric Biggers
2020-09-04 16:05 ` [RFC PATCH v2 12/18] ceph: set S_ENCRYPTED bit if new inode has encryption.ctx xattr Jeff Layton
2020-09-08  4:57   ` Eric Biggers
2020-09-09 12:20     ` Jeff Layton
2020-09-09 15:53     ` Jeff Layton
2020-09-09 16:33       ` Eric Biggers
2020-09-09 17:19         ` Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 13/18] ceph: make ceph_msdc_build_path use ref-walk Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 14/18] ceph: add encrypted fname handling to ceph_mdsc_build_path Jeff Layton
2020-09-08  5:06   ` Eric Biggers
2020-09-09 12:24     ` Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 15/18] ceph: make d_revalidate call fscrypt revalidator for encrypted dentries Jeff Layton
2020-09-08  5:12   ` Eric Biggers
2020-09-09 12:26     ` Jeff Layton
2020-09-09 16:18       ` Eric Biggers
2020-09-04 16:05 ` [RFC PATCH v2 16/18] ceph: add support to readdir for encrypted filenames Jeff Layton
2020-09-08  5:34   ` Eric Biggers
2020-09-09 13:02     ` Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 17/18] ceph: add fscrypt support to ceph_fill_trace Jeff Layton
2020-09-04 16:05 ` [RFC PATCH v2 18/18] ceph: create symlinks with encrypted and base64-encoded targets Jeff Layton
2020-09-04 16:11   ` Jeff Layton
2020-09-08  5:43   ` Eric Biggers
2020-09-08  5:54 ` [RFC PATCH v2 00/18] ceph+fscrypt: context, filename and symlink support Eric Biggers
2020-09-08 12:09   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5471b5436b71e860da5bb3bb76a2a8bd0a61387e.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).