All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: "Yan, Zheng" <ukernel@gmail.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>,
	Ilya Dryomov <idryomov@gmail.com>, Sage Weil <sage@redhat.com>,
	Zheng Yan <zyan@redhat.com>,
	Patrick Donnelly <pdonnell@redhat.com>,
	Xiubo Li <xiubli@redhat.com>
Subject: Re: [PATCH v5 03/12] ceph: add infrastructure for waiting for async create to complete
Date: Thu, 20 Feb 2020 08:01:39 -0500	[thread overview]
Message-ID: <89ba8857af29c0e877d22e2188f86142f316454a.camel@kernel.org> (raw)
In-Reply-To: <CAAM7YAn-bXrOHGrF4O0WY4hB7ZUj7_uCT=qy3NphbNbw15F6hA@mail.gmail.com>

On Thu, 2020-02-20 at 11:32 +0800, Yan, Zheng wrote:
> On Wed, Feb 19, 2020 at 9:27 PM Jeff Layton <jlayton@kernel.org> wrote:
> > When we issue an async create, we must ensure that any later on-the-wire
> > requests involving it wait for the create reply.
> > 
> > Expand i_ceph_flags to be an unsigned long, and add a new bit that
> > MDS requests can wait on. If the bit is set in the inode when sending
> > caps, then don't send it and just return that it has been delayed.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/ceph/caps.c       | 13 ++++++++++++-
> >  fs/ceph/dir.c        |  2 +-
> >  fs/ceph/mds_client.c | 20 +++++++++++++++++++-
> >  fs/ceph/mds_client.h |  7 +++++++
> >  fs/ceph/super.h      |  4 +++-
> >  5 files changed, 42 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > index d05717397c2a..85e13aa359d2 100644
> > --- a/fs/ceph/caps.c
> > +++ b/fs/ceph/caps.c
> > @@ -511,7 +511,7 @@ static void __cap_delay_requeue(struct ceph_mds_client *mdsc,
> >                                 struct ceph_inode_info *ci,
> >                                 bool set_timeout)
> >  {
> > -       dout("__cap_delay_requeue %p flags %d at %lu\n", &ci->vfs_inode,
> > +       dout("__cap_delay_requeue %p flags 0x%lx at %lu\n", &ci->vfs_inode,
> >              ci->i_ceph_flags, ci->i_hold_caps_max);
> >         if (!mdsc->stopping) {
> >                 spin_lock(&mdsc->cap_delay_lock);
> > @@ -1294,6 +1294,13 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap,
> >         int delayed = 0;
> >         int ret;
> > 
> > +       /* Don't send anything if it's still being created. Return delayed */
> > +       if (ci->i_ceph_flags & CEPH_I_ASYNC_CREATE) {
> > +               spin_unlock(&ci->i_ceph_lock);
> > +               dout("%s async create in flight for %p\n", __func__, inode);
> > +               return 1;
> > +       }
> > +
> 
> Maybe it's better to check this in ceph_check_caps().  Other callers
> of __send_cap() shouldn't encounter async creating inode
> 

I've been looking, but what actually guarantees that?

Only ceph_check_caps calls it for UPDATE, but the other two callers call
it for FLUSH. I don't see what prevents the kernel from (e.g.) calling
write_inode before the create reply comes in, particularly if we just
create and then close the file.

As a side note, I still struggle with the fact thatthere seems to be no
coherent overall description of the cap protocol. What distinguishes a
FLUSH from an UPDATE, for instance? The MDS code and comments seem to
treat them somewhat interchangeably.


> >         held = cap->issued | cap->implemented;
> >         revoking = cap->implemented & ~cap->issued;
> >         retain &= ~revoking;
> > @@ -2250,6 +2257,10 @@ int ceph_fsync(struct file *file, loff_t start, loff_t end, int datasync)
> >         if (datasync)
> >                 goto out;
> > 
> > +       ret = ceph_wait_on_async_create(inode);
> > +       if (ret)
> > +               goto out;
> > +
> >         dirty = try_flush_caps(inode, &flush_tid);
> >         dout("fsync dirty caps are %s\n", ceph_cap_string(dirty));
> > 
> > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> > index a87274935a09..5b83bda57056 100644
> > --- a/fs/ceph/dir.c
> > +++ b/fs/ceph/dir.c
> > @@ -752,7 +752,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
> >                 struct ceph_dentry_info *di = ceph_dentry(dentry);
> > 
> >                 spin_lock(&ci->i_ceph_lock);
> > -               dout(" dir %p flags are %d\n", dir, ci->i_ceph_flags);
> > +               dout(" dir %p flags are 0x%lx\n", dir, ci->i_ceph_flags);
> >                 if (strncmp(dentry->d_name.name,
> >                             fsc->mount_options->snapdir_name,
> >                             dentry->d_name.len) &&
> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > index 94d18e643a3d..38eb9dd5062b 100644
> > --- a/fs/ceph/mds_client.c
> > +++ b/fs/ceph/mds_client.c
> > @@ -2730,7 +2730,7 @@ static void kick_requests(struct ceph_mds_client *mdsc, int mds)
> >  int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
> >                               struct ceph_mds_request *req)
> >  {
> > -       int err;
> > +       int err = 0;
> > 
> >         /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
> >         if (req->r_inode)
> > @@ -2743,6 +2743,24 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
> >                 ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir),
> >                                   CEPH_CAP_PIN);
> > 
> > +       if (req->r_inode) {
> > +               err = ceph_wait_on_async_create(req->r_inode);
> > +               if (err) {
> > +                       dout("%s: wait for async create returned: %d\n",
> > +                            __func__, err);
> > +                       return err;
> > +               }
> > +       }
> > +
> > +       if (!err && req->r_old_inode) {
> > +               err = ceph_wait_on_async_create(req->r_old_inode);
> > +               if (err) {
> > +                       dout("%s: wait for async create returned: %d\n",
> > +                            __func__, err);
> > +                       return err;
> > +               }
> > +       }
> > +
> >         dout("submit_request on %p for inode %p\n", req, dir);
> >         mutex_lock(&mdsc->mutex);
> >         __register_request(mdsc, req, dir);
> > diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
> > index 95ac00e59e66..8043f2b439b1 100644
> > --- a/fs/ceph/mds_client.h
> > +++ b/fs/ceph/mds_client.h
> > @@ -538,4 +538,11 @@ extern void ceph_mdsc_open_export_target_sessions(struct ceph_mds_client *mdsc,
> >  extern int ceph_trim_caps(struct ceph_mds_client *mdsc,
> >                           struct ceph_mds_session *session,
> >                           int max_caps);
> > +static inline int ceph_wait_on_async_create(struct inode *inode)
> > +{
> > +       struct ceph_inode_info *ci = ceph_inode(inode);
> > +
> > +       return wait_on_bit(&ci->i_ceph_flags, CEPH_ASYNC_CREATE_BIT,
> > +                          TASK_INTERRUPTIBLE);
> > +}
> >  #endif
> > diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> > index 3430d7ffe8f7..bfb03adb4a08 100644
> > --- a/fs/ceph/super.h
> > +++ b/fs/ceph/super.h
> > @@ -316,7 +316,7 @@ struct ceph_inode_info {
> >         u64 i_inline_version;
> >         u32 i_time_warp_seq;
> > 
> > -       unsigned i_ceph_flags;
> > +       unsigned long i_ceph_flags;
> >         atomic64_t i_release_count;
> >         atomic64_t i_ordered_count;
> >         atomic64_t i_complete_seq[2];
> > @@ -524,6 +524,8 @@ static inline struct inode *ceph_find_inode(struct super_block *sb,
> >  #define CEPH_I_ERROR_WRITE     (1 << 10) /* have seen write errors */
> >  #define CEPH_I_ERROR_FILELOCK  (1 << 11) /* have seen file lock errors */
> >  #define CEPH_I_ODIRECT         (1 << 12) /* inode in direct I/O mode */
> > +#define CEPH_ASYNC_CREATE_BIT  (13)      /* async create in flight for this */
> > +#define CEPH_I_ASYNC_CREATE    (1 << CEPH_ASYNC_CREATE_BIT)
> > 
> >  /*
> >   * Masks of ceph inode work.
> > --
> > 2.24.1
> > 

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2020-02-20 13:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 13:25 [PATCH v5 00/12] ceph: async directory operations support Jeff Layton
2020-02-19 13:25 ` [PATCH v5 01/12] ceph: add flag to designate that a request is asynchronous Jeff Layton
2020-02-19 13:25 ` [PATCH v5 02/12] ceph: track primary dentry link Jeff Layton
2020-02-19 13:25 ` [PATCH v5 03/12] ceph: add infrastructure for waiting for async create to complete Jeff Layton
2020-02-20  3:32   ` Yan, Zheng
2020-02-20 13:01     ` Jeff Layton [this message]
2020-02-20 13:33       ` Yan, Zheng
2020-02-20 14:53         ` Jeff Layton
2020-02-25 19:45           ` Jeff Layton
2020-02-26 14:10             ` Yan, Zheng
2020-02-27 20:06               ` Jeff Layton
2020-02-19 13:25 ` [PATCH v5 04/12] ceph: make __take_cap_refs non-static Jeff Layton
2020-02-19 13:25 ` [PATCH v5 05/12] ceph: cap tracking for async directory operations Jeff Layton
2020-02-20  6:42   ` Yan, Zheng
2020-02-20 11:30     ` Jeff Layton
2020-02-19 13:25 ` [PATCH v5 06/12] ceph: don't take refs to want mask unless we have all bits Jeff Layton
2020-02-19 13:25 ` [PATCH v5 07/12] ceph: perform asynchronous unlink if we have sufficient caps Jeff Layton
2020-02-20  6:44   ` Yan, Zheng
2020-02-20 11:32     ` Jeff Layton
2020-02-19 13:25 ` [PATCH v5 08/12] ceph: make ceph_fill_inode non-static Jeff Layton
2020-02-19 13:25 ` [PATCH v5 09/12] ceph: decode interval_sets for delegated inos Jeff Layton
2020-02-19 13:25 ` [PATCH v5 10/12] ceph: add new MDS req field to hold delegated inode number Jeff Layton
2020-02-19 13:25 ` [PATCH v5 11/12] ceph: cache layout in parent dir on first sync create Jeff Layton
2020-02-19 13:25 ` [PATCH v5 12/12] ceph: attempt to do async create when possible Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=89ba8857af29c0e877d22e2188f86142f316454a.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=pdonnell@redhat.com \
    --cc=sage@redhat.com \
    --cc=ukernel@gmail.com \
    --cc=xiubli@redhat.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.