All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yan, Zheng" <ukernel@gmail.com>
To: Ilya Dryomov <idryomov@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>,
	ceph-devel <ceph-devel@vger.kernel.org>,
	Patrick Donnelly <pdonnell@redhat.com>
Subject: Re: [RFC PATCH 0/4] ceph: fix spurious recover_session=clean errors
Date: Tue, 29 Sep 2020 18:44:38 +0800	[thread overview]
Message-ID: <CAAM7YA=bo-pdnLuxFAyChtZCoP6VZ3oUJEX_+Sn5r6i6bO_+8Q@mail.gmail.com> (raw)
In-Reply-To: <CAOi1vP9Nz2Art=rq06qBuU3DvKzZs+RR7pf+qsGxYZkrbSB-1Q@mail.gmail.com>

On Tue, Sep 29, 2020 at 4:55 PM Ilya Dryomov <idryomov@gmail.com> wrote:
>
> On Tue, Sep 29, 2020 at 10:28 AM Yan, Zheng <ukernel@gmail.com> wrote:
> >
> > On Fri, Sep 25, 2020 at 10:08 PM Jeff Layton <jlayton@kernel.org> wrote:
> > >
> > > Ilya noticed that he would get spurious EACCES errors on calls done just
> > > after blocklisting the client on mounts with recover_session=clean. The
> > > session would get marked as REJECTED and that caused in-flight calls to
> > > die with EACCES. This patchset seems to smooth over the problem, but I'm
> > > not fully convinced it's the right approach.
> > >
> >
> > the root is cause is that client does not recover session instantly
> > after getting rejected by mds. Before session gets recovered, client
> > continues to return error.
>
> Hi Zheng,
>
> I don't think it's about whether that happens instantly or not.
> In the example from [1], the first "ls" would fail even if issued
> minutes after the session reject message and the reconnect.  From
> the user's POV it is well after the automatic recovery promised by
> recover_session=clean.
>
> [1] https://tracker.ceph.com/issues/47385

Reconnect should close all old session. It's likely because that
client didn't detect it's blacklisted.

>
> Thanks,
>
>                 Ilya
>
> >
> >
> > > The potential issue I see is that the client could take cap references to
> > > do a call on a session that has been blocklisted. We then queue the
> > > message and reestablish the session, but we may not have been granted
> > > the same caps by the MDS at that point.
> > >
> > > If this is a problem, then we probably need to rework it so that we
> > > return a distinct error code in this situation and have the upper layers
> > > issue a completely new mds request (with new cap refs, etc.)
> > >
> > > Obviously, that's a much more invasive approach though, so it would be
> > > nice to avoid that if this would suffice.
> > >
> > > Jeff Layton (4):
> > >   ceph: don't WARN when removing caps due to blocklisting
> > >   ceph: don't mark mount as SHUTDOWN when recovering session
> > >   ceph: remove timeout on allowing reconnect after blocklisting
> > >   ceph: queue request when CLEANRECOVER is set
> > >
> > >  fs/ceph/caps.c       |  2 +-
> > >  fs/ceph/mds_client.c | 10 ++++------
> > >  fs/ceph/super.c      | 13 +++++++++----
> > >  fs/ceph/super.h      |  1 -
> > >  4 files changed, 14 insertions(+), 12 deletions(-)
> > >
> > > --
> > > 2.26.2
> > >

  reply	other threads:[~2020-09-29 10:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25 14:08 [RFC PATCH 0/4] ceph: fix spurious recover_session=clean errors Jeff Layton
2020-09-25 14:08 ` [RFC PATCH 1/4] ceph: don't WARN when removing caps due to blocklisting Jeff Layton
2020-09-25 14:08 ` [RFC PATCH 2/4] ceph: don't mark mount as SHUTDOWN when recovering session Jeff Layton
2020-09-29  8:20   ` Yan, Zheng
2020-09-29 12:30     ` Jeff Layton
2020-09-25 14:08 ` [RFC PATCH 3/4] ceph: remove timeout on allowing reconnect after blocklisting Jeff Layton
2020-09-25 14:08 ` [RFC PATCH 4/4] ceph: queue request when CLEANRECOVER is set Jeff Layton
2020-09-29  8:31   ` Yan, Zheng
2020-09-29 12:46     ` Jeff Layton
2020-09-29 19:55   ` Jeff Layton
2020-09-29  8:28 ` [RFC PATCH 0/4] ceph: fix spurious recover_session=clean errors Yan, Zheng
2020-09-29  8:54   ` Ilya Dryomov
2020-09-29 10:44     ` Yan, Zheng [this message]
2020-09-29 10:58       ` Ilya Dryomov
2020-09-29 12:48         ` Jeff Layton
2020-09-29 19:50       ` Jeff Layton
2020-09-30  8:45         ` Yan, Zheng
2020-09-30 17:55           ` Jeff Layton
2020-09-30 12:10 ` [RFC PATCH v2 " Jeff Layton
2020-09-30 12:10   ` [RFC PATCH v2 1/4] ceph: don't WARN when removing caps due to blocklisting Jeff Layton
2020-09-30 12:10   ` [RFC PATCH v2 2/4] ceph: don't mark mount as SHUTDOWN when recovering session Jeff Layton
2020-09-30 12:10   ` [RFC PATCH v2 3/4] ceph: remove timeout on allowing reconnect after blocklisting Jeff Layton
2020-09-30 12:10   ` [RFC PATCH v2 4/4] ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAM7YA=bo-pdnLuxFAyChtZCoP6VZ3oUJEX_+Sn5r6i6bO_+8Q@mail.gmail.com' \
    --to=ukernel@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=pdonnell@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.