ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ilya Dryomov <idryomov@gmail.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: "Yan, Zheng" <ukernel@gmail.com>,
	Patrick Donnelly <pdonnell@redhat.com>,
	Xiubo Li <xiubli@redhat.com>,
	Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: [PATCH] ceph: check session state after bumping session->s_seq
Date: Mon, 12 Oct 2020 18:00:32 +0200	[thread overview]
Message-ID: <CAOi1vP_xnT8E1Ojex_OgCDDJFDL7YuanUmqiErxjE8JwzZMJ8w@mail.gmail.com> (raw)
In-Reply-To: <20201012151326.310268-1-jlayton@kernel.org>

On Mon, Oct 12, 2020 at 5:13 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> Some messages sent by the MDS entail a session sequence number
> increment, and the MDS will drop certain types of requests on the floor
> when the sequence numbers don't match.
>
> In particular, a REQUEST_CLOSE message can cross with one of sequence
> morphing messages from the MDS, which can cause the client to stall,
> waiting for a response that will never come.
>
> Originally, this meant an up to 5s delay before the recurring workqueue
> job kicked in and resent the request, but a recent change made it so
> that the client would never resend, causing a 60s stall unmounting and
> sometimes a blockisting event.
>
> Fix this by checking the connection state after bumping the session
> sequence, which should cause a retransmit of the REQUEST_CLOSE, when
> this occurs.
>
> URL: https://tracker.ceph.com/issues/47563
> Fixes: fa9967734227 ("ceph: fix potential mdsc use-after-free crash")
> Reported-by: Patrick Donnelly <pdonnell@redhat.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/ceph/caps.c       | 1 +
>  fs/ceph/mds_client.c | 1 +
>  fs/ceph/quota.c      | 1 +
>  fs/ceph/snap.c       | 1 +
>  4 files changed, 4 insertions(+)
>
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index c00abd7eefc1..ac822c74baea 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -4072,6 +4072,7 @@ void ceph_handle_caps(struct ceph_mds_session *session,
>
>         mutex_lock(&session->s_mutex);
>         session->s_seq++;
> +       check_session_state(session);
>         dout(" mds%d seq %lld cap seq %u\n", session->s_mds, session->s_seq,
>              (unsigned)seq);
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 0190555b1f9e..69f529d894e6 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -4238,6 +4238,7 @@ static void handle_lease(struct ceph_mds_client *mdsc,
>
>         mutex_lock(&session->s_mutex);
>         session->s_seq++;
> +       check_session_state(session);
>
>         if (!inode) {
>                 dout("handle_lease no inode %llx\n", vino.ino);
> diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
> index 83cb4f26b689..a09667ee83c1 100644
> --- a/fs/ceph/quota.c
> +++ b/fs/ceph/quota.c
> @@ -54,6 +54,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
>         /* increment msg sequence number */
>         mutex_lock(&session->s_mutex);
>         session->s_seq++;
> +       check_session_state(session);
>         mutex_unlock(&session->s_mutex);
>
>         /* lookup inode */
> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
> index 0da39c16dab4..f1e73a65f4a5 100644
> --- a/fs/ceph/snap.c
> +++ b/fs/ceph/snap.c
> @@ -874,6 +874,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc,
>
>         mutex_lock(&session->s_mutex);
>         session->s_seq++;
> +       check_session_state(session);
>         mutex_unlock(&session->s_mutex);
>
>         down_write(&mdsc->snap_rwsem);
> --
> 2.26.2
>

A new helper just for

   if (s->s_state == CEPH_MDS_SESSION_CLOSING) {
           dout("resending session close request for mds%d\n",
                           s->s_mds);
           request_close_session(s);
   }

would be more precise IMO.  It could check request_close_session()
return value and log the error, too.

Thanks,

                Ilya

      reply	other threads:[~2020-10-12 16:00 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 15:13 [PATCH] ceph: check session state after bumping session->s_seq Jeff Layton
2020-10-12 16:00 ` Ilya Dryomov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOi1vP_xnT8E1Ojex_OgCDDJFDL7YuanUmqiErxjE8JwzZMJ8w@mail.gmail.com \
    --to=idryomov@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=jlayton@kernel.org \
    --cc=pdonnell@redhat.com \
    --cc=ukernel@gmail.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).