From: Jeff Layton <jlayton@kernel.org>
To: Luis Henriques <lhenriques@suse.de>
Cc: ceph-devel@vger.kernel.org, pdonnell@redhat.com, idryomov@gmail.com
Subject: Re: [PATCH v3] ceph: dump info about cap flushes when we're waiting too long for them
Date: Fri, 30 Jul 2021 09:32:44 -0400 [thread overview]
Message-ID: <8d91e032b65b06807c2ef07fee2590e5a0adad4d.camel@kernel.org> (raw)
In-Reply-To: <87zgu4m7un.fsf@suse.de>
On Fri, 2021-07-30 at 11:09 +0100, Luis Henriques wrote:
> Jeff Layton <jlayton@kernel.org> writes:
>
> > We've had some cases of hung umounts in teuthology testing. It looks
> > like client is waiting for cap flushes to complete, but they aren't.
> >
> > Add a field to the inode to track the highest cap flush tid seen for
> > that inode. Also, add a backpointer to the inode to the ceph_cap_flush
> > struct.
> >
> > Change wait_caps_flush to wait 60s, and then dump info about the
> > condition of the list.
> >
> > Also, print pr_info messages if we end up dropping a FLUSH_ACK for an
> > inode onto the floor.
> >
> > Reported-by: Patrick Donnelly <pdonnell@redhat.com>
> > URL: https://tracker.ceph.com/issues/51279
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > fs/ceph/caps.c | 17 +++++++++++++++--
> > fs/ceph/inode.c | 1 +
> > fs/ceph/mds_client.c | 31 +++++++++++++++++++++++++++++--
> > fs/ceph/super.h | 2 ++
> > 4 files changed, 47 insertions(+), 4 deletions(-)
> >
> > v3: more debugging has shown the client waiting on FLUSH_ACK messages
> > that seem to never have come. Add some new printks if we end up
> > dropping a FLUSH_ACK onto the floor.
>
> Since you're adding debug printks, would it be worth to also add one in
> mds_dispatch(), when __verify_registered_session(mdsc, s) < 0?
>
> It's a wild guess, but the FLUSH_ACK could be dropped in that case too.
> Not that I could spot any issue there, but since this seems to be
> happening during umount...
>
> Cheers,
Good point. I had looked at that case and had sort of dismissed it in
this situation, but you're probably right. I've added a similar pr_info
for that case and pushed it to the repo after a little testing here. I
won't bother re-posting it though since the change is trivial.
Thanks,
--
Jeff Layton <jlayton@kernel.org>
prev parent reply other threads:[~2021-07-30 13:32 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-29 18:04 Jeff Layton
2021-07-30 10:09 ` Luis Henriques
2021-07-30 13:32 ` Jeff Layton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8d91e032b65b06807c2ef07fee2590e5a0adad4d.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=lhenriques@suse.de \
--cc=pdonnell@redhat.com \
--subject='Re: [PATCH v3] ceph: dump info about cap flushes when we'\''re waiting too long for them' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).