All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org, darrick.wong@oracle.com, kernel-team@fb.com
Subject: Re: [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback
Date: Fri, 3 Mar 2017 16:34:16 -0500	[thread overview]
Message-ID: <1488576856.9307.13.camel@fb.com> (raw)
In-Reply-To: <20170303204902.GF21245@bfoster.bfoster>

On Fri, 2017-03-03 at 15:49 -0500, Brian Foster wrote:
> On Fri, Mar 03, 2017 at 09:46:44AM -0500, Josef Bacik wrote:
> > 
> > While testing nbd disconnects I kept hitting the following hang
> > 
> > Call Trace:
> >  schedule+0x35/0x80
> >  xfs_ail_push_all_sync+0xa3/0xe0
> >  ? prepare_to_wait_event+0x100/0x100
> >  xfs_unmountfs+0x57/0x190
> >  xfs_fs_put_super+0x32/0x90
> >  generic_shutdown_super+0x6f/0xf0
> >  kill_block_super+0x27/0x70
> >  deactivate_locked_super+0x3e/0x70
> >  deactivate_super+0x46/0x60
> >  cleanup_mnt+0x3f/0x80
> >  __cleanup_mnt+0x12/0x20
> >  task_work_run+0x86/0xb0
> >  exit_to_usermode_loop+0x6d/0x96
> >  do_syscall_64+0x8b/0xa0
> >  entry_SYSCALL64_slow_path+0x25/0x25
> > 
> > After some digging around I found that there was a log item on the
> > ail
> > with a callback of xfs_iflush_done.  A printk confirmed that at the
> > call
> > to xfs_buf_attach_iodone in xfs_iflush_int had XBF_ASYNC already
> > set,
> > which means on error we do not call xfs_buf_do_callbacks, which
> > leaves
> > the log item on the ail list which causes us to hang on unmount.  I
> > assume the block has XBF_ASYNC set because we did a readahead on
> > it, so
> > it doesn't really need to have XBF_ASYNC set at this point as we do
> > actually care about what happens to the buffer once IO is complete.
> > With this patch my hang no longer happens.
> > 
> > Signed-off-by: Josef Bacik <jbacik@fb.com>
> > ---
> >  fs/xfs/xfs_buf_item.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> > index 2975cb2..24fcb67 100644
> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -1016,6 +1016,11 @@ xfs_buf_attach_iodone(
> >  
> >  	ASSERT(bp->b_iodone == NULL ||
> >  	       bp->b_iodone == xfs_buf_iodone_callbacks);
> > +	/*
> > +	 * Somebody now cares about the fate of this buffer, clear
> > XBF_ASYNC so
> > +	 * that the iodone callback actually gets called.
> > +	 */
> > +	bp->b_flags &= ~XBF_ASYNC;
> XBF_ASYNC generally describes the type of I/O being submitted for a
> particular buffer (blocking or non), so it being set or not is kind
> of
> outside the scope of xfs_iflush(). It really depends on how the
> caller
> ultimately wants to write the buffer. For example, if we get here via
> xfsaild_push()->xfs_inode_item_push(), the latter will do
> xfs_buf_delwri_queue() and the former will eventually call
> xfs_buf_delwri_submit_nowait(), which submits async I/O and so will
> re-add XBF_ASYNC.
> 
> It is interesting if this prevents your problem. I can't quite make
> out
> why that would be. The other xfs_iflush() caller is reclaim, which
> calls
> xfs_bwrite() which itself clears XBF_ASYNC before submission.
> 
> What I'm guessing you're referring to above wrt to not calling
> xfs_buf_do_callbacks() is the async check in
> xfs_buf_iodone_callback_error(), which is only relevant on I/O error.
> If
> the buffer I/O succeeds, we should always invoke the iodone callbacks
> regardless of async state. Do you observe otherwise or does this I/O
> indeed fail?
> 
> Anyways, if we change the buffer flags simply to control the behavior
> of
> iodone_callback_error(), we've basically decided to treat an
> otherwise
> async I/O as sync with the assumption that the submitter is going to
> handle the I/O error. As mentioned above, we don't really have that
> information here so that is kind of a dangerous assumption. At the
> very
> least, it bypasses all of the configurable error handling
> infrastructure
> that is specifically in place for async submitted buffers.
> 
> I'm wondering if you're hitting the flush locked inode retry problem
> that Carlos has been working on. The problem is basically that if the
> I/O fails on a buffer that backs a flush locked inode, we never retry
> the buffer and xfsaild basically spins on pushing the inode. What
> happens if you enable xfs tracepoints while the fs is hung?

Sorry for some reason my reply to linux-xfs got eaten.  It made it to
Darrick but vger is being weird (sometimes it fails to get an email,
sometimes it works fine).

So yes my patch doesn't actually work, I just didn't notice that the
problem had happened until after I sent the page.

I'm doing this on a 4.6 kernel, I haven't actually tried it on a
upstream kernel but my original assumption was it was still available
to hit on an upstream kernel.  Basically on 4.6 we _always_ resubmit
the IO once if it fails, and then just drop the buffer if we've re-
submitted once.  We never do the xfs_buf_do_callbacks() call, which
seems wrong since as you say basically everything will set XBF_ASYNC.

What I've done to fix it for 4.6 was in the second case where we just
drop the buffer was to go ahead and do the xfs_buf_do_callbacks() call
and then clear the fspriv and iodone callback on the buffer, and that
actually made the problem go away.  Looking at the upstream code if
it's a permanent IO error we still do the callbacks, so I think the
problem I've hit has already been fixed upstream.  I'll verify next
week just to be sure, since I can usually hit the problem in less than
5 minutes of my reproducer.  Thanks,

Josef

  reply	other threads:[~2017-03-03 21:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03 14:46 [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback Josef Bacik
2017-03-03 20:49 ` Brian Foster
2017-03-03 21:34   ` Josef Bacik [this message]
2017-03-03 22:20     ` Brian Foster
2017-03-04  1:26       ` Josef Bacik
2017-03-04 13:58         ` Brian Foster
2017-03-06 19:00           ` Josef Bacik
2017-03-07  1:22             ` Dave Chinner
2017-03-07  2:37               ` Josef Bacik
2017-03-07 15:56               ` Josef Bacik
2017-03-07 16:12                 ` Brian Foster
2017-04-03  9:19                   ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1488576856.9307.13.camel@fb.com \
    --to=jbacik@fb.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.