All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Josef Bacik <jbacik@fb.com>
Cc: Brian Foster <bfoster@redhat.com>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	Kernel Team <Kernel-team@fb.com>,
	Carlos Maiolino <cmaiolino@redhat.com>
Subject: Re: [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback
Date: Tue, 7 Mar 2017 12:22:04 +1100	[thread overview]
Message-ID: <20170307012204.GP17542@dastard> (raw)
In-Reply-To: <1488826813.9307.21.camel@fb.com>

On Mon, Mar 06, 2017 at 02:00:13PM -0500, Josef Bacik wrote:
> so you retry once, and the second time through we will return false if
> we are unmounting, which means that xfs_buf_do_callbacks gets called.
> 
> Now you say that xfs_buf_do_callbacks() being called when we errored
> out is dangerous, but I don't understand why.  We have this code
> 
>         if (need_ail) {
>                 struct xfs_log_item *log_items[need_ail];
>                 int i = 0;
>                 spin_lock(&ailp->xa_lock);
>                 for (blip = lip; blip; blip = blip->li_bio_list) {
>                         iip = INODE_ITEM(blip);
>                         if (iip->ili_logged &&
>                             blip->li_lsn == iip->ili_flush_lsn) {
>                                 log_items[i++] = blip;
>                         }
>                         ASSERT(i <= need_ail);
>                 }
>                 /* xfs_trans_ail_delete_bulk() drops the AIL lock. */
>                 xfs_trans_ail_delete_bulk(ailp, log_items, i,
>                                           SHUTDOWN_CORRUPT_INCORE);
>         }

[ Your cut-n-paste is horribly mangled in something non-ascii. ]

> and need_ail is set if we didn't actually end up on disk properly.  So
> we delete the thing and mark the fs as fucked.  Am I mis-reading what
> is going on here?

Yes. That won't shut down the filesystem on an inode write error. It
will simply remove the inode from the AIL, the error will do dropped
on the floor, and the filesystem will not be shut down even though
it's now in a corrupt state. xfs_trans_ail_delete_bulk() only shuts
down the filesytem if it doesn't find the item being removed form
the AIL in the AIL.

As I keep telling people, what needs to happen is this:

	1. buffer write error occurs
	2. retries exhausted, mark buffer as errored, run callbacks
	3. xfs_iflush_done() (and other iodone callbacks) need to
	   look at the buffer error state to determine what to do.
	4. if the buffer is errored and recovery is not possible,
	   the callback needs to shut down the filesytem.
	5. the callback then needs to call flush abort function for
	   item (e.g. xfs_iflush_abort()) rather than the normal
	   IO completion processing path.

>  This is what I did to our 4.6 tree (in testing
> still, not actually pushed to production)
>
>         if (bp->b_flags & XBF_ASYNC) {
>                 ASSERT(bp->b_iodone != NULL);
> 
>                 trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> 
>                 xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag
> */
> 
>                 if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) {
>                         bp->b_flags |= XBF_WRITE | XBF_ASYNC |
>                                        XBF_DONE | XBF_WRITE_FAIL;
>                         xfs_buf_submit(bp);
>                 } else {
> +                       xfs_buf_do_callbacks(bp);
> +                       bp->b_fspriv = NULL;
> +                       bp->b_iodone = NULL;
>                         xfs_buf_relse(bp);
>                 }
> 
>                 return;
>         }

It's likely you'll end up with silent on disk corruption if you do
that because it's just tossing the error state away.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2017-03-07  1:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03 14:46 [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback Josef Bacik
2017-03-03 20:49 ` Brian Foster
2017-03-03 21:34   ` Josef Bacik
2017-03-03 22:20     ` Brian Foster
2017-03-04  1:26       ` Josef Bacik
2017-03-04 13:58         ` Brian Foster
2017-03-06 19:00           ` Josef Bacik
2017-03-07  1:22             ` Dave Chinner [this message]
2017-03-07  2:37               ` Josef Bacik
2017-03-07 15:56               ` Josef Bacik
2017-03-07 16:12                 ` Brian Foster
2017-04-03  9:19                   ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170307012204.GP17542@dastard \
    --to=david@fromorbit.com \
    --cc=Kernel-team@fb.com \
    --cc=bfoster@redhat.com \
    --cc=cmaiolino@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=jbacik@fb.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.