linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Allison Henderson <allison.henderson@oracle.com>,
	linux-block@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	martin.petersen@oracle.com, shirley.ma@oracle.com,
	bob.liu@oracle.com
Subject: Re: [PATCH v1 5/7] xfs: Add device retry
Date: Wed, 28 Nov 2018 16:38:22 +1100	[thread overview]
Message-ID: <20181128053821.GM6311@dastard> (raw)
In-Reply-To: <20181128052245.GD8125@magnolia>

On Tue, Nov 27, 2018 at 09:22:45PM -0800, Darrick J. Wong wrote:
> On Wed, Nov 28, 2018 at 04:08:50PM +1100, Dave Chinner wrote:
> > On Tue, Nov 27, 2018 at 08:49:49PM -0700, Allison Henderson wrote:
> > > Check to see if the _xfs_buf_read fails.  If so loop over the
> > > available mirrors and retry the read
> > > 
> > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > ---
> > >  fs/xfs/xfs_buf.c | 28 +++++++++++++++++++++++++++-
> > >  1 file changed, 27 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > index dd8ba59..f102d01 100644
> > > --- a/fs/xfs/xfs_buf.c
> > > +++ b/fs/xfs/xfs_buf.c
> > > @@ -21,6 +21,7 @@
> > >  #include <linux/migrate.h>
> > >  #include <linux/backing-dev.h>
> > >  #include <linux/freezer.h>
> > > +#include <linux/blkdev.h>
> > >  
> > >  #include "xfs_format.h"
> > >  #include "xfs_log_format.h"
> > > @@ -808,6 +809,8 @@ xfs_buf_read_map(
> > >  	const struct xfs_buf_ops *ops)
> > >  {
> > >  	struct xfs_buf		*bp;
> > > +	struct request_queue	*q;
> > > +	unsigned short		i;
> > >  
> > >  	flags |= XBF_READ;
> > >  
> > > @@ -820,7 +823,30 @@ xfs_buf_read_map(
> > >  	if (!(bp->b_flags & XBF_DONE)) {
> > >  		XFS_STATS_INC(target->bt_mount, xb_get_read);
> > >  		bp->b_ops = ops;
> > > -		_xfs_buf_read(bp, flags);
> > > +		q = bdev_get_queue(bp->b_target->bt_bdev);
> > > +
> > > +		/*
> > > +		 * Mirrors are indexed 1 - n, specified through the rw_hint.
> > > +		 * Setting the hint to 0 is unspecified and allows the block
> > > +		 * layer to decide.
> > > +		 */
> > > +		for (i = 0; i <= blk_queue_get_mirrors(q); i++) {
> > > +			bp->b_error = 0;
> > > +			bp->b_rw_hint = i;
> > > +			_xfs_buf_read(bp, flags);
> > 
> > So the first time through this loop the block layer devices what
> > device to read from, then we iterate devices 1..n on error.
> > 
> > Whihc means if device 0 is the only one with good information in it,
> > we may not ever actually read from it.
> > 
> > I'd suggest that a hint of "-1" (or equivalent max value) should be
> > used for "device selects mirror leg" rather than 0, so we can
> > actually read from the first device on command.
> 
> "read from the first device on command" => "set bio.bi_rw_hint = 1"...

Landmine.

> > i.e.
> > 		bp->b_error = 0;
> > 		bp->b_rw_hint = -1;
> 
> ...which is confusing.  The intended behavior for this RFC (though not
> so well documented) is that bi_rw_hint == 0 means "let the device
> choose", and rw_hint > 1 means "choose mirror (rw_hint - 1)".  That's
> sort of an odd behavior because now we have:
> 
> blk_queue_get_mirrors(q) returns 5 (as in 5 mirrors) but we access the
> 5 mirrors as indices 1-5, not 0-4 like most programmers would probably
> expect.

Yeah, that's not nice, and will lead to bugs in future as it trips
up people who have forgotten about this quirk.

> Also, I think it's probably necessary to create a #define to attach a
> name to the "let the device choose" value...
> 
> #define BIO_RW_HINT_ANY_MIRROR	(0)
> 
> for (i = BIO_RW_HINT_ANY_MIRROR; i <= blk_queue_get_mirrors(q); i++) {
> 	...
> 	bp->b_rw_hint = i;
> 	...
> 	_xfs_buf_read(bp, flags);
> 	...
> }

The recovery algorithms are only going to get more complex as
time goes on, so I'd really like to see an explicit separation of
the simple, unchanging fast path and the fallback recovery code.

Cheers,

dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-11-28  5:38 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28  3:49 [RFC PATCH v1 0/7] Block/XFS: Support alternative mirror device retry Allison Henderson
2018-11-28  3:49 ` [PATCH v1 1/7] block: add nr_mirrors to request_queue Allison Henderson
2018-11-28  3:49 ` [PATCH v1 2/7] block: expand write_hint of bio/request to rw_hint Allison Henderson
2018-11-28  3:49 ` [PATCH v1 3/7] md: raid1: handle bi_rw_hint accordingly Allison Henderson
2018-11-28  3:49 ` [PATCH v1 4/7] xfs: Add b_rw_hint to xfs_buf Allison Henderson
2018-11-28  5:03   ` Dave Chinner
2018-11-28  3:49 ` [PATCH v1 5/7] xfs: Add device retry Allison Henderson
2018-11-28  5:08   ` Dave Chinner
2018-11-28  5:22     ` Darrick J. Wong
2018-11-28  5:38       ` Dave Chinner [this message]
2018-11-28  7:35     ` Christoph Hellwig
2018-11-28 12:41       ` Bob Liu
2018-11-28 16:47         ` Allison Henderson
2018-11-28  3:49 ` [PATCH v1 6/7] xfs: Rewrite retried read Allison Henderson
2018-11-28  5:17   ` Dave Chinner
2018-11-28  5:26     ` Darrick J. Wong
2018-11-28  5:40       ` Dave Chinner
2018-11-28  3:49 ` [PATCH v1 7/7] xfs: Add tracepoints and logging to alternate device retry Allison Henderson
2018-11-28  5:33 ` [RFC PATCH v1 0/7] Block/XFS: Support alternative mirror " Dave Chinner
2018-11-28  5:49   ` Darrick J. Wong
2018-11-28  6:30     ` Dave Chinner
2018-11-28  7:15       ` Darrick J. Wong
2018-11-28 19:38     ` Andreas Dilger
2018-11-28  7:37   ` Christoph Hellwig
2018-11-28  7:46     ` Dave Chinner
2018-11-28  7:51       ` Christoph Hellwig
2018-11-28  7:45   ` Christoph Hellwig
2018-12-08 14:49     ` Bob Liu
2018-12-10  4:30       ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181128053821.GM6311@dastard \
    --to=david@fromorbit.com \
    --cc=allison.henderson@oracle.com \
    --cc=bob.liu@oracle.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=shirley.ma@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).