All of lore.kernel.org
 help / color / mirror / Atom feed
From: Milan Broz <gmazyland@gmail.com>
To: NeilBrown <neilb@suse.com>, Mike Snitzer <snitzer@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>
Cc: device-mapper development <dm-devel@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't
Date: Thu, 15 Feb 2018 08:37:12 +0100	[thread overview]
Message-ID: <1df6de2e-bfd3-84c4-cffb-83b6299c6d23@gmail.com> (raw)
In-Reply-To: <87y3jvp13k.fsf@notabene.neil.brown.name>

On 02/15/2018 01:07 AM, NeilBrown wrote:
> 
> And looking again at the code, it doesn't seem so bad.  I was worried
> about reassigning ci.io->orig_bio after calling
> __split_and_process_non_flush(), but the important thing is to set it
> before the last dec_pending() call, and the code gets that right.
> 
> I think I've found the problem though:
> 
> 			/* done with normal IO or empty flush */
> 			bio->bi_status = io_error;
> 
> When we have chained bios, there are multiple sources for error which
> need to be merged into bi_status:  all bios that were chained to this
> one, and this bio itself.
> 
> __bio_chain_endio uses:
> 
> 	if (!parent->bi_status)
> 		parent->bi_status = bio->bi_status;
> 
> so the new status won't over-ride the old status (unless there are
> races....), but dm doesn't do that - I guess it didn't have to worry
> about chains before.
> 
> So this:
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index d6de00f367ef..68136806d365 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -903,7 +903,8 @@ static void dec_pending(struct dm_io *io, blk_status_t error)
>  			queue_io(md, bio);
>  		} else {
>  			/* done with normal IO or empty flush */
> -			bio->bi_status = io_error;
> +			if (io_error)
> +				bio->bi_status = io_error;
>  			bio_endio(bio);
>  		}
>  	}
> 

Hi,

well, it indeed fixes the reported problem, thanks.
But I just tested the first (dm.c) change above.

The important part is also that if the error is hits bio later, it will produce
short read (and not fail of the whole operation), this can be easily tested if we switch
the error and the zero target in that reproducer
  //system("echo -e \'0 10000000 error\'\\\\n\'10000000 1000000 zero\' | dmsetup create test");
  system("echo -e \'0 10000000 zero\'\\\\n\'10000000 1000000 error\' | dmsetup create test");

So it seems your patch works.

I am not sure about this part though:

> should fix it.  And __bio_chain_endio should probably be
> 
> diff --git a/block/bio.c b/block/bio.c
> index e1708db48258..ad77140edc6f 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -312,7 +312,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
>  {
>  	struct bio *parent = bio->bi_private;
>  
> -	if (!parent->bi_status)
> +	if (bio->bi_status)
>  		parent->bi_status = bio->bi_status;

Shouldn't it be
	if (!parent->bi_status && bio->bi_status)
  		parent->bi_status = bio->bi_status;
?

Maybe one rephrased question: what about priorities for errors (bio statuses)?

For example, part of a bio fails with EILSEQ (data integrity check error, BLK_STS_PROTECTION),
part with EIO (media error, BLK_STS_IOERR). What should be reported for parent bio?
(I would expect I always get EIO here because this is hard, uncorrectable error.)

Anyway, thanks for the fix!

Milan


>  	bio_put(bio);
>  	return parent;
> 
> to remove the chance that a race will hide a real error.
> 
> Milan: could you please check that the patch fixes the dm-crypt bug?
> I've confirmed that it fixes your test case.
> When you confirm, I'll send a proper patch.
> 
> I guess I should audit all code that assigns to ->bi_status and make
> sure nothing ever assigns 0 to it.
> 
> Thanks,
> NeilBrown
> 

  reply	other threads:[~2018-02-15  7:37 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-14 13:02 DM Regression in 4.16-rc1 - read() returns data when it shouldn't Milan Broz
2018-02-14 20:39 ` NeilBrown
2018-02-14 23:05   ` Mike Snitzer
2018-02-15  0:07     ` [dm-devel] " NeilBrown
2018-02-15  7:37       ` Milan Broz [this message]
2018-02-15  8:52         ` NeilBrown
2018-02-15  9:00 ` [PATCH] dm: correctly handle chained bios in dec_pending() NeilBrown
2018-02-15  9:01 ` [RFC PATCH] dm: don't assign zero to ->bi_status of an active bio NeilBrown
2018-02-15  9:09 ` [PATCH] block: be more careful about status in __bio_chain_endio NeilBrown
2018-02-15  9:09   ` NeilBrown
2019-02-22 21:10   ` Mike Snitzer
2019-02-22 22:46     ` Jens Axboe
2019-02-22 23:55       ` Mike Snitzer
2019-02-23  2:02         ` John Dorminy
2019-02-23  2:02           ` John Dorminy
2019-02-23  2:44           ` Mike Snitzer
2019-02-23  3:10             ` John Dorminy
2019-06-12  2:56               ` John Dorminy
2019-06-12  7:01                 ` Christoph Hellwig
2019-06-17  7:32                   ` Hannes Reinecke
2018-02-19 13:44 ` DM Regression in 4.16-rc1 - read() returns data when it shouldn't Thorsten Leemhuis
2018-02-19 17:15   ` Mike Snitzer
2018-02-19 17:15     ` Mike Snitzer
2018-02-26 10:14     ` Thorsten Leemhuis
2018-02-26 11:01       ` NeilBrown
2018-02-26 17:31         ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1df6de2e-bfd3-84c4-cffb-83b6299c6d23@gmail.com \
    --to=gmazyland@gmail.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=neilb@suse.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.