All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] blk-lib: fix error reporting
@ 2014-07-03 19:16 Mikulas Patocka
  2014-07-08  9:50 ` [dm-devel] " Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2014-07-03 19:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: dm-devel, linux-kernel

The function bio_batch_end_io ignores -EOPNOTSUPP. It doesn't matter for
discard (the device isn't required to discard anything, so missing the
error code and reporting success shouldn't cause any trouble). However,
for WRITE SAME command, missing the error code is obviously wrong. It may
fool the user into thinking that the data were written while in fact they
weren't.

Note that in device mapper, devices may be dynamically reconfigured, so a
device that supports WRITE SAME may stop supporting it at any time and
return -EOPNOTSUPP. Ignoring -EOPNOTSUPP is wrong.

This patch changes bio_batch->flags to an error field and stores the last
error there - so that the error is reported accurately and it isn't
ignored.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 block/blk-lib.c |   25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

Index: linux-3.16-rc3/block/blk-lib.c
===================================================================
--- linux-3.16-rc3.orig/block/blk-lib.c	2014-07-03 18:17:21.000000000 +0200
+++ linux-3.16-rc3/block/blk-lib.c	2014-07-03 18:36:52.000000000 +0200
@@ -11,7 +11,7 @@
 
 struct bio_batch {
 	atomic_t		done;
-	unsigned long		flags;
+	int			error;
 	struct completion	*wait;
 };
 
@@ -19,8 +19,8 @@ static void bio_batch_end_io(struct bio 
 {
 	struct bio_batch *bb = bio->bi_private;
 
-	if (err && (err != -EOPNOTSUPP))
-		clear_bit(BIO_UPTODATE, &bb->flags);
+	if (unlikely(err))
+		ACCESS_ONCE(bb->error) = err;
 	if (atomic_dec_and_test(&bb->done))
 		complete(bb->wait);
 	bio_put(bio);
@@ -78,7 +78,7 @@ int blkdev_issue_discard(struct block_de
 	}
 
 	atomic_set(&bb.done, 1);
-	bb.flags = 1 << BIO_UPTODATE;
+	bb.error = 0;
 	bb.wait = &wait;
 
 	blk_start_plug(&plug);
@@ -134,8 +134,8 @@ int blkdev_issue_discard(struct block_de
 	if (!atomic_dec_and_test(&bb.done))
 		wait_for_completion_io(&wait);
 
-	if (!test_bit(BIO_UPTODATE, &bb.flags))
-		ret = -EIO;
+	if (likely(!ret))
+		ret = bb.error;
 
 	return ret;
 }
@@ -172,7 +172,7 @@ int blkdev_issue_write_same(struct block
 		return -EOPNOTSUPP;
 
 	atomic_set(&bb.done, 1);
-	bb.flags = 1 << BIO_UPTODATE;
+	bb.error = 0;
 	bb.wait = &wait;
 
 	while (nr_sects) {
@@ -208,8 +208,8 @@ int blkdev_issue_write_same(struct block
 	if (!atomic_dec_and_test(&bb.done))
 		wait_for_completion_io(&wait);
 
-	if (!test_bit(BIO_UPTODATE, &bb.flags))
-		ret = -ENOTSUPP;
+	if (likely(!ret))
+		ret = bb.error;
 
 	return ret;
 }
@@ -236,7 +236,7 @@ static int __blkdev_issue_zeroout(struct
 	DECLARE_COMPLETION_ONSTACK(wait);
 
 	atomic_set(&bb.done, 1);
-	bb.flags = 1 << BIO_UPTODATE;
+	bb.error = 0;
 	bb.wait = &wait;
 
 	ret = 0;
@@ -270,9 +270,8 @@ static int __blkdev_issue_zeroout(struct
 	if (!atomic_dec_and_test(&bb.done))
 		wait_for_completion_io(&wait);
 
-	if (!test_bit(BIO_UPTODATE, &bb.flags))
-		/* One of bios in the batch was completed with error.*/
-		ret = -EIO;
+	if (likely(!ret))
+		ret = bb.error;
 
 	return ret;
 }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH] blk-lib: fix error reporting
  2014-07-03 19:16 [PATCH] blk-lib: fix error reporting Mikulas Patocka
@ 2014-07-08  9:50 ` Christoph Hellwig
  2014-07-08 13:05   ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2014-07-08  9:50 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: Jens Axboe, dm-devel, linux-kernel

> +	if (unlikely(err))
> +		ACCESS_ONCE(bb->error) = err;

I can't see a reason for the ACCESS_ONCE here.

Also the likely/unlikely annotations here smell like premature
optimization.

Otherwise looks good to me.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH] blk-lib: fix error reporting
  2014-07-08  9:50 ` [dm-devel] " Christoph Hellwig
@ 2014-07-08 13:05   ` Mikulas Patocka
  2014-07-08 14:04     ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2014-07-08 13:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, dm-devel, linux-kernel



On Tue, 8 Jul 2014, Christoph Hellwig wrote:

> > +	if (unlikely(err))
> > +		ACCESS_ONCE(bb->error) = err;
> 
> I can't see a reason for the ACCESS_ONCE here.

Multiple bios can be completed concurrently, so they write bb->error at 
the same time. The compiler may do store tearing (see "store tearing" in 
Documentation/memory-barriers.txt) - it may split one 4-byte write into 
several smaller writes - and it could result in setting bb->error to 
invalid value. We need ACCESS_ONCE to make sure that store tearing doesn't 
happen.

Mikulas

> Also the likely/unlikely annotations here smell like premature
> optimization.
> 
> Otherwise looks good to me.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH] blk-lib: fix error reporting
  2014-07-08 13:05   ` Mikulas Patocka
@ 2014-07-08 14:04     ` James Bottomley
  2014-07-16 10:22         ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2014-07-08 14:04 UTC (permalink / raw)
  To: device-mapper development; +Cc: Christoph Hellwig, Jens Axboe, linux-kernel

On Tue, 2014-07-08 at 09:05 -0400, Mikulas Patocka wrote:
> 
> On Tue, 8 Jul 2014, Christoph Hellwig wrote:
> 
> > > +	if (unlikely(err))
> > > +		ACCESS_ONCE(bb->error) = err;
> > 
> > I can't see a reason for the ACCESS_ONCE here.
> 
> Multiple bios can be completed concurrently, so they write bb->error at 
> the same time. The compiler may do store tearing (see "store tearing" in 
> Documentation/memory-barriers.txt) - it may split one 4-byte write into 
> several smaller writes - and it could result in setting bb->error to 
> invalid value. We need ACCESS_ONCE to make sure that store tearing doesn't 
> happen.

That's not correct, because it's not applicable in this case.  Tearing
may occur on misalignment (which ACCESS_ONCE() cannot rectify because
it's architectural), short constant loads (again, usually architectural)
and structure copies, none of which applies here.

We can rely on a properly aligned 32 bit write being atomic.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH] blk-lib: fix error reporting
  2014-07-08 14:04     ` James Bottomley
@ 2014-07-16 10:22         ` Mikulas Patocka
  0 siblings, 0 replies; 6+ messages in thread
From: Mikulas Patocka @ 2014-07-16 10:22 UTC (permalink / raw)
  To: device-mapper development; +Cc: Christoph Hellwig, Jens Axboe, linux-kernel



On Tue, 8 Jul 2014, James Bottomley wrote:

> On Tue, 2014-07-08 at 09:05 -0400, Mikulas Patocka wrote:
> > 
> > On Tue, 8 Jul 2014, Christoph Hellwig wrote:
> > 
> > > > +	if (unlikely(err))
> > > > +		ACCESS_ONCE(bb->error) = err;
> > > 
> > > I can't see a reason for the ACCESS_ONCE here.
> > 
> > Multiple bios can be completed concurrently, so they write bb->error at 
> > the same time. The compiler may do store tearing (see "store tearing" in 
> > Documentation/memory-barriers.txt) - it may split one 4-byte write into 
> > several smaller writes - and it could result in setting bb->error to 
> > invalid value. We need ACCESS_ONCE to make sure that store tearing doesn't 
> > happen.
> 
> That's not correct, because it's not applicable in this case.  Tearing
> may occur on misalignment (which ACCESS_ONCE() cannot rectify because
> it's architectural), short constant loads (again, usually architectural)
> and structure copies, none of which applies here.

Suppose this scenario:
CPU1 writes low byte of the first error code
CPU2 writes low byte of the second error code
CPU2 writes 3 high bytes of the second error code
CPU1 writes 3 high bytes of the first error code

- now, bb->error contains garbage - a mix of the first and second error 
code. That's why we need ACCESS_ONCE.

It may happen even if the variable is aligned. The compiler is allowed to 
split larger memory access to several smaller accesses. The compiler 
usually doesn't do this split (that's why omitting ACCESS_ONCE usually 
doesn't result in any observable misbehavior), but it is still a bug to 
omit it - you don't really know that for all 29 architectures gcc won't 
split the memory write...

> We can rely on a properly aligned 32 bit write being atomic.
>
> James

... only if you use ACCESS_ONCE ...

Mikulas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] blk-lib: fix error reporting
@ 2014-07-16 10:22         ` Mikulas Patocka
  0 siblings, 0 replies; 6+ messages in thread
From: Mikulas Patocka @ 2014-07-16 10:22 UTC (permalink / raw)
  To: device-mapper development; +Cc: Christoph Hellwig, Jens Axboe, linux-kernel



On Tue, 8 Jul 2014, James Bottomley wrote:

> On Tue, 2014-07-08 at 09:05 -0400, Mikulas Patocka wrote:
> > 
> > On Tue, 8 Jul 2014, Christoph Hellwig wrote:
> > 
> > > > +	if (unlikely(err))
> > > > +		ACCESS_ONCE(bb->error) = err;
> > > 
> > > I can't see a reason for the ACCESS_ONCE here.
> > 
> > Multiple bios can be completed concurrently, so they write bb->error at 
> > the same time. The compiler may do store tearing (see "store tearing" in 
> > Documentation/memory-barriers.txt) - it may split one 4-byte write into 
> > several smaller writes - and it could result in setting bb->error to 
> > invalid value. We need ACCESS_ONCE to make sure that store tearing doesn't 
> > happen.
> 
> That's not correct, because it's not applicable in this case.  Tearing
> may occur on misalignment (which ACCESS_ONCE() cannot rectify because
> it's architectural), short constant loads (again, usually architectural)
> and structure copies, none of which applies here.

Suppose this scenario:
CPU1 writes low byte of the first error code
CPU2 writes low byte of the second error code
CPU2 writes 3 high bytes of the second error code
CPU1 writes 3 high bytes of the first error code

- now, bb->error contains garbage - a mix of the first and second error 
code. That's why we need ACCESS_ONCE.

It may happen even if the variable is aligned. The compiler is allowed to 
split larger memory access to several smaller accesses. The compiler 
usually doesn't do this split (that's why omitting ACCESS_ONCE usually 
doesn't result in any observable misbehavior), but it is still a bug to 
omit it - you don't really know that for all 29 architectures gcc won't 
split the memory write...

> We can rely on a properly aligned 32 bit write being atomic.
>
> James

... only if you use ACCESS_ONCE ...

Mikulas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-07-16 10:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-03 19:16 [PATCH] blk-lib: fix error reporting Mikulas Patocka
2014-07-08  9:50 ` [dm-devel] " Christoph Hellwig
2014-07-08 13:05   ` Mikulas Patocka
2014-07-08 14:04     ` James Bottomley
2014-07-16 10:22       ` Mikulas Patocka
2014-07-16 10:22         ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.