All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
@ 2024-03-24 13:37 Ming Lei
  2024-03-24 21:48 ` Mike Snitzer
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Ming Lei @ 2024-03-24 13:37 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: Ming Lei, Keith Busch, Bart Van Assche, Christoph Hellwig,
	Mikulas Patocka, Mike Snitzer

For any FS bio, its start sector and size have to be aligned with the
queue's logical block size from beginning, because bio split code can't
make one aligned bio.

This rule is obvious, but there is still user which may send unaligned
bio to block layer, and it is observed that dm-integrity can do that,
and cause double free of driver's dma meta buffer.

So failfast unaligned bio from submit_bio_noacct() for avoiding more
troubles.

Meantime remove this kind of check in dio and discard code path.

Cc: Keith Busch <kbusch@kernel.org>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
V2:
	- remove the check in dio and discard code path
	- check .bi_sector with (logical_block_size >> 9) - 1

 block/blk-core.c | 16 ++++++++++++++++
 block/blk-lib.c  | 17 -----------------
 block/fops.c     |  3 +--
 3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..2d86922f95e3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -729,6 +729,19 @@ void submit_bio_noacct_nocheck(struct bio *bio)
 		__submit_bio_noacct(bio);
 }
 
+static bool bio_check_alignment(struct bio *bio, struct request_queue *q)
+{
+	unsigned int bs = q->limits.logical_block_size;
+
+	if (bio->bi_iter.bi_size & (bs - 1))
+		return false;
+
+	if (bio->bi_iter.bi_sector & ((bs >> SECTOR_SHIFT) - 1))
+		return false;
+
+	return true;
+}
+
 /**
  * submit_bio_noacct - re-submit a bio to the block device layer for I/O
  * @bio:  The bio describing the location in memory and on the device.
@@ -780,6 +793,9 @@ void submit_bio_noacct(struct bio *bio)
 		}
 	}
 
+	if (WARN_ON_ONCE(!bio_check_alignment(bio, q)))
+		goto end_io;
+
 	if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
 		bio_clear_polled(bio);
 
diff --git a/block/blk-lib.c b/block/blk-lib.c
index a6954eafb8c8..ea1a7d16ffdf 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -39,7 +39,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
 {
 	struct bio *bio = *biop;
-	sector_t bs_mask;
 
 	if (bdev_read_only(bdev))
 		return -EPERM;
@@ -53,10 +52,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		return -EOPNOTSUPP;
 	}
 
-	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
-	if ((sector | nr_sects) & bs_mask)
-		return -EINVAL;
-
 	if (!nr_sects)
 		return -EINVAL;
 
@@ -217,11 +212,6 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 		unsigned flags)
 {
 	int ret;
-	sector_t bs_mask;
-
-	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
-	if ((sector | nr_sects) & bs_mask)
-		return -EINVAL;
 
 	ret = __blkdev_issue_write_zeroes(bdev, sector, nr_sects, gfp_mask,
 			biop, flags);
@@ -250,15 +240,10 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
 {
 	int ret = 0;
-	sector_t bs_mask;
 	struct bio *bio;
 	struct blk_plug plug;
 	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev);
 
-	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
-	if ((sector | nr_sects) & bs_mask)
-		return -EINVAL;
-
 retry:
 	bio = NULL;
 	blk_start_plug(&plug);
@@ -313,8 +298,6 @@ int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector,
 
 	if (max_sectors == 0)
 		return -EOPNOTSUPP;
-	if ((sector | nr_sects) & bs_mask)
-		return -EINVAL;
 	if (bdev_read_only(bdev))
 		return -EPERM;
 
diff --git a/block/fops.c b/block/fops.c
index 679d9b752fe8..75595c728190 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -37,8 +37,7 @@ static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
 static bool blkdev_dio_unaligned(struct block_device *bdev, loff_t pos,
 			      struct iov_iter *iter)
 {
-	return pos & (bdev_logical_block_size(bdev) - 1) ||
-		!bdev_iter_is_aligned(bdev, iter);
+	return !bdev_iter_is_aligned(bdev, iter);
 }
 
 #define DIO_INLINE_BIO_VECS 4
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-24 13:37 [PATCH V2] block: fail unaligned bio from submit_bio_noacct() Ming Lei
@ 2024-03-24 21:48 ` Mike Snitzer
  2024-03-24 23:25 ` Christoph Hellwig
  2024-03-25 18:53 ` Keith Busch
  2 siblings, 0 replies; 8+ messages in thread
From: Mike Snitzer @ 2024-03-24 21:48 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Keith Busch, Bart Van Assche,
	Christoph Hellwig, Mikulas Patocka

On Sun, Mar 24 2024 at  9:37P -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> For any FS bio, its start sector and size have to be aligned with the
> queue's logical block size from beginning, because bio split code can't
> make one aligned bio.
> 
> This rule is obvious, but there is still user which may send unaligned
> bio to block layer, and it is observed that dm-integrity can do that,
> and cause double free of driver's dma meta buffer.
> 
> So failfast unaligned bio from submit_bio_noacct() for avoiding more
> troubles.
> 
> Meantime remove this kind of check in dio and discard code path.
> 
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> Cc: Mike Snitzer <snitzer@kernel.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> V2:
> 	- remove the check in dio and discard code path
> 	- check .bi_sector with (logical_block_size >> 9) - 1
> 
>  block/blk-core.c | 16 ++++++++++++++++
>  block/blk-lib.c  | 17 -----------------
>  block/fops.c     |  3 +--
>  3 files changed, 17 insertions(+), 19 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index a16b5abdbbf5..2d86922f95e3 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -729,6 +729,19 @@ void submit_bio_noacct_nocheck(struct bio *bio)
>  		__submit_bio_noacct(bio);
>  }
>  
> +static bool bio_check_alignment(struct bio *bio, struct request_queue *q)
> +{
> +	unsigned int bs = q->limits.logical_block_size;
> +
> +	if (bio->bi_iter.bi_size & (bs - 1))
> +		return false;
> +
> +	if (bio->bi_iter.bi_sector & ((bs >> SECTOR_SHIFT) - 1))
> +		return false;
> +
> +	return true;
> +}
> +

You missed Christoph's reply to v1 where he offered:
"This should just use bdev_logical_block_size() on bio->bi_bdev."

Otherwise, looks good.

Mike

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-24 13:37 [PATCH V2] block: fail unaligned bio from submit_bio_noacct() Ming Lei
  2024-03-24 21:48 ` Mike Snitzer
@ 2024-03-24 23:25 ` Christoph Hellwig
  2024-03-25  3:03   ` Ming Lei
  2024-03-25 18:53 ` Keith Busch
  2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-03-24 23:25 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Keith Busch, Bart Van Assche,
	Christoph Hellwig, Mikulas Patocka, Mike Snitzer

On Sun, Mar 24, 2024 at 09:37:02PM +0800, Ming Lei wrote:
> +static bool bio_check_alignment(struct bio *bio, struct request_queue *q)
> +{
> +	unsigned int bs = q->limits.logical_block_size;
> +
> +	if (bio->bi_iter.bi_size & (bs - 1))
> +		return false;
> +
> +	if (bio->bi_iter.bi_sector & ((bs >> SECTOR_SHIFT) - 1))
> +		return false;
> +
> +	return true;
> +}


This should still use bdev_logic_block_size.  And maybe it's just me,
but I think dropping thelines after the false returns would actually
make it more readle.

> diff --git a/block/fops.c b/block/fops.c
> index 679d9b752fe8..75595c728190 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -37,8 +37,7 @@ static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
>  static bool blkdev_dio_unaligned(struct block_device *bdev, loff_t pos,
>  			      struct iov_iter *iter)
>  {
> -	return pos & (bdev_logical_block_size(bdev) - 1) ||
> -		!bdev_iter_is_aligned(bdev, iter);
> +	return !bdev_iter_is_aligned(bdev, iter);

If you drop this:

 - we now actually go all the way down to building and submiting a
   bio for a trivial bounds check.
 - your get a trivial to trigger WARN_ON.

I'd strongly advise against dropping this check.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-24 23:25 ` Christoph Hellwig
@ 2024-03-25  3:03   ` Ming Lei
  2024-03-25  3:12     ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Ming Lei @ 2024-03-25  3:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Keith Busch, Bart Van Assche,
	Mikulas Patocka, Mike Snitzer, ming.lei

On Sun, Mar 24, 2024 at 04:25:04PM -0700, Christoph Hellwig wrote:
> On Sun, Mar 24, 2024 at 09:37:02PM +0800, Ming Lei wrote:
> > +static bool bio_check_alignment(struct bio *bio, struct request_queue *q)
> > +{
> > +	unsigned int bs = q->limits.logical_block_size;
> > +
> > +	if (bio->bi_iter.bi_size & (bs - 1))
> > +		return false;
> > +
> > +	if (bio->bi_iter.bi_sector & ((bs >> SECTOR_SHIFT) - 1))
> > +		return false;
> > +
> > +	return true;
> > +}
> 
> 
> This should still use bdev_logic_block_size.  And maybe it's just me,
> but I think dropping thelines after the false returns would actually
> make it more readle.

OK, will remove the blank line.

> 
> > diff --git a/block/fops.c b/block/fops.c
> > index 679d9b752fe8..75595c728190 100644
> > --- a/block/fops.c
> > +++ b/block/fops.c
> > @@ -37,8 +37,7 @@ static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
> >  static bool blkdev_dio_unaligned(struct block_device *bdev, loff_t pos,
> >  			      struct iov_iter *iter)
> >  {
> > -	return pos & (bdev_logical_block_size(bdev) - 1) ||
> > -		!bdev_iter_is_aligned(bdev, iter);
> > +	return !bdev_iter_is_aligned(bdev, iter);
> 
> If you drop this:
> 
>  - we now actually go all the way down to building and submiting a
>    bio for a trivial bounds check.
>  - your get a trivial to trigger WARN_ON.
> 
> I'd strongly advise against dropping this check.

OK.

Also only q->limits.logical_block_size is fetched for small BS IO
fast path, I think log(lbs) can be cached in request_queue for avoiding the
extra fetch of q.limits. Especially, it could be easier to do so
with your recent queue limit atomic update changes.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-25  3:03   ` Ming Lei
@ 2024-03-25  3:12     ` Christoph Hellwig
  2024-03-25  3:50       ` Ming Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-03-25  3:12 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Keith Busch,
	Bart Van Assche, Mikulas Patocka, Mike Snitzer

On Mon, Mar 25, 2024 at 11:03:25AM +0800, Ming Lei wrote:
> Also only q->limits.logical_block_size is fetched for small BS IO
> fast path, I think log(lbs) can be cached in request_queue for avoiding the
> extra fetch of q.limits. Especially, it could be easier to do so
> with your recent queue limit atomic update changes.

So.  One thing I've been thinking of for a while (and which Bart also
mentioned) is tht queue_limits currently is a bit of a mess between
the actual queue limits, and the gneidks configuration.   The logical
block size is firmly in the latter, and we should probably move it
to the gendisk eventually.  Depending on how converting the SCSI ULDs
to the atomic queue limits API goes that imght happen rather sooner
than later.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-25  3:12     ` Christoph Hellwig
@ 2024-03-25  3:50       ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2024-03-25  3:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Keith Busch, Bart Van Assche,
	Mikulas Patocka, Mike Snitzer

On Sun, Mar 24, 2024 at 08:12:01PM -0700, Christoph Hellwig wrote:
> On Mon, Mar 25, 2024 at 11:03:25AM +0800, Ming Lei wrote:
> > Also only q->limits.logical_block_size is fetched for small BS IO
> > fast path, I think log(lbs) can be cached in request_queue for avoiding the
> > extra fetch of q.limits. Especially, it could be easier to do so
> > with your recent queue limit atomic update changes.
> 
> So.  One thing I've been thinking of for a while (and which Bart also
> mentioned) is tht queue_limits currently is a bit of a mess between
> the actual queue limits, and the gneidks configuration.   The logical
> block size is firmly in the latter, and we should probably move it

lbs and pbs belong to disk, but some others may not be very obvious.

Strictly speaking elevator/blkcg belong to disk too, but still stay in
request_queue, :-)

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-24 13:37 [PATCH V2] block: fail unaligned bio from submit_bio_noacct() Ming Lei
  2024-03-24 21:48 ` Mike Snitzer
  2024-03-24 23:25 ` Christoph Hellwig
@ 2024-03-25 18:53 ` Keith Busch
  2024-03-26  1:19   ` Ming Lei
  2 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2024-03-25 18:53 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Bart Van Assche, Christoph Hellwig,
	Mikulas Patocka, Mike Snitzer

On Sun, Mar 24, 2024 at 09:37:02PM +0800, Ming Lei wrote:
> @@ -780,6 +793,9 @@ void submit_bio_noacct(struct bio *bio)
>  		}
>  	}
>  
> +	if (WARN_ON_ONCE(!bio_check_alignment(bio, q)))
> +		goto end_io;
> +

The "status" at this point is "BLK_STS_IOERR", so user space would see
EIO, but the existing checks return EINVAL. I'm not sure if that's "ok",
but assuming it is, I think the user visible different behavior should
be mentioned in the changelog.

Alternatively, maybe we want an asynchronous way to return EINVAL for
these conditions. It's more informative to a user where the problem is
than a generic EIO. There is no BLK_STS_ value that translates to
EINVAL, though, so maybe we need a new block status code like
BLK_STS_INVALID_REQUEST.

> @@ -53,10 +52,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  		return -EOPNOTSUPP;
>  	}
>  
> -	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
> -	if ((sector | nr_sects) & bs_mask)
> -		return -EINVAL;
> -
>  	if (!nr_sects)
>  		return -EINVAL;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] block: fail unaligned bio from submit_bio_noacct()
  2024-03-25 18:53 ` Keith Busch
@ 2024-03-26  1:19   ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2024-03-26  1:19 UTC (permalink / raw)
  To: Keith Busch
  Cc: Jens Axboe, linux-block, Bart Van Assche, Christoph Hellwig,
	Mikulas Patocka, Mike Snitzer

On Mon, Mar 25, 2024 at 11:53:45AM -0700, Keith Busch wrote:
> On Sun, Mar 24, 2024 at 09:37:02PM +0800, Ming Lei wrote:
> > @@ -780,6 +793,9 @@ void submit_bio_noacct(struct bio *bio)
> >  		}
> >  	}
> >  
> > +	if (WARN_ON_ONCE(!bio_check_alignment(bio, q)))
> > +		goto end_io;
> > +
> 
> The "status" at this point is "BLK_STS_IOERR", so user space would see
> EIO, but the existing checks return EINVAL. I'm not sure if that's "ok",
> but assuming it is, I think the user visible different behavior should
> be mentioned in the changelog.
> 
> Alternatively, maybe we want an asynchronous way to return EINVAL for

It has to be async way to return it because submit_bio*() returns
void.

> these conditions. It's more informative to a user where the problem is
> than a generic EIO. There is no BLK_STS_ value that translates to
> EINVAL, though, so maybe we need a new block status code like
> BLK_STS_INVALID_REQUEST.

Yeah, I agree, but that is one existed issue. The 'status' should have
been initialized as 'BLK_STS_INVALID_REQUEST' or 'BLK_STS_INVALID' in
submit_bio_noacct(), and all check failure can be thought as -EINVAL.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-26  1:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-24 13:37 [PATCH V2] block: fail unaligned bio from submit_bio_noacct() Ming Lei
2024-03-24 21:48 ` Mike Snitzer
2024-03-24 23:25 ` Christoph Hellwig
2024-03-25  3:03   ` Ming Lei
2024-03-25  3:12     ` Christoph Hellwig
2024-03-25  3:50       ` Ming Lei
2024-03-25 18:53 ` Keith Busch
2024-03-26  1:19   ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.