linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: re-add discard_granularity and alignment checks
@ 2015-10-22 16:59 Ming Lin
  2015-10-22 17:03 ` Mike Snitzer
  0 siblings, 1 reply; 4+ messages in thread
From: Ming Lin @ 2015-10-22 16:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Christoph Hellwig, Jens Axboe, Mike Snitzer, Martin K. Petersen,
	Kent Overstreet

From: Ming Lin <ming.l@ssi.samsung.com>

In commit b49a087("block: remove split code in
blkdev_issue_{discard,write_same}"), discard_granularity and alignment
checks were removed. Ideally, with bio late splitting, the upper layers
shouldn't need to depend on device's limits.

Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
device when mkfs.xfs. We have not found the root cause yet.

This patch re-adds discard_granularity and alignment checks by reverting
the related changes in commit b49a087. The good thing is now we can
remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
overflow.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
---
 block/blk-lib.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index bd40292..9ebf653 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -26,13 +26,6 @@ static void bio_batch_end_io(struct bio *bio)
 	bio_put(bio);
 }
 
-/*
- * Ensure that max discard sectors doesn't overflow bi_size and hopefully
- * it is of the proper granularity as long as the granularity is a power
- * of two.
- */
-#define MAX_BIO_SECTORS ((1U << 31) >> 9)
-
 /**
  * blkdev_issue_discard - queue a discard
  * @bdev:	blockdev to issue discard for
@@ -50,6 +43,8 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	DECLARE_COMPLETION_ONSTACK(wait);
 	struct request_queue *q = bdev_get_queue(bdev);
 	int type = REQ_WRITE | REQ_DISCARD;
+	unsigned int granularity;
+	int alignment;
 	struct bio_batch bb;
 	struct bio *bio;
 	int ret = 0;
@@ -61,6 +56,10 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	if (!blk_queue_discard(q))
 		return -EOPNOTSUPP;
 
+	/* Zero-sector (unknown) and one-sector granularities are the same.  */
+	granularity = max(q->limits.discard_granularity >> 9, 1U);
+	alignment = (bdev_discard_alignment(bdev) >> 9) % granularity;
+
 	if (flags & BLKDEV_DISCARD_SECURE) {
 		if (!blk_queue_secdiscard(q))
 			return -EOPNOTSUPP;
@@ -74,7 +73,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	blk_start_plug(&plug);
 	while (nr_sects) {
 		unsigned int req_sects;
-		sector_t end_sect;
+		sector_t end_sect, tmp;
 
 		bio = bio_alloc(gfp_mask, 1);
 		if (!bio) {
@@ -82,8 +81,22 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 			break;
 		}
 
-		req_sects = min_t(sector_t, nr_sects, MAX_BIO_SECTORS);
+		/* Make sure bi_size doesn't overflow */
+		req_sects = min_t(sector_t, nr_sects, UINT_MAX >> 9);
+
+		/*
+		 * If splitting a request, and the next starting sector would be
+		 * misaligned, stop the discard at the previous aligned sector.
+		 */
 		end_sect = sector + req_sects;
+		tmp = end_sect;
+		if (req_sects < nr_sects &&
+		    sector_div(tmp, granularity) != alignment) {
+			end_sect = end_sect - alignment;
+			sector_div(end_sect, granularity);
+			end_sect = end_sect * granularity + alignment;
+			req_sects = end_sect - sector;
+		}
 
 		bio->bi_iter.bi_sector = sector;
 		bio->bi_end_io = bio_batch_end_io;
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: block: re-add discard_granularity and alignment checks
  2015-10-22 16:59 [PATCH] block: re-add discard_granularity and alignment checks Ming Lin
@ 2015-10-22 17:03 ` Mike Snitzer
  2015-10-27 21:23   ` Ming Lin
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Snitzer @ 2015-10-22 17:03 UTC (permalink / raw)
  To: Ming Lin
  Cc: linux-kernel, Christoph Hellwig, Jens Axboe, Martin K. Petersen,
	Kent Overstreet

On Thu, Oct 22 2015 at 12:59pm -0400,
Ming Lin <mlin@kernel.org> wrote:

> From: Ming Lin <ming.l@ssi.samsung.com>
> 
> In commit b49a087("block: remove split code in
> blkdev_issue_{discard,write_same}"), discard_granularity and alignment
> checks were removed. Ideally, with bio late splitting, the upper layers
> shouldn't need to depend on device's limits.
> 
> Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
> device when mkfs.xfs. We have not found the root cause yet.
> 
> This patch re-adds discard_granularity and alignment checks by reverting
> the related changes in commit b49a087. The good thing is now we can
> remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
> overflow.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Tested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>

Reviewed-by: Mike Snitzer <snitzer@redhat.com>

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: block: re-add discard_granularity and alignment checks
  2015-10-22 17:03 ` Mike Snitzer
@ 2015-10-27 21:23   ` Ming Lin
  2015-10-28  0:13     ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Ming Lin @ 2015-10-27 21:23 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: lkml, Christoph Hellwig, Jens Axboe, Martin K. Petersen, Kent Overstreet

On Thu, Oct 22, 2015 at 10:03 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> On Thu, Oct 22 2015 at 12:59pm -0400,
> Ming Lin <mlin@kernel.org> wrote:
>
>> From: Ming Lin <ming.l@ssi.samsung.com>
>>
>> In commit b49a087("block: remove split code in
>> blkdev_issue_{discard,write_same}"), discard_granularity and alignment
>> checks were removed. Ideally, with bio late splitting, the upper layers
>> shouldn't need to depend on device's limits.
>>
>> Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
>> device when mkfs.xfs. We have not found the root cause yet.
>>
>> This patch re-adds discard_granularity and alignment checks by reverting
>> the related changes in commit b49a087. The good thing is now we can
>> remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
>> overflow.
>>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Tested-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
>
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>

Hi Jens,

Would you please take this one?

>
> Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: block: re-add discard_granularity and alignment checks
  2015-10-27 21:23   ` Ming Lin
@ 2015-10-28  0:13     ` Jens Axboe
  0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2015-10-28  0:13 UTC (permalink / raw)
  To: Ming Lin, Mike Snitzer
  Cc: lkml, Christoph Hellwig, Martin K. Petersen, Kent Overstreet

On 10/28/2015 06:23 AM, Ming Lin wrote:
> On Thu, Oct 22, 2015 at 10:03 AM, Mike Snitzer <snitzer@redhat.com> wrote:
>> On Thu, Oct 22 2015 at 12:59pm -0400,
>> Ming Lin <mlin@kernel.org> wrote:
>>
>>> From: Ming Lin <ming.l@ssi.samsung.com>
>>>
>>> In commit b49a087("block: remove split code in
>>> blkdev_issue_{discard,write_same}"), discard_granularity and alignment
>>> checks were removed. Ideally, with bio late splitting, the upper layers
>>> shouldn't need to depend on device's limits.
>>>
>>> Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
>>> device when mkfs.xfs. We have not found the root cause yet.
>>>
>>> This patch re-adds discard_granularity and alignment checks by reverting
>>> the related changes in commit b49a087. The good thing is now we can
>>> remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
>>> overflow.
>>>
>>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>>> Tested-by: Christoph Hellwig <hch@lst.de>
>>> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
>>
>> Reviewed-by: Mike Snitzer <snitzer@redhat.com>
>
> Hi Jens,
>
> Would you please take this one?

I was going to add it for 4.3, but hch just pointed out that it's a 
regression in this series. I'll send it in for 4.3.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-28  0:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-22 16:59 [PATCH] block: re-add discard_granularity and alignment checks Ming Lin
2015-10-22 17:03 ` Mike Snitzer
2015-10-27 21:23   ` Ming Lin
2015-10-28  0:13     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).