linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] zoned block device report zones enhancements
@ 2019-10-24  6:50 Damien Le Moal
  2019-10-24  6:50 ` [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Damien Le Moal
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Damien Le Moal @ 2019-10-24  6:50 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

This series of patches improve the handling and execution of report
zones operations for zoned block devices.

The first patch enhances device revalidation by moving zone information
checks from the low level driver into the block layer. The second patch
remove some unnecessary code. The last two patches introduce generic
allocation of report zones command buffer, further enhancing zoned disk
revalidation.

As always, comments are welcome.

Damien Le Moal (4):
  block: Enhance blk_revalidate_disk_zones()
  block: Simplify report zones execution
  block: Introduce report zones queue limits
  block: Generically handle report zones buffer

 block/blk-settings.c           |   3 +
 block/blk-zoned.c              | 178 +++++++++++++++++----------
 drivers/block/null_blk.h       |   6 +-
 drivers/block/null_blk_zoned.c |   3 +-
 drivers/md/dm.c                |   9 +-
 drivers/scsi/sd.h              |   3 +-
 drivers/scsi/sd_zbc.c          | 212 +++++++++------------------------
 include/linux/blkdev.h         |  12 +-
 8 files changed, 193 insertions(+), 233 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] block: Enhance blk_revalidate_disk_zones()
  2019-10-24  6:50 [PATCH 0/4] zoned block device report zones enhancements Damien Le Moal
@ 2019-10-24  6:50 ` Damien Le Moal
  2019-10-24  7:08   ` Christoph Hellwig
  2019-10-24  7:20   ` Keith Busch
  2019-10-24  6:50 ` [PATCH 2/4] block: Simplify report zones execution Damien Le Moal
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 11+ messages in thread
From: Damien Le Moal @ 2019-10-24  6:50 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

For ZBC and ZAC zoned devices, the scsi driver revalidation processing
implemented by sd_revalidate_disk() includes a call to
sd_zbc_read_zones() which executes a full disk zone report used to
check that all zones of the disk are the same size. This processing is
followed by a call to blk_revalidate_disk_zones(), used to initialize
the device request queue zone bitmaps (zone type and zone write lock
bitmaps). To do so, blk_revalidate_disk_zones() also executes a full
device zone report to obtain zone types. As a result, the entire
zoned block device revalidation process includes two full device zone
report.

By moving the zone size checks into blk_revalidate_disk_zones(), this
process can be optimized to a single full device zone report, leading to
shorter device scan and revalidation times. This patch implements this
optimization, reducing the original full device zone report implemented
in sd_zbc_check_zones() to a single, small, report zones command
execution to obtain the size of the first zone of the device. Checks
whether all zones of the device are the same size as the first zone
size are moved to the generic blk_check_zone() function called from
blk_revalidate_disk_zones().

This optimization also has the following benefits:
1) fewer memory allocations in the scsi layer during disk revalidation
   as the potentailly large buffer for zone report execution is not
   needed.
2) Implement zone checks in a generic manner, reducing the burden on
   device driver which only need to obtain the zone size and check that
   this size is a power of 2 number of LBAs. Any new type of zoned
   block device will benefit from this.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-zoned.c     |  61 +++++++++++++++++++++++-
 drivers/scsi/sd_zbc.c | 107 ++++++++----------------------------------
 2 files changed, 79 insertions(+), 89 deletions(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 4bc5f260248a..293891b7068a 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -441,6 +441,57 @@ void blk_queue_free_zone_bitmaps(struct request_queue *q)
 	q->seq_zones_wlock = NULL;
 }
 
+/**
+ * blk_check_zone - Check a zone information
+ * @q:		request queue
+ * @zone:	the zone to  check
+ * @sector:	start sector of the zone
+ *
+ * Helper function to check zones of a zoned block device. Returns true if the
+ * zone is correct and false if a problem is detected.
+ */
+static bool blk_check_zone(struct gendisk *disk, struct blk_zone *zone,
+			   sector_t *sector)
+{
+	struct request_queue *q = disk->queue;
+	sector_t zone_sectors = blk_queue_zone_sectors(q);
+	sector_t capacity = get_capacity(disk);
+
+	/*
+	 * All zones must have the same size, with the exception on an eventual
+	 * smaller last zone.
+	 */
+	if (zone->start + zone_sectors < capacity &&
+	    zone->len != zone_sectors) {
+		pr_warn("%s: Invalid zone device with non constant zone size\n",
+			disk->disk_name);
+		return false;
+	}
+
+	/* Check for holes in the zone report */
+	if (zone->start != *sector) {
+		pr_warn("%s: Zone gap at sectors %llu..%llu\n",
+			disk->disk_name, *sector, zone->start);
+		return false;
+	}
+
+	/* Check zone type */
+	switch (zone->type) {
+	case BLK_ZONE_TYPE_CONVENTIONAL:
+	case BLK_ZONE_TYPE_SEQWRITE_REQ:
+	case BLK_ZONE_TYPE_SEQWRITE_PREF:
+		break;
+	default:
+		pr_warn("%s: Invalid zone type 0x%x at sectors %llu\n",
+			disk->disk_name, (int)zone->type, zone->start);
+		return false;
+	}
+
+	*sector += zone->len;
+
+	return true;
+}
+
 /**
  * blk_revalidate_disk_zones - (re)allocate and initialize zone bitmaps
  * @disk:	Target disk
@@ -490,7 +541,10 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 	if (!seq_zones_bitmap)
 		goto out;
 
-	/* Get zone information and initialize seq_zones_bitmap */
+	/*
+	 * Get zone information to check the zones and initialize
+	 * seq_zones_bitmap.
+	 */
 	rep_nr_zones = nr_zones;
 	zones = blk_alloc_zones(&rep_nr_zones);
 	if (!zones)
@@ -504,11 +558,14 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 		if (!nrz)
 			break;
 		for (i = 0; i < nrz; i++) {
+			if (!blk_check_zone(disk, &zones[i], &sector)) {
+				ret = -ENODEV;
+				goto out;
+			}
 			if (zones[i].type != BLK_ZONE_TYPE_CONVENTIONAL)
 				set_bit(z, seq_zones_bitmap);
 			z++;
 		}
-		sector += nrz * blk_queue_zone_sectors(q);
 	}
 
 	if (WARN_ON(z != nr_zones)) {
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index de4019dc0f0b..fbec99db6124 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -344,32 +344,19 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp,
  * Returns the zone size in number of blocks upon success or an error code
  * upon failure.
  */
-static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks)
+static int sd_zbc_check_zones(struct scsi_disk *sdkp, unsigned char *buf,
+			      u32 *zblocks)
 {
-	size_t bufsize, buflen;
-	unsigned int noio_flag;
+	size_t buflen;
 	u64 zone_blocks = 0;
-	sector_t max_lba, block = 0;
-	unsigned char *buf;
+	sector_t max_lba;
 	unsigned char *rec;
 	int ret;
-	u8 same;
-
-	/* Do all memory allocations as if GFP_NOIO was specified */
-	noio_flag = memalloc_noio_save();
 
-	/* Get a buffer */
-	buf = sd_zbc_alloc_report_buffer(sdkp, SD_ZBC_REPORT_MAX_ZONES,
-					 &bufsize);
-	if (!buf) {
-		ret = -ENOMEM;
-		goto out;
-	}
-
-	/* Do a report zone to get max_lba and the same field */
-	ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, 0, false);
+	/* Do a report zone to get max_lba and the size of the first zone */
+	ret = sd_zbc_do_report_zones(sdkp, buf, SD_BUF_SIZE, 0, false);
 	if (ret)
-		goto out_free;
+		return ret;
 
 	if (sdkp->rc_basis == 0) {
 		/* The max_lba field is the capacity of this device */
@@ -384,82 +371,28 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks)
 		}
 	}
 
-	/*
-	 * Check same field: for any value other than 0, we know that all zones
-	 * have the same size.
-	 */
-	same = buf[4] & 0x0f;
-	if (same > 0) {
-		rec = &buf[64];
-		zone_blocks = get_unaligned_be64(&rec[8]);
-		goto out;
-	}
-
-	/*
-	 * Check the size of all zones: all zones must be of
-	 * equal size, except the last zone which can be smaller
-	 * than other zones.
-	 */
-	do {
-
-		/* Parse REPORT ZONES header */
-		buflen = min_t(size_t, get_unaligned_be32(&buf[0]) + 64,
-			       bufsize);
-		rec = buf + 64;
-
-		/* Parse zone descriptors */
-		while (rec < buf + buflen) {
-			u64 this_zone_blocks = get_unaligned_be64(&rec[8]);
-
-			if (zone_blocks == 0) {
-				zone_blocks = this_zone_blocks;
-			} else if (this_zone_blocks != zone_blocks &&
-				   (block + this_zone_blocks < sdkp->capacity
-				    || this_zone_blocks > zone_blocks)) {
-				zone_blocks = 0;
-				goto out;
-			}
-			block += this_zone_blocks;
-			rec += 64;
-		}
-
-		if (block < sdkp->capacity) {
-			ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, block,
-						     true);
-			if (ret)
-				goto out_free;
-		}
-
-	} while (block < sdkp->capacity);
-
-out:
-	if (!zone_blocks) {
-		if (sdkp->first_scan)
-			sd_printk(KERN_NOTICE, sdkp,
-				  "Devices with non constant zone "
-				  "size are not supported\n");
-		ret = -ENODEV;
-	} else if (!is_power_of_2(zone_blocks)) {
+	/* Parse REPORT ZONES header */
+	buflen = min_t(size_t, get_unaligned_be32(&buf[0]) + 64, SD_BUF_SIZE);
+	rec = buf + 64;
+	zone_blocks = get_unaligned_be64(&rec[8]);
+	if (!zone_blocks || !is_power_of_2(zone_blocks)) {
 		if (sdkp->first_scan)
 			sd_printk(KERN_NOTICE, sdkp,
 				  "Devices with non power of 2 zone "
 				  "size are not supported\n");
-		ret = -ENODEV;
-	} else if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) {
+		return -ENODEV;
+	}
+
+	if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) {
 		if (sdkp->first_scan)
 			sd_printk(KERN_NOTICE, sdkp,
 				  "Zone size too large\n");
-		ret = -EFBIG;
-	} else {
-		*zblocks = zone_blocks;
-		ret = 0;
+		return -EFBIG;
 	}
 
-out_free:
-	memalloc_noio_restore(noio_flag);
-	kvfree(buf);
+	*zblocks = zone_blocks;
 
-	return ret;
+	return 0;
 }
 
 int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
@@ -485,7 +418,7 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	 * Check zone size: only devices with a constant zone size (except
 	 * an eventual last runt zone) that is a power of 2 are supported.
 	 */
-	ret = sd_zbc_check_zones(sdkp, &zone_blocks);
+	ret = sd_zbc_check_zones(sdkp, buf, &zone_blocks);
 	if (ret != 0)
 		goto err;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] block: Simplify report zones execution
  2019-10-24  6:50 [PATCH 0/4] zoned block device report zones enhancements Damien Le Moal
  2019-10-24  6:50 ` [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Damien Le Moal
@ 2019-10-24  6:50 ` Damien Le Moal
  2019-10-24  7:10   ` Christoph Hellwig
  2019-10-25  5:17   ` Javier González
  2019-10-24  6:50 ` [PATCH 3/4] block: Introduce report zones queue limits Damien Le Moal
  2019-10-24  6:50 ` [PATCH 4/4] block: Generically handle report zones buffer Damien Le Moal
  3 siblings, 2 replies; 11+ messages in thread
From: Damien Le Moal @ 2019-10-24  6:50 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

All kernel users of blkdev_report_zones() as well as applications use
through ioctl(BLKZONEREPORT) expect to potentially get less zone
descriptors than requested. As such, the use of the internal report
zones command execution loop implemented by blk_report_zones() is
not necessary and can even be harmful to performance by causing the
execution of inefficient small zones report command to service the
reminder of a requested zone array.

This patch removes blk_report_zones(), simplifying the code. Also
remove a now incorrect comment in dm_blk_report_zones().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-zoned.c | 34 +++++-----------------------------
 drivers/md/dm.c   |  6 ------
 2 files changed, 5 insertions(+), 35 deletions(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 293891b7068a..43bfd1be0985 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -119,31 +119,6 @@ static bool blkdev_report_zone(struct block_device *bdev, struct blk_zone *rep)
 	return true;
 }
 
-static int blk_report_zones(struct gendisk *disk, sector_t sector,
-			    struct blk_zone *zones, unsigned int *nr_zones)
-{
-	struct request_queue *q = disk->queue;
-	unsigned int z = 0, n, nrz = *nr_zones;
-	sector_t capacity = get_capacity(disk);
-	int ret;
-
-	while (z < nrz && sector < capacity) {
-		n = nrz - z;
-		ret = disk->fops->report_zones(disk, sector, &zones[z], &n);
-		if (ret)
-			return ret;
-		if (!n)
-			break;
-		sector += blk_queue_zone_sectors(q) * n;
-		z += n;
-	}
-
-	WARN_ON(z > *nr_zones);
-	*nr_zones = z;
-
-	return 0;
-}
-
 /**
  * blkdev_report_zones - Get zones information
  * @bdev:	Target block device
@@ -164,6 +139,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
 			struct blk_zone *zones, unsigned int *nr_zones)
 {
 	struct request_queue *q = bdev_get_queue(bdev);
+	struct gendisk *disk = bdev->bd_disk;
 	unsigned int i, nrz;
 	int ret;
 
@@ -175,7 +151,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
 	 * report_zones method. If it does not have one defined, the device
 	 * driver has a bug. So warn about that.
 	 */
-	if (WARN_ON_ONCE(!bdev->bd_disk->fops->report_zones))
+	if (WARN_ON_ONCE(!disk->fops->report_zones))
 		return -EOPNOTSUPP;
 
 	if (!*nr_zones || sector >= bdev->bd_part->nr_sects) {
@@ -185,8 +161,8 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
 
 	nrz = min(*nr_zones,
 		  __blkdev_nr_zones(q, bdev->bd_part->nr_sects - sector));
-	ret = blk_report_zones(bdev->bd_disk, get_start_sect(bdev) + sector,
-			       zones, &nrz);
+	ret = disk->fops->report_zones(disk, get_start_sect(bdev) + sector,
+				       zones, &nrz);
 	if (ret)
 		return ret;
 
@@ -552,7 +528,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 
 	while (z < nr_zones) {
 		nrz = min(nr_zones - z, rep_nr_zones);
-		ret = blk_report_zones(disk, sector, zones, &nrz);
+		ret = disk->fops->report_zones(disk, sector, zones, &nrz);
 		if (ret)
 			goto out;
 		if (!nrz)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1a5e328c443a..647aa5b0233b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -473,12 +473,6 @@ static int dm_blk_report_zones(struct gendisk *disk, sector_t sector,
 		goto out;
 	}
 
-	/*
-	 * blkdev_report_zones() will loop and call this again to cover all the
-	 * zones of the target, eventually moving on to the next target.
-	 * So there is no need to loop here trying to fill the entire array
-	 * of zones.
-	 */
 	ret = tgt->type->report_zones(tgt, sector, zones, nr_zones);
 
 out:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] block: Introduce report zones queue limits
  2019-10-24  6:50 [PATCH 0/4] zoned block device report zones enhancements Damien Le Moal
  2019-10-24  6:50 ` [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Damien Le Moal
  2019-10-24  6:50 ` [PATCH 2/4] block: Simplify report zones execution Damien Le Moal
@ 2019-10-24  6:50 ` Damien Le Moal
  2019-10-24  6:50 ` [PATCH 4/4] block: Generically handle report zones buffer Damien Le Moal
  3 siblings, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2019-10-24  6:50 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

In preparation for a generic report zones command buffer allocation to
the block layer, introduce three new request queue limits describing the
device zone descriptor size (zone_descriptor_size limit), the needed
granularity of the report zones command buffer size
(zones_report_granularity limit) and the maximum size of a report zone
command (max_zones_report_size limit).

For scsi, set these values respectively to 64 bytes, SECTOR_SIZE and
the maximum transfer size used for regular read/write commands limited
by the maximum number of pages (segments) that the hardware can map.
This removes the need for the "magic" limit implemented with the macro
SD_ZBC_REPORT_MAX_ZONES.

For the null_blk driver and dm targets, the default value of 0 is used
for these limits, indicating that these zoned devices do not need a
buffer for the execution of report zones.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-settings.c   |  3 +++
 drivers/scsi/sd_zbc.c  | 48 +++++++++++++++++++++---------------------
 include/linux/blkdev.h |  4 ++++
 3 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 5f6dcc7a47bd..674cfc428334 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -53,6 +53,9 @@ void blk_set_default_limits(struct queue_limits *lim)
 	lim->discard_granularity = 0;
 	lim->discard_alignment = 0;
 	lim->discard_misaligned = 0;
+	lim->zone_descriptor_size = 0;
+	lim->zones_report_granularity = 0;
+	lim->max_zones_report_size = 0;
 	lim->logical_block_size = lim->physical_block_size = lim->io_min = 512;
 	lim->bounce_pfn = (unsigned long)(BLK_BOUNCE_ANY >> PAGE_SHIFT);
 	lim->alignment_offset = 0;
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index fbec99db6124..8dc96f4ea920 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -104,11 +104,6 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 	return 0;
 }
 
-/*
- * Maximum number of zones to get with one report zones command.
- */
-#define SD_ZBC_REPORT_MAX_ZONES		8192U
-
 /**
  * Allocate a buffer for report zones reply.
  * @sdkp: The target disk
@@ -129,21 +124,8 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
 	size_t bufsize;
 	void *buf;
 
-	/*
-	 * Report zone buffer size should be at most 64B times the number of
-	 * zones requested plus the 64B reply header, but should be at least
-	 * SECTOR_SIZE for ATA devices.
-	 * Make sure that this size does not exceed the hardware capabilities.
-	 * Furthermore, since the report zone command cannot be split, make
-	 * sure that the allocated buffer can always be mapped by limiting the
-	 * number of pages allocated to the HBA max segments limit.
-	 */
-	nr_zones = min(nr_zones, SD_ZBC_REPORT_MAX_ZONES);
-	bufsize = roundup((nr_zones + 1) * 64, 512);
-	bufsize = min_t(size_t, bufsize,
-			queue_max_hw_sectors(q) << SECTOR_SHIFT);
-	bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);
-
+	bufsize = min_t(size_t, roundup(nr_zones * 64, SECTOR_SIZE),
+			q->limits.max_zones_report_size);
 	buf = vzalloc(bufsize);
 	if (buf)
 		*buflen = bufsize;
@@ -398,6 +380,8 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, unsigned char *buf,
 int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 {
 	struct gendisk *disk = sdkp->disk;
+	struct request_queue *q = disk->queue;
+	unsigned int max_zones_report_size;
 	unsigned int nr_zones;
 	u32 zone_blocks = 0;
 	int ret;
@@ -423,13 +407,29 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 		goto err;
 
 	/* The drive satisfies the kernel restrictions: set it up */
-	blk_queue_chunk_sectors(sdkp->disk->queue,
+	blk_queue_chunk_sectors(q,
 			logical_to_sectors(sdkp->device, zone_blocks));
-	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, sdkp->disk->queue);
-	blk_queue_required_elevator_features(sdkp->disk->queue,
-					     ELEVATOR_F_ZBD_SEQ_WRITE);
+	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q);
+	blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE);
 	nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
 
+	/*
+	 * Zone descriptors are 64 bytes. A report zone buffer size should be
+	 * at most 64B times the number of zones of the device plus a 64B reply
+	 * header and should be at least be SECTOR_SIZE bytes for ATA devices.
+	 * Make sure that this maximum buffer size does not exceed the hardware
+	 * capabilities in terms of maximum data transfer size. Furthermore,
+	 * make sure that the allocated buffer can always be mapped by limiting
+	 * the number of pages of the buffer to the device max segments limit.
+	 */
+	q->limits.zone_descriptor_size = 64;
+	q->limits.zones_report_granularity = SECTOR_SIZE;
+	max_zones_report_size = min(roundup((nr_zones + 1) * 64, SECTOR_SIZE),
+				    queue_max_hw_sectors(q) << SECTOR_SHIFT);
+	q->limits.max_zones_report_size =
+		min(max_zones_report_size,
+		    (unsigned int)queue_max_segments(q) << PAGE_SHIFT);
+
 	/* READ16/WRITE16 is mandatory for ZBC disks */
 	sdkp->device->use_16_for_rw = 1;
 	sdkp->device->use_10_for_rw = 0;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f3ea78b0c91c..1c76d71fc232 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -338,6 +338,10 @@ struct queue_limits {
 	unsigned int		discard_granularity;
 	unsigned int		discard_alignment;
 
+	unsigned int		zone_descriptor_size;
+	unsigned int		zones_report_granularity;
+	unsigned int		max_zones_report_size;
+
 	unsigned short		logical_block_size;
 	unsigned short		max_segments;
 	unsigned short		max_integrity_segments;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] block: Generically handle report zones buffer
  2019-10-24  6:50 [PATCH 0/4] zoned block device report zones enhancements Damien Le Moal
                   ` (2 preceding siblings ...)
  2019-10-24  6:50 ` [PATCH 3/4] block: Introduce report zones queue limits Damien Le Moal
@ 2019-10-24  6:50 ` Damien Le Moal
  2019-10-24  7:13   ` Christoph Hellwig
  3 siblings, 1 reply; 11+ messages in thread
From: Damien Le Moal @ 2019-10-24  6:50 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

Instead of relying on a zoned block device driver to allocate a buffer
for every execution of a report zones command execution, rely on the
block layer use of the device zone report queue limits to allocate a
buffer and keep it around when the device report_zones method is
executed in a loop, e.g.  as in blk_revalidate_disk_zones().

This simplifies the code in the scsi sd_zbc driver as well as simplify
the addition of zone supports for upcoming new zoned device drivers.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-zoned.c              | 99 ++++++++++++++++++++--------------
 drivers/block/null_blk.h       |  6 ++-
 drivers/block/null_blk_zoned.c |  3 +-
 drivers/md/dm.c                |  3 +-
 drivers/scsi/sd.h              |  3 +-
 drivers/scsi/sd_zbc.c          | 61 ++++++---------------
 include/linux/blkdev.h         |  8 +--
 7 files changed, 88 insertions(+), 95 deletions(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 43bfd1be0985..6bddaa505df0 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -97,6 +97,29 @@ unsigned int blkdev_nr_zones(struct block_device *bdev)
 }
 EXPORT_SYMBOL_GPL(blkdev_nr_zones);
 
+/*
+ * Allocate a buffer to execute report zones.
+ */
+static void *blk_alloc_report_buffer(struct request_queue *q,
+				     unsigned int *nr_zones, size_t *buflen)
+{
+	unsigned int nrz = *nr_zones;
+	size_t bufsize = nrz * q->limits.zone_descriptor_size;
+	void *buf;
+
+	if (q->limits.zones_report_granularity)
+		bufsize = roundup(bufsize, q->limits.zones_report_granularity);
+	bufsize = min_t(size_t, bufsize, q->limits.max_zones_report_size);
+	buf = vzalloc(bufsize);
+	if (buf) {
+		*buflen = bufsize;
+		*nr_zones = min_t(unsigned int, nrz,
+				  bufsize / q->limits.zone_descriptor_size);
+	}
+
+	return buf;
+}
+
 /*
  * Check that a zone report belongs to this partition, and if yes, fix its start
  * sector and write pointer and return true. Return false otherwise.
@@ -140,7 +163,10 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
 {
 	struct request_queue *q = bdev_get_queue(bdev);
 	struct gendisk *disk = bdev->bd_disk;
-	unsigned int i, nrz;
+	unsigned int i, nrz = *nr_zones;
+	sector_t capacity = bdev->bd_part->nr_sects;
+	size_t buflen = 0;
+	void *buf = NULL;
 	int ret;
 
 	if (!blk_queue_is_zoned(q))
@@ -154,27 +180,33 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
 	if (WARN_ON_ONCE(!disk->fops->report_zones))
 		return -EOPNOTSUPP;
 
-	if (!*nr_zones || sector >= bdev->bd_part->nr_sects) {
+	if (!nrz || sector >= capacity) {
 		*nr_zones = 0;
 		return 0;
 	}
 
-	nrz = min(*nr_zones,
-		  __blkdev_nr_zones(q, bdev->bd_part->nr_sects - sector));
-	ret = disk->fops->report_zones(disk, get_start_sect(bdev) + sector,
-				       zones, &nrz);
+	nrz = min(nrz, __blkdev_nr_zones(q, capacity - sector));
+	if (q->limits.zone_descriptor_size) {
+		buf = blk_alloc_report_buffer(q, &nrz, &buflen);
+		if (!buf)
+			return -ENOMEM;
+	}
+
+	ret = disk->fops->report_zones(disk, sector, zones, &nrz, buf, buflen);
 	if (ret)
-		return ret;
+		goto out;
 
 	for (i = 0; i < nrz; i++) {
 		if (!blkdev_report_zone(bdev, zones))
 			break;
 		zones++;
 	}
-
 	*nr_zones = i;
 
-	return 0;
+out:
+	kvfree(buf);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(blkdev_report_zones);
 
@@ -384,31 +416,6 @@ static inline unsigned long *blk_alloc_zone_bitmap(int node,
 			    GFP_NOIO, node);
 }
 
-/*
- * Allocate an array of struct blk_zone to get nr_zones zone information.
- * The allocated array may be smaller than nr_zones.
- */
-static struct blk_zone *blk_alloc_zones(unsigned int *nr_zones)
-{
-	struct blk_zone *zones;
-	size_t nrz = min(*nr_zones, BLK_ZONED_REPORT_MAX_ZONES);
-
-	/*
-	 * GFP_KERNEL here is meaningless as the caller task context has
-	 * the PF_MEMALLOC_NOIO flag set in blk_revalidate_disk_zones()
-	 * with memalloc_noio_save().
-	 */
-	zones = kvcalloc(nrz, sizeof(struct blk_zone), GFP_KERNEL);
-	if (!zones) {
-		*nr_zones = 0;
-		return NULL;
-	}
-
-	*nr_zones = nrz;
-
-	return zones;
-}
-
 void blk_queue_free_zone_bitmaps(struct request_queue *q)
 {
 	kfree(q->seq_zones_bitmap);
@@ -482,10 +489,12 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 	struct request_queue *q = disk->queue;
 	unsigned int nr_zones = __blkdev_nr_zones(q, get_capacity(disk));
 	unsigned long *seq_zones_wlock = NULL, *seq_zones_bitmap = NULL;
-	unsigned int i, rep_nr_zones = 0, z = 0, nrz;
+	unsigned int i, rep_nr_zones, z = 0, nrz;
 	struct blk_zone *zones = NULL;
 	unsigned int noio_flag;
 	sector_t sector = 0;
+	size_t buflen = 0;
+	void *buf = NULL;
 	int ret = 0;
 
 	/*
@@ -518,17 +527,28 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 		goto out;
 
 	/*
-	 * Get zone information to check the zones and initialize
-	 * seq_zones_bitmap.
+	 * Allocate a report buffer for the driver execution of report zones
+	 * and an array of zones to get the report back.
 	 */
 	rep_nr_zones = nr_zones;
-	zones = blk_alloc_zones(&rep_nr_zones);
+	if (q->limits.zone_descriptor_size) {
+		buf = blk_alloc_report_buffer(q, &rep_nr_zones, &buflen);
+		if (!buf)
+			goto out;
+	}
+
+	zones = kvcalloc(rep_nr_zones, sizeof(struct blk_zone), GFP_KERNEL);
 	if (!zones)
 		goto out;
 
+	/*
+	 * Get zone information to check the zones and initialize
+	 * seq_zones_bitmap.
+	 */
 	while (z < nr_zones) {
 		nrz = min(nr_zones - z, rep_nr_zones);
-		ret = disk->fops->report_zones(disk, sector, zones, &nrz);
+		ret = disk->fops->report_zones(disk, sector, zones, &nrz,
+					       buf, buflen);
 		if (ret)
 			goto out;
 		if (!nrz)
@@ -565,6 +585,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 	memalloc_noio_restore(noio_flag);
 
 	kvfree(zones);
+	kvfree(buf);
 	kfree(seq_zones_wlock);
 	kfree(seq_zones_bitmap);
 
diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h
index 93c2a3d403da..6bd0482ec683 100644
--- a/drivers/block/null_blk.h
+++ b/drivers/block/null_blk.h
@@ -92,7 +92,8 @@ struct nullb {
 int null_zone_init(struct nullb_device *dev);
 void null_zone_exit(struct nullb_device *dev);
 int null_zone_report(struct gendisk *disk, sector_t sector,
-		     struct blk_zone *zones, unsigned int *nr_zones);
+		     struct blk_zone *zones, unsigned int *nr_zones,
+		     void *buf, size_t buflen);
 blk_status_t null_handle_zoned(struct nullb_cmd *cmd,
 				enum req_opf op, sector_t sector,
 				sector_t nr_sectors);
@@ -107,7 +108,8 @@ static inline int null_zone_init(struct nullb_device *dev)
 static inline void null_zone_exit(struct nullb_device *dev) {}
 static inline int null_zone_report(struct gendisk *disk, sector_t sector,
 				   struct blk_zone *zones,
-				   unsigned int *nr_zones)
+				   unsigned int *nr_zones,
+				   void *buf, size_t buflen)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
index 4e56b17ed3ef..446e083be240 100644
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -67,7 +67,8 @@ void null_zone_exit(struct nullb_device *dev)
 }
 
 int null_zone_report(struct gendisk *disk, sector_t sector,
-		     struct blk_zone *zones, unsigned int *nr_zones)
+		     struct blk_zone *zones, unsigned int *nr_zones,
+		     void *buf, size_t buflen)
 {
 	struct nullb *nullb = disk->private_data;
 	struct nullb_device *dev = nullb->dev;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 647aa5b0233b..5d5a297ceeb1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -441,7 +441,8 @@ static int dm_blk_getgeo(struct block_device *bdev, struct hd_geometry *geo)
 }
 
 static int dm_blk_report_zones(struct gendisk *disk, sector_t sector,
-			       struct blk_zone *zones, unsigned int *nr_zones)
+			       struct blk_zone *zones, unsigned int *nr_zones,
+			       void *buf, size_t buflen)
 {
 #ifdef CONFIG_BLK_DEV_ZONED
 	struct mapped_device *md = disk->private_data;
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 1eab779f812b..b948656b6882 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -213,7 +213,8 @@ extern blk_status_t sd_zbc_setup_reset_cmnd(struct scsi_cmnd *cmd, bool all);
 extern void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes,
 			    struct scsi_sense_hdr *sshdr);
 extern int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
-			       struct blk_zone *zones, unsigned int *nr_zones);
+			       struct blk_zone *zones, unsigned int *nr_zones,
+			       void *buf, size_t buflen);
 
 #else /* CONFIG_BLK_DEV_ZONED */
 
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 8dc96f4ea920..228522c4338f 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -104,35 +104,6 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 	return 0;
 }
 
-/**
- * Allocate a buffer for report zones reply.
- * @sdkp: The target disk
- * @nr_zones: Maximum number of zones to report
- * @buflen: Size of the buffer allocated
- *
- * Try to allocate a reply buffer for the number of requested zones.
- * The size of the buffer allocated may be smaller than requested to
- * satify the device constraint (max_hw_sectors, max_segments, etc).
- *
- * Return the address of the allocated buffer and update @buflen with
- * the size of the allocated buffer.
- */
-static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
-					unsigned int nr_zones, size_t *buflen)
-{
-	struct request_queue *q = sdkp->disk->queue;
-	size_t bufsize;
-	void *buf;
-
-	bufsize = min_t(size_t, roundup(nr_zones * 64, SECTOR_SIZE),
-			q->limits.max_zones_report_size);
-	buf = vzalloc(bufsize);
-	if (buf)
-		*buflen = bufsize;
-
-	return buf;
-}
-
 /**
  * sd_zbc_report_zones - Disk report zones operation.
  * @disk: The target disk
@@ -143,40 +114,40 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
  * Execute a report zones command on the target disk.
  */
 int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
-			struct blk_zone *zones, unsigned int *nr_zones)
+			struct blk_zone *zones, unsigned int *nr_zones,
+			void *buf, size_t buflen)
 {
 	struct scsi_disk *sdkp = scsi_disk(disk);
 	unsigned int i, nrz = *nr_zones;
-	unsigned char *buf;
-	size_t buflen = 0, offset = 0;
-	int ret = 0;
+	unsigned char *rep_buf = buf;
+	size_t offset = 0;
+	int ret;
 
 	if (!sd_is_zoned(sdkp))
 		/* Not a zoned device */
 		return -EOPNOTSUPP;
 
-	buf = sd_zbc_alloc_report_buffer(sdkp, nrz, &buflen);
-	if (!buf)
-		return -ENOMEM;
-
-	ret = sd_zbc_do_report_zones(sdkp, buf, buflen,
+	/*
+	 * The buffer prepared by the block layer may be too large for the
+	 * number of zones requested. Tune it here to avoid requesting too
+	 * many zones than necessary.
+	 */
+	buflen = min_t(size_t, roundup((nrz + 1) * 64, SECTOR_SIZE), buflen);
+	ret = sd_zbc_do_report_zones(sdkp, rep_buf, buflen,
 			sectors_to_logical(sdkp->device, sector), true);
 	if (ret)
-		goto out;
+		return ret;
 
-	nrz = min(nrz, get_unaligned_be32(&buf[0]) / 64);
+	nrz = min(nrz, get_unaligned_be32(&rep_buf[0]) / 64);
 	for (i = 0; i < nrz; i++) {
 		offset += 64;
-		sd_zbc_parse_report(sdkp, buf + offset, zones);
+		sd_zbc_parse_report(sdkp, rep_buf + offset, zones);
 		zones++;
 	}
 
 	*nr_zones = nrz;
 
-out:
-	kvfree(buf);
-
-	return ret;
+	return 0;
 }
 
 /**
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1c76d71fc232..f04927a7fb40 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -355,11 +355,6 @@ struct queue_limits {
 
 #ifdef CONFIG_BLK_DEV_ZONED
 
-/*
- * Maximum number of zones to report with a single report zones command.
- */
-#define BLK_ZONED_REPORT_MAX_ZONES	8192U
-
 extern unsigned int blkdev_nr_zones(struct block_device *bdev);
 extern int blkdev_report_zones(struct block_device *bdev,
 			       sector_t sector, struct blk_zone *zones,
@@ -1713,7 +1708,8 @@ struct block_device_operations {
 	/* this callback is with swap_lock and sometimes page table lock held */
 	void (*swap_slot_free_notify) (struct block_device *, unsigned long);
 	int (*report_zones)(struct gendisk *, sector_t sector,
-			    struct blk_zone *zones, unsigned int *nr_zones);
+			    struct blk_zone *zones, unsigned int *nr_zones,
+			    void *buf, size_t buflen);
 	struct module *owner;
 	const struct pr_ops *pr_ops;
 };
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] block: Enhance blk_revalidate_disk_zones()
  2019-10-24  6:50 ` [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Damien Le Moal
@ 2019-10-24  7:08   ` Christoph Hellwig
  2019-10-25  5:15     ` Damien Le Moal
  2019-10-24  7:20   ` Keith Busch
  1 sibling, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2019-10-24  7:08 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 4bc5f260248a..293891b7068a 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -441,6 +441,57 @@ void blk_queue_free_zone_bitmaps(struct request_queue *q)
>  	q->seq_zones_wlock = NULL;
>  }
>  
> +/**
> + * blk_check_zone - Check a zone information
> + * @q:		request queue
> + * @zone:	the zone to  check
> + * @sector:	start sector of the zone
> + *
> + * Helper function to check zones of a zoned block device. Returns true if the
> + * zone is correct and false if a problem is detected.
> + */
> +static bool blk_check_zone(struct gendisk *disk, struct blk_zone *zone,
> +			   sector_t *sector)

Maybe call this blk_zone_valid?  Also in many places we don't really
do the verbose kerneldoc comments with the duplicated parameter names
for local functions, as the scripts only pick up non-stack ones anyway.

> +	/*
> +	 * All zones must have the same size, with the exception on an eventual
> +	 * smaller last zone.
> +	 */
> +	if (zone->start + zone_sectors < capacity &&
> +	    zone->len != zone_sectors) {
> +		pr_warn("%s: Invalid zone device with non constant zone size\n",
> +			disk->disk_name);
> +		return false;
> +	}

I think we should also move the power of two zone size check here
instead of leaving it in the driver.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] block: Simplify report zones execution
  2019-10-24  6:50 ` [PATCH 2/4] block: Simplify report zones execution Damien Le Moal
@ 2019-10-24  7:10   ` Christoph Hellwig
  2019-10-25  5:17   ` Javier González
  1 sibling, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2019-10-24  7:10 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] block: Generically handle report zones buffer
  2019-10-24  6:50 ` [PATCH 4/4] block: Generically handle report zones buffer Damien Le Moal
@ 2019-10-24  7:13   ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2019-10-24  7:13 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

On Thu, Oct 24, 2019 at 03:50:06PM +0900, Damien Le Moal wrote:
> Instead of relying on a zoned block device driver to allocate a buffer
> for every execution of a report zones command execution, rely on the
> block layer use of the device zone report queue limits to allocate a
> buffer and keep it around when the device report_zones method is
> executed in a loop, e.g.  as in blk_revalidate_disk_zones().
> 
> This simplifies the code in the scsi sd_zbc driver as well as simplify
> the addition of zone supports for upcoming new zoned device drivers.

I wonder if we could just do away with the separate buffer entirely.
As the SCSI zone size (and also ATA even if we don't directly talk
to that) are intentionally the same size as the blk_zone (and the
same true is for the only upcoming standard I know of) we can just
rewrite each entry in-place(-ish) by reusing the same allocation.
Depending on the detailed formate we have to copy a field our two
onto the stack first, but it both avoids the extra allocation, and
the whole queue limits infrastructure in the previous patch and
should simply the code a lot.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] block: Enhance blk_revalidate_disk_zones()
  2019-10-24  6:50 ` [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Damien Le Moal
  2019-10-24  7:08   ` Christoph Hellwig
@ 2019-10-24  7:20   ` Keith Busch
  1 sibling, 0 replies; 11+ messages in thread
From: Keith Busch @ 2019-10-24  7:20 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

On Thu, Oct 24, 2019 at 03:50:03PM +0900, Damien Le Moal wrote:
> -	/* Do a report zone to get max_lba and the same field */
> -	ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, 0, false);
> +	/* Do a report zone to get max_lba and the size of the first zone */
> +	ret = sd_zbc_do_report_zones(sdkp, buf, SD_BUF_SIZE, 0, false);

This is no longer reading all the zones here, so you could set the
'partial' field to true. And then since this was the only caller that
had set partial to false, you can also remove that parameter.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] block: Enhance blk_revalidate_disk_zones()
  2019-10-24  7:08   ` Christoph Hellwig
@ 2019-10-25  5:15     ` Damien Le Moal
  0 siblings, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2019-10-25  5:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

On 2019/10/24 16:08, Christoph Hellwig wrote:
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 4bc5f260248a..293891b7068a 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -441,6 +441,57 @@ void blk_queue_free_zone_bitmaps(struct request_queue *q)
>>  	q->seq_zones_wlock = NULL;
>>  }
>>  
>> +/**
>> + * blk_check_zone - Check a zone information
>> + * @q:		request queue
>> + * @zone:	the zone to  check
>> + * @sector:	start sector of the zone
>> + *
>> + * Helper function to check zones of a zoned block device. Returns true if the
>> + * zone is correct and false if a problem is detected.
>> + */
>> +static bool blk_check_zone(struct gendisk *disk, struct blk_zone *zone,
>> +			   sector_t *sector)
> 
> Maybe call this blk_zone_valid?  Also in many places we don't really
> do the verbose kerneldoc comments with the duplicated parameter names
> for local functions, as the scripts only pick up non-stack ones anyway.
> 
>> +	/*
>> +	 * All zones must have the same size, with the exception on an eventual
>> +	 * smaller last zone.
>> +	 */
>> +	if (zone->start + zone_sectors < capacity &&
>> +	    zone->len != zone_sectors) {
>> +		pr_warn("%s: Invalid zone device with non constant zone size\n",
>> +			disk->disk_name);
>> +		return false;
>> +	}
> 
> I think we should also move the power of two zone size check here
> instead of leaving it in the driver.

We should not since the drivers calls blk_queue_chunk_sectors() first to
set the zone size and that function has a power of 2 check. So better
check that earlier than here.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] block: Simplify report zones execution
  2019-10-24  6:50 ` [PATCH 2/4] block: Simplify report zones execution Damien Le Moal
  2019-10-24  7:10   ` Christoph Hellwig
@ 2019-10-25  5:17   ` Javier González
  1 sibling, 0 replies; 11+ messages in thread
From: Javier González @ 2019-10-25  5:17 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	dm-devel, Mike Snitzer

On 24.10.2019 15:50, Damien Le Moal wrote:
>All kernel users of blkdev_report_zones() as well as applications use
>through ioctl(BLKZONEREPORT) expect to potentially get less zone
>descriptors than requested. As such, the use of the internal report
>zones command execution loop implemented by blk_report_zones() is
>not necessary and can even be harmful to performance by causing the
>execution of inefficient small zones report command to service the
>reminder of a requested zone array.
>
>This patch removes blk_report_zones(), simplifying the code. Also
>remove a now incorrect comment in dm_blk_report_zones().
>
>Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>---
> block/blk-zoned.c | 34 +++++-----------------------------
> drivers/md/dm.c   |  6 ------
> 2 files changed, 5 insertions(+), 35 deletions(-)
>
>diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>index 293891b7068a..43bfd1be0985 100644
>--- a/block/blk-zoned.c
>+++ b/block/blk-zoned.c
>@@ -119,31 +119,6 @@ static bool blkdev_report_zone(struct block_device *bdev, struct blk_zone *rep)
> 	return true;
> }
>
>-static int blk_report_zones(struct gendisk *disk, sector_t sector,
>-			    struct blk_zone *zones, unsigned int *nr_zones)
>-{
>-	struct request_queue *q = disk->queue;
>-	unsigned int z = 0, n, nrz = *nr_zones;
>-	sector_t capacity = get_capacity(disk);
>-	int ret;
>-
>-	while (z < nrz && sector < capacity) {
>-		n = nrz - z;
>-		ret = disk->fops->report_zones(disk, sector, &zones[z], &n);
>-		if (ret)
>-			return ret;
>-		if (!n)
>-			break;
>-		sector += blk_queue_zone_sectors(q) * n;
>-		z += n;
>-	}
>-
>-	WARN_ON(z > *nr_zones);
>-	*nr_zones = z;
>-
>-	return 0;
>-}
>-
> /**
>  * blkdev_report_zones - Get zones information
>  * @bdev:	Target block device
>@@ -164,6 +139,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
> 			struct blk_zone *zones, unsigned int *nr_zones)
> {
> 	struct request_queue *q = bdev_get_queue(bdev);
>+	struct gendisk *disk = bdev->bd_disk;
> 	unsigned int i, nrz;
> 	int ret;
>
>@@ -175,7 +151,7 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
> 	 * report_zones method. If it does not have one defined, the device
> 	 * driver has a bug. So warn about that.
> 	 */
>-	if (WARN_ON_ONCE(!bdev->bd_disk->fops->report_zones))
>+	if (WARN_ON_ONCE(!disk->fops->report_zones))
> 		return -EOPNOTSUPP;
>
> 	if (!*nr_zones || sector >= bdev->bd_part->nr_sects) {
>@@ -185,8 +161,8 @@ int blkdev_report_zones(struct block_device *bdev, sector_t sector,
>
> 	nrz = min(*nr_zones,
> 		  __blkdev_nr_zones(q, bdev->bd_part->nr_sects - sector));
>-	ret = blk_report_zones(bdev->bd_disk, get_start_sect(bdev) + sector,
>-			       zones, &nrz);
>+	ret = disk->fops->report_zones(disk, get_start_sect(bdev) + sector,
>+				       zones, &nrz);
> 	if (ret)
> 		return ret;
>
>@@ -552,7 +528,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
>
> 	while (z < nr_zones) {
> 		nrz = min(nr_zones - z, rep_nr_zones);
>-		ret = blk_report_zones(disk, sector, zones, &nrz);
>+		ret = disk->fops->report_zones(disk, sector, zones, &nrz);
> 		if (ret)
> 			goto out;
> 		if (!nrz)
>diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>index 1a5e328c443a..647aa5b0233b 100644
>--- a/drivers/md/dm.c
>+++ b/drivers/md/dm.c
>@@ -473,12 +473,6 @@ static int dm_blk_report_zones(struct gendisk *disk, sector_t sector,
> 		goto out;
> 	}
>
>-	/*
>-	 * blkdev_report_zones() will loop and call this again to cover all the
>-	 * zones of the target, eventually moving on to the next target.
>-	 * So there is no need to loop here trying to fill the entire array
>-	 * of zones.
>-	 */
> 	ret = tgt->type->report_zones(tgt, sector, zones, nr_zones);
>
> out:
>-- 
>2.21.0
>

Looks good to me.

Reviewed-by: Javier González <javier@javigon.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-10-25  5:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-24  6:50 [PATCH 0/4] zoned block device report zones enhancements Damien Le Moal
2019-10-24  6:50 ` [PATCH 1/4] block: Enhance blk_revalidate_disk_zones() Damien Le Moal
2019-10-24  7:08   ` Christoph Hellwig
2019-10-25  5:15     ` Damien Le Moal
2019-10-24  7:20   ` Keith Busch
2019-10-24  6:50 ` [PATCH 2/4] block: Simplify report zones execution Damien Le Moal
2019-10-24  7:10   ` Christoph Hellwig
2019-10-25  5:17   ` Javier González
2019-10-24  6:50 ` [PATCH 3/4] block: Introduce report zones queue limits Damien Le Moal
2019-10-24  6:50 ` [PATCH 4/4] block: Generically handle report zones buffer Damien Le Moal
2019-10-24  7:13   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).