* [PATCH v3 0/3] block: add zone write granularity limit @ 2021-01-22 8:00 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch The first patch in this series introduces the zone write granularity queue limit to indicate the alignment constraint for write operations into sequential zones of zoned block devices. The second patch fixes adds the missing documentation for zone_append_max_bytes to the sysfs block documentation. The third patch switch zonefs to use this new limit as the file block size instead of using the physical block size. Changes from v2: * Added patch 3 for zonefs * Addressed Christoph's comments on patch 1 and added the limit initialization for zoned nullblk Changes from v1: * Fixed typo in patch 2 Damien Le Moal (3): block: introduce zone_write_granularity limit block: document zone_append_max_bytes attribute zonefs: use zone write granularity as block size Documentation/block/queue-sysfs.rst | 13 ++++++++++ block/blk-settings.c | 39 +++++++++++++++++++++++++---- block/blk-sysfs.c | 8 ++++++ drivers/block/null_blk/zoned.c | 1 + drivers/nvme/host/zns.c | 1 + drivers/scsi/sd_zbc.c | 10 ++++++++ fs/zonefs/super.c | 9 +++---- include/linux/blkdev.h | 15 +++++++++++ 8 files changed, 86 insertions(+), 10 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 0/3] block: add zone write granularity limit @ 2021-01-22 8:00 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: linux-scsi, Keith Busch, Chaitanya Kulkarni, linux-nvme, Martin K . Petersen, Christoph Hellwig The first patch in this series introduces the zone write granularity queue limit to indicate the alignment constraint for write operations into sequential zones of zoned block devices. The second patch fixes adds the missing documentation for zone_append_max_bytes to the sysfs block documentation. The third patch switch zonefs to use this new limit as the file block size instead of using the physical block size. Changes from v2: * Added patch 3 for zonefs * Addressed Christoph's comments on patch 1 and added the limit initialization for zoned nullblk Changes from v1: * Fixed typo in patch 2 Damien Le Moal (3): block: introduce zone_write_granularity limit block: document zone_append_max_bytes attribute zonefs: use zone write granularity as block size Documentation/block/queue-sysfs.rst | 13 ++++++++++ block/blk-settings.c | 39 +++++++++++++++++++++++++---- block/blk-sysfs.c | 8 ++++++ drivers/block/null_blk/zoned.c | 1 + drivers/nvme/host/zns.c | 1 + drivers/scsi/sd_zbc.c | 10 ++++++++ fs/zonefs/super.c | 9 +++---- include/linux/blkdev.h | 15 +++++++++++ 8 files changed, 86 insertions(+), 10 deletions(-) -- 2.29.2 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-22 8:00 ` Damien Le Moal -1 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that all writes into sequential write required zones be aligned to the device physical block size. However, NVMe ZNS does not have this constraint and allows write operations into sequential zones to be logical block size aligned. This inconsistency does not help with portability of software across device types. To solve this, introduce the zone_write_granularity queue limit to indicate the alignment constraint, in bytes, of write operations into zones of a zoned block device. This new limit is exported as a read-only sysfs queue attribute and the helper blk_queue_zone_write_granularity() introduced for drivers to set this limit. The scsi disk driver is modified to use this helper to set host-managed SMR disk zone write granularity to the disk physical block size. The ZNS support code of the NVMe driver is also modified to use this helper to set the new limit to the logical block size of the namespace. The nullblk driver is similarly modified too. The accessor functions queue_zone_write_granularity() and bdev_zone_write_granularity() are also introduced. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- Documentation/block/queue-sysfs.rst | 7 ++++++ block/blk-settings.c | 39 +++++++++++++++++++++++++---- block/blk-sysfs.c | 8 ++++++ drivers/block/null_blk/zoned.c | 1 + drivers/nvme/host/zns.c | 1 + drivers/scsi/sd_zbc.c | 10 ++++++++ include/linux/blkdev.h | 15 +++++++++++ 7 files changed, 76 insertions(+), 5 deletions(-) diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index 2638d3446b79..c8bf8bc3c03a 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -273,4 +273,11 @@ devices are described in the ZBC (Zoned Block Commands) and ZAC do not support zone commands, they will be treated as regular block devices and zoned will report "none". +zone_write_granularity (RO) +--------------------------- +This indicates the alignment constraint, in bytes, for write operations in +sequential zones of zoned block devices (devices with a zoned attributed +that reports "host-managed" or "host-aware"). This value is always 0 for +regular block devices. + Jens Axboe <jens.axboe@oracle.com>, February 2009 diff --git a/block/blk-settings.c b/block/blk-settings.c index 43990b1d148b..48872b4085d4 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -60,6 +60,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->io_opt = 0; lim->misaligned = 0; lim->zoned = BLK_ZONED_NONE; + lim->zone_write_granularity = 0; } EXPORT_SYMBOL(blk_set_default_limits); @@ -366,6 +367,28 @@ void blk_queue_physical_block_size(struct request_queue *q, unsigned int size) } EXPORT_SYMBOL(blk_queue_physical_block_size); +/** + * blk_queue_zone_write_granularity - set zone write granularity for the queue + * @q: the request queue for the zoned device + * @size: the zone write granularity size, in bytes + * + * Description: + * This should be set to the lowest possible size allowing to write in + * sequential zones of a zoned block device. + */ +void blk_queue_zone_write_granularity(struct request_queue *q, + unsigned int size) +{ + if (WARN_ON_ONCE(!blk_queue_is_zoned(q))) + return; + + q->limits.zone_write_granularity = size; + + if (q->limits.zone_write_granularity < q->limits.logical_block_size) + q->limits.zone_write_granularity = q->limits.logical_block_size; +} +EXPORT_SYMBOL_GPL(blk_queue_zone_write_granularity); + /** * blk_queue_alignment_offset - set physical block alignment offset * @q: the request queue for the device @@ -631,6 +654,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->discard_granularity; } + t->zone_write_granularity = max(t->zone_write_granularity, + b->zone_write_granularity); t->zoned = max(t->zoned, b->zoned); return ret; } @@ -847,6 +872,8 @@ EXPORT_SYMBOL_GPL(blk_queue_can_use_dma_map_merging); */ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) { + struct request_queue *q = disk->queue; + switch (model) { case BLK_ZONED_HM: /* @@ -864,18 +891,20 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) * partitions and zoned block device support is enabled, else * we do nothing special as far as the block layer is concerned. */ - if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) || - disk_has_partitions(disk)) - model = BLK_ZONED_NONE; - break; + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && + !disk_has_partitions(disk)) + break; + model = BLK_ZONED_NONE; + fallthrough; case BLK_ZONED_NONE: default: if (WARN_ON_ONCE(model != BLK_ZONED_NONE)) model = BLK_ZONED_NONE; + q->limits.zone_write_granularity = 0; break; } - disk->queue->limits.zoned = model; + q->limits.zoned = model; } EXPORT_SYMBOL_GPL(blk_queue_set_zoned); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index b513f1683af0..ae39c7f3d83d 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -219,6 +219,12 @@ static ssize_t queue_write_zeroes_max_show(struct request_queue *q, char *page) (unsigned long long)q->limits.max_write_zeroes_sectors << 9); } +static ssize_t queue_zone_write_granularity_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_zone_write_granularity(q), page); +} + static ssize_t queue_zone_append_max_show(struct request_queue *q, char *page) { unsigned long long max_sectors = q->limits.max_zone_append_sectors; @@ -585,6 +591,7 @@ QUEUE_RO_ENTRY(queue_discard_zeroes_data, "discard_zeroes_data"); QUEUE_RO_ENTRY(queue_write_same_max, "write_same_max_bytes"); QUEUE_RO_ENTRY(queue_write_zeroes_max, "write_zeroes_max_bytes"); QUEUE_RO_ENTRY(queue_zone_append_max, "zone_append_max_bytes"); +QUEUE_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity"); QUEUE_RO_ENTRY(queue_zoned, "zoned"); QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); @@ -639,6 +646,7 @@ static struct attribute *queue_attrs[] = { &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, + &queue_zone_write_granularity_entry.attr, &queue_nonrot_entry.attr, &queue_zoned_entry.attr, &queue_nr_zones_entry.attr, diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c index 535351570bb2..704d09481e0d 100644 --- a/drivers/block/null_blk/zoned.c +++ b/drivers/block/null_blk/zoned.c @@ -172,6 +172,7 @@ int null_register_zoned_dev(struct nullb *nullb) blk_queue_max_zone_append_sectors(q, dev->zone_size_sects); blk_queue_max_open_zones(q, dev->zone_max_open); blk_queue_max_active_zones(q, dev->zone_max_active); + blk_queue_zone_write_granularity(q, dev->blocksize); return 0; } diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c index 1dfe9a3500e3..f25311ccd996 100644 --- a/drivers/nvme/host/zns.c +++ b/drivers/nvme/host/zns.c @@ -113,6 +113,7 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf) blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q); blk_queue_max_open_zones(q, le32_to_cpu(id->mor) + 1); blk_queue_max_active_zones(q, le32_to_cpu(id->mar) + 1); + blk_queue_zone_write_granularity(q, q->limits.logical_block_size); free_data: kfree(id); return status; diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index cf07b7f93579..10e9f33cc069 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -789,6 +789,16 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) blk_queue_max_active_zones(q, 0); nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks); + /* + * Per ZBC and ZAC specifications, writes in sequential write required + * zones of host-managed devices must be aligned to the device physical + * block size. + */ + if (blk_queue_zoned_model(q) == BLK_ZONED_HM) + blk_queue_zone_write_granularity(q, sdkp->physical_block_size); + else if (blk_queue_zoned_model(q) == BLK_ZONED_HA) + blk_queue_zone_write_granularity(q, sdkp->device->sector_size); + /* READ16/WRITE16 is mandatory for ZBC disks */ sdkp->device->use_16_for_rw = 1; sdkp->device->use_10_for_rw = 0; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f94ee3089e01..142e3b34be75 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -337,6 +337,7 @@ struct queue_limits { unsigned int max_zone_append_sectors; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int zone_write_granularity; unsigned short max_segments; unsigned short max_integrity_segments; @@ -1161,6 +1162,8 @@ extern void blk_queue_logical_block_size(struct request_queue *, unsigned int); extern void blk_queue_max_zone_append_sectors(struct request_queue *q, unsigned int max_zone_append_sectors); extern void blk_queue_physical_block_size(struct request_queue *, unsigned int); +void blk_queue_zone_write_granularity(struct request_queue *q, + unsigned int size); extern void blk_queue_alignment_offset(struct request_queue *q, unsigned int alignment); void blk_queue_update_readahead(struct request_queue *q); @@ -1474,6 +1477,18 @@ static inline int bdev_io_opt(struct block_device *bdev) return queue_io_opt(bdev_get_queue(bdev)); } +static inline unsigned int +queue_zone_write_granularity(const struct request_queue *q) +{ + return q->limits.zone_write_granularity; +} + +static inline unsigned int +bdev_zone_write_granularity(struct block_device *bdev) +{ + return queue_zone_write_granularity(bdev_get_queue(bdev)); +} + static inline int queue_alignment_offset(const struct request_queue *q) { if (q->limits.misaligned) -- 2.29.2 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-22 8:00 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: linux-scsi, Keith Busch, Chaitanya Kulkarni, linux-nvme, Martin K . Petersen, Christoph Hellwig Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that all writes into sequential write required zones be aligned to the device physical block size. However, NVMe ZNS does not have this constraint and allows write operations into sequential zones to be logical block size aligned. This inconsistency does not help with portability of software across device types. To solve this, introduce the zone_write_granularity queue limit to indicate the alignment constraint, in bytes, of write operations into zones of a zoned block device. This new limit is exported as a read-only sysfs queue attribute and the helper blk_queue_zone_write_granularity() introduced for drivers to set this limit. The scsi disk driver is modified to use this helper to set host-managed SMR disk zone write granularity to the disk physical block size. The ZNS support code of the NVMe driver is also modified to use this helper to set the new limit to the logical block size of the namespace. The nullblk driver is similarly modified too. The accessor functions queue_zone_write_granularity() and bdev_zone_write_granularity() are also introduced. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- Documentation/block/queue-sysfs.rst | 7 ++++++ block/blk-settings.c | 39 +++++++++++++++++++++++++---- block/blk-sysfs.c | 8 ++++++ drivers/block/null_blk/zoned.c | 1 + drivers/nvme/host/zns.c | 1 + drivers/scsi/sd_zbc.c | 10 ++++++++ include/linux/blkdev.h | 15 +++++++++++ 7 files changed, 76 insertions(+), 5 deletions(-) diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index 2638d3446b79..c8bf8bc3c03a 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -273,4 +273,11 @@ devices are described in the ZBC (Zoned Block Commands) and ZAC do not support zone commands, they will be treated as regular block devices and zoned will report "none". +zone_write_granularity (RO) +--------------------------- +This indicates the alignment constraint, in bytes, for write operations in +sequential zones of zoned block devices (devices with a zoned attributed +that reports "host-managed" or "host-aware"). This value is always 0 for +regular block devices. + Jens Axboe <jens.axboe@oracle.com>, February 2009 diff --git a/block/blk-settings.c b/block/blk-settings.c index 43990b1d148b..48872b4085d4 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -60,6 +60,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->io_opt = 0; lim->misaligned = 0; lim->zoned = BLK_ZONED_NONE; + lim->zone_write_granularity = 0; } EXPORT_SYMBOL(blk_set_default_limits); @@ -366,6 +367,28 @@ void blk_queue_physical_block_size(struct request_queue *q, unsigned int size) } EXPORT_SYMBOL(blk_queue_physical_block_size); +/** + * blk_queue_zone_write_granularity - set zone write granularity for the queue + * @q: the request queue for the zoned device + * @size: the zone write granularity size, in bytes + * + * Description: + * This should be set to the lowest possible size allowing to write in + * sequential zones of a zoned block device. + */ +void blk_queue_zone_write_granularity(struct request_queue *q, + unsigned int size) +{ + if (WARN_ON_ONCE(!blk_queue_is_zoned(q))) + return; + + q->limits.zone_write_granularity = size; + + if (q->limits.zone_write_granularity < q->limits.logical_block_size) + q->limits.zone_write_granularity = q->limits.logical_block_size; +} +EXPORT_SYMBOL_GPL(blk_queue_zone_write_granularity); + /** * blk_queue_alignment_offset - set physical block alignment offset * @q: the request queue for the device @@ -631,6 +654,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->discard_granularity; } + t->zone_write_granularity = max(t->zone_write_granularity, + b->zone_write_granularity); t->zoned = max(t->zoned, b->zoned); return ret; } @@ -847,6 +872,8 @@ EXPORT_SYMBOL_GPL(blk_queue_can_use_dma_map_merging); */ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) { + struct request_queue *q = disk->queue; + switch (model) { case BLK_ZONED_HM: /* @@ -864,18 +891,20 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) * partitions and zoned block device support is enabled, else * we do nothing special as far as the block layer is concerned. */ - if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) || - disk_has_partitions(disk)) - model = BLK_ZONED_NONE; - break; + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && + !disk_has_partitions(disk)) + break; + model = BLK_ZONED_NONE; + fallthrough; case BLK_ZONED_NONE: default: if (WARN_ON_ONCE(model != BLK_ZONED_NONE)) model = BLK_ZONED_NONE; + q->limits.zone_write_granularity = 0; break; } - disk->queue->limits.zoned = model; + q->limits.zoned = model; } EXPORT_SYMBOL_GPL(blk_queue_set_zoned); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index b513f1683af0..ae39c7f3d83d 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -219,6 +219,12 @@ static ssize_t queue_write_zeroes_max_show(struct request_queue *q, char *page) (unsigned long long)q->limits.max_write_zeroes_sectors << 9); } +static ssize_t queue_zone_write_granularity_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_zone_write_granularity(q), page); +} + static ssize_t queue_zone_append_max_show(struct request_queue *q, char *page) { unsigned long long max_sectors = q->limits.max_zone_append_sectors; @@ -585,6 +591,7 @@ QUEUE_RO_ENTRY(queue_discard_zeroes_data, "discard_zeroes_data"); QUEUE_RO_ENTRY(queue_write_same_max, "write_same_max_bytes"); QUEUE_RO_ENTRY(queue_write_zeroes_max, "write_zeroes_max_bytes"); QUEUE_RO_ENTRY(queue_zone_append_max, "zone_append_max_bytes"); +QUEUE_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity"); QUEUE_RO_ENTRY(queue_zoned, "zoned"); QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); @@ -639,6 +646,7 @@ static struct attribute *queue_attrs[] = { &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, + &queue_zone_write_granularity_entry.attr, &queue_nonrot_entry.attr, &queue_zoned_entry.attr, &queue_nr_zones_entry.attr, diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c index 535351570bb2..704d09481e0d 100644 --- a/drivers/block/null_blk/zoned.c +++ b/drivers/block/null_blk/zoned.c @@ -172,6 +172,7 @@ int null_register_zoned_dev(struct nullb *nullb) blk_queue_max_zone_append_sectors(q, dev->zone_size_sects); blk_queue_max_open_zones(q, dev->zone_max_open); blk_queue_max_active_zones(q, dev->zone_max_active); + blk_queue_zone_write_granularity(q, dev->blocksize); return 0; } diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c index 1dfe9a3500e3..f25311ccd996 100644 --- a/drivers/nvme/host/zns.c +++ b/drivers/nvme/host/zns.c @@ -113,6 +113,7 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf) blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q); blk_queue_max_open_zones(q, le32_to_cpu(id->mor) + 1); blk_queue_max_active_zones(q, le32_to_cpu(id->mar) + 1); + blk_queue_zone_write_granularity(q, q->limits.logical_block_size); free_data: kfree(id); return status; diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index cf07b7f93579..10e9f33cc069 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -789,6 +789,16 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) blk_queue_max_active_zones(q, 0); nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks); + /* + * Per ZBC and ZAC specifications, writes in sequential write required + * zones of host-managed devices must be aligned to the device physical + * block size. + */ + if (blk_queue_zoned_model(q) == BLK_ZONED_HM) + blk_queue_zone_write_granularity(q, sdkp->physical_block_size); + else if (blk_queue_zoned_model(q) == BLK_ZONED_HA) + blk_queue_zone_write_granularity(q, sdkp->device->sector_size); + /* READ16/WRITE16 is mandatory for ZBC disks */ sdkp->device->use_16_for_rw = 1; sdkp->device->use_10_for_rw = 0; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f94ee3089e01..142e3b34be75 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -337,6 +337,7 @@ struct queue_limits { unsigned int max_zone_append_sectors; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int zone_write_granularity; unsigned short max_segments; unsigned short max_integrity_segments; @@ -1161,6 +1162,8 @@ extern void blk_queue_logical_block_size(struct request_queue *, unsigned int); extern void blk_queue_max_zone_append_sectors(struct request_queue *q, unsigned int max_zone_append_sectors); extern void blk_queue_physical_block_size(struct request_queue *, unsigned int); +void blk_queue_zone_write_granularity(struct request_queue *q, + unsigned int size); extern void blk_queue_alignment_offset(struct request_queue *q, unsigned int alignment); void blk_queue_update_readahead(struct request_queue *q); @@ -1474,6 +1477,18 @@ static inline int bdev_io_opt(struct block_device *bdev) return queue_io_opt(bdev_get_queue(bdev)); } +static inline unsigned int +queue_zone_write_granularity(const struct request_queue *q) +{ + return q->limits.zone_write_granularity; +} + +static inline unsigned int +bdev_zone_write_granularity(struct block_device *bdev) +{ + return queue_zone_write_granularity(bdev_get_queue(bdev)); +} + static inline int queue_alignment_offset(const struct request_queue *q) { if (q->limits.misaligned) -- 2.29.2 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-22 8:42 ` Christoph Hellwig -1 siblings, 0 replies; 28+ messages in thread From: Christoph Hellwig @ 2021-01-22 8:42 UTC (permalink / raw) To: Damien Le Moal Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch > @@ -864,18 +891,20 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) > * partitions and zoned block device support is enabled, else > * we do nothing special as far as the block layer is concerned. > */ > - if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) || > - disk_has_partitions(disk)) > - model = BLK_ZONED_NONE; > - break; > + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && > + !disk_has_partitions(disk)) > + break; > + model = BLK_ZONED_NONE; > + fallthrough; > case BLK_ZONED_NONE: > default: > if (WARN_ON_ONCE(model != BLK_ZONED_NONE)) > model = BLK_ZONED_NONE; > + q->limits.zone_write_granularity = 0; > break; > } > > - disk->queue->limits.zoned = model; > + q->limits.zoned = model; > } This looks a little strange. If we special case zoned vs not zoned here anyway, why not set the zone_write_granularity to the logical block size here by default. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-22 8:42 ` Christoph Hellwig 0 siblings, 0 replies; 28+ messages in thread From: Christoph Hellwig @ 2021-01-22 8:42 UTC (permalink / raw) To: Damien Le Moal Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen, Christoph Hellwig > @@ -864,18 +891,20 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) > * partitions and zoned block device support is enabled, else > * we do nothing special as far as the block layer is concerned. > */ > - if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) || > - disk_has_partitions(disk)) > - model = BLK_ZONED_NONE; > - break; > + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && > + !disk_has_partitions(disk)) > + break; > + model = BLK_ZONED_NONE; > + fallthrough; > case BLK_ZONED_NONE: > default: > if (WARN_ON_ONCE(model != BLK_ZONED_NONE)) > model = BLK_ZONED_NONE; > + q->limits.zone_write_granularity = 0; > break; > } > > - disk->queue->limits.zoned = model; > + q->limits.zoned = model; > } This looks a little strange. If we special case zoned vs not zoned here anyway, why not set the zone_write_granularity to the logical block size here by default. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-22 8:42 ` Christoph Hellwig @ 2021-01-22 8:56 ` Damien Le Moal -1 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:56 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Keith Busch On 2021/01/22 17:42, Christoph Hellwig wrote: >> @@ -864,18 +891,20 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) >> * partitions and zoned block device support is enabled, else >> * we do nothing special as far as the block layer is concerned. >> */ >> - if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) || >> - disk_has_partitions(disk)) >> - model = BLK_ZONED_NONE; >> - break; >> + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && >> + !disk_has_partitions(disk)) >> + break; >> + model = BLK_ZONED_NONE; >> + fallthrough; >> case BLK_ZONED_NONE: >> default: >> if (WARN_ON_ONCE(model != BLK_ZONED_NONE)) >> model = BLK_ZONED_NONE; >> + q->limits.zone_write_granularity = 0; >> break; >> } >> >> - disk->queue->limits.zoned = model; >> + q->limits.zoned = model; >> } > > This looks a little strange. If we special case zoned vs not zoned > here anyway, why not set the zone_write_granularity to the logical > block size here by default. The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence the reset here if we force the zoned model to none for HA drives. This way, this does not create a special case for HA drives used as regular disks. Of note is that there is something a little weird in the sd_zbc.c code that needs fixing: blk_queue_set_zoned() is called before sd_zbc_read_zones() is executed and that function will check the zones of an HA drive and set the queue nr_zones and max zone append sectors, even if blk_queue_set_zoned() set the zoned model to none due to partitions. That makes the BLK_ZONED_NONE case of HA drives a little weird since zone information is visible and correct but the model says "none". As long as users separate zoned vs not-zoned cases by looking at the zoned model, this does not create any problem, but that is not pretty. Will send a separate patch to clean that up and have something consistent with regular disks for this special HA case. The above blk_queue_set_zoned() function can be used to cleanup the zones information for an HA drive that is used as a regular disk (nr_zones, zone append sectors and zone bitmaps). -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-22 8:56 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:56 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen On 2021/01/22 17:42, Christoph Hellwig wrote: >> @@ -864,18 +891,20 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) >> * partitions and zoned block device support is enabled, else >> * we do nothing special as far as the block layer is concerned. >> */ >> - if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED) || >> - disk_has_partitions(disk)) >> - model = BLK_ZONED_NONE; >> - break; >> + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && >> + !disk_has_partitions(disk)) >> + break; >> + model = BLK_ZONED_NONE; >> + fallthrough; >> case BLK_ZONED_NONE: >> default: >> if (WARN_ON_ONCE(model != BLK_ZONED_NONE)) >> model = BLK_ZONED_NONE; >> + q->limits.zone_write_granularity = 0; >> break; >> } >> >> - disk->queue->limits.zoned = model; >> + q->limits.zoned = model; >> } > > This looks a little strange. If we special case zoned vs not zoned > here anyway, why not set the zone_write_granularity to the logical > block size here by default. The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence the reset here if we force the zoned model to none for HA drives. This way, this does not create a special case for HA drives used as regular disks. Of note is that there is something a little weird in the sd_zbc.c code that needs fixing: blk_queue_set_zoned() is called before sd_zbc_read_zones() is executed and that function will check the zones of an HA drive and set the queue nr_zones and max zone append sectors, even if blk_queue_set_zoned() set the zoned model to none due to partitions. That makes the BLK_ZONED_NONE case of HA drives a little weird since zone information is visible and correct but the model says "none". As long as users separate zoned vs not-zoned cases by looking at the zoned model, this does not create any problem, but that is not pretty. Will send a separate patch to clean that up and have something consistent with regular disks for this special HA case. The above blk_queue_set_zoned() function can be used to cleanup the zones information for an HA drive that is used as a regular disk (nr_zones, zone append sectors and zone bitmaps). -- Damien Le Moal Western Digital Research _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-22 8:56 ` Damien Le Moal @ 2021-01-24 10:07 ` Christoph Hellwig -1 siblings, 0 replies; 28+ messages in thread From: Christoph Hellwig @ 2021-01-24 10:07 UTC (permalink / raw) To: Damien Le Moal Cc: Christoph Hellwig, linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Keith Busch On Fri, Jan 22, 2021 at 08:56:58AM +0000, Damien Le Moal wrote: > > This looks a little strange. If we special case zoned vs not zoned > > here anyway, why not set the zone_write_granularity to the logical > > block size here by default. > > The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence > the reset here if we force the zoned model to none for HA drives. This way, this > does not create a special case for HA drives used as regular disks. Just inititialize it for all cases if you initialize it for some here. That way everyone but sd already gets a right default and life becomes simpler. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-24 10:07 ` Christoph Hellwig 0 siblings, 0 replies; 28+ messages in thread From: Christoph Hellwig @ 2021-01-24 10:07 UTC (permalink / raw) To: Damien Le Moal Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, Martin K . Petersen, linux-nvme, linux-block, linux-scsi, Christoph Hellwig On Fri, Jan 22, 2021 at 08:56:58AM +0000, Damien Le Moal wrote: > > This looks a little strange. If we special case zoned vs not zoned > > here anyway, why not set the zone_write_granularity to the logical > > block size here by default. > > The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence > the reset here if we force the zoned model to none for HA drives. This way, this > does not create a special case for HA drives used as regular disks. Just inititialize it for all cases if you initialize it for some here. That way everyone but sd already gets a right default and life becomes simpler. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-24 10:07 ` Christoph Hellwig @ 2021-01-25 5:32 ` Damien Le Moal -1 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-25 5:32 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Keith Busch On 2021/01/24 19:07, Christoph Hellwig wrote: > On Fri, Jan 22, 2021 at 08:56:58AM +0000, Damien Le Moal wrote: >>> This looks a little strange. If we special case zoned vs not zoned >>> here anyway, why not set the zone_write_granularity to the logical >>> block size here by default. >> >> The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence >> the reset here if we force the zoned model to none for HA drives. This way, this >> does not create a special case for HA drives used as regular disks. > > Just inititialize it for all cases if you initialize it for some here. > That way everyone but sd already gets a right default and life becomes > simpler. True for nullblk, and that also simplifies sd a little. But not for nvme, blk_queue_set_zoned() is not used AND nvme_update_zone_info() is called before nvme_update_disk_info() where the NS logical block size is set. So some surgery/cleanups would be needed to benefit. I could add a cleanup for this, but not entirely sure if calling nvme_update_zone_info() after nvme_update_disk_info() is OK. Thoughts ? -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-25 5:32 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-25 5:32 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen On 2021/01/24 19:07, Christoph Hellwig wrote: > On Fri, Jan 22, 2021 at 08:56:58AM +0000, Damien Le Moal wrote: >>> This looks a little strange. If we special case zoned vs not zoned >>> here anyway, why not set the zone_write_granularity to the logical >>> block size here by default. >> >> The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence >> the reset here if we force the zoned model to none for HA drives. This way, this >> does not create a special case for HA drives used as regular disks. > > Just inititialize it for all cases if you initialize it for some here. > That way everyone but sd already gets a right default and life becomes > simpler. True for nullblk, and that also simplifies sd a little. But not for nvme, blk_queue_set_zoned() is not used AND nvme_update_zone_info() is called before nvme_update_disk_info() where the NS logical block size is set. So some surgery/cleanups would be needed to benefit. I could add a cleanup for this, but not entirely sure if calling nvme_update_zone_info() after nvme_update_disk_info() is OK. Thoughts ? -- Damien Le Moal Western Digital Research _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-25 5:32 ` Damien Le Moal @ 2021-01-25 5:34 ` Damien Le Moal -1 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-25 5:34 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Keith Busch On 2021/01/25 14:32, Damien Le Moal wrote: > On 2021/01/24 19:07, Christoph Hellwig wrote: >> On Fri, Jan 22, 2021 at 08:56:58AM +0000, Damien Le Moal wrote: >>>> This looks a little strange. If we special case zoned vs not zoned >>>> here anyway, why not set the zone_write_granularity to the logical >>>> block size here by default. >>> >>> The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence >>> the reset here if we force the zoned model to none for HA drives. This way, this >>> does not create a special case for HA drives used as regular disks. >> >> Just inititialize it for all cases if you initialize it for some here. >> That way everyone but sd already gets a right default and life becomes >> simpler. > > True for nullblk, and that also simplifies sd a little. But not for nvme, > blk_queue_set_zoned() is not used AND nvme_update_zone_info() is called before > nvme_update_disk_info() where the NS logical block size is set. So some > surgery/cleanups would be needed to benefit. I could add a cleanup for this, but > not entirely sure if calling nvme_update_zone_info() after > nvme_update_disk_info() is OK. Thoughts ? Also, because nvme_update_zone_info() is called without the logical block size being set yet, this patch was not good for ZNS. -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-25 5:34 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-25 5:34 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen On 2021/01/25 14:32, Damien Le Moal wrote: > On 2021/01/24 19:07, Christoph Hellwig wrote: >> On Fri, Jan 22, 2021 at 08:56:58AM +0000, Damien Le Moal wrote: >>>> This looks a little strange. If we special case zoned vs not zoned >>>> here anyway, why not set the zone_write_granularity to the logical >>>> block size here by default. >>> >>> The convention is zone_write_granularity == 0 for the BLK_ZONED_NONE case. Hence >>> the reset here if we force the zoned model to none for HA drives. This way, this >>> does not create a special case for HA drives used as regular disks. >> >> Just inititialize it for all cases if you initialize it for some here. >> That way everyone but sd already gets a right default and life becomes >> simpler. > > True for nullblk, and that also simplifies sd a little. But not for nvme, > blk_queue_set_zoned() is not used AND nvme_update_zone_info() is called before > nvme_update_disk_info() where the NS logical block size is set. So some > surgery/cleanups would be needed to benefit. I could add a cleanup for this, but > not entirely sure if calling nvme_update_zone_info() after > nvme_update_disk_info() is OK. Thoughts ? Also, because nvme_update_zone_info() is called without the logical block size being set yet, this patch was not good for ZNS. -- Damien Le Moal Western Digital Research _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-23 2:43 ` Martin K. Petersen -1 siblings, 0 replies; 28+ messages in thread From: Martin K. Petersen @ 2021-01-23 2:43 UTC (permalink / raw) To: Damien Le Moal Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch Damien, > To solve this, introduce the zone_write_granularity queue limit to > indicate the alignment constraint, in bytes, of write operations into > zones of a zoned block device. This new limit is exported as a > read-only sysfs queue attribute and the helper > blk_queue_zone_write_granularity() introduced for drivers to set this > limit. The scsi disk driver is modified to use this helper to set > host-managed SMR disk zone write granularity to the disk physical > block size. The ZNS support code of the NVMe driver is also modified > to use this helper to set the new limit to the logical block size of > the namespace. The nullblk driver is similarly modified too. Looks fine. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/3] block: introduce zone_write_granularity limit @ 2021-01-23 2:43 ` Martin K. Petersen 0 siblings, 0 replies; 28+ messages in thread From: Martin K. Petersen @ 2021-01-23 2:43 UTC (permalink / raw) To: Damien Le Moal Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen, Christoph Hellwig Damien, > To solve this, introduce the zone_write_granularity queue limit to > indicate the alignment constraint, in bytes, of write operations into > zones of a zoned block device. This new limit is exported as a > read-only sysfs queue attribute and the helper > blk_queue_zone_write_granularity() introduced for drivers to set this > limit. The scsi disk driver is modified to use this helper to set > host-managed SMR disk zone write granularity to the disk physical > block size. The ZNS support code of the NVMe driver is also modified > to use this helper to set the new limit to the logical block size of > the namespace. The nullblk driver is similarly modified too. Looks fine. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> -- Martin K. Petersen Oracle Linux Engineering _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 2/3] block: document zone_append_max_bytes attribute 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-22 8:00 ` Damien Le Moal -1 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch The description of the zone_append_max_bytes sysfs queue attribute is missing from Documentation/block/queue-sysfs.rst. Add it. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- Documentation/block/queue-sysfs.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index c8bf8bc3c03a..4dc7f0d499a8 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -261,6 +261,12 @@ For block drivers that support REQ_OP_WRITE_ZEROES, the maximum number of bytes that can be zeroed at once. The value 0 means that REQ_OP_WRITE_ZEROES is not supported. +zone_append_max_bytes (RO) +-------------------------- +This is the maximum number of bytes that can be written to a sequential +zone of a zoned block device using a zone append write operation +(REQ_OP_ZONE_APPEND). This value is always 0 for regular block devices. + zoned (RO) ---------- This indicates if the device is a zoned block device and the zone model of the -- 2.29.2 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 2/3] block: document zone_append_max_bytes attribute @ 2021-01-22 8:00 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: linux-scsi, Keith Busch, Chaitanya Kulkarni, linux-nvme, Martin K . Petersen, Christoph Hellwig The description of the zone_append_max_bytes sysfs queue attribute is missing from Documentation/block/queue-sysfs.rst. Add it. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- Documentation/block/queue-sysfs.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index c8bf8bc3c03a..4dc7f0d499a8 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -261,6 +261,12 @@ For block drivers that support REQ_OP_WRITE_ZEROES, the maximum number of bytes that can be zeroed at once. The value 0 means that REQ_OP_WRITE_ZEROES is not supported. +zone_append_max_bytes (RO) +-------------------------- +This is the maximum number of bytes that can be written to a sequential +zone of a zoned block device using a zone append write operation +(REQ_OP_ZONE_APPEND). This value is always 0 for regular block devices. + zoned (RO) ---------- This indicates if the device is a zoned block device and the zone model of the -- 2.29.2 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v3 2/3] block: document zone_append_max_bytes attribute 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-23 2:43 ` Martin K. Petersen -1 siblings, 0 replies; 28+ messages in thread From: Martin K. Petersen @ 2021-01-23 2:43 UTC (permalink / raw) To: Damien Le Moal Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch Damien, > The description of the zone_append_max_bytes sysfs queue attribute is > missing from Documentation/block/queue-sysfs.rst. Add it. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 2/3] block: document zone_append_max_bytes attribute @ 2021-01-23 2:43 ` Martin K. Petersen 0 siblings, 0 replies; 28+ messages in thread From: Martin K. Petersen @ 2021-01-23 2:43 UTC (permalink / raw) To: Damien Le Moal Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen, Christoph Hellwig Damien, > The description of the zone_append_max_bytes sysfs queue attribute is > missing from Documentation/block/queue-sysfs.rst. Add it. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> -- Martin K. Petersen Oracle Linux Engineering _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 2/3] block: document zone_append_max_bytes attribute 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-23 3:03 ` Chaitanya Kulkarni -1 siblings, 0 replies; 28+ messages in thread From: Chaitanya Kulkarni @ 2021-01-23 3:03 UTC (permalink / raw) To: Damien Le Moal, linux-block, Jens Axboe Cc: linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch On 1/22/21 12:00 AM, Damien Le Moal wrote: > The description of the zone_append_max_bytes sysfs queue attribute is > missing from Documentation/block/queue-sysfs.rst. Add it. > > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> Looks good. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 2/3] block: document zone_append_max_bytes attribute @ 2021-01-23 3:03 ` Chaitanya Kulkarni 0 siblings, 0 replies; 28+ messages in thread From: Chaitanya Kulkarni @ 2021-01-23 3:03 UTC (permalink / raw) To: Damien Le Moal, linux-block, Jens Axboe Cc: Keith Busch, linux-nvme, Martin K . Petersen, linux-scsi, Christoph Hellwig On 1/22/21 12:00 AM, Damien Le Moal wrote: > The description of the zone_append_max_bytes sysfs queue attribute is > missing from Documentation/block/queue-sysfs.rst. Add it. > > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> Looks good. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 3/3] zonefs: use zone write granularity as block size 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-22 8:00 ` Damien Le Moal -1 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch Zoned block devices have different granularity constraints for write operations into sequential zones. E.g. ZBC and ZAC devices require that writes be aligned to the device physical block size while NVMe ZNS devices allow logical block size aligned write operations. To correctly handle such difference, use the device zone write granularity limit to set the block size of a zonefs volume, thus allowing the smallest possible write unit for all zoned device types. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- fs/zonefs/super.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index bec47f2d074b..8973d77ba000 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -1581,12 +1581,11 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent) sb->s_time_gran = 1; /* - * The block size is set to the device physical sector size to ensure - * that write operations on 512e devices (512B logical block and 4KB - * physical block) are always aligned to the device physical blocks, - * as mandated by the ZBC/ZAC specifications. + * The block size is set to the device zone write granularity to ensure + * that write operations are always aligned according to the device + * interface constraints. */ - sb_set_blocksize(sb, bdev_physical_block_size(sb->s_bdev)); + sb_set_blocksize(sb, bdev_zone_write_granularity(sb->s_bdev)); sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev)); sbi->s_uid = GLOBAL_ROOT_UID; sbi->s_gid = GLOBAL_ROOT_GID; -- 2.29.2 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 3/3] zonefs: use zone write granularity as block size @ 2021-01-22 8:00 ` Damien Le Moal 0 siblings, 0 replies; 28+ messages in thread From: Damien Le Moal @ 2021-01-22 8:00 UTC (permalink / raw) To: linux-block, Jens Axboe Cc: linux-scsi, Keith Busch, Chaitanya Kulkarni, linux-nvme, Martin K . Petersen, Christoph Hellwig Zoned block devices have different granularity constraints for write operations into sequential zones. E.g. ZBC and ZAC devices require that writes be aligned to the device physical block size while NVMe ZNS devices allow logical block size aligned write operations. To correctly handle such difference, use the device zone write granularity limit to set the block size of a zonefs volume, thus allowing the smallest possible write unit for all zoned device types. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- fs/zonefs/super.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index bec47f2d074b..8973d77ba000 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -1581,12 +1581,11 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent) sb->s_time_gran = 1; /* - * The block size is set to the device physical sector size to ensure - * that write operations on 512e devices (512B logical block and 4KB - * physical block) are always aligned to the device physical blocks, - * as mandated by the ZBC/ZAC specifications. + * The block size is set to the device zone write granularity to ensure + * that write operations are always aligned according to the device + * interface constraints. */ - sb_set_blocksize(sb, bdev_physical_block_size(sb->s_bdev)); + sb_set_blocksize(sb, bdev_zone_write_granularity(sb->s_bdev)); sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev)); sbi->s_uid = GLOBAL_ROOT_UID; sbi->s_gid = GLOBAL_ROOT_GID; -- 2.29.2 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/3] zonefs: use zone write granularity as block size 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-22 8:42 ` Christoph Hellwig -1 siblings, 0 replies; 28+ messages in thread From: Christoph Hellwig @ 2021-01-22 8:42 UTC (permalink / raw) To: Damien Le Moal Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/3] zonefs: use zone write granularity as block size @ 2021-01-22 8:42 ` Christoph Hellwig 0 siblings, 0 replies; 28+ messages in thread From: Christoph Hellwig @ 2021-01-22 8:42 UTC (permalink / raw) To: Damien Le Moal Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen, Christoph Hellwig Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/3] zonefs: use zone write granularity as block size 2021-01-22 8:00 ` Damien Le Moal @ 2021-01-23 2:44 ` Martin K. Petersen -1 siblings, 0 replies; 28+ messages in thread From: Martin K. Petersen @ 2021-01-23 2:44 UTC (permalink / raw) To: Damien Le Moal Cc: linux-block, Jens Axboe, Chaitanya Kulkarni, linux-scsi, Martin K . Petersen, linux-nvme, Christoph Hellwig, Keith Busch Damien, > Zoned block devices have different granularity constraints for write > operations into sequential zones. E.g. ZBC and ZAC devices require > that writes be aligned to the device physical block size while NVMe > ZNS devices allow logical block size aligned write operations. To > correctly handle such difference, use the device zone write > granularity limit to set the block size of a zonefs volume, thus > allowing the smallest possible write unit for all zoned device types. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/3] zonefs: use zone write granularity as block size @ 2021-01-23 2:44 ` Martin K. Petersen 0 siblings, 0 replies; 28+ messages in thread From: Martin K. Petersen @ 2021-01-23 2:44 UTC (permalink / raw) To: Damien Le Moal Cc: Jens Axboe, Keith Busch, Chaitanya Kulkarni, linux-scsi, linux-nvme, linux-block, Martin K . Petersen, Christoph Hellwig Damien, > Zoned block devices have different granularity constraints for write > operations into sequential zones. E.g. ZBC and ZAC devices require > that writes be aligned to the device physical block size while NVMe > ZNS devices allow logical block size aligned write operations. To > correctly handle such difference, use the device zone write > granularity limit to set the block size of a zonefs volume, thus > allowing the smallest possible write unit for all zoned device types. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> -- Martin K. Petersen Oracle Linux Engineering _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2021-01-25 5:36 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-22 8:00 [PATCH v3 0/3] block: add zone write granularity limit Damien Le Moal 2021-01-22 8:00 ` Damien Le Moal 2021-01-22 8:00 ` [PATCH v3 1/3] block: introduce zone_write_granularity limit Damien Le Moal 2021-01-22 8:00 ` Damien Le Moal 2021-01-22 8:42 ` Christoph Hellwig 2021-01-22 8:42 ` Christoph Hellwig 2021-01-22 8:56 ` Damien Le Moal 2021-01-22 8:56 ` Damien Le Moal 2021-01-24 10:07 ` Christoph Hellwig 2021-01-24 10:07 ` Christoph Hellwig 2021-01-25 5:32 ` Damien Le Moal 2021-01-25 5:32 ` Damien Le Moal 2021-01-25 5:34 ` Damien Le Moal 2021-01-25 5:34 ` Damien Le Moal 2021-01-23 2:43 ` Martin K. Petersen 2021-01-23 2:43 ` Martin K. Petersen 2021-01-22 8:00 ` [PATCH v3 2/3] block: document zone_append_max_bytes attribute Damien Le Moal 2021-01-22 8:00 ` Damien Le Moal 2021-01-23 2:43 ` Martin K. Petersen 2021-01-23 2:43 ` Martin K. Petersen 2021-01-23 3:03 ` Chaitanya Kulkarni 2021-01-23 3:03 ` Chaitanya Kulkarni 2021-01-22 8:00 ` [PATCH v3 3/3] zonefs: use zone write granularity as block size Damien Le Moal 2021-01-22 8:00 ` Damien Le Moal 2021-01-22 8:42 ` Christoph Hellwig 2021-01-22 8:42 ` Christoph Hellwig 2021-01-23 2:44 ` Martin K. Petersen 2021-01-23 2:44 ` Martin K. Petersen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.