All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv6 00/14] dm-zoned: metadata version 2
@ 2020-05-11  8:24 Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 01/15] dm-zoned: add 'status' callback Hannes Reinecke
                   ` (14 more replies)
  0 siblings, 15 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Hi all,

this patchset adds a new metadata version 2 for dm-zoned, which brings the
following improvements:

- UUIDs and labels: Adding three more fields to the metadata containing
  the dm-zoned device UUID and label, and the device UUID. This allows
  for an unique identification of the devices, so that several dm-zoned
  sets can coexist and have a persistent identification.
- Extend random zones by an additional regular disk device: A regular
  block device can be added together with the zoned block device, providing
  additional (emulated) random write zones. With this it's possible to
  handle sequential zones only devices; also there will be a speed-up if
  the regular block device resides on a fast medium. The regular block device
  is placed logically in front of the zoned block device, so that metadata
  and mapping tables reside on the regular block device, not the zoned device.
- Tertiary superblock support: In addition to the two existing sets of metadata
  another, tertiary, superblock is written to the first block of the zoned
  block device. This superblock is for identification only; the generation
  number is set to '0' and the block itself it never updated. The addition
  metadate like bitmap tables etc are not copied.

To handle this, some changes to the original handling are introduced:
- Zones are now equidistant. Originally, runt zones were ignored, and
  not counted when sizing the mapping tables. With the dual device setup
  runt zones might occur at the end of the regular block device, making
  direct translation between zone number and sector/block number complex.
  For metadata version 2 all zones are considered to be of the same size,
  and runt zones are simply marked as 'offline' to have them ignored when
  allocating a new zone.
- The block number in the superblock is now the global number, and refers to
  the location of the superblock relative to the resulting device-mapper
  device. Which means that the tertiary superblock contains absolute block
  addresses, which needs to be translated to the relative device addresses
  to find the referenced block.

There is an accompanying patchset for dm-zoned-tools for writing and checking
this new metadata.

As usual, comments and reviews are welcome.

Changes to v5:
- Separate 'message' and 'status' into separate patches
  as suggested by Mike Snitzer
- Update documentation to reflect the new functionality
  as suggested by Mike Snitzer
- Add review tags from Damien

Changes to v4:
- Add reviews from Damien
- Silence logging output as suggested by Mike Snitzer
- Fixup compilation on 32bit archs

Changes to v3:
- Reorder devices such that the regular device is always at position 0,
  and the zoned device is always at position 1.
- Split off dmz_dev_is_dying() into a separate patch
- Include reviews from Damien

Changes to v2:
- Kill dmz_id()
- Include reviews from Damien
- Sanitize uuid handling as suggested by John Dorminy

Hannes Reinecke (15):
  dm-zoned: add 'status' callback
  dm-zoned: add 'message' callback
  dm-zoned: store zone id within the zone structure and kill dmz_id()
  dm-zoned: use array for superblock zones
  dm-zoned: store device in struct dmz_sb
  dm-zoned: move fields from struct dmz_dev to dmz_metadata
  dm-zoned: introduce dmz_metadata_label() to format device name
  dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev()
  dm-zoned: remove 'dev' argument from reclaim
  dm-zoned: replace 'target' pointer in the bio context
  dm-zoned: use dmz_zone_to_dev() when handling metadata I/O
  dm-zoned: add metadata logging functions
  dm-zoned: Reduce logging output on startup
  dm-zoned: ignore metadata zone in dmz_alloc_zone()
  dm-zoned: metadata version 2

 .../admin-guide/device-mapper/dm-zoned.rst         |  62 +-
 drivers/md/dm-zoned-metadata.c                     | 664 ++++++++++++++++-----
 drivers/md/dm-zoned-reclaim.c                      |  88 +--
 drivers/md/dm-zoned-target.c                       | 376 ++++++++----
 drivers/md/dm-zoned.h                              |  35 +-
 5 files changed, 880 insertions(+), 345 deletions(-)

-- 
2.16.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/15] dm-zoned: add 'status' callback
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 02/15] dm-zoned: add 'message' callback Hannes Reinecke
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Add callback to supply information for 'dmsetup status'
and 'dmsetup table'. The output for 'dmsetup status' is

0 <size> zoned <nr_zones> zones <nr_unmap_rnd>/<nr_rnd> random <nr_unmap_seq>/<nr_seq> sequential

where <nr_unmap_rnd> is the number of unmapped (ie free) random zones,
<nr_rnd> the total number of random zones, <nr_unmap_seq> the number
of unmapped sequential zones, and <nr_seq> the total number of
sequential zones.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 .../admin-guide/device-mapper/dm-zoned.rst         | 16 +++++++++++++
 drivers/md/dm-zoned-metadata.c                     | 15 +++++++++++++
 drivers/md/dm-zoned-target.c                       | 26 ++++++++++++++++++++++
 drivers/md/dm-zoned.h                              |  3 +++
 4 files changed, 60 insertions(+)

diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
index 07f56ebc1730..4165fbf1aeb6 100644
--- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
+++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
@@ -144,3 +144,19 @@ underlying zoned block device name. Ex::
 
 	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
 	dmsetup create dmz-`basename ${dev}`
+
+Information about the internal layout and current usage of the zones can
+be obtained with the 'status' callback from dmsetup:
+
+Ex::
+
+	dmsetup status /dev/dm-X
+
+will return a line
+
+	0 <size> zoned <nr_zones> zones <nr_unmap_rnd>/<nr_rnd> random <nr_unmap_seq>/<nr_seq> sequential
+
+where <nr_zones> is the total number of zones, <nr_unmap_rnd> is the number
+of unmapped (ie free) random zones, <nr_rnd> the total number of zones,
+<nr_unmap_seq> the number of unmapped sequential zones, and <nr_seq> the
+total number of sequential zones.
diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 369de15c4e80..c8787560fa9f 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -202,6 +202,11 @@ sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_blocks_shift;
 }
 
+unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
+{
+	return zmd->dev->nr_zones;
+}
+
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd)
 {
 	return zmd->nr_chunks;
@@ -217,6 +222,16 @@ unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd)
 	return atomic_read(&zmd->unmap_nr_rnd);
 }
 
+unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd)
+{
+	return zmd->nr_seq;
+}
+
+unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
+{
+	return atomic_read(&zmd->unmap_nr_seq);
+}
+
 /*
  * Lock/unlock mapping table.
  * The map lock also protects all the zone lists.
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index f4f83d39b3dc..0b4b27d280fb 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -965,6 +965,31 @@ static int dmz_iterate_devices(struct dm_target *ti,
 	return fn(ti, dmz->ddev, 0, capacity, data);
 }
 
+static void dmz_status(struct dm_target *ti, status_type_t type,
+		       unsigned int status_flags, char *result,
+		       unsigned int maxlen)
+{
+	struct dmz_target *dmz = ti->private;
+	ssize_t sz = 0;
+	char buf[BDEVNAME_SIZE];
+
+	switch (type) {
+	case STATUSTYPE_INFO:
+		DMEMIT("%u zones %u/%u random %u/%u sequential",
+		       dmz_nr_zones(dmz->metadata),
+		       dmz_nr_unmap_rnd_zones(dmz->metadata),
+		       dmz_nr_rnd_zones(dmz->metadata),
+		       dmz_nr_unmap_seq_zones(dmz->metadata),
+		       dmz_nr_seq_zones(dmz->metadata));
+		break;
+	case STATUSTYPE_TABLE:
+		format_dev_t(buf, dmz->dev->bdev->bd_dev);
+		DMEMIT("%s", buf);
+		break;
+	}
+	return;
+}
+
 static struct target_type dmz_type = {
 	.name		 = "zoned",
 	.version	 = {1, 1, 0},
@@ -978,6 +1003,7 @@ static struct target_type dmz_type = {
 	.postsuspend	 = dmz_suspend,
 	.resume		 = dmz_resume,
 	.iterate_devices = dmz_iterate_devices,
+	.status		 = dmz_status,
 };
 
 static int __init dmz_init(void)
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 5b5e493d479c..884c0e586082 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -190,8 +190,11 @@ void dmz_free_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
 void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *zone,
 		  unsigned int chunk);
 void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
+unsigned int dmz_nr_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
+unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd);
+unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd);
 
 /*
  * Activate a zone (increment its reference count).
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/15] dm-zoned: add 'message' callback
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 01/15] dm-zoned: add 'status' callback Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 03/15] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Add callback for 'dmsetup message' to allow the reclaim process
to be triggered manually.
Eg.

	dmsetup message /dev/dm-X 0 message

will start the reclaim process even if the default threshold
of 50 percent of free random zones is not reached.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 Documentation/admin-guide/device-mapper/dm-zoned.rst | 12 ++++++++++++
 drivers/md/dm-zoned-target.c                         | 15 +++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
index 4165fbf1aeb6..7547ce635161 100644
--- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
+++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
@@ -160,3 +160,15 @@ where <nr_zones> is the total number of zones, <nr_unmap_rnd> is the number
 of unmapped (ie free) random zones, <nr_rnd> the total number of zones,
 <nr_unmap_seq> the number of unmapped sequential zones, and <nr_seq> the
 total number of sequential zones.
+
+Normally the reclaim process will be started once there are less than 50
+percent free random zones. In order to start the reclaim process manually
+even before reaching this threshold the 'dmsetup message' function can be
+used:
+
+Ex::
+
+	dmsetup message /dev/dm-X 0 reclaim
+
+will start the reclaim process and random zones will be moved to sequential
+zones.
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 0b4b27d280fb..0bfe34162dbb 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -990,6 +990,20 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
 	return;
 }
 
+static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
+		       char *result, unsigned int maxlen)
+{
+	struct dmz_target *dmz = ti->private;
+	int r = -EINVAL;
+
+	if (!strcasecmp(argv[0], "reclaim")) {
+		dmz_schedule_reclaim(dmz->reclaim);
+		r = 0;
+	} else
+		DMERR("unrecognized message %s", argv[0]);
+	return r;
+}
+
 static struct target_type dmz_type = {
 	.name		 = "zoned",
 	.version	 = {1, 1, 0},
@@ -1004,6 +1018,7 @@ static struct target_type dmz_type = {
 	.resume		 = dmz_resume,
 	.iterate_devices = dmz_iterate_devices,
 	.status		 = dmz_status,
+	.message	 = dmz_message,
 };
 
 static int __init dmz_init(void)
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/15] dm-zoned: store zone id within the zone structure and kill dmz_id()
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 01/15] dm-zoned: add 'status' callback Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 02/15] dm-zoned: add 'message' callback Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 04/15] dm-zoned: use array for superblock zones Hannes Reinecke
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Instead of calculating the zone index by the offset within the
zone array store the index within the structure itself. With that
the helper dmz_id() is pointless and can be replaced with accessing
the ->id value directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/md/dm-zoned-metadata.c | 40 +++++++++++++++++-----------------------
 drivers/md/dm-zoned-reclaim.c  | 17 ++++++++---------
 drivers/md/dm-zoned-target.c   |  6 +++---
 drivers/md/dm-zoned.h          |  4 +++-
 4 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index c8787560fa9f..1993eeb26bc1 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -187,19 +187,14 @@ struct dmz_metadata {
 /*
  * Various accessors
  */
-unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
-{
-	return ((unsigned int)(zone - zmd->zones));
-}
-
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_sectors_shift;
+	return (sector_t)zone->id << zmd->dev->zone_nr_sectors_shift;
 }
 
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_blocks_shift;
+	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
 }
 
 unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
@@ -1119,6 +1114,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 
 	INIT_LIST_HEAD(&zone->link);
 	atomic_set(&zone->refcount, 0);
+	zone->id = idx;
 	zone->chunk = DMZ_MAP_UNMAPPED;
 
 	switch (blkz->type) {
@@ -1246,7 +1242,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 		ret = -EIO;
 	if (ret < 0) {
 		dmz_dev_err(zmd->dev, "Get zone %u report failed",
-			    dmz_id(zmd, zone));
+			    zone->id);
 		dmz_check_bdev(zmd->dev);
 		return ret;
 	}
@@ -1270,7 +1266,7 @@ static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
 		return ret;
 
 	dmz_dev_warn(zmd->dev, "Processing zone %u write error (zone wp %u/%u)",
-		     dmz_id(zmd, zone), zone->wp_block, wp);
+		     zone->id, zone->wp_block, wp);
 
 	if (zone->wp_block < wp) {
 		dmz_invalidate_blocks(zmd, zone, zone->wp_block,
@@ -1309,7 +1305,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 				       dev->zone_nr_sectors, GFP_NOIO);
 		if (ret) {
 			dmz_dev_err(dev, "Reset zone %u failed %d",
-				    dmz_id(zmd, zone), ret);
+				    zone->id, ret);
 			return ret;
 		}
 	}
@@ -1757,8 +1753,7 @@ struct dm_zone *dmz_get_chunk_buffer(struct dmz_metadata *zmd,
 	}
 
 	/* Update the chunk mapping */
-	dmz_set_chunk_mapping(zmd, dzone->chunk, dmz_id(zmd, dzone),
-			      dmz_id(zmd, bzone));
+	dmz_set_chunk_mapping(zmd, dzone->chunk, dzone->id, bzone->id);
 
 	set_bit(DMZ_BUF, &bzone->flags);
 	bzone->chunk = dzone->chunk;
@@ -1810,7 +1805,7 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
 		atomic_dec(&zmd->unmap_nr_seq);
 
 	if (dmz_is_offline(zone)) {
-		dmz_dev_warn(zmd->dev, "Zone %u is offline", dmz_id(zmd, zone));
+		dmz_dev_warn(zmd->dev, "Zone %u is offline", zone->id);
 		zone = NULL;
 		goto again;
 	}
@@ -1852,7 +1847,7 @@ void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *dzone,
 		  unsigned int chunk)
 {
 	/* Set the chunk mapping */
-	dmz_set_chunk_mapping(zmd, chunk, dmz_id(zmd, dzone),
+	dmz_set_chunk_mapping(zmd, chunk, dzone->id,
 			      DMZ_MAP_UNMAPPED);
 	dzone->chunk = chunk;
 	if (dmz_is_rnd(dzone))
@@ -1880,7 +1875,7 @@ void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 		 * Unmapping the chunk buffer zone: clear only
 		 * the chunk buffer mapping
 		 */
-		dzone_id = dmz_id(zmd, zone->bzone);
+		dzone_id = zone->bzone->id;
 		zone->bzone->bzone = NULL;
 		zone->bzone = NULL;
 
@@ -1942,7 +1937,7 @@ static struct dmz_mblock *dmz_get_bitmap(struct dmz_metadata *zmd,
 					 sector_t chunk_block)
 {
 	sector_t bitmap_block = 1 + zmd->nr_map_blocks +
-		(sector_t)(dmz_id(zmd, zone) * zmd->zone_nr_bitmap_blocks) +
+		(sector_t)(zone->id * zmd->zone_nr_bitmap_blocks) +
 		(chunk_block >> DMZ_BLOCK_SHIFT_BITS);
 
 	return dmz_get_mblock(zmd, bitmap_block);
@@ -2022,7 +2017,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	unsigned int n = 0;
 
 	dmz_dev_debug(zmd->dev, "=> VALIDATE zone %u, block %llu, %u blocks",
-		      dmz_id(zmd, zone), (unsigned long long)chunk_block,
+		      zone->id, (unsigned long long)chunk_block,
 		      nr_blocks);
 
 	WARN_ON(chunk_block + nr_blocks > zone_nr_blocks);
@@ -2052,7 +2047,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 		zone->weight += n;
 	else {
 		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be <= %u",
-			     dmz_id(zmd, zone), zone->weight,
+			     zone->id, zone->weight,
 			     zone_nr_blocks - n);
 		zone->weight = zone_nr_blocks;
 	}
@@ -2102,7 +2097,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	unsigned int n = 0;
 
 	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
-		      dmz_id(zmd, zone), (u64)chunk_block, nr_blocks);
+		      zone->id, (u64)chunk_block, nr_blocks);
 
 	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
 
@@ -2132,7 +2127,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 		zone->weight -= n;
 	else {
 		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be >= %u",
-			     dmz_id(zmd, zone), zone->weight, n);
+			     zone->id, zone->weight, n);
 		zone->weight = 0;
 	}
 
@@ -2378,7 +2373,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 {
 	struct dmz_metadata *zmd;
-	unsigned int i, zid;
+	unsigned int i;
 	struct dm_zone *zone;
 	int ret;
 
@@ -2419,9 +2414,8 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 		goto err;
 
 	/* Set metadata zones starting from sb_zone */
-	zid = dmz_id(zmd, zmd->sb_zone);
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
-		zone = dmz_get(zmd, zid + i);
+		zone = dmz_get(zmd, zmd->sb_zone->id + i);
 		if (!dmz_is_rnd(zone))
 			goto err;
 		set_bit(DMZ_META, &zone->flags);
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index e7ace908a9b7..7f57c4299a2f 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -80,7 +80,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
 	if (ret) {
 		dmz_dev_err(zrc->dev,
 			    "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
-			    dmz_id(zmd, zone), (unsigned long long)wp_block,
+			    zone->id, (unsigned long long)wp_block,
 			    (unsigned long long)block, nr_blocks, ret);
 		dmz_check_bdev(zrc->dev);
 		return ret;
@@ -196,8 +196,8 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 
 	dmz_dev_debug(zrc->dev,
 		      "Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
-		      dzone->chunk, dmz_id(zmd, bzone), dmz_weight(bzone),
-		      dmz_id(zmd, dzone), dmz_weight(dzone));
+		      dzone->chunk, bzone->id, dmz_weight(bzone),
+		      dzone->id, dmz_weight(dzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, bzone, dzone);
@@ -235,8 +235,8 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 
 	dmz_dev_debug(zrc->dev,
 		      "Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
-		      chunk, dmz_id(zmd, dzone), dmz_weight(dzone),
-		      dmz_id(zmd, bzone), dmz_weight(bzone));
+		      chunk, dzone->id, dmz_weight(dzone),
+		      bzone->id, dmz_weight(bzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, dzone, bzone);
@@ -287,8 +287,7 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 
 	dmz_dev_debug(zrc->dev,
 		      "Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
-		      chunk, dmz_id(zmd, dzone), dmz_weight(dzone),
-		      dmz_id(zmd, szone));
+		      chunk, dzone->id, dmz_weight(dzone), szone->id);
 
 	/* Flush the random data zone into the sequential zone */
 	ret = dmz_reclaim_copy(zrc, dzone, szone);
@@ -403,12 +402,12 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 	if (ret) {
 		dmz_dev_debug(zrc->dev,
 			      "Metadata flush for zone %u failed, err %d\n",
-			      dmz_id(zmd, rzone), ret);
+			      rzone->id, ret);
 		return ret;
 	}
 
 	dmz_dev_debug(zrc->dev, "Reclaimed zone %u in %u ms",
-		      dmz_id(zmd, rzone), jiffies_to_msecs(jiffies - start));
+		      rzone->id, jiffies_to_msecs(jiffies - start));
 	return 0;
 }
 
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 0bfe34162dbb..859ccc30ba7f 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -180,7 +180,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
 		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      dmz_id(dmz->metadata, zone),
+		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
 	/* Check block validity to determine the read location */
@@ -317,7 +317,7 @@ static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
 		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      dmz_id(dmz->metadata, zone),
+		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
 	if (dmz_is_rnd(zone) || chunk_block == zone->wp_block) {
@@ -357,7 +357,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 
 	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
 		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
-		      dmz_id(zmd, zone),
+		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
 	/*
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 884c0e586082..30781646741a 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -87,6 +87,9 @@ struct dm_zone {
 	/* Zone activation reference count */
 	atomic_t		refcount;
 
+	/* Zone id */
+	unsigned int		id;
+
 	/* Zone write pointer block (relative to the zone start block) */
 	unsigned int		wp_block;
 
@@ -176,7 +179,6 @@ void dmz_lock_flush(struct dmz_metadata *zmd);
 void dmz_unlock_flush(struct dmz_metadata *zmd);
 int dmz_flush_metadata(struct dmz_metadata *zmd);
 
-unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/15] dm-zoned: use array for superblock zones
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (2 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 03/15] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 05/15] dm-zoned: store device in struct dmz_sb Hannes Reinecke
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Instead of storing just the first superblock zone and calculate
the secondary relative to that we should be using an array for
holding the superblock zones.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 41 +++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 1993eeb26bc1..900b1c1224f5 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -124,6 +124,7 @@ struct dmz_sb {
 	sector_t		block;
 	struct dmz_mblock	*mblk;
 	struct dmz_super	*sb;
+	struct dm_zone		*zone;
 };
 
 /*
@@ -150,7 +151,6 @@ struct dmz_metadata {
 	/* Zone information array */
 	struct dm_zone		*zones;
 
-	struct dm_zone		*sb_zone;
 	struct dmz_sb		sb[2];
 	unsigned int		mblk_primary;
 	u64			sb_gen;
@@ -839,8 +839,9 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 /*
  * Check super block.
  */
-static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
+static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 {
+	struct dmz_super *sb = zmd->sb[set].sb;
 	unsigned int nr_meta_zones, nr_data_zones;
 	struct dmz_dev *dev = zmd->dev;
 	u32 crc, stored_crc;
@@ -932,16 +933,20 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 
 	/* Bad first super block: search for the second one */
 	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
+	zmd->sb[1].zone = zmd->sb[0].zone + 1;
 	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
 		if (dmz_read_sb(zmd, 1) != 0)
 			break;
-		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC)
+		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC) {
+			zmd->sb[1].zone += i;
 			return 0;
+		}
 		zmd->sb[1].block += zone_nr_blocks;
 	}
 
 	dmz_free_mblock(zmd, mblk);
 	zmd->sb[1].mblk = NULL;
+	zmd->sb[1].zone = NULL;
 
 	return -EIO;
 }
@@ -985,11 +990,9 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
 
 	if (dst_set == 0)
-		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
-	else {
-		zmd->sb[1].block = zmd->sb[0].block +
-			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
-	}
+		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
+	else
+		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
 
 	page = alloc_page(GFP_NOIO);
 	if (!page)
@@ -1033,21 +1036,27 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 	u64 sb_gen[2] = {0, 0};
 	int ret;
 
+	if (!zmd->sb[0].zone) {
+		dmz_dev_err(zmd->dev, "Primary super block zone not set");
+		return -ENXIO;
+	}
+
 	/* Read and check the primary super block */
-	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
+	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
 	ret = dmz_get_sb(zmd, 0);
 	if (ret) {
 		dmz_dev_err(zmd->dev, "Read primary super block failed");
 		return ret;
 	}
 
-	ret = dmz_check_sb(zmd, zmd->sb[0].sb);
+	ret = dmz_check_sb(zmd, 0);
 
 	/* Read and check secondary super block */
 	if (ret == 0) {
 		sb_good[0] = true;
-		zmd->sb[1].block = zmd->sb[0].block +
-			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
+		if (!zmd->sb[1].zone)
+			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
+		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
 		ret = dmz_get_sb(zmd, 1);
 	} else
 		ret = dmz_lookup_secondary_sb(zmd);
@@ -1057,7 +1066,7 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		return ret;
 	}
 
-	ret = dmz_check_sb(zmd, zmd->sb[1].sb);
+	ret = dmz_check_sb(zmd, 1);
 	if (ret == 0)
 		sb_good[1] = true;
 
@@ -1142,9 +1151,9 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 		zmd->nr_useable_zones++;
 		if (dmz_is_rnd(zone)) {
 			zmd->nr_rnd_zones++;
-			if (!zmd->sb_zone) {
+			if (!zmd->sb[0].zone) {
 				/* Super block zone */
-				zmd->sb_zone = zone;
+				zmd->sb[0].zone = zone;
 			}
 		}
 	}
@@ -2415,7 +2424,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 
 	/* Set metadata zones starting from sb_zone */
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
-		zone = dmz_get(zmd, zmd->sb_zone->id + i);
+		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
 		if (!dmz_is_rnd(zone))
 			goto err;
 		set_bit(DMZ_META, &zone->flags);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/15] dm-zoned: store device in struct dmz_sb
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (3 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 04/15] dm-zoned: use array for superblock zones Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 06/15] dm-zoned: move fields from struct dmz_dev to dmz_metadata Hannes Reinecke
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Store the device together with the superblock so that
we don't have to recur to the metadata to find it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 90 +++++++++++++++++++++++++++---------------
 1 file changed, 59 insertions(+), 31 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 900b1c1224f5..def836e12dd9 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -122,6 +122,7 @@ enum {
  */
 struct dmz_sb {
 	sector_t		block;
+	struct dmz_dev		*dev;
 	struct dmz_mblock	*mblk;
 	struct dmz_super	*sb;
 	struct dm_zone		*zone;
@@ -197,6 +198,11 @@ sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
 }
 
+struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
+{
+	return &zmd->dev[0];
+}
+
 unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
 {
 	return zmd->dev->nr_zones;
@@ -412,9 +418,10 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd,
 {
 	struct dmz_mblock *mblk, *m;
 	sector_t block = zmd->sb[zmd->mblk_primary].block + mblk_no;
+	struct dmz_dev *dev = zmd->sb[zmd->mblk_primary].dev;
 	struct bio *bio;
 
-	if (dmz_bdev_is_dying(zmd->dev))
+	if (dmz_bdev_is_dying(dev))
 		return ERR_PTR(-EIO);
 
 	/* Get a new block and a BIO to read it */
@@ -450,7 +457,7 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd,
 
 	/* Submit read BIO */
 	bio->bi_iter.bi_sector = dmz_blk2sect(block);
-	bio_set_dev(bio, zmd->dev->bdev);
+	bio_set_dev(bio, dev->bdev);
 	bio->bi_private = mblk;
 	bio->bi_end_io = dmz_mblock_bio_end_io;
 	bio_set_op_attrs(bio, REQ_OP_READ, REQ_META | REQ_PRIO);
@@ -547,6 +554,7 @@ static struct dmz_mblock *dmz_get_mblock(struct dmz_metadata *zmd,
 					 sector_t mblk_no)
 {
 	struct dmz_mblock *mblk;
+	struct dmz_dev *dev = zmd->sb[zmd->mblk_primary].dev;
 
 	/* Check rbtree */
 	spin_lock(&zmd->mblk_lock);
@@ -565,7 +573,7 @@ static struct dmz_mblock *dmz_get_mblock(struct dmz_metadata *zmd,
 		       TASK_UNINTERRUPTIBLE);
 	if (test_bit(DMZ_META_ERROR, &mblk->state)) {
 		dmz_release_mblock(zmd, mblk);
-		dmz_check_bdev(zmd->dev);
+		dmz_check_bdev(dev);
 		return ERR_PTR(-EIO);
 	}
 
@@ -589,10 +597,11 @@ static void dmz_dirty_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk)
 static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 			    unsigned int set)
 {
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	sector_t block = zmd->sb[set].block + mblk->no;
 	struct bio *bio;
 
-	if (dmz_bdev_is_dying(zmd->dev))
+	if (dmz_bdev_is_dying(dev))
 		return -EIO;
 
 	bio = bio_alloc(GFP_NOIO, 1);
@@ -604,7 +613,7 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 	set_bit(DMZ_META_WRITING, &mblk->state);
 
 	bio->bi_iter.bi_sector = dmz_blk2sect(block);
-	bio_set_dev(bio, zmd->dev->bdev);
+	bio_set_dev(bio, dev->bdev);
 	bio->bi_private = mblk;
 	bio->bi_end_io = dmz_mblock_bio_end_io;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_META | REQ_PRIO);
@@ -617,13 +626,13 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 /*
  * Read/write a metadata block.
  */
-static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block,
-			  struct page *page)
+static int dmz_rdwr_block(struct dmz_dev *dev, int op,
+			  sector_t block, struct page *page)
 {
 	struct bio *bio;
 	int ret;
 
-	if (dmz_bdev_is_dying(zmd->dev))
+	if (dmz_bdev_is_dying(dev))
 		return -EIO;
 
 	bio = bio_alloc(GFP_NOIO, 1);
@@ -631,14 +640,14 @@ static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block,
 		return -ENOMEM;
 
 	bio->bi_iter.bi_sector = dmz_blk2sect(block);
-	bio_set_dev(bio, zmd->dev->bdev);
+	bio_set_dev(bio, dev->bdev);
 	bio_set_op_attrs(bio, op, REQ_SYNC | REQ_META | REQ_PRIO);
 	bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0);
 	ret = submit_bio_wait(bio);
 	bio_put(bio);
 
 	if (ret)
-		dmz_check_bdev(zmd->dev);
+		dmz_check_bdev(dev);
 	return ret;
 }
 
@@ -650,6 +659,7 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	sector_t block = zmd->sb[set].block;
 	struct dmz_mblock *mblk = zmd->sb[set].mblk;
 	struct dmz_super *sb = zmd->sb[set].sb;
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	u64 sb_gen = zmd->sb_gen + 1;
 	int ret;
 
@@ -669,9 +679,9 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	sb->crc = 0;
 	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
 
-	ret = dmz_rdwr_block(zmd, REQ_OP_WRITE, block, mblk->page);
+	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
 	if (ret == 0)
-		ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
+		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 
 	return ret;
 }
@@ -684,6 +694,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
 				   unsigned int set)
 {
 	struct dmz_mblock *mblk;
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	struct blk_plug plug;
 	int ret = 0, nr_mblks_submitted = 0;
 
@@ -705,7 +716,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
 			       TASK_UNINTERRUPTIBLE);
 		if (test_bit(DMZ_META_ERROR, &mblk->state)) {
 			clear_bit(DMZ_META_ERROR, &mblk->state);
-			dmz_check_bdev(zmd->dev);
+			dmz_check_bdev(dev);
 			ret = -EIO;
 		}
 		nr_mblks_submitted--;
@@ -713,7 +724,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
 
 	/* Flush drive cache (this will also sync data) */
 	if (ret == 0)
-		ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
+		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 
 	return ret;
 }
@@ -750,6 +761,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 {
 	struct dmz_mblock *mblk;
 	struct list_head write_list;
+	struct dmz_dev *dev;
 	int ret;
 
 	if (WARN_ON(!zmd))
@@ -763,6 +775,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 	 * from modifying metadata.
 	 */
 	down_write(&zmd->mblk_sem);
+	dev = zmd->sb[zmd->mblk_primary].dev;
 
 	/*
 	 * This is called from the target flush work and reclaim work.
@@ -770,7 +783,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 	 */
 	dmz_lock_flush(zmd);
 
-	if (dmz_bdev_is_dying(zmd->dev)) {
+	if (dmz_bdev_is_dying(dev)) {
 		ret = -EIO;
 		goto out;
 	}
@@ -782,7 +795,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 
 	/* If there are no dirty metadata blocks, just flush the device cache */
 	if (list_empty(&write_list)) {
-		ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
+		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 		goto err;
 	}
 
@@ -831,7 +844,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 		list_splice(&write_list, &zmd->mblk_dirty_list);
 		spin_unlock(&zmd->mblk_lock);
 	}
-	if (!dmz_check_bdev(zmd->dev))
+	if (!dmz_check_bdev(dev))
 		ret = -EIO;
 	goto out;
 }
@@ -842,8 +855,8 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 {
 	struct dmz_super *sb = zmd->sb[set].sb;
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	unsigned int nr_meta_zones, nr_data_zones;
-	struct dmz_dev *dev = zmd->dev;
 	u32 crc, stored_crc;
 	u64 gen;
 
@@ -908,8 +921,8 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
  */
 static int dmz_read_sb(struct dmz_metadata *zmd, unsigned int set)
 {
-	return dmz_rdwr_block(zmd, REQ_OP_READ, zmd->sb[set].block,
-			      zmd->sb[set].mblk->page);
+	return dmz_rdwr_block(zmd->sb[set].dev, REQ_OP_READ,
+			      zmd->sb[set].block, zmd->sb[set].mblk->page);
 }
 
 /*
@@ -934,6 +947,7 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 	/* Bad first super block: search for the second one */
 	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
 	zmd->sb[1].zone = zmd->sb[0].zone + 1;
+	zmd->sb[1].dev = dmz_zone_to_dev(zmd, zmd->sb[1].zone);
 	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
 		if (dmz_read_sb(zmd, 1) != 0)
 			break;
@@ -942,11 +956,13 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 			return 0;
 		}
 		zmd->sb[1].block += zone_nr_blocks;
+		zmd->sb[1].dev = dmz_zone_to_dev(zmd, zmd->sb[1].zone + i);
 	}
 
 	dmz_free_mblock(zmd, mblk);
 	zmd->sb[1].mblk = NULL;
 	zmd->sb[1].zone = NULL;
+	zmd->sb[1].dev = NULL;
 
 	return -EIO;
 }
@@ -987,7 +1003,8 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 	struct page *page;
 	int i, ret;
 
-	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
+	dmz_dev_warn(zmd->sb[dst_set].dev,
+		     "Metadata set %u invalid: recovering", dst_set);
 
 	if (dst_set == 0)
 		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
@@ -1000,11 +1017,11 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 
 	/* Copy metadata blocks */
 	for (i = 1; i < zmd->nr_meta_blocks; i++) {
-		ret = dmz_rdwr_block(zmd, REQ_OP_READ,
+		ret = dmz_rdwr_block(zmd->sb[src_set].dev, REQ_OP_READ,
 				     zmd->sb[src_set].block + i, page);
 		if (ret)
 			goto out;
-		ret = dmz_rdwr_block(zmd, REQ_OP_WRITE,
+		ret = dmz_rdwr_block(zmd->sb[dst_set].dev, REQ_OP_WRITE,
 				     zmd->sb[dst_set].block + i, page);
 		if (ret)
 			goto out;
@@ -1043,9 +1060,10 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 
 	/* Read and check the primary super block */
 	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
+	zmd->sb[0].dev = dmz_zone_to_dev(zmd, zmd->sb[0].zone);
 	ret = dmz_get_sb(zmd, 0);
 	if (ret) {
-		dmz_dev_err(zmd->dev, "Read primary super block failed");
+		dmz_dev_err(zmd->sb[0].dev, "Read primary super block failed");
 		return ret;
 	}
 
@@ -1057,12 +1075,13 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		if (!zmd->sb[1].zone)
 			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
 		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
+		zmd->sb[1].dev = dmz_zone_to_dev(zmd, zmd->sb[1].zone);
 		ret = dmz_get_sb(zmd, 1);
 	} else
 		ret = dmz_lookup_secondary_sb(zmd);
 
 	if (ret) {
-		dmz_dev_err(zmd->dev, "Read secondary super block failed");
+		dmz_dev_err(zmd->sb[1].dev, "Read secondary super block failed");
 		return ret;
 	}
 
@@ -1078,17 +1097,25 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 
 	if (sb_good[0])
 		sb_gen[0] = le64_to_cpu(zmd->sb[0].sb->gen);
-	else
+	else {
 		ret = dmz_recover_mblocks(zmd, 0);
+		if (ret) {
+			dmz_dev_err(zmd->sb[0].dev,
+				    "Recovery of superblock 0 failed");
+			return -EIO;
+		}
+	}
 
 	if (sb_good[1])
 		sb_gen[1] = le64_to_cpu(zmd->sb[1].sb->gen);
-	else
+	else {
 		ret = dmz_recover_mblocks(zmd, 1);
 
-	if (ret) {
-		dmz_dev_err(zmd->dev, "Recovery failed");
-		return -EIO;
+		if (ret) {
+			dmz_dev_err(zmd->sb[1].dev,
+				    "Recovery of superblock 1 failed");
+			return -EIO;
+		}
 	}
 
 	if (sb_gen[0] >= sb_gen[1]) {
@@ -1099,7 +1126,8 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		zmd->mblk_primary = 1;
 	}
 
-	dmz_dev_debug(zmd->dev, "Using super block %u (gen %llu)",
+	dmz_dev_debug(zmd->sb[zmd->mblk_primary].dev,
+		      "Using super block %u (gen %llu)",
 		      zmd->mblk_primary, zmd->sb_gen);
 
 	return 0;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/15] dm-zoned: move fields from struct dmz_dev to dmz_metadata
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (4 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 05/15] dm-zoned: store device in struct dmz_sb Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 07/15] dm-zoned: introduce dmz_metadata_label() to format device name Hannes Reinecke
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Move fields from the device structure into the metadata structure
and provide accessor functions.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 88 ++++++++++++++++++++++++++++--------------
 drivers/md/dm-zoned-reclaim.c  |  8 ++--
 drivers/md/dm-zoned-target.c   | 48 +++++++++++------------
 drivers/md/dm-zoned.h          | 14 +++----
 4 files changed, 95 insertions(+), 63 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index def836e12dd9..b844ff02ae7b 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -138,9 +138,16 @@ struct dmz_metadata {
 	unsigned int		zone_nr_bitmap_blocks;
 	unsigned int		zone_bits_per_mblk;
 
+	sector_t		zone_nr_blocks;
+	sector_t		zone_nr_blocks_shift;
+
+	sector_t		zone_nr_sectors;
+	sector_t		zone_nr_sectors_shift;
+
 	unsigned int		nr_bitmap_blocks;
 	unsigned int		nr_map_blocks;
 
+	unsigned int		nr_zones;
 	unsigned int		nr_useable_zones;
 	unsigned int		nr_meta_blocks;
 	unsigned int		nr_meta_zones;
@@ -190,12 +197,12 @@ struct dmz_metadata {
  */
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->dev->zone_nr_sectors_shift;
+	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
 }
 
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
+	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
 }
 
 struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
@@ -203,9 +210,29 @@ struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
 	return &zmd->dev[0];
 }
 
+unsigned int dmz_zone_nr_blocks(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_blocks;
+}
+
+unsigned int dmz_zone_nr_blocks_shift(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_blocks_shift;
+}
+
+unsigned int dmz_zone_nr_sectors(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_sectors;
+}
+
+unsigned int dmz_zone_nr_sectors_shift(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_sectors_shift;
+}
+
 unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
 {
-	return zmd->dev->nr_zones;
+	return zmd->nr_zones;
 }
 
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd)
@@ -882,8 +909,8 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 		return -ENXIO;
 	}
 
-	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + dev->zone_nr_blocks - 1)
-		>> dev->zone_nr_blocks_shift;
+	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
+		>> zmd->zone_nr_blocks_shift;
 	if (!nr_meta_zones ||
 	    nr_meta_zones >= zmd->nr_rnd_zones) {
 		dmz_dev_err(dev, "Invalid number of metadata blocks");
@@ -932,7 +959,7 @@ static int dmz_read_sb(struct dmz_metadata *zmd, unsigned int set)
  */
 static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 {
-	unsigned int zone_nr_blocks = zmd->dev->zone_nr_blocks;
+	unsigned int zone_nr_blocks = zmd->zone_nr_blocks;
 	struct dmz_mblock *mblk;
 	int i;
 
@@ -1143,7 +1170,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 	struct dmz_dev *dev = zmd->dev;
 
 	/* Ignore the eventual last runt (smaller) zone */
-	if (blkz->len != dev->zone_nr_sectors) {
+	if (blkz->len != zmd->zone_nr_sectors) {
 		if (blkz->start + blkz->len == dev->capacity)
 			return 0;
 		return -ENXIO;
@@ -1208,19 +1235,24 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	int ret;
 
 	/* Init */
-	zmd->zone_bitmap_size = dev->zone_nr_blocks >> 3;
+	zmd->zone_nr_sectors = dev->zone_nr_sectors;
+	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
+	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
+	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
+	zmd->zone_bitmap_size = zmd->zone_nr_blocks >> 3;
 	zmd->zone_nr_bitmap_blocks =
 		max_t(sector_t, 1, zmd->zone_bitmap_size >> DMZ_BLOCK_SHIFT);
-	zmd->zone_bits_per_mblk = min_t(sector_t, dev->zone_nr_blocks,
+	zmd->zone_bits_per_mblk = min_t(sector_t, zmd->zone_nr_blocks,
 					DMZ_BLOCK_SIZE_BITS);
 
 	/* Allocate zone array */
-	zmd->zones = kcalloc(dev->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
+	zmd->nr_zones = dev->nr_zones;
+	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
 	if (!zmd->zones)
 		return -ENOMEM;
 
 	dmz_dev_info(dev, "Using %zu B for zone information",
-		     sizeof(struct dm_zone) * dev->nr_zones);
+		     sizeof(struct dm_zone) * zmd->nr_zones);
 
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
@@ -1339,7 +1371,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 
 		ret = blkdev_zone_mgmt(dev->bdev, REQ_OP_ZONE_RESET,
 				       dmz_start_sect(zmd, zone),
-				       dev->zone_nr_sectors, GFP_NOIO);
+				       zmd->zone_nr_sectors, GFP_NOIO);
 		if (ret) {
 			dmz_dev_err(dev, "Reset zone %u failed %d",
 				    zone->id, ret);
@@ -1393,7 +1425,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 		if (dzone_id == DMZ_MAP_UNMAPPED)
 			goto next;
 
-		if (dzone_id >= dev->nr_zones) {
+		if (dzone_id >= zmd->nr_zones) {
 			dmz_dev_err(dev, "Chunk %u mapping: invalid data zone ID %u",
 				    chunk, dzone_id);
 			return -EIO;
@@ -1414,7 +1446,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 		if (bzone_id == DMZ_MAP_UNMAPPED)
 			goto next;
 
-		if (bzone_id >= dev->nr_zones) {
+		if (bzone_id >= zmd->nr_zones) {
 			dmz_dev_err(dev, "Chunk %u mapping: invalid buffer zone ID %u",
 				    chunk, bzone_id);
 			return -EIO;
@@ -1446,7 +1478,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 	 * fully initialized. All remaining zones are unmapped data
 	 * zones. Finish initializing those here.
 	 */
-	for (i = 0; i < dev->nr_zones; i++) {
+	for (i = 0; i < zmd->nr_zones; i++) {
 		dzone = dmz_get(zmd, i);
 		if (dmz_is_meta(dzone))
 			continue;
@@ -1990,7 +2022,7 @@ int dmz_copy_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 	sector_t chunk_block = 0;
 
 	/* Get the zones bitmap blocks */
-	while (chunk_block < zmd->dev->zone_nr_blocks) {
+	while (chunk_block < zmd->zone_nr_blocks) {
 		from_mblk = dmz_get_bitmap(zmd, from_zone, chunk_block);
 		if (IS_ERR(from_mblk))
 			return PTR_ERR(from_mblk);
@@ -2025,7 +2057,7 @@ int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 	int ret;
 
 	/* Get the zones bitmap blocks */
-	while (chunk_block < zmd->dev->zone_nr_blocks) {
+	while (chunk_block < zmd->zone_nr_blocks) {
 		/* Get a valid region from the source zone */
 		ret = dmz_first_valid_block(zmd, from_zone, &chunk_block);
 		if (ret <= 0)
@@ -2049,7 +2081,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 			sector_t chunk_block, unsigned int nr_blocks)
 {
 	unsigned int count, bit, nr_bits;
-	unsigned int zone_nr_blocks = zmd->dev->zone_nr_blocks;
+	unsigned int zone_nr_blocks = zmd->zone_nr_blocks;
 	struct dmz_mblock *mblk;
 	unsigned int n = 0;
 
@@ -2136,7 +2168,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
 		      zone->id, (u64)chunk_block, nr_blocks);
 
-	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
+	WARN_ON(chunk_block + nr_blocks > zmd->zone_nr_blocks);
 
 	while (nr_blocks) {
 		/* Get bitmap block */
@@ -2180,7 +2212,7 @@ static int dmz_test_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	struct dmz_mblock *mblk;
 	int ret;
 
-	WARN_ON(chunk_block >= zmd->dev->zone_nr_blocks);
+	WARN_ON(chunk_block >= zmd->zone_nr_blocks);
 
 	/* Get bitmap block */
 	mblk = dmz_get_bitmap(zmd, zone, chunk_block);
@@ -2210,7 +2242,7 @@ static int dmz_to_next_set_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	unsigned long *bitmap;
 	int n = 0;
 
-	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
+	WARN_ON(chunk_block + nr_blocks > zmd->zone_nr_blocks);
 
 	while (nr_blocks) {
 		/* Get bitmap block */
@@ -2254,7 +2286,7 @@ int dmz_block_valid(struct dmz_metadata *zmd, struct dm_zone *zone,
 
 	/* The block is valid: get the number of valid blocks from block */
 	return dmz_to_next_set_block(zmd, zone, chunk_block,
-				     zmd->dev->zone_nr_blocks - chunk_block, 0);
+				     zmd->zone_nr_blocks - chunk_block, 0);
 }
 
 /*
@@ -2270,7 +2302,7 @@ int dmz_first_valid_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	int ret;
 
 	ret = dmz_to_next_set_block(zmd, zone, start_block,
-				    zmd->dev->zone_nr_blocks - start_block, 1);
+				    zmd->zone_nr_blocks - start_block, 1);
 	if (ret < 0)
 		return ret;
 
@@ -2278,7 +2310,7 @@ int dmz_first_valid_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	*chunk_block = start_block;
 
 	return dmz_to_next_set_block(zmd, zone, start_block,
-				     zmd->dev->zone_nr_blocks - start_block, 0);
+				     zmd->zone_nr_blocks - start_block, 0);
 }
 
 /*
@@ -2317,7 +2349,7 @@ static void dmz_get_zone_weight(struct dmz_metadata *zmd, struct dm_zone *zone)
 	struct dmz_mblock *mblk;
 	sector_t chunk_block = 0;
 	unsigned int bit, nr_bits;
-	unsigned int nr_blocks = zmd->dev->zone_nr_blocks;
+	unsigned int nr_blocks = zmd->zone_nr_blocks;
 	void *bitmap;
 	int n = 0;
 
@@ -2488,7 +2520,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
 		     (u64)dev->capacity);
 	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
-		     dev->nr_zones, (u64)dev->zone_nr_sectors);
+		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
 	dmz_dev_info(dev, "  %u metadata zones",
 		     zmd->nr_meta_zones * 2);
 	dmz_dev_info(dev, "  %u data zones for %u chunks",
@@ -2541,7 +2573,7 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 	int ret;
 
 	/* Check zones */
-	for (i = 0; i < dev->nr_zones; i++) {
+	for (i = 0; i < zmd->nr_zones; i++) {
 		zone = dmz_get(zmd, i);
 		if (!zone) {
 			dmz_dev_err(dev, "Unable to get zone %u", i);
@@ -2569,7 +2601,7 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 				    i, (u64)zone->wp_block, (u64)wp_block);
 			zone->wp_block = wp_block;
 			dmz_invalidate_blocks(zmd, zone, zone->wp_block,
-					      dev->zone_nr_blocks - zone->wp_block);
+					      zmd->zone_nr_blocks - zone->wp_block);
 		}
 	}
 
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 7f57c4299a2f..5aa5e5130fe8 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -128,7 +128,7 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 	if (dmz_is_seq(src_zone))
 		end_block = src_zone->wp_block;
 	else
-		end_block = dev->zone_nr_blocks;
+		end_block = dmz_zone_nr_blocks(zmd);
 	src_zone_block = dmz_start_block(zmd, src_zone);
 	dst_zone_block = dmz_start_block(zmd, dst_zone);
 
@@ -210,7 +210,7 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	ret = dmz_merge_valid_blocks(zmd, bzone, dzone, chunk_block);
 	if (ret == 0) {
 		/* Free the buffer zone */
-		dmz_invalidate_blocks(zmd, bzone, 0, zrc->dev->zone_nr_blocks);
+		dmz_invalidate_blocks(zmd, bzone, 0, dmz_zone_nr_blocks(zmd));
 		dmz_lock_map(zmd);
 		dmz_unmap_zone(zmd, bzone);
 		dmz_unlock_zone_reclaim(dzone);
@@ -252,7 +252,7 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 		 * Free the data zone and remap the chunk to
 		 * the buffer zone.
 		 */
-		dmz_invalidate_blocks(zmd, dzone, 0, zrc->dev->zone_nr_blocks);
+		dmz_invalidate_blocks(zmd, dzone, 0, dmz_zone_nr_blocks(zmd));
 		dmz_lock_map(zmd);
 		dmz_unmap_zone(zmd, bzone);
 		dmz_unmap_zone(zmd, dzone);
@@ -305,7 +305,7 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 		dmz_unlock_map(zmd);
 	} else {
 		/* Free the data zone and remap the chunk */
-		dmz_invalidate_blocks(zmd, dzone, 0, zrc->dev->zone_nr_blocks);
+		dmz_invalidate_blocks(zmd, dzone, 0, dmz_zone_nr_blocks(zmd));
 		dmz_lock_map(zmd);
 		dmz_unmap_zone(zmd, dzone);
 		dmz_unlock_zone_reclaim(dzone);
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 859ccc30ba7f..68c5684d7b01 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -165,7 +165,8 @@ static void dmz_handle_read_zero(struct dmz_target *dmz, struct bio *bio,
 static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 			   struct bio *bio)
 {
-	sector_t chunk_block = dmz_chunk_block(dmz->dev, dmz_bio_block(bio));
+	struct dmz_metadata *zmd = dmz->metadata;
+	sector_t chunk_block = dmz_chunk_block(zmd, dmz_bio_block(bio));
 	unsigned int nr_blocks = dmz_bio_blocks(bio);
 	sector_t end_block = chunk_block + nr_blocks;
 	struct dm_zone *rzone, *bzone;
@@ -178,7 +179,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 	}
 
 	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
 		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
@@ -189,7 +190,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 		nr_blocks = 0;
 		if (dmz_is_rnd(zone) || chunk_block < zone->wp_block) {
 			/* Test block validity in the data zone */
-			ret = dmz_block_valid(dmz->metadata, zone, chunk_block);
+			ret = dmz_block_valid(zmd, zone, chunk_block);
 			if (ret < 0)
 				return ret;
 			if (ret > 0) {
@@ -204,7 +205,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 		 * Check the buffer zone, if there is one.
 		 */
 		if (!nr_blocks && bzone) {
-			ret = dmz_block_valid(dmz->metadata, bzone, chunk_block);
+			ret = dmz_block_valid(zmd, bzone, chunk_block);
 			if (ret < 0)
 				return ret;
 			if (ret > 0) {
@@ -308,14 +309,15 @@ static int dmz_handle_buffered_write(struct dmz_target *dmz,
 static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 			    struct bio *bio)
 {
-	sector_t chunk_block = dmz_chunk_block(dmz->dev, dmz_bio_block(bio));
+	struct dmz_metadata *zmd = dmz->metadata;
+	sector_t chunk_block = dmz_chunk_block(zmd, dmz_bio_block(bio));
 	unsigned int nr_blocks = dmz_bio_blocks(bio);
 
 	if (!zone)
 		return -ENOSPC;
 
 	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
 		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
@@ -345,7 +347,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 	struct dmz_metadata *zmd = dmz->metadata;
 	sector_t block = dmz_bio_block(bio);
 	unsigned int nr_blocks = dmz_bio_blocks(bio);
-	sector_t chunk_block = dmz_chunk_block(dmz->dev, block);
+	sector_t chunk_block = dmz_chunk_block(zmd, block);
 	int ret = 0;
 
 	/* For unmapped chunks, there is nothing to do */
@@ -356,7 +358,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 		return -EROFS;
 
 	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
 		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
@@ -402,7 +404,7 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 	 * mapping for read and discard. If a mapping is obtained,
 	 + the zone returned will be set to active state.
 	 */
-	zone = dmz_get_chunk_mapping(zmd, dmz_bio_chunk(dmz->dev, bio),
+	zone = dmz_get_chunk_mapping(zmd, dmz_bio_chunk(zmd, bio),
 				     bio_op(bio));
 	if (IS_ERR(zone)) {
 		ret = PTR_ERR(zone);
@@ -525,7 +527,7 @@ static void dmz_flush_work(struct work_struct *work)
  */
 static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio)
 {
-	unsigned int chunk = dmz_bio_chunk(dmz->dev, bio);
+	unsigned int chunk = dmz_bio_chunk(dmz->metadata, bio);
 	struct dm_chunk_work *cw;
 	int ret = 0;
 
@@ -618,6 +620,7 @@ bool dmz_check_bdev(struct dmz_dev *dmz_dev)
 static int dmz_map(struct dm_target *ti, struct bio *bio)
 {
 	struct dmz_target *dmz = ti->private;
+	struct dmz_metadata *zmd = dmz->metadata;
 	struct dmz_dev *dev = dmz->dev;
 	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
 	sector_t sector = bio->bi_iter.bi_sector;
@@ -630,8 +633,8 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 
 	dmz_dev_debug(dev, "BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
 		      bio_op(bio), (unsigned long long)sector, nr_sectors,
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
-		      (unsigned long long)dmz_chunk_block(dmz->dev, dmz_bio_block(bio)),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
+		      (unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
 		      (unsigned int)dmz_bio_blocks(bio));
 
 	bio_set_dev(bio, dev->bdev);
@@ -659,16 +662,16 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	}
 
 	/* Split zone BIOs to fit entirely into a zone */
-	chunk_sector = sector & (dev->zone_nr_sectors - 1);
-	if (chunk_sector + nr_sectors > dev->zone_nr_sectors)
-		dm_accept_partial_bio(bio, dev->zone_nr_sectors - chunk_sector);
+	chunk_sector = sector & (dmz_zone_nr_sectors(zmd) - 1);
+	if (chunk_sector + nr_sectors > dmz_zone_nr_sectors(zmd))
+		dm_accept_partial_bio(bio, dmz_zone_nr_sectors(zmd) - chunk_sector);
 
 	/* Now ready to handle this BIO */
 	ret = dmz_queue_chunk_work(dmz, bio);
 	if (ret) {
 		dmz_dev_debug(dmz->dev,
 			      "BIO op %d, can't process chunk %llu, err %i\n",
-			      bio_op(bio), (u64)dmz_bio_chunk(dmz->dev, bio),
+			      bio_op(bio), (u64)dmz_bio_chunk(zmd, bio),
 			      ret);
 		return DM_MAPIO_REQUEUE;
 	}
@@ -722,10 +725,6 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
 	}
 
 	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
-	dev->zone_nr_sectors_shift = ilog2(dev->zone_nr_sectors);
-
-	dev->zone_nr_blocks = dmz_sect2blk(dev->zone_nr_sectors);
-	dev->zone_nr_blocks_shift = ilog2(dev->zone_nr_blocks);
 
 	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
 
@@ -790,7 +789,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	}
 
 	/* Set target (no write same support) */
-	ti->max_io_len = dev->zone_nr_sectors << 9;
+	ti->max_io_len = dmz_zone_nr_sectors(dmz->metadata) << 9;
 	ti->num_flush_bios = 1;
 	ti->num_discard_bios = 1;
 	ti->num_write_zeroes_bios = 1;
@@ -799,7 +798,8 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	ti->discards_supported = true;
 
 	/* The exposed capacity is the number of chunks that can be mapped */
-	ti->len = (sector_t)dmz_nr_chunks(dmz->metadata) << dev->zone_nr_sectors_shift;
+	ti->len = (sector_t)dmz_nr_chunks(dmz->metadata) <<
+		dmz_zone_nr_sectors_shift(dmz->metadata);
 
 	/* Zone BIO */
 	ret = bioset_init(&dmz->bio_set, DMZ_MIN_BIOS, 0, 0);
@@ -895,7 +895,7 @@ static void dmz_dtr(struct dm_target *ti)
 static void dmz_io_hints(struct dm_target *ti, struct queue_limits *limits)
 {
 	struct dmz_target *dmz = ti->private;
-	unsigned int chunk_sectors = dmz->dev->zone_nr_sectors;
+	unsigned int chunk_sectors = dmz_zone_nr_sectors(dmz->metadata);
 
 	limits->logical_block_size = DMZ_BLOCK_SIZE;
 	limits->physical_block_size = DMZ_BLOCK_SIZE;
@@ -960,7 +960,7 @@ static int dmz_iterate_devices(struct dm_target *ti,
 {
 	struct dmz_target *dmz = ti->private;
 	struct dmz_dev *dev = dmz->dev;
-	sector_t capacity = dev->capacity & ~(dev->zone_nr_sectors - 1);
+	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
 
 	return fn(ti, dmz->ddev, 0, capacity, data);
 }
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 30781646741a..f997ad62c7b4 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -60,15 +60,11 @@ struct dmz_dev {
 	unsigned int		flags;
 
 	sector_t		zone_nr_sectors;
-	unsigned int		zone_nr_sectors_shift;
-
-	sector_t		zone_nr_blocks;
-	sector_t		zone_nr_blocks_shift;
 };
 
-#define dmz_bio_chunk(dev, bio)	((bio)->bi_iter.bi_sector >> \
-				 (dev)->zone_nr_sectors_shift)
-#define dmz_chunk_block(dev, b)	((b) & ((dev)->zone_nr_blocks - 1))
+#define dmz_bio_chunk(zmd, bio)	((bio)->bi_iter.bi_sector >> \
+				 dmz_zone_nr_sectors_shift(zmd))
+#define dmz_chunk_block(zmd, b)	((b) & (dmz_zone_nr_blocks(zmd) - 1))
 
 /* Device flags. */
 #define DMZ_BDEV_DYING		(1 << 0)
@@ -197,6 +193,10 @@ unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_blocks(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_blocks_shift(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_sectors(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_sectors_shift(struct dmz_metadata *zmd);
 
 /*
  * Activate a zone (increment its reference count).
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/15] dm-zoned: introduce dmz_metadata_label() to format device name
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (5 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 06/15] dm-zoned: move fields from struct dmz_dev to dmz_metadata Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 08/15] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Introduce dmz_metadata_label() to format the device-mapper device
name and use it instead of the device name of the underlying device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 11 ++++++-
 drivers/md/dm-zoned-reclaim.c  | 15 +++++----
 drivers/md/dm-zoned-target.c   | 74 +++++++++++++++++++++++-------------------
 drivers/md/dm-zoned.h          |  4 ++-
 4 files changed, 62 insertions(+), 42 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index b844ff02ae7b..7cda48683c0b 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -134,6 +134,8 @@ struct dmz_sb {
 struct dmz_metadata {
 	struct dmz_dev		*dev;
 
+	char			devname[BDEVNAME_SIZE];
+
 	sector_t		zone_bitmap_size;
 	unsigned int		zone_nr_bitmap_blocks;
 	unsigned int		zone_bits_per_mblk;
@@ -260,6 +262,11 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
 	return atomic_read(&zmd->unmap_nr_seq);
 }
 
+const char *dmz_metadata_label(struct dmz_metadata *zmd)
+{
+	return (const char *)zmd->devname;
+}
+
 /*
  * Lock/unlock mapping table.
  * The map lock also protects all the zone lists.
@@ -2439,7 +2446,8 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 /*
  * Initialize the zoned metadata.
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
+int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
+		     const char *devname)
 {
 	struct dmz_metadata *zmd;
 	unsigned int i;
@@ -2450,6 +2458,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 	if (!zmd)
 		return -ENOMEM;
 
+	strcpy(zmd->devname, devname);
 	zmd->dev = dev;
 	zmd->mblk_rbtree = RB_ROOT;
 	init_rwsem(&zmd->mblk_sem);
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 5aa5e5130fe8..699c4145306e 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -480,15 +480,16 @@ static void dmz_reclaim_work(struct work_struct *work)
 		zrc->kc_throttle.throttle = min(75U, 100U - p_unmap_rnd / 2);
 	}
 
-	dmz_dev_debug(zrc->dev,
-		      "Reclaim (%u): %s, %u%% free rnd zones (%u/%u)",
-		      zrc->kc_throttle.throttle,
-		      (dmz_target_idle(zrc) ? "Idle" : "Busy"),
-		      p_unmap_rnd, nr_unmap_rnd, nr_rnd);
+	DMDEBUG("(%s): Reclaim (%u): %s, %u%% free rnd zones (%u/%u)",
+		dmz_metadata_label(zmd),
+		zrc->kc_throttle.throttle,
+		(dmz_target_idle(zrc) ? "Idle" : "Busy"),
+		p_unmap_rnd, nr_unmap_rnd, nr_rnd);
 
 	ret = dmz_do_reclaim(zrc);
 	if (ret) {
-		dmz_dev_debug(zrc->dev, "Reclaim error %d\n", ret);
+		DMDEBUG("(%s): Reclaim error %d\n",
+			dmz_metadata_label(zmd), ret);
 		if (!dmz_check_bdev(zrc->dev))
 			return;
 	}
@@ -524,7 +525,7 @@ int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
 	/* Reclaim work */
 	INIT_DELAYED_WORK(&zrc->work, dmz_reclaim_work);
 	zrc->wq = alloc_ordered_workqueue("dmz_rwq_%s", WQ_MEM_RECLAIM,
-					  dev->name);
+					  dmz_metadata_label(zmd));
 	if (!zrc->wq) {
 		ret = -ENOMEM;
 		goto err;
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 68c5684d7b01..ba5b8c507c98 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -178,11 +178,12 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 		return 0;
 	}
 
-	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      zone->id,
-		      (unsigned long long)chunk_block, nr_blocks);
+	DMDEBUG("(%s): READ chunk %llu -> %s zone %u, block %llu, %u blocks",
+		dmz_metadata_label(zmd),
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		(dmz_is_rnd(zone) ? "RND" : "SEQ"),
+		zone->id,
+		(unsigned long long)chunk_block, nr_blocks);
 
 	/* Check block validity to determine the read location */
 	bzone = zone->bzone;
@@ -316,11 +317,12 @@ static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 	if (!zone)
 		return -ENOSPC;
 
-	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      zone->id,
-		      (unsigned long long)chunk_block, nr_blocks);
+	DMDEBUG("(%s): WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
+		dmz_metadata_label(zmd),
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		(dmz_is_rnd(zone) ? "RND" : "SEQ"),
+		zone->id,
+		(unsigned long long)chunk_block, nr_blocks);
 
 	if (dmz_is_rnd(zone) || chunk_block == zone->wp_block) {
 		/*
@@ -357,10 +359,11 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 	if (dmz_is_readonly(zone))
 		return -EROFS;
 
-	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      zone->id,
-		      (unsigned long long)chunk_block, nr_blocks);
+	DMDEBUG("(%s): DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
+		dmz_metadata_label(dmz->metadata),
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		zone->id,
+		(unsigned long long)chunk_block, nr_blocks);
 
 	/*
 	 * Invalidate blocks in the data zone and its
@@ -429,8 +432,8 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 		ret = dmz_handle_discard(dmz, zone, bio);
 		break;
 	default:
-		dmz_dev_err(dmz->dev, "Unsupported BIO operation 0x%x",
-			    bio_op(bio));
+		DMERR("(%s): Unsupported BIO operation 0x%x",
+		      dmz_metadata_label(dmz->metadata), bio_op(bio));
 		ret = -EIO;
 	}
 
@@ -504,7 +507,8 @@ static void dmz_flush_work(struct work_struct *work)
 	/* Flush dirty metadata blocks */
 	ret = dmz_flush_metadata(dmz->metadata);
 	if (ret)
-		dmz_dev_debug(dmz->dev, "Metadata flush failed, rc=%d\n", ret);
+		DMDEBUG("(%s): Metadata flush failed, rc=%d\n",
+			dmz_metadata_label(dmz->metadata), ret);
 
 	/* Process queued flush requests */
 	while (1) {
@@ -631,11 +635,12 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	if (dmz_bdev_is_dying(dmz->dev))
 		return DM_MAPIO_KILL;
 
-	dmz_dev_debug(dev, "BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
-		      bio_op(bio), (unsigned long long)sector, nr_sectors,
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      (unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
-		      (unsigned int)dmz_bio_blocks(bio));
+	DMDEBUG("(%s): BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
+		dmz_metadata_label(zmd),
+		bio_op(bio), (unsigned long long)sector, nr_sectors,
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		(unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
+		(unsigned int)dmz_bio_blocks(bio));
 
 	bio_set_dev(bio, dev->bdev);
 
@@ -669,10 +674,10 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	/* Now ready to handle this BIO */
 	ret = dmz_queue_chunk_work(dmz, bio);
 	if (ret) {
-		dmz_dev_debug(dmz->dev,
-			      "BIO op %d, can't process chunk %llu, err %i\n",
-			      bio_op(bio), (u64)dmz_bio_chunk(zmd, bio),
-			      ret);
+		DMDEBUG("(%s): BIO op %d, can't process chunk %llu, err %i\n",
+			dmz_metadata_label(zmd),
+			bio_op(bio), (u64)dmz_bio_chunk(zmd, bio),
+			ret);
 		return DM_MAPIO_REQUEUE;
 	}
 
@@ -782,7 +787,8 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 	/* Initialize metadata */
 	dev = dmz->dev;
-	ret = dmz_ctr_metadata(dev, &dmz->metadata);
+	ret = dmz_ctr_metadata(dev, &dmz->metadata,
+			       dm_table_device_name(ti->table));
 	if (ret) {
 		ti->error = "Metadata initialization failed";
 		goto err_dev;
@@ -811,8 +817,9 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	/* Chunk BIO work */
 	mutex_init(&dmz->chunk_lock);
 	INIT_RADIX_TREE(&dmz->chunk_rxtree, GFP_NOIO);
-	dmz->chunk_wq = alloc_workqueue("dmz_cwq_%s", WQ_MEM_RECLAIM | WQ_UNBOUND,
-					0, dev->name);
+	dmz->chunk_wq = alloc_workqueue("dmz_cwq_%s",
+					WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
+					dmz_metadata_label(dmz->metadata));
 	if (!dmz->chunk_wq) {
 		ti->error = "Create chunk workqueue failed";
 		ret = -ENOMEM;
@@ -824,7 +831,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	bio_list_init(&dmz->flush_list);
 	INIT_DELAYED_WORK(&dmz->flush_work, dmz_flush_work);
 	dmz->flush_wq = alloc_ordered_workqueue("dmz_fwq_%s", WQ_MEM_RECLAIM,
-						dev->name);
+						dmz_metadata_label(dmz->metadata));
 	if (!dmz->flush_wq) {
 		ti->error = "Create flush workqueue failed";
 		ret = -ENOMEM;
@@ -839,9 +846,10 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		goto err_fwq;
 	}
 
-	dmz_dev_info(dev, "Target device: %llu 512-byte logical sectors (%llu blocks)",
-		     (unsigned long long)ti->len,
-		     (unsigned long long)dmz_sect2blk(ti->len));
+	DMINFO("(%s): Target device: %llu 512-byte logical sectors (%llu blocks)",
+	       dmz_metadata_label(dmz->metadata),
+	       (unsigned long long)ti->len,
+	       (unsigned long long)dmz_sect2blk(ti->len));
 
 	return 0;
 err_fwq:
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index f997ad62c7b4..dd768dc60341 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -163,7 +163,8 @@ struct dmz_reclaim;
 /*
  * Functions defined in dm-zoned-metadata.c
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd);
+int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
+		     const char *devname);
 void dmz_dtr_metadata(struct dmz_metadata *zmd);
 int dmz_resume_metadata(struct dmz_metadata *zmd);
 
@@ -174,6 +175,7 @@ void dmz_unlock_metadata(struct dmz_metadata *zmd);
 void dmz_lock_flush(struct dmz_metadata *zmd);
 void dmz_unlock_flush(struct dmz_metadata *zmd);
 int dmz_flush_metadata(struct dmz_metadata *zmd);
+const char *dmz_metadata_label(struct dmz_metadata *zmd);
 
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/15] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev()
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (6 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 07/15] dm-zoned: introduce dmz_metadata_label() to format device name Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 09/15] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Introduce accessors dmz_dev_is_dying() and dmz_check_dev() to
avoid having to reference the devices directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/md/dm-zoned-metadata.c | 14 ++++++++++++--
 drivers/md/dm-zoned-reclaim.c  |  4 ++--
 drivers/md/dm-zoned-target.c   |  2 +-
 drivers/md/dm-zoned.h          |  3 +++
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 7cda48683c0b..426af738f1ca 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -267,6 +267,16 @@ const char *dmz_metadata_label(struct dmz_metadata *zmd)
 	return (const char *)zmd->devname;
 }
 
+bool dmz_check_dev(struct dmz_metadata *zmd)
+{
+	return dmz_check_bdev(&zmd->dev[0]);
+}
+
+bool dmz_dev_is_dying(struct dmz_metadata *zmd)
+{
+	return dmz_bdev_is_dying(&zmd->dev[0]);
+}
+
 /*
  * Lock/unlock mapping table.
  * The map lock also protects all the zone lists.
@@ -1719,7 +1729,7 @@ struct dm_zone *dmz_get_chunk_mapping(struct dmz_metadata *zmd, unsigned int chu
 		/* Allocate a random zone */
 		dzone = dmz_alloc_zone(zmd, DMZ_ALLOC_RND);
 		if (!dzone) {
-			if (dmz_bdev_is_dying(zmd->dev)) {
+			if (dmz_dev_is_dying(zmd)) {
 				dzone = ERR_PTR(-EIO);
 				goto out;
 			}
@@ -1820,7 +1830,7 @@ struct dm_zone *dmz_get_chunk_buffer(struct dmz_metadata *zmd,
 	/* Allocate a random zone */
 	bzone = dmz_alloc_zone(zmd, DMZ_ALLOC_RND);
 	if (!bzone) {
-		if (dmz_bdev_is_dying(zmd->dev)) {
+		if (dmz_dev_is_dying(zmd)) {
 			bzone = ERR_PTR(-EIO);
 			goto out;
 		}
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 699c4145306e..5daede0daf92 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -455,7 +455,7 @@ static void dmz_reclaim_work(struct work_struct *work)
 	unsigned int p_unmap_rnd;
 	int ret;
 
-	if (dmz_bdev_is_dying(zrc->dev))
+	if (dmz_dev_is_dying(zmd))
 		return;
 
 	if (!dmz_should_reclaim(zrc)) {
@@ -490,7 +490,7 @@ static void dmz_reclaim_work(struct work_struct *work)
 	if (ret) {
 		DMDEBUG("(%s): Reclaim error %d\n",
 			dmz_metadata_label(zmd), ret);
-		if (!dmz_check_bdev(zrc->dev))
+		if (!dmz_check_dev(zmd))
 			return;
 	}
 
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index ba5b8c507c98..b32e791b8a5c 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -632,7 +632,7 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	sector_t chunk_sector;
 	int ret;
 
-	if (dmz_bdev_is_dying(dmz->dev))
+	if (dmz_dev_is_dying(zmd))
 		return DM_MAPIO_KILL;
 
 	DMDEBUG("(%s): BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index dd768dc60341..e0883df8a903 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -181,6 +181,9 @@ sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
 
+bool dmz_check_dev(struct dmz_metadata *zmd);
+bool dmz_dev_is_dying(struct dmz_metadata *zmd);
+
 #define DMZ_ALLOC_RND		0x01
 #define DMZ_ALLOC_RECLAIM	0x02
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/15] dm-zoned: remove 'dev' argument from reclaim
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (7 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 08/15] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 10/15] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Use the dmz_zone_to_dev() mapping function to remove the
'dev' argument from reclaim.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/md/dm-zoned-reclaim.c | 58 +++++++++++++++++++++++--------------------
 drivers/md/dm-zoned-target.c  |  2 +-
 drivers/md/dm-zoned.h         |  4 +--
 3 files changed, 34 insertions(+), 30 deletions(-)

diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 5daede0daf92..39ea0d5d4706 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -13,7 +13,6 @@
 
 struct dmz_reclaim {
 	struct dmz_metadata     *metadata;
-	struct dmz_dev		*dev;
 
 	struct delayed_work	work;
 	struct workqueue_struct *wq;
@@ -59,6 +58,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
 				sector_t block)
 {
 	struct dmz_metadata *zmd = zrc->metadata;
+	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 	sector_t wp_block = zone->wp_block;
 	unsigned int nr_blocks;
 	int ret;
@@ -74,15 +74,15 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
 	 * pointer and the requested position.
 	 */
 	nr_blocks = block - wp_block;
-	ret = blkdev_issue_zeroout(zrc->dev->bdev,
+	ret = blkdev_issue_zeroout(dev->bdev,
 				   dmz_start_sect(zmd, zone) + dmz_blk2sect(wp_block),
 				   dmz_blk2sect(nr_blocks), GFP_NOIO, 0);
 	if (ret) {
-		dmz_dev_err(zrc->dev,
+		dmz_dev_err(dev,
 			    "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
 			    zone->id, (unsigned long long)wp_block,
 			    (unsigned long long)block, nr_blocks, ret);
-		dmz_check_bdev(zrc->dev);
+		dmz_check_bdev(dev);
 		return ret;
 	}
 
@@ -116,7 +116,7 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 			    struct dm_zone *src_zone, struct dm_zone *dst_zone)
 {
 	struct dmz_metadata *zmd = zrc->metadata;
-	struct dmz_dev *dev = zrc->dev;
+	struct dmz_dev *src_dev, *dst_dev;
 	struct dm_io_region src, dst;
 	sector_t block = 0, end_block;
 	sector_t nr_blocks;
@@ -130,13 +130,17 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 	else
 		end_block = dmz_zone_nr_blocks(zmd);
 	src_zone_block = dmz_start_block(zmd, src_zone);
+	src_dev = dmz_zone_to_dev(zmd, src_zone);
 	dst_zone_block = dmz_start_block(zmd, dst_zone);
+	dst_dev = dmz_zone_to_dev(zmd, dst_zone);
 
 	if (dmz_is_seq(dst_zone))
 		set_bit(DM_KCOPYD_WRITE_SEQ, &flags);
 
 	while (block < end_block) {
-		if (dev->flags & DMZ_BDEV_DYING)
+		if (src_dev->flags & DMZ_BDEV_DYING)
+			return -EIO;
+		if (dst_dev->flags & DMZ_BDEV_DYING)
 			return -EIO;
 
 		/* Get a valid region from the source zone */
@@ -156,11 +160,11 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 				return ret;
 		}
 
-		src.bdev = dev->bdev;
+		src.bdev = src_dev->bdev;
 		src.sector = dmz_blk2sect(src_zone_block + block);
 		src.count = dmz_blk2sect(nr_blocks);
 
-		dst.bdev = dev->bdev;
+		dst.bdev = dst_dev->bdev;
 		dst.sector = dmz_blk2sect(dst_zone_block + block);
 		dst.count = src.count;
 
@@ -194,10 +198,10 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	struct dmz_metadata *zmd = zrc->metadata;
 	int ret;
 
-	dmz_dev_debug(zrc->dev,
-		      "Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
-		      dzone->chunk, bzone->id, dmz_weight(bzone),
-		      dzone->id, dmz_weight(dzone));
+	DMDEBUG("(%s): Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
+		dmz_metadata_label(zmd),
+		dzone->chunk, bzone->id, dmz_weight(bzone),
+		dzone->id, dmz_weight(dzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, bzone, dzone);
@@ -233,10 +237,10 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	struct dmz_metadata *zmd = zrc->metadata;
 	int ret = 0;
 
-	dmz_dev_debug(zrc->dev,
-		      "Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
-		      chunk, dzone->id, dmz_weight(dzone),
-		      bzone->id, dmz_weight(bzone));
+	DMDEBUG("(%s): Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
+		dmz_metadata_label(zmd),
+		chunk, dzone->id, dmz_weight(dzone),
+		bzone->id, dmz_weight(bzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, dzone, bzone);
@@ -285,9 +289,9 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	if (!szone)
 		return -ENOSPC;
 
-	dmz_dev_debug(zrc->dev,
-		      "Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
-		      chunk, dzone->id, dmz_weight(dzone), szone->id);
+	DMDEBUG("(%s): Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
+		dmz_metadata_label(zmd),
+		chunk, dzone->id, dmz_weight(dzone), szone->id);
 
 	/* Flush the random data zone into the sequential zone */
 	ret = dmz_reclaim_copy(zrc, dzone, szone);
@@ -343,6 +347,7 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 	struct dmz_metadata *zmd = zrc->metadata;
 	struct dm_zone *dzone;
 	struct dm_zone *rzone;
+	struct dmz_dev *dev;
 	unsigned long start;
 	int ret;
 
@@ -352,7 +357,7 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 		return PTR_ERR(dzone);
 
 	start = jiffies;
-
+	dev = dmz_zone_to_dev(zmd, dzone);
 	if (dmz_is_rnd(dzone)) {
 		if (!dmz_weight(dzone)) {
 			/* Empty zone */
@@ -400,14 +405,14 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 
 	ret = dmz_flush_metadata(zrc->metadata);
 	if (ret) {
-		dmz_dev_debug(zrc->dev,
-			      "Metadata flush for zone %u failed, err %d\n",
-			      rzone->id, ret);
+		DMDEBUG("(%s): Metadata flush for zone %u failed, err %d\n",
+			dmz_metadata_label(zmd), rzone->id, ret);
 		return ret;
 	}
 
-	dmz_dev_debug(zrc->dev, "Reclaimed zone %u in %u ms",
-		      rzone->id, jiffies_to_msecs(jiffies - start));
+	DMDEBUG("(%s): Reclaimed zone %u in %u ms",
+		dmz_metadata_label(zmd),
+		rzone->id, jiffies_to_msecs(jiffies - start));
 	return 0;
 }
 
@@ -500,7 +505,7 @@ static void dmz_reclaim_work(struct work_struct *work)
 /*
  * Initialize reclaim.
  */
-int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
+int dmz_ctr_reclaim(struct dmz_metadata *zmd,
 		    struct dmz_reclaim **reclaim)
 {
 	struct dmz_reclaim *zrc;
@@ -510,7 +515,6 @@ int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
 	if (!zrc)
 		return -ENOMEM;
 
-	zrc->dev = dev;
 	zrc->metadata = zmd;
 	zrc->atime = jiffies;
 
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index b32e791b8a5c..520e55df627b 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -840,7 +840,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	mod_delayed_work(dmz->flush_wq, &dmz->flush_work, DMZ_FLUSH_PERIOD);
 
 	/* Initialize reclaim */
-	ret = dmz_ctr_reclaim(dev, dmz->metadata, &dmz->reclaim);
+	ret = dmz_ctr_reclaim(dmz->metadata, &dmz->reclaim);
 	if (ret) {
 		ti->error = "Zone reclaim initialization failed";
 		goto err_fwq;
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index e0883df8a903..2629bd51fa26 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -180,6 +180,7 @@ const char *dmz_metadata_label(struct dmz_metadata *zmd);
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
+struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone);
 
 bool dmz_check_dev(struct dmz_metadata *zmd);
 bool dmz_dev_is_dying(struct dmz_metadata *zmd);
@@ -254,8 +255,7 @@ int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 /*
  * Functions defined in dm-zoned-reclaim.c
  */
-int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
-		    struct dmz_reclaim **zrc);
+int dmz_ctr_reclaim(struct dmz_metadata *zmd, struct dmz_reclaim **zrc);
 void dmz_dtr_reclaim(struct dmz_reclaim *zrc);
 void dmz_suspend_reclaim(struct dmz_reclaim *zrc);
 void dmz_resume_reclaim(struct dmz_reclaim *zrc);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/15] dm-zoned: replace 'target' pointer in the bio context
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (8 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 09/15] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 11/15] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O Hannes Reinecke
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Replace the 'target' pointer in the bio context with the
device pointer as this is what's actually used.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/md/dm-zoned-target.c | 44 ++++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 520e55df627b..a09fb78ffe88 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -17,7 +17,7 @@
  * Zone BIO context.
  */
 struct dmz_bioctx {
-	struct dmz_target	*target;
+	struct dmz_dev		*dev;
 	struct dm_zone		*zone;
 	struct bio		*bio;
 	refcount_t		ref;
@@ -76,12 +76,13 @@ struct dmz_target {
  */
 static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
 {
-	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
+	struct dmz_bioctx *bioctx =
+		dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
 
 	if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
 		bio->bi_status = status;
 	if (bio->bi_status != BLK_STS_OK)
-		bioctx->target->dev->flags |= DMZ_CHECK_BDEV;
+		bioctx->dev->flags |= DMZ_CHECK_BDEV;
 
 	if (refcount_dec_and_test(&bioctx->ref)) {
 		struct dm_zone *zone = bioctx->zone;
@@ -118,14 +119,20 @@ static int dmz_submit_bio(struct dmz_target *dmz, struct dm_zone *zone,
 			  struct bio *bio, sector_t chunk_block,
 			  unsigned int nr_blocks)
 {
-	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
+	struct dmz_bioctx *bioctx =
+		dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
+	struct dmz_dev *dev = dmz_zone_to_dev(dmz->metadata, zone);
 	struct bio *clone;
 
+	if (dev->flags & DMZ_BDEV_DYING)
+		return -EIO;
+
 	clone = bio_clone_fast(bio, GFP_NOIO, &dmz->bio_set);
 	if (!clone)
 		return -ENOMEM;
 
-	bio_set_dev(clone, dmz->dev->bdev);
+	bio_set_dev(clone, dev->bdev);
+	bioctx->dev = dev;
 	clone->bi_iter.bi_sector =
 		dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
 	clone->bi_iter.bi_size = dmz_blk2sect(nr_blocks) << SECTOR_SHIFT;
@@ -218,8 +225,10 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 
 		if (nr_blocks) {
 			/* Valid blocks found: read them */
-			nr_blocks = min_t(unsigned int, nr_blocks, end_block - chunk_block);
-			ret = dmz_submit_bio(dmz, rzone, bio, chunk_block, nr_blocks);
+			nr_blocks = min_t(unsigned int, nr_blocks,
+					  end_block - chunk_block);
+			ret = dmz_submit_bio(dmz, rzone, bio,
+					     chunk_block, nr_blocks);
 			if (ret)
 				return ret;
 			chunk_block += nr_blocks;
@@ -330,7 +339,8 @@ static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 		 * and the BIO is aligned to the zone write pointer:
 		 * direct write the zone.
 		 */
-		return dmz_handle_direct_write(dmz, zone, bio, chunk_block, nr_blocks);
+		return dmz_handle_direct_write(dmz, zone, bio,
+					       chunk_block, nr_blocks);
 	}
 
 	/*
@@ -383,7 +393,8 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 			   struct bio *bio)
 {
-	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
+	struct dmz_bioctx *bioctx =
+		dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
 	struct dmz_metadata *zmd = dmz->metadata;
 	struct dm_zone *zone;
 	int ret;
@@ -397,11 +408,6 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 
 	dmz_lock_metadata(zmd);
 
-	if (dmz->dev->flags & DMZ_BDEV_DYING) {
-		ret = -EIO;
-		goto out;
-	}
-
 	/*
 	 * Get the data zone mapping the chunk. There may be no
 	 * mapping for read and discard. If a mapping is obtained,
@@ -625,7 +631,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 {
 	struct dmz_target *dmz = ti->private;
 	struct dmz_metadata *zmd = dmz->metadata;
-	struct dmz_dev *dev = dmz->dev;
 	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
 	sector_t sector = bio->bi_iter.bi_sector;
 	unsigned int nr_sectors = bio_sectors(bio);
@@ -642,8 +647,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 		(unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
 		(unsigned int)dmz_bio_blocks(bio));
 
-	bio_set_dev(bio, dev->bdev);
-
 	if (!nr_sectors && bio_op(bio) != REQ_OP_WRITE)
 		return DM_MAPIO_REMAPPED;
 
@@ -652,7 +655,7 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 		return DM_MAPIO_KILL;
 
 	/* Initialize the BIO context */
-	bioctx->target = dmz;
+	bioctx->dev = NULL;
 	bioctx->zone = NULL;
 	bioctx->bio = bio;
 	refcount_set(&bioctx->ref, 1);
@@ -931,11 +934,12 @@ static void dmz_io_hints(struct dm_target *ti, struct queue_limits *limits)
 static int dmz_prepare_ioctl(struct dm_target *ti, struct block_device **bdev)
 {
 	struct dmz_target *dmz = ti->private;
+	struct dmz_dev *dev = &dmz->dev[0];
 
-	if (!dmz_check_bdev(dmz->dev))
+	if (!dmz_check_bdev(dev))
 		return -EIO;
 
-	*bdev = dmz->dev->bdev;
+	*bdev = dev->bdev;
 
 	return 0;
 }
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/15] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (9 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 10/15] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 12/15] dm-zoned: add metadata logging functions Hannes Reinecke
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Use accessors to retrieve the device pointer in preparation
for adding an additional block device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 426af738f1ca..312194be4cb0 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -1310,6 +1310,7 @@ static int dmz_update_zone_cb(struct blk_zone *blkz, unsigned int idx,
  */
 static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
+	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 	unsigned int noio_flag;
 	int ret;
 
@@ -1320,16 +1321,16 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 	 * GFP_NOIO was specified.
 	 */
 	noio_flag = memalloc_noio_save();
-	ret = blkdev_report_zones(zmd->dev->bdev, dmz_start_sect(zmd, zone), 1,
+	ret = blkdev_report_zones(dev->bdev, dmz_start_sect(zmd, zone), 1,
 				  dmz_update_zone_cb, zone);
 	memalloc_noio_restore(noio_flag);
 
 	if (ret == 0)
 		ret = -EIO;
 	if (ret < 0) {
-		dmz_dev_err(zmd->dev, "Get zone %u report failed",
+		dmz_dev_err(dev, "Get zone %u report failed",
 			    zone->id);
-		dmz_check_bdev(zmd->dev);
+		dmz_check_bdev(dev);
 		return ret;
 	}
 
@@ -1343,6 +1344,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
 				    struct dm_zone *zone)
 {
+	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 	unsigned int wp = 0;
 	int ret;
 
@@ -1351,7 +1353,7 @@ static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
 	if (ret)
 		return ret;
 
-	dmz_dev_warn(zmd->dev, "Processing zone %u write error (zone wp %u/%u)",
+	dmz_dev_warn(dev, "Processing zone %u write error (zone wp %u/%u)",
 		     zone->id, zone->wp_block, wp);
 
 	if (zone->wp_block < wp) {
@@ -1384,7 +1386,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 		return 0;
 
 	if (!dmz_is_empty(zone) || dmz_seq_write_err(zone)) {
-		struct dmz_dev *dev = zmd->dev;
+		struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 
 		ret = blkdev_zone_mgmt(dev->bdev, REQ_OP_ZONE_RESET,
 				       dmz_start_sect(zmd, zone),
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 12/15] dm-zoned: add metadata logging functions
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (10 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 11/15] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 13/15] dm-zoned: Reduce logging output on startup Hannes Reinecke
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Use the metadata label for logging and not the underlying
device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 95 +++++++++++++++++++++++++-----------------
 1 file changed, 56 insertions(+), 39 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 312194be4cb0..77b9ea4bad74 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -194,6 +194,17 @@ struct dmz_metadata {
 	wait_queue_head_t	free_wq;
 };
 
+#define dmz_zmd_info(zmd, format, args...)	\
+	DMINFO("(%s): " format, (zmd)->devname, ## args)
+
+#define dmz_zmd_err(zmd, format, args...)	\
+	DMERR("(%s): " format, (zmd)->devname, ## args)
+
+#define dmz_zmd_warn(zmd, format, args...)	\
+	DMWARN("(%s): " format, (zmd)->devname, ## args)
+
+#define dmz_zmd_debug(zmd, format, args...)	\
+	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
 /*
  * Various accessors
  */
@@ -1098,7 +1109,7 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 	int ret;
 
 	if (!zmd->sb[0].zone) {
-		dmz_dev_err(zmd->dev, "Primary super block zone not set");
+		dmz_zmd_err(zmd, "Primary super block zone not set");
 		return -ENXIO;
 	}
 
@@ -1135,7 +1146,7 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 
 	/* Use highest generation sb first */
 	if (!sb_good[0] && !sb_good[1]) {
-		dmz_dev_err(zmd->dev, "No valid super block found");
+		dmz_zmd_err(zmd, "No valid super block found");
 		return -EIO;
 	}
 
@@ -1248,7 +1259,7 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
  */
 static int dmz_init_zones(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = zmd->dev;
+	struct dmz_dev *dev = &zmd->dev[0];
 	int ret;
 
 	/* Init */
@@ -1268,8 +1279,8 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	if (!zmd->zones)
 		return -ENOMEM;
 
-	dmz_dev_info(dev, "Using %zu B for zone information",
-		     sizeof(struct dm_zone) * zmd->nr_zones);
+	DMINFO("(%s): Using %zu B for zone information",
+	       zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
 
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
@@ -1412,7 +1423,6 @@ static void dmz_get_zone_weight(struct dmz_metadata *zmd, struct dm_zone *zone);
  */
 static int dmz_load_mapping(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = zmd->dev;
 	struct dm_zone *dzone, *bzone;
 	struct dmz_mblock *dmap_mblk = NULL;
 	struct dmz_map *dmap;
@@ -1445,7 +1455,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 			goto next;
 
 		if (dzone_id >= zmd->nr_zones) {
-			dmz_dev_err(dev, "Chunk %u mapping: invalid data zone ID %u",
+			dmz_zmd_err(zmd, "Chunk %u mapping: invalid data zone ID %u",
 				    chunk, dzone_id);
 			return -EIO;
 		}
@@ -1466,14 +1476,14 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 			goto next;
 
 		if (bzone_id >= zmd->nr_zones) {
-			dmz_dev_err(dev, "Chunk %u mapping: invalid buffer zone ID %u",
+			dmz_zmd_err(zmd, "Chunk %u mapping: invalid buffer zone ID %u",
 				    chunk, bzone_id);
 			return -EIO;
 		}
 
 		bzone = dmz_get(zmd, bzone_id);
 		if (!dmz_is_rnd(bzone)) {
-			dmz_dev_err(dev, "Chunk %u mapping: invalid buffer zone %u",
+			dmz_zmd_err(zmd, "Chunk %u mapping: invalid buffer zone %u",
 				    chunk, bzone_id);
 			return -EIO;
 		}
@@ -1893,7 +1903,7 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
 		atomic_dec(&zmd->unmap_nr_seq);
 
 	if (dmz_is_offline(zone)) {
-		dmz_dev_warn(zmd->dev, "Zone %u is offline", zone->id);
+		dmz_zmd_warn(zmd, "Zone %u is offline", zone->id);
 		zone = NULL;
 		goto again;
 	}
@@ -2104,7 +2114,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	struct dmz_mblock *mblk;
 	unsigned int n = 0;
 
-	dmz_dev_debug(zmd->dev, "=> VALIDATE zone %u, block %llu, %u blocks",
+	dmz_zmd_debug(zmd, "=> VALIDATE zone %u, block %llu, %u blocks",
 		      zone->id, (unsigned long long)chunk_block,
 		      nr_blocks);
 
@@ -2134,7 +2144,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	if (likely(zone->weight + n <= zone_nr_blocks))
 		zone->weight += n;
 	else {
-		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be <= %u",
+		dmz_zmd_warn(zmd, "Zone %u: weight %u should be <= %u",
 			     zone->id, zone->weight,
 			     zone_nr_blocks - n);
 		zone->weight = zone_nr_blocks;
@@ -2184,7 +2194,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	struct dmz_mblock *mblk;
 	unsigned int n = 0;
 
-	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
+	dmz_zmd_debug(zmd, "=> INVALIDATE zone %u, block %llu, %u blocks",
 		      zone->id, (u64)chunk_block, nr_blocks);
 
 	WARN_ON(chunk_block + nr_blocks > zmd->zone_nr_blocks);
@@ -2214,7 +2224,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	if (zone->weight >= n)
 		zone->weight -= n;
 	else {
-		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be >= %u",
+		dmz_zmd_warn(zmd, "Zone %u: weight %u should be >= %u",
 			     zone->id, zone->weight, n);
 		zone->weight = 0;
 	}
@@ -2424,7 +2434,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 	while (!list_empty(&zmd->mblk_dirty_list)) {
 		mblk = list_first_entry(&zmd->mblk_dirty_list,
 					struct dmz_mblock, link);
-		dmz_dev_warn(zmd->dev, "mblock %llu still in dirty list (ref %u)",
+		dmz_zmd_warn(zmd, "mblock %llu still in dirty list (ref %u)",
 			     (u64)mblk->no, mblk->ref);
 		list_del_init(&mblk->link);
 		rb_erase(&mblk->node, &zmd->mblk_rbtree);
@@ -2442,7 +2452,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 	/* Sanity checks: the mblock rbtree should now be empty */
 	root = &zmd->mblk_rbtree;
 	rbtree_postorder_for_each_entry_safe(mblk, next, root, node) {
-		dmz_dev_warn(zmd->dev, "mblock %llu ref %u still in rbtree",
+		dmz_zmd_warn(zmd, "mblock %llu ref %u still in rbtree",
 			     (u64)mblk->no, mblk->ref);
 		mblk->ref = 0;
 		dmz_free_mblock(zmd, mblk);
@@ -2455,6 +2465,18 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 	mutex_destroy(&zmd->map_lock);
 }
 
+void dmz_print_dev(struct dmz_metadata *zmd, int num)
+{
+	struct dmz_dev *dev = &zmd->dev[num];
+
+	dmz_dev_info(dev, "Host-%s zoned block device",
+		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
+		     "aware" : "managed");
+	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
+		     (u64)dev->capacity);
+	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
+		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
+}
 /*
  * Initialize the zoned metadata.
  */
@@ -2531,34 +2553,31 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 	/* Metadata cache shrinker */
 	ret = register_shrinker(&zmd->mblk_shrinker);
 	if (ret) {
-		dmz_dev_err(dev, "Register metadata cache shrinker failed");
+		dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
 		goto err;
 	}
 
-	dmz_dev_info(dev, "Host-%s zoned block device",
-		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
-		     "aware" : "managed");
-	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
-		     (u64)dev->capacity);
-	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
+	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
+	dmz_print_dev(zmd, 0);
+
+	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
 		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
-	dmz_dev_info(dev, "  %u metadata zones",
+	dmz_zmd_info(zmd, "  %u metadata zones",
 		     zmd->nr_meta_zones * 2);
-	dmz_dev_info(dev, "  %u data zones for %u chunks",
+	dmz_zmd_info(zmd, "  %u data zones for %u chunks",
 		     zmd->nr_data_zones, zmd->nr_chunks);
-	dmz_dev_info(dev, "    %u random zones (%u unmapped)",
+	dmz_zmd_info(zmd, "    %u random zones (%u unmapped)",
 		     zmd->nr_rnd, atomic_read(&zmd->unmap_nr_rnd));
-	dmz_dev_info(dev, "    %u sequential zones (%u unmapped)",
+	dmz_zmd_info(zmd, "    %u sequential zones (%u unmapped)",
 		     zmd->nr_seq, atomic_read(&zmd->unmap_nr_seq));
-	dmz_dev_info(dev, "  %u reserved sequential data zones",
+	dmz_zmd_info(zmd, "  %u reserved sequential data zones",
 		     zmd->nr_reserved_seq);
-
-	dmz_dev_debug(dev, "Format:");
-	dmz_dev_debug(dev, "%u metadata blocks per set (%u max cache)",
+	dmz_zmd_debug(zmd, "Format:");
+	dmz_zmd_debug(zmd, "%u metadata blocks per set (%u max cache)",
 		      zmd->nr_meta_blocks, zmd->max_nr_mblks);
-	dmz_dev_debug(dev, "  %u data zone mapping blocks",
+	dmz_zmd_debug(zmd, "  %u data zone mapping blocks",
 		      zmd->nr_map_blocks);
-	dmz_dev_debug(dev, "  %u bitmap blocks",
+	dmz_zmd_debug(zmd, "  %u bitmap blocks",
 		      zmd->nr_bitmap_blocks);
 
 	*metadata = zmd;
@@ -2587,7 +2606,6 @@ void dmz_dtr_metadata(struct dmz_metadata *zmd)
  */
 int dmz_resume_metadata(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = zmd->dev;
 	struct dm_zone *zone;
 	sector_t wp_block;
 	unsigned int i;
@@ -2597,20 +2615,19 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 	for (i = 0; i < zmd->nr_zones; i++) {
 		zone = dmz_get(zmd, i);
 		if (!zone) {
-			dmz_dev_err(dev, "Unable to get zone %u", i);
+			dmz_zmd_err(zmd, "Unable to get zone %u", i);
 			return -EIO;
 		}
-
 		wp_block = zone->wp_block;
 
 		ret = dmz_update_zone(zmd, zone);
 		if (ret) {
-			dmz_dev_err(dev, "Broken zone %u", i);
+			dmz_zmd_err(zmd, "Broken zone %u", i);
 			return ret;
 		}
 
 		if (dmz_is_offline(zone)) {
-			dmz_dev_warn(dev, "Zone %u is offline", i);
+			dmz_zmd_warn(zmd, "Zone %u is offline", i);
 			continue;
 		}
 
@@ -2618,7 +2635,7 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 		if (!dmz_is_seq(zone))
 			zone->wp_block = 0;
 		else if (zone->wp_block != wp_block) {
-			dmz_dev_err(dev, "Zone %u: Invalid wp (%llu / %llu)",
+			dmz_zmd_err(zmd, "Zone %u: Invalid wp (%llu / %llu)",
 				    i, (u64)zone->wp_block, (u64)wp_block);
 			zone->wp_block = wp_block;
 			dmz_invalidate_blocks(zmd, zone, zone->wp_block,
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 13/15] dm-zoned: Reduce logging output on startup
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (11 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 12/15] dm-zoned: add metadata logging functions Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 14/15] dm-zoned: ignore metadata zone in dmz_alloc_zone() Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 15/15] dm-zoned: metadata version 2 Hannes Reinecke
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

dm-zoned is becoming quite chatty during startup; reduce the noise
by moving some information to 'debug' level.

Suggested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/md/dm-zoned-metadata.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 77b9ea4bad74..80c0fe4c3546 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -1279,8 +1279,8 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	if (!zmd->zones)
 		return -ENOMEM;
 
-	DMINFO("(%s): Using %zu B for zone information",
-	       zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
+	DMDEBUG("(%s): Using %zu B for zone information",
+		zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
 
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
@@ -2562,16 +2562,16 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 
 	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
 		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
-	dmz_zmd_info(zmd, "  %u metadata zones",
-		     zmd->nr_meta_zones * 2);
-	dmz_zmd_info(zmd, "  %u data zones for %u chunks",
-		     zmd->nr_data_zones, zmd->nr_chunks);
-	dmz_zmd_info(zmd, "    %u random zones (%u unmapped)",
-		     zmd->nr_rnd, atomic_read(&zmd->unmap_nr_rnd));
-	dmz_zmd_info(zmd, "    %u sequential zones (%u unmapped)",
-		     zmd->nr_seq, atomic_read(&zmd->unmap_nr_seq));
-	dmz_zmd_info(zmd, "  %u reserved sequential data zones",
-		     zmd->nr_reserved_seq);
+	dmz_zmd_debug(zmd, "  %u metadata zones",
+		      zmd->nr_meta_zones * 2);
+	dmz_zmd_debug(zmd, "  %u data zones for %u chunks",
+		      zmd->nr_data_zones, zmd->nr_chunks);
+	dmz_zmd_debug(zmd, "    %u random zones (%u unmapped)",
+		      zmd->nr_rnd, atomic_read(&zmd->unmap_nr_rnd));
+	dmz_zmd_debug(zmd, "    %u sequential zones (%u unmapped)",
+		      zmd->nr_seq, atomic_read(&zmd->unmap_nr_seq));
+	dmz_zmd_debug(zmd, "  %u reserved sequential data zones",
+		      zmd->nr_reserved_seq);
 	dmz_zmd_debug(zmd, "Format:");
 	dmz_zmd_debug(zmd, "%u metadata blocks per set (%u max cache)",
 		      zmd->nr_meta_blocks, zmd->max_nr_mblks);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 14/15] dm-zoned: ignore metadata zone in dmz_alloc_zone()
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (12 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 13/15] dm-zoned: Reduce logging output on startup Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:24 ` [PATCH 15/15] dm-zoned: metadata version 2 Hannes Reinecke
  14 siblings, 0 replies; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

When looking up zones in dmz_alloc_zone() we need to ignore
metadata zones so as not to accidentally overwrite metadata.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 80c0fe4c3546..067ce010f457 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -1907,7 +1907,13 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
 		zone = NULL;
 		goto again;
 	}
+	if (dmz_is_meta(zone)) {
+		struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 
+		dmz_dev_warn(dev, "Zone %u has metadata", zone->id);
+		zone = NULL;
+		goto again;
+	}
 	return zone;
 }
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 15/15] dm-zoned: metadata version 2
  2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (13 preceding siblings ...)
  2020-05-11  8:24 ` [PATCH 14/15] dm-zoned: ignore metadata zone in dmz_alloc_zone() Hannes Reinecke
@ 2020-05-11  8:24 ` Hannes Reinecke
  2020-05-11  8:36   ` Damien Le Moal
  14 siblings, 1 reply; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:24 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Implement handling for metadata version 2. The new metadata adds
a label and UUID for the device mapper device, and additional UUID
for the underlying block devices.
It also allows for an additional regular drive to be used for
emulating random access zones. The emulated zones will be placed
logically in front of the zones from the zoned block device, causing
the superblocks and metadata to be stored on that device.
The first zone of the original zoned device will be used to hold
another, tertiary copy of the metadata; this copy carries a
generation number of 0 and is never updated; it's just used
for identification.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 .../admin-guide/device-mapper/dm-zoned.rst         |  34 ++-
 drivers/md/dm-zoned-metadata.c                     | 310 +++++++++++++++++----
 drivers/md/dm-zoned-target.c                       | 185 ++++++++----
 drivers/md/dm-zoned.h                              |   7 +-
 4 files changed, 427 insertions(+), 109 deletions(-)

diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
index 7547ce635161..553752ea2521 100644
--- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
+++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
@@ -37,9 +37,13 @@ Algorithm
 dm-zoned implements an on-disk buffering scheme to handle non-sequential
 write accesses to the sequential zones of a zoned block device.
 Conventional zones are used for caching as well as for storing internal
-metadata.
+metadata. It can also use a regular block device together with the zoned
+block device; in that case the regular block device will be split logically
+in zones with the same size as the zoned block device. These zones will be
+placed in front of the zones from the zoned block device and will be handled
+just like conventional zones.
 
-The zones of the device are separated into 2 types:
+The zones of the device(s) are separated into 2 types:
 
 1) Metadata zones: these are conventional zones used to store metadata.
 Metadata zones are not reported as useable capacity to the user.
@@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
 discard requests. Read requests can be processed concurrently while
 metadata flush is being executed.
 
+If a regular device is used in conjunction with the zoned block device,
+a third set of metadata (without the zone bitmaps) is written to the
+start of the zoned block device. This metadata has a generation counter of
+'0' and will never be updated during normal operation; it just serves for
+identification purposes. The first and second copy of the metadata
+are located at the start of the regular block device.
+
 Usage
 =====
 
@@ -138,12 +149,21 @@ Ex::
 
 	dmzadm --format /dev/sdxx
 
-For a formatted device, the target can be created normally with the
-dmsetup utility. The only parameter that dm-zoned requires is the
-underlying zoned block device name. Ex::
 
-	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
-	dmsetup create dmz-`basename ${dev}`
+If two drives are to be used, both devices must be specified, with the
+regular block device as the first device.
+
+Ex::
+
+	dmzadm --format /dev/sdxx /dev/sdyy
+
+
+Fomatted device(s) can be started with the dmzadm utility, too.:
+
+Ex::
+
+	dmzadm --start /dev/sdxx /dev/sdyy
+
 
 Information about the internal layout and current usage of the zones can
 be obtained with the 'status' callback from dmsetup:
diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 067ce010f457..d9e256762eff 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -16,7 +16,7 @@
 /*
  * Metadata version.
  */
-#define DMZ_META_VER	1
+#define DMZ_META_VER	2
 
 /*
  * On-disk super block magic.
@@ -69,8 +69,17 @@ struct dmz_super {
 	/* Checksum */
 	__le32		crc;			/*  48 */
 
+	/* DM-Zoned label */
+	u8		dmz_label[32];		/*  80 */
+
+	/* DM-Zoned UUID */
+	u8		dmz_uuid[16];		/*  96 */
+
+	/* Device UUID */
+	u8		dev_uuid[16];		/* 112 */
+
 	/* Padding to full 512B sector */
-	u8		reserved[464];		/* 512 */
+	u8		reserved[400];		/* 512 */
 };
 
 /*
@@ -133,8 +142,11 @@ struct dmz_sb {
  */
 struct dmz_metadata {
 	struct dmz_dev		*dev;
+	unsigned int		nr_devs;
 
 	char			devname[BDEVNAME_SIZE];
+	char			label[BDEVNAME_SIZE];
+	uuid_t			uuid;
 
 	sector_t		zone_bitmap_size;
 	unsigned int		zone_nr_bitmap_blocks;
@@ -161,8 +173,9 @@ struct dmz_metadata {
 	/* Zone information array */
 	struct dm_zone		*zones;
 
-	struct dmz_sb		sb[2];
+	struct dmz_sb		sb[3];
 	unsigned int		mblk_primary;
+	unsigned int		sb_version;
 	u64			sb_gen;
 	unsigned int		min_nr_mblks;
 	unsigned int		max_nr_mblks;
@@ -195,31 +208,56 @@ struct dmz_metadata {
 };
 
 #define dmz_zmd_info(zmd, format, args...)	\
-	DMINFO("(%s): " format, (zmd)->devname, ## args)
+	DMINFO("(%s): " format, (zmd)->label, ## args)
 
 #define dmz_zmd_err(zmd, format, args...)	\
-	DMERR("(%s): " format, (zmd)->devname, ## args)
+	DMERR("(%s): " format, (zmd)->label, ## args)
 
 #define dmz_zmd_warn(zmd, format, args...)	\
-	DMWARN("(%s): " format, (zmd)->devname, ## args)
+	DMWARN("(%s): " format, (zmd)->label, ## args)
 
 #define dmz_zmd_debug(zmd, format, args...)	\
-	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
+	DMDEBUG("(%s): " format, (zmd)->label, ## args)
 /*
  * Various accessors
  */
+unsigned int dmz_dev_zone_id(struct dmz_metadata *zmd, struct dm_zone *zone)
+{
+	unsigned int zone_id;
+
+	if (WARN_ON(!zone))
+		return 0;
+
+	zone_id = zone->id;
+	if (zmd->nr_devs > 1 &&
+	    (zone_id >= zmd->dev[1].zone_offset))
+		zone_id -= zmd->dev[1].zone_offset;
+	return zone_id;
+}
+
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
+	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
+
+	return (sector_t)zone_id << zmd->zone_nr_sectors_shift;
 }
 
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
+	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
+
+	return (sector_t)zone_id << zmd->zone_nr_blocks_shift;
 }
 
 struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
+	if (WARN_ON(!zone))
+		return &zmd->dev[0];
+
+	if (zmd->nr_devs > 1 &&
+	    zone->id >= zmd->dev[1].zone_offset)
+		return &zmd->dev[1];
+
 	return &zmd->dev[0];
 }
 
@@ -275,17 +313,29 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
 
 const char *dmz_metadata_label(struct dmz_metadata *zmd)
 {
-	return (const char *)zmd->devname;
+	return (const char *)zmd->label;
 }
 
 bool dmz_check_dev(struct dmz_metadata *zmd)
 {
-	return dmz_check_bdev(&zmd->dev[0]);
+	unsigned int i;
+
+	for (i = 0; i < zmd->nr_devs; i++) {
+		if (!dmz_check_bdev(&zmd->dev[i]))
+			return false;
+	}
+	return true;
 }
 
 bool dmz_dev_is_dying(struct dmz_metadata *zmd)
 {
-	return dmz_bdev_is_dying(&zmd->dev[0]);
+	unsigned int i;
+
+	for (i = 0; i < zmd->nr_devs; i++) {
+		if (dmz_bdev_is_dying(&zmd->dev[i]))
+			return true;
+	}
+	return false;
 }
 
 /*
@@ -687,6 +737,9 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
 	struct bio *bio;
 	int ret;
 
+	if (WARN_ON(!dev))
+		return -EIO;
+
 	if (dmz_bdev_is_dying(dev))
 		return -EIO;
 
@@ -711,19 +764,32 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
  */
 static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 {
-	sector_t block = zmd->sb[set].block;
 	struct dmz_mblock *mblk = zmd->sb[set].mblk;
 	struct dmz_super *sb = zmd->sb[set].sb;
 	struct dmz_dev *dev = zmd->sb[set].dev;
+	sector_t sb_block;
 	u64 sb_gen = zmd->sb_gen + 1;
 	int ret;
 
 	sb->magic = cpu_to_le32(DMZ_MAGIC);
-	sb->version = cpu_to_le32(DMZ_META_VER);
+
+	sb->version = cpu_to_le32(zmd->sb_version);
+	if (zmd->sb_version > 1) {
+		BUILD_BUG_ON(UUID_SIZE != 16);
+		export_uuid(sb->dmz_uuid, &zmd->uuid);
+		memcpy(sb->dmz_label, zmd->label, BDEVNAME_SIZE);
+		export_uuid(sb->dev_uuid, &dev->uuid);
+	}
 
 	sb->gen = cpu_to_le64(sb_gen);
 
-	sb->sb_block = cpu_to_le64(block);
+	/*
+	 * The metadata always references the absolute block address,
+	 * ie relative to the entire block range, not the per-device
+	 * block address.
+	 */
+	sb_block = zmd->sb[set].zone->id << zmd->zone_nr_blocks_shift;
+	sb->sb_block = cpu_to_le64(sb_block);
 	sb->nr_meta_blocks = cpu_to_le32(zmd->nr_meta_blocks);
 	sb->nr_reserved_seq = cpu_to_le32(zmd->nr_reserved_seq);
 	sb->nr_chunks = cpu_to_le32(zmd->nr_chunks);
@@ -734,7 +800,8 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	sb->crc = 0;
 	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
 
-	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
+	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, zmd->sb[set].block,
+			     mblk->page);
 	if (ret == 0)
 		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 
@@ -915,6 +982,23 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 	u32 crc, stored_crc;
 	u64 gen;
 
+	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
+		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
+			    DMZ_MAGIC, le32_to_cpu(sb->magic));
+		return -ENXIO;
+	}
+
+	zmd->sb_version = le32_to_cpu(sb->version);
+	if (zmd->sb_version > DMZ_META_VER) {
+		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
+			    DMZ_META_VER, zmd->sb_version);
+		return -EINVAL;
+	}
+	if ((zmd->sb_version < 1) && (set == 2)) {
+		dmz_dev_err(dev, "Tertiary superblocks are not supported");
+		return -EINVAL;
+	}
+
 	gen = le64_to_cpu(sb->gen);
 	stored_crc = le32_to_cpu(sb->crc);
 	sb->crc = 0;
@@ -925,16 +1009,45 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 		return -ENXIO;
 	}
 
-	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
-		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
-			    DMZ_MAGIC, le32_to_cpu(sb->magic));
-		return -ENXIO;
-	}
+	if (zmd->sb_version > 1) {
+		uuid_t sb_uuid;
+
+		import_uuid(&sb_uuid, sb->dmz_uuid);
+		if (uuid_is_null(&sb_uuid)) {
+			dmz_dev_err(dev, "NULL DM-Zoned uuid");
+			return -ENXIO;
+		} else if (uuid_is_null(&zmd->uuid)) {
+			uuid_copy(&zmd->uuid, &sb_uuid);
+		} else if (!uuid_equal(&zmd->uuid, &sb_uuid)) {
+			dmz_dev_err(dev, "mismatching DM-Zoned uuid, "
+				    "is %pUl expected %pUl",
+				    &sb_uuid, &zmd->uuid);
+			return -ENXIO;
+		}
+		if (!strlen(zmd->label))
+			memcpy(zmd->label, sb->dmz_label, BDEVNAME_SIZE);
+		else if (memcmp(zmd->label, sb->dmz_label, BDEVNAME_SIZE)) {
+			dmz_dev_err(dev, "mismatching DM-Zoned label, "
+				    "is %s expected %s",
+				    sb->dmz_label, zmd->label);
+			return -ENXIO;
+		}
+		import_uuid(&dev->uuid, sb->dev_uuid);
+		if (uuid_is_null(&dev->uuid)) {
+			dmz_dev_err(dev, "NULL device uuid");
+			return -ENXIO;
+		}
 
-	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
-		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
-			    DMZ_META_VER, le32_to_cpu(sb->version));
-		return -ENXIO;
+		if (set == 2) {
+			/*
+			 * Generation number should be 0, but it doesn't
+			 * really matter if it isn't.
+			 */
+			if (gen != 0)
+				dmz_dev_warn(dev, "Invalid generation %llu",
+					    gen);
+			return 0;
+		}
 	}
 
 	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
@@ -1185,21 +1298,38 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		      "Using super block %u (gen %llu)",
 		      zmd->mblk_primary, zmd->sb_gen);
 
+	if ((zmd->sb_version > 1) && zmd->sb[2].zone) {
+		zmd->sb[2].block = dmz_start_block(zmd, zmd->sb[2].zone);
+		zmd->sb[2].dev = dmz_zone_to_dev(zmd, zmd->sb[2].zone);
+		ret = dmz_get_sb(zmd, 2);
+		if (ret) {
+			dmz_dev_err(zmd->sb[2].dev,
+				    "Read tertiary super block failed");
+			return ret;
+		}
+		ret = dmz_check_sb(zmd, 2);
+		if (ret == -EINVAL)
+			return ret;
+	}
 	return 0;
 }
 
 /*
  * Initialize a zone descriptor.
  */
-static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
+static int dmz_init_zone(struct blk_zone *blkz, unsigned int num, void *data)
 {
 	struct dmz_metadata *zmd = data;
+	struct dmz_dev *dev = zmd->nr_devs > 1 ? &zmd->dev[1] : &zmd->dev[0];
+	int idx = num + dev->zone_offset;
 	struct dm_zone *zone = &zmd->zones[idx];
-	struct dmz_dev *dev = zmd->dev;
 
-	/* Ignore the eventual last runt (smaller) zone */
 	if (blkz->len != zmd->zone_nr_sectors) {
-		if (blkz->start + blkz->len == dev->capacity)
+		if (zmd->sb_version > 1) {
+			/* Ignore the eventual runt (smaller) zone */
+			set_bit(DMZ_OFFLINE, &zone->flags);
+			return 0;
+		} else if (blkz->start + blkz->len == dev->capacity)
 			return 0;
 		return -ENXIO;
 	}
@@ -1234,16 +1364,45 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 		zmd->nr_useable_zones++;
 		if (dmz_is_rnd(zone)) {
 			zmd->nr_rnd_zones++;
-			if (!zmd->sb[0].zone) {
-				/* Super block zone */
+			if (zmd->nr_devs == 1 && !zmd->sb[0].zone) {
+				/* Primary super block zone */
 				zmd->sb[0].zone = zone;
 			}
 		}
+		if (zmd->nr_devs > 1 && !zmd->sb[2].zone) {
+			/* Tertiary superblock zone */
+			zmd->sb[2].zone = zone;
+		}
 	}
 
 	return 0;
 }
 
+static void dmz_emulate_zones(struct dmz_metadata *zmd, struct dmz_dev *dev)
+{
+	int idx;
+	sector_t zone_offset = 0;
+
+	for(idx = 0; idx < dev->nr_zones; idx++) {
+		struct dm_zone *zone = &zmd->zones[idx];
+
+		INIT_LIST_HEAD(&zone->link);
+		atomic_set(&zone->refcount, 0);
+		zone->id = idx;
+		zone->chunk = DMZ_MAP_UNMAPPED;
+		set_bit(DMZ_RND, &zone->flags);
+		zone->wp_block = 0;
+		zmd->nr_rnd_zones++;
+		zmd->nr_useable_zones++;
+		if (dev->capacity - zone_offset < zmd->zone_nr_sectors) {
+			/* Disable runt zone */
+			set_bit(DMZ_OFFLINE, &zone->flags);
+			break;
+		}
+		zone_offset += zmd->zone_nr_sectors;
+	}
+}
+
 /*
  * Free zones descriptors.
  */
@@ -1259,11 +1418,11 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
  */
 static int dmz_init_zones(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = &zmd->dev[0];
-	int ret;
+	int i, ret;
+	struct dmz_dev *zoned_dev = &zmd->dev[0];
 
 	/* Init */
-	zmd->zone_nr_sectors = dev->zone_nr_sectors;
+	zmd->zone_nr_sectors = zmd->dev[0].zone_nr_sectors;
 	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
 	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
 	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
@@ -1274,7 +1433,14 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 					DMZ_BLOCK_SIZE_BITS);
 
 	/* Allocate zone array */
-	zmd->nr_zones = dev->nr_zones;
+	zmd->nr_zones = 0;
+	for (i = 0; i < zmd->nr_devs; i++)
+		zmd->nr_zones += zmd->dev[i].nr_zones;
+
+	if (!zmd->nr_zones) {
+		DMERR("(%s): No zones found", zmd->devname);
+		return -ENXIO;
+	}
 	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
 	if (!zmd->zones)
 		return -ENOMEM;
@@ -1282,14 +1448,27 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	DMDEBUG("(%s): Using %zu B for zone information",
 		zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
 
+	if (zmd->nr_devs > 1) {
+		dmz_emulate_zones(zmd, &zmd->dev[0]);
+		/*
+		 * Primary superblock zone is always at zone 0 when multiple
+		 * drives are present.
+		 */
+		zmd->sb[0].zone = &zmd->zones[0];
+
+		zoned_dev = &zmd->dev[1];
+	}
+
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
 	 * time, determine where the super block should be: first block of the
 	 * first randomly writable zone.
 	 */
-	ret = blkdev_report_zones(dev->bdev, 0, BLK_ALL_ZONES, dmz_init_zone,
-				  zmd);
+	ret = blkdev_report_zones(zoned_dev->bdev, 0, BLK_ALL_ZONES,
+				  dmz_init_zone, zmd);
 	if (ret < 0) {
+		DMDEBUG("(%s): Failed to report zones, error %d",
+			zmd->devname, ret);
 		dmz_drop_zones(zmd);
 		return ret;
 	}
@@ -1325,6 +1504,9 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 	unsigned int noio_flag;
 	int ret;
 
+	if (dev->flags & DMZ_BDEV_REGULAR)
+		return 0;
+
 	/*
 	 * Get zone information from disk. Since blkdev_report_zones() uses
 	 * GFP_KERNEL by default for memory allocations, set the per-task
@@ -2475,18 +2657,33 @@ void dmz_print_dev(struct dmz_metadata *zmd, int num)
 {
 	struct dmz_dev *dev = &zmd->dev[num];
 
-	dmz_dev_info(dev, "Host-%s zoned block device",
-		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
-		     "aware" : "managed");
-	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
-		     (u64)dev->capacity);
-	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
-		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
+	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE)
+		dmz_dev_info(dev, "Regular block device");
+	else
+		dmz_dev_info(dev, "Host-%s zoned block device",
+			     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
+			     "aware" : "managed");
+	if (zmd->sb_version > 1) {
+		sector_t sector_offset =
+			dev->zone_offset << zmd->zone_nr_sectors_shift;
+
+		dmz_dev_info(dev, "  %llu 512-byte logical sectors (offset %llu)",
+			     (u64)dev->capacity, (u64)sector_offset);
+		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors (offset %llu)",
+			     dev->nr_zones, (u64)zmd->zone_nr_sectors,
+			     (u64)dev->zone_offset);
+	} else {
+		dmz_dev_info(dev, "  %llu 512-byte logical sectors",
+			     (u64)dev->capacity);
+		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
+			     dev->nr_zones, (u64)zmd->zone_nr_sectors);
+	}
 }
 /*
  * Initialize the zoned metadata.
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
+int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
+		     struct dmz_metadata **metadata,
 		     const char *devname)
 {
 	struct dmz_metadata *zmd;
@@ -2500,6 +2697,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 
 	strcpy(zmd->devname, devname);
 	zmd->dev = dev;
+	zmd->nr_devs = num_dev;
 	zmd->mblk_rbtree = RB_ROOT;
 	init_rwsem(&zmd->mblk_sem);
 	mutex_init(&zmd->mblk_flush_lock);
@@ -2534,11 +2732,24 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 	/* Set metadata zones starting from sb_zone */
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
 		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
-		if (!dmz_is_rnd(zone))
+		if (!dmz_is_rnd(zone)) {
+			dmz_zmd_err(zmd,
+				    "metadata zone %d is not random", i);
+			ret = -ENXIO;
 			goto err;
+		}
+		set_bit(DMZ_META, &zone->flags);
+	}
+	if (zmd->sb[2].zone) {
+		zone = dmz_get(zmd, zmd->sb[2].zone->id);
+		if (!zone) {
+			dmz_zmd_err(zmd,
+				    "Tertiary metadata zone not present");
+			ret = -ENXIO;
+			goto err;
+		}
 		set_bit(DMZ_META, &zone->flags);
 	}
-
 	/* Load mapping table */
 	ret = dmz_load_mapping(zmd);
 	if (ret)
@@ -2563,8 +2774,9 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 		goto err;
 	}
 
-	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
-	dmz_print_dev(zmd, 0);
+	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", zmd->sb_version);
+	for (i = 0; i < zmd->nr_devs; i++)
+		dmz_print_dev(zmd, i);
 
 	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
 		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index a09fb78ffe88..ea43f6892ced 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -13,6 +13,8 @@
 
 #define DMZ_MIN_BIOS		8192
 
+#define DMZ_MAX_DEVS		2
+
 /*
  * Zone BIO context.
  */
@@ -38,7 +40,7 @@ struct dm_chunk_work {
  * Target descriptor.
  */
 struct dmz_target {
-	struct dm_dev		*ddev;
+	struct dm_dev		*ddev[DMZ_MAX_DEVS];
 
 	unsigned long		flags;
 
@@ -81,7 +83,7 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
 
 	if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
 		bio->bi_status = status;
-	if (bio->bi_status != BLK_STS_OK)
+	if (bioctx->dev && bio->bi_status != BLK_STS_OK)
 		bioctx->dev->flags |= DMZ_CHECK_BDEV;
 
 	if (refcount_dec_and_test(&bioctx->ref)) {
@@ -690,60 +692,64 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 /*
  * Get zoned device information.
  */
-static int dmz_get_zoned_device(struct dm_target *ti, char *path)
+static int dmz_get_zoned_device(struct dm_target *ti, char *path,
+				int idx, int nr_devs)
 {
 	struct dmz_target *dmz = ti->private;
-	struct request_queue *q;
+	struct dm_dev *ddev;
 	struct dmz_dev *dev;
-	sector_t aligned_capacity;
 	int ret;
+	struct block_device *bdev;
 
 	/* Get the target device */
-	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &dmz->ddev);
+	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &ddev);
 	if (ret) {
 		ti->error = "Get target device failed";
-		dmz->ddev = NULL;
 		return ret;
 	}
 
-	dev = kzalloc(sizeof(struct dmz_dev), GFP_KERNEL);
-	if (!dev) {
-		ret = -ENOMEM;
-		goto err;
+	bdev = ddev->bdev;
+	if (bdev_zoned_model(bdev) == BLK_ZONED_NONE) {
+		if (nr_devs == 1) {
+			ti->error = "Invalid regular device";
+			goto err;
+		}
+		if (idx != 0) {
+			ti->error = "First device must be a regular device";
+			goto err;
+		}
+		if (dmz->ddev[0]) {
+			ti->error = "Too many regular devices";
+			goto err;
+		}
+		dev = &dmz->dev[idx];
+		dev->flags = DMZ_BDEV_REGULAR;
+	} else {
+		if (dmz->ddev[idx]) {
+			ti->error = "Too many zoned devices";
+			goto err;
+		}
+		if (nr_devs > 1 && idx == 0) {
+			ti->error = "First device must be a regular device";
+			goto err;
+		}
+		dev = &dmz->dev[idx];
 	}
-
-	dev->bdev = dmz->ddev->bdev;
+	dev->bdev = bdev;
 	(void)bdevname(dev->bdev, dev->name);
 
-	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE) {
-		ti->error = "Not a zoned block device";
-		ret = -EINVAL;
-		goto err;
-	}
-
-	q = bdev_get_queue(dev->bdev);
-	dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
-	aligned_capacity = dev->capacity &
-				~((sector_t)blk_queue_zone_sectors(q) - 1);
-	if (ti->begin ||
-	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
-		ti->error = "Partial mapping not supported";
-		ret = -EINVAL;
+	dev->capacity = i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
+	if (ti->begin) {
+		ti->error = "Partial mapping is not supported";
 		goto err;
 	}
 
-	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
-
-	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
-
-	dmz->dev = dev;
+	dmz->ddev[idx] = ddev;
 
 	return 0;
 err:
-	dm_put_device(ti, dmz->ddev);
-	kfree(dev);
-
-	return ret;
+	dm_put_device(ti, ddev);
+	return -EINVAL;
 }
 
 /*
@@ -752,10 +758,56 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
 static void dmz_put_zoned_device(struct dm_target *ti)
 {
 	struct dmz_target *dmz = ti->private;
+	int i;
 
-	dm_put_device(ti, dmz->ddev);
-	kfree(dmz->dev);
-	dmz->dev = NULL;
+	for (i = 0; i < DMZ_MAX_DEVS; i++) {
+		if (dmz->ddev[i]) {
+			dm_put_device(ti, dmz->ddev[i]);
+			dmz->ddev[i] = NULL;
+		}
+	}
+}
+
+static int dmz_fixup_devices(struct dm_target *ti)
+{
+	struct dmz_target *dmz = ti->private;
+	struct dmz_dev *reg_dev, *zoned_dev;
+	struct request_queue *q;
+
+	/*
+	 * When we have two devices, the first one must be a regular block
+	 * device and the second a zoned block device.
+	 */
+	if (dmz->ddev[0] && dmz->ddev[1]) {
+		reg_dev = &dmz->dev[0];
+		if (!(reg_dev->flags & DMZ_BDEV_REGULAR)) {
+			ti->error = "Primary disk is not a regular device";
+			return -EINVAL;
+		}
+		zoned_dev = &dmz->dev[1];
+		if (zoned_dev->flags & DMZ_BDEV_REGULAR) {
+			ti->error = "Secondary disk is not a zoned device";
+			return -EINVAL;
+		}
+	} else {
+		reg_dev = NULL;
+		zoned_dev = &dmz->dev[0];
+		if (zoned_dev->flags & DMZ_BDEV_REGULAR) {
+			ti->error = "Disk is not a zoned device";
+			return -EINVAL;
+		}
+	}
+	q = bdev_get_queue(zoned_dev->bdev);
+	zoned_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
+	zoned_dev->nr_zones = blkdev_nr_zones(zoned_dev->bdev->bd_disk);
+
+	if (reg_dev) {
+		reg_dev->zone_nr_sectors = zoned_dev->zone_nr_sectors;
+		reg_dev->nr_zones = DIV_ROUND_UP(reg_dev->capacity,
+						 reg_dev->zone_nr_sectors);
+		zoned_dev->zone_offset = reg_dev->nr_zones;
+	}
+	return 0;
 }
 
 /*
@@ -764,11 +816,10 @@ static void dmz_put_zoned_device(struct dm_target *ti)
 static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 {
 	struct dmz_target *dmz;
-	struct dmz_dev *dev;
 	int ret;
 
 	/* Check arguments */
-	if (argc != 1) {
+	if (argc < 1 || argc > 2) {
 		ti->error = "Invalid argument count";
 		return -EINVAL;
 	}
@@ -779,18 +830,34 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		ti->error = "Unable to allocate the zoned target descriptor";
 		return -ENOMEM;
 	}
+	dmz->dev = kcalloc(2, sizeof(struct dmz_dev), GFP_KERNEL);
+	if (!dmz->dev) {
+		ti->error = "Unable to allocate the zoned device descriptors";
+		kfree(dmz);
+		return -ENOMEM;
+	}
 	ti->private = dmz;
 
 	/* Get the target zoned block device */
-	ret = dmz_get_zoned_device(ti, argv[0]);
+	ret = dmz_get_zoned_device(ti, argv[0], 0, argc);
+	if (ret)
+		goto err;
+
+	if (argc == 2) {
+		ret = dmz_get_zoned_device(ti, argv[1], 1, argc);
+		if (ret) {
+			dmz_put_zoned_device(ti);
+			goto err;
+		}
+	}
+	ret = dmz_fixup_devices(ti);
 	if (ret) {
-		dmz->ddev = NULL;
+		dmz_put_zoned_device(ti);
 		goto err;
 	}
 
 	/* Initialize metadata */
-	dev = dmz->dev;
-	ret = dmz_ctr_metadata(dev, &dmz->metadata,
+	ret = dmz_ctr_metadata(dmz->dev, argc, &dmz->metadata,
 			       dm_table_device_name(ti->table));
 	if (ret) {
 		ti->error = "Metadata initialization failed";
@@ -867,6 +934,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 err_dev:
 	dmz_put_zoned_device(ti);
 err:
+	kfree(dmz->dev);
 	kfree(dmz);
 
 	return ret;
@@ -897,6 +965,7 @@ static void dmz_dtr(struct dm_target *ti)
 
 	mutex_destroy(&dmz->chunk_lock);
 
+	kfree(dmz->dev);
 	kfree(dmz);
 }
 
@@ -971,10 +1040,17 @@ static int dmz_iterate_devices(struct dm_target *ti,
 			       iterate_devices_callout_fn fn, void *data)
 {
 	struct dmz_target *dmz = ti->private;
-	struct dmz_dev *dev = dmz->dev;
-	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
-
-	return fn(ti, dmz->ddev, 0, capacity, data);
+	unsigned int zone_nr_sectors = dmz_zone_nr_sectors(dmz->metadata);
+	sector_t capacity;
+	int r;
+
+	capacity = dmz->dev[0].capacity & ~(zone_nr_sectors - 1);
+	r = fn(ti, dmz->ddev[0], 0, capacity, data);
+	if (!r && dmz->ddev[1]) {
+		capacity = dmz->dev[1].capacity & ~(zone_nr_sectors - 1);
+		r = fn(ti, dmz->ddev[1], 0, capacity, data);
+	}
+	return r;
 }
 
 static void dmz_status(struct dm_target *ti, status_type_t type,
@@ -984,6 +1060,7 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
 	struct dmz_target *dmz = ti->private;
 	ssize_t sz = 0;
 	char buf[BDEVNAME_SIZE];
+	struct dmz_dev *dev;
 
 	switch (type) {
 	case STATUSTYPE_INFO:
@@ -995,8 +1072,14 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
 		       dmz_nr_seq_zones(dmz->metadata));
 		break;
 	case STATUSTYPE_TABLE:
-		format_dev_t(buf, dmz->dev->bdev->bd_dev);
+		dev = &dmz->dev[0];
+		format_dev_t(buf, dev->bdev->bd_dev);
 		DMEMIT("%s", buf);
+		if (dmz->dev[1].bdev) {
+			dev = &dmz->dev[1];
+			format_dev_t(buf, dev->bdev->bd_dev);
+			DMEMIT(" %s", buf);
+		}
 		break;
 	}
 	return;
@@ -1018,7 +1101,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
 
 static struct target_type dmz_type = {
 	.name		 = "zoned",
-	.version	 = {1, 1, 0},
+	.version	 = {2, 0, 0},
 	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
 	.module		 = THIS_MODULE,
 	.ctr		 = dmz_ctr,
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 2629bd51fa26..4971a765be55 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -52,10 +52,12 @@ struct dmz_dev {
 	struct block_device	*bdev;
 
 	char			name[BDEVNAME_SIZE];
+	uuid_t			uuid;
 
 	sector_t		capacity;
 
 	unsigned int		nr_zones;
+	unsigned int		zone_offset;
 
 	unsigned int		flags;
 
@@ -69,6 +71,7 @@ struct dmz_dev {
 /* Device flags. */
 #define DMZ_BDEV_DYING		(1 << 0)
 #define DMZ_CHECK_BDEV		(2 << 0)
+#define DMZ_BDEV_REGULAR	(4 << 0)
 
 /*
  * Zone descriptor.
@@ -163,8 +166,8 @@ struct dmz_reclaim;
 /*
  * Functions defined in dm-zoned-metadata.c
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
-		     const char *devname);
+int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
+		     struct dmz_metadata **zmd, const char *devname);
 void dmz_dtr_metadata(struct dmz_metadata *zmd);
 int dmz_resume_metadata(struct dmz_metadata *zmd);
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 15/15] dm-zoned: metadata version 2
  2020-05-11  8:24 ` [PATCH 15/15] dm-zoned: metadata version 2 Hannes Reinecke
@ 2020-05-11  8:36   ` Damien Le Moal
  2020-05-11  8:46     ` Hannes Reinecke
  0 siblings, 1 reply; 21+ messages in thread
From: Damien Le Moal @ 2020-05-11  8:36 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/05/11 17:24, Hannes Reinecke wrote:
> Implement handling for metadata version 2. The new metadata adds
> a label and UUID for the device mapper device, and additional UUID
> for the underlying block devices.
> It also allows for an additional regular drive to be used for
> emulating random access zones. The emulated zones will be placed
> logically in front of the zones from the zoned block device, causing
> the superblocks and metadata to be stored on that device.
> The first zone of the original zoned device will be used to hold
> another, tertiary copy of the metadata; this copy carries a
> generation number of 0 and is never updated; it's just used
> for identification.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

Forgot to read through the documentation update. A couple of comments added below.

> ---
>  .../admin-guide/device-mapper/dm-zoned.rst         |  34 ++-
>  drivers/md/dm-zoned-metadata.c                     | 310 +++++++++++++++++----
>  drivers/md/dm-zoned-target.c                       | 185 ++++++++----
>  drivers/md/dm-zoned.h                              |   7 +-
>  4 files changed, 427 insertions(+), 109 deletions(-)
> 
> diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
> index 7547ce635161..553752ea2521 100644
> --- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
> +++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
> @@ -37,9 +37,13 @@ Algorithm
>  dm-zoned implements an on-disk buffering scheme to handle non-sequential
>  write accesses to the sequential zones of a zoned block device.
>  Conventional zones are used for caching as well as for storing internal
> -metadata.
> +metadata. It can also use a regular block device together with the zoned
> +block device; in that case the regular block device will be split logically
> +in zones with the same size as the zoned block device. These zones will be
> +placed in front of the zones from the zoned block device and will be handled
> +just like conventional zones.
>  
> -The zones of the device are separated into 2 types:
> +The zones of the device(s) are separated into 2 types:
>  
>  1) Metadata zones: these are conventional zones used to store metadata.
>  Metadata zones are not reported as useable capacity to the user.
> @@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
>  discard requests. Read requests can be processed concurrently while
>  metadata flush is being executed.
>  
> +If a regular device is used in conjunction with the zoned block device,
> +a third set of metadata (without the zone bitmaps) is written to the
> +start of the zoned block device. This metadata has a generation counter of
> +'0' and will never be updated during normal operation; it just serves for
> +identification purposes. The first and second copy of the metadata
> +are located at the start of the regular block device.
> +
>  Usage
>  =====
>  
> @@ -138,12 +149,21 @@ Ex::
>  
>  	dmzadm --format /dev/sdxx
>  
> -For a formatted device, the target can be created normally with the
> -dmsetup utility. The only parameter that dm-zoned requires is the
> -underlying zoned block device name. Ex::
>  
> -	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
> -	dmsetup create dmz-`basename ${dev}`
> +If two drives are to be used, both devices must be specified, with the
> +regular block device as the first device.

Actually, the zoned block device must be first. Otherwise dmzadm complains. We
can change that, or change the doc. Which do you prefer ? No strong opinion here.

> +
> +Ex::
> +
> +	dmzadm --format /dev/sdxx /dev/sdyy
> +
> +
> +Fomatted device(s) can be started with the dmzadm utility, too.:
> +
> +Ex::
> +
> +	dmzadm --start /dev/sdxx /dev/sdyy

And same here, the zoned device must come first. I added a patch that internally
reverse that order for the dm start operation so that the regular device is
specified first.

> +
>  
>  Information about the internal layout and current usage of the zones can
>  be obtained with the 'status' callback from dmsetup:
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index 067ce010f457..d9e256762eff 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -16,7 +16,7 @@
>  /*
>   * Metadata version.
>   */
> -#define DMZ_META_VER	1
> +#define DMZ_META_VER	2
>  
>  /*
>   * On-disk super block magic.
> @@ -69,8 +69,17 @@ struct dmz_super {
>  	/* Checksum */
>  	__le32		crc;			/*  48 */
>  
> +	/* DM-Zoned label */
> +	u8		dmz_label[32];		/*  80 */
> +
> +	/* DM-Zoned UUID */
> +	u8		dmz_uuid[16];		/*  96 */
> +
> +	/* Device UUID */
> +	u8		dev_uuid[16];		/* 112 */
> +
>  	/* Padding to full 512B sector */
> -	u8		reserved[464];		/* 512 */
> +	u8		reserved[400];		/* 512 */
>  };
>  
>  /*
> @@ -133,8 +142,11 @@ struct dmz_sb {
>   */
>  struct dmz_metadata {
>  	struct dmz_dev		*dev;
> +	unsigned int		nr_devs;
>  
>  	char			devname[BDEVNAME_SIZE];
> +	char			label[BDEVNAME_SIZE];
> +	uuid_t			uuid;
>  
>  	sector_t		zone_bitmap_size;
>  	unsigned int		zone_nr_bitmap_blocks;
> @@ -161,8 +173,9 @@ struct dmz_metadata {
>  	/* Zone information array */
>  	struct dm_zone		*zones;
>  
> -	struct dmz_sb		sb[2];
> +	struct dmz_sb		sb[3];
>  	unsigned int		mblk_primary;
> +	unsigned int		sb_version;
>  	u64			sb_gen;
>  	unsigned int		min_nr_mblks;
>  	unsigned int		max_nr_mblks;
> @@ -195,31 +208,56 @@ struct dmz_metadata {
>  };
>  
>  #define dmz_zmd_info(zmd, format, args...)	\
> -	DMINFO("(%s): " format, (zmd)->devname, ## args)
> +	DMINFO("(%s): " format, (zmd)->label, ## args)
>  
>  #define dmz_zmd_err(zmd, format, args...)	\
> -	DMERR("(%s): " format, (zmd)->devname, ## args)
> +	DMERR("(%s): " format, (zmd)->label, ## args)
>  
>  #define dmz_zmd_warn(zmd, format, args...)	\
> -	DMWARN("(%s): " format, (zmd)->devname, ## args)
> +	DMWARN("(%s): " format, (zmd)->label, ## args)
>  
>  #define dmz_zmd_debug(zmd, format, args...)	\
> -	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
> +	DMDEBUG("(%s): " format, (zmd)->label, ## args)
>  /*
>   * Various accessors
>   */
> +unsigned int dmz_dev_zone_id(struct dmz_metadata *zmd, struct dm_zone *zone)
> +{
> +	unsigned int zone_id;
> +
> +	if (WARN_ON(!zone))
> +		return 0;
> +
> +	zone_id = zone->id;
> +	if (zmd->nr_devs > 1 &&
> +	    (zone_id >= zmd->dev[1].zone_offset))
> +		zone_id -= zmd->dev[1].zone_offset;
> +	return zone_id;
> +}
> +
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
> +	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
> +
> +	return (sector_t)zone_id << zmd->zone_nr_sectors_shift;
>  }
>  
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
> +	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
> +
> +	return (sector_t)zone_id << zmd->zone_nr_blocks_shift;
>  }
>  
>  struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> +	if (WARN_ON(!zone))
> +		return &zmd->dev[0];
> +
> +	if (zmd->nr_devs > 1 &&
> +	    zone->id >= zmd->dev[1].zone_offset)
> +		return &zmd->dev[1];
> +
>  	return &zmd->dev[0];
>  }
>  
> @@ -275,17 +313,29 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
>  
>  const char *dmz_metadata_label(struct dmz_metadata *zmd)
>  {
> -	return (const char *)zmd->devname;
> +	return (const char *)zmd->label;
>  }
>  
>  bool dmz_check_dev(struct dmz_metadata *zmd)
>  {
> -	return dmz_check_bdev(&zmd->dev[0]);
> +	unsigned int i;
> +
> +	for (i = 0; i < zmd->nr_devs; i++) {
> +		if (!dmz_check_bdev(&zmd->dev[i]))
> +			return false;
> +	}
> +	return true;
>  }
>  
>  bool dmz_dev_is_dying(struct dmz_metadata *zmd)
>  {
> -	return dmz_bdev_is_dying(&zmd->dev[0]);
> +	unsigned int i;
> +
> +	for (i = 0; i < zmd->nr_devs; i++) {
> +		if (dmz_bdev_is_dying(&zmd->dev[i]))
> +			return true;
> +	}
> +	return false;
>  }
>  
>  /*
> @@ -687,6 +737,9 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
>  	struct bio *bio;
>  	int ret;
>  
> +	if (WARN_ON(!dev))
> +		return -EIO;
> +
>  	if (dmz_bdev_is_dying(dev))
>  		return -EIO;
>  
> @@ -711,19 +764,32 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
>   */
>  static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  {
> -	sector_t block = zmd->sb[set].block;
>  	struct dmz_mblock *mblk = zmd->sb[set].mblk;
>  	struct dmz_super *sb = zmd->sb[set].sb;
>  	struct dmz_dev *dev = zmd->sb[set].dev;
> +	sector_t sb_block;
>  	u64 sb_gen = zmd->sb_gen + 1;
>  	int ret;
>  
>  	sb->magic = cpu_to_le32(DMZ_MAGIC);
> -	sb->version = cpu_to_le32(DMZ_META_VER);
> +
> +	sb->version = cpu_to_le32(zmd->sb_version);
> +	if (zmd->sb_version > 1) {
> +		BUILD_BUG_ON(UUID_SIZE != 16);
> +		export_uuid(sb->dmz_uuid, &zmd->uuid);
> +		memcpy(sb->dmz_label, zmd->label, BDEVNAME_SIZE);
> +		export_uuid(sb->dev_uuid, &dev->uuid);
> +	}
>  
>  	sb->gen = cpu_to_le64(sb_gen);
>  
> -	sb->sb_block = cpu_to_le64(block);
> +	/*
> +	 * The metadata always references the absolute block address,
> +	 * ie relative to the entire block range, not the per-device
> +	 * block address.
> +	 */
> +	sb_block = zmd->sb[set].zone->id << zmd->zone_nr_blocks_shift;
> +	sb->sb_block = cpu_to_le64(sb_block);
>  	sb->nr_meta_blocks = cpu_to_le32(zmd->nr_meta_blocks);
>  	sb->nr_reserved_seq = cpu_to_le32(zmd->nr_reserved_seq);
>  	sb->nr_chunks = cpu_to_le32(zmd->nr_chunks);
> @@ -734,7 +800,8 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  	sb->crc = 0;
>  	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
>  
> -	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
> +	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, zmd->sb[set].block,
> +			     mblk->page);
>  	if (ret == 0)
>  		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
>  
> @@ -915,6 +982,23 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
>  	u32 crc, stored_crc;
>  	u64 gen;
>  
> +	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
> +		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
> +			    DMZ_MAGIC, le32_to_cpu(sb->magic));
> +		return -ENXIO;
> +	}
> +
> +	zmd->sb_version = le32_to_cpu(sb->version);
> +	if (zmd->sb_version > DMZ_META_VER) {
> +		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
> +			    DMZ_META_VER, zmd->sb_version);
> +		return -EINVAL;
> +	}
> +	if ((zmd->sb_version < 1) && (set == 2)) {
> +		dmz_dev_err(dev, "Tertiary superblocks are not supported");
> +		return -EINVAL;
> +	}
> +
>  	gen = le64_to_cpu(sb->gen);
>  	stored_crc = le32_to_cpu(sb->crc);
>  	sb->crc = 0;
> @@ -925,16 +1009,45 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
>  		return -ENXIO;
>  	}
>  
> -	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
> -		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
> -			    DMZ_MAGIC, le32_to_cpu(sb->magic));
> -		return -ENXIO;
> -	}
> +	if (zmd->sb_version > 1) {
> +		uuid_t sb_uuid;
> +
> +		import_uuid(&sb_uuid, sb->dmz_uuid);
> +		if (uuid_is_null(&sb_uuid)) {
> +			dmz_dev_err(dev, "NULL DM-Zoned uuid");
> +			return -ENXIO;
> +		} else if (uuid_is_null(&zmd->uuid)) {
> +			uuid_copy(&zmd->uuid, &sb_uuid);
> +		} else if (!uuid_equal(&zmd->uuid, &sb_uuid)) {
> +			dmz_dev_err(dev, "mismatching DM-Zoned uuid, "
> +				    "is %pUl expected %pUl",
> +				    &sb_uuid, &zmd->uuid);
> +			return -ENXIO;
> +		}
> +		if (!strlen(zmd->label))
> +			memcpy(zmd->label, sb->dmz_label, BDEVNAME_SIZE);
> +		else if (memcmp(zmd->label, sb->dmz_label, BDEVNAME_SIZE)) {
> +			dmz_dev_err(dev, "mismatching DM-Zoned label, "
> +				    "is %s expected %s",
> +				    sb->dmz_label, zmd->label);
> +			return -ENXIO;
> +		}
> +		import_uuid(&dev->uuid, sb->dev_uuid);
> +		if (uuid_is_null(&dev->uuid)) {
> +			dmz_dev_err(dev, "NULL device uuid");
> +			return -ENXIO;
> +		}
>  
> -	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
> -		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
> -			    DMZ_META_VER, le32_to_cpu(sb->version));
> -		return -ENXIO;
> +		if (set == 2) {
> +			/*
> +			 * Generation number should be 0, but it doesn't
> +			 * really matter if it isn't.
> +			 */
> +			if (gen != 0)
> +				dmz_dev_warn(dev, "Invalid generation %llu",
> +					    gen);
> +			return 0;
> +		}
>  	}
>  
>  	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
> @@ -1185,21 +1298,38 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>  		      "Using super block %u (gen %llu)",
>  		      zmd->mblk_primary, zmd->sb_gen);
>  
> +	if ((zmd->sb_version > 1) && zmd->sb[2].zone) {
> +		zmd->sb[2].block = dmz_start_block(zmd, zmd->sb[2].zone);
> +		zmd->sb[2].dev = dmz_zone_to_dev(zmd, zmd->sb[2].zone);
> +		ret = dmz_get_sb(zmd, 2);
> +		if (ret) {
> +			dmz_dev_err(zmd->sb[2].dev,
> +				    "Read tertiary super block failed");
> +			return ret;
> +		}
> +		ret = dmz_check_sb(zmd, 2);
> +		if (ret == -EINVAL)
> +			return ret;
> +	}
>  	return 0;
>  }
>  
>  /*
>   * Initialize a zone descriptor.
>   */
> -static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
> +static int dmz_init_zone(struct blk_zone *blkz, unsigned int num, void *data)
>  {
>  	struct dmz_metadata *zmd = data;
> +	struct dmz_dev *dev = zmd->nr_devs > 1 ? &zmd->dev[1] : &zmd->dev[0];
> +	int idx = num + dev->zone_offset;
>  	struct dm_zone *zone = &zmd->zones[idx];
> -	struct dmz_dev *dev = zmd->dev;
>  
> -	/* Ignore the eventual last runt (smaller) zone */
>  	if (blkz->len != zmd->zone_nr_sectors) {
> -		if (blkz->start + blkz->len == dev->capacity)
> +		if (zmd->sb_version > 1) {
> +			/* Ignore the eventual runt (smaller) zone */
> +			set_bit(DMZ_OFFLINE, &zone->flags);
> +			return 0;
> +		} else if (blkz->start + blkz->len == dev->capacity)
>  			return 0;
>  		return -ENXIO;
>  	}
> @@ -1234,16 +1364,45 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>  		zmd->nr_useable_zones++;
>  		if (dmz_is_rnd(zone)) {
>  			zmd->nr_rnd_zones++;
> -			if (!zmd->sb[0].zone) {
> -				/* Super block zone */
> +			if (zmd->nr_devs == 1 && !zmd->sb[0].zone) {
> +				/* Primary super block zone */
>  				zmd->sb[0].zone = zone;
>  			}
>  		}
> +		if (zmd->nr_devs > 1 && !zmd->sb[2].zone) {
> +			/* Tertiary superblock zone */
> +			zmd->sb[2].zone = zone;
> +		}
>  	}
>  
>  	return 0;
>  }
>  
> +static void dmz_emulate_zones(struct dmz_metadata *zmd, struct dmz_dev *dev)
> +{
> +	int idx;
> +	sector_t zone_offset = 0;
> +
> +	for(idx = 0; idx < dev->nr_zones; idx++) {
> +		struct dm_zone *zone = &zmd->zones[idx];
> +
> +		INIT_LIST_HEAD(&zone->link);
> +		atomic_set(&zone->refcount, 0);
> +		zone->id = idx;
> +		zone->chunk = DMZ_MAP_UNMAPPED;
> +		set_bit(DMZ_RND, &zone->flags);
> +		zone->wp_block = 0;
> +		zmd->nr_rnd_zones++;
> +		zmd->nr_useable_zones++;
> +		if (dev->capacity - zone_offset < zmd->zone_nr_sectors) {
> +			/* Disable runt zone */
> +			set_bit(DMZ_OFFLINE, &zone->flags);
> +			break;
> +		}
> +		zone_offset += zmd->zone_nr_sectors;
> +	}
> +}
> +
>  /*
>   * Free zones descriptors.
>   */
> @@ -1259,11 +1418,11 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
>   */
>  static int dmz_init_zones(struct dmz_metadata *zmd)
>  {
> -	struct dmz_dev *dev = &zmd->dev[0];
> -	int ret;
> +	int i, ret;
> +	struct dmz_dev *zoned_dev = &zmd->dev[0];
>  
>  	/* Init */
> -	zmd->zone_nr_sectors = dev->zone_nr_sectors;
> +	zmd->zone_nr_sectors = zmd->dev[0].zone_nr_sectors;
>  	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
>  	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
>  	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
> @@ -1274,7 +1433,14 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
>  					DMZ_BLOCK_SIZE_BITS);
>  
>  	/* Allocate zone array */
> -	zmd->nr_zones = dev->nr_zones;
> +	zmd->nr_zones = 0;
> +	for (i = 0; i < zmd->nr_devs; i++)
> +		zmd->nr_zones += zmd->dev[i].nr_zones;
> +
> +	if (!zmd->nr_zones) {
> +		DMERR("(%s): No zones found", zmd->devname);
> +		return -ENXIO;
> +	}
>  	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
>  	if (!zmd->zones)
>  		return -ENOMEM;
> @@ -1282,14 +1448,27 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
>  	DMDEBUG("(%s): Using %zu B for zone information",
>  		zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
>  
> +	if (zmd->nr_devs > 1) {
> +		dmz_emulate_zones(zmd, &zmd->dev[0]);
> +		/*
> +		 * Primary superblock zone is always at zone 0 when multiple
> +		 * drives are present.
> +		 */
> +		zmd->sb[0].zone = &zmd->zones[0];
> +
> +		zoned_dev = &zmd->dev[1];
> +	}
> +
>  	/*
>  	 * Get zone information and initialize zone descriptors.  At the same
>  	 * time, determine where the super block should be: first block of the
>  	 * first randomly writable zone.
>  	 */
> -	ret = blkdev_report_zones(dev->bdev, 0, BLK_ALL_ZONES, dmz_init_zone,
> -				  zmd);
> +	ret = blkdev_report_zones(zoned_dev->bdev, 0, BLK_ALL_ZONES,
> +				  dmz_init_zone, zmd);
>  	if (ret < 0) {
> +		DMDEBUG("(%s): Failed to report zones, error %d",
> +			zmd->devname, ret);
>  		dmz_drop_zones(zmd);
>  		return ret;
>  	}
> @@ -1325,6 +1504,9 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
>  	unsigned int noio_flag;
>  	int ret;
>  
> +	if (dev->flags & DMZ_BDEV_REGULAR)
> +		return 0;
> +
>  	/*
>  	 * Get zone information from disk. Since blkdev_report_zones() uses
>  	 * GFP_KERNEL by default for memory allocations, set the per-task
> @@ -2475,18 +2657,33 @@ void dmz_print_dev(struct dmz_metadata *zmd, int num)
>  {
>  	struct dmz_dev *dev = &zmd->dev[num];
>  
> -	dmz_dev_info(dev, "Host-%s zoned block device",
> -		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
> -		     "aware" : "managed");
> -	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
> -		     (u64)dev->capacity);
> -	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
> -		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
> +	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE)
> +		dmz_dev_info(dev, "Regular block device");
> +	else
> +		dmz_dev_info(dev, "Host-%s zoned block device",
> +			     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
> +			     "aware" : "managed");
> +	if (zmd->sb_version > 1) {
> +		sector_t sector_offset =
> +			dev->zone_offset << zmd->zone_nr_sectors_shift;
> +
> +		dmz_dev_info(dev, "  %llu 512-byte logical sectors (offset %llu)",
> +			     (u64)dev->capacity, (u64)sector_offset);
> +		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors (offset %llu)",
> +			     dev->nr_zones, (u64)zmd->zone_nr_sectors,
> +			     (u64)dev->zone_offset);
> +	} else {
> +		dmz_dev_info(dev, "  %llu 512-byte logical sectors",
> +			     (u64)dev->capacity);
> +		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
> +			     dev->nr_zones, (u64)zmd->zone_nr_sectors);
> +	}
>  }
>  /*
>   * Initialize the zoned metadata.
>   */
> -int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
> +int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> +		     struct dmz_metadata **metadata,
>  		     const char *devname)
>  {
>  	struct dmz_metadata *zmd;
> @@ -2500,6 +2697,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>  
>  	strcpy(zmd->devname, devname);
>  	zmd->dev = dev;
> +	zmd->nr_devs = num_dev;
>  	zmd->mblk_rbtree = RB_ROOT;
>  	init_rwsem(&zmd->mblk_sem);
>  	mutex_init(&zmd->mblk_flush_lock);
> @@ -2534,11 +2732,24 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>  	/* Set metadata zones starting from sb_zone */
>  	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
>  		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
> -		if (!dmz_is_rnd(zone))
> +		if (!dmz_is_rnd(zone)) {
> +			dmz_zmd_err(zmd,
> +				    "metadata zone %d is not random", i);
> +			ret = -ENXIO;
>  			goto err;
> +		}
> +		set_bit(DMZ_META, &zone->flags);
> +	}
> +	if (zmd->sb[2].zone) {
> +		zone = dmz_get(zmd, zmd->sb[2].zone->id);
> +		if (!zone) {
> +			dmz_zmd_err(zmd,
> +				    "Tertiary metadata zone not present");
> +			ret = -ENXIO;
> +			goto err;
> +		}
>  		set_bit(DMZ_META, &zone->flags);
>  	}
> -
>  	/* Load mapping table */
>  	ret = dmz_load_mapping(zmd);
>  	if (ret)
> @@ -2563,8 +2774,9 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>  		goto err;
>  	}
>  
> -	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
> -	dmz_print_dev(zmd, 0);
> +	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", zmd->sb_version);
> +	for (i = 0; i < zmd->nr_devs; i++)
> +		dmz_print_dev(zmd, i);
>  
>  	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
>  		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index a09fb78ffe88..ea43f6892ced 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -13,6 +13,8 @@
>  
>  #define DMZ_MIN_BIOS		8192
>  
> +#define DMZ_MAX_DEVS		2
> +
>  /*
>   * Zone BIO context.
>   */
> @@ -38,7 +40,7 @@ struct dm_chunk_work {
>   * Target descriptor.
>   */
>  struct dmz_target {
> -	struct dm_dev		*ddev;
> +	struct dm_dev		*ddev[DMZ_MAX_DEVS];
>  
>  	unsigned long		flags;
>  
> @@ -81,7 +83,7 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
>  
>  	if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
>  		bio->bi_status = status;
> -	if (bio->bi_status != BLK_STS_OK)
> +	if (bioctx->dev && bio->bi_status != BLK_STS_OK)
>  		bioctx->dev->flags |= DMZ_CHECK_BDEV;
>  
>  	if (refcount_dec_and_test(&bioctx->ref)) {
> @@ -690,60 +692,64 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>  /*
>   * Get zoned device information.
>   */
> -static int dmz_get_zoned_device(struct dm_target *ti, char *path)
> +static int dmz_get_zoned_device(struct dm_target *ti, char *path,
> +				int idx, int nr_devs)
>  {
>  	struct dmz_target *dmz = ti->private;
> -	struct request_queue *q;
> +	struct dm_dev *ddev;
>  	struct dmz_dev *dev;
> -	sector_t aligned_capacity;
>  	int ret;
> +	struct block_device *bdev;
>  
>  	/* Get the target device */
> -	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &dmz->ddev);
> +	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &ddev);
>  	if (ret) {
>  		ti->error = "Get target device failed";
> -		dmz->ddev = NULL;
>  		return ret;
>  	}
>  
> -	dev = kzalloc(sizeof(struct dmz_dev), GFP_KERNEL);
> -	if (!dev) {
> -		ret = -ENOMEM;
> -		goto err;
> +	bdev = ddev->bdev;
> +	if (bdev_zoned_model(bdev) == BLK_ZONED_NONE) {
> +		if (nr_devs == 1) {
> +			ti->error = "Invalid regular device";
> +			goto err;
> +		}
> +		if (idx != 0) {
> +			ti->error = "First device must be a regular device";
> +			goto err;
> +		}
> +		if (dmz->ddev[0]) {
> +			ti->error = "Too many regular devices";
> +			goto err;
> +		}
> +		dev = &dmz->dev[idx];
> +		dev->flags = DMZ_BDEV_REGULAR;
> +	} else {
> +		if (dmz->ddev[idx]) {
> +			ti->error = "Too many zoned devices";
> +			goto err;
> +		}
> +		if (nr_devs > 1 && idx == 0) {
> +			ti->error = "First device must be a regular device";
> +			goto err;
> +		}
> +		dev = &dmz->dev[idx];
>  	}
> -
> -	dev->bdev = dmz->ddev->bdev;
> +	dev->bdev = bdev;
>  	(void)bdevname(dev->bdev, dev->name);
>  
> -	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE) {
> -		ti->error = "Not a zoned block device";
> -		ret = -EINVAL;
> -		goto err;
> -	}
> -
> -	q = bdev_get_queue(dev->bdev);
> -	dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
> -	aligned_capacity = dev->capacity &
> -				~((sector_t)blk_queue_zone_sectors(q) - 1);
> -	if (ti->begin ||
> -	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
> -		ti->error = "Partial mapping not supported";
> -		ret = -EINVAL;
> +	dev->capacity = i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
> +	if (ti->begin) {
> +		ti->error = "Partial mapping is not supported";
>  		goto err;
>  	}
>  
> -	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
> -
> -	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
> -
> -	dmz->dev = dev;
> +	dmz->ddev[idx] = ddev;
>  
>  	return 0;
>  err:
> -	dm_put_device(ti, dmz->ddev);
> -	kfree(dev);
> -
> -	return ret;
> +	dm_put_device(ti, ddev);
> +	return -EINVAL;
>  }
>  
>  /*
> @@ -752,10 +758,56 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
>  static void dmz_put_zoned_device(struct dm_target *ti)
>  {
>  	struct dmz_target *dmz = ti->private;
> +	int i;
>  
> -	dm_put_device(ti, dmz->ddev);
> -	kfree(dmz->dev);
> -	dmz->dev = NULL;
> +	for (i = 0; i < DMZ_MAX_DEVS; i++) {
> +		if (dmz->ddev[i]) {
> +			dm_put_device(ti, dmz->ddev[i]);
> +			dmz->ddev[i] = NULL;
> +		}
> +	}
> +}
> +
> +static int dmz_fixup_devices(struct dm_target *ti)
> +{
> +	struct dmz_target *dmz = ti->private;
> +	struct dmz_dev *reg_dev, *zoned_dev;
> +	struct request_queue *q;
> +
> +	/*
> +	 * When we have two devices, the first one must be a regular block
> +	 * device and the second a zoned block device.
> +	 */
> +	if (dmz->ddev[0] && dmz->ddev[1]) {
> +		reg_dev = &dmz->dev[0];
> +		if (!(reg_dev->flags & DMZ_BDEV_REGULAR)) {
> +			ti->error = "Primary disk is not a regular device";
> +			return -EINVAL;
> +		}
> +		zoned_dev = &dmz->dev[1];
> +		if (zoned_dev->flags & DMZ_BDEV_REGULAR) {
> +			ti->error = "Secondary disk is not a zoned device";
> +			return -EINVAL;
> +		}
> +	} else {
> +		reg_dev = NULL;
> +		zoned_dev = &dmz->dev[0];
> +		if (zoned_dev->flags & DMZ_BDEV_REGULAR) {
> +			ti->error = "Disk is not a zoned device";
> +			return -EINVAL;
> +		}
> +	}
> +	q = bdev_get_queue(zoned_dev->bdev);
> +	zoned_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
> +	zoned_dev->nr_zones = blkdev_nr_zones(zoned_dev->bdev->bd_disk);
> +
> +	if (reg_dev) {
> +		reg_dev->zone_nr_sectors = zoned_dev->zone_nr_sectors;
> +		reg_dev->nr_zones = DIV_ROUND_UP(reg_dev->capacity,
> +						 reg_dev->zone_nr_sectors);
> +		zoned_dev->zone_offset = reg_dev->nr_zones;
> +	}
> +	return 0;
>  }
>  
>  /*
> @@ -764,11 +816,10 @@ static void dmz_put_zoned_device(struct dm_target *ti)
>  static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  {
>  	struct dmz_target *dmz;
> -	struct dmz_dev *dev;
>  	int ret;
>  
>  	/* Check arguments */
> -	if (argc != 1) {
> +	if (argc < 1 || argc > 2) {
>  		ti->error = "Invalid argument count";
>  		return -EINVAL;
>  	}
> @@ -779,18 +830,34 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  		ti->error = "Unable to allocate the zoned target descriptor";
>  		return -ENOMEM;
>  	}
> +	dmz->dev = kcalloc(2, sizeof(struct dmz_dev), GFP_KERNEL);
> +	if (!dmz->dev) {
> +		ti->error = "Unable to allocate the zoned device descriptors";
> +		kfree(dmz);
> +		return -ENOMEM;
> +	}
>  	ti->private = dmz;
>  
>  	/* Get the target zoned block device */
> -	ret = dmz_get_zoned_device(ti, argv[0]);
> +	ret = dmz_get_zoned_device(ti, argv[0], 0, argc);
> +	if (ret)
> +		goto err;
> +
> +	if (argc == 2) {
> +		ret = dmz_get_zoned_device(ti, argv[1], 1, argc);
> +		if (ret) {
> +			dmz_put_zoned_device(ti);
> +			goto err;
> +		}
> +	}
> +	ret = dmz_fixup_devices(ti);
>  	if (ret) {
> -		dmz->ddev = NULL;
> +		dmz_put_zoned_device(ti);
>  		goto err;
>  	}
>  
>  	/* Initialize metadata */
> -	dev = dmz->dev;
> -	ret = dmz_ctr_metadata(dev, &dmz->metadata,
> +	ret = dmz_ctr_metadata(dmz->dev, argc, &dmz->metadata,
>  			       dm_table_device_name(ti->table));
>  	if (ret) {
>  		ti->error = "Metadata initialization failed";
> @@ -867,6 +934,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  err_dev:
>  	dmz_put_zoned_device(ti);
>  err:
> +	kfree(dmz->dev);
>  	kfree(dmz);
>  
>  	return ret;
> @@ -897,6 +965,7 @@ static void dmz_dtr(struct dm_target *ti)
>  
>  	mutex_destroy(&dmz->chunk_lock);
>  
> +	kfree(dmz->dev);
>  	kfree(dmz);
>  }
>  
> @@ -971,10 +1040,17 @@ static int dmz_iterate_devices(struct dm_target *ti,
>  			       iterate_devices_callout_fn fn, void *data)
>  {
>  	struct dmz_target *dmz = ti->private;
> -	struct dmz_dev *dev = dmz->dev;
> -	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
> -
> -	return fn(ti, dmz->ddev, 0, capacity, data);
> +	unsigned int zone_nr_sectors = dmz_zone_nr_sectors(dmz->metadata);
> +	sector_t capacity;
> +	int r;
> +
> +	capacity = dmz->dev[0].capacity & ~(zone_nr_sectors - 1);
> +	r = fn(ti, dmz->ddev[0], 0, capacity, data);
> +	if (!r && dmz->ddev[1]) {
> +		capacity = dmz->dev[1].capacity & ~(zone_nr_sectors - 1);
> +		r = fn(ti, dmz->ddev[1], 0, capacity, data);
> +	}
> +	return r;
>  }
>  
>  static void dmz_status(struct dm_target *ti, status_type_t type,
> @@ -984,6 +1060,7 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
>  	struct dmz_target *dmz = ti->private;
>  	ssize_t sz = 0;
>  	char buf[BDEVNAME_SIZE];
> +	struct dmz_dev *dev;
>  
>  	switch (type) {
>  	case STATUSTYPE_INFO:
> @@ -995,8 +1072,14 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
>  		       dmz_nr_seq_zones(dmz->metadata));
>  		break;
>  	case STATUSTYPE_TABLE:
> -		format_dev_t(buf, dmz->dev->bdev->bd_dev);
> +		dev = &dmz->dev[0];
> +		format_dev_t(buf, dev->bdev->bd_dev);
>  		DMEMIT("%s", buf);
> +		if (dmz->dev[1].bdev) {
> +			dev = &dmz->dev[1];
> +			format_dev_t(buf, dev->bdev->bd_dev);
> +			DMEMIT(" %s", buf);
> +		}
>  		break;
>  	}
>  	return;
> @@ -1018,7 +1101,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>  
>  static struct target_type dmz_type = {
>  	.name		 = "zoned",
> -	.version	 = {1, 1, 0},
> +	.version	 = {2, 0, 0},
>  	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>  	.module		 = THIS_MODULE,
>  	.ctr		 = dmz_ctr,
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index 2629bd51fa26..4971a765be55 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -52,10 +52,12 @@ struct dmz_dev {
>  	struct block_device	*bdev;
>  
>  	char			name[BDEVNAME_SIZE];
> +	uuid_t			uuid;
>  
>  	sector_t		capacity;
>  
>  	unsigned int		nr_zones;
> +	unsigned int		zone_offset;
>  
>  	unsigned int		flags;
>  
> @@ -69,6 +71,7 @@ struct dmz_dev {
>  /* Device flags. */
>  #define DMZ_BDEV_DYING		(1 << 0)
>  #define DMZ_CHECK_BDEV		(2 << 0)
> +#define DMZ_BDEV_REGULAR	(4 << 0)
>  
>  /*
>   * Zone descriptor.
> @@ -163,8 +166,8 @@ struct dmz_reclaim;
>  /*
>   * Functions defined in dm-zoned-metadata.c
>   */
> -int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
> -		     const char *devname);
> +int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> +		     struct dmz_metadata **zmd, const char *devname);
>  void dmz_dtr_metadata(struct dmz_metadata *zmd);
>  int dmz_resume_metadata(struct dmz_metadata *zmd);
>  
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 15/15] dm-zoned: metadata version 2
  2020-05-11  8:36   ` Damien Le Moal
@ 2020-05-11  8:46     ` Hannes Reinecke
  2020-05-11  8:51       ` Damien Le Moal
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  8:46 UTC (permalink / raw)
  To: Damien Le Moal, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 5/11/20 10:36 AM, Damien Le Moal wrote:
> On 2020/05/11 17:24, Hannes Reinecke wrote:
>> Implement handling for metadata version 2. The new metadata adds
>> a label and UUID for the device mapper device, and additional UUID
>> for the underlying block devices.
>> It also allows for an additional regular drive to be used for
>> emulating random access zones. The emulated zones will be placed
>> logically in front of the zones from the zoned block device, causing
>> the superblocks and metadata to be stored on that device.
>> The first zone of the original zoned device will be used to hold
>> another, tertiary copy of the metadata; this copy carries a
>> generation number of 0 and is never updated; it's just used
>> for identification.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> Reviewed-by: Bob Liu <bob.liu@oracle.com>
>> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
> 
> Forgot to read through the documentation update. A couple of comments added below.
> 
>> ---
>>   .../admin-guide/device-mapper/dm-zoned.rst         |  34 ++-
>>   drivers/md/dm-zoned-metadata.c                     | 310 +++++++++++++++++----
>>   drivers/md/dm-zoned-target.c                       | 185 ++++++++----
>>   drivers/md/dm-zoned.h                              |   7 +-
>>   4 files changed, 427 insertions(+), 109 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>> index 7547ce635161..553752ea2521 100644
>> --- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
>> +++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>> @@ -37,9 +37,13 @@ Algorithm
>>   dm-zoned implements an on-disk buffering scheme to handle non-sequential
>>   write accesses to the sequential zones of a zoned block device.
>>   Conventional zones are used for caching as well as for storing internal
>> -metadata.
>> +metadata. It can also use a regular block device together with the zoned
>> +block device; in that case the regular block device will be split logically
>> +in zones with the same size as the zoned block device. These zones will be
>> +placed in front of the zones from the zoned block device and will be handled
>> +just like conventional zones.
>>   
>> -The zones of the device are separated into 2 types:
>> +The zones of the device(s) are separated into 2 types:
>>   
>>   1) Metadata zones: these are conventional zones used to store metadata.
>>   Metadata zones are not reported as useable capacity to the user.
>> @@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
>>   discard requests. Read requests can be processed concurrently while
>>   metadata flush is being executed.
>>   
>> +If a regular device is used in conjunction with the zoned block device,
>> +a third set of metadata (without the zone bitmaps) is written to the
>> +start of the zoned block device. This metadata has a generation counter of
>> +'0' and will never be updated during normal operation; it just serves for
>> +identification purposes. The first and second copy of the metadata
>> +are located at the start of the regular block device.
>> +
>>   Usage
>>   =====
>>   
>> @@ -138,12 +149,21 @@ Ex::
>>   
>>   	dmzadm --format /dev/sdxx
>>   
>> -For a formatted device, the target can be created normally with the
>> -dmsetup utility. The only parameter that dm-zoned requires is the
>> -underlying zoned block device name. Ex::
>>   
>> -	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
>> -	dmsetup create dmz-`basename ${dev}`
>> +If two drives are to be used, both devices must be specified, with the
>> +regular block device as the first device.
> 
> Actually, the zoned block device must be first. Otherwise dmzadm complains. We
> can change that, or change the doc. Which do you prefer ? No strong opinion here.
> 
Nope, not any more. Fixed it in my local repo (which I haven't pushed, 
sorry).

But after the last discussion we had I thought it better and more 
consistent to have the regular device first, just like the device-mapper 
interface.

>> +
>> +Ex::
>> +
>> +	dmzadm --format /dev/sdxx /dev/sdyy
>> +
>> +
>> +Fomatted device(s) can be started with the dmzadm utility, too.:
>> +
>> +Ex::
>> +
>> +	dmzadm --start /dev/sdxx /dev/sdyy
> 
> And same here, the zoned device must come first. I added a patch that internally
> reverse that order for the dm start operation so that the regular device is
> specified first.
> 
See above. I've fixed up dmzadm for this.

I just hadn't pushed the patch as I wanted to get the kernel bits 
settled. But now that we have I'll be pushing the dm-zoned-tools updates.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 15/15] dm-zoned: metadata version 2
  2020-05-11  8:46     ` Hannes Reinecke
@ 2020-05-11  8:51       ` Damien Le Moal
  2020-05-11  9:18         ` Hannes Reinecke
  0 siblings, 1 reply; 21+ messages in thread
From: Damien Le Moal @ 2020-05-11  8:51 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/05/11 17:46, Hannes Reinecke wrote:
> On 5/11/20 10:36 AM, Damien Le Moal wrote:
>> On 2020/05/11 17:24, Hannes Reinecke wrote:
>>> Implement handling for metadata version 2. The new metadata adds
>>> a label and UUID for the device mapper device, and additional UUID
>>> for the underlying block devices.
>>> It also allows for an additional regular drive to be used for
>>> emulating random access zones. The emulated zones will be placed
>>> logically in front of the zones from the zoned block device, causing
>>> the superblocks and metadata to be stored on that device.
>>> The first zone of the original zoned device will be used to hold
>>> another, tertiary copy of the metadata; this copy carries a
>>> generation number of 0 and is never updated; it's just used
>>> for identification.
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> Reviewed-by: Bob Liu <bob.liu@oracle.com>
>>> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
>>
>> Forgot to read through the documentation update. A couple of comments added below.
>>
>>> ---
>>>   .../admin-guide/device-mapper/dm-zoned.rst         |  34 ++-
>>>   drivers/md/dm-zoned-metadata.c                     | 310 +++++++++++++++++----
>>>   drivers/md/dm-zoned-target.c                       | 185 ++++++++----
>>>   drivers/md/dm-zoned.h                              |   7 +-
>>>   4 files changed, 427 insertions(+), 109 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>> index 7547ce635161..553752ea2521 100644
>>> --- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>> +++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>> @@ -37,9 +37,13 @@ Algorithm
>>>   dm-zoned implements an on-disk buffering scheme to handle non-sequential
>>>   write accesses to the sequential zones of a zoned block device.
>>>   Conventional zones are used for caching as well as for storing internal
>>> -metadata.
>>> +metadata. It can also use a regular block device together with the zoned
>>> +block device; in that case the regular block device will be split logically
>>> +in zones with the same size as the zoned block device. These zones will be
>>> +placed in front of the zones from the zoned block device and will be handled
>>> +just like conventional zones.
>>>   
>>> -The zones of the device are separated into 2 types:
>>> +The zones of the device(s) are separated into 2 types:
>>>   
>>>   1) Metadata zones: these are conventional zones used to store metadata.
>>>   Metadata zones are not reported as useable capacity to the user.
>>> @@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
>>>   discard requests. Read requests can be processed concurrently while
>>>   metadata flush is being executed.
>>>   
>>> +If a regular device is used in conjunction with the zoned block device,
>>> +a third set of metadata (without the zone bitmaps) is written to the
>>> +start of the zoned block device. This metadata has a generation counter of
>>> +'0' and will never be updated during normal operation; it just serves for
>>> +identification purposes. The first and second copy of the metadata
>>> +are located at the start of the regular block device.
>>> +
>>>   Usage
>>>   =====
>>>   
>>> @@ -138,12 +149,21 @@ Ex::
>>>   
>>>   	dmzadm --format /dev/sdxx
>>>   
>>> -For a formatted device, the target can be created normally with the
>>> -dmsetup utility. The only parameter that dm-zoned requires is the
>>> -underlying zoned block device name. Ex::
>>>   
>>> -	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
>>> -	dmsetup create dmz-`basename ${dev}`
>>> +If two drives are to be used, both devices must be specified, with the
>>> +regular block device as the first device.
>>
>> Actually, the zoned block device must be first. Otherwise dmzadm complains. We
>> can change that, or change the doc. Which do you prefer ? No strong opinion here.
>>
> Nope, not any more. Fixed it in my local repo (which I haven't pushed, 
> sorry).
> 
> But after the last discussion we had I thought it better and more 
> consistent to have the regular device first, just like the device-mapper 
> interface.

Works for me !

> 
>>> +
>>> +Ex::
>>> +
>>> +	dmzadm --format /dev/sdxx /dev/sdyy
>>> +
>>> +
>>> +Fomatted device(s) can be started with the dmzadm utility, too.:
>>> +
>>> +Ex::
>>> +
>>> +	dmzadm --start /dev/sdxx /dev/sdyy
>>
>> And same here, the zoned device must come first. I added a patch that internally
>> reverse that order for the dm start operation so that the regular device is
>> specified first.
>>
> See above. I've fixed up dmzadm for this.
> 
> I just hadn't pushed the patch as I wanted to get the kernel bits 
> settled. But now that we have I'll be pushing the dm-zoned-tools updates.

Please send changes on top of the "staging" branch. Your first batch of changes
is already merged in that branch.

> 
> Cheers,
> 
> Hannes
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 15/15] dm-zoned: metadata version 2
  2020-05-11  8:51       ` Damien Le Moal
@ 2020-05-11  9:18         ` Hannes Reinecke
  2020-05-11  9:20           ` Damien Le Moal
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Reinecke @ 2020-05-11  9:18 UTC (permalink / raw)
  To: Damien Le Moal, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 5/11/20 10:51 AM, Damien Le Moal wrote:
> On 2020/05/11 17:46, Hannes Reinecke wrote:
>> On 5/11/20 10:36 AM, Damien Le Moal wrote:
>>> On 2020/05/11 17:24, Hannes Reinecke wrote:
>>>> Implement handling for metadata version 2. The new metadata adds
>>>> a label and UUID for the device mapper device, and additional UUID
>>>> for the underlying block devices.
>>>> It also allows for an additional regular drive to be used for
>>>> emulating random access zones. The emulated zones will be placed
>>>> logically in front of the zones from the zoned block device, causing
>>>> the superblocks and metadata to be stored on that device.
>>>> The first zone of the original zoned device will be used to hold
>>>> another, tertiary copy of the metadata; this copy carries a
>>>> generation number of 0 and is never updated; it's just used
>>>> for identification.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>> Reviewed-by: Bob Liu <bob.liu@oracle.com>
>>>> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
>>>
>>> Forgot to read through the documentation update. A couple of comments added below.
>>>
>>>> ---
>>>>    .../admin-guide/device-mapper/dm-zoned.rst         |  34 ++-
>>>>    drivers/md/dm-zoned-metadata.c                     | 310 +++++++++++++++++----
>>>>    drivers/md/dm-zoned-target.c                       | 185 ++++++++----
>>>>    drivers/md/dm-zoned.h                              |   7 +-
>>>>    4 files changed, 427 insertions(+), 109 deletions(-)
>>>>
>>>> diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>>> index 7547ce635161..553752ea2521 100644
>>>> --- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>>> +++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>>> @@ -37,9 +37,13 @@ Algorithm
>>>>    dm-zoned implements an on-disk buffering scheme to handle non-sequential
>>>>    write accesses to the sequential zones of a zoned block device.
>>>>    Conventional zones are used for caching as well as for storing internal
>>>> -metadata.
>>>> +metadata. It can also use a regular block device together with the zoned
>>>> +block device; in that case the regular block device will be split logically
>>>> +in zones with the same size as the zoned block device. These zones will be
>>>> +placed in front of the zones from the zoned block device and will be handled
>>>> +just like conventional zones.
>>>>    
>>>> -The zones of the device are separated into 2 types:
>>>> +The zones of the device(s) are separated into 2 types:
>>>>    
>>>>    1) Metadata zones: these are conventional zones used to store metadata.
>>>>    Metadata zones are not reported as useable capacity to the user.
>>>> @@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
>>>>    discard requests. Read requests can be processed concurrently while
>>>>    metadata flush is being executed.
>>>>    
>>>> +If a regular device is used in conjunction with the zoned block device,
>>>> +a third set of metadata (without the zone bitmaps) is written to the
>>>> +start of the zoned block device. This metadata has a generation counter of
>>>> +'0' and will never be updated during normal operation; it just serves for
>>>> +identification purposes. The first and second copy of the metadata
>>>> +are located at the start of the regular block device.
>>>> +
>>>>    Usage
>>>>    =====
>>>>    
>>>> @@ -138,12 +149,21 @@ Ex::
>>>>    
>>>>    	dmzadm --format /dev/sdxx
>>>>    
>>>> -For a formatted device, the target can be created normally with the
>>>> -dmsetup utility. The only parameter that dm-zoned requires is the
>>>> -underlying zoned block device name. Ex::
>>>>    
>>>> -	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
>>>> -	dmsetup create dmz-`basename ${dev}`
>>>> +If two drives are to be used, both devices must be specified, with the
>>>> +regular block device as the first device.
>>>
>>> Actually, the zoned block device must be first. Otherwise dmzadm complains. We
>>> can change that, or change the doc. Which do you prefer ? No strong opinion here.
>>>
>> Nope, not any more. Fixed it in my local repo (which I haven't pushed,
>> sorry).
>>
>> But after the last discussion we had I thought it better and more
>> consistent to have the regular device first, just like the device-mapper
>> interface.
> 
> Works for me !
> 

I do hope so :-)
I've spun a new version against the master branch.

>>
>>>> +
>>>> +Ex::
>>>> +
>>>> +	dmzadm --format /dev/sdxx /dev/sdyy
>>>> +
>>>> +
>>>> +Fomatted device(s) can be started with the dmzadm utility, too.:
>>>> +
>>>> +Ex::
>>>> +
>>>> +	dmzadm --start /dev/sdxx /dev/sdyy
>>>
>>> And same here, the zoned device must come first. I added a patch that internally
>>> reverse that order for the dm start operation so that the regular device is
>>> specified first.
>>>
>> See above. I've fixed up dmzadm for this.
>>
>> I just hadn't pushed the patch as I wanted to get the kernel bits
>> settled. But now that we have I'll be pushing the dm-zoned-tools updates.
> 
> Please send changes on top of the "staging" branch. Your first batch of changes
> is already merged in that branch.
> 
Rah. Send a new merge request for 'master'; will be doing an update to 
the 'staging' branch, too.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 15/15] dm-zoned: metadata version 2
  2020-05-11  9:18         ` Hannes Reinecke
@ 2020-05-11  9:20           ` Damien Le Moal
  0 siblings, 0 replies; 21+ messages in thread
From: Damien Le Moal @ 2020-05-11  9:20 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/05/11 18:19, Hannes Reinecke wrote:
> On 5/11/20 10:51 AM, Damien Le Moal wrote:
>> On 2020/05/11 17:46, Hannes Reinecke wrote:
>>> On 5/11/20 10:36 AM, Damien Le Moal wrote:
>>>> On 2020/05/11 17:24, Hannes Reinecke wrote:
>>>>> Implement handling for metadata version 2. The new metadata adds
>>>>> a label and UUID for the device mapper device, and additional UUID
>>>>> for the underlying block devices.
>>>>> It also allows for an additional regular drive to be used for
>>>>> emulating random access zones. The emulated zones will be placed
>>>>> logically in front of the zones from the zoned block device, causing
>>>>> the superblocks and metadata to be stored on that device.
>>>>> The first zone of the original zoned device will be used to hold
>>>>> another, tertiary copy of the metadata; this copy carries a
>>>>> generation number of 0 and is never updated; it's just used
>>>>> for identification.
>>>>>
>>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>>> Reviewed-by: Bob Liu <bob.liu@oracle.com>
>>>>> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
>>>>
>>>> Forgot to read through the documentation update. A couple of comments added below.
>>>>
>>>>> ---
>>>>>    .../admin-guide/device-mapper/dm-zoned.rst         |  34 ++-
>>>>>    drivers/md/dm-zoned-metadata.c                     | 310 +++++++++++++++++----
>>>>>    drivers/md/dm-zoned-target.c                       | 185 ++++++++----
>>>>>    drivers/md/dm-zoned.h                              |   7 +-
>>>>>    4 files changed, 427 insertions(+), 109 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>>>> index 7547ce635161..553752ea2521 100644
>>>>> --- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>>>> +++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
>>>>> @@ -37,9 +37,13 @@ Algorithm
>>>>>    dm-zoned implements an on-disk buffering scheme to handle non-sequential
>>>>>    write accesses to the sequential zones of a zoned block device.
>>>>>    Conventional zones are used for caching as well as for storing internal
>>>>> -metadata.
>>>>> +metadata. It can also use a regular block device together with the zoned
>>>>> +block device; in that case the regular block device will be split logically
>>>>> +in zones with the same size as the zoned block device. These zones will be
>>>>> +placed in front of the zones from the zoned block device and will be handled
>>>>> +just like conventional zones.
>>>>>    
>>>>> -The zones of the device are separated into 2 types:
>>>>> +The zones of the device(s) are separated into 2 types:
>>>>>    
>>>>>    1) Metadata zones: these are conventional zones used to store metadata.
>>>>>    Metadata zones are not reported as useable capacity to the user.
>>>>> @@ -127,6 +131,13 @@ resumed. Flushing metadata thus only temporarily delays write and
>>>>>    discard requests. Read requests can be processed concurrently while
>>>>>    metadata flush is being executed.
>>>>>    
>>>>> +If a regular device is used in conjunction with the zoned block device,
>>>>> +a third set of metadata (without the zone bitmaps) is written to the
>>>>> +start of the zoned block device. This metadata has a generation counter of
>>>>> +'0' and will never be updated during normal operation; it just serves for
>>>>> +identification purposes. The first and second copy of the metadata
>>>>> +are located at the start of the regular block device.
>>>>> +
>>>>>    Usage
>>>>>    =====
>>>>>    
>>>>> @@ -138,12 +149,21 @@ Ex::
>>>>>    
>>>>>    	dmzadm --format /dev/sdxx
>>>>>    
>>>>> -For a formatted device, the target can be created normally with the
>>>>> -dmsetup utility. The only parameter that dm-zoned requires is the
>>>>> -underlying zoned block device name. Ex::
>>>>>    
>>>>> -	echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
>>>>> -	dmsetup create dmz-`basename ${dev}`
>>>>> +If two drives are to be used, both devices must be specified, with the
>>>>> +regular block device as the first device.
>>>>
>>>> Actually, the zoned block device must be first. Otherwise dmzadm complains. We
>>>> can change that, or change the doc. Which do you prefer ? No strong opinion here.
>>>>
>>> Nope, not any more. Fixed it in my local repo (which I haven't pushed,
>>> sorry).
>>>
>>> But after the last discussion we had I thought it better and more
>>> consistent to have the regular device first, just like the device-mapper
>>> interface.
>>
>> Works for me !
>>
> 
> I do hope so :-)
> I've spun a new version against the master branch.
> 
>>>
>>>>> +
>>>>> +Ex::
>>>>> +
>>>>> +	dmzadm --format /dev/sdxx /dev/sdyy
>>>>> +
>>>>> +
>>>>> +Fomatted device(s) can be started with the dmzadm utility, too.:
>>>>> +
>>>>> +Ex::
>>>>> +
>>>>> +	dmzadm --start /dev/sdxx /dev/sdyy
>>>>
>>>> And same here, the zoned device must come first. I added a patch that internally
>>>> reverse that order for the dm start operation so that the regular device is
>>>> specified first.
>>>>
>>> See above. I've fixed up dmzadm for this.
>>>
>>> I just hadn't pushed the patch as I wanted to get the kernel bits
>>> settled. But now that we have I'll be pushing the dm-zoned-tools updates.
>>
>> Please send changes on top of the "staging" branch. Your first batch of changes
>> is already merged in that branch.
>>
> Rah. Send a new merge request for 'master'; will be doing an update to 
> the 'staging' branch, too.

Don't bother, will respin the staging branch with the new PR.

> 
> Cheers,
> 
> Hannes
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-05-11  9:20 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11  8:24 [PATCHv6 00/14] dm-zoned: metadata version 2 Hannes Reinecke
2020-05-11  8:24 ` [PATCH 01/15] dm-zoned: add 'status' callback Hannes Reinecke
2020-05-11  8:24 ` [PATCH 02/15] dm-zoned: add 'message' callback Hannes Reinecke
2020-05-11  8:24 ` [PATCH 03/15] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
2020-05-11  8:24 ` [PATCH 04/15] dm-zoned: use array for superblock zones Hannes Reinecke
2020-05-11  8:24 ` [PATCH 05/15] dm-zoned: store device in struct dmz_sb Hannes Reinecke
2020-05-11  8:24 ` [PATCH 06/15] dm-zoned: move fields from struct dmz_dev to dmz_metadata Hannes Reinecke
2020-05-11  8:24 ` [PATCH 07/15] dm-zoned: introduce dmz_metadata_label() to format device name Hannes Reinecke
2020-05-11  8:24 ` [PATCH 08/15] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
2020-05-11  8:24 ` [PATCH 09/15] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
2020-05-11  8:24 ` [PATCH 10/15] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
2020-05-11  8:24 ` [PATCH 11/15] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O Hannes Reinecke
2020-05-11  8:24 ` [PATCH 12/15] dm-zoned: add metadata logging functions Hannes Reinecke
2020-05-11  8:24 ` [PATCH 13/15] dm-zoned: Reduce logging output on startup Hannes Reinecke
2020-05-11  8:24 ` [PATCH 14/15] dm-zoned: ignore metadata zone in dmz_alloc_zone() Hannes Reinecke
2020-05-11  8:24 ` [PATCH 15/15] dm-zoned: metadata version 2 Hannes Reinecke
2020-05-11  8:36   ` Damien Le Moal
2020-05-11  8:46     ` Hannes Reinecke
2020-05-11  8:51       ` Damien Le Moal
2020-05-11  9:18         ` Hannes Reinecke
2020-05-11  9:20           ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.