All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4 00/13] dm-zoned: metadata version 2
@ 2020-04-20 10:08 Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks Hannes Reinecke
                   ` (13 more replies)
  0 siblings, 14 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Hi all,

this patchset adds a new metadata version 2 for dm-zoned, which brings the
following improvements:
- UUIDs and labels: Adding three more fields to the metadata containing
  the dm-zoned device UUID and label, and the device UUID. This allows
  for an unique identification of the devices, so that several dm-zoned
  sets can coexist and have a persistent identification.
- Extend random zones by an additional regular disk device: A regular
  block device can be added together with the zoned block device, providing
  additional (emulated) random write zones. With this it's possible to
  handle sequential zones only devices; also there will be a speed-up if
  the regular block device resides on a fast medium. The regular block device
  is placed logically in front of the zoned block device, so that metadata
  and mapping tables reside on the regular block device, not the zoned device.
- Tertiary superblock support: In addition to the two existing sets of metadata
  another, tertiary, superblock is written to the first block of the zoned
  block device. This superblock is for identification only; the generation
  number is set to '0' and the block itself it never updated. The additional
  metadate like bitmap tables etc are not copied.

To handle this, some changes to the original handling are introduced:
- Zones are now equidistant. Originally, runt zones were ignored, and
  not counted when sizing the mapping tables. With the dual device setup
  runt zones might occur at the end of the regular block device, making
  direct translation between zone number and sector/block number complex.
  For metadata version 2 all zones are considered to be of the same size,
  and runt zones are simply marked as 'offline' to have them ignored when
  allocating a new zone.
- The block number in the superblock is now the global number, and refers to
  the location of the superblock relative to the resulting device-mapper
  device. Which means that the tertiary superblock contains absolute block
  addresses, which needs to be translated to the relative device addresses
  to find the referenced block.

There is an accompanying patchset for dm-zoned-tools for writing and checking
this new metadata.

As usual, comments and reviews are welcome.

Changes to v3:
- Reorder devices such that the regular device is always at position 0,
  and the zoned device is always at position 1.
- Split off dmz_dev_is_dying() into a separate patch
- Include reviews from Damien

Changes to v2:
- Kill dmz_id()
- Include reviews from Damien
- Sanitize uuid handling as suggested by John Dorminy


Hannes Reinecke (13):
  dm-zoned: add 'status' and 'message' callbacks
  dm-zoned: store zone id within the zone structure and kill dmz_id()
  dm-zoned: use array for superblock zones
  dm-zoned: store device in struct dmz_sb
  dm-zoned: move fields from struct dmz_dev to dmz_metadata
  dm-zoned: introduce dmz_metadata_label() to format device name
  dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev()
  dm-zoned: remove 'dev' argument from reclaim
  dm-zoned: replace 'target' pointer in the bio context
  dm-zoned: use dmz_zone_to_dev() when handling metadata I/O
  dm-zoned: add metadata logging functions
  dm-zoned: ignore metadata zone in dmz_alloc_zone()
  dm-zoned: metadata version 2

 drivers/md/dm-zoned-metadata.c | 658 +++++++++++++++++++++++++++++++----------
 drivers/md/dm-zoned-reclaim.c  |  88 +++---
 drivers/md/dm-zoned-target.c   | 331 +++++++++++++--------
 drivers/md/dm-zoned.h          |  33 ++-
 4 files changed, 780 insertions(+), 330 deletions(-)

-- 
2.16.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-28  9:19   ` Damien Le Moal
  2020-04-20 10:08 ` [PATCH 02/13] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Add callbacks to supply information for 'dmsetup status'
and 'dmsetup info', and implement the message 'reclaim'
to start the reclaim worker.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 15 +++++++++++++++
 drivers/md/dm-zoned-target.c   | 43 ++++++++++++++++++++++++++++++++++++++++++
 drivers/md/dm-zoned.h          |  3 +++
 3 files changed, 61 insertions(+)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 369de15c4e80..c8787560fa9f 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -202,6 +202,11 @@ sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_blocks_shift;
 }
 
+unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
+{
+	return zmd->dev->nr_zones;
+}
+
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd)
 {
 	return zmd->nr_chunks;
@@ -217,6 +222,16 @@ unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd)
 	return atomic_read(&zmd->unmap_nr_rnd);
 }
 
+unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd)
+{
+	return zmd->nr_seq;
+}
+
+unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
+{
+	return atomic_read(&zmd->unmap_nr_seq);
+}
+
 /*
  * Lock/unlock mapping table.
  * The map lock also protects all the zone lists.
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index f4f83d39b3dc..44e30a7de8b9 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -965,6 +965,47 @@ static int dmz_iterate_devices(struct dm_target *ti,
 	return fn(ti, dmz->ddev, 0, capacity, data);
 }
 
+static void dmz_status(struct dm_target *ti, status_type_t type,
+		       unsigned int status_flags, char *result,
+		       unsigned int maxlen)
+{
+	struct dmz_target *dmz = ti->private;
+	ssize_t sz = 0;
+	char buf[BDEVNAME_SIZE];
+
+	switch (type) {
+	case STATUSTYPE_INFO:
+		DMEMIT("%u zones "
+		       "%u/%u random "
+		       "%u/%u sequential",
+		       dmz_nr_zones(dmz->metadata),
+		       dmz_nr_unmap_rnd_zones(dmz->metadata),
+		       dmz_nr_rnd_zones(dmz->metadata),
+		       dmz_nr_unmap_seq_zones(dmz->metadata),
+		       dmz_nr_seq_zones(dmz->metadata));
+		break;
+	case STATUSTYPE_TABLE:
+		format_dev_t(buf, dmz->dev->bdev->bd_dev);
+		DMEMIT("%s ", buf);
+		break;
+	}
+	return;
+}
+
+static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
+		       char *result, unsigned int maxlen)
+{
+	struct dmz_target *dmz = ti->private;
+	int r = -EINVAL;
+
+	if (!strcasecmp(argv[0], "reclaim")) {
+		dmz_schedule_reclaim(dmz->reclaim);
+		r = 0;
+	} else
+		DMERR("unrecognized message %s", argv[0]);
+	return r;
+}
+
 static struct target_type dmz_type = {
 	.name		 = "zoned",
 	.version	 = {1, 1, 0},
@@ -978,6 +1019,8 @@ static struct target_type dmz_type = {
 	.postsuspend	 = dmz_suspend,
 	.resume		 = dmz_resume,
 	.iterate_devices = dmz_iterate_devices,
+	.status		 = dmz_status,
+	.message	 = dmz_message,
 };
 
 static int __init dmz_init(void)
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 5b5e493d479c..884c0e586082 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -190,8 +190,11 @@ void dmz_free_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
 void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *zone,
 		  unsigned int chunk);
 void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
+unsigned int dmz_nr_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
+unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd);
+unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd);
 
 /*
  * Activate a zone (increment its reference count).
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/13] dm-zoned: store zone id within the zone structure and kill dmz_id()
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-28  9:35   ` Damien Le Moal
  2020-04-20 10:08 ` [PATCH 03/13] dm-zoned: use array for superblock zones Hannes Reinecke
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Instead of calculating the zone index by the offset within the
zone array store the index within the structure itself. With that
the helper dmz_id() is pointless and can be replaced with accessing
the ->id value directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 40 +++++++++++++++++-----------------------
 drivers/md/dm-zoned-reclaim.c  | 17 ++++++++---------
 drivers/md/dm-zoned-target.c   |  6 +++---
 drivers/md/dm-zoned.h          |  4 +++-
 4 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index c8787560fa9f..1993eeb26bc1 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -187,19 +187,14 @@ struct dmz_metadata {
 /*
  * Various accessors
  */
-unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
-{
-	return ((unsigned int)(zone - zmd->zones));
-}
-
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_sectors_shift;
+	return (sector_t)zone->id << zmd->dev->zone_nr_sectors_shift;
 }
 
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_blocks_shift;
+	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
 }
 
 unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
@@ -1119,6 +1114,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 
 	INIT_LIST_HEAD(&zone->link);
 	atomic_set(&zone->refcount, 0);
+	zone->id = idx;
 	zone->chunk = DMZ_MAP_UNMAPPED;
 
 	switch (blkz->type) {
@@ -1246,7 +1242,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 		ret = -EIO;
 	if (ret < 0) {
 		dmz_dev_err(zmd->dev, "Get zone %u report failed",
-			    dmz_id(zmd, zone));
+			    zone->id);
 		dmz_check_bdev(zmd->dev);
 		return ret;
 	}
@@ -1270,7 +1266,7 @@ static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
 		return ret;
 
 	dmz_dev_warn(zmd->dev, "Processing zone %u write error (zone wp %u/%u)",
-		     dmz_id(zmd, zone), zone->wp_block, wp);
+		     zone->id, zone->wp_block, wp);
 
 	if (zone->wp_block < wp) {
 		dmz_invalidate_blocks(zmd, zone, zone->wp_block,
@@ -1309,7 +1305,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 				       dev->zone_nr_sectors, GFP_NOIO);
 		if (ret) {
 			dmz_dev_err(dev, "Reset zone %u failed %d",
-				    dmz_id(zmd, zone), ret);
+				    zone->id, ret);
 			return ret;
 		}
 	}
@@ -1757,8 +1753,7 @@ struct dm_zone *dmz_get_chunk_buffer(struct dmz_metadata *zmd,
 	}
 
 	/* Update the chunk mapping */
-	dmz_set_chunk_mapping(zmd, dzone->chunk, dmz_id(zmd, dzone),
-			      dmz_id(zmd, bzone));
+	dmz_set_chunk_mapping(zmd, dzone->chunk, dzone->id, bzone->id);
 
 	set_bit(DMZ_BUF, &bzone->flags);
 	bzone->chunk = dzone->chunk;
@@ -1810,7 +1805,7 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
 		atomic_dec(&zmd->unmap_nr_seq);
 
 	if (dmz_is_offline(zone)) {
-		dmz_dev_warn(zmd->dev, "Zone %u is offline", dmz_id(zmd, zone));
+		dmz_dev_warn(zmd->dev, "Zone %u is offline", zone->id);
 		zone = NULL;
 		goto again;
 	}
@@ -1852,7 +1847,7 @@ void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *dzone,
 		  unsigned int chunk)
 {
 	/* Set the chunk mapping */
-	dmz_set_chunk_mapping(zmd, chunk, dmz_id(zmd, dzone),
+	dmz_set_chunk_mapping(zmd, chunk, dzone->id,
 			      DMZ_MAP_UNMAPPED);
 	dzone->chunk = chunk;
 	if (dmz_is_rnd(dzone))
@@ -1880,7 +1875,7 @@ void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 		 * Unmapping the chunk buffer zone: clear only
 		 * the chunk buffer mapping
 		 */
-		dzone_id = dmz_id(zmd, zone->bzone);
+		dzone_id = zone->bzone->id;
 		zone->bzone->bzone = NULL;
 		zone->bzone = NULL;
 
@@ -1942,7 +1937,7 @@ static struct dmz_mblock *dmz_get_bitmap(struct dmz_metadata *zmd,
 					 sector_t chunk_block)
 {
 	sector_t bitmap_block = 1 + zmd->nr_map_blocks +
-		(sector_t)(dmz_id(zmd, zone) * zmd->zone_nr_bitmap_blocks) +
+		(sector_t)(zone->id * zmd->zone_nr_bitmap_blocks) +
 		(chunk_block >> DMZ_BLOCK_SHIFT_BITS);
 
 	return dmz_get_mblock(zmd, bitmap_block);
@@ -2022,7 +2017,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	unsigned int n = 0;
 
 	dmz_dev_debug(zmd->dev, "=> VALIDATE zone %u, block %llu, %u blocks",
-		      dmz_id(zmd, zone), (unsigned long long)chunk_block,
+		      zone->id, (unsigned long long)chunk_block,
 		      nr_blocks);
 
 	WARN_ON(chunk_block + nr_blocks > zone_nr_blocks);
@@ -2052,7 +2047,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 		zone->weight += n;
 	else {
 		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be <= %u",
-			     dmz_id(zmd, zone), zone->weight,
+			     zone->id, zone->weight,
 			     zone_nr_blocks - n);
 		zone->weight = zone_nr_blocks;
 	}
@@ -2102,7 +2097,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	unsigned int n = 0;
 
 	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
-		      dmz_id(zmd, zone), (u64)chunk_block, nr_blocks);
+		      zone->id, (u64)chunk_block, nr_blocks);
 
 	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
 
@@ -2132,7 +2127,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 		zone->weight -= n;
 	else {
 		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be >= %u",
-			     dmz_id(zmd, zone), zone->weight, n);
+			     zone->id, zone->weight, n);
 		zone->weight = 0;
 	}
 
@@ -2378,7 +2373,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 {
 	struct dmz_metadata *zmd;
-	unsigned int i, zid;
+	unsigned int i;
 	struct dm_zone *zone;
 	int ret;
 
@@ -2419,9 +2414,8 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 		goto err;
 
 	/* Set metadata zones starting from sb_zone */
-	zid = dmz_id(zmd, zmd->sb_zone);
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
-		zone = dmz_get(zmd, zid + i);
+		zone = dmz_get(zmd, zmd->sb_zone->id + i);
 		if (!dmz_is_rnd(zone))
 			goto err;
 		set_bit(DMZ_META, &zone->flags);
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index e7ace908a9b7..7f57c4299a2f 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -80,7 +80,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
 	if (ret) {
 		dmz_dev_err(zrc->dev,
 			    "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
-			    dmz_id(zmd, zone), (unsigned long long)wp_block,
+			    zone->id, (unsigned long long)wp_block,
 			    (unsigned long long)block, nr_blocks, ret);
 		dmz_check_bdev(zrc->dev);
 		return ret;
@@ -196,8 +196,8 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 
 	dmz_dev_debug(zrc->dev,
 		      "Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
-		      dzone->chunk, dmz_id(zmd, bzone), dmz_weight(bzone),
-		      dmz_id(zmd, dzone), dmz_weight(dzone));
+		      dzone->chunk, bzone->id, dmz_weight(bzone),
+		      dzone->id, dmz_weight(dzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, bzone, dzone);
@@ -235,8 +235,8 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 
 	dmz_dev_debug(zrc->dev,
 		      "Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
-		      chunk, dmz_id(zmd, dzone), dmz_weight(dzone),
-		      dmz_id(zmd, bzone), dmz_weight(bzone));
+		      chunk, dzone->id, dmz_weight(dzone),
+		      bzone->id, dmz_weight(bzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, dzone, bzone);
@@ -287,8 +287,7 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 
 	dmz_dev_debug(zrc->dev,
 		      "Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
-		      chunk, dmz_id(zmd, dzone), dmz_weight(dzone),
-		      dmz_id(zmd, szone));
+		      chunk, dzone->id, dmz_weight(dzone), szone->id);
 
 	/* Flush the random data zone into the sequential zone */
 	ret = dmz_reclaim_copy(zrc, dzone, szone);
@@ -403,12 +402,12 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 	if (ret) {
 		dmz_dev_debug(zrc->dev,
 			      "Metadata flush for zone %u failed, err %d\n",
-			      dmz_id(zmd, rzone), ret);
+			      rzone->id, ret);
 		return ret;
 	}
 
 	dmz_dev_debug(zrc->dev, "Reclaimed zone %u in %u ms",
-		      dmz_id(zmd, rzone), jiffies_to_msecs(jiffies - start));
+		      rzone->id, jiffies_to_msecs(jiffies - start));
 	return 0;
 }
 
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 44e30a7de8b9..7268e0af9e17 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -180,7 +180,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
 		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      dmz_id(dmz->metadata, zone),
+		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
 	/* Check block validity to determine the read location */
@@ -317,7 +317,7 @@ static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
 		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      dmz_id(dmz->metadata, zone),
+		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
 	if (dmz_is_rnd(zone) || chunk_block == zone->wp_block) {
@@ -357,7 +357,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 
 	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
 		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
-		      dmz_id(zmd, zone),
+		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
 	/*
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 884c0e586082..30781646741a 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -87,6 +87,9 @@ struct dm_zone {
 	/* Zone activation reference count */
 	atomic_t		refcount;
 
+	/* Zone id */
+	unsigned int		id;
+
 	/* Zone write pointer block (relative to the zone start block) */
 	unsigned int		wp_block;
 
@@ -176,7 +179,6 @@ void dmz_lock_flush(struct dmz_metadata *zmd);
 void dmz_unlock_flush(struct dmz_metadata *zmd);
 int dmz_flush_metadata(struct dmz_metadata *zmd);
 
-unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/13] dm-zoned: use array for superblock zones
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 02/13] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 04/13] dm-zoned: store device in struct dmz_sb Hannes Reinecke
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Instead of storing just the first superblock zone and calculate
the secondary relative to that we should be using an array for
holding the superblock zones.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 41 +++++++++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 1993eeb26bc1..900b1c1224f5 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -124,6 +124,7 @@ struct dmz_sb {
 	sector_t		block;
 	struct dmz_mblock	*mblk;
 	struct dmz_super	*sb;
+	struct dm_zone		*zone;
 };
 
 /*
@@ -150,7 +151,6 @@ struct dmz_metadata {
 	/* Zone information array */
 	struct dm_zone		*zones;
 
-	struct dm_zone		*sb_zone;
 	struct dmz_sb		sb[2];
 	unsigned int		mblk_primary;
 	u64			sb_gen;
@@ -839,8 +839,9 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 /*
  * Check super block.
  */
-static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
+static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 {
+	struct dmz_super *sb = zmd->sb[set].sb;
 	unsigned int nr_meta_zones, nr_data_zones;
 	struct dmz_dev *dev = zmd->dev;
 	u32 crc, stored_crc;
@@ -932,16 +933,20 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 
 	/* Bad first super block: search for the second one */
 	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
+	zmd->sb[1].zone = zmd->sb[0].zone + 1;
 	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
 		if (dmz_read_sb(zmd, 1) != 0)
 			break;
-		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC)
+		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC) {
+			zmd->sb[1].zone += i;
 			return 0;
+		}
 		zmd->sb[1].block += zone_nr_blocks;
 	}
 
 	dmz_free_mblock(zmd, mblk);
 	zmd->sb[1].mblk = NULL;
+	zmd->sb[1].zone = NULL;
 
 	return -EIO;
 }
@@ -985,11 +990,9 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
 
 	if (dst_set == 0)
-		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
-	else {
-		zmd->sb[1].block = zmd->sb[0].block +
-			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
-	}
+		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
+	else
+		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
 
 	page = alloc_page(GFP_NOIO);
 	if (!page)
@@ -1033,21 +1036,27 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 	u64 sb_gen[2] = {0, 0};
 	int ret;
 
+	if (!zmd->sb[0].zone) {
+		dmz_dev_err(zmd->dev, "Primary super block zone not set");
+		return -ENXIO;
+	}
+
 	/* Read and check the primary super block */
-	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
+	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
 	ret = dmz_get_sb(zmd, 0);
 	if (ret) {
 		dmz_dev_err(zmd->dev, "Read primary super block failed");
 		return ret;
 	}
 
-	ret = dmz_check_sb(zmd, zmd->sb[0].sb);
+	ret = dmz_check_sb(zmd, 0);
 
 	/* Read and check secondary super block */
 	if (ret == 0) {
 		sb_good[0] = true;
-		zmd->sb[1].block = zmd->sb[0].block +
-			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
+		if (!zmd->sb[1].zone)
+			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
+		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
 		ret = dmz_get_sb(zmd, 1);
 	} else
 		ret = dmz_lookup_secondary_sb(zmd);
@@ -1057,7 +1066,7 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		return ret;
 	}
 
-	ret = dmz_check_sb(zmd, zmd->sb[1].sb);
+	ret = dmz_check_sb(zmd, 1);
 	if (ret == 0)
 		sb_good[1] = true;
 
@@ -1142,9 +1151,9 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 		zmd->nr_useable_zones++;
 		if (dmz_is_rnd(zone)) {
 			zmd->nr_rnd_zones++;
-			if (!zmd->sb_zone) {
+			if (!zmd->sb[0].zone) {
 				/* Super block zone */
-				zmd->sb_zone = zone;
+				zmd->sb[0].zone = zone;
 			}
 		}
 	}
@@ -2415,7 +2424,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 
 	/* Set metadata zones starting from sb_zone */
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
-		zone = dmz_get(zmd, zmd->sb_zone->id + i);
+		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
 		if (!dmz_is_rnd(zone))
 			goto err;
 		set_bit(DMZ_META, &zone->flags);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/13] dm-zoned: store device in struct dmz_sb
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (2 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 03/13] dm-zoned: use array for superblock zones Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 05/13] dm-zoned: move fields from struct dmz_dev to dmz_metadata Hannes Reinecke
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Store the device together with the superblock so that
we don't have to recur to the metadata to find it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 90 +++++++++++++++++++++++++++---------------
 1 file changed, 59 insertions(+), 31 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 900b1c1224f5..def836e12dd9 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -122,6 +122,7 @@ enum {
  */
 struct dmz_sb {
 	sector_t		block;
+	struct dmz_dev		*dev;
 	struct dmz_mblock	*mblk;
 	struct dmz_super	*sb;
 	struct dm_zone		*zone;
@@ -197,6 +198,11 @@ sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
 }
 
+struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
+{
+	return &zmd->dev[0];
+}
+
 unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
 {
 	return zmd->dev->nr_zones;
@@ -412,9 +418,10 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd,
 {
 	struct dmz_mblock *mblk, *m;
 	sector_t block = zmd->sb[zmd->mblk_primary].block + mblk_no;
+	struct dmz_dev *dev = zmd->sb[zmd->mblk_primary].dev;
 	struct bio *bio;
 
-	if (dmz_bdev_is_dying(zmd->dev))
+	if (dmz_bdev_is_dying(dev))
 		return ERR_PTR(-EIO);
 
 	/* Get a new block and a BIO to read it */
@@ -450,7 +457,7 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd,
 
 	/* Submit read BIO */
 	bio->bi_iter.bi_sector = dmz_blk2sect(block);
-	bio_set_dev(bio, zmd->dev->bdev);
+	bio_set_dev(bio, dev->bdev);
 	bio->bi_private = mblk;
 	bio->bi_end_io = dmz_mblock_bio_end_io;
 	bio_set_op_attrs(bio, REQ_OP_READ, REQ_META | REQ_PRIO);
@@ -547,6 +554,7 @@ static struct dmz_mblock *dmz_get_mblock(struct dmz_metadata *zmd,
 					 sector_t mblk_no)
 {
 	struct dmz_mblock *mblk;
+	struct dmz_dev *dev = zmd->sb[zmd->mblk_primary].dev;
 
 	/* Check rbtree */
 	spin_lock(&zmd->mblk_lock);
@@ -565,7 +573,7 @@ static struct dmz_mblock *dmz_get_mblock(struct dmz_metadata *zmd,
 		       TASK_UNINTERRUPTIBLE);
 	if (test_bit(DMZ_META_ERROR, &mblk->state)) {
 		dmz_release_mblock(zmd, mblk);
-		dmz_check_bdev(zmd->dev);
+		dmz_check_bdev(dev);
 		return ERR_PTR(-EIO);
 	}
 
@@ -589,10 +597,11 @@ static void dmz_dirty_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk)
 static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 			    unsigned int set)
 {
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	sector_t block = zmd->sb[set].block + mblk->no;
 	struct bio *bio;
 
-	if (dmz_bdev_is_dying(zmd->dev))
+	if (dmz_bdev_is_dying(dev))
 		return -EIO;
 
 	bio = bio_alloc(GFP_NOIO, 1);
@@ -604,7 +613,7 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 	set_bit(DMZ_META_WRITING, &mblk->state);
 
 	bio->bi_iter.bi_sector = dmz_blk2sect(block);
-	bio_set_dev(bio, zmd->dev->bdev);
+	bio_set_dev(bio, dev->bdev);
 	bio->bi_private = mblk;
 	bio->bi_end_io = dmz_mblock_bio_end_io;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_META | REQ_PRIO);
@@ -617,13 +626,13 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
 /*
  * Read/write a metadata block.
  */
-static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block,
-			  struct page *page)
+static int dmz_rdwr_block(struct dmz_dev *dev, int op,
+			  sector_t block, struct page *page)
 {
 	struct bio *bio;
 	int ret;
 
-	if (dmz_bdev_is_dying(zmd->dev))
+	if (dmz_bdev_is_dying(dev))
 		return -EIO;
 
 	bio = bio_alloc(GFP_NOIO, 1);
@@ -631,14 +640,14 @@ static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block,
 		return -ENOMEM;
 
 	bio->bi_iter.bi_sector = dmz_blk2sect(block);
-	bio_set_dev(bio, zmd->dev->bdev);
+	bio_set_dev(bio, dev->bdev);
 	bio_set_op_attrs(bio, op, REQ_SYNC | REQ_META | REQ_PRIO);
 	bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0);
 	ret = submit_bio_wait(bio);
 	bio_put(bio);
 
 	if (ret)
-		dmz_check_bdev(zmd->dev);
+		dmz_check_bdev(dev);
 	return ret;
 }
 
@@ -650,6 +659,7 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	sector_t block = zmd->sb[set].block;
 	struct dmz_mblock *mblk = zmd->sb[set].mblk;
 	struct dmz_super *sb = zmd->sb[set].sb;
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	u64 sb_gen = zmd->sb_gen + 1;
 	int ret;
 
@@ -669,9 +679,9 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	sb->crc = 0;
 	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
 
-	ret = dmz_rdwr_block(zmd, REQ_OP_WRITE, block, mblk->page);
+	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
 	if (ret == 0)
-		ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
+		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 
 	return ret;
 }
@@ -684,6 +694,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
 				   unsigned int set)
 {
 	struct dmz_mblock *mblk;
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	struct blk_plug plug;
 	int ret = 0, nr_mblks_submitted = 0;
 
@@ -705,7 +716,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
 			       TASK_UNINTERRUPTIBLE);
 		if (test_bit(DMZ_META_ERROR, &mblk->state)) {
 			clear_bit(DMZ_META_ERROR, &mblk->state);
-			dmz_check_bdev(zmd->dev);
+			dmz_check_bdev(dev);
 			ret = -EIO;
 		}
 		nr_mblks_submitted--;
@@ -713,7 +724,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
 
 	/* Flush drive cache (this will also sync data) */
 	if (ret == 0)
-		ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
+		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 
 	return ret;
 }
@@ -750,6 +761,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 {
 	struct dmz_mblock *mblk;
 	struct list_head write_list;
+	struct dmz_dev *dev;
 	int ret;
 
 	if (WARN_ON(!zmd))
@@ -763,6 +775,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 	 * from modifying metadata.
 	 */
 	down_write(&zmd->mblk_sem);
+	dev = zmd->sb[zmd->mblk_primary].dev;
 
 	/*
 	 * This is called from the target flush work and reclaim work.
@@ -770,7 +783,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 	 */
 	dmz_lock_flush(zmd);
 
-	if (dmz_bdev_is_dying(zmd->dev)) {
+	if (dmz_bdev_is_dying(dev)) {
 		ret = -EIO;
 		goto out;
 	}
@@ -782,7 +795,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 
 	/* If there are no dirty metadata blocks, just flush the device cache */
 	if (list_empty(&write_list)) {
-		ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
+		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 		goto err;
 	}
 
@@ -831,7 +844,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 		list_splice(&write_list, &zmd->mblk_dirty_list);
 		spin_unlock(&zmd->mblk_lock);
 	}
-	if (!dmz_check_bdev(zmd->dev))
+	if (!dmz_check_bdev(dev))
 		ret = -EIO;
 	goto out;
 }
@@ -842,8 +855,8 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
 static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 {
 	struct dmz_super *sb = zmd->sb[set].sb;
+	struct dmz_dev *dev = zmd->sb[set].dev;
 	unsigned int nr_meta_zones, nr_data_zones;
-	struct dmz_dev *dev = zmd->dev;
 	u32 crc, stored_crc;
 	u64 gen;
 
@@ -908,8 +921,8 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
  */
 static int dmz_read_sb(struct dmz_metadata *zmd, unsigned int set)
 {
-	return dmz_rdwr_block(zmd, REQ_OP_READ, zmd->sb[set].block,
-			      zmd->sb[set].mblk->page);
+	return dmz_rdwr_block(zmd->sb[set].dev, REQ_OP_READ,
+			      zmd->sb[set].block, zmd->sb[set].mblk->page);
 }
 
 /*
@@ -934,6 +947,7 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 	/* Bad first super block: search for the second one */
 	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
 	zmd->sb[1].zone = zmd->sb[0].zone + 1;
+	zmd->sb[1].dev = dmz_zone_to_dev(zmd, zmd->sb[1].zone);
 	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
 		if (dmz_read_sb(zmd, 1) != 0)
 			break;
@@ -942,11 +956,13 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 			return 0;
 		}
 		zmd->sb[1].block += zone_nr_blocks;
+		zmd->sb[1].dev = dmz_zone_to_dev(zmd, zmd->sb[1].zone + i);
 	}
 
 	dmz_free_mblock(zmd, mblk);
 	zmd->sb[1].mblk = NULL;
 	zmd->sb[1].zone = NULL;
+	zmd->sb[1].dev = NULL;
 
 	return -EIO;
 }
@@ -987,7 +1003,8 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 	struct page *page;
 	int i, ret;
 
-	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
+	dmz_dev_warn(zmd->sb[dst_set].dev,
+		     "Metadata set %u invalid: recovering", dst_set);
 
 	if (dst_set == 0)
 		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
@@ -1000,11 +1017,11 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 
 	/* Copy metadata blocks */
 	for (i = 1; i < zmd->nr_meta_blocks; i++) {
-		ret = dmz_rdwr_block(zmd, REQ_OP_READ,
+		ret = dmz_rdwr_block(zmd->sb[src_set].dev, REQ_OP_READ,
 				     zmd->sb[src_set].block + i, page);
 		if (ret)
 			goto out;
-		ret = dmz_rdwr_block(zmd, REQ_OP_WRITE,
+		ret = dmz_rdwr_block(zmd->sb[dst_set].dev, REQ_OP_WRITE,
 				     zmd->sb[dst_set].block + i, page);
 		if (ret)
 			goto out;
@@ -1043,9 +1060,10 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 
 	/* Read and check the primary super block */
 	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
+	zmd->sb[0].dev = dmz_zone_to_dev(zmd, zmd->sb[0].zone);
 	ret = dmz_get_sb(zmd, 0);
 	if (ret) {
-		dmz_dev_err(zmd->dev, "Read primary super block failed");
+		dmz_dev_err(zmd->sb[0].dev, "Read primary super block failed");
 		return ret;
 	}
 
@@ -1057,12 +1075,13 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		if (!zmd->sb[1].zone)
 			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
 		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
+		zmd->sb[1].dev = dmz_zone_to_dev(zmd, zmd->sb[1].zone);
 		ret = dmz_get_sb(zmd, 1);
 	} else
 		ret = dmz_lookup_secondary_sb(zmd);
 
 	if (ret) {
-		dmz_dev_err(zmd->dev, "Read secondary super block failed");
+		dmz_dev_err(zmd->sb[1].dev, "Read secondary super block failed");
 		return ret;
 	}
 
@@ -1078,17 +1097,25 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 
 	if (sb_good[0])
 		sb_gen[0] = le64_to_cpu(zmd->sb[0].sb->gen);
-	else
+	else {
 		ret = dmz_recover_mblocks(zmd, 0);
+		if (ret) {
+			dmz_dev_err(zmd->sb[0].dev,
+				    "Recovery of superblock 0 failed");
+			return -EIO;
+		}
+	}
 
 	if (sb_good[1])
 		sb_gen[1] = le64_to_cpu(zmd->sb[1].sb->gen);
-	else
+	else {
 		ret = dmz_recover_mblocks(zmd, 1);
 
-	if (ret) {
-		dmz_dev_err(zmd->dev, "Recovery failed");
-		return -EIO;
+		if (ret) {
+			dmz_dev_err(zmd->sb[1].dev,
+				    "Recovery of superblock 1 failed");
+			return -EIO;
+		}
 	}
 
 	if (sb_gen[0] >= sb_gen[1]) {
@@ -1099,7 +1126,8 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		zmd->mblk_primary = 1;
 	}
 
-	dmz_dev_debug(zmd->dev, "Using super block %u (gen %llu)",
+	dmz_dev_debug(zmd->sb[zmd->mblk_primary].dev,
+		      "Using super block %u (gen %llu)",
 		      zmd->mblk_primary, zmd->sb_gen);
 
 	return 0;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/13] dm-zoned: move fields from struct dmz_dev to dmz_metadata
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (3 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 04/13] dm-zoned: store device in struct dmz_sb Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 06/13] dm-zoned: introduce dmz_metadata_label() to format device name Hannes Reinecke
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Move fields from the device structure into the metadata structure
and provide accessor functions.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 88 ++++++++++++++++++++++++++++--------------
 drivers/md/dm-zoned-reclaim.c  |  8 ++--
 drivers/md/dm-zoned-target.c   | 48 +++++++++++------------
 drivers/md/dm-zoned.h          | 14 +++----
 4 files changed, 95 insertions(+), 63 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index def836e12dd9..b844ff02ae7b 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -138,9 +138,16 @@ struct dmz_metadata {
 	unsigned int		zone_nr_bitmap_blocks;
 	unsigned int		zone_bits_per_mblk;
 
+	sector_t		zone_nr_blocks;
+	sector_t		zone_nr_blocks_shift;
+
+	sector_t		zone_nr_sectors;
+	sector_t		zone_nr_sectors_shift;
+
 	unsigned int		nr_bitmap_blocks;
 	unsigned int		nr_map_blocks;
 
+	unsigned int		nr_zones;
 	unsigned int		nr_useable_zones;
 	unsigned int		nr_meta_blocks;
 	unsigned int		nr_meta_zones;
@@ -190,12 +197,12 @@ struct dmz_metadata {
  */
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->dev->zone_nr_sectors_shift;
+	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
 }
 
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
+	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
 }
 
 struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
@@ -203,9 +210,29 @@ struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
 	return &zmd->dev[0];
 }
 
+unsigned int dmz_zone_nr_blocks(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_blocks;
+}
+
+unsigned int dmz_zone_nr_blocks_shift(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_blocks_shift;
+}
+
+unsigned int dmz_zone_nr_sectors(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_sectors;
+}
+
+unsigned int dmz_zone_nr_sectors_shift(struct dmz_metadata *zmd)
+{
+	return zmd->zone_nr_sectors_shift;
+}
+
 unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
 {
-	return zmd->dev->nr_zones;
+	return zmd->nr_zones;
 }
 
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd)
@@ -882,8 +909,8 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 		return -ENXIO;
 	}
 
-	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + dev->zone_nr_blocks - 1)
-		>> dev->zone_nr_blocks_shift;
+	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
+		>> zmd->zone_nr_blocks_shift;
 	if (!nr_meta_zones ||
 	    nr_meta_zones >= zmd->nr_rnd_zones) {
 		dmz_dev_err(dev, "Invalid number of metadata blocks");
@@ -932,7 +959,7 @@ static int dmz_read_sb(struct dmz_metadata *zmd, unsigned int set)
  */
 static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 {
-	unsigned int zone_nr_blocks = zmd->dev->zone_nr_blocks;
+	unsigned int zone_nr_blocks = zmd->zone_nr_blocks;
 	struct dmz_mblock *mblk;
 	int i;
 
@@ -1143,7 +1170,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 	struct dmz_dev *dev = zmd->dev;
 
 	/* Ignore the eventual last runt (smaller) zone */
-	if (blkz->len != dev->zone_nr_sectors) {
+	if (blkz->len != zmd->zone_nr_sectors) {
 		if (blkz->start + blkz->len == dev->capacity)
 			return 0;
 		return -ENXIO;
@@ -1208,19 +1235,24 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	int ret;
 
 	/* Init */
-	zmd->zone_bitmap_size = dev->zone_nr_blocks >> 3;
+	zmd->zone_nr_sectors = dev->zone_nr_sectors;
+	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
+	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
+	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
+	zmd->zone_bitmap_size = zmd->zone_nr_blocks >> 3;
 	zmd->zone_nr_bitmap_blocks =
 		max_t(sector_t, 1, zmd->zone_bitmap_size >> DMZ_BLOCK_SHIFT);
-	zmd->zone_bits_per_mblk = min_t(sector_t, dev->zone_nr_blocks,
+	zmd->zone_bits_per_mblk = min_t(sector_t, zmd->zone_nr_blocks,
 					DMZ_BLOCK_SIZE_BITS);
 
 	/* Allocate zone array */
-	zmd->zones = kcalloc(dev->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
+	zmd->nr_zones = dev->nr_zones;
+	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
 	if (!zmd->zones)
 		return -ENOMEM;
 
 	dmz_dev_info(dev, "Using %zu B for zone information",
-		     sizeof(struct dm_zone) * dev->nr_zones);
+		     sizeof(struct dm_zone) * zmd->nr_zones);
 
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
@@ -1339,7 +1371,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 
 		ret = blkdev_zone_mgmt(dev->bdev, REQ_OP_ZONE_RESET,
 				       dmz_start_sect(zmd, zone),
-				       dev->zone_nr_sectors, GFP_NOIO);
+				       zmd->zone_nr_sectors, GFP_NOIO);
 		if (ret) {
 			dmz_dev_err(dev, "Reset zone %u failed %d",
 				    zone->id, ret);
@@ -1393,7 +1425,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 		if (dzone_id == DMZ_MAP_UNMAPPED)
 			goto next;
 
-		if (dzone_id >= dev->nr_zones) {
+		if (dzone_id >= zmd->nr_zones) {
 			dmz_dev_err(dev, "Chunk %u mapping: invalid data zone ID %u",
 				    chunk, dzone_id);
 			return -EIO;
@@ -1414,7 +1446,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 		if (bzone_id == DMZ_MAP_UNMAPPED)
 			goto next;
 
-		if (bzone_id >= dev->nr_zones) {
+		if (bzone_id >= zmd->nr_zones) {
 			dmz_dev_err(dev, "Chunk %u mapping: invalid buffer zone ID %u",
 				    chunk, bzone_id);
 			return -EIO;
@@ -1446,7 +1478,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 	 * fully initialized. All remaining zones are unmapped data
 	 * zones. Finish initializing those here.
 	 */
-	for (i = 0; i < dev->nr_zones; i++) {
+	for (i = 0; i < zmd->nr_zones; i++) {
 		dzone = dmz_get(zmd, i);
 		if (dmz_is_meta(dzone))
 			continue;
@@ -1990,7 +2022,7 @@ int dmz_copy_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 	sector_t chunk_block = 0;
 
 	/* Get the zones bitmap blocks */
-	while (chunk_block < zmd->dev->zone_nr_blocks) {
+	while (chunk_block < zmd->zone_nr_blocks) {
 		from_mblk = dmz_get_bitmap(zmd, from_zone, chunk_block);
 		if (IS_ERR(from_mblk))
 			return PTR_ERR(from_mblk);
@@ -2025,7 +2057,7 @@ int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 	int ret;
 
 	/* Get the zones bitmap blocks */
-	while (chunk_block < zmd->dev->zone_nr_blocks) {
+	while (chunk_block < zmd->zone_nr_blocks) {
 		/* Get a valid region from the source zone */
 		ret = dmz_first_valid_block(zmd, from_zone, &chunk_block);
 		if (ret <= 0)
@@ -2049,7 +2081,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 			sector_t chunk_block, unsigned int nr_blocks)
 {
 	unsigned int count, bit, nr_bits;
-	unsigned int zone_nr_blocks = zmd->dev->zone_nr_blocks;
+	unsigned int zone_nr_blocks = zmd->zone_nr_blocks;
 	struct dmz_mblock *mblk;
 	unsigned int n = 0;
 
@@ -2136,7 +2168,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
 		      zone->id, (u64)chunk_block, nr_blocks);
 
-	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
+	WARN_ON(chunk_block + nr_blocks > zmd->zone_nr_blocks);
 
 	while (nr_blocks) {
 		/* Get bitmap block */
@@ -2180,7 +2212,7 @@ static int dmz_test_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	struct dmz_mblock *mblk;
 	int ret;
 
-	WARN_ON(chunk_block >= zmd->dev->zone_nr_blocks);
+	WARN_ON(chunk_block >= zmd->zone_nr_blocks);
 
 	/* Get bitmap block */
 	mblk = dmz_get_bitmap(zmd, zone, chunk_block);
@@ -2210,7 +2242,7 @@ static int dmz_to_next_set_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	unsigned long *bitmap;
 	int n = 0;
 
-	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
+	WARN_ON(chunk_block + nr_blocks > zmd->zone_nr_blocks);
 
 	while (nr_blocks) {
 		/* Get bitmap block */
@@ -2254,7 +2286,7 @@ int dmz_block_valid(struct dmz_metadata *zmd, struct dm_zone *zone,
 
 	/* The block is valid: get the number of valid blocks from block */
 	return dmz_to_next_set_block(zmd, zone, chunk_block,
-				     zmd->dev->zone_nr_blocks - chunk_block, 0);
+				     zmd->zone_nr_blocks - chunk_block, 0);
 }
 
 /*
@@ -2270,7 +2302,7 @@ int dmz_first_valid_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	int ret;
 
 	ret = dmz_to_next_set_block(zmd, zone, start_block,
-				    zmd->dev->zone_nr_blocks - start_block, 1);
+				    zmd->zone_nr_blocks - start_block, 1);
 	if (ret < 0)
 		return ret;
 
@@ -2278,7 +2310,7 @@ int dmz_first_valid_block(struct dmz_metadata *zmd, struct dm_zone *zone,
 	*chunk_block = start_block;
 
 	return dmz_to_next_set_block(zmd, zone, start_block,
-				     zmd->dev->zone_nr_blocks - start_block, 0);
+				     zmd->zone_nr_blocks - start_block, 0);
 }
 
 /*
@@ -2317,7 +2349,7 @@ static void dmz_get_zone_weight(struct dmz_metadata *zmd, struct dm_zone *zone)
 	struct dmz_mblock *mblk;
 	sector_t chunk_block = 0;
 	unsigned int bit, nr_bits;
-	unsigned int nr_blocks = zmd->dev->zone_nr_blocks;
+	unsigned int nr_blocks = zmd->zone_nr_blocks;
 	void *bitmap;
 	int n = 0;
 
@@ -2488,7 +2520,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
 		     (u64)dev->capacity);
 	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
-		     dev->nr_zones, (u64)dev->zone_nr_sectors);
+		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
 	dmz_dev_info(dev, "  %u metadata zones",
 		     zmd->nr_meta_zones * 2);
 	dmz_dev_info(dev, "  %u data zones for %u chunks",
@@ -2541,7 +2573,7 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 	int ret;
 
 	/* Check zones */
-	for (i = 0; i < dev->nr_zones; i++) {
+	for (i = 0; i < zmd->nr_zones; i++) {
 		zone = dmz_get(zmd, i);
 		if (!zone) {
 			dmz_dev_err(dev, "Unable to get zone %u", i);
@@ -2569,7 +2601,7 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 				    i, (u64)zone->wp_block, (u64)wp_block);
 			zone->wp_block = wp_block;
 			dmz_invalidate_blocks(zmd, zone, zone->wp_block,
-					      dev->zone_nr_blocks - zone->wp_block);
+					      zmd->zone_nr_blocks - zone->wp_block);
 		}
 	}
 
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 7f57c4299a2f..5aa5e5130fe8 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -128,7 +128,7 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 	if (dmz_is_seq(src_zone))
 		end_block = src_zone->wp_block;
 	else
-		end_block = dev->zone_nr_blocks;
+		end_block = dmz_zone_nr_blocks(zmd);
 	src_zone_block = dmz_start_block(zmd, src_zone);
 	dst_zone_block = dmz_start_block(zmd, dst_zone);
 
@@ -210,7 +210,7 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	ret = dmz_merge_valid_blocks(zmd, bzone, dzone, chunk_block);
 	if (ret == 0) {
 		/* Free the buffer zone */
-		dmz_invalidate_blocks(zmd, bzone, 0, zrc->dev->zone_nr_blocks);
+		dmz_invalidate_blocks(zmd, bzone, 0, dmz_zone_nr_blocks(zmd));
 		dmz_lock_map(zmd);
 		dmz_unmap_zone(zmd, bzone);
 		dmz_unlock_zone_reclaim(dzone);
@@ -252,7 +252,7 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 		 * Free the data zone and remap the chunk to
 		 * the buffer zone.
 		 */
-		dmz_invalidate_blocks(zmd, dzone, 0, zrc->dev->zone_nr_blocks);
+		dmz_invalidate_blocks(zmd, dzone, 0, dmz_zone_nr_blocks(zmd));
 		dmz_lock_map(zmd);
 		dmz_unmap_zone(zmd, bzone);
 		dmz_unmap_zone(zmd, dzone);
@@ -305,7 +305,7 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 		dmz_unlock_map(zmd);
 	} else {
 		/* Free the data zone and remap the chunk */
-		dmz_invalidate_blocks(zmd, dzone, 0, zrc->dev->zone_nr_blocks);
+		dmz_invalidate_blocks(zmd, dzone, 0, dmz_zone_nr_blocks(zmd));
 		dmz_lock_map(zmd);
 		dmz_unmap_zone(zmd, dzone);
 		dmz_unlock_zone_reclaim(dzone);
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 7268e0af9e17..1e22da9d7b40 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -165,7 +165,8 @@ static void dmz_handle_read_zero(struct dmz_target *dmz, struct bio *bio,
 static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 			   struct bio *bio)
 {
-	sector_t chunk_block = dmz_chunk_block(dmz->dev, dmz_bio_block(bio));
+	struct dmz_metadata *zmd = dmz->metadata;
+	sector_t chunk_block = dmz_chunk_block(zmd, dmz_bio_block(bio));
 	unsigned int nr_blocks = dmz_bio_blocks(bio);
 	sector_t end_block = chunk_block + nr_blocks;
 	struct dm_zone *rzone, *bzone;
@@ -178,7 +179,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 	}
 
 	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
 		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
@@ -189,7 +190,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 		nr_blocks = 0;
 		if (dmz_is_rnd(zone) || chunk_block < zone->wp_block) {
 			/* Test block validity in the data zone */
-			ret = dmz_block_valid(dmz->metadata, zone, chunk_block);
+			ret = dmz_block_valid(zmd, zone, chunk_block);
 			if (ret < 0)
 				return ret;
 			if (ret > 0) {
@@ -204,7 +205,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 		 * Check the buffer zone, if there is one.
 		 */
 		if (!nr_blocks && bzone) {
-			ret = dmz_block_valid(dmz->metadata, bzone, chunk_block);
+			ret = dmz_block_valid(zmd, bzone, chunk_block);
 			if (ret < 0)
 				return ret;
 			if (ret > 0) {
@@ -308,14 +309,15 @@ static int dmz_handle_buffered_write(struct dmz_target *dmz,
 static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 			    struct bio *bio)
 {
-	sector_t chunk_block = dmz_chunk_block(dmz->dev, dmz_bio_block(bio));
+	struct dmz_metadata *zmd = dmz->metadata;
+	sector_t chunk_block = dmz_chunk_block(zmd, dmz_bio_block(bio));
 	unsigned int nr_blocks = dmz_bio_blocks(bio);
 
 	if (!zone)
 		return -ENOSPC;
 
 	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
 		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
 		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
@@ -345,7 +347,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 	struct dmz_metadata *zmd = dmz->metadata;
 	sector_t block = dmz_bio_block(bio);
 	unsigned int nr_blocks = dmz_bio_blocks(bio);
-	sector_t chunk_block = dmz_chunk_block(dmz->dev, block);
+	sector_t chunk_block = dmz_chunk_block(zmd, block);
 	int ret = 0;
 
 	/* For unmapped chunks, there is nothing to do */
@@ -356,7 +358,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 		return -EROFS;
 
 	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
 		      zone->id,
 		      (unsigned long long)chunk_block, nr_blocks);
 
@@ -402,7 +404,7 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 	 * mapping for read and discard. If a mapping is obtained,
 	 + the zone returned will be set to active state.
 	 */
-	zone = dmz_get_chunk_mapping(zmd, dmz_bio_chunk(dmz->dev, bio),
+	zone = dmz_get_chunk_mapping(zmd, dmz_bio_chunk(zmd, bio),
 				     bio_op(bio));
 	if (IS_ERR(zone)) {
 		ret = PTR_ERR(zone);
@@ -525,7 +527,7 @@ static void dmz_flush_work(struct work_struct *work)
  */
 static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio)
 {
-	unsigned int chunk = dmz_bio_chunk(dmz->dev, bio);
+	unsigned int chunk = dmz_bio_chunk(dmz->metadata, bio);
 	struct dm_chunk_work *cw;
 	int ret = 0;
 
@@ -618,6 +620,7 @@ bool dmz_check_bdev(struct dmz_dev *dmz_dev)
 static int dmz_map(struct dm_target *ti, struct bio *bio)
 {
 	struct dmz_target *dmz = ti->private;
+	struct dmz_metadata *zmd = dmz->metadata;
 	struct dmz_dev *dev = dmz->dev;
 	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
 	sector_t sector = bio->bi_iter.bi_sector;
@@ -630,8 +633,8 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 
 	dmz_dev_debug(dev, "BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
 		      bio_op(bio), (unsigned long long)sector, nr_sectors,
-		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
-		      (unsigned long long)dmz_chunk_block(dmz->dev, dmz_bio_block(bio)),
+		      (unsigned long long)dmz_bio_chunk(zmd, bio),
+		      (unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
 		      (unsigned int)dmz_bio_blocks(bio));
 
 	bio_set_dev(bio, dev->bdev);
@@ -659,16 +662,16 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	}
 
 	/* Split zone BIOs to fit entirely into a zone */
-	chunk_sector = sector & (dev->zone_nr_sectors - 1);
-	if (chunk_sector + nr_sectors > dev->zone_nr_sectors)
-		dm_accept_partial_bio(bio, dev->zone_nr_sectors - chunk_sector);
+	chunk_sector = sector & (dmz_zone_nr_sectors(zmd) - 1);
+	if (chunk_sector + nr_sectors > dmz_zone_nr_sectors(zmd))
+		dm_accept_partial_bio(bio, dmz_zone_nr_sectors(zmd) - chunk_sector);
 
 	/* Now ready to handle this BIO */
 	ret = dmz_queue_chunk_work(dmz, bio);
 	if (ret) {
 		dmz_dev_debug(dmz->dev,
 			      "BIO op %d, can't process chunk %llu, err %i\n",
-			      bio_op(bio), (u64)dmz_bio_chunk(dmz->dev, bio),
+			      bio_op(bio), (u64)dmz_bio_chunk(zmd, bio),
 			      ret);
 		return DM_MAPIO_REQUEUE;
 	}
@@ -722,10 +725,6 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
 	}
 
 	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
-	dev->zone_nr_sectors_shift = ilog2(dev->zone_nr_sectors);
-
-	dev->zone_nr_blocks = dmz_sect2blk(dev->zone_nr_sectors);
-	dev->zone_nr_blocks_shift = ilog2(dev->zone_nr_blocks);
 
 	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
 
@@ -790,7 +789,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	}
 
 	/* Set target (no write same support) */
-	ti->max_io_len = dev->zone_nr_sectors << 9;
+	ti->max_io_len = dmz_zone_nr_sectors(dmz->metadata) << 9;
 	ti->num_flush_bios = 1;
 	ti->num_discard_bios = 1;
 	ti->num_write_zeroes_bios = 1;
@@ -799,7 +798,8 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	ti->discards_supported = true;
 
 	/* The exposed capacity is the number of chunks that can be mapped */
-	ti->len = (sector_t)dmz_nr_chunks(dmz->metadata) << dev->zone_nr_sectors_shift;
+	ti->len = (sector_t)dmz_nr_chunks(dmz->metadata) <<
+		dmz_zone_nr_sectors_shift(dmz->metadata);
 
 	/* Zone BIO */
 	ret = bioset_init(&dmz->bio_set, DMZ_MIN_BIOS, 0, 0);
@@ -895,7 +895,7 @@ static void dmz_dtr(struct dm_target *ti)
 static void dmz_io_hints(struct dm_target *ti, struct queue_limits *limits)
 {
 	struct dmz_target *dmz = ti->private;
-	unsigned int chunk_sectors = dmz->dev->zone_nr_sectors;
+	unsigned int chunk_sectors = dmz_zone_nr_sectors(dmz->metadata);
 
 	limits->logical_block_size = DMZ_BLOCK_SIZE;
 	limits->physical_block_size = DMZ_BLOCK_SIZE;
@@ -960,7 +960,7 @@ static int dmz_iterate_devices(struct dm_target *ti,
 {
 	struct dmz_target *dmz = ti->private;
 	struct dmz_dev *dev = dmz->dev;
-	sector_t capacity = dev->capacity & ~(dev->zone_nr_sectors - 1);
+	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
 
 	return fn(ti, dmz->ddev, 0, capacity, data);
 }
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 30781646741a..f997ad62c7b4 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -60,15 +60,11 @@ struct dmz_dev {
 	unsigned int		flags;
 
 	sector_t		zone_nr_sectors;
-	unsigned int		zone_nr_sectors_shift;
-
-	sector_t		zone_nr_blocks;
-	sector_t		zone_nr_blocks_shift;
 };
 
-#define dmz_bio_chunk(dev, bio)	((bio)->bi_iter.bi_sector >> \
-				 (dev)->zone_nr_sectors_shift)
-#define dmz_chunk_block(dev, b)	((b) & ((dev)->zone_nr_blocks - 1))
+#define dmz_bio_chunk(zmd, bio)	((bio)->bi_iter.bi_sector >> \
+				 dmz_zone_nr_sectors_shift(zmd))
+#define dmz_chunk_block(zmd, b)	((b) & (dmz_zone_nr_blocks(zmd) - 1))
 
 /* Device flags. */
 #define DMZ_BDEV_DYING		(1 << 0)
@@ -197,6 +193,10 @@ unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd);
 unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_blocks(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_blocks_shift(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_sectors(struct dmz_metadata *zmd);
+unsigned int dmz_zone_nr_sectors_shift(struct dmz_metadata *zmd);
 
 /*
  * Activate a zone (increment its reference count).
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/13] dm-zoned: introduce dmz_metadata_label() to format device name
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (4 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 05/13] dm-zoned: move fields from struct dmz_dev to dmz_metadata Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 07/13] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Introduce dmz_metadata_label() to format the device-mapper device
name and use it instead of the device name of the underlying device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 11 ++++++-
 drivers/md/dm-zoned-reclaim.c  | 15 +++++----
 drivers/md/dm-zoned-target.c   | 74 +++++++++++++++++++++++-------------------
 drivers/md/dm-zoned.h          |  4 ++-
 4 files changed, 62 insertions(+), 42 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index b844ff02ae7b..7cda48683c0b 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -134,6 +134,8 @@ struct dmz_sb {
 struct dmz_metadata {
 	struct dmz_dev		*dev;
 
+	char			devname[BDEVNAME_SIZE];
+
 	sector_t		zone_bitmap_size;
 	unsigned int		zone_nr_bitmap_blocks;
 	unsigned int		zone_bits_per_mblk;
@@ -260,6 +262,11 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
 	return atomic_read(&zmd->unmap_nr_seq);
 }
 
+const char *dmz_metadata_label(struct dmz_metadata *zmd)
+{
+	return (const char *)zmd->devname;
+}
+
 /*
  * Lock/unlock mapping table.
  * The map lock also protects all the zone lists.
@@ -2439,7 +2446,8 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 /*
  * Initialize the zoned metadata.
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
+int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
+		     const char *devname)
 {
 	struct dmz_metadata *zmd;
 	unsigned int i;
@@ -2450,6 +2458,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 	if (!zmd)
 		return -ENOMEM;
 
+	strcpy(zmd->devname, devname);
 	zmd->dev = dev;
 	zmd->mblk_rbtree = RB_ROOT;
 	init_rwsem(&zmd->mblk_sem);
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 5aa5e5130fe8..699c4145306e 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -480,15 +480,16 @@ static void dmz_reclaim_work(struct work_struct *work)
 		zrc->kc_throttle.throttle = min(75U, 100U - p_unmap_rnd / 2);
 	}
 
-	dmz_dev_debug(zrc->dev,
-		      "Reclaim (%u): %s, %u%% free rnd zones (%u/%u)",
-		      zrc->kc_throttle.throttle,
-		      (dmz_target_idle(zrc) ? "Idle" : "Busy"),
-		      p_unmap_rnd, nr_unmap_rnd, nr_rnd);
+	DMDEBUG("(%s): Reclaim (%u): %s, %u%% free rnd zones (%u/%u)",
+		dmz_metadata_label(zmd),
+		zrc->kc_throttle.throttle,
+		(dmz_target_idle(zrc) ? "Idle" : "Busy"),
+		p_unmap_rnd, nr_unmap_rnd, nr_rnd);
 
 	ret = dmz_do_reclaim(zrc);
 	if (ret) {
-		dmz_dev_debug(zrc->dev, "Reclaim error %d\n", ret);
+		DMDEBUG("(%s): Reclaim error %d\n",
+			dmz_metadata_label(zmd), ret);
 		if (!dmz_check_bdev(zrc->dev))
 			return;
 	}
@@ -524,7 +525,7 @@ int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
 	/* Reclaim work */
 	INIT_DELAYED_WORK(&zrc->work, dmz_reclaim_work);
 	zrc->wq = alloc_ordered_workqueue("dmz_rwq_%s", WQ_MEM_RECLAIM,
-					  dev->name);
+					  dmz_metadata_label(zmd));
 	if (!zrc->wq) {
 		ret = -ENOMEM;
 		goto err;
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 1e22da9d7b40..748d4cd5d62d 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -178,11 +178,12 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
 		return 0;
 	}
 
-	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      zone->id,
-		      (unsigned long long)chunk_block, nr_blocks);
+	DMDEBUG("(%s): READ chunk %llu -> %s zone %u, block %llu, %u blocks",
+		dmz_metadata_label(zmd),
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		(dmz_is_rnd(zone) ? "RND" : "SEQ"),
+		zone->id,
+		(unsigned long long)chunk_block, nr_blocks);
 
 	/* Check block validity to determine the read location */
 	bzone = zone->bzone;
@@ -316,11 +317,12 @@ static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
 	if (!zone)
 		return -ENOSPC;
 
-	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
-		      zone->id,
-		      (unsigned long long)chunk_block, nr_blocks);
+	DMDEBUG("(%s): WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
+		dmz_metadata_label(zmd),
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		(dmz_is_rnd(zone) ? "RND" : "SEQ"),
+		zone->id,
+		(unsigned long long)chunk_block, nr_blocks);
 
 	if (dmz_is_rnd(zone) || chunk_block == zone->wp_block) {
 		/*
@@ -357,10 +359,11 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
 	if (dmz_is_readonly(zone))
 		return -EROFS;
 
-	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      zone->id,
-		      (unsigned long long)chunk_block, nr_blocks);
+	DMDEBUG("(%s): DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
+		dmz_metadata_label(dmz->metadata),
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		zone->id,
+		(unsigned long long)chunk_block, nr_blocks);
 
 	/*
 	 * Invalidate blocks in the data zone and its
@@ -429,8 +432,8 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 		ret = dmz_handle_discard(dmz, zone, bio);
 		break;
 	default:
-		dmz_dev_err(dmz->dev, "Unsupported BIO operation 0x%x",
-			    bio_op(bio));
+		DMERR("(%s): Unsupported BIO operation 0x%x",
+		      dmz_metadata_label(dmz->metadata), bio_op(bio));
 		ret = -EIO;
 	}
 
@@ -504,7 +507,8 @@ static void dmz_flush_work(struct work_struct *work)
 	/* Flush dirty metadata blocks */
 	ret = dmz_flush_metadata(dmz->metadata);
 	if (ret)
-		dmz_dev_debug(dmz->dev, "Metadata flush failed, rc=%d\n", ret);
+		DMDEBUG("(%s): Metadata flush failed, rc=%d\n",
+			dmz_metadata_label(dmz->metadata), ret);
 
 	/* Process queued flush requests */
 	while (1) {
@@ -631,11 +635,12 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	if (dmz_bdev_is_dying(dmz->dev))
 		return DM_MAPIO_KILL;
 
-	dmz_dev_debug(dev, "BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
-		      bio_op(bio), (unsigned long long)sector, nr_sectors,
-		      (unsigned long long)dmz_bio_chunk(zmd, bio),
-		      (unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
-		      (unsigned int)dmz_bio_blocks(bio));
+	DMDEBUG("(%s): BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
+		dmz_metadata_label(zmd),
+		bio_op(bio), (unsigned long long)sector, nr_sectors,
+		(unsigned long long)dmz_bio_chunk(zmd, bio),
+		(unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
+		(unsigned int)dmz_bio_blocks(bio));
 
 	bio_set_dev(bio, dev->bdev);
 
@@ -669,10 +674,10 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	/* Now ready to handle this BIO */
 	ret = dmz_queue_chunk_work(dmz, bio);
 	if (ret) {
-		dmz_dev_debug(dmz->dev,
-			      "BIO op %d, can't process chunk %llu, err %i\n",
-			      bio_op(bio), (u64)dmz_bio_chunk(zmd, bio),
-			      ret);
+		DMDEBUG("(%s): BIO op %d, can't process chunk %llu, err %i\n",
+			dmz_metadata_label(zmd),
+			bio_op(bio), (u64)dmz_bio_chunk(zmd, bio),
+			ret);
 		return DM_MAPIO_REQUEUE;
 	}
 
@@ -782,7 +787,8 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 	/* Initialize metadata */
 	dev = dmz->dev;
-	ret = dmz_ctr_metadata(dev, &dmz->metadata);
+	ret = dmz_ctr_metadata(dev, &dmz->metadata,
+			       dm_table_device_name(ti->table));
 	if (ret) {
 		ti->error = "Metadata initialization failed";
 		goto err_dev;
@@ -811,8 +817,9 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	/* Chunk BIO work */
 	mutex_init(&dmz->chunk_lock);
 	INIT_RADIX_TREE(&dmz->chunk_rxtree, GFP_NOIO);
-	dmz->chunk_wq = alloc_workqueue("dmz_cwq_%s", WQ_MEM_RECLAIM | WQ_UNBOUND,
-					0, dev->name);
+	dmz->chunk_wq = alloc_workqueue("dmz_cwq_%s",
+					WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
+					dmz_metadata_label(dmz->metadata));
 	if (!dmz->chunk_wq) {
 		ti->error = "Create chunk workqueue failed";
 		ret = -ENOMEM;
@@ -824,7 +831,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	bio_list_init(&dmz->flush_list);
 	INIT_DELAYED_WORK(&dmz->flush_work, dmz_flush_work);
 	dmz->flush_wq = alloc_ordered_workqueue("dmz_fwq_%s", WQ_MEM_RECLAIM,
-						dev->name);
+						dmz_metadata_label(dmz->metadata));
 	if (!dmz->flush_wq) {
 		ti->error = "Create flush workqueue failed";
 		ret = -ENOMEM;
@@ -839,9 +846,10 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		goto err_fwq;
 	}
 
-	dmz_dev_info(dev, "Target device: %llu 512-byte logical sectors (%llu blocks)",
-		     (unsigned long long)ti->len,
-		     (unsigned long long)dmz_sect2blk(ti->len));
+	DMINFO("(%s): Target device: %llu 512-byte logical sectors (%llu blocks)",
+	       dmz_metadata_label(dmz->metadata),
+	       (unsigned long long)ti->len,
+	       (unsigned long long)dmz_sect2blk(ti->len));
 
 	return 0;
 err_fwq:
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index f997ad62c7b4..dd768dc60341 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -163,7 +163,8 @@ struct dmz_reclaim;
 /*
  * Functions defined in dm-zoned-metadata.c
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd);
+int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
+		     const char *devname);
 void dmz_dtr_metadata(struct dmz_metadata *zmd);
 int dmz_resume_metadata(struct dmz_metadata *zmd);
 
@@ -174,6 +175,7 @@ void dmz_unlock_metadata(struct dmz_metadata *zmd);
 void dmz_lock_flush(struct dmz_metadata *zmd);
 void dmz_unlock_flush(struct dmz_metadata *zmd);
 int dmz_flush_metadata(struct dmz_metadata *zmd);
+const char *dmz_metadata_label(struct dmz_metadata *zmd);
 
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/13] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev()
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (5 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 06/13] dm-zoned: introduce dmz_metadata_label() to format device name Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-28  9:37   ` Damien Le Moal
  2020-04-20 10:08 ` [PATCH 08/13] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Introduce accessors dmz_dev_is_dying() and dmz_check_dev() to
avoid having to reference the devices directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 14 ++++++++++++--
 drivers/md/dm-zoned-reclaim.c  |  4 ++--
 drivers/md/dm-zoned-target.c   |  2 +-
 drivers/md/dm-zoned.h          |  3 +++
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 7cda48683c0b..426af738f1ca 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -267,6 +267,16 @@ const char *dmz_metadata_label(struct dmz_metadata *zmd)
 	return (const char *)zmd->devname;
 }
 
+bool dmz_check_dev(struct dmz_metadata *zmd)
+{
+	return dmz_check_bdev(&zmd->dev[0]);
+}
+
+bool dmz_dev_is_dying(struct dmz_metadata *zmd)
+{
+	return dmz_bdev_is_dying(&zmd->dev[0]);
+}
+
 /*
  * Lock/unlock mapping table.
  * The map lock also protects all the zone lists.
@@ -1719,7 +1729,7 @@ struct dm_zone *dmz_get_chunk_mapping(struct dmz_metadata *zmd, unsigned int chu
 		/* Allocate a random zone */
 		dzone = dmz_alloc_zone(zmd, DMZ_ALLOC_RND);
 		if (!dzone) {
-			if (dmz_bdev_is_dying(zmd->dev)) {
+			if (dmz_dev_is_dying(zmd)) {
 				dzone = ERR_PTR(-EIO);
 				goto out;
 			}
@@ -1820,7 +1830,7 @@ struct dm_zone *dmz_get_chunk_buffer(struct dmz_metadata *zmd,
 	/* Allocate a random zone */
 	bzone = dmz_alloc_zone(zmd, DMZ_ALLOC_RND);
 	if (!bzone) {
-		if (dmz_bdev_is_dying(zmd->dev)) {
+		if (dmz_dev_is_dying(zmd)) {
 			bzone = ERR_PTR(-EIO);
 			goto out;
 		}
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 699c4145306e..5daede0daf92 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -455,7 +455,7 @@ static void dmz_reclaim_work(struct work_struct *work)
 	unsigned int p_unmap_rnd;
 	int ret;
 
-	if (dmz_bdev_is_dying(zrc->dev))
+	if (dmz_dev_is_dying(zmd))
 		return;
 
 	if (!dmz_should_reclaim(zrc)) {
@@ -490,7 +490,7 @@ static void dmz_reclaim_work(struct work_struct *work)
 	if (ret) {
 		DMDEBUG("(%s): Reclaim error %d\n",
 			dmz_metadata_label(zmd), ret);
-		if (!dmz_check_bdev(zrc->dev))
+		if (!dmz_check_dev(zmd))
 			return;
 	}
 
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 748d4cd5d62d..15f00535060f 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -632,7 +632,7 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 	sector_t chunk_sector;
 	int ret;
 
-	if (dmz_bdev_is_dying(dmz->dev))
+	if (dmz_dev_is_dying(zmd))
 		return DM_MAPIO_KILL;
 
 	DMDEBUG("(%s): BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index dd768dc60341..e0883df8a903 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -181,6 +181,9 @@ sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
 
+bool dmz_check_dev(struct dmz_metadata *zmd);
+bool dmz_dev_is_dying(struct dmz_metadata *zmd);
+
 #define DMZ_ALLOC_RND		0x01
 #define DMZ_ALLOC_RECLAIM	0x02
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/13] dm-zoned: remove 'dev' argument from reclaim
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (6 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 07/13] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-28  9:40   ` Damien Le Moal
  2020-04-20 10:08 ` [PATCH 09/13] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Use the dmz_zone_to_dev() mapping function to remove the
'dev' argument from reclaim.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-reclaim.c | 58 +++++++++++++++++++++++--------------------
 drivers/md/dm-zoned-target.c  |  2 +-
 drivers/md/dm-zoned.h         |  3 ++-
 3 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index 5daede0daf92..39ea0d5d4706 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -13,7 +13,6 @@
 
 struct dmz_reclaim {
 	struct dmz_metadata     *metadata;
-	struct dmz_dev		*dev;
 
 	struct delayed_work	work;
 	struct workqueue_struct *wq;
@@ -59,6 +58,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
 				sector_t block)
 {
 	struct dmz_metadata *zmd = zrc->metadata;
+	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 	sector_t wp_block = zone->wp_block;
 	unsigned int nr_blocks;
 	int ret;
@@ -74,15 +74,15 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
 	 * pointer and the requested position.
 	 */
 	nr_blocks = block - wp_block;
-	ret = blkdev_issue_zeroout(zrc->dev->bdev,
+	ret = blkdev_issue_zeroout(dev->bdev,
 				   dmz_start_sect(zmd, zone) + dmz_blk2sect(wp_block),
 				   dmz_blk2sect(nr_blocks), GFP_NOIO, 0);
 	if (ret) {
-		dmz_dev_err(zrc->dev,
+		dmz_dev_err(dev,
 			    "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
 			    zone->id, (unsigned long long)wp_block,
 			    (unsigned long long)block, nr_blocks, ret);
-		dmz_check_bdev(zrc->dev);
+		dmz_check_bdev(dev);
 		return ret;
 	}
 
@@ -116,7 +116,7 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 			    struct dm_zone *src_zone, struct dm_zone *dst_zone)
 {
 	struct dmz_metadata *zmd = zrc->metadata;
-	struct dmz_dev *dev = zrc->dev;
+	struct dmz_dev *src_dev, *dst_dev;
 	struct dm_io_region src, dst;
 	sector_t block = 0, end_block;
 	sector_t nr_blocks;
@@ -130,13 +130,17 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 	else
 		end_block = dmz_zone_nr_blocks(zmd);
 	src_zone_block = dmz_start_block(zmd, src_zone);
+	src_dev = dmz_zone_to_dev(zmd, src_zone);
 	dst_zone_block = dmz_start_block(zmd, dst_zone);
+	dst_dev = dmz_zone_to_dev(zmd, dst_zone);
 
 	if (dmz_is_seq(dst_zone))
 		set_bit(DM_KCOPYD_WRITE_SEQ, &flags);
 
 	while (block < end_block) {
-		if (dev->flags & DMZ_BDEV_DYING)
+		if (src_dev->flags & DMZ_BDEV_DYING)
+			return -EIO;
+		if (dst_dev->flags & DMZ_BDEV_DYING)
 			return -EIO;
 
 		/* Get a valid region from the source zone */
@@ -156,11 +160,11 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
 				return ret;
 		}
 
-		src.bdev = dev->bdev;
+		src.bdev = src_dev->bdev;
 		src.sector = dmz_blk2sect(src_zone_block + block);
 		src.count = dmz_blk2sect(nr_blocks);
 
-		dst.bdev = dev->bdev;
+		dst.bdev = dst_dev->bdev;
 		dst.sector = dmz_blk2sect(dst_zone_block + block);
 		dst.count = src.count;
 
@@ -194,10 +198,10 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	struct dmz_metadata *zmd = zrc->metadata;
 	int ret;
 
-	dmz_dev_debug(zrc->dev,
-		      "Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
-		      dzone->chunk, bzone->id, dmz_weight(bzone),
-		      dzone->id, dmz_weight(dzone));
+	DMDEBUG("(%s): Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
+		dmz_metadata_label(zmd),
+		dzone->chunk, bzone->id, dmz_weight(bzone),
+		dzone->id, dmz_weight(dzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, bzone, dzone);
@@ -233,10 +237,10 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	struct dmz_metadata *zmd = zrc->metadata;
 	int ret = 0;
 
-	dmz_dev_debug(zrc->dev,
-		      "Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
-		      chunk, dzone->id, dmz_weight(dzone),
-		      bzone->id, dmz_weight(bzone));
+	DMDEBUG("(%s): Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
+		dmz_metadata_label(zmd),
+		chunk, dzone->id, dmz_weight(dzone),
+		bzone->id, dmz_weight(bzone));
 
 	/* Flush data zone into the buffer zone */
 	ret = dmz_reclaim_copy(zrc, dzone, bzone);
@@ -285,9 +289,9 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
 	if (!szone)
 		return -ENOSPC;
 
-	dmz_dev_debug(zrc->dev,
-		      "Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
-		      chunk, dzone->id, dmz_weight(dzone), szone->id);
+	DMDEBUG("(%s): Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
+		dmz_metadata_label(zmd),
+		chunk, dzone->id, dmz_weight(dzone), szone->id);
 
 	/* Flush the random data zone into the sequential zone */
 	ret = dmz_reclaim_copy(zrc, dzone, szone);
@@ -343,6 +347,7 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 	struct dmz_metadata *zmd = zrc->metadata;
 	struct dm_zone *dzone;
 	struct dm_zone *rzone;
+	struct dmz_dev *dev;
 	unsigned long start;
 	int ret;
 
@@ -352,7 +357,7 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 		return PTR_ERR(dzone);
 
 	start = jiffies;
-
+	dev = dmz_zone_to_dev(zmd, dzone);
 	if (dmz_is_rnd(dzone)) {
 		if (!dmz_weight(dzone)) {
 			/* Empty zone */
@@ -400,14 +405,14 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
 
 	ret = dmz_flush_metadata(zrc->metadata);
 	if (ret) {
-		dmz_dev_debug(zrc->dev,
-			      "Metadata flush for zone %u failed, err %d\n",
-			      rzone->id, ret);
+		DMDEBUG("(%s): Metadata flush for zone %u failed, err %d\n",
+			dmz_metadata_label(zmd), rzone->id, ret);
 		return ret;
 	}
 
-	dmz_dev_debug(zrc->dev, "Reclaimed zone %u in %u ms",
-		      rzone->id, jiffies_to_msecs(jiffies - start));
+	DMDEBUG("(%s): Reclaimed zone %u in %u ms",
+		dmz_metadata_label(zmd),
+		rzone->id, jiffies_to_msecs(jiffies - start));
 	return 0;
 }
 
@@ -500,7 +505,7 @@ static void dmz_reclaim_work(struct work_struct *work)
 /*
  * Initialize reclaim.
  */
-int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
+int dmz_ctr_reclaim(struct dmz_metadata *zmd,
 		    struct dmz_reclaim **reclaim)
 {
 	struct dmz_reclaim *zrc;
@@ -510,7 +515,6 @@ int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
 	if (!zrc)
 		return -ENOMEM;
 
-	zrc->dev = dev;
 	zrc->metadata = zmd;
 	zrc->atime = jiffies;
 
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 15f00535060f..a1f42af2877c 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -840,7 +840,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	mod_delayed_work(dmz->flush_wq, &dmz->flush_work, DMZ_FLUSH_PERIOD);
 
 	/* Initialize reclaim */
-	ret = dmz_ctr_reclaim(dev, dmz->metadata, &dmz->reclaim);
+	ret = dmz_ctr_reclaim(dmz->metadata, &dmz->reclaim);
 	if (ret) {
 		ti->error = "Zone reclaim initialization failed";
 		goto err_fwq;
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index e0883df8a903..454ebd628cca 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -180,6 +180,7 @@ const char *dmz_metadata_label(struct dmz_metadata *zmd);
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
+struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone);
 
 bool dmz_check_dev(struct dmz_metadata *zmd);
 bool dmz_dev_is_dying(struct dmz_metadata *zmd);
@@ -254,7 +255,7 @@ int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 /*
  * Functions defined in dm-zoned-reclaim.c
  */
-int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
+int dmz_ctr_reclaim(struct dmz_metadata *zmd,
 		    struct dmz_reclaim **zrc);
 void dmz_dtr_reclaim(struct dmz_reclaim *zrc);
 void dmz_suspend_reclaim(struct dmz_reclaim *zrc);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/13] dm-zoned: replace 'target' pointer in the bio context
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (7 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 08/13] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-28  9:43   ` Damien Le Moal
  2020-04-20 10:08 ` [PATCH 10/13] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O Hannes Reinecke
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Replace the 'target' pointer in the bio context with the
device pointer as this is what's actually used.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-target.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index a1f42af2877c..4897ffae96ca 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -17,7 +17,7 @@
  * Zone BIO context.
  */
 struct dmz_bioctx {
-	struct dmz_target	*target;
+	struct dmz_dev		*dev;
 	struct dm_zone		*zone;
 	struct bio		*bio;
 	refcount_t		ref;
@@ -81,7 +81,7 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
 	if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
 		bio->bi_status = status;
 	if (bio->bi_status != BLK_STS_OK)
-		bioctx->target->dev->flags |= DMZ_CHECK_BDEV;
+		bioctx->dev->flags |= DMZ_CHECK_BDEV;
 
 	if (refcount_dec_and_test(&bioctx->ref)) {
 		struct dm_zone *zone = bioctx->zone;
@@ -119,13 +119,18 @@ static int dmz_submit_bio(struct dmz_target *dmz, struct dm_zone *zone,
 			  unsigned int nr_blocks)
 {
 	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
+	struct dmz_dev *dev = dmz_zone_to_dev(dmz->metadata, zone);
 	struct bio *clone;
 
+	if (dev->flags & DMZ_BDEV_DYING)
+		return -EIO;
+
 	clone = bio_clone_fast(bio, GFP_NOIO, &dmz->bio_set);
 	if (!clone)
 		return -ENOMEM;
 
-	bio_set_dev(clone, dmz->dev->bdev);
+	bio_set_dev(clone, dev->bdev);
+	bioctx->dev = dev;
 	clone->bi_iter.bi_sector =
 		dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
 	clone->bi_iter.bi_size = dmz_blk2sect(nr_blocks) << SECTOR_SHIFT;
@@ -397,11 +402,6 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
 
 	dmz_lock_metadata(zmd);
 
-	if (dmz->dev->flags & DMZ_BDEV_DYING) {
-		ret = -EIO;
-		goto out;
-	}
-
 	/*
 	 * Get the data zone mapping the chunk. There may be no
 	 * mapping for read and discard. If a mapping is obtained,
@@ -625,7 +625,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 {
 	struct dmz_target *dmz = ti->private;
 	struct dmz_metadata *zmd = dmz->metadata;
-	struct dmz_dev *dev = dmz->dev;
 	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
 	sector_t sector = bio->bi_iter.bi_sector;
 	unsigned int nr_sectors = bio_sectors(bio);
@@ -642,8 +641,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 		(unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
 		(unsigned int)dmz_bio_blocks(bio));
 
-	bio_set_dev(bio, dev->bdev);
-
 	if (!nr_sectors && bio_op(bio) != REQ_OP_WRITE)
 		return DM_MAPIO_REMAPPED;
 
@@ -652,7 +649,7 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 		return DM_MAPIO_KILL;
 
 	/* Initialize the BIO context */
-	bioctx->target = dmz;
+	bioctx->dev = NULL;
 	bioctx->zone = NULL;
 	bioctx->bio = bio;
 	refcount_set(&bioctx->ref, 1);
@@ -931,11 +928,12 @@ static void dmz_io_hints(struct dm_target *ti, struct queue_limits *limits)
 static int dmz_prepare_ioctl(struct dm_target *ti, struct block_device **bdev)
 {
 	struct dmz_target *dmz = ti->private;
+	struct dmz_dev *dev = &dmz->dev[0];
 
-	if (!dmz_check_bdev(dmz->dev))
+	if (!dmz_check_bdev(dev))
 		return -EIO;
 
-	*bdev = dmz->dev->bdev;
+	*bdev = dev->bdev;
 
 	return 0;
 }
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/13] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (8 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 09/13] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 11/13] dm-zoned: add metadata logging functions Hannes Reinecke
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Use accessors to retrieve the device pointer in preparation
for adding an additional block device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 426af738f1ca..312194be4cb0 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -1310,6 +1310,7 @@ static int dmz_update_zone_cb(struct blk_zone *blkz, unsigned int idx,
  */
 static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
+	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 	unsigned int noio_flag;
 	int ret;
 
@@ -1320,16 +1321,16 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 	 * GFP_NOIO was specified.
 	 */
 	noio_flag = memalloc_noio_save();
-	ret = blkdev_report_zones(zmd->dev->bdev, dmz_start_sect(zmd, zone), 1,
+	ret = blkdev_report_zones(dev->bdev, dmz_start_sect(zmd, zone), 1,
 				  dmz_update_zone_cb, zone);
 	memalloc_noio_restore(noio_flag);
 
 	if (ret == 0)
 		ret = -EIO;
 	if (ret < 0) {
-		dmz_dev_err(zmd->dev, "Get zone %u report failed",
+		dmz_dev_err(dev, "Get zone %u report failed",
 			    zone->id);
-		dmz_check_bdev(zmd->dev);
+		dmz_check_bdev(dev);
 		return ret;
 	}
 
@@ -1343,6 +1344,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
 				    struct dm_zone *zone)
 {
+	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 	unsigned int wp = 0;
 	int ret;
 
@@ -1351,7 +1353,7 @@ static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
 	if (ret)
 		return ret;
 
-	dmz_dev_warn(zmd->dev, "Processing zone %u write error (zone wp %u/%u)",
+	dmz_dev_warn(dev, "Processing zone %u write error (zone wp %u/%u)",
 		     zone->id, zone->wp_block, wp);
 
 	if (zone->wp_block < wp) {
@@ -1384,7 +1386,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 		return 0;
 
 	if (!dmz_is_empty(zone) || dmz_seq_write_err(zone)) {
-		struct dmz_dev *dev = zmd->dev;
+		struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 
 		ret = blkdev_zone_mgmt(dev->bdev, REQ_OP_ZONE_RESET,
 				       dmz_start_sect(zmd, zone),
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 11/13] dm-zoned: add metadata logging functions
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (9 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 10/13] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 12/13] dm-zoned: ignore metadata zone in dmz_alloc_zone() Hannes Reinecke
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Use the metadata label for logging and not the underlying
device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 95 +++++++++++++++++++++++++-----------------
 1 file changed, 56 insertions(+), 39 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 312194be4cb0..77b9ea4bad74 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -194,6 +194,17 @@ struct dmz_metadata {
 	wait_queue_head_t	free_wq;
 };
 
+#define dmz_zmd_info(zmd, format, args...)	\
+	DMINFO("(%s): " format, (zmd)->devname, ## args)
+
+#define dmz_zmd_err(zmd, format, args...)	\
+	DMERR("(%s): " format, (zmd)->devname, ## args)
+
+#define dmz_zmd_warn(zmd, format, args...)	\
+	DMWARN("(%s): " format, (zmd)->devname, ## args)
+
+#define dmz_zmd_debug(zmd, format, args...)	\
+	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
 /*
  * Various accessors
  */
@@ -1098,7 +1109,7 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 	int ret;
 
 	if (!zmd->sb[0].zone) {
-		dmz_dev_err(zmd->dev, "Primary super block zone not set");
+		dmz_zmd_err(zmd, "Primary super block zone not set");
 		return -ENXIO;
 	}
 
@@ -1135,7 +1146,7 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 
 	/* Use highest generation sb first */
 	if (!sb_good[0] && !sb_good[1]) {
-		dmz_dev_err(zmd->dev, "No valid super block found");
+		dmz_zmd_err(zmd, "No valid super block found");
 		return -EIO;
 	}
 
@@ -1248,7 +1259,7 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
  */
 static int dmz_init_zones(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = zmd->dev;
+	struct dmz_dev *dev = &zmd->dev[0];
 	int ret;
 
 	/* Init */
@@ -1268,8 +1279,8 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	if (!zmd->zones)
 		return -ENOMEM;
 
-	dmz_dev_info(dev, "Using %zu B for zone information",
-		     sizeof(struct dm_zone) * zmd->nr_zones);
+	DMINFO("(%s): Using %zu B for zone information",
+	       zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
 
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
@@ -1412,7 +1423,6 @@ static void dmz_get_zone_weight(struct dmz_metadata *zmd, struct dm_zone *zone);
  */
 static int dmz_load_mapping(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = zmd->dev;
 	struct dm_zone *dzone, *bzone;
 	struct dmz_mblock *dmap_mblk = NULL;
 	struct dmz_map *dmap;
@@ -1445,7 +1455,7 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 			goto next;
 
 		if (dzone_id >= zmd->nr_zones) {
-			dmz_dev_err(dev, "Chunk %u mapping: invalid data zone ID %u",
+			dmz_zmd_err(zmd, "Chunk %u mapping: invalid data zone ID %u",
 				    chunk, dzone_id);
 			return -EIO;
 		}
@@ -1466,14 +1476,14 @@ static int dmz_load_mapping(struct dmz_metadata *zmd)
 			goto next;
 
 		if (bzone_id >= zmd->nr_zones) {
-			dmz_dev_err(dev, "Chunk %u mapping: invalid buffer zone ID %u",
+			dmz_zmd_err(zmd, "Chunk %u mapping: invalid buffer zone ID %u",
 				    chunk, bzone_id);
 			return -EIO;
 		}
 
 		bzone = dmz_get(zmd, bzone_id);
 		if (!dmz_is_rnd(bzone)) {
-			dmz_dev_err(dev, "Chunk %u mapping: invalid buffer zone %u",
+			dmz_zmd_err(zmd, "Chunk %u mapping: invalid buffer zone %u",
 				    chunk, bzone_id);
 			return -EIO;
 		}
@@ -1893,7 +1903,7 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
 		atomic_dec(&zmd->unmap_nr_seq);
 
 	if (dmz_is_offline(zone)) {
-		dmz_dev_warn(zmd->dev, "Zone %u is offline", zone->id);
+		dmz_zmd_warn(zmd, "Zone %u is offline", zone->id);
 		zone = NULL;
 		goto again;
 	}
@@ -2104,7 +2114,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	struct dmz_mblock *mblk;
 	unsigned int n = 0;
 
-	dmz_dev_debug(zmd->dev, "=> VALIDATE zone %u, block %llu, %u blocks",
+	dmz_zmd_debug(zmd, "=> VALIDATE zone %u, block %llu, %u blocks",
 		      zone->id, (unsigned long long)chunk_block,
 		      nr_blocks);
 
@@ -2134,7 +2144,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	if (likely(zone->weight + n <= zone_nr_blocks))
 		zone->weight += n;
 	else {
-		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be <= %u",
+		dmz_zmd_warn(zmd, "Zone %u: weight %u should be <= %u",
 			     zone->id, zone->weight,
 			     zone_nr_blocks - n);
 		zone->weight = zone_nr_blocks;
@@ -2184,7 +2194,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	struct dmz_mblock *mblk;
 	unsigned int n = 0;
 
-	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
+	dmz_zmd_debug(zmd, "=> INVALIDATE zone %u, block %llu, %u blocks",
 		      zone->id, (u64)chunk_block, nr_blocks);
 
 	WARN_ON(chunk_block + nr_blocks > zmd->zone_nr_blocks);
@@ -2214,7 +2224,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
 	if (zone->weight >= n)
 		zone->weight -= n;
 	else {
-		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be >= %u",
+		dmz_zmd_warn(zmd, "Zone %u: weight %u should be >= %u",
 			     zone->id, zone->weight, n);
 		zone->weight = 0;
 	}
@@ -2424,7 +2434,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 	while (!list_empty(&zmd->mblk_dirty_list)) {
 		mblk = list_first_entry(&zmd->mblk_dirty_list,
 					struct dmz_mblock, link);
-		dmz_dev_warn(zmd->dev, "mblock %llu still in dirty list (ref %u)",
+		dmz_zmd_warn(zmd, "mblock %llu still in dirty list (ref %u)",
 			     (u64)mblk->no, mblk->ref);
 		list_del_init(&mblk->link);
 		rb_erase(&mblk->node, &zmd->mblk_rbtree);
@@ -2442,7 +2452,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 	/* Sanity checks: the mblock rbtree should now be empty */
 	root = &zmd->mblk_rbtree;
 	rbtree_postorder_for_each_entry_safe(mblk, next, root, node) {
-		dmz_dev_warn(zmd->dev, "mblock %llu ref %u still in rbtree",
+		dmz_zmd_warn(zmd, "mblock %llu ref %u still in rbtree",
 			     (u64)mblk->no, mblk->ref);
 		mblk->ref = 0;
 		dmz_free_mblock(zmd, mblk);
@@ -2455,6 +2465,18 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
 	mutex_destroy(&zmd->map_lock);
 }
 
+void dmz_print_dev(struct dmz_metadata *zmd, int num)
+{
+	struct dmz_dev *dev = &zmd->dev[num];
+
+	dmz_dev_info(dev, "Host-%s zoned block device",
+		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
+		     "aware" : "managed");
+	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
+		     (u64)dev->capacity);
+	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
+		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
+}
 /*
  * Initialize the zoned metadata.
  */
@@ -2531,34 +2553,31 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 	/* Metadata cache shrinker */
 	ret = register_shrinker(&zmd->mblk_shrinker);
 	if (ret) {
-		dmz_dev_err(dev, "Register metadata cache shrinker failed");
+		dmz_zmd_err(zmd, "Register metadata cache shrinker failed");
 		goto err;
 	}
 
-	dmz_dev_info(dev, "Host-%s zoned block device",
-		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
-		     "aware" : "managed");
-	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
-		     (u64)dev->capacity);
-	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
+	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
+	dmz_print_dev(zmd, 0);
+
+	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
 		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
-	dmz_dev_info(dev, "  %u metadata zones",
+	dmz_zmd_info(zmd, "  %u metadata zones",
 		     zmd->nr_meta_zones * 2);
-	dmz_dev_info(dev, "  %u data zones for %u chunks",
+	dmz_zmd_info(zmd, "  %u data zones for %u chunks",
 		     zmd->nr_data_zones, zmd->nr_chunks);
-	dmz_dev_info(dev, "    %u random zones (%u unmapped)",
+	dmz_zmd_info(zmd, "    %u random zones (%u unmapped)",
 		     zmd->nr_rnd, atomic_read(&zmd->unmap_nr_rnd));
-	dmz_dev_info(dev, "    %u sequential zones (%u unmapped)",
+	dmz_zmd_info(zmd, "    %u sequential zones (%u unmapped)",
 		     zmd->nr_seq, atomic_read(&zmd->unmap_nr_seq));
-	dmz_dev_info(dev, "  %u reserved sequential data zones",
+	dmz_zmd_info(zmd, "  %u reserved sequential data zones",
 		     zmd->nr_reserved_seq);
-
-	dmz_dev_debug(dev, "Format:");
-	dmz_dev_debug(dev, "%u metadata blocks per set (%u max cache)",
+	dmz_zmd_debug(zmd, "Format:");
+	dmz_zmd_debug(zmd, "%u metadata blocks per set (%u max cache)",
 		      zmd->nr_meta_blocks, zmd->max_nr_mblks);
-	dmz_dev_debug(dev, "  %u data zone mapping blocks",
+	dmz_zmd_debug(zmd, "  %u data zone mapping blocks",
 		      zmd->nr_map_blocks);
-	dmz_dev_debug(dev, "  %u bitmap blocks",
+	dmz_zmd_debug(zmd, "  %u bitmap blocks",
 		      zmd->nr_bitmap_blocks);
 
 	*metadata = zmd;
@@ -2587,7 +2606,6 @@ void dmz_dtr_metadata(struct dmz_metadata *zmd)
  */
 int dmz_resume_metadata(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = zmd->dev;
 	struct dm_zone *zone;
 	sector_t wp_block;
 	unsigned int i;
@@ -2597,20 +2615,19 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 	for (i = 0; i < zmd->nr_zones; i++) {
 		zone = dmz_get(zmd, i);
 		if (!zone) {
-			dmz_dev_err(dev, "Unable to get zone %u", i);
+			dmz_zmd_err(zmd, "Unable to get zone %u", i);
 			return -EIO;
 		}
-
 		wp_block = zone->wp_block;
 
 		ret = dmz_update_zone(zmd, zone);
 		if (ret) {
-			dmz_dev_err(dev, "Broken zone %u", i);
+			dmz_zmd_err(zmd, "Broken zone %u", i);
 			return ret;
 		}
 
 		if (dmz_is_offline(zone)) {
-			dmz_dev_warn(dev, "Zone %u is offline", i);
+			dmz_zmd_warn(zmd, "Zone %u is offline", i);
 			continue;
 		}
 
@@ -2618,7 +2635,7 @@ int dmz_resume_metadata(struct dmz_metadata *zmd)
 		if (!dmz_is_seq(zone))
 			zone->wp_block = 0;
 		else if (zone->wp_block != wp_block) {
-			dmz_dev_err(dev, "Zone %u: Invalid wp (%llu / %llu)",
+			dmz_zmd_err(zmd, "Zone %u: Invalid wp (%llu / %llu)",
 				    i, (u64)zone->wp_block, (u64)wp_block);
 			zone->wp_block = wp_block;
 			dmz_invalidate_blocks(zmd, zone, zone->wp_block,
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 12/13] dm-zoned: ignore metadata zone in dmz_alloc_zone()
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (10 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 11/13] dm-zoned: add metadata logging functions Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-20 10:08 ` [PATCH 13/13] dm-zoned: metadata version 2 Hannes Reinecke
  2020-04-22  0:42 ` [PATCHv4 00/13] " Damien Le Moal
  13 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

When looking up zones in dmz_alloc_zone() we need to ignore
metadata zones so as not to accidentally overwrite metadata.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 77b9ea4bad74..c009f2d962e2 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -1907,7 +1907,13 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
 		zone = NULL;
 		goto again;
 	}
+	if (dmz_is_meta(zone)) {
+		struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
 
+		dmz_dev_warn(dev, "Zone %u has metadata", zone->id);
+		zone = NULL;
+		goto again;
+	}
 	return zone;
 }
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 13/13] dm-zoned: metadata version 2
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (11 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 12/13] dm-zoned: ignore metadata zone in dmz_alloc_zone() Hannes Reinecke
@ 2020-04-20 10:08 ` Hannes Reinecke
  2020-04-28 10:54   ` Damien Le Moal
  2020-04-22  0:42 ` [PATCHv4 00/13] " Damien Le Moal
  13 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-20 10:08 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Implement handling for metadata version 2. The new metadata adds
a label and UUID for the device mapper device, and additional UUID
for the underlying block devices.
It also allows for an additional regular drive to be used for
emulating random access zones. The emulated zones will be placed
logically in front of the zones from the zoned block device, causing
the superblocks and metadata to be stored on that device.
The first zone of the original zoned device will be used to hold
another, tertiary copy of the metadata; this copy carries a
generation number of 0 and is never updated; it's just used
for identification.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/md/dm-zoned-metadata.c | 314 ++++++++++++++++++++++++++++++++++-------
 drivers/md/dm-zoned-target.c   | 156 +++++++++++++-------
 drivers/md/dm-zoned.h          |  12 +-
 3 files changed, 373 insertions(+), 109 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index c009f2d962e2..1f31635aba73 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -16,7 +16,7 @@
 /*
  * Metadata version.
  */
-#define DMZ_META_VER	1
+#define DMZ_META_VER	2
 
 /*
  * On-disk super block magic.
@@ -69,8 +69,17 @@ struct dmz_super {
 	/* Checksum */
 	__le32		crc;			/*  48 */
 
+	/* DM-Zoned label */
+	u8		dmz_label[32];		/*  80 */
+
+	/* DM-Zoned UUID */
+	u8		dmz_uuid[16];		/*  96 */
+
+	/* Device UUID */
+	u8		dev_uuid[16];		/* 112 */
+
 	/* Padding to full 512B sector */
-	u8		reserved[464];		/* 512 */
+	u8		reserved[400];		/* 512 */
 };
 
 /*
@@ -133,8 +142,11 @@ struct dmz_sb {
  */
 struct dmz_metadata {
 	struct dmz_dev		*dev;
+	unsigned int		nr_devs;
 
 	char			devname[BDEVNAME_SIZE];
+	char			label[BDEVNAME_SIZE];
+	uuid_t			uuid;
 
 	sector_t		zone_bitmap_size;
 	unsigned int		zone_nr_bitmap_blocks;
@@ -161,8 +173,9 @@ struct dmz_metadata {
 	/* Zone information array */
 	struct dm_zone		*zones;
 
-	struct dmz_sb		sb[2];
+	struct dmz_sb		sb[3];
 	unsigned int		mblk_primary;
+	unsigned int		sb_version;
 	u64			sb_gen;
 	unsigned int		min_nr_mblks;
 	unsigned int		max_nr_mblks;
@@ -195,31 +208,56 @@ struct dmz_metadata {
 };
 
 #define dmz_zmd_info(zmd, format, args...)	\
-	DMINFO("(%s): " format, (zmd)->devname, ## args)
+	DMINFO("(%s): " format, (zmd)->label, ## args)
 
 #define dmz_zmd_err(zmd, format, args...)	\
-	DMERR("(%s): " format, (zmd)->devname, ## args)
+	DMERR("(%s): " format, (zmd)->label, ## args)
 
 #define dmz_zmd_warn(zmd, format, args...)	\
-	DMWARN("(%s): " format, (zmd)->devname, ## args)
+	DMWARN("(%s): " format, (zmd)->label, ## args)
 
 #define dmz_zmd_debug(zmd, format, args...)	\
-	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
+	DMDEBUG("(%s): " format, (zmd)->label, ## args)
 /*
  * Various accessors
  */
+unsigned int dmz_dev_zone_id(struct dmz_metadata *zmd, struct dm_zone *zone)
+{
+	unsigned int zone_id;
+
+	if (WARN_ON(!zone))
+		return 0;
+
+	zone_id = zone->id;
+	if (zmd->nr_devs > 1 &&
+	    (zone_id >= zmd->dev[1].zone_offset))
+		zone_id -= zmd->dev[1].zone_offset;
+	return zone_id;
+}
+
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
+	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
+
+	return (sector_t)zone_id << zmd->zone_nr_sectors_shift;
 }
 
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
+	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
+
+	return (sector_t)zone_id << zmd->zone_nr_blocks_shift;
 }
 
 struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
+	if (WARN_ON(!zone))
+		return &zmd->dev[0];
+
+	if (zmd->nr_devs > 1 &&
+	    zone->id >= zmd->dev[1].zone_offset)
+		return &zmd->dev[1];
+
 	return &zmd->dev[0];
 }
 
@@ -275,17 +313,33 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
 
 const char *dmz_metadata_label(struct dmz_metadata *zmd)
 {
-	return (const char *)zmd->devname;
+	return (const char *)zmd->label;
 }
 
 bool dmz_check_dev(struct dmz_metadata *zmd)
 {
-	return dmz_check_bdev(&zmd->dev[0]);
+	unsigned int i;
+
+	for (i = 0; i < zmd->nr_devs; i++) {
+		if (!zmd->dev[i].bdev)
+			continue;
+		if (!dmz_check_bdev(&zmd->dev[i]))
+			return false;
+	}
+	return true;
 }
 
 bool dmz_dev_is_dying(struct dmz_metadata *zmd)
 {
-	return dmz_bdev_is_dying(&zmd->dev[0]);
+	unsigned int i;
+
+	for (i = 0; i < zmd->nr_devs; i++) {
+		if (!zmd->dev[i].bdev)
+			continue;
+		if (dmz_bdev_is_dying(&zmd->dev[i]))
+			return true;
+	}
+	return false;
 }
 
 /*
@@ -687,6 +741,9 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
 	struct bio *bio;
 	int ret;
 
+	if (WARN_ON(!dev))
+		return -EIO;
+
 	if (dmz_bdev_is_dying(dev))
 		return -EIO;
 
@@ -711,7 +768,8 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
  */
 static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 {
-	sector_t block = zmd->sb[set].block;
+	sector_t sb_block =
+		zmd->sb[set].zone->id << zmd->zone_nr_blocks_shift;
 	struct dmz_mblock *mblk = zmd->sb[set].mblk;
 	struct dmz_super *sb = zmd->sb[set].sb;
 	struct dmz_dev *dev = zmd->sb[set].dev;
@@ -719,11 +777,18 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	int ret;
 
 	sb->magic = cpu_to_le32(DMZ_MAGIC);
-	sb->version = cpu_to_le32(DMZ_META_VER);
+
+	sb->version = cpu_to_le32(zmd->sb_version);
+	if (zmd->sb_version > 1) {
+		BUILD_BUG_ON(UUID_SIZE != 16);
+		memcpy(sb->dmz_uuid, &zmd->uuid, UUID_SIZE);
+		memcpy(sb->dmz_label, zmd->label, BDEVNAME_SIZE);
+		memcpy(sb->dev_uuid, &dev->uuid, UUID_SIZE);
+	}
 
 	sb->gen = cpu_to_le64(sb_gen);
 
-	sb->sb_block = cpu_to_le64(block);
+	sb->sb_block = cpu_to_le64(sb_block);
 	sb->nr_meta_blocks = cpu_to_le32(zmd->nr_meta_blocks);
 	sb->nr_reserved_seq = cpu_to_le32(zmd->nr_reserved_seq);
 	sb->nr_chunks = cpu_to_le32(zmd->nr_chunks);
@@ -734,7 +799,8 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	sb->crc = 0;
 	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
 
-	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
+	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, zmd->sb[set].block,
+			     mblk->page);
 	if (ret == 0)
 		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
 
@@ -915,6 +981,23 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 	u32 crc, stored_crc;
 	u64 gen;
 
+	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
+		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
+			    DMZ_MAGIC, le32_to_cpu(sb->magic));
+		return -ENXIO;
+	}
+
+	zmd->sb_version = le32_to_cpu(sb->version);
+	if (zmd->sb_version > DMZ_META_VER) {
+		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
+			    DMZ_META_VER, zmd->sb_version);
+		return -EINVAL;
+	}
+	if ((zmd->sb_version < 1) && (set == 2)) {
+		dmz_dev_err(dev, "Tertiary superblocks are not supported");
+		return -EINVAL;
+	}
+
 	gen = le64_to_cpu(sb->gen);
 	stored_crc = le32_to_cpu(sb->crc);
 	sb->crc = 0;
@@ -925,18 +1008,44 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
 		return -ENXIO;
 	}
 
-	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
-		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
-			    DMZ_MAGIC, le32_to_cpu(sb->magic));
-		return -ENXIO;
-	}
+	if (zmd->sb_version > 1) {
+		uuid_t sb_uuid;
+
+		memcpy(&sb_uuid, sb->dmz_uuid, UUID_SIZE);
+		if (uuid_is_null(&sb_uuid)) {
+			dmz_dev_err(dev, "NULL DM-Zoned uuid");
+			return -ENXIO;
+		} else if (uuid_is_null(&zmd->uuid)) {
+			uuid_copy(&zmd->uuid, &sb_uuid);
+		} else if (!uuid_equal(&zmd->uuid, &sb_uuid)) {
+			dmz_dev_err(dev, "mismatching DM-Zoned uuid, "
+				    "is %pUl expected %pUl",
+				    &sb_uuid, &zmd->uuid);
+			return -ENXIO;
+		}
+		if (!strlen(zmd->label))
+			memcpy(zmd->label, sb->dmz_label, BDEVNAME_SIZE);
+		else if (memcmp(zmd->label, sb->dmz_label, BDEVNAME_SIZE)) {
+			dmz_dev_err(dev, "mismatching DM-Zoned label, "
+				    "is %s expected %s",
+				    sb->dmz_label, zmd->label);
+			return -ENXIO;
+		}
+		memcpy(&dev->uuid, sb->dev_uuid, UUID_SIZE);
+		if (uuid_is_null(&dev->uuid)) {
+			dmz_dev_err(dev, "NULL device uuid");
+			return -ENXIO;
+		}
 
-	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
-		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
-			    DMZ_META_VER, le32_to_cpu(sb->version));
-		return -ENXIO;
+		if (set == 2) {
+			if (gen != 0) {
+				dmz_dev_err(dev, "Invalid generation %llu",
+					    gen);
+				return -ENXIO;
+			}
+			return 0;
+		}
 	}
-
 	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
 		>> zmd->zone_nr_blocks_shift;
 	if (!nr_meta_zones ||
@@ -1185,21 +1294,38 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 		      "Using super block %u (gen %llu)",
 		      zmd->mblk_primary, zmd->sb_gen);
 
+	if ((zmd->sb_version > 1) && zmd->sb[2].zone) {
+		zmd->sb[2].block = dmz_start_block(zmd, zmd->sb[2].zone);
+		zmd->sb[2].dev = dmz_zone_to_dev(zmd, zmd->sb[2].zone);
+		ret = dmz_get_sb(zmd, 2);
+		if (ret) {
+			dmz_dev_err(zmd->sb[2].dev,
+				    "Read tertiary super block failed");
+			return ret;
+		}
+		ret = dmz_check_sb(zmd, 2);
+		if (ret == -EINVAL)
+			return ret;
+	}
 	return 0;
 }
 
 /*
  * Initialize a zone descriptor.
  */
-static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
+static int dmz_init_zone(struct blk_zone *blkz, unsigned int num, void *data)
 {
 	struct dmz_metadata *zmd = data;
+	struct dmz_dev *dev = zmd->nr_devs > 1 ? &zmd->dev[1] : &zmd->dev[0];
+	int idx = num + dev->zone_offset;
 	struct dm_zone *zone = &zmd->zones[idx];
-	struct dmz_dev *dev = zmd->dev;
 
-	/* Ignore the eventual last runt (smaller) zone */
 	if (blkz->len != zmd->zone_nr_sectors) {
-		if (blkz->start + blkz->len == dev->capacity)
+		if (zmd->sb_version > 1) {
+			/* Ignore the eventual runt (smaller) zone */
+			set_bit(DMZ_OFFLINE, &zone->flags);
+			return 0;
+		} else if (blkz->start + blkz->len == dev->capacity)
 			return 0;
 		return -ENXIO;
 	}
@@ -1234,16 +1360,46 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 		zmd->nr_useable_zones++;
 		if (dmz_is_rnd(zone)) {
 			zmd->nr_rnd_zones++;
-			if (!zmd->sb[0].zone) {
-				/* Super block zone */
+			if (zmd->nr_devs == 1 && !zmd->sb[0].zone) {
+				/* Primary super block zone */
 				zmd->sb[0].zone = zone;
 			}
 		}
+		if (zmd->nr_devs > 1 && !zmd->sb[2].zone) {
+			/* Tertiary superblock zone */
+			zmd->sb[2].zone = zone;
+		}
 	}
 
 	return 0;
 }
 
+static void dmz_emulate_zones(struct dmz_metadata *zmd, struct dmz_dev *dev)
+{
+	int idx;
+	sector_t zone_offset = 0;
+
+	for(idx = 0; idx < dev->nr_zones; idx++) {
+		struct dm_zone *zone = &zmd->zones[idx];
+
+		INIT_LIST_HEAD(&zone->link);
+		atomic_set(&zone->refcount, 0);
+		zone->id = idx;
+		zone->chunk = DMZ_MAP_UNMAPPED;
+		set_bit(DMZ_RND, &zone->flags);
+		zone->wp_block = 0;
+		zmd->nr_rnd_zones++;
+		zmd->nr_useable_zones++;
+		if (dev->capacity - zone_offset <
+		    zmd->zone_nr_sectors) {
+			/* Disable runt zone */
+			set_bit(DMZ_OFFLINE, &zone->flags);
+			break;
+		}
+		zone_offset += zmd->zone_nr_sectors;
+	}
+}
+
 /*
  * Free zones descriptors.
  */
@@ -1259,11 +1415,11 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
  */
 static int dmz_init_zones(struct dmz_metadata *zmd)
 {
-	struct dmz_dev *dev = &zmd->dev[0];
-	int ret;
+	int i, ret;
+	struct dmz_dev *zoned_dev = &zmd->dev[0];
 
 	/* Init */
-	zmd->zone_nr_sectors = dev->zone_nr_sectors;
+	zmd->zone_nr_sectors = zmd->dev[0].zone_nr_sectors;
 	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
 	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
 	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
@@ -1274,7 +1430,14 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 					DMZ_BLOCK_SIZE_BITS);
 
 	/* Allocate zone array */
-	zmd->nr_zones = dev->nr_zones;
+	zmd->nr_zones = 0;
+	for (i = 0; i < zmd->nr_devs; i++)
+		zmd->nr_zones += zmd->dev[i].nr_zones;
+
+	if (!zmd->nr_zones) {
+		DMERR("(%s): No zones found", zmd->devname);
+		return -ENXIO;
+	}
 	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
 	if (!zmd->zones)
 		return -ENOMEM;
@@ -1282,14 +1445,27 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
 	DMINFO("(%s): Using %zu B for zone information",
 	       zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
 
+	if (zmd->nr_devs > 1) {
+		dmz_emulate_zones(zmd, &zmd->dev[0]);
+		/*
+		 * Primary superblock zone is always at zone 0 when multiple
+		 * drives are present.
+		 */
+		zmd->sb[0].zone = &zmd->zones[0];
+
+		zoned_dev = &zmd->dev[1];
+	}
+
 	/*
 	 * Get zone information and initialize zone descriptors.  At the same
 	 * time, determine where the super block should be: first block of the
 	 * first randomly writable zone.
 	 */
-	ret = blkdev_report_zones(dev->bdev, 0, BLK_ALL_ZONES, dmz_init_zone,
-				  zmd);
+	ret = blkdev_report_zones(zoned_dev->bdev, 0, BLK_ALL_ZONES,
+				  dmz_init_zone, zmd);
 	if (ret < 0) {
+		DMDEBUG("(%s): Failed to report zones, error %d",
+			zmd->devname, ret);
 		dmz_drop_zones(zmd);
 		return ret;
 	}
@@ -1325,6 +1501,9 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
 	unsigned int noio_flag;
 	int ret;
 
+	if (dev->flags & DMZ_BDEV_REGULAR)
+		return 0;
+
 	/*
 	 * Get zone information from disk. Since blkdev_report_zones() uses
 	 * GFP_KERNEL by default for memory allocations, set the per-task
@@ -2475,18 +2654,34 @@ void dmz_print_dev(struct dmz_metadata *zmd, int num)
 {
 	struct dmz_dev *dev = &zmd->dev[num];
 
-	dmz_dev_info(dev, "Host-%s zoned block device",
-		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
-		     "aware" : "managed");
-	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
-		     (u64)dev->capacity);
-	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
-		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
+	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE)
+		dmz_dev_info(dev, "Regular block device");
+	else
+		dmz_dev_info(dev, "Host-%s zoned block device",
+			     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
+			     "aware" : "managed");
+	if (zmd->sb_version > 1) {
+		sector_t sector_offset =
+			dev->zone_offset << zmd->zone_nr_sectors_shift;
+
+		dmz_dev_info(dev, "  uuid %pUl", &dev->uuid);
+		dmz_dev_info(dev, "  %llu 512-byte logical sectors (offset %llu)",
+			     (u64)dev->capacity, (u64)sector_offset);
+		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors (offset %llu)",
+			     dev->nr_zones, (u64)zmd->zone_nr_sectors,
+			     (u64)dev->zone_offset);
+	} else {
+		dmz_dev_info(dev, "  %llu 512-byte logical sectors",
+			     (u64)dev->capacity);
+		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
+			     dev->nr_zones, (u64)zmd->zone_nr_sectors);
+	}
 }
 /*
  * Initialize the zoned metadata.
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
+int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
+		     struct dmz_metadata **metadata,
 		     const char *devname)
 {
 	struct dmz_metadata *zmd;
@@ -2500,6 +2695,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 
 	strcpy(zmd->devname, devname);
 	zmd->dev = dev;
+	zmd->nr_devs = num_dev;
 	zmd->mblk_rbtree = RB_ROOT;
 	init_rwsem(&zmd->mblk_sem);
 	mutex_init(&zmd->mblk_flush_lock);
@@ -2534,11 +2730,24 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 	/* Set metadata zones starting from sb_zone */
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
 		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
-		if (!dmz_is_rnd(zone))
+		if (!dmz_is_rnd(zone)) {
+			dmz_zmd_err(zmd,
+				    "metadata zone %d is not random", i);
+			ret = -ENXIO;
 			goto err;
+		}
+		set_bit(DMZ_META, &zone->flags);
+	}
+	if (zmd->sb[2].zone) {
+		zone = dmz_get(zmd, zmd->sb[2].zone->id);
+		if (!zone) {
+			dmz_zmd_err(zmd,
+				    "Tertiary metadata zone not present");
+			ret = -ENXIO;
+			goto err;
+		}
 		set_bit(DMZ_META, &zone->flags);
 	}
-
 	/* Load mapping table */
 	ret = dmz_load_mapping(zmd);
 	if (ret)
@@ -2563,8 +2772,13 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
 		goto err;
 	}
 
-	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
-	dmz_print_dev(zmd, 0);
+	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", zmd->sb_version);
+	if (zmd->sb_version > 1) {
+		dmz_zmd_info(zmd, "DM UUID %pUl", &zmd->uuid);
+		dmz_zmd_info(zmd, "DM Label %s", zmd->label);
+	}
+	for (i = 0; i < zmd->nr_devs; i++)
+		dmz_print_dev(zmd, i);
 
 	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
 		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 4897ffae96ca..ae05d5d60b37 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -38,7 +38,7 @@ struct dm_chunk_work {
  * Target descriptor.
  */
 struct dmz_target {
-	struct dm_dev		*ddev;
+	struct dm_dev		*ddev[2];
 
 	unsigned long		flags;
 
@@ -684,60 +684,40 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 /*
  * Get zoned device information.
  */
-static int dmz_get_zoned_device(struct dm_target *ti, char *path)
+static int dmz_get_zoned_device(struct dm_target *ti, char *path, int num)
 {
 	struct dmz_target *dmz = ti->private;
-	struct request_queue *q;
 	struct dmz_dev *dev;
-	sector_t aligned_capacity;
 	int ret;
+	struct block_device *bdev;
 
 	/* Get the target device */
-	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &dmz->ddev);
+	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table),
+			    &dmz->ddev[num]);
 	if (ret) {
 		ti->error = "Get target device failed";
-		dmz->ddev = NULL;
+		dmz->ddev[num] = NULL;
 		return ret;
 	}
 
-	dev = kzalloc(sizeof(struct dmz_dev), GFP_KERNEL);
-	if (!dev) {
-		ret = -ENOMEM;
-		goto err;
-	}
-
-	dev->bdev = dmz->ddev->bdev;
+	bdev = dmz->ddev[num]->bdev;
+	if (bdev_zoned_model(bdev) == BLK_ZONED_NONE) {
+		dev = &dmz->dev[0];
+		dev->flags = DMZ_BDEV_REGULAR;
+	} else
+		dev = &dmz->dev[1];
+	dev->bdev = bdev;
 	(void)bdevname(dev->bdev, dev->name);
 
-	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE) {
-		ti->error = "Not a zoned block device";
-		ret = -EINVAL;
-		goto err;
-	}
-
-	q = bdev_get_queue(dev->bdev);
 	dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
-	aligned_capacity = dev->capacity &
-				~((sector_t)blk_queue_zone_sectors(q) - 1);
-	if (ti->begin ||
-	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
-		ti->error = "Partial mapping not supported";
-		ret = -EINVAL;
-		goto err;
+	if (ti->begin) {
+		ti->error = "Partial mapping is not supported";
+		dm_put_device(ti, dmz->ddev[num]);
+		dmz->ddev[num] = NULL;
+		return -EINVAL;
 	}
 
-	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
-
-	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
-
-	dmz->dev = dev;
-
 	return 0;
-err:
-	dm_put_device(ti, dmz->ddev);
-	kfree(dev);
-
-	return ret;
 }
 
 /*
@@ -747,9 +727,46 @@ static void dmz_put_zoned_device(struct dm_target *ti)
 {
 	struct dmz_target *dmz = ti->private;
 
-	dm_put_device(ti, dmz->ddev);
-	kfree(dmz->dev);
-	dmz->dev = NULL;
+	if (dmz->ddev[1]) {
+		dm_put_device(ti, dmz->ddev[1]);
+		dmz->ddev[1] = NULL;
+	}
+	dm_put_device(ti, dmz->ddev[0]);
+	dmz->ddev[0] = NULL;
+}
+
+static int dmz_fixup_devices(struct dm_target *ti)
+{
+	struct dmz_target *dmz = ti->private;
+	struct dmz_dev *pri_dev, *sec_dev;
+	struct request_queue *q;
+
+	pri_dev = &dmz->dev[0];
+	if (!(pri_dev->flags & DMZ_BDEV_REGULAR)) {
+		ti->error = "Primary disk is not a regular device";
+		return -EINVAL;
+	}
+	sec_dev = &dmz->dev[1];
+	if (sec_dev->flags & DMZ_BDEV_REGULAR) {
+		ti->error = "Secondary disk is not a zoned device";
+		return -EINVAL;
+	}
+	q = bdev_get_queue(sec_dev->bdev);
+	sec_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
+	sec_dev->nr_zones = blkdev_nr_zones(sec_dev->bdev->bd_disk);
+
+	pri_dev->zone_nr_sectors = sec_dev->zone_nr_sectors;
+	pri_dev->nr_zones = DIV_ROUND_UP(pri_dev->capacity,
+					 pri_dev->zone_nr_sectors);
+	sec_dev->zone_offset = pri_dev->nr_zones;
+	/* Check if we need to swizzle devices */
+	if (pri_dev->bdev != dmz->ddev[0]->bdev) {
+		struct dm_dev *ddev = dmz->ddev[0];
+
+		dmz->ddev[0] = dmz->ddev[1];
+		dmz->ddev[1] = ddev;
+	}
+	return 0;
 }
 
 /*
@@ -758,11 +775,10 @@ static void dmz_put_zoned_device(struct dm_target *ti)
 static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 {
 	struct dmz_target *dmz;
-	struct dmz_dev *dev;
 	int ret;
 
 	/* Check arguments */
-	if (argc != 1) {
+	if (argc < 1 || argc > 2) {
 		ti->error = "Invalid argument count";
 		return -EINVAL;
 	}
@@ -773,18 +789,34 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		ti->error = "Unable to allocate the zoned target descriptor";
 		return -ENOMEM;
 	}
+	dmz->dev = kcalloc(2, sizeof(struct dmz_dev), GFP_KERNEL);
+	if (!dmz->dev) {
+		ti->error = "Unable to allocate the zoned device descriptors";
+		kfree(dmz);
+		return -ENOMEM;
+	}
 	ti->private = dmz;
 
 	/* Get the target zoned block device */
-	ret = dmz_get_zoned_device(ti, argv[0]);
-	if (ret) {
-		dmz->ddev = NULL;
+	ret = dmz_get_zoned_device(ti, argv[0], 0);
+	if (ret)
 		goto err;
+
+	if (argc == 2) {
+		ret = dmz_get_zoned_device(ti, argv[1], 1);
+		if (ret) {
+			dmz_put_zoned_device(ti);
+			goto err;
+		}
+		ret = dmz_fixup_devices(ti);
+		if (ret) {
+			dmz_put_zoned_device(ti);
+			goto err;
+		}
 	}
 
 	/* Initialize metadata */
-	dev = dmz->dev;
-	ret = dmz_ctr_metadata(dev, &dmz->metadata,
+	ret = dmz_ctr_metadata(dmz->dev, argc, &dmz->metadata,
 			       dm_table_device_name(ti->table));
 	if (ret) {
 		ti->error = "Metadata initialization failed";
@@ -861,6 +893,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 err_dev:
 	dmz_put_zoned_device(ti);
 err:
+	kfree(dmz->dev);
 	kfree(dmz);
 
 	return ret;
@@ -891,6 +924,7 @@ static void dmz_dtr(struct dm_target *ti)
 
 	mutex_destroy(&dmz->chunk_lock);
 
+	kfree(dmz->dev);
 	kfree(dmz);
 }
 
@@ -965,10 +999,17 @@ static int dmz_iterate_devices(struct dm_target *ti,
 			       iterate_devices_callout_fn fn, void *data)
 {
 	struct dmz_target *dmz = ti->private;
-	struct dmz_dev *dev = dmz->dev;
-	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
-
-	return fn(ti, dmz->ddev, 0, capacity, data);
+	unsigned int zone_nr_sectors = dmz_zone_nr_sectors(dmz->metadata);
+	sector_t capacity;
+	int r;
+
+	capacity = dmz->dev[0].capacity & ~(zone_nr_sectors - 1);
+	r = fn(ti, dmz->ddev[0], 0, capacity, data);
+	if (!r && dmz->ddev[1]) {
+		capacity = dmz->dev[1].capacity & ~(zone_nr_sectors - 1);
+		r = fn(ti, dmz->ddev[1], 0, capacity, data);
+	}
+	return r;
 }
 
 static void dmz_status(struct dm_target *ti, status_type_t type,
@@ -978,6 +1019,7 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
 	struct dmz_target *dmz = ti->private;
 	ssize_t sz = 0;
 	char buf[BDEVNAME_SIZE];
+	struct dmz_dev *dev;
 
 	switch (type) {
 	case STATUSTYPE_INFO:
@@ -991,8 +1033,14 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
 		       dmz_nr_seq_zones(dmz->metadata));
 		break;
 	case STATUSTYPE_TABLE:
-		format_dev_t(buf, dmz->dev->bdev->bd_dev);
+		dev = &dmz->dev[0];
+		format_dev_t(buf, dev->bdev->bd_dev);
 		DMEMIT("%s ", buf);
+		if (dmz->dev[1].bdev) {
+			dev = &dmz->dev[1];
+			format_dev_t(buf, dev->bdev->bd_dev);
+			DMEMIT("%s ", buf);
+		}
 		break;
 	}
 	return;
@@ -1014,7 +1062,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
 
 static struct target_type dmz_type = {
 	.name		 = "zoned",
-	.version	 = {1, 1, 0},
+	.version	 = {1, 2, 0},
 	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
 	.module		 = THIS_MODULE,
 	.ctr		 = dmz_ctr,
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 454ebd628cca..e383d5b2a3c5 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -52,10 +52,12 @@ struct dmz_dev {
 	struct block_device	*bdev;
 
 	char			name[BDEVNAME_SIZE];
+	uuid_t			uuid;
 
 	sector_t		capacity;
 
 	unsigned int		nr_zones;
+	unsigned int		zone_offset;
 
 	unsigned int		flags;
 
@@ -69,6 +71,7 @@ struct dmz_dev {
 /* Device flags. */
 #define DMZ_BDEV_DYING		(1 << 0)
 #define DMZ_CHECK_BDEV		(2 << 0)
+#define DMZ_BDEV_REGULAR	(4 << 0)
 
 /*
  * Zone descriptor.
@@ -163,8 +166,8 @@ struct dmz_reclaim;
 /*
  * Functions defined in dm-zoned-metadata.c
  */
-int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
-		     const char *devname);
+int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
+		     struct dmz_metadata **zmd, const char *devname);
 void dmz_dtr_metadata(struct dmz_metadata *zmd);
 int dmz_resume_metadata(struct dmz_metadata *zmd);
 
@@ -176,15 +179,13 @@ void dmz_lock_flush(struct dmz_metadata *zmd);
 void dmz_unlock_flush(struct dmz_metadata *zmd);
 int dmz_flush_metadata(struct dmz_metadata *zmd);
 const char *dmz_metadata_label(struct dmz_metadata *zmd);
+bool dmz_check_dev(struct dmz_metadata *zmd);
 
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
 sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
 unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
 struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone);
 
-bool dmz_check_dev(struct dmz_metadata *zmd);
-bool dmz_dev_is_dying(struct dmz_metadata *zmd);
-
 #define DMZ_ALLOC_RND		0x01
 #define DMZ_ALLOC_RECLAIM	0x02
 
@@ -251,6 +252,7 @@ int dmz_copy_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 			  struct dm_zone *to_zone);
 int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
 			   struct dm_zone *to_zone, sector_t chunk_block);
+bool dmz_dev_is_dying(struct dmz_metadata *zmd);
 
 /*
  * Functions defined in dm-zoned-reclaim.c
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCHv4 00/13] dm-zoned: metadata version 2
  2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
                   ` (12 preceding siblings ...)
  2020-04-20 10:08 ` [PATCH 13/13] dm-zoned: metadata version 2 Hannes Reinecke
@ 2020-04-22  0:42 ` Damien Le Moal
  13 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-22  0:42 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:08, Hannes Reinecke wrote:
> Hi all,
> 
> this patchset adds a new metadata version 2 for dm-zoned, which brings the
> following improvements:
> - UUIDs and labels: Adding three more fields to the metadata containing
>   the dm-zoned device UUID and label, and the device UUID. This allows
>   for an unique identification of the devices, so that several dm-zoned
>   sets can coexist and have a persistent identification.
> - Extend random zones by an additional regular disk device: A regular
>   block device can be added together with the zoned block device, providing
>   additional (emulated) random write zones. With this it's possible to
>   handle sequential zones only devices; also there will be a speed-up if
>   the regular block device resides on a fast medium. The regular block device
>   is placed logically in front of the zoned block device, so that metadata
>   and mapping tables reside on the regular block device, not the zoned device.
> - Tertiary superblock support: In addition to the two existing sets of metadata
>   another, tertiary, superblock is written to the first block of the zoned
>   block device. This superblock is for identification only; the generation
>   number is set to '0' and the block itself it never updated. The additional
>   metadate like bitmap tables etc are not copied.
> 
> To handle this, some changes to the original handling are introduced:
> - Zones are now equidistant. Originally, runt zones were ignored, and
>   not counted when sizing the mapping tables. With the dual device setup
>   runt zones might occur at the end of the regular block device, making
>   direct translation between zone number and sector/block number complex.
>   For metadata version 2 all zones are considered to be of the same size,
>   and runt zones are simply marked as 'offline' to have them ignored when
>   allocating a new zone.
> - The block number in the superblock is now the global number, and refers to
>   the location of the superblock relative to the resulting device-mapper
>   device. Which means that the tertiary superblock contains absolute block
>   addresses, which needs to be translated to the relative device addresses
>   to find the referenced block.
> 
> There is an accompanying patchset for dm-zoned-tools for writing and checking
> this new metadata.
> 
> As usual, comments and reviews are welcome.

Not forgetting this, just late reviewing. At a glance, I think this is all good,
but I would like have another good round of review and to run it through our
dm-zoned tests :) Will do that today or tomorrow.

Thanks !


> 
> Changes to v3:
> - Reorder devices such that the regular device is always at position 0,
>   and the zoned device is always at position 1.
> - Split off dmz_dev_is_dying() into a separate patch
> - Include reviews from Damien
> 
> Changes to v2:
> - Kill dmz_id()
> - Include reviews from Damien
> - Sanitize uuid handling as suggested by John Dorminy
> 
> 
> Hannes Reinecke (13):
>   dm-zoned: add 'status' and 'message' callbacks
>   dm-zoned: store zone id within the zone structure and kill dmz_id()
>   dm-zoned: use array for superblock zones
>   dm-zoned: store device in struct dmz_sb
>   dm-zoned: move fields from struct dmz_dev to dmz_metadata
>   dm-zoned: introduce dmz_metadata_label() to format device name
>   dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev()
>   dm-zoned: remove 'dev' argument from reclaim
>   dm-zoned: replace 'target' pointer in the bio context
>   dm-zoned: use dmz_zone_to_dev() when handling metadata I/O
>   dm-zoned: add metadata logging functions
>   dm-zoned: ignore metadata zone in dmz_alloc_zone()
>   dm-zoned: metadata version 2
> 
>  drivers/md/dm-zoned-metadata.c | 658 +++++++++++++++++++++++++++++++----------
>  drivers/md/dm-zoned-reclaim.c  |  88 +++---
>  drivers/md/dm-zoned-target.c   | 331 +++++++++++++--------
>  drivers/md/dm-zoned.h          |  33 ++-
>  4 files changed, 780 insertions(+), 330 deletions(-)
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks
  2020-04-20 10:08 ` [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks Hannes Reinecke
@ 2020-04-28  9:19   ` Damien Le Moal
  0 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-28  9:19 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:09, Hannes Reinecke wrote:
> Add callbacks to supply information for 'dmsetup status'
> and 'dmsetup info', and implement the message 'reclaim'
> to start the reclaim worker.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/md/dm-zoned-metadata.c | 15 +++++++++++++++
>  drivers/md/dm-zoned-target.c   | 43 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/md/dm-zoned.h          |  3 +++
>  3 files changed, 61 insertions(+)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index 369de15c4e80..c8787560fa9f 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -202,6 +202,11 @@ sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
>  	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_blocks_shift;
>  }
>  
> +unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
> +{
> +	return zmd->dev->nr_zones;
> +}
> +
>  unsigned int dmz_nr_chunks(struct dmz_metadata *zmd)
>  {
>  	return zmd->nr_chunks;
> @@ -217,6 +222,16 @@ unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd)
>  	return atomic_read(&zmd->unmap_nr_rnd);
>  }
>  
> +unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd)
> +{
> +	return zmd->nr_seq;
> +}
> +
> +unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
> +{
> +	return atomic_read(&zmd->unmap_nr_seq);
> +}
> +
>  /*
>   * Lock/unlock mapping table.
>   * The map lock also protects all the zone lists.
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index f4f83d39b3dc..44e30a7de8b9 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -965,6 +965,47 @@ static int dmz_iterate_devices(struct dm_target *ti,
>  	return fn(ti, dmz->ddev, 0, capacity, data);
>  }
>  
> +static void dmz_status(struct dm_target *ti, status_type_t type,
> +		       unsigned int status_flags, char *result,
> +		       unsigned int maxlen)
> +{
> +	struct dmz_target *dmz = ti->private;
> +	ssize_t sz = 0;
> +	char buf[BDEVNAME_SIZE];
> +
> +	switch (type) {
> +	case STATUSTYPE_INFO:
> +		DMEMIT("%u zones "
> +		       "%u/%u random "
> +		       "%u/%u sequential",
> +		       dmz_nr_zones(dmz->metadata),
> +		       dmz_nr_unmap_rnd_zones(dmz->metadata),
> +		       dmz_nr_rnd_zones(dmz->metadata),
> +		       dmz_nr_unmap_seq_zones(dmz->metadata),
> +		       dmz_nr_seq_zones(dmz->metadata));
> +		break;
> +	case STATUSTYPE_TABLE:
> +		format_dev_t(buf, dmz->dev->bdev->bd_dev);
> +		DMEMIT("%s ", buf);
> +		break;
> +	}
> +	return;
> +}
> +
> +static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
> +		       char *result, unsigned int maxlen)
> +{
> +	struct dmz_target *dmz = ti->private;
> +	int r = -EINVAL;
> +
> +	if (!strcasecmp(argv[0], "reclaim")) {
> +		dmz_schedule_reclaim(dmz->reclaim);
> +		r = 0;
> +	} else
> +		DMERR("unrecognized message %s", argv[0]);
> +	return r;
> +}
> +
>  static struct target_type dmz_type = {
>  	.name		 = "zoned",
>  	.version	 = {1, 1, 0},
> @@ -978,6 +1019,8 @@ static struct target_type dmz_type = {
>  	.postsuspend	 = dmz_suspend,
>  	.resume		 = dmz_resume,
>  	.iterate_devices = dmz_iterate_devices,
> +	.status		 = dmz_status,
> +	.message	 = dmz_message,
>  };
>  
>  static int __init dmz_init(void)
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index 5b5e493d479c..884c0e586082 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -190,8 +190,11 @@ void dmz_free_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
>  void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *zone,
>  		  unsigned int chunk);
>  void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone);
> +unsigned int dmz_nr_zones(struct dmz_metadata *zmd);
>  unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd);
>  unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
> +unsigned int dmz_nr_seq_zones(struct dmz_metadata *zmd);
> +unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd);
>  
>  /*
>   * Activate a zone (increment its reference count).
> 

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/13] dm-zoned: store zone id within the zone structure and kill dmz_id()
  2020-04-20 10:08 ` [PATCH 02/13] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
@ 2020-04-28  9:35   ` Damien Le Moal
  0 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-28  9:35 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:08, Hannes Reinecke wrote:
> Instead of calculating the zone index by the offset within the
> zone array store the index within the structure itself. With that
> the helper dmz_id() is pointless and can be replaced with accessing
> the ->id value directly.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/md/dm-zoned-metadata.c | 40 +++++++++++++++++-----------------------
>  drivers/md/dm-zoned-reclaim.c  | 17 ++++++++---------
>  drivers/md/dm-zoned-target.c   |  6 +++---
>  drivers/md/dm-zoned.h          |  4 +++-
>  4 files changed, 31 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index c8787560fa9f..1993eeb26bc1 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -187,19 +187,14 @@ struct dmz_metadata {
>  /*
>   * Various accessors
>   */
> -unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
> -{
> -	return ((unsigned int)(zone - zmd->zones));
> -}
> -
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_sectors_shift;
> +	return (sector_t)zone->id << zmd->dev->zone_nr_sectors_shift;
>  }
>  
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return (sector_t)dmz_id(zmd, zone) << zmd->dev->zone_nr_blocks_shift;
> +	return (sector_t)zone->id << zmd->dev->zone_nr_blocks_shift;
>  }
>  
>  unsigned int dmz_nr_zones(struct dmz_metadata *zmd)
> @@ -1119,6 +1114,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>  
>  	INIT_LIST_HEAD(&zone->link);
>  	atomic_set(&zone->refcount, 0);
> +	zone->id = idx;
>  	zone->chunk = DMZ_MAP_UNMAPPED;
>  
>  	switch (blkz->type) {
> @@ -1246,7 +1242,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
>  		ret = -EIO;
>  	if (ret < 0) {
>  		dmz_dev_err(zmd->dev, "Get zone %u report failed",
> -			    dmz_id(zmd, zone));
> +			    zone->id);
>  		dmz_check_bdev(zmd->dev);
>  		return ret;
>  	}
> @@ -1270,7 +1266,7 @@ static int dmz_handle_seq_write_err(struct dmz_metadata *zmd,
>  		return ret;
>  
>  	dmz_dev_warn(zmd->dev, "Processing zone %u write error (zone wp %u/%u)",
> -		     dmz_id(zmd, zone), zone->wp_block, wp);
> +		     zone->id, zone->wp_block, wp);
>  
>  	if (zone->wp_block < wp) {
>  		dmz_invalidate_blocks(zmd, zone, zone->wp_block,
> @@ -1309,7 +1305,7 @@ static int dmz_reset_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
>  				       dev->zone_nr_sectors, GFP_NOIO);
>  		if (ret) {
>  			dmz_dev_err(dev, "Reset zone %u failed %d",
> -				    dmz_id(zmd, zone), ret);
> +				    zone->id, ret);
>  			return ret;
>  		}
>  	}
> @@ -1757,8 +1753,7 @@ struct dm_zone *dmz_get_chunk_buffer(struct dmz_metadata *zmd,
>  	}
>  
>  	/* Update the chunk mapping */
> -	dmz_set_chunk_mapping(zmd, dzone->chunk, dmz_id(zmd, dzone),
> -			      dmz_id(zmd, bzone));
> +	dmz_set_chunk_mapping(zmd, dzone->chunk, dzone->id, bzone->id);
>  
>  	set_bit(DMZ_BUF, &bzone->flags);
>  	bzone->chunk = dzone->chunk;
> @@ -1810,7 +1805,7 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, unsigned long flags)
>  		atomic_dec(&zmd->unmap_nr_seq);
>  
>  	if (dmz_is_offline(zone)) {
> -		dmz_dev_warn(zmd->dev, "Zone %u is offline", dmz_id(zmd, zone));
> +		dmz_dev_warn(zmd->dev, "Zone %u is offline", zone->id);
>  		zone = NULL;
>  		goto again;
>  	}
> @@ -1852,7 +1847,7 @@ void dmz_map_zone(struct dmz_metadata *zmd, struct dm_zone *dzone,
>  		  unsigned int chunk)
>  {
>  	/* Set the chunk mapping */
> -	dmz_set_chunk_mapping(zmd, chunk, dmz_id(zmd, dzone),
> +	dmz_set_chunk_mapping(zmd, chunk, dzone->id,
>  			      DMZ_MAP_UNMAPPED);
>  	dzone->chunk = chunk;
>  	if (dmz_is_rnd(dzone))
> @@ -1880,7 +1875,7 @@ void dmz_unmap_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
>  		 * Unmapping the chunk buffer zone: clear only
>  		 * the chunk buffer mapping
>  		 */
> -		dzone_id = dmz_id(zmd, zone->bzone);
> +		dzone_id = zone->bzone->id;
>  		zone->bzone->bzone = NULL;
>  		zone->bzone = NULL;
>  
> @@ -1942,7 +1937,7 @@ static struct dmz_mblock *dmz_get_bitmap(struct dmz_metadata *zmd,
>  					 sector_t chunk_block)
>  {
>  	sector_t bitmap_block = 1 + zmd->nr_map_blocks +
> -		(sector_t)(dmz_id(zmd, zone) * zmd->zone_nr_bitmap_blocks) +
> +		(sector_t)(zone->id * zmd->zone_nr_bitmap_blocks) +
>  		(chunk_block >> DMZ_BLOCK_SHIFT_BITS);
>  
>  	return dmz_get_mblock(zmd, bitmap_block);
> @@ -2022,7 +2017,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
>  	unsigned int n = 0;
>  
>  	dmz_dev_debug(zmd->dev, "=> VALIDATE zone %u, block %llu, %u blocks",
> -		      dmz_id(zmd, zone), (unsigned long long)chunk_block,
> +		      zone->id, (unsigned long long)chunk_block,
>  		      nr_blocks);
>  
>  	WARN_ON(chunk_block + nr_blocks > zone_nr_blocks);
> @@ -2052,7 +2047,7 @@ int dmz_validate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
>  		zone->weight += n;
>  	else {
>  		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be <= %u",
> -			     dmz_id(zmd, zone), zone->weight,
> +			     zone->id, zone->weight,
>  			     zone_nr_blocks - n);
>  		zone->weight = zone_nr_blocks;
>  	}
> @@ -2102,7 +2097,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
>  	unsigned int n = 0;
>  
>  	dmz_dev_debug(zmd->dev, "=> INVALIDATE zone %u, block %llu, %u blocks",
> -		      dmz_id(zmd, zone), (u64)chunk_block, nr_blocks);
> +		      zone->id, (u64)chunk_block, nr_blocks);
>  
>  	WARN_ON(chunk_block + nr_blocks > zmd->dev->zone_nr_blocks);
>  
> @@ -2132,7 +2127,7 @@ int dmz_invalidate_blocks(struct dmz_metadata *zmd, struct dm_zone *zone,
>  		zone->weight -= n;
>  	else {
>  		dmz_dev_warn(zmd->dev, "Zone %u: weight %u should be >= %u",
> -			     dmz_id(zmd, zone), zone->weight, n);
> +			     zone->id, zone->weight, n);
>  		zone->weight = 0;
>  	}
>  
> @@ -2378,7 +2373,7 @@ static void dmz_cleanup_metadata(struct dmz_metadata *zmd)
>  int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
>  {
>  	struct dmz_metadata *zmd;
> -	unsigned int i, zid;
> +	unsigned int i;
>  	struct dm_zone *zone;
>  	int ret;
>  
> @@ -2419,9 +2414,8 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
>  		goto err;
>  
>  	/* Set metadata zones starting from sb_zone */
> -	zid = dmz_id(zmd, zmd->sb_zone);
>  	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
> -		zone = dmz_get(zmd, zid + i);
> +		zone = dmz_get(zmd, zmd->sb_zone->id + i);
>  		if (!dmz_is_rnd(zone))
>  			goto err;
>  		set_bit(DMZ_META, &zone->flags);
> diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
> index e7ace908a9b7..7f57c4299a2f 100644
> --- a/drivers/md/dm-zoned-reclaim.c
> +++ b/drivers/md/dm-zoned-reclaim.c
> @@ -80,7 +80,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
>  	if (ret) {
>  		dmz_dev_err(zrc->dev,
>  			    "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
> -			    dmz_id(zmd, zone), (unsigned long long)wp_block,
> +			    zone->id, (unsigned long long)wp_block,
>  			    (unsigned long long)block, nr_blocks, ret);
>  		dmz_check_bdev(zrc->dev);
>  		return ret;
> @@ -196,8 +196,8 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
>  
>  	dmz_dev_debug(zrc->dev,
>  		      "Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
> -		      dzone->chunk, dmz_id(zmd, bzone), dmz_weight(bzone),
> -		      dmz_id(zmd, dzone), dmz_weight(dzone));
> +		      dzone->chunk, bzone->id, dmz_weight(bzone),
> +		      dzone->id, dmz_weight(dzone));
>  
>  	/* Flush data zone into the buffer zone */
>  	ret = dmz_reclaim_copy(zrc, bzone, dzone);
> @@ -235,8 +235,8 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
>  
>  	dmz_dev_debug(zrc->dev,
>  		      "Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
> -		      chunk, dmz_id(zmd, dzone), dmz_weight(dzone),
> -		      dmz_id(zmd, bzone), dmz_weight(bzone));
> +		      chunk, dzone->id, dmz_weight(dzone),
> +		      bzone->id, dmz_weight(bzone));
>  
>  	/* Flush data zone into the buffer zone */
>  	ret = dmz_reclaim_copy(zrc, dzone, bzone);
> @@ -287,8 +287,7 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
>  
>  	dmz_dev_debug(zrc->dev,
>  		      "Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
> -		      chunk, dmz_id(zmd, dzone), dmz_weight(dzone),
> -		      dmz_id(zmd, szone));
> +		      chunk, dzone->id, dmz_weight(dzone), szone->id);
>  
>  	/* Flush the random data zone into the sequential zone */
>  	ret = dmz_reclaim_copy(zrc, dzone, szone);
> @@ -403,12 +402,12 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
>  	if (ret) {
>  		dmz_dev_debug(zrc->dev,
>  			      "Metadata flush for zone %u failed, err %d\n",
> -			      dmz_id(zmd, rzone), ret);
> +			      rzone->id, ret);
>  		return ret;
>  	}
>  
>  	dmz_dev_debug(zrc->dev, "Reclaimed zone %u in %u ms",
> -		      dmz_id(zmd, rzone), jiffies_to_msecs(jiffies - start));
> +		      rzone->id, jiffies_to_msecs(jiffies - start));
>  	return 0;
>  }
>  
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 44e30a7de8b9..7268e0af9e17 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -180,7 +180,7 @@ static int dmz_handle_read(struct dmz_target *dmz, struct dm_zone *zone,
>  	dmz_dev_debug(dmz->dev, "READ chunk %llu -> %s zone %u, block %llu, %u blocks",
>  		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
>  		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
> -		      dmz_id(dmz->metadata, zone),
> +		      zone->id,
>  		      (unsigned long long)chunk_block, nr_blocks);
>  
>  	/* Check block validity to determine the read location */
> @@ -317,7 +317,7 @@ static int dmz_handle_write(struct dmz_target *dmz, struct dm_zone *zone,
>  	dmz_dev_debug(dmz->dev, "WRITE chunk %llu -> %s zone %u, block %llu, %u blocks",
>  		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
>  		      (dmz_is_rnd(zone) ? "RND" : "SEQ"),
> -		      dmz_id(dmz->metadata, zone),
> +		      zone->id,
>  		      (unsigned long long)chunk_block, nr_blocks);
>  
>  	if (dmz_is_rnd(zone) || chunk_block == zone->wp_block) {
> @@ -357,7 +357,7 @@ static int dmz_handle_discard(struct dmz_target *dmz, struct dm_zone *zone,
>  
>  	dmz_dev_debug(dmz->dev, "DISCARD chunk %llu -> zone %u, block %llu, %u blocks",
>  		      (unsigned long long)dmz_bio_chunk(dmz->dev, bio),
> -		      dmz_id(zmd, zone),
> +		      zone->id,
>  		      (unsigned long long)chunk_block, nr_blocks);
>  
>  	/*
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index 884c0e586082..30781646741a 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -87,6 +87,9 @@ struct dm_zone {
>  	/* Zone activation reference count */
>  	atomic_t		refcount;
>  
> +	/* Zone id */
> +	unsigned int		id;
> +
>  	/* Zone write pointer block (relative to the zone start block) */
>  	unsigned int		wp_block;
>  
> @@ -176,7 +179,6 @@ void dmz_lock_flush(struct dmz_metadata *zmd);
>  void dmz_unlock_flush(struct dmz_metadata *zmd);
>  int dmz_flush_metadata(struct dmz_metadata *zmd);
>  
> -unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone);
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
>  unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
> 

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 07/13] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev()
  2020-04-20 10:08 ` [PATCH 07/13] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
@ 2020-04-28  9:37   ` Damien Le Moal
  0 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-28  9:37 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:08, Hannes Reinecke wrote:
> Introduce accessors dmz_dev_is_dying() and dmz_check_dev() to
> avoid having to reference the devices directly.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/md/dm-zoned-metadata.c | 14 ++++++++++++--
>  drivers/md/dm-zoned-reclaim.c  |  4 ++--
>  drivers/md/dm-zoned-target.c   |  2 +-
>  drivers/md/dm-zoned.h          |  3 +++
>  4 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index 7cda48683c0b..426af738f1ca 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -267,6 +267,16 @@ const char *dmz_metadata_label(struct dmz_metadata *zmd)
>  	return (const char *)zmd->devname;
>  }
>  
> +bool dmz_check_dev(struct dmz_metadata *zmd)
> +{
> +	return dmz_check_bdev(&zmd->dev[0]);
> +}
> +
> +bool dmz_dev_is_dying(struct dmz_metadata *zmd)
> +{
> +	return dmz_bdev_is_dying(&zmd->dev[0]);
> +}
> +
>  /*
>   * Lock/unlock mapping table.
>   * The map lock also protects all the zone lists.
> @@ -1719,7 +1729,7 @@ struct dm_zone *dmz_get_chunk_mapping(struct dmz_metadata *zmd, unsigned int chu
>  		/* Allocate a random zone */
>  		dzone = dmz_alloc_zone(zmd, DMZ_ALLOC_RND);
>  		if (!dzone) {
> -			if (dmz_bdev_is_dying(zmd->dev)) {
> +			if (dmz_dev_is_dying(zmd)) {
>  				dzone = ERR_PTR(-EIO);
>  				goto out;
>  			}
> @@ -1820,7 +1830,7 @@ struct dm_zone *dmz_get_chunk_buffer(struct dmz_metadata *zmd,
>  	/* Allocate a random zone */
>  	bzone = dmz_alloc_zone(zmd, DMZ_ALLOC_RND);
>  	if (!bzone) {
> -		if (dmz_bdev_is_dying(zmd->dev)) {
> +		if (dmz_dev_is_dying(zmd)) {
>  			bzone = ERR_PTR(-EIO);
>  			goto out;
>  		}
> diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
> index 699c4145306e..5daede0daf92 100644
> --- a/drivers/md/dm-zoned-reclaim.c
> +++ b/drivers/md/dm-zoned-reclaim.c
> @@ -455,7 +455,7 @@ static void dmz_reclaim_work(struct work_struct *work)
>  	unsigned int p_unmap_rnd;
>  	int ret;
>  
> -	if (dmz_bdev_is_dying(zrc->dev))
> +	if (dmz_dev_is_dying(zmd))
>  		return;
>  
>  	if (!dmz_should_reclaim(zrc)) {
> @@ -490,7 +490,7 @@ static void dmz_reclaim_work(struct work_struct *work)
>  	if (ret) {
>  		DMDEBUG("(%s): Reclaim error %d\n",
>  			dmz_metadata_label(zmd), ret);
> -		if (!dmz_check_bdev(zrc->dev))
> +		if (!dmz_check_dev(zmd))
>  			return;
>  	}
>  
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 748d4cd5d62d..15f00535060f 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -632,7 +632,7 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>  	sector_t chunk_sector;
>  	int ret;
>  
> -	if (dmz_bdev_is_dying(dmz->dev))
> +	if (dmz_dev_is_dying(zmd))
>  		return DM_MAPIO_KILL;
>  
>  	DMDEBUG("(%s): BIO op %d sector %llu + %u => chunk %llu, block %llu, %u blocks",
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index dd768dc60341..e0883df8a903 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -181,6 +181,9 @@ sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
>  unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
>  
> +bool dmz_check_dev(struct dmz_metadata *zmd);
> +bool dmz_dev_is_dying(struct dmz_metadata *zmd);
> +
>  #define DMZ_ALLOC_RND		0x01
>  #define DMZ_ALLOC_RECLAIM	0x02
>  
> 

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 08/13] dm-zoned: remove 'dev' argument from reclaim
  2020-04-20 10:08 ` [PATCH 08/13] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
@ 2020-04-28  9:40   ` Damien Le Moal
  0 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-28  9:40 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:09, Hannes Reinecke wrote:
> Use the dmz_zone_to_dev() mapping function to remove the
> 'dev' argument from reclaim.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/md/dm-zoned-reclaim.c | 58 +++++++++++++++++++++++--------------------
>  drivers/md/dm-zoned-target.c  |  2 +-
>  drivers/md/dm-zoned.h         |  3 ++-
>  3 files changed, 34 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
> index 5daede0daf92..39ea0d5d4706 100644
> --- a/drivers/md/dm-zoned-reclaim.c
> +++ b/drivers/md/dm-zoned-reclaim.c
> @@ -13,7 +13,6 @@
>  
>  struct dmz_reclaim {
>  	struct dmz_metadata     *metadata;
> -	struct dmz_dev		*dev;
>  
>  	struct delayed_work	work;
>  	struct workqueue_struct *wq;
> @@ -59,6 +58,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
>  				sector_t block)
>  {
>  	struct dmz_metadata *zmd = zrc->metadata;
> +	struct dmz_dev *dev = dmz_zone_to_dev(zmd, zone);
>  	sector_t wp_block = zone->wp_block;
>  	unsigned int nr_blocks;
>  	int ret;
> @@ -74,15 +74,15 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
>  	 * pointer and the requested position.
>  	 */
>  	nr_blocks = block - wp_block;
> -	ret = blkdev_issue_zeroout(zrc->dev->bdev,
> +	ret = blkdev_issue_zeroout(dev->bdev,
>  				   dmz_start_sect(zmd, zone) + dmz_blk2sect(wp_block),
>  				   dmz_blk2sect(nr_blocks), GFP_NOIO, 0);
>  	if (ret) {
> -		dmz_dev_err(zrc->dev,
> +		dmz_dev_err(dev,
>  			    "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
>  			    zone->id, (unsigned long long)wp_block,
>  			    (unsigned long long)block, nr_blocks, ret);
> -		dmz_check_bdev(zrc->dev);
> +		dmz_check_bdev(dev);
>  		return ret;
>  	}
>  
> @@ -116,7 +116,7 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
>  			    struct dm_zone *src_zone, struct dm_zone *dst_zone)
>  {
>  	struct dmz_metadata *zmd = zrc->metadata;
> -	struct dmz_dev *dev = zrc->dev;
> +	struct dmz_dev *src_dev, *dst_dev;
>  	struct dm_io_region src, dst;
>  	sector_t block = 0, end_block;
>  	sector_t nr_blocks;
> @@ -130,13 +130,17 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
>  	else
>  		end_block = dmz_zone_nr_blocks(zmd);
>  	src_zone_block = dmz_start_block(zmd, src_zone);
> +	src_dev = dmz_zone_to_dev(zmd, src_zone);
>  	dst_zone_block = dmz_start_block(zmd, dst_zone);
> +	dst_dev = dmz_zone_to_dev(zmd, dst_zone);
>  
>  	if (dmz_is_seq(dst_zone))
>  		set_bit(DM_KCOPYD_WRITE_SEQ, &flags);
>  
>  	while (block < end_block) {
> -		if (dev->flags & DMZ_BDEV_DYING)
> +		if (src_dev->flags & DMZ_BDEV_DYING)
> +			return -EIO;
> +		if (dst_dev->flags & DMZ_BDEV_DYING)
>  			return -EIO;
>  
>  		/* Get a valid region from the source zone */
> @@ -156,11 +160,11 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
>  				return ret;
>  		}
>  
> -		src.bdev = dev->bdev;
> +		src.bdev = src_dev->bdev;
>  		src.sector = dmz_blk2sect(src_zone_block + block);
>  		src.count = dmz_blk2sect(nr_blocks);
>  
> -		dst.bdev = dev->bdev;
> +		dst.bdev = dst_dev->bdev;
>  		dst.sector = dmz_blk2sect(dst_zone_block + block);
>  		dst.count = src.count;
>  
> @@ -194,10 +198,10 @@ static int dmz_reclaim_buf(struct dmz_reclaim *zrc, struct dm_zone *dzone)
>  	struct dmz_metadata *zmd = zrc->metadata;
>  	int ret;
>  
> -	dmz_dev_debug(zrc->dev,
> -		      "Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
> -		      dzone->chunk, bzone->id, dmz_weight(bzone),
> -		      dzone->id, dmz_weight(dzone));
> +	DMDEBUG("(%s): Chunk %u, move buf zone %u (weight %u) to data zone %u (weight %u)",
> +		dmz_metadata_label(zmd),
> +		dzone->chunk, bzone->id, dmz_weight(bzone),
> +		dzone->id, dmz_weight(dzone));
>  
>  	/* Flush data zone into the buffer zone */
>  	ret = dmz_reclaim_copy(zrc, bzone, dzone);
> @@ -233,10 +237,10 @@ static int dmz_reclaim_seq_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
>  	struct dmz_metadata *zmd = zrc->metadata;
>  	int ret = 0;
>  
> -	dmz_dev_debug(zrc->dev,
> -		      "Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
> -		      chunk, dzone->id, dmz_weight(dzone),
> -		      bzone->id, dmz_weight(bzone));
> +	DMDEBUG("(%s): Chunk %u, move data zone %u (weight %u) to buf zone %u (weight %u)",
> +		dmz_metadata_label(zmd),
> +		chunk, dzone->id, dmz_weight(dzone),
> +		bzone->id, dmz_weight(bzone));
>  
>  	/* Flush data zone into the buffer zone */
>  	ret = dmz_reclaim_copy(zrc, dzone, bzone);
> @@ -285,9 +289,9 @@ static int dmz_reclaim_rnd_data(struct dmz_reclaim *zrc, struct dm_zone *dzone)
>  	if (!szone)
>  		return -ENOSPC;
>  
> -	dmz_dev_debug(zrc->dev,
> -		      "Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
> -		      chunk, dzone->id, dmz_weight(dzone), szone->id);
> +	DMDEBUG("(%s): Chunk %u, move rnd zone %u (weight %u) to seq zone %u",
> +		dmz_metadata_label(zmd),
> +		chunk, dzone->id, dmz_weight(dzone), szone->id);
>  
>  	/* Flush the random data zone into the sequential zone */
>  	ret = dmz_reclaim_copy(zrc, dzone, szone);
> @@ -343,6 +347,7 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
>  	struct dmz_metadata *zmd = zrc->metadata;
>  	struct dm_zone *dzone;
>  	struct dm_zone *rzone;
> +	struct dmz_dev *dev;
>  	unsigned long start;
>  	int ret;
>  
> @@ -352,7 +357,7 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
>  		return PTR_ERR(dzone);
>  
>  	start = jiffies;
> -
> +	dev = dmz_zone_to_dev(zmd, dzone);
>  	if (dmz_is_rnd(dzone)) {
>  		if (!dmz_weight(dzone)) {
>  			/* Empty zone */
> @@ -400,14 +405,14 @@ static int dmz_do_reclaim(struct dmz_reclaim *zrc)
>  
>  	ret = dmz_flush_metadata(zrc->metadata);
>  	if (ret) {
> -		dmz_dev_debug(zrc->dev,
> -			      "Metadata flush for zone %u failed, err %d\n",
> -			      rzone->id, ret);
> +		DMDEBUG("(%s): Metadata flush for zone %u failed, err %d\n",
> +			dmz_metadata_label(zmd), rzone->id, ret);
>  		return ret;
>  	}
>  
> -	dmz_dev_debug(zrc->dev, "Reclaimed zone %u in %u ms",
> -		      rzone->id, jiffies_to_msecs(jiffies - start));
> +	DMDEBUG("(%s): Reclaimed zone %u in %u ms",
> +		dmz_metadata_label(zmd),
> +		rzone->id, jiffies_to_msecs(jiffies - start));
>  	return 0;
>  }
>  
> @@ -500,7 +505,7 @@ static void dmz_reclaim_work(struct work_struct *work)
>  /*
>   * Initialize reclaim.
>   */
> -int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
> +int dmz_ctr_reclaim(struct dmz_metadata *zmd,
>  		    struct dmz_reclaim **reclaim)
>  {
>  	struct dmz_reclaim *zrc;
> @@ -510,7 +515,6 @@ int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
>  	if (!zrc)
>  		return -ENOMEM;
>  
> -	zrc->dev = dev;
>  	zrc->metadata = zmd;
>  	zrc->atime = jiffies;
>  
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 15f00535060f..a1f42af2877c 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -840,7 +840,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  	mod_delayed_work(dmz->flush_wq, &dmz->flush_work, DMZ_FLUSH_PERIOD);
>  
>  	/* Initialize reclaim */
> -	ret = dmz_ctr_reclaim(dev, dmz->metadata, &dmz->reclaim);
> +	ret = dmz_ctr_reclaim(dmz->metadata, &dmz->reclaim);
>  	if (ret) {
>  		ti->error = "Zone reclaim initialization failed";
>  		goto err_fwq;
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index e0883df8a903..454ebd628cca 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -180,6 +180,7 @@ const char *dmz_metadata_label(struct dmz_metadata *zmd);
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
>  unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
> +struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone);
>  
>  bool dmz_check_dev(struct dmz_metadata *zmd);
>  bool dmz_dev_is_dying(struct dmz_metadata *zmd);
> @@ -254,7 +255,7 @@ int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
>  /*
>   * Functions defined in dm-zoned-reclaim.c
>   */
> -int dmz_ctr_reclaim(struct dmz_dev *dev, struct dmz_metadata *zmd,
> +int dmz_ctr_reclaim(struct dmz_metadata *zmd,
>  		    struct dmz_reclaim **zrc);

Nit: this should fit on a single line, no ?

>  void dmz_dtr_reclaim(struct dmz_reclaim *zrc);
>  void dmz_suspend_reclaim(struct dmz_reclaim *zrc);
> 

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 09/13] dm-zoned: replace 'target' pointer in the bio context
  2020-04-20 10:08 ` [PATCH 09/13] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
@ 2020-04-28  9:43   ` Damien Le Moal
  0 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-28  9:43 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:08, Hannes Reinecke wrote:
> Replace the 'target' pointer in the bio context with the
> device pointer as this is what's actually used.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/md/dm-zoned-target.c | 26 ++++++++++++--------------
>  1 file changed, 12 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index a1f42af2877c..4897ffae96ca 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -17,7 +17,7 @@
>   * Zone BIO context.
>   */
>  struct dmz_bioctx {
> -	struct dmz_target	*target;
> +	struct dmz_dev		*dev;
>  	struct dm_zone		*zone;
>  	struct bio		*bio;
>  	refcount_t		ref;
> @@ -81,7 +81,7 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
>  	if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
>  		bio->bi_status = status;
>  	if (bio->bi_status != BLK_STS_OK)
> -		bioctx->target->dev->flags |= DMZ_CHECK_BDEV;
> +		bioctx->dev->flags |= DMZ_CHECK_BDEV;
>  
>  	if (refcount_dec_and_test(&bioctx->ref)) {
>  		struct dm_zone *zone = bioctx->zone;
> @@ -119,13 +119,18 @@ static int dmz_submit_bio(struct dmz_target *dmz, struct dm_zone *zone,
>  			  unsigned int nr_blocks)
>  {
>  	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
> +	struct dmz_dev *dev = dmz_zone_to_dev(dmz->metadata, zone);
>  	struct bio *clone;
>  
> +	if (dev->flags & DMZ_BDEV_DYING)
> +		return -EIO;
> +
>  	clone = bio_clone_fast(bio, GFP_NOIO, &dmz->bio_set);
>  	if (!clone)
>  		return -ENOMEM;
>  
> -	bio_set_dev(clone, dmz->dev->bdev);
> +	bio_set_dev(clone, dev->bdev);
> +	bioctx->dev = dev;
>  	clone->bi_iter.bi_sector =
>  		dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block);
>  	clone->bi_iter.bi_size = dmz_blk2sect(nr_blocks) << SECTOR_SHIFT;
> @@ -397,11 +402,6 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct dm_chunk_work *cw,
>  
>  	dmz_lock_metadata(zmd);
>  
> -	if (dmz->dev->flags & DMZ_BDEV_DYING) {
> -		ret = -EIO;
> -		goto out;
> -	}
> -
>  	/*
>  	 * Get the data zone mapping the chunk. There may be no
>  	 * mapping for read and discard. If a mapping is obtained,
> @@ -625,7 +625,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>  {
>  	struct dmz_target *dmz = ti->private;
>  	struct dmz_metadata *zmd = dmz->metadata;
> -	struct dmz_dev *dev = dmz->dev;
>  	struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
>  	sector_t sector = bio->bi_iter.bi_sector;
>  	unsigned int nr_sectors = bio_sectors(bio);
> @@ -642,8 +641,6 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>  		(unsigned long long)dmz_chunk_block(zmd, dmz_bio_block(bio)),
>  		(unsigned int)dmz_bio_blocks(bio));
>  
> -	bio_set_dev(bio, dev->bdev);
> -
>  	if (!nr_sectors && bio_op(bio) != REQ_OP_WRITE)
>  		return DM_MAPIO_REMAPPED;
>  
> @@ -652,7 +649,7 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>  		return DM_MAPIO_KILL;
>  
>  	/* Initialize the BIO context */
> -	bioctx->target = dmz;
> +	bioctx->dev = NULL;
>  	bioctx->zone = NULL;
>  	bioctx->bio = bio;
>  	refcount_set(&bioctx->ref, 1);
> @@ -931,11 +928,12 @@ static void dmz_io_hints(struct dm_target *ti, struct queue_limits *limits)
>  static int dmz_prepare_ioctl(struct dm_target *ti, struct block_device **bdev)
>  {
>  	struct dmz_target *dmz = ti->private;
> +	struct dmz_dev *dev = &dmz->dev[0];
>  
> -	if (!dmz_check_bdev(dmz->dev))
> +	if (!dmz_check_bdev(dev))
>  		return -EIO;
>  
> -	*bdev = dmz->dev->bdev;
> +	*bdev = dev->bdev;
>  
>  	return 0;
>  }
> 

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 13/13] dm-zoned: metadata version 2
  2020-04-20 10:08 ` [PATCH 13/13] dm-zoned: metadata version 2 Hannes Reinecke
@ 2020-04-28 10:54   ` Damien Le Moal
  2020-04-28 17:37     ` Mike Snitzer
  2020-04-30 14:45     ` Hannes Reinecke
  0 siblings, 2 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-04-28 10:54 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/20 19:09, Hannes Reinecke wrote:
> Implement handling for metadata version 2. The new metadata adds
> a label and UUID for the device mapper device, and additional UUID
> for the underlying block devices.
> It also allows for an additional regular drive to be used for
> emulating random access zones. The emulated zones will be placed
> logically in front of the zones from the zoned block device, causing
> the superblocks and metadata to be stored on that device.
> The first zone of the original zoned device will be used to hold
> another, tertiary copy of the metadata; this copy carries a
> generation number of 0 and is never updated; it's just used
> for identification.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/md/dm-zoned-metadata.c | 314 ++++++++++++++++++++++++++++++++++-------
>  drivers/md/dm-zoned-target.c   | 156 +++++++++++++-------
>  drivers/md/dm-zoned.h          |  12 +-
>  3 files changed, 373 insertions(+), 109 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index c009f2d962e2..1f31635aba73 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -16,7 +16,7 @@
>  /*
>   * Metadata version.
>   */
> -#define DMZ_META_VER	1
> +#define DMZ_META_VER	2
>  
>  /*
>   * On-disk super block magic.
> @@ -69,8 +69,17 @@ struct dmz_super {
>  	/* Checksum */
>  	__le32		crc;			/*  48 */
>  
> +	/* DM-Zoned label */
> +	u8		dmz_label[32];		/*  80 */
> +
> +	/* DM-Zoned UUID */
> +	u8		dmz_uuid[16];		/*  96 */
> +
> +	/* Device UUID */
> +	u8		dev_uuid[16];		/* 112 */
> +
>  	/* Padding to full 512B sector */
> -	u8		reserved[464];		/* 512 */
> +	u8		reserved[400];		/* 512 */
>  };
>  
>  /*
> @@ -133,8 +142,11 @@ struct dmz_sb {
>   */
>  struct dmz_metadata {
>  	struct dmz_dev		*dev;
> +	unsigned int		nr_devs;
>  
>  	char			devname[BDEVNAME_SIZE];
> +	char			label[BDEVNAME_SIZE];
> +	uuid_t			uuid;
>  
>  	sector_t		zone_bitmap_size;
>  	unsigned int		zone_nr_bitmap_blocks;
> @@ -161,8 +173,9 @@ struct dmz_metadata {
>  	/* Zone information array */
>  	struct dm_zone		*zones;
>  
> -	struct dmz_sb		sb[2];
> +	struct dmz_sb		sb[3];
>  	unsigned int		mblk_primary;
> +	unsigned int		sb_version;
>  	u64			sb_gen;
>  	unsigned int		min_nr_mblks;
>  	unsigned int		max_nr_mblks;
> @@ -195,31 +208,56 @@ struct dmz_metadata {
>  };
>  
>  #define dmz_zmd_info(zmd, format, args...)	\
> -	DMINFO("(%s): " format, (zmd)->devname, ## args)
> +	DMINFO("(%s): " format, (zmd)->label, ## args)
>  
>  #define dmz_zmd_err(zmd, format, args...)	\
> -	DMERR("(%s): " format, (zmd)->devname, ## args)
> +	DMERR("(%s): " format, (zmd)->label, ## args)
>  
>  #define dmz_zmd_warn(zmd, format, args...)	\
> -	DMWARN("(%s): " format, (zmd)->devname, ## args)
> +	DMWARN("(%s): " format, (zmd)->label, ## args)
>  
>  #define dmz_zmd_debug(zmd, format, args...)	\
> -	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
> +	DMDEBUG("(%s): " format, (zmd)->label, ## args)
>  /*
>   * Various accessors
>   */
> +unsigned int dmz_dev_zone_id(struct dmz_metadata *zmd, struct dm_zone *zone)
> +{
> +	unsigned int zone_id;
> +
> +	if (WARN_ON(!zone))
> +		return 0;
> +
> +	zone_id = zone->id;
> +	if (zmd->nr_devs > 1 &&
> +	    (zone_id >= zmd->dev[1].zone_offset))
> +		zone_id -= zmd->dev[1].zone_offset;

We could have this as:

	if (zone_id >= zmd->dev[0].nr_zones)
		zone_id -= zmd->dev[0].nr_zones;

No ? It is simpler and we can kill the zone_offset.

> +	return zone_id;
> +}
> +
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
> +	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
> +
> +	return (sector_t)zone_id << zmd->zone_nr_sectors_shift;
>  }
>  
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
> +	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
> +
> +	return (sector_t)zone_id << zmd->zone_nr_blocks_shift;
>  }
>  
>  struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> +	if (WARN_ON(!zone))
> +		return &zmd->dev[0];
> +
> +	if (zmd->nr_devs > 1 &&
> +	    zone->id >= zmd->dev[1].zone_offset)
> +		return &zmd->dev[1];
> +
>  	return &zmd->dev[0];


Same here, simpler version:

	if (zone_id < zmd->dev[0].nr_zones)
		return &zmd->dev[0];

	return &zmd->dev[1];

>  }
>  
> @@ -275,17 +313,33 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
>  
>  const char *dmz_metadata_label(struct dmz_metadata *zmd)
>  {
> -	return (const char *)zmd->devname;
> +	return (const char *)zmd->label;
>  }
>  
>  bool dmz_check_dev(struct dmz_metadata *zmd)
>  {
> -	return dmz_check_bdev(&zmd->dev[0]);
> +	unsigned int i;
> +
> +	for (i = 0; i < zmd->nr_devs; i++) {
> +		if (!zmd->dev[i].bdev)
> +			continue;

This test is not necessary, no ? Since dev[0] is always set now with your latest
changes reshuffling the devs index.

> +		if (!dmz_check_bdev(&zmd->dev[i]))
> +			return false;
> +	}
> +	return true;
>  }
>  
>  bool dmz_dev_is_dying(struct dmz_metadata *zmd)
>  {
> -	return dmz_bdev_is_dying(&zmd->dev[0]);
> +	unsigned int i;
> +
> +	for (i = 0; i < zmd->nr_devs; i++) {
> +		if (!zmd->dev[i].bdev)
> +			continue;

Same here.

> +		if (dmz_bdev_is_dying(&zmd->dev[i]))
> +			return true;
> +	}
> +	return false;
>  }
>  
>  /*
> @@ -687,6 +741,9 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
>  	struct bio *bio;
>  	int ret;
>  
> +	if (WARN_ON(!dev))
> +		return -EIO;
> +
>  	if (dmz_bdev_is_dying(dev))
>  		return -EIO;
>  
> @@ -711,7 +768,8 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
>   */
>  static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  {
> -	sector_t block = zmd->sb[set].block;
> +	sector_t sb_block =
> +		zmd->sb[set].zone->id << zmd->zone_nr_blocks_shift;

I think this is safe as set 2 is read-only, so updates are opnly for set 0 and 1
on dev[0]. But a comment pointing that out would be nice...

>  	struct dmz_mblock *mblk = zmd->sb[set].mblk;
>  	struct dmz_super *sb = zmd->sb[set].sb;
>  	struct dmz_dev *dev = zmd->sb[set].dev;
> @@ -719,11 +777,18 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  	int ret;
>  
>  	sb->magic = cpu_to_le32(DMZ_MAGIC);
> -	sb->version = cpu_to_le32(DMZ_META_VER);
> +
> +	sb->version = cpu_to_le32(zmd->sb_version);
> +	if (zmd->sb_version > 1) {
> +		BUILD_BUG_ON(UUID_SIZE != 16);
> +		memcpy(sb->dmz_uuid, &zmd->uuid, UUID_SIZE);
> +		memcpy(sb->dmz_label, zmd->label, BDEVNAME_SIZE);
> +		memcpy(sb->dev_uuid, &dev->uuid, UUID_SIZE);

import_uuid() ?

> +	}
>  
>  	sb->gen = cpu_to_le64(sb_gen);
>  
> -	sb->sb_block = cpu_to_le64(block);
> +	sb->sb_block = cpu_to_le64(sb_block);
>  	sb->nr_meta_blocks = cpu_to_le32(zmd->nr_meta_blocks);
>  	sb->nr_reserved_seq = cpu_to_le32(zmd->nr_reserved_seq);
>  	sb->nr_chunks = cpu_to_le32(zmd->nr_chunks);
> @@ -734,7 +799,8 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  	sb->crc = 0;
>  	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
>  
> -	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
> +	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, zmd->sb[set].block,
> +			     mblk->page);
>  	if (ret == 0)
>  		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
>  
> @@ -915,6 +981,23 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
>  	u32 crc, stored_crc;
>  	u64 gen;
>  
> +	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
> +		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
> +			    DMZ_MAGIC, le32_to_cpu(sb->magic));
> +		return -ENXIO;
> +	}
> +
> +	zmd->sb_version = le32_to_cpu(sb->version);
> +	if (zmd->sb_version > DMZ_META_VER) {
> +		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
> +			    DMZ_META_VER, zmd->sb_version);
> +		return -EINVAL;
> +	}
> +	if ((zmd->sb_version < 1) && (set == 2)) {
> +		dmz_dev_err(dev, "Tertiary superblocks are not supported");
> +		return -EINVAL;
> +	}
> +
>  	gen = le64_to_cpu(sb->gen);
>  	stored_crc = le32_to_cpu(sb->crc);
>  	sb->crc = 0;
> @@ -925,18 +1008,44 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
>  		return -ENXIO;
>  	}
>  
> -	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
> -		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
> -			    DMZ_MAGIC, le32_to_cpu(sb->magic));
> -		return -ENXIO;
> -	}
> +	if (zmd->sb_version > 1) {
> +		uuid_t sb_uuid;
> +
> +		memcpy(&sb_uuid, sb->dmz_uuid, UUID_SIZE);
> +		if (uuid_is_null(&sb_uuid)) {
> +			dmz_dev_err(dev, "NULL DM-Zoned uuid");
> +			return -ENXIO;
> +		} else if (uuid_is_null(&zmd->uuid)) {
> +			uuid_copy(&zmd->uuid, &sb_uuid);
> +		} else if (!uuid_equal(&zmd->uuid, &sb_uuid)) {
> +			dmz_dev_err(dev, "mismatching DM-Zoned uuid, "
> +				    "is %pUl expected %pUl",
> +				    &sb_uuid, &zmd->uuid);
> +			return -ENXIO;
> +		}
> +		if (!strlen(zmd->label))
> +			memcpy(zmd->label, sb->dmz_label, BDEVNAME_SIZE);
> +		else if (memcmp(zmd->label, sb->dmz_label, BDEVNAME_SIZE)) {
> +			dmz_dev_err(dev, "mismatching DM-Zoned label, "
> +				    "is %s expected %s",
> +				    sb->dmz_label, zmd->label);
> +			return -ENXIO;
> +		}
> +		memcpy(&dev->uuid, sb->dev_uuid, UUID_SIZE);
> +		if (uuid_is_null(&dev->uuid)) {
> +			dmz_dev_err(dev, "NULL device uuid");
> +			return -ENXIO;
> +		}
>  
> -	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
> -		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
> -			    DMZ_META_VER, le32_to_cpu(sb->version));
> -		return -ENXIO;
> +		if (set == 2) {
> +			if (gen != 0) {
> +				dmz_dev_err(dev, "Invalid generation %llu",
> +					    gen);
> +				return -ENXIO;
> +			}
> +			return 0;
> +		}
>  	}
> -
>  	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
>  		>> zmd->zone_nr_blocks_shift;
>  	if (!nr_meta_zones ||
> @@ -1185,21 +1294,38 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>  		      "Using super block %u (gen %llu)",
>  		      zmd->mblk_primary, zmd->sb_gen);
>  
> +	if ((zmd->sb_version > 1) && zmd->sb[2].zone) {
> +		zmd->sb[2].block = dmz_start_block(zmd, zmd->sb[2].zone);
> +		zmd->sb[2].dev = dmz_zone_to_dev(zmd, zmd->sb[2].zone);
> +		ret = dmz_get_sb(zmd, 2);
> +		if (ret) {
> +			dmz_dev_err(zmd->sb[2].dev,
> +				    "Read tertiary super block failed");
> +			return ret;
> +		}
> +		ret = dmz_check_sb(zmd, 2);
> +		if (ret == -EINVAL)
> +			return ret;
> +	}
>  	return 0;
>  }
>  
>  /*
>   * Initialize a zone descriptor.
>   */
> -static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
> +static int dmz_init_zone(struct blk_zone *blkz, unsigned int num, void *data)
>  {
>  	struct dmz_metadata *zmd = data;
> +	struct dmz_dev *dev = zmd->nr_devs > 1 ? &zmd->dev[1] : &zmd->dev[0];
> +	int idx = num + dev->zone_offset;
>  	struct dm_zone *zone = &zmd->zones[idx];
> -	struct dmz_dev *dev = zmd->dev;
>  
> -	/* Ignore the eventual last runt (smaller) zone */
>  	if (blkz->len != zmd->zone_nr_sectors) {
> -		if (blkz->start + blkz->len == dev->capacity)
> +		if (zmd->sb_version > 1) {
> +			/* Ignore the eventual runt (smaller) zone */
> +			set_bit(DMZ_OFFLINE, &zone->flags);
> +			return 0;
> +		} else if (blkz->start + blkz->len == dev->capacity)
>  			return 0;
>  		return -ENXIO;
>  	}
> @@ -1234,16 +1360,46 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>  		zmd->nr_useable_zones++;
>  		if (dmz_is_rnd(zone)) {
>  			zmd->nr_rnd_zones++;
> -			if (!zmd->sb[0].zone) {
> -				/* Super block zone */
> +			if (zmd->nr_devs == 1 && !zmd->sb[0].zone) {
> +				/* Primary super block zone */
>  				zmd->sb[0].zone = zone;
>  			}
>  		}
> +		if (zmd->nr_devs > 1 && !zmd->sb[2].zone) {
> +			/* Tertiary superblock zone */
> +			zmd->sb[2].zone = zone;
> +		}
>  	}
>  
>  	return 0;
>  }
>  
> +static void dmz_emulate_zones(struct dmz_metadata *zmd, struct dmz_dev *dev)
> +{
> +	int idx;
> +	sector_t zone_offset = 0;
> +
> +	for(idx = 0; idx < dev->nr_zones; idx++) {
> +		struct dm_zone *zone = &zmd->zones[idx];
> +
> +		INIT_LIST_HEAD(&zone->link);
> +		atomic_set(&zone->refcount, 0);
> +		zone->id = idx;
> +		zone->chunk = DMZ_MAP_UNMAPPED;
> +		set_bit(DMZ_RND, &zone->flags);
> +		zone->wp_block = 0;
> +		zmd->nr_rnd_zones++;
> +		zmd->nr_useable_zones++;
> +		if (dev->capacity - zone_offset <
> +		    zmd->zone_nr_sectors) {

No need for the line break here. It fits in 80 chars line.

> +			/* Disable runt zone */
> +			set_bit(DMZ_OFFLINE, &zone->flags);
> +			break;
> +		}
> +		zone_offset += zmd->zone_nr_sectors;
> +	}
> +}
> +
>  /*
>   * Free zones descriptors.
>   */
> @@ -1259,11 +1415,11 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
>   */
>  static int dmz_init_zones(struct dmz_metadata *zmd)
>  {
> -	struct dmz_dev *dev = &zmd->dev[0];
> -	int ret;
> +	int i, ret;
> +	struct dmz_dev *zoned_dev = &zmd->dev[0];
>  
>  	/* Init */
> -	zmd->zone_nr_sectors = dev->zone_nr_sectors;
> +	zmd->zone_nr_sectors = zmd->dev[0].zone_nr_sectors;
>  	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
>  	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
>  	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
> @@ -1274,7 +1430,14 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
>  					DMZ_BLOCK_SIZE_BITS);
>  
>  	/* Allocate zone array */
> -	zmd->nr_zones = dev->nr_zones;
> +	zmd->nr_zones = 0;
> +	for (i = 0; i < zmd->nr_devs; i++)
> +		zmd->nr_zones += zmd->dev[i].nr_zones;
> +
> +	if (!zmd->nr_zones) {
> +		DMERR("(%s): No zones found", zmd->devname);
> +		return -ENXIO;
> +	}

I tested and this does not work for a single zoned device case because nr_zones
is set in device fixup after this. So thie sees nr_zones == 0.

>  	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
>  	if (!zmd->zones)
>  		return -ENOMEM;
> @@ -1282,14 +1445,27 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
>  	DMINFO("(%s): Using %zu B for zone information",
>  	       zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
>  
> +	if (zmd->nr_devs > 1) {
> +		dmz_emulate_zones(zmd, &zmd->dev[0]);
> +		/*
> +		 * Primary superblock zone is always at zone 0 when multiple
> +		 * drives are present.
> +		 */
> +		zmd->sb[0].zone = &zmd->zones[0];
> +
> +		zoned_dev = &zmd->dev[1];
> +	}
> +
>  	/*
>  	 * Get zone information and initialize zone descriptors.  At the same
>  	 * time, determine where the super block should be: first block of the
>  	 * first randomly writable zone.
>  	 */
> -	ret = blkdev_report_zones(dev->bdev, 0, BLK_ALL_ZONES, dmz_init_zone,
> -				  zmd);
> +	ret = blkdev_report_zones(zoned_dev->bdev, 0, BLK_ALL_ZONES,
> +				  dmz_init_zone, zmd);
>  	if (ret < 0) {
> +		DMDEBUG("(%s): Failed to report zones, error %d",
> +			zmd->devname, ret);
>  		dmz_drop_zones(zmd);
>  		return ret;
>  	}
> @@ -1325,6 +1501,9 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
>  	unsigned int noio_flag;
>  	int ret;
>  
> +	if (dev->flags & DMZ_BDEV_REGULAR)
> +		return 0;
> +
>  	/*
>  	 * Get zone information from disk. Since blkdev_report_zones() uses
>  	 * GFP_KERNEL by default for memory allocations, set the per-task
> @@ -2475,18 +2654,34 @@ void dmz_print_dev(struct dmz_metadata *zmd, int num)
>  {
>  	struct dmz_dev *dev = &zmd->dev[num];
>  
> -	dmz_dev_info(dev, "Host-%s zoned block device",
> -		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
> -		     "aware" : "managed");
> -	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
> -		     (u64)dev->capacity);
> -	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
> -		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
> +	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE)
> +		dmz_dev_info(dev, "Regular block device");
> +	else
> +		dmz_dev_info(dev, "Host-%s zoned block device",
> +			     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
> +			     "aware" : "managed");
> +	if (zmd->sb_version > 1) {
> +		sector_t sector_offset =
> +			dev->zone_offset << zmd->zone_nr_sectors_shift;
> +
> +		dmz_dev_info(dev, "  uuid %pUl", &dev->uuid);
> +		dmz_dev_info(dev, "  %llu 512-byte logical sectors (offset %llu)",
> +			     (u64)dev->capacity, (u64)sector_offset);
> +		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors (offset %llu)",
> +			     dev->nr_zones, (u64)zmd->zone_nr_sectors,
> +			     (u64)dev->zone_offset);
> +	} else {
> +		dmz_dev_info(dev, "  %llu 512-byte logical sectors",
> +			     (u64)dev->capacity);
> +		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
> +			     dev->nr_zones, (u64)zmd->zone_nr_sectors);
> +	}
>  }
>  /*
>   * Initialize the zoned metadata.
>   */
> -int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
> +int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> +		     struct dmz_metadata **metadata,
>  		     const char *devname)
>  {
>  	struct dmz_metadata *zmd;
> @@ -2500,6 +2695,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>  
>  	strcpy(zmd->devname, devname);
>  	zmd->dev = dev;
> +	zmd->nr_devs = num_dev;
>  	zmd->mblk_rbtree = RB_ROOT;
>  	init_rwsem(&zmd->mblk_sem);
>  	mutex_init(&zmd->mblk_flush_lock);
> @@ -2534,11 +2730,24 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>  	/* Set metadata zones starting from sb_zone */
>  	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
>  		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
> -		if (!dmz_is_rnd(zone))
> +		if (!dmz_is_rnd(zone)) {
> +			dmz_zmd_err(zmd,
> +				    "metadata zone %d is not random", i);
> +			ret = -ENXIO;
>  			goto err;
> +		}
> +		set_bit(DMZ_META, &zone->flags);
> +	}
> +	if (zmd->sb[2].zone) {
> +		zone = dmz_get(zmd, zmd->sb[2].zone->id);
> +		if (!zone) {
> +			dmz_zmd_err(zmd,
> +				    "Tertiary metadata zone not present");
> +			ret = -ENXIO;
> +			goto err;
> +		}
>  		set_bit(DMZ_META, &zone->flags);
>  	}
> -
>  	/* Load mapping table */
>  	ret = dmz_load_mapping(zmd);
>  	if (ret)
> @@ -2563,8 +2772,13 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>  		goto err;
>  	}
>  
> -	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
> -	dmz_print_dev(zmd, 0);
> +	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", zmd->sb_version);
> +	if (zmd->sb_version > 1) {
> +		dmz_zmd_info(zmd, "DM UUID %pUl", &zmd->uuid);
> +		dmz_zmd_info(zmd, "DM Label %s", zmd->label);
> +	}
> +	for (i = 0; i < zmd->nr_devs; i++)
> +		dmz_print_dev(zmd, i);
>  
>  	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
>  		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 4897ffae96ca..ae05d5d60b37 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -38,7 +38,7 @@ struct dm_chunk_work {
>   * Target descriptor.
>   */
>  struct dmz_target {
> -	struct dm_dev		*ddev;
> +	struct dm_dev		*ddev[2];
>  
>  	unsigned long		flags;
>  
> @@ -684,60 +684,40 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>  /*
>   * Get zoned device information.
>   */
> -static int dmz_get_zoned_device(struct dm_target *ti, char *path)
> +static int dmz_get_zoned_device(struct dm_target *ti, char *path, int num)
>  {
>  	struct dmz_target *dmz = ti->private;
> -	struct request_queue *q;
>  	struct dmz_dev *dev;
> -	sector_t aligned_capacity;
>  	int ret;
> +	struct block_device *bdev;
>  
>  	/* Get the target device */
> -	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &dmz->ddev);
> +	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table),
> +			    &dmz->ddev[num]);
>  	if (ret) {
>  		ti->error = "Get target device failed";
> -		dmz->ddev = NULL;
> +		dmz->ddev[num] = NULL;
>  		return ret;
>  	}
>  
> -	dev = kzalloc(sizeof(struct dmz_dev), GFP_KERNEL);
> -	if (!dev) {
> -		ret = -ENOMEM;
> -		goto err;
> -	}
> -
> -	dev->bdev = dmz->ddev->bdev;
> +	bdev = dmz->ddev[num]->bdev;
> +	if (bdev_zoned_model(bdev) == BLK_ZONED_NONE) {
> +		dev = &dmz->dev[0];
> +		dev->flags = DMZ_BDEV_REGULAR;
> +	} else
> +		dev = &dmz->dev[1];
> +	dev->bdev = bdev;
>  	(void)bdevname(dev->bdev, dev->name);

I changed this. See below.

>  
> -	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE) {
> -		ti->error = "Not a zoned block device";
> -		ret = -EINVAL;
> -		goto err;
> -	}
> -
> -	q = bdev_get_queue(dev->bdev);
>  	dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
> -	aligned_capacity = dev->capacity &
> -				~((sector_t)blk_queue_zone_sectors(q) - 1);
> -	if (ti->begin ||
> -	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
> -		ti->error = "Partial mapping not supported";
> -		ret = -EINVAL;
> -		goto err;
> +	if (ti->begin) {
> +		ti->error = "Partial mapping is not supported";
> +		dm_put_device(ti, dmz->ddev[num]);
> +		dmz->ddev[num] = NULL;
> +		return -EINVAL;
>  	}
>  
> -	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
> -
> -	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
> -
> -	dmz->dev = dev;
> -
>  	return 0;
> -err:
> -	dm_put_device(ti, dmz->ddev);
> -	kfree(dev);
> -
> -	return ret;
>  }
>  
>  /*
> @@ -747,9 +727,46 @@ static void dmz_put_zoned_device(struct dm_target *ti)
>  {
>  	struct dmz_target *dmz = ti->private;
>  
> -	dm_put_device(ti, dmz->ddev);
> -	kfree(dmz->dev);
> -	dmz->dev = NULL;
> +	if (dmz->ddev[1]) {
> +		dm_put_device(ti, dmz->ddev[1]);
> +		dmz->ddev[1] = NULL;
> +	}
> +	dm_put_device(ti, dmz->ddev[0]);
> +	dmz->ddev[0] = NULL;

A for loop here would be cleaner ?

> +}
> +
> +static int dmz_fixup_devices(struct dm_target *ti)
> +{
> +	struct dmz_target *dmz = ti->private;
> +	struct dmz_dev *pri_dev, *sec_dev;
> +	struct request_queue *q;
> +
> +	pri_dev = &dmz->dev[0];
> +	if (!(pri_dev->flags & DMZ_BDEV_REGULAR)) {
> +		ti->error = "Primary disk is not a regular device";
> +		return -EINVAL;
> +	}
> +	sec_dev = &dmz->dev[1];
> +	if (sec_dev->flags & DMZ_BDEV_REGULAR) {
> +		ti->error = "Secondary disk is not a zoned device";
> +		return -EINVAL;
> +	}
> +	q = bdev_get_queue(sec_dev->bdev);
> +	sec_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
> +	sec_dev->nr_zones = blkdev_nr_zones(sec_dev->bdev->bd_disk);
> +
> +	pri_dev->zone_nr_sectors = sec_dev->zone_nr_sectors;
> +	pri_dev->nr_zones = DIV_ROUND_UP(pri_dev->capacity,
> +					 pri_dev->zone_nr_sectors);
> +	sec_dev->zone_offset = pri_dev->nr_zones;
> +	/* Check if we need to swizzle devices */
> +	if (pri_dev->bdev != dmz->ddev[0]->bdev) {
> +		struct dm_dev *ddev = dmz->ddev[0];
> +
> +		dmz->ddev[0] = dmz->ddev[1];
> +		dmz->ddev[1] = ddev;
> +	}
> +	return 0;

Changed this too. See below.

>  }
>  
>  /*
> @@ -758,11 +775,10 @@ static void dmz_put_zoned_device(struct dm_target *ti)
>  static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  {
>  	struct dmz_target *dmz;
> -	struct dmz_dev *dev;
>  	int ret;
>  
>  	/* Check arguments */
> -	if (argc != 1) {
> +	if (argc < 1 || argc > 2) {
>  		ti->error = "Invalid argument count";
>  		return -EINVAL;
>  	}
> @@ -773,18 +789,34 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  		ti->error = "Unable to allocate the zoned target descriptor";
>  		return -ENOMEM;
>  	}
> +	dmz->dev = kcalloc(2, sizeof(struct dmz_dev), GFP_KERNEL);
> +	if (!dmz->dev) {
> +		ti->error = "Unable to allocate the zoned device descriptors";
> +		kfree(dmz);
> +		return -ENOMEM;
> +	}
>  	ti->private = dmz;
>  
>  	/* Get the target zoned block device */
> -	ret = dmz_get_zoned_device(ti, argv[0]);
> -	if (ret) {
> -		dmz->ddev = NULL;
> +	ret = dmz_get_zoned_device(ti, argv[0], 0);
> +	if (ret)
>  		goto err;
> +
> +	if (argc == 2) {
> +		ret = dmz_get_zoned_device(ti, argv[1], 1);
> +		if (ret) {
> +			dmz_put_zoned_device(ti);
> +			goto err;
> +		}
> +		ret = dmz_fixup_devices(ti);
> +		if (ret) {
> +			dmz_put_zoned_device(ti);
> +			goto err;
> +		}

Fixup devices needs to be called regardless of the number of drives so that
zone_nr_sectors and nr_zones get initialized. See below.

>  	}
>  
>  	/* Initialize metadata */
> -	dev = dmz->dev;
> -	ret = dmz_ctr_metadata(dev, &dmz->metadata,
> +	ret = dmz_ctr_metadata(dmz->dev, argc, &dmz->metadata,
>  			       dm_table_device_name(ti->table));
>  	if (ret) {
>  		ti->error = "Metadata initialization failed";
> @@ -861,6 +893,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  err_dev:
>  	dmz_put_zoned_device(ti);
>  err:
> +	kfree(dmz->dev);
>  	kfree(dmz);
>  
>  	return ret;
> @@ -891,6 +924,7 @@ static void dmz_dtr(struct dm_target *ti)
>  
>  	mutex_destroy(&dmz->chunk_lock);
>  
> +	kfree(dmz->dev);
>  	kfree(dmz);
>  }
>  
> @@ -965,10 +999,17 @@ static int dmz_iterate_devices(struct dm_target *ti,
>  			       iterate_devices_callout_fn fn, void *data)
>  {
>  	struct dmz_target *dmz = ti->private;
> -	struct dmz_dev *dev = dmz->dev;
> -	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
> -
> -	return fn(ti, dmz->ddev, 0, capacity, data);
> +	unsigned int zone_nr_sectors = dmz_zone_nr_sectors(dmz->metadata);
> +	sector_t capacity;
> +	int r;
> +
> +	capacity = dmz->dev[0].capacity & ~(zone_nr_sectors - 1);
> +	r = fn(ti, dmz->ddev[0], 0, capacity, data);
> +	if (!r && dmz->ddev[1]) {
> +		capacity = dmz->dev[1].capacity & ~(zone_nr_sectors - 1);
> +		r = fn(ti, dmz->ddev[1], 0, capacity, data);
> +	}
> +	return r;
>  }
>  
>  static void dmz_status(struct dm_target *ti, status_type_t type,
> @@ -978,6 +1019,7 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
>  	struct dmz_target *dmz = ti->private;
>  	ssize_t sz = 0;
>  	char buf[BDEVNAME_SIZE];
> +	struct dmz_dev *dev;
>  
>  	switch (type) {
>  	case STATUSTYPE_INFO:
> @@ -991,8 +1033,14 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
>  		       dmz_nr_seq_zones(dmz->metadata));
>  		break;
>  	case STATUSTYPE_TABLE:
> -		format_dev_t(buf, dmz->dev->bdev->bd_dev);
> +		dev = &dmz->dev[0];
> +		format_dev_t(buf, dev->bdev->bd_dev);
>  		DMEMIT("%s ", buf);
> +		if (dmz->dev[1].bdev) {
> +			dev = &dmz->dev[1];
> +			format_dev_t(buf, dev->bdev->bd_dev);
> +			DMEMIT("%s ", buf);
> +		}
>  		break;
>  	}
>  	return;
> @@ -1014,7 +1062,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>  
>  static struct target_type dmz_type = {
>  	.name		 = "zoned",
> -	.version	 = {1, 1, 0},
> +	.version	 = {1, 2, 0},

May be got to version 2.0.0 to match the metadata version number ?

>  	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>  	.module		 = THIS_MODULE,
>  	.ctr		 = dmz_ctr,
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index 454ebd628cca..e383d5b2a3c5 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -52,10 +52,12 @@ struct dmz_dev {
>  	struct block_device	*bdev;
>  
>  	char			name[BDEVNAME_SIZE];
> +	uuid_t			uuid;
>  
>  	sector_t		capacity;
>  
>  	unsigned int		nr_zones;
> +	unsigned int		zone_offset;
>  
>  	unsigned int		flags;
>  
> @@ -69,6 +71,7 @@ struct dmz_dev {
>  /* Device flags. */
>  #define DMZ_BDEV_DYING		(1 << 0)
>  #define DMZ_CHECK_BDEV		(2 << 0)
> +#define DMZ_BDEV_REGULAR	(4 << 0)
>  
>  /*
>   * Zone descriptor.
> @@ -163,8 +166,8 @@ struct dmz_reclaim;
>  /*
>   * Functions defined in dm-zoned-metadata.c
>   */
> -int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
> -		     const char *devname);
> +int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
> +		     struct dmz_metadata **zmd, const char *devname);
>  void dmz_dtr_metadata(struct dmz_metadata *zmd);
>  int dmz_resume_metadata(struct dmz_metadata *zmd);
>  
> @@ -176,15 +179,13 @@ void dmz_lock_flush(struct dmz_metadata *zmd);
>  void dmz_unlock_flush(struct dmz_metadata *zmd);
>  int dmz_flush_metadata(struct dmz_metadata *zmd);
>  const char *dmz_metadata_label(struct dmz_metadata *zmd);
> +bool dmz_check_dev(struct dmz_metadata *zmd);
>  
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
>  sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
>  unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
>  struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone);
>  
> -bool dmz_check_dev(struct dmz_metadata *zmd);
> -bool dmz_dev_is_dying(struct dmz_metadata *zmd);
> -
>  #define DMZ_ALLOC_RND		0x01
>  #define DMZ_ALLOC_RECLAIM	0x02
>  
> @@ -251,6 +252,7 @@ int dmz_copy_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
>  			  struct dm_zone *to_zone);
>  int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
>  			   struct dm_zone *to_zone, sector_t chunk_block);
> +bool dmz_dev_is_dying(struct dmz_metadata *zmd);
>  
>  /*
>   * Functions defined in dm-zoned-reclaim.c
> 

I ran the entire series through simple tests. As noted above, the single drive
case is broken. Here is what I applied on top of this patch to fix it:


From fe074133b780a66403e24896c78691312e7b692a Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal@wdc.com>
Date: Tue, 28 Apr 2020 16:58:13 +0900
Subject: [PATCH] Fix initialization

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/md/dm-zoned-target.c | 130 ++++++++++++++++++++++-------------
 1 file changed, 81 insertions(+), 49 deletions(-)

diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index ae05d5d60b37..e420cd7a251e 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -13,6 +13,8 @@

 #define DMZ_MIN_BIOS           8192

+#define DMZ_MAX_DEVS           2
+
 /*
  * Zone BIO context.
  */
@@ -38,7 +40,7 @@ struct dm_chunk_work {
  * Target descriptor.
  */
 struct dmz_target {
-       struct dm_dev           *ddev[2];
+       struct dm_dev           *ddev[DMZ_MAX_DEVS];

        unsigned long           flags;

@@ -684,40 +686,58 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
 /*
  * Get zoned device information.
  */
-static int dmz_get_zoned_device(struct dm_target *ti, char *path, int num)
+static int dmz_get_zoned_device(struct dm_target *ti, char *path, int nr_devs)
 {
        struct dmz_target *dmz = ti->private;
+       struct dm_dev *ddev;
        struct dmz_dev *dev;
-       int ret;
+       int idx, ret;
        struct block_device *bdev;

        /* Get the target device */
-       ret = dm_get_device(ti, path, dm_table_get_mode(ti->table),
-                           &dmz->ddev[num]);
+       ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &ddev);
        if (ret) {
                ti->error = "Get target device failed";
-               dmz->ddev[num] = NULL;
                return ret;
        }

-       bdev = dmz->ddev[num]->bdev;
+       bdev = ddev->bdev;
        if (bdev_zoned_model(bdev) == BLK_ZONED_NONE) {
+               if (nr_devs == 1) {
+                       ti->error = "Invalid regular device";
+                       goto err;
+               }
+               if (dmz->ddev[0]) {
+                       ti->error = "Too many regular devices";
+                       goto err;
+               }
+               idx = 0;
                dev = &dmz->dev[0];
                dev->flags = DMZ_BDEV_REGULAR;
-       } else
-               dev = &dmz->dev[1];
+       } else {
+               idx = nr_devs - 1;
+               if (dmz->ddev[idx]) {
+                       ti->error = "Too many zoned devices";
+                       goto err;
+               }
+               dev = &dmz->dev[idx];
+       }
        dev->bdev = bdev;
        (void)bdevname(dev->bdev, dev->name);

-       dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
+       dev->capacity = i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
        if (ti->begin) {
                ti->error = "Partial mapping is not supported";
-               dm_put_device(ti, dmz->ddev[num]);
-               dmz->ddev[num] = NULL;
-               return -EINVAL;
+               goto err;
        }

+       dmz->ddev[idx] = ddev;
+
        return 0;
+
+err:
+       dm_put_device(ti, ddev);
+       return -EINVAL;
 }

 /*
@@ -726,46 +746,57 @@ static int dmz_get_zoned_device(struct dm_target *ti, char
*path, int num)
 static void dmz_put_zoned_device(struct dm_target *ti)
 {
        struct dmz_target *dmz = ti->private;
+       int i;

-       if (dmz->ddev[1]) {
-               dm_put_device(ti, dmz->ddev[1]);
-               dmz->ddev[1] = NULL;
+       for (i = 0; i < DMZ_MAX_DEVS; i++) {
+               if (dmz->ddev[i]) {
+                       dm_put_device(ti, dmz->ddev[i]);
+                       dmz->ddev[i] = NULL;
+               }
        }
-       dm_put_device(ti, dmz->ddev[0]);
-       dmz->ddev[0] = NULL;
 }

 static int dmz_fixup_devices(struct dm_target *ti)
 {
        struct dmz_target *dmz = ti->private;
-       struct dmz_dev *pri_dev, *sec_dev;
+       struct dmz_dev *reg_dev, *zoned_dev;
        struct request_queue *q;

-       pri_dev = &dmz->dev[0];
-       if (!(pri_dev->flags & DMZ_BDEV_REGULAR)) {
-               ti->error = "Primary disk is not a regular device";
-               return -EINVAL;
-       }
-       sec_dev = &dmz->dev[1];
-       if (sec_dev->flags & DMZ_BDEV_REGULAR) {
-               ti->error = "Secondary disk is not a zoned device";
-               return -EINVAL;
+       /*
+        * When we have two devices, the first one must be a regular block
+        * device and the second a zoned block device.
+        */
+       if (dmz->ddev[0] && dmz->ddev[1]) {
+               reg_dev = &dmz->dev[0];
+               if (!(reg_dev->flags & DMZ_BDEV_REGULAR)) {
+                       ti->error = "Primary disk is not a regular device";
+                       return -EINVAL;
+               }
+               zoned_dev = &dmz->dev[1];
+               if (zoned_dev->flags & DMZ_BDEV_REGULAR) {
+                       ti->error = "Secondary disk is not a zoned device";
+                       return -EINVAL;
+               }
+       } else {
+               reg_dev = NULL;
+               zoned_dev = &dmz->dev[0];
+               if (zoned_dev->flags & DMZ_BDEV_REGULAR) {
+                       ti->error = "disk is not a zoned device";
+                       return -EINVAL;
+               }
        }
-       q = bdev_get_queue(sec_dev->bdev);
-       sec_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
-       sec_dev->nr_zones = blkdev_nr_zones(sec_dev->bdev->bd_disk);
-
-       pri_dev->zone_nr_sectors = sec_dev->zone_nr_sectors;
-       pri_dev->nr_zones = DIV_ROUND_UP(pri_dev->capacity,
-                                        pri_dev->zone_nr_sectors);
-       sec_dev->zone_offset = pri_dev->nr_zones;
-       /* Check if we need to swizzle devices */
-       if (pri_dev->bdev != dmz->ddev[0]->bdev) {
-               struct dm_dev *ddev = dmz->ddev[0];
-
-               dmz->ddev[0] = dmz->ddev[1];
-               dmz->ddev[1] = ddev;
+
+       q = bdev_get_queue(zoned_dev->bdev);
+       zoned_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
+       zoned_dev->nr_zones = blkdev_nr_zones(zoned_dev->bdev->bd_disk);
+
+       if (reg_dev) {
+               reg_dev->zone_nr_sectors = zoned_dev->zone_nr_sectors;
+               reg_dev->nr_zones = DIV_ROUND_UP(reg_dev->capacity,
+                                                reg_dev->zone_nr_sectors);
+               zoned_dev->zone_offset = reg_dev->nr_zones;
        }
+
        return 0;
 }

@@ -798,23 +829,24 @@ static int dmz_ctr(struct dm_target *ti, unsigned int
argc, char **argv)
        ti->private = dmz;

        /* Get the target zoned block device */
-       ret = dmz_get_zoned_device(ti, argv[0], 0);
+       ret = dmz_get_zoned_device(ti, argv[0], argc);
        if (ret)
                goto err;

        if (argc == 2) {
-               ret = dmz_get_zoned_device(ti, argv[1], 1);
-               if (ret) {
-                       dmz_put_zoned_device(ti);
-                       goto err;
-               }
-               ret = dmz_fixup_devices(ti);
+               ret = dmz_get_zoned_device(ti, argv[1], argc);
                if (ret) {
                        dmz_put_zoned_device(ti);
                        goto err;
                }
        }

+       ret = dmz_fixup_devices(ti);
+       if (ret) {
+               dmz_put_zoned_device(ti);
+               goto err;
+       }
+
        /* Initialize metadata */
        ret = dmz_ctr_metadata(dmz->dev, argc, &dmz->metadata,
                               dm_table_device_name(ti->table));
-- 
2.25.4

With this, everything works fine for single and dual device case. But I only did
very light testing (formating witth ext4, mounting, running simple fio,
unmount). I also noticed this message on dmzadm --start:

[ 2707.268812] device-mapper: zoned metadata: (253:0): Using 3233664 B for zone
information
[ 2707.921500] device-mapper: zoned metadata: (dmz-sdj): DM-Zoned metadata version 2
[ 2707.929865] device-mapper: zoned metadata: (dmz-sdj): DM UUID
01149f45-1391-d44d-803a-7830d7d62b12
[ 2707.939457] device-mapper: zoned metadata: (dmz-sdj): DM Label dmz-sdj
[ 2707.946371] device-mapper: zoned metadata: (nvme1n1): Regular block device
[ 2707.953616] device-mapper: zoned metadata: (nvme1n1):   uuid
df2c308c-9c98-1845-afad-6bf80bd0ad4a
[ 2707.963097] device-mapper: zoned metadata: (nvme1n1):   976773168 512-byte
logical sectors (offset 0)
[ 2707.972940] device-mapper: zoned metadata: (nvme1n1):   1864 zones of 524288
512-byte logical sectors (offset 0)
[ 2707.983747] device-mapper: zoned metadata: (sdj): Host-managed zoned block device
[ 2707.991852] device-mapper: zoned metadata: (sdj):   uuid
f842e365-53b6-4942-ad00-954e50bec940
[ 2708.001004] device-mapper: zoned metadata: (sdj):   29297213440 512-byte
logical sectors (offset 977272832)
[ 2708.011380] device-mapper: zoned metadata: (sdj):   55880 zones of 524288
512-byte logical sectors (offset 1864)
[ 2708.022184] device-mapper: zoned metadata: (dmz-sdj):   57744 zones of 524288
512-byte logical sectors
[ 2708.032116] device-mapper: zoned metadata: (dmz-sdj):   4 metadata zones
[ 2708.039212] device-mapper: zoned metadata: (dmz-sdj):   57724 data zones for
57724 chunks
[ 2708.048018] device-mapper: zoned metadata: (dmz-sdj):     2383 random zones
(2383 unmapped)
[ 2708.057023] device-mapper: zoned metadata: (dmz-sdj):     55340 sequential
zones (55340 unmapped)
[ 2708.066509] device-mapper: zoned metadata: (dmz-sdj):   16 reserved
sequential data zones
[ 2708.112529] device-mapper: zoned: (dmz-sdj): Target device: 30264000512
512-byte logical sectors (3783000064 blocks)
[ 2708.125465] device-mapper: table: 253:0: adding target device sdj caused an
alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
alignment_offset=0, start=0
[ 2708.142332] device-mapper: table: 253:0: adding target device sdj caused an
alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
alignment_offset=0, start=0
[ 2708.159659] device-mapper: table: 253:0: adding target device sdj caused an
alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
alignment_offset=0, start=0
[ 2708.176600] device-mapper: table: 253:0: adding target device sdj caused an
alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
alignment_offset=0, start=0

Which I think comes from the fact that I mixed a 4Kn SMR drive with a 512B
sector M.2 NVMe drive. The different sector size seem to generate this. I have
not dig further yet.

FYI, I pushed your patches for dmzadm to the "staging" branch on github.

Cheers.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 13/13] dm-zoned: metadata version 2
  2020-04-28 10:54   ` Damien Le Moal
@ 2020-04-28 17:37     ` Mike Snitzer
  2020-04-30 14:45     ` Hannes Reinecke
  1 sibling, 0 replies; 24+ messages in thread
From: Mike Snitzer @ 2020-04-28 17:37 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Bob Liu, dm-devel

On Tue, Apr 28 2020 at  6:54am -0400,
Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
 
> With this, everything works fine for single and dual device case.

Cool.

Hannes, pleasee fold Damien's changes in for v3, thanks!

> But I only did very light testing (formating witth ext4, mounting,
> running simple fio, unmount). I also noticed this message on dmzadm
> --start:
> 
> [ 2707.268812] device-mapper: zoned metadata: (253:0): Using 3233664 B for zone
> information
> [ 2707.921500] device-mapper: zoned metadata: (dmz-sdj): DM-Zoned metadata version 2
> [ 2707.929865] device-mapper: zoned metadata: (dmz-sdj): DM UUID
> 01149f45-1391-d44d-803a-7830d7d62b12
> [ 2707.939457] device-mapper: zoned metadata: (dmz-sdj): DM Label dmz-sdj
> [ 2707.946371] device-mapper: zoned metadata: (nvme1n1): Regular block device
> [ 2707.953616] device-mapper: zoned metadata: (nvme1n1):   uuid
> df2c308c-9c98-1845-afad-6bf80bd0ad4a
> [ 2707.963097] device-mapper: zoned metadata: (nvme1n1):   976773168 512-byte
> logical sectors (offset 0)
> [ 2707.972940] device-mapper: zoned metadata: (nvme1n1):   1864 zones of 524288
> 512-byte logical sectors (offset 0)
> [ 2707.983747] device-mapper: zoned metadata: (sdj): Host-managed zoned block device
> [ 2707.991852] device-mapper: zoned metadata: (sdj):   uuid
> f842e365-53b6-4942-ad00-954e50bec940
> [ 2708.001004] device-mapper: zoned metadata: (sdj):   29297213440 512-byte
> logical sectors (offset 977272832)
> [ 2708.011380] device-mapper: zoned metadata: (sdj):   55880 zones of 524288
> 512-byte logical sectors (offset 1864)
> [ 2708.022184] device-mapper: zoned metadata: (dmz-sdj):   57744 zones of 524288
> 512-byte logical sectors
> [ 2708.032116] device-mapper: zoned metadata: (dmz-sdj):   4 metadata zones
> [ 2708.039212] device-mapper: zoned metadata: (dmz-sdj):   57724 data zones for
> 57724 chunks
> [ 2708.048018] device-mapper: zoned metadata: (dmz-sdj):     2383 random zones
> (2383 unmapped)
> [ 2708.057023] device-mapper: zoned metadata: (dmz-sdj):     55340 sequential
> zones (55340 unmapped)
> [ 2708.066509] device-mapper: zoned metadata: (dmz-sdj):   16 reserved
> sequential data zones
> [ 2708.112529] device-mapper: zoned: (dmz-sdj): Target device: 30264000512
> 512-byte logical sectors (3783000064 blocks)

Not liking how chatty DM zoned metadata has become... can that be
removed and the proper .status updates be provided?  (yes I know zoned
never provided .status but this series should introduce a basic one
early in the series, that should've always been there, and then update
it for v2 metadata).

> [ 2708.125465] device-mapper: table: 253:0: adding target device sdj caused an
> alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
> alignment_offset=0, start=0
> [ 2708.142332] device-mapper: table: 253:0: adding target device sdj caused an
> alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
> alignment_offset=0, start=0
> [ 2708.159659] device-mapper: table: 253:0: adding target device sdj caused an
> alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
> alignment_offset=0, start=0
> [ 2708.176600] device-mapper: table: 253:0: adding target device sdj caused an
> alignment inconsistency: physical_block_size=4096, logical_block_size=4096,
> alignment_offset=0, start=0
> 
> Which I think comes from the fact that I mixed a 4Kn SMR drive with a 512B
> sector M.2 NVMe drive. The different sector size seem to generate this. I have
> not dig further yet.

I'd have to dig further myself to understand the disposition of these
messages... if it is born of of 512 vs 4096 I'm missing why the
.iterate_devices isn't properly establishing limits that are compatible
with your combined 512 and 4096 hybrid.. e.g. require 4096 to satisfy
the 4K device's constraints.

Mike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 13/13] dm-zoned: metadata version 2
  2020-04-28 10:54   ` Damien Le Moal
  2020-04-28 17:37     ` Mike Snitzer
@ 2020-04-30 14:45     ` Hannes Reinecke
  2020-05-01  0:15       ` Damien Le Moal
  1 sibling, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-04-30 14:45 UTC (permalink / raw)
  To: Damien Le Moal, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 4/28/20 12:54 PM, Damien Le Moal wrote:
> On 2020/04/20 19:09, Hannes Reinecke wrote:
>> Implement handling for metadata version 2. The new metadata adds
>> a label and UUID for the device mapper device, and additional UUID
>> for the underlying block devices.
>> It also allows for an additional regular drive to be used for
>> emulating random access zones. The emulated zones will be placed
>> logically in front of the zones from the zoned block device, causing
>> the superblocks and metadata to be stored on that device.
>> The first zone of the original zoned device will be used to hold
>> another, tertiary copy of the metadata; this copy carries a
>> generation number of 0 and is never updated; it's just used
>> for identification.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> Reviewed-by: Bob Liu <bob.liu@oracle.com>
>> ---
>>   drivers/md/dm-zoned-metadata.c | 314 ++++++++++++++++++++++++++++++++++-------
>>   drivers/md/dm-zoned-target.c   | 156 +++++++++++++-------
>>   drivers/md/dm-zoned.h          |  12 +-
>>   3 files changed, 373 insertions(+), 109 deletions(-)
>>
>> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
>> index c009f2d962e2..1f31635aba73 100644
>> --- a/drivers/md/dm-zoned-metadata.c
>> +++ b/drivers/md/dm-zoned-metadata.c
>> @@ -16,7 +16,7 @@
>>   /*
>>    * Metadata version.
>>    */
>> -#define DMZ_META_VER	1
>> +#define DMZ_META_VER	2
>>   
>>   /*
>>    * On-disk super block magic.
>> @@ -69,8 +69,17 @@ struct dmz_super {
>>   	/* Checksum */
>>   	__le32		crc;			/*  48 */
>>   
>> +	/* DM-Zoned label */
>> +	u8		dmz_label[32];		/*  80 */
>> +
>> +	/* DM-Zoned UUID */
>> +	u8		dmz_uuid[16];		/*  96 */
>> +
>> +	/* Device UUID */
>> +	u8		dev_uuid[16];		/* 112 */
>> +
>>   	/* Padding to full 512B sector */
>> -	u8		reserved[464];		/* 512 */
>> +	u8		reserved[400];		/* 512 */
>>   };
>>   
>>   /*
>> @@ -133,8 +142,11 @@ struct dmz_sb {
>>    */
>>   struct dmz_metadata {
>>   	struct dmz_dev		*dev;
>> +	unsigned int		nr_devs;
>>   
>>   	char			devname[BDEVNAME_SIZE];
>> +	char			label[BDEVNAME_SIZE];
>> +	uuid_t			uuid;
>>   
>>   	sector_t		zone_bitmap_size;
>>   	unsigned int		zone_nr_bitmap_blocks;
>> @@ -161,8 +173,9 @@ struct dmz_metadata {
>>   	/* Zone information array */
>>   	struct dm_zone		*zones;
>>   
>> -	struct dmz_sb		sb[2];
>> +	struct dmz_sb		sb[3];
>>   	unsigned int		mblk_primary;
>> +	unsigned int		sb_version;
>>   	u64			sb_gen;
>>   	unsigned int		min_nr_mblks;
>>   	unsigned int		max_nr_mblks;
>> @@ -195,31 +208,56 @@ struct dmz_metadata {
>>   };
>>   
>>   #define dmz_zmd_info(zmd, format, args...)	\
>> -	DMINFO("(%s): " format, (zmd)->devname, ## args)
>> +	DMINFO("(%s): " format, (zmd)->label, ## args)
>>   
>>   #define dmz_zmd_err(zmd, format, args...)	\
>> -	DMERR("(%s): " format, (zmd)->devname, ## args)
>> +	DMERR("(%s): " format, (zmd)->label, ## args)
>>   
>>   #define dmz_zmd_warn(zmd, format, args...)	\
>> -	DMWARN("(%s): " format, (zmd)->devname, ## args)
>> +	DMWARN("(%s): " format, (zmd)->label, ## args)
>>   
>>   #define dmz_zmd_debug(zmd, format, args...)	\
>> -	DMDEBUG("(%s): " format, (zmd)->devname, ## args)
>> +	DMDEBUG("(%s): " format, (zmd)->label, ## args)
>>   /*
>>    * Various accessors
>>    */
>> +unsigned int dmz_dev_zone_id(struct dmz_metadata *zmd, struct dm_zone *zone)
>> +{
>> +	unsigned int zone_id;
>> +
>> +	if (WARN_ON(!zone))
>> +		return 0;
>> +
>> +	zone_id = zone->id;
>> +	if (zmd->nr_devs > 1 &&
>> +	    (zone_id >= zmd->dev[1].zone_offset))
>> +		zone_id -= zmd->dev[1].zone_offset;
> 
> We could have this as:
> 
> 	if (zone_id >= zmd->dev[0].nr_zones)
> 		zone_id -= zmd->dev[0].nr_zones;
> 
> No ? It is simpler and we can kill the zone_offset.
> 
Yes, but it will make the device arrangement implicit; by specifying
the block offset we allow us the option of possibly moving the block 
offset into the metadata, and then having the metadata specifying the
layout.
Something which I'd like to keep as I have this weird idea of using 
other, non-standard, drives, too, which then would require a more 
complex layout.

>> +	return zone_id;
>> +}
>> +
>>   sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
>>   {
>> -	return (sector_t)zone->id << zmd->zone_nr_sectors_shift;
>> +	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
>> +
>> +	return (sector_t)zone_id << zmd->zone_nr_sectors_shift;
>>   }
>>   
>>   sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone)
>>   {
>> -	return (sector_t)zone->id << zmd->zone_nr_blocks_shift;
>> +	unsigned int zone_id = dmz_dev_zone_id(zmd, zone);
>> +
>> +	return (sector_t)zone_id << zmd->zone_nr_blocks_shift;
>>   }
>>   
>>   struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone)
>>   {
>> +	if (WARN_ON(!zone))
>> +		return &zmd->dev[0];
>> +
>> +	if (zmd->nr_devs > 1 &&
>> +	    zone->id >= zmd->dev[1].zone_offset)
>> +		return &zmd->dev[1];
>> +
>>   	return &zmd->dev[0];
> 
> 
> Same here, simpler version:
> 
> 	if (zone_id < zmd->dev[0].nr_zones)
> 		return &zmd->dev[0];
> 
> 	return &zmd->dev[1];
> 

Same argument here, too :-)

>>   }
>>   
>> @@ -275,17 +313,33 @@ unsigned int dmz_nr_unmap_seq_zones(struct dmz_metadata *zmd)
>>   
>>   const char *dmz_metadata_label(struct dmz_metadata *zmd)
>>   {
>> -	return (const char *)zmd->devname;
>> +	return (const char *)zmd->label;
>>   }
>>   
>>   bool dmz_check_dev(struct dmz_metadata *zmd)
>>   {
>> -	return dmz_check_bdev(&zmd->dev[0]);
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < zmd->nr_devs; i++) {
>> +		if (!zmd->dev[i].bdev)
>> +			continue;
> 
> This test is not necessary, no ? Since dev[0] is always set now with your latest
> changes reshuffling the devs index.
> 
True. Will be removing it.

>> +		if (!dmz_check_bdev(&zmd->dev[i]))
>> +			return false;
>> +	}
>> +	return true;
>>   }
>>   
>>   bool dmz_dev_is_dying(struct dmz_metadata *zmd)
>>   {
>> -	return dmz_bdev_is_dying(&zmd->dev[0]);
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < zmd->nr_devs; i++) {
>> +		if (!zmd->dev[i].bdev)
>> +			continue;
> 
> Same here.
> 
Ok.

>> +		if (dmz_bdev_is_dying(&zmd->dev[i]))
>> +			return true;
>> +	}
>> +	return false;
>>   }
>>   
>>   /*
>> @@ -687,6 +741,9 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
>>   	struct bio *bio;
>>   	int ret;
>>   
>> +	if (WARN_ON(!dev))
>> +		return -EIO;
>> +
>>   	if (dmz_bdev_is_dying(dev))
>>   		return -EIO;
>>   
>> @@ -711,7 +768,8 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
>>    */
>>   static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>>   {
>> -	sector_t block = zmd->sb[set].block;
>> +	sector_t sb_block =
>> +		zmd->sb[set].zone->id << zmd->zone_nr_blocks_shift;
> 
> I think this is safe as set 2 is read-only, so updates are opnly for set 0 and 1
> on dev[0]. But a comment pointing that out would be nice...
> 
Indeed, as the 'sb_block' variable here is the _absolute_ block address 
(ie encompassing both drives). And as such it's the number written into 
the metadata, not the position at which the metadata is written to.
But yes, we do need a comment here.

(This was the bit which took me several iteration to sort out ...)

>>   	struct dmz_mblock *mblk = zmd->sb[set].mblk;
>>   	struct dmz_super *sb = zmd->sb[set].sb;
>>   	struct dmz_dev *dev = zmd->sb[set].dev;
>> @@ -719,11 +777,18 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>>   	int ret;
>>   
>>   	sb->magic = cpu_to_le32(DMZ_MAGIC);
>> -	sb->version = cpu_to_le32(DMZ_META_VER);
>> +
>> +	sb->version = cpu_to_le32(zmd->sb_version);
>> +	if (zmd->sb_version > 1) {
>> +		BUILD_BUG_ON(UUID_SIZE != 16);
>> +		memcpy(sb->dmz_uuid, &zmd->uuid, UUID_SIZE);
>> +		memcpy(sb->dmz_label, zmd->label, BDEVNAME_SIZE);
>> +		memcpy(sb->dev_uuid, &dev->uuid, UUID_SIZE);
> 
> import_uuid() ?
> 

Oh, do we have that? Cool ...
Only it would have to be 'export_uuid()';
but then I just saw that we have that, too...

>> +	}
>>   
>>   	sb->gen = cpu_to_le64(sb_gen);
>>   
>> -	sb->sb_block = cpu_to_le64(block);
>> +	sb->sb_block = cpu_to_le64(sb_block);
>>   	sb->nr_meta_blocks = cpu_to_le32(zmd->nr_meta_blocks);
>>   	sb->nr_reserved_seq = cpu_to_le32(zmd->nr_reserved_seq);
>>   	sb->nr_chunks = cpu_to_le32(zmd->nr_chunks);
>> @@ -734,7 +799,8 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>>   	sb->crc = 0;
>>   	sb->crc = cpu_to_le32(crc32_le(sb_gen, (unsigned char *)sb, DMZ_BLOCK_SIZE));
>>   
>> -	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, block, mblk->page);
>> +	ret = dmz_rdwr_block(dev, REQ_OP_WRITE, zmd->sb[set].block,
>> +			     mblk->page);
>>   	if (ret == 0)
>>   		ret = blkdev_issue_flush(dev->bdev, GFP_NOIO, NULL);
>>   
>> @@ -915,6 +981,23 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
>>   	u32 crc, stored_crc;
>>   	u64 gen;
>>   
>> +	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
>> +		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
>> +			    DMZ_MAGIC, le32_to_cpu(sb->magic));
>> +		return -ENXIO;
>> +	}
>> +
>> +	zmd->sb_version = le32_to_cpu(sb->version);
>> +	if (zmd->sb_version > DMZ_META_VER) {
>> +		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
>> +			    DMZ_META_VER, zmd->sb_version);
>> +		return -EINVAL;
>> +	}
>> +	if ((zmd->sb_version < 1) && (set == 2)) {
>> +		dmz_dev_err(dev, "Tertiary superblocks are not supported");
>> +		return -EINVAL;
>> +	}
>> +
>>   	gen = le64_to_cpu(sb->gen);
>>   	stored_crc = le32_to_cpu(sb->crc);
>>   	sb->crc = 0;
>> @@ -925,18 +1008,44 @@ static int dmz_check_sb(struct dmz_metadata *zmd, unsigned int set)
>>   		return -ENXIO;
>>   	}
>>   
>> -	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
>> -		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
>> -			    DMZ_MAGIC, le32_to_cpu(sb->magic));
>> -		return -ENXIO;
>> -	}
>> +	if (zmd->sb_version > 1) {
>> +		uuid_t sb_uuid;
>> +
>> +		memcpy(&sb_uuid, sb->dmz_uuid, UUID_SIZE);
>> +		if (uuid_is_null(&sb_uuid)) {
>> +			dmz_dev_err(dev, "NULL DM-Zoned uuid");
>> +			return -ENXIO;
>> +		} else if (uuid_is_null(&zmd->uuid)) {
>> +			uuid_copy(&zmd->uuid, &sb_uuid);
>> +		} else if (!uuid_equal(&zmd->uuid, &sb_uuid)) {
>> +			dmz_dev_err(dev, "mismatching DM-Zoned uuid, "
>> +				    "is %pUl expected %pUl",
>> +				    &sb_uuid, &zmd->uuid);
>> +			return -ENXIO;
>> +		}
>> +		if (!strlen(zmd->label))
>> +			memcpy(zmd->label, sb->dmz_label, BDEVNAME_SIZE);
>> +		else if (memcmp(zmd->label, sb->dmz_label, BDEVNAME_SIZE)) {
>> +			dmz_dev_err(dev, "mismatching DM-Zoned label, "
>> +				    "is %s expected %s",
>> +				    sb->dmz_label, zmd->label);
>> +			return -ENXIO;
>> +		}
>> +		memcpy(&dev->uuid, sb->dev_uuid, UUID_SIZE);
>> +		if (uuid_is_null(&dev->uuid)) {
>> +			dmz_dev_err(dev, "NULL device uuid");
>> +			return -ENXIO;
>> +		}
>>   
>> -	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
>> -		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
>> -			    DMZ_META_VER, le32_to_cpu(sb->version));
>> -		return -ENXIO;
>> +		if (set == 2) {
>> +			if (gen != 0) {
>> +				dmz_dev_err(dev, "Invalid generation %llu",
>> +					    gen);
>> +				return -ENXIO;
>> +			}
>> +			return 0;
>> +		}
>>   	}
>> -
>>   	nr_meta_zones = (le32_to_cpu(sb->nr_meta_blocks) + zmd->zone_nr_blocks - 1)
>>   		>> zmd->zone_nr_blocks_shift;
>>   	if (!nr_meta_zones ||
>> @@ -1185,21 +1294,38 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>>   		      "Using super block %u (gen %llu)",
>>   		      zmd->mblk_primary, zmd->sb_gen);
>>   
>> +	if ((zmd->sb_version > 1) && zmd->sb[2].zone) {
>> +		zmd->sb[2].block = dmz_start_block(zmd, zmd->sb[2].zone);
>> +		zmd->sb[2].dev = dmz_zone_to_dev(zmd, zmd->sb[2].zone);
>> +		ret = dmz_get_sb(zmd, 2);
>> +		if (ret) {
>> +			dmz_dev_err(zmd->sb[2].dev,
>> +				    "Read tertiary super block failed");
>> +			return ret;
>> +		}
>> +		ret = dmz_check_sb(zmd, 2);
>> +		if (ret == -EINVAL)
>> +			return ret;
>> +	}
>>   	return 0;
>>   }
>>   
>>   /*
>>    * Initialize a zone descriptor.
>>    */
>> -static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>> +static int dmz_init_zone(struct blk_zone *blkz, unsigned int num, void *data)
>>   {
>>   	struct dmz_metadata *zmd = data;
>> +	struct dmz_dev *dev = zmd->nr_devs > 1 ? &zmd->dev[1] : &zmd->dev[0];
>> +	int idx = num + dev->zone_offset;
>>   	struct dm_zone *zone = &zmd->zones[idx];
>> -	struct dmz_dev *dev = zmd->dev;
>>   
>> -	/* Ignore the eventual last runt (smaller) zone */
>>   	if (blkz->len != zmd->zone_nr_sectors) {
>> -		if (blkz->start + blkz->len == dev->capacity)
>> +		if (zmd->sb_version > 1) {
>> +			/* Ignore the eventual runt (smaller) zone */
>> +			set_bit(DMZ_OFFLINE, &zone->flags);
>> +			return 0;
>> +		} else if (blkz->start + blkz->len == dev->capacity)
>>   			return 0;
>>   		return -ENXIO;
>>   	}
>> @@ -1234,16 +1360,46 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>>   		zmd->nr_useable_zones++;
>>   		if (dmz_is_rnd(zone)) {
>>   			zmd->nr_rnd_zones++;
>> -			if (!zmd->sb[0].zone) {
>> -				/* Super block zone */
>> +			if (zmd->nr_devs == 1 && !zmd->sb[0].zone) {
>> +				/* Primary super block zone */
>>   				zmd->sb[0].zone = zone;
>>   			}
>>   		}
>> +		if (zmd->nr_devs > 1 && !zmd->sb[2].zone) {
>> +			/* Tertiary superblock zone */
>> +			zmd->sb[2].zone = zone;
>> +		}
>>   	}
>>   
>>   	return 0;
>>   }
>>   
>> +static void dmz_emulate_zones(struct dmz_metadata *zmd, struct dmz_dev *dev)
>> +{
>> +	int idx;
>> +	sector_t zone_offset = 0;
>> +
>> +	for(idx = 0; idx < dev->nr_zones; idx++) {
>> +		struct dm_zone *zone = &zmd->zones[idx];
>> +
>> +		INIT_LIST_HEAD(&zone->link);
>> +		atomic_set(&zone->refcount, 0);
>> +		zone->id = idx;
>> +		zone->chunk = DMZ_MAP_UNMAPPED;
>> +		set_bit(DMZ_RND, &zone->flags);
>> +		zone->wp_block = 0;
>> +		zmd->nr_rnd_zones++;
>> +		zmd->nr_useable_zones++;
>> +		if (dev->capacity - zone_offset <
>> +		    zmd->zone_nr_sectors) {
> 
> No need for the line break here. It fits in 80 chars line.
> 
ok.

>> +			/* Disable runt zone */
>> +			set_bit(DMZ_OFFLINE, &zone->flags);
>> +			break;
>> +		}
>> +		zone_offset += zmd->zone_nr_sectors;
>> +	}
>> +}
>> +
>>   /*
>>    * Free zones descriptors.
>>    */
>> @@ -1259,11 +1415,11 @@ static void dmz_drop_zones(struct dmz_metadata *zmd)
>>    */
>>   static int dmz_init_zones(struct dmz_metadata *zmd)
>>   {
>> -	struct dmz_dev *dev = &zmd->dev[0];
>> -	int ret;
>> +	int i, ret;
>> +	struct dmz_dev *zoned_dev = &zmd->dev[0];
>>   
>>   	/* Init */
>> -	zmd->zone_nr_sectors = dev->zone_nr_sectors;
>> +	zmd->zone_nr_sectors = zmd->dev[0].zone_nr_sectors;
>>   	zmd->zone_nr_sectors_shift = ilog2(zmd->zone_nr_sectors);
>>   	zmd->zone_nr_blocks = dmz_sect2blk(zmd->zone_nr_sectors);
>>   	zmd->zone_nr_blocks_shift = ilog2(zmd->zone_nr_blocks);
>> @@ -1274,7 +1430,14 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
>>   					DMZ_BLOCK_SIZE_BITS);
>>   
>>   	/* Allocate zone array */
>> -	zmd->nr_zones = dev->nr_zones;
>> +	zmd->nr_zones = 0;
>> +	for (i = 0; i < zmd->nr_devs; i++)
>> +		zmd->nr_zones += zmd->dev[i].nr_zones;
>> +
>> +	if (!zmd->nr_zones) {
>> +		DMERR("(%s): No zones found", zmd->devname);
>> +		return -ENXIO;
>> +	}
> 
> I tested and this does not work for a single zoned device case because nr_zones
> is set in device fixup after this. So thie sees nr_zones == 0.
> 

Right.

>>   	zmd->zones = kcalloc(zmd->nr_zones, sizeof(struct dm_zone), GFP_KERNEL);
>>   	if (!zmd->zones)
>>   		return -ENOMEM;
>> @@ -1282,14 +1445,27 @@ static int dmz_init_zones(struct dmz_metadata *zmd)
>>   	DMINFO("(%s): Using %zu B for zone information",
>>   	       zmd->devname, sizeof(struct dm_zone) * zmd->nr_zones);
>>   
>> +	if (zmd->nr_devs > 1) {
>> +		dmz_emulate_zones(zmd, &zmd->dev[0]);
>> +		/*
>> +		 * Primary superblock zone is always at zone 0 when multiple
>> +		 * drives are present.
>> +		 */
>> +		zmd->sb[0].zone = &zmd->zones[0];
>> +
>> +		zoned_dev = &zmd->dev[1];
>> +	}
>> +
>>   	/*
>>   	 * Get zone information and initialize zone descriptors.  At the same
>>   	 * time, determine where the super block should be: first block of the
>>   	 * first randomly writable zone.
>>   	 */
>> -	ret = blkdev_report_zones(dev->bdev, 0, BLK_ALL_ZONES, dmz_init_zone,
>> -				  zmd);
>> +	ret = blkdev_report_zones(zoned_dev->bdev, 0, BLK_ALL_ZONES,
>> +				  dmz_init_zone, zmd);
>>   	if (ret < 0) {
>> +		DMDEBUG("(%s): Failed to report zones, error %d",
>> +			zmd->devname, ret);
>>   		dmz_drop_zones(zmd);
>>   		return ret;
>>   	}
>> @@ -1325,6 +1501,9 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
>>   	unsigned int noio_flag;
>>   	int ret;
>>   
>> +	if (dev->flags & DMZ_BDEV_REGULAR)
>> +		return 0;
>> +
>>   	/*
>>   	 * Get zone information from disk. Since blkdev_report_zones() uses
>>   	 * GFP_KERNEL by default for memory allocations, set the per-task
>> @@ -2475,18 +2654,34 @@ void dmz_print_dev(struct dmz_metadata *zmd, int num)
>>   {
>>   	struct dmz_dev *dev = &zmd->dev[num];
>>   
>> -	dmz_dev_info(dev, "Host-%s zoned block device",
>> -		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
>> -		     "aware" : "managed");
>> -	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
>> -		     (u64)dev->capacity);
>> -	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
>> -		     dev->nr_zones, (u64)zmd->zone_nr_sectors);
>> +	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE)
>> +		dmz_dev_info(dev, "Regular block device");
>> +	else
>> +		dmz_dev_info(dev, "Host-%s zoned block device",
>> +			     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
>> +			     "aware" : "managed");
>> +	if (zmd->sb_version > 1) {
>> +		sector_t sector_offset =
>> +			dev->zone_offset << zmd->zone_nr_sectors_shift;
>> +
>> +		dmz_dev_info(dev, "  uuid %pUl", &dev->uuid);
>> +		dmz_dev_info(dev, "  %llu 512-byte logical sectors (offset %llu)",
>> +			     (u64)dev->capacity, (u64)sector_offset);
>> +		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors (offset %llu)",
>> +			     dev->nr_zones, (u64)zmd->zone_nr_sectors,
>> +			     (u64)dev->zone_offset);
>> +	} else {
>> +		dmz_dev_info(dev, "  %llu 512-byte logical sectors",
>> +			     (u64)dev->capacity);
>> +		dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
>> +			     dev->nr_zones, (u64)zmd->zone_nr_sectors);
>> +	}
>>   }
>>   /*
>>    * Initialize the zoned metadata.
>>    */
>> -int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>> +int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
>> +		     struct dmz_metadata **metadata,
>>   		     const char *devname)
>>   {
>>   	struct dmz_metadata *zmd;
>> @@ -2500,6 +2695,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>>   
>>   	strcpy(zmd->devname, devname);
>>   	zmd->dev = dev;
>> +	zmd->nr_devs = num_dev;
>>   	zmd->mblk_rbtree = RB_ROOT;
>>   	init_rwsem(&zmd->mblk_sem);
>>   	mutex_init(&zmd->mblk_flush_lock);
>> @@ -2534,11 +2730,24 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>>   	/* Set metadata zones starting from sb_zone */
>>   	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
>>   		zone = dmz_get(zmd, zmd->sb[0].zone->id + i);
>> -		if (!dmz_is_rnd(zone))
>> +		if (!dmz_is_rnd(zone)) {
>> +			dmz_zmd_err(zmd,
>> +				    "metadata zone %d is not random", i);
>> +			ret = -ENXIO;
>>   			goto err;
>> +		}
>> +		set_bit(DMZ_META, &zone->flags);
>> +	}
>> +	if (zmd->sb[2].zone) {
>> +		zone = dmz_get(zmd, zmd->sb[2].zone->id);
>> +		if (!zone) {
>> +			dmz_zmd_err(zmd,
>> +				    "Tertiary metadata zone not present");
>> +			ret = -ENXIO;
>> +			goto err;
>> +		}
>>   		set_bit(DMZ_META, &zone->flags);
>>   	}
>> -
>>   	/* Load mapping table */
>>   	ret = dmz_load_mapping(zmd);
>>   	if (ret)
>> @@ -2563,8 +2772,13 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata,
>>   		goto err;
>>   	}
>>   
>> -	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", DMZ_META_VER);
>> -	dmz_print_dev(zmd, 0);
>> +	dmz_zmd_info(zmd, "DM-Zoned metadata version %d", zmd->sb_version);
>> +	if (zmd->sb_version > 1) {
>> +		dmz_zmd_info(zmd, "DM UUID %pUl", &zmd->uuid);
>> +		dmz_zmd_info(zmd, "DM Label %s", zmd->label);
>> +	}
>> +	for (i = 0; i < zmd->nr_devs; i++)
>> +		dmz_print_dev(zmd, i);
>>   
>>   	dmz_zmd_info(zmd, "  %u zones of %llu 512-byte logical sectors",
>>   		     zmd->nr_zones, (u64)zmd->zone_nr_sectors);
>> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
>> index 4897ffae96ca..ae05d5d60b37 100644
>> --- a/drivers/md/dm-zoned-target.c
>> +++ b/drivers/md/dm-zoned-target.c
>> @@ -38,7 +38,7 @@ struct dm_chunk_work {
>>    * Target descriptor.
>>    */
>>   struct dmz_target {
>> -	struct dm_dev		*ddev;
>> +	struct dm_dev		*ddev[2];
>>   
>>   	unsigned long		flags;
>>   
>> @@ -684,60 +684,40 @@ static int dmz_map(struct dm_target *ti, struct bio *bio)
>>   /*
>>    * Get zoned device information.
>>    */
>> -static int dmz_get_zoned_device(struct dm_target *ti, char *path)
>> +static int dmz_get_zoned_device(struct dm_target *ti, char *path, int num)
>>   {
>>   	struct dmz_target *dmz = ti->private;
>> -	struct request_queue *q;
>>   	struct dmz_dev *dev;
>> -	sector_t aligned_capacity;
>>   	int ret;
>> +	struct block_device *bdev;
>>   
>>   	/* Get the target device */
>> -	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table), &dmz->ddev);
>> +	ret = dm_get_device(ti, path, dm_table_get_mode(ti->table),
>> +			    &dmz->ddev[num]);
>>   	if (ret) {
>>   		ti->error = "Get target device failed";
>> -		dmz->ddev = NULL;
>> +		dmz->ddev[num] = NULL;
>>   		return ret;
>>   	}
>>   
>> -	dev = kzalloc(sizeof(struct dmz_dev), GFP_KERNEL);
>> -	if (!dev) {
>> -		ret = -ENOMEM;
>> -		goto err;
>> -	}
>> -
>> -	dev->bdev = dmz->ddev->bdev;
>> +	bdev = dmz->ddev[num]->bdev;
>> +	if (bdev_zoned_model(bdev) == BLK_ZONED_NONE) {
>> +		dev = &dmz->dev[0];
>> +		dev->flags = DMZ_BDEV_REGULAR;
>> +	} else
>> +		dev = &dmz->dev[1];
>> +	dev->bdev = bdev;
>>   	(void)bdevname(dev->bdev, dev->name);
> 
> I changed this. See below.
> 
>>   
>> -	if (bdev_zoned_model(dev->bdev) == BLK_ZONED_NONE) {
>> -		ti->error = "Not a zoned block device";
>> -		ret = -EINVAL;
>> -		goto err;
>> -	}
>> -
>> -	q = bdev_get_queue(dev->bdev);
>>   	dev->capacity = i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
>> -	aligned_capacity = dev->capacity &
>> -				~((sector_t)blk_queue_zone_sectors(q) - 1);
>> -	if (ti->begin ||
>> -	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
>> -		ti->error = "Partial mapping not supported";
>> -		ret = -EINVAL;
>> -		goto err;
>> +	if (ti->begin) {
>> +		ti->error = "Partial mapping is not supported";
>> +		dm_put_device(ti, dmz->ddev[num]);
>> +		dmz->ddev[num] = NULL;
>> +		return -EINVAL;
>>   	}
>>   
>> -	dev->zone_nr_sectors = blk_queue_zone_sectors(q);
>> -
>> -	dev->nr_zones = blkdev_nr_zones(dev->bdev->bd_disk);
>> -
>> -	dmz->dev = dev;
>> -
>>   	return 0;
>> -err:
>> -	dm_put_device(ti, dmz->ddev);
>> -	kfree(dev);
>> -
>> -	return ret;
>>   }
>>   
>>   /*
>> @@ -747,9 +727,46 @@ static void dmz_put_zoned_device(struct dm_target *ti)
>>   {
>>   	struct dmz_target *dmz = ti->private;
>>   
>> -	dm_put_device(ti, dmz->ddev);
>> -	kfree(dmz->dev);
>> -	dmz->dev = NULL;
>> +	if (dmz->ddev[1]) {
>> +		dm_put_device(ti, dmz->ddev[1]);
>> +		dmz->ddev[1] = NULL;
>> +	}
>> +	dm_put_device(ti, dmz->ddev[0]);
>> +	dmz->ddev[0] = NULL;
> 
> A for loop here would be cleaner ?
> 
>> +}
>> +
>> +static int dmz_fixup_devices(struct dm_target *ti)
>> +{
>> +	struct dmz_target *dmz = ti->private;
>> +	struct dmz_dev *pri_dev, *sec_dev;
>> +	struct request_queue *q;
>> +
>> +	pri_dev = &dmz->dev[0];
>> +	if (!(pri_dev->flags & DMZ_BDEV_REGULAR)) {
>> +		ti->error = "Primary disk is not a regular device";
>> +		return -EINVAL;
>> +	}
>> +	sec_dev = &dmz->dev[1];
>> +	if (sec_dev->flags & DMZ_BDEV_REGULAR) {
>> +		ti->error = "Secondary disk is not a zoned device";
>> +		return -EINVAL;
>> +	}
>> +	q = bdev_get_queue(sec_dev->bdev);
>> +	sec_dev->zone_nr_sectors = blk_queue_zone_sectors(q);
>> +	sec_dev->nr_zones = blkdev_nr_zones(sec_dev->bdev->bd_disk);
>> +
>> +	pri_dev->zone_nr_sectors = sec_dev->zone_nr_sectors;
>> +	pri_dev->nr_zones = DIV_ROUND_UP(pri_dev->capacity,
>> +					 pri_dev->zone_nr_sectors);
>> +	sec_dev->zone_offset = pri_dev->nr_zones;
>> +	/* Check if we need to swizzle devices */
>> +	if (pri_dev->bdev != dmz->ddev[0]->bdev) {
>> +		struct dm_dev *ddev = dmz->ddev[0];
>> +
>> +		dmz->ddev[0] = dmz->ddev[1];
>> +		dmz->ddev[1] = ddev;
>> +	}
>> +	return 0;
> 
> Changed this too. See below.
> 
>>   }
>>   
>>   /*
>> @@ -758,11 +775,10 @@ static void dmz_put_zoned_device(struct dm_target *ti)
>>   static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>>   {
>>   	struct dmz_target *dmz;
>> -	struct dmz_dev *dev;
>>   	int ret;
>>   
>>   	/* Check arguments */
>> -	if (argc != 1) {
>> +	if (argc < 1 || argc > 2) {
>>   		ti->error = "Invalid argument count";
>>   		return -EINVAL;
>>   	}
>> @@ -773,18 +789,34 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>>   		ti->error = "Unable to allocate the zoned target descriptor";
>>   		return -ENOMEM;
>>   	}
>> +	dmz->dev = kcalloc(2, sizeof(struct dmz_dev), GFP_KERNEL);
>> +	if (!dmz->dev) {
>> +		ti->error = "Unable to allocate the zoned device descriptors";
>> +		kfree(dmz);
>> +		return -ENOMEM;
>> +	}
>>   	ti->private = dmz;
>>   
>>   	/* Get the target zoned block device */
>> -	ret = dmz_get_zoned_device(ti, argv[0]);
>> -	if (ret) {
>> -		dmz->ddev = NULL;
>> +	ret = dmz_get_zoned_device(ti, argv[0], 0);
>> +	if (ret)
>>   		goto err;
>> +
>> +	if (argc == 2) {
>> +		ret = dmz_get_zoned_device(ti, argv[1], 1);
>> +		if (ret) {
>> +			dmz_put_zoned_device(ti);
>> +			goto err;
>> +		}
>> +		ret = dmz_fixup_devices(ti);
>> +		if (ret) {
>> +			dmz_put_zoned_device(ti);
>> +			goto err;
>> +		}
> 
> Fixup devices needs to be called regardless of the number of drives so that
> zone_nr_sectors and nr_zones get initialized. See below.
> 
>>   	}
>>   
>>   	/* Initialize metadata */
>> -	dev = dmz->dev;
>> -	ret = dmz_ctr_metadata(dev, &dmz->metadata,
>> +	ret = dmz_ctr_metadata(dmz->dev, argc, &dmz->metadata,
>>   			       dm_table_device_name(ti->table));
>>   	if (ret) {
>>   		ti->error = "Metadata initialization failed";
>> @@ -861,6 +893,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>>   err_dev:
>>   	dmz_put_zoned_device(ti);
>>   err:
>> +	kfree(dmz->dev);
>>   	kfree(dmz);
>>   
>>   	return ret;
>> @@ -891,6 +924,7 @@ static void dmz_dtr(struct dm_target *ti)
>>   
>>   	mutex_destroy(&dmz->chunk_lock);
>>   
>> +	kfree(dmz->dev);
>>   	kfree(dmz);
>>   }
>>   
>> @@ -965,10 +999,17 @@ static int dmz_iterate_devices(struct dm_target *ti,
>>   			       iterate_devices_callout_fn fn, void *data)
>>   {
>>   	struct dmz_target *dmz = ti->private;
>> -	struct dmz_dev *dev = dmz->dev;
>> -	sector_t capacity = dev->capacity & ~(dmz_zone_nr_sectors(dmz->metadata) - 1);
>> -
>> -	return fn(ti, dmz->ddev, 0, capacity, data);
>> +	unsigned int zone_nr_sectors = dmz_zone_nr_sectors(dmz->metadata);
>> +	sector_t capacity;
>> +	int r;
>> +
>> +	capacity = dmz->dev[0].capacity & ~(zone_nr_sectors - 1);
>> +	r = fn(ti, dmz->ddev[0], 0, capacity, data);
>> +	if (!r && dmz->ddev[1]) {
>> +		capacity = dmz->dev[1].capacity & ~(zone_nr_sectors - 1);
>> +		r = fn(ti, dmz->ddev[1], 0, capacity, data);
>> +	}
>> +	return r;
>>   }
>>   
>>   static void dmz_status(struct dm_target *ti, status_type_t type,
>> @@ -978,6 +1019,7 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
>>   	struct dmz_target *dmz = ti->private;
>>   	ssize_t sz = 0;
>>   	char buf[BDEVNAME_SIZE];
>> +	struct dmz_dev *dev;
>>   
>>   	switch (type) {
>>   	case STATUSTYPE_INFO:
>> @@ -991,8 +1033,14 @@ static void dmz_status(struct dm_target *ti, status_type_t type,
>>   		       dmz_nr_seq_zones(dmz->metadata));
>>   		break;
>>   	case STATUSTYPE_TABLE:
>> -		format_dev_t(buf, dmz->dev->bdev->bd_dev);
>> +		dev = &dmz->dev[0];
>> +		format_dev_t(buf, dev->bdev->bd_dev);
>>   		DMEMIT("%s ", buf);
>> +		if (dmz->dev[1].bdev) {
>> +			dev = &dmz->dev[1];
>> +			format_dev_t(buf, dev->bdev->bd_dev);
>> +			DMEMIT("%s ", buf);
>> +		}
>>   		break;
>>   	}
>>   	return;
>> @@ -1014,7 +1062,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>>   
>>   static struct target_type dmz_type = {
>>   	.name		 = "zoned",
>> -	.version	 = {1, 1, 0},
>> +	.version	 = {1, 2, 0},
> 
> May be got to version 2.0.0 to match the metadata version number ?
> 
Sure. Will be updating it.

>>   	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>>   	.module		 = THIS_MODULE,
>>   	.ctr		 = dmz_ctr,
>> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
>> index 454ebd628cca..e383d5b2a3c5 100644
>> --- a/drivers/md/dm-zoned.h
>> +++ b/drivers/md/dm-zoned.h
>> @@ -52,10 +52,12 @@ struct dmz_dev {
>>   	struct block_device	*bdev;
>>   
>>   	char			name[BDEVNAME_SIZE];
>> +	uuid_t			uuid;
>>   
>>   	sector_t		capacity;
>>   
>>   	unsigned int		nr_zones;
>> +	unsigned int		zone_offset;
>>   
>>   	unsigned int		flags;
>>   
>> @@ -69,6 +71,7 @@ struct dmz_dev {
>>   /* Device flags. */
>>   #define DMZ_BDEV_DYING		(1 << 0)
>>   #define DMZ_CHECK_BDEV		(2 << 0)
>> +#define DMZ_BDEV_REGULAR	(4 << 0)
>>   
>>   /*
>>    * Zone descriptor.
>> @@ -163,8 +166,8 @@ struct dmz_reclaim;
>>   /*
>>    * Functions defined in dm-zoned-metadata.c
>>    */
>> -int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **zmd,
>> -		     const char *devname);
>> +int dmz_ctr_metadata(struct dmz_dev *dev, int num_dev,
>> +		     struct dmz_metadata **zmd, const char *devname);
>>   void dmz_dtr_metadata(struct dmz_metadata *zmd);
>>   int dmz_resume_metadata(struct dmz_metadata *zmd);
>>   
>> @@ -176,15 +179,13 @@ void dmz_lock_flush(struct dmz_metadata *zmd);
>>   void dmz_unlock_flush(struct dmz_metadata *zmd);
>>   int dmz_flush_metadata(struct dmz_metadata *zmd);
>>   const char *dmz_metadata_label(struct dmz_metadata *zmd);
>> +bool dmz_check_dev(struct dmz_metadata *zmd);
>>   
>>   sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone);
>>   sector_t dmz_start_block(struct dmz_metadata *zmd, struct dm_zone *zone);
>>   unsigned int dmz_nr_chunks(struct dmz_metadata *zmd);
>>   struct dmz_dev *dmz_zone_to_dev(struct dmz_metadata *zmd, struct dm_zone *zone);
>>   
>> -bool dmz_check_dev(struct dmz_metadata *zmd);
>> -bool dmz_dev_is_dying(struct dmz_metadata *zmd);
>> -
>>   #define DMZ_ALLOC_RND		0x01
>>   #define DMZ_ALLOC_RECLAIM	0x02
>>   
>> @@ -251,6 +252,7 @@ int dmz_copy_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
>>   			  struct dm_zone *to_zone);
>>   int dmz_merge_valid_blocks(struct dmz_metadata *zmd, struct dm_zone *from_zone,
>>   			   struct dm_zone *to_zone, sector_t chunk_block);
>> +bool dmz_dev_is_dying(struct dmz_metadata *zmd);
>>   
>>   /*
>>    * Functions defined in dm-zoned-reclaim.c
>>
> 
> I ran the entire series through simple tests. As noted above, the single drive
> case is broken. Here is what I applied on top of this patch to fix it:
> 
I'll give it a spin on my testbed, and will be sending a v4 shortly.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 13/13] dm-zoned: metadata version 2
  2020-04-30 14:45     ` Hannes Reinecke
@ 2020-05-01  0:15       ` Damien Le Moal
  0 siblings, 0 replies; 24+ messages in thread
From: Damien Le Moal @ 2020-05-01  0:15 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/04/30 23:45, Hannes Reinecke wrote:
>>> +unsigned int dmz_dev_zone_id(struct dmz_metadata *zmd, struct dm_zone *zone)
>>> +{
>>> +	unsigned int zone_id;
>>> +
>>> +	if (WARN_ON(!zone))
>>> +		return 0;
>>> +
>>> +	zone_id = zone->id;
>>> +	if (zmd->nr_devs > 1 &&
>>> +	    (zone_id >= zmd->dev[1].zone_offset))
>>> +		zone_id -= zmd->dev[1].zone_offset;
>>
>> We could have this as:
>>
>> 	if (zone_id >= zmd->dev[0].nr_zones)
>> 		zone_id -= zmd->dev[0].nr_zones;
>>
>> No ? It is simpler and we can kill the zone_offset.
>>
> Yes, but it will make the device arrangement implicit; by specifying
> the block offset we allow us the option of possibly moving the block 
> offset into the metadata, and then having the metadata specifying the
> layout.
> Something which I'd like to keep as I have this weird idea of using 
> other, non-standard, drives, too, which then would require a more 
> complex layout.

OK. Got it. Let's keep this as is then.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-05-01  0:15 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20 10:08 [PATCHv4 00/13] dm-zoned: metadata version 2 Hannes Reinecke
2020-04-20 10:08 ` [PATCH 01/13] dm-zoned: add 'status' and 'message' callbacks Hannes Reinecke
2020-04-28  9:19   ` Damien Le Moal
2020-04-20 10:08 ` [PATCH 02/13] dm-zoned: store zone id within the zone structure and kill dmz_id() Hannes Reinecke
2020-04-28  9:35   ` Damien Le Moal
2020-04-20 10:08 ` [PATCH 03/13] dm-zoned: use array for superblock zones Hannes Reinecke
2020-04-20 10:08 ` [PATCH 04/13] dm-zoned: store device in struct dmz_sb Hannes Reinecke
2020-04-20 10:08 ` [PATCH 05/13] dm-zoned: move fields from struct dmz_dev to dmz_metadata Hannes Reinecke
2020-04-20 10:08 ` [PATCH 06/13] dm-zoned: introduce dmz_metadata_label() to format device name Hannes Reinecke
2020-04-20 10:08 ` [PATCH 07/13] dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() Hannes Reinecke
2020-04-28  9:37   ` Damien Le Moal
2020-04-20 10:08 ` [PATCH 08/13] dm-zoned: remove 'dev' argument from reclaim Hannes Reinecke
2020-04-28  9:40   ` Damien Le Moal
2020-04-20 10:08 ` [PATCH 09/13] dm-zoned: replace 'target' pointer in the bio context Hannes Reinecke
2020-04-28  9:43   ` Damien Le Moal
2020-04-20 10:08 ` [PATCH 10/13] dm-zoned: use dmz_zone_to_dev() when handling metadata I/O Hannes Reinecke
2020-04-20 10:08 ` [PATCH 11/13] dm-zoned: add metadata logging functions Hannes Reinecke
2020-04-20 10:08 ` [PATCH 12/13] dm-zoned: ignore metadata zone in dmz_alloc_zone() Hannes Reinecke
2020-04-20 10:08 ` [PATCH 13/13] dm-zoned: metadata version 2 Hannes Reinecke
2020-04-28 10:54   ` Damien Le Moal
2020-04-28 17:37     ` Mike Snitzer
2020-04-30 14:45     ` Hannes Reinecke
2020-05-01  0:15       ` Damien Le Moal
2020-04-22  0:42 ` [PATCHv4 00/13] " Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.