All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] dm-zoned: Metadata V2
@ 2020-03-27  7:14 Hannes Reinecke
  2020-03-27  7:14 ` [PATCH 1/4] dm-zoned: store zone id within the zone structure Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-27  7:14 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Hi all,

here's a patchset to add UUIDs and labels to the metadata,
increasing the metadata version number to 2.
This allows to identify the various devices, and also to
have a persistent device-mapper name based on the label.
To handle and generate these new values I've also send
a pull request to the upstream dm-zoned-tools git repository.

This patchset requires the previous patch to add status
and message callbacks.

As usual, comments and reviews are welcome.

Hannes Reinecke (4):
  dm-zoned: store zone id within the zone structure
  dm-zoned: use array for superblock zones
  dm-zoned: V2 metadata handling
  dm-zoned: allow for device size smaller than the capacity

 drivers/md/dm-zoned-metadata.c | 117 ++++++++++++++++++++++++++++++-----------
 drivers/md/dm-zoned-target.c   |   4 +-
 drivers/md/dm-zoned.h          |   3 ++
 3 files changed, 91 insertions(+), 33 deletions(-)

-- 
2.16.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] dm-zoned: store zone id within the zone structure
  2020-03-27  7:14 [PATCH RFC 0/4] dm-zoned: Metadata V2 Hannes Reinecke
@ 2020-03-27  7:14 ` Hannes Reinecke
  2020-03-31  0:57   ` Damien Le Moal
  2020-03-27  7:14 ` [PATCH 2/4] dm-zoned: use array for superblock zones Hannes Reinecke
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-27  7:14 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Instead of calculating the zone index by the offset within the
zone array store the index within the structure itself.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-zoned-metadata.c | 3 ++-
 drivers/md/dm-zoned.h          | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index c8787560fa9f..afce594067fb 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -189,7 +189,7 @@ struct dmz_metadata {
  */
 unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
 {
-	return ((unsigned int)(zone - zmd->zones));
+	return zone->id;
 }
 
 sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
@@ -1119,6 +1119,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 
 	INIT_LIST_HEAD(&zone->link);
 	atomic_set(&zone->refcount, 0);
+	zone->id = idx;
 	zone->chunk = DMZ_MAP_UNMAPPED;
 
 	switch (blkz->type) {
diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
index 884c0e586082..39d59898abbe 100644
--- a/drivers/md/dm-zoned.h
+++ b/drivers/md/dm-zoned.h
@@ -87,6 +87,9 @@ struct dm_zone {
 	/* Zone activation reference count */
 	atomic_t		refcount;
 
+	/* Zone id */
+	unsigned int		id;
+
 	/* Zone write pointer block (relative to the zone start block) */
 	unsigned int		wp_block;
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] dm-zoned: use array for superblock zones
  2020-03-27  7:14 [PATCH RFC 0/4] dm-zoned: Metadata V2 Hannes Reinecke
  2020-03-27  7:14 ` [PATCH 1/4] dm-zoned: store zone id within the zone structure Hannes Reinecke
@ 2020-03-27  7:14 ` Hannes Reinecke
  2020-03-31  0:51   ` Damien Le Moal
  2020-03-31  9:10   ` Bob Liu
  2020-03-27  7:14 ` [PATCH 3/4] dm-zoned: V2 metadata handling Hannes Reinecke
  2020-03-27  7:14 ` [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity Hannes Reinecke
  3 siblings, 2 replies; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-27  7:14 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Instead of storing just the first superblock zone and calculate
the secondary relative to that we should be using an array for
holding the superblock zones.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-zoned-metadata.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index afce594067fb..dc1d17bc3bbb 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -124,6 +124,7 @@ struct dmz_sb {
 	sector_t		block;
 	struct dmz_mblock	*mblk;
 	struct dmz_super	*sb;
+	struct dm_zone		*zone;
 };
 
 /*
@@ -150,7 +151,6 @@ struct dmz_metadata {
 	/* Zone information array */
 	struct dm_zone		*zones;
 
-	struct dm_zone		*sb_zone;
 	struct dmz_sb		sb[2];
 	unsigned int		mblk_primary;
 	u64			sb_gen;
@@ -937,16 +937,20 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
 
 	/* Bad first super block: search for the second one */
 	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
+	zmd->sb[1].zone = zmd->sb[0].zone + 1;
 	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
 		if (dmz_read_sb(zmd, 1) != 0)
 			break;
-		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC)
+		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC) {
+			zmd->sb[1].zone += i;
 			return 0;
+		}
 		zmd->sb[1].block += zone_nr_blocks;
 	}
 
 	dmz_free_mblock(zmd, mblk);
 	zmd->sb[1].mblk = NULL;
+	zmd->sb[1].zone = NULL;
 
 	return -EIO;
 }
@@ -990,11 +994,9 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
 	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
 
 	if (dst_set == 0)
-		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
-	else {
-		zmd->sb[1].block = zmd->sb[0].block +
-			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
-	}
+		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
+	else
+		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
 
 	page = alloc_page(GFP_NOIO);
 	if (!page)
@@ -1038,8 +1040,13 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 	u64 sb_gen[2] = {0, 0};
 	int ret;
 
+	if (!zmd->sb[0].zone) {
+		dmz_dev_err(zmd->dev, "Primary super block zone not set");
+		return -ENXIO;
+	}
+
 	/* Read and check the primary super block */
-	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
+	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
 	ret = dmz_get_sb(zmd, 0);
 	if (ret) {
 		dmz_dev_err(zmd->dev, "Read primary super block failed");
@@ -1051,8 +1058,9 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
 	/* Read and check secondary super block */
 	if (ret == 0) {
 		sb_good[0] = true;
-		zmd->sb[1].block = zmd->sb[0].block +
-			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
+		if (!zmd->sb[1].zone)
+			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
+		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
 		ret = dmz_get_sb(zmd, 1);
 	} else
 		ret = dmz_lookup_secondary_sb(zmd);
@@ -1147,9 +1155,9 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
 		zmd->nr_useable_zones++;
 		if (dmz_is_rnd(zone)) {
 			zmd->nr_rnd_zones++;
-			if (!zmd->sb_zone) {
+			if (!zmd->sb[0].zone) {
 				/* Super block zone */
-				zmd->sb_zone = zone;
+				zmd->sb[0].zone = zone;
 			}
 		}
 	}
@@ -2420,7 +2428,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 		goto err;
 
 	/* Set metadata zones starting from sb_zone */
-	zid = dmz_id(zmd, zmd->sb_zone);
+	zid = dmz_id(zmd, zmd->sb[0].zone);
 	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
 		zone = dmz_get(zmd, zid + i);
 		if (!dmz_is_rnd(zone))
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-03-27  7:14 [PATCH RFC 0/4] dm-zoned: Metadata V2 Hannes Reinecke
  2020-03-27  7:14 ` [PATCH 1/4] dm-zoned: store zone id within the zone structure Hannes Reinecke
  2020-03-27  7:14 ` [PATCH 2/4] dm-zoned: use array for superblock zones Hannes Reinecke
@ 2020-03-27  7:14 ` Hannes Reinecke
  2020-03-31  0:54   ` Damien Le Moal
  2020-03-31  9:11   ` Bob Liu
  2020-03-27  7:14 ` [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity Hannes Reinecke
  3 siblings, 2 replies; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-27  7:14 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

Add 'V2' metadata which includes UUIDs for the dmz set and for
the device itself.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-zoned-metadata.c | 80 +++++++++++++++++++++++++++++++++---------
 drivers/md/dm-zoned-target.c   |  2 +-
 2 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index dc1d17bc3bbb..026f285fba33 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -16,7 +16,8 @@
 /*
  * Metadata version.
  */
-#define DMZ_META_VER	1
+#define DMZ_META_COMPAT_VER 1
+#define DMZ_META_VER	2
 
 /*
  * On-disk super block magic.
@@ -69,8 +70,17 @@ struct dmz_super {
 	/* Checksum */
 	__le32		crc;			/*  48 */
 
-	/* Padding to full 512B sector */
-	u8		reserved[464];		/* 512 */
+	/* DM-Zoned label */
+	u8		dmz_label[32];		/*  80 */
+
+	/* DM-Zoned UUID */
+	u8		dmz_uuid[16];		/*  96 */
+
+	/* Device UUID */
+	u8		dev_uuid[16];		/* 112 */
+
+	/* Padding to full 512B - CRC sector */
+	u8		reserved[400];		/* 512 */
 };
 
 /*
@@ -133,6 +143,10 @@ struct dmz_sb {
 struct dmz_metadata {
 	struct dmz_dev		*dev;
 
+	char			dmz_label[BDEVNAME_SIZE];
+	uuid_t			dmz_uuid;
+	uuid_t			dev_uuid;
+
 	sector_t		zone_bitmap_size;
 	unsigned int		zone_nr_bitmap_blocks;
 	unsigned int		zone_bits_per_mblk;
@@ -659,7 +673,14 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
 	int ret;
 
 	sb->magic = cpu_to_le32(DMZ_MAGIC);
-	sb->version = cpu_to_le32(DMZ_META_VER);
+
+	if (!uuid_is_null(&zmd->dmz_uuid)) {
+		sb->version = cpu_to_le32(DMZ_META_VER);
+		uuid_copy((uuid_t *)sb->dmz_uuid, &zmd->dmz_uuid);
+		memcpy(sb->dmz_label, zmd->dmz_label, BDEVNAME_SIZE);
+		uuid_copy((uuid_t *)sb->dev_uuid, &zmd->dev_uuid);
+	} else
+		sb->version = cpu_to_le32(DMZ_META_COMPAT_VER);
 
 	sb->gen = cpu_to_le64(sb_gen);
 
@@ -848,28 +869,29 @@ static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
 {
 	unsigned int nr_meta_zones, nr_data_zones;
 	struct dmz_dev *dev = zmd->dev;
-	u32 crc, stored_crc;
+	u32 crc, stored_crc, dmz_version;
 	u64 gen;
 
-	gen = le64_to_cpu(sb->gen);
-	stored_crc = le32_to_cpu(sb->crc);
-	sb->crc = 0;
-	crc = crc32_le(gen, (unsigned char *)sb, DMZ_BLOCK_SIZE);
-	if (crc != stored_crc) {
-		dmz_dev_err(dev, "Invalid checksum (needed 0x%08x, got 0x%08x)",
-			    crc, stored_crc);
-		return -ENXIO;
-	}
-
 	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
 		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
 			    DMZ_MAGIC, le32_to_cpu(sb->magic));
 		return -ENXIO;
 	}
 
-	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
+	dmz_version = le32_to_cpu(sb->version);
+	if (dmz_version > DMZ_META_VER) {
 		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
-			    DMZ_META_VER, le32_to_cpu(sb->version));
+			    DMZ_META_VER, dmz_version);
+		return -ENXIO;
+	}
+
+	gen = le64_to_cpu(sb->gen);
+	stored_crc = le32_to_cpu(sb->crc);
+	sb->crc = 0;
+	crc = crc32_le(gen, (unsigned char *)sb, DMZ_BLOCK_SIZE);
+	if (crc != stored_crc) {
+		dmz_dev_err(dev, "Invalid checksum (needed 0x%08x, got 0x%08x)",
+			    crc, stored_crc);
 		return -ENXIO;
 	}
 
@@ -895,6 +917,21 @@ static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
 		return -ENXIO;
 	}
 
+	if (dmz_version == DMZ_META_VER) {
+		if (uuid_is_null((uuid_t *)sb->dmz_uuid)) {
+			dmz_dev_err(dev, "NULL DM-Zoned uuid");
+			return -ENXIO;
+		}
+		uuid_copy(&zmd->dmz_uuid, (uuid_t *)sb->dmz_uuid);
+		memcpy(zmd->dmz_label, sb->dmz_label, BDEVNAME_SIZE);
+		if (uuid_is_null((uuid_t *)sb->dev_uuid)) {
+			dmz_dev_err(dev, "NULL device uuid");
+			return -ENXIO;
+		}
+		uuid_copy(&zmd->dev_uuid, (uuid_t *)sb->dev_uuid);
+
+	}
+
 	/* OK */
 	zmd->nr_meta_blocks = le32_to_cpu(sb->nr_meta_blocks);
 	zmd->nr_reserved_seq = le32_to_cpu(sb->nr_reserved_seq);
@@ -2460,9 +2497,18 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
 		goto err;
 	}
 
+	dmz_dev_info(dev, "DM-Zoned version %d",
+		     uuid_is_null(&zmd->dmz_uuid) ?
+		     DMZ_META_COMPAT_VER : DMZ_META_VER);
+	if (!uuid_is_null(&zmd->dmz_uuid))
+		dmz_dev_info(dev, "DM UUID %pUl", &zmd->dmz_uuid);
+	if (strlen(zmd->dmz_label))
+		dmz_dev_info(dev, "DM Label %s", zmd->dmz_label);
 	dmz_dev_info(dev, "Host-%s zoned block device",
 		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
 		     "aware" : "managed");
+	if (!uuid_is_null(&zmd->dev_uuid))
+		dmz_dev_info(dev, "  uuid %pUl", &zmd->dev_uuid);
 	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
 		     (u64)dev->capacity);
 	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 44e30a7de8b9..7ec9dde24516 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
 
 static struct target_type dmz_type = {
 	.name		 = "zoned",
-	.version	 = {1, 1, 0},
+	.version	 = {1, 2, 0},
 	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
 	.module		 = THIS_MODULE,
 	.ctr		 = dmz_ctr,
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity
  2020-03-27  7:14 [PATCH RFC 0/4] dm-zoned: Metadata V2 Hannes Reinecke
                   ` (2 preceding siblings ...)
  2020-03-27  7:14 ` [PATCH 3/4] dm-zoned: V2 metadata handling Hannes Reinecke
@ 2020-03-27  7:14 ` Hannes Reinecke
  2020-03-31  0:49   ` Damien Le Moal
  3 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-27  7:14 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Damien LeMoal, Bob Liu, dm-devel

dm-zoned requires several zones for metadata and chunk bitmaps,
so it cannot expose the entire capacity as the device size.
Originally the code would check for the capacity being equal to
the device size, which is arguably wrong.
So relax this check and increase the interface version number
to signal to userspace that it can set a smaller device size.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-zoned-target.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index 7ec9dde24516..89a825d1034e 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -715,7 +715,7 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
 	aligned_capacity = dev->capacity &
 				~((sector_t)blk_queue_zone_sectors(q) - 1);
 	if (ti->begin ||
-	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
+	    ((ti->len > dev->capacity) && (ti->len > aligned_capacity))) {
 		ti->error = "Partial mapping not supported";
 		ret = -EINVAL;
 		goto err;
@@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
 
 static struct target_type dmz_type = {
 	.name		 = "zoned",
-	.version	 = {1, 2, 0},
+	.version	 = {1, 3, 0},
 	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
 	.module		 = THIS_MODULE,
 	.ctr		 = dmz_ctr,
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity
  2020-03-27  7:14 ` [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity Hannes Reinecke
@ 2020-03-31  0:49   ` Damien Le Moal
  2020-03-31  8:53     ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Damien Le Moal @ 2020-03-31  0:49 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/03/27 16:15, Hannes Reinecke wrote:
> dm-zoned requires several zones for metadata and chunk bitmaps,
> so it cannot expose the entire capacity as the device size.
> Originally the code would check for the capacity being equal to
> the device size, which is arguably wrong.
> So relax this check and increase the interface version number
> to signal to userspace that it can set a smaller device size.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/md/dm-zoned-target.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 7ec9dde24516..89a825d1034e 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -715,7 +715,7 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
>  	aligned_capacity = dev->capacity &
>  				~((sector_t)blk_queue_zone_sectors(q) - 1);
>  	if (ti->begin ||
> -	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
> +	    ((ti->len > dev->capacity) && (ti->len > aligned_capacity))) {
>  		ti->error = "Partial mapping not supported";

The message is now wrong. Also, the condition can now be simplified to:

if ((ti->begin + ti->len) > aligned_capacity) {

Since aligned capacity is equal or smaller than dev capacity. And we have to
account for the potential non-zero begin.

>  		ret = -EINVAL;
>  		goto err;
> @@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>  
>  static struct target_type dmz_type = {
>  	.name		 = "zoned",
> -	.version	 = {1, 2, 0},
> +	.version	 = {1, 3, 0},
>  	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>  	.module		 = THIS_MODULE,
>  	.ctr		 = dmz_ctr,
> 

I do not think this is nearly enough: dmz_init_zones() is still considering the
entire drive and does a zone report from 0 on the backend bdev. It should start
at ti->begin and up to ti->begin+ti->len, no ?

Furthermore, this introduce a change in the meaning of the zone ID. Since this
is set to the index of the zone in the report (patch 1), if the mapping is
partial and the zone report does not start at 0, then zone ID is not zone number
on the device anymore. So dmz_start_block() needs to be offset by ti->begin
otherwise IOs will go to the wrong zones.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] dm-zoned: use array for superblock zones
  2020-03-27  7:14 ` [PATCH 2/4] dm-zoned: use array for superblock zones Hannes Reinecke
@ 2020-03-31  0:51   ` Damien Le Moal
  2020-03-31  9:10   ` Bob Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Damien Le Moal @ 2020-03-31  0:51 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/03/27 16:15, Hannes Reinecke wrote:
> Instead of storing just the first superblock zone and calculate
> the secondary relative to that we should be using an array for
> holding the superblock zones.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>

Looks good.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

> ---
>  drivers/md/dm-zoned-metadata.c | 34 +++++++++++++++++++++-------------
>  1 file changed, 21 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index afce594067fb..dc1d17bc3bbb 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -124,6 +124,7 @@ struct dmz_sb {
>  	sector_t		block;
>  	struct dmz_mblock	*mblk;
>  	struct dmz_super	*sb;
> +	struct dm_zone		*zone;
>  };
>  
>  /*
> @@ -150,7 +151,6 @@ struct dmz_metadata {
>  	/* Zone information array */
>  	struct dm_zone		*zones;
>  
> -	struct dm_zone		*sb_zone;
>  	struct dmz_sb		sb[2];
>  	unsigned int		mblk_primary;
>  	u64			sb_gen;
> @@ -937,16 +937,20 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
>  
>  	/* Bad first super block: search for the second one */
>  	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
> +	zmd->sb[1].zone = zmd->sb[0].zone + 1;
>  	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
>  		if (dmz_read_sb(zmd, 1) != 0)
>  			break;
> -		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC)
> +		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC) {
> +			zmd->sb[1].zone += i;
>  			return 0;
> +		}
>  		zmd->sb[1].block += zone_nr_blocks;
>  	}
>  
>  	dmz_free_mblock(zmd, mblk);
>  	zmd->sb[1].mblk = NULL;
> +	zmd->sb[1].zone = NULL;
>  
>  	return -EIO;
>  }
> @@ -990,11 +994,9 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
>  	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
>  
>  	if (dst_set == 0)
> -		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
> -	else {
> -		zmd->sb[1].block = zmd->sb[0].block +
> -			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
> -	}
> +		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
> +	else
> +		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
>  
>  	page = alloc_page(GFP_NOIO);
>  	if (!page)
> @@ -1038,8 +1040,13 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>  	u64 sb_gen[2] = {0, 0};
>  	int ret;
>  
> +	if (!zmd->sb[0].zone) {
> +		dmz_dev_err(zmd->dev, "Primary super block zone not set");
> +		return -ENXIO;
> +	}
> +
>  	/* Read and check the primary super block */
> -	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
> +	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
>  	ret = dmz_get_sb(zmd, 0);
>  	if (ret) {
>  		dmz_dev_err(zmd->dev, "Read primary super block failed");
> @@ -1051,8 +1058,9 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>  	/* Read and check secondary super block */
>  	if (ret == 0) {
>  		sb_good[0] = true;
> -		zmd->sb[1].block = zmd->sb[0].block +
> -			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
> +		if (!zmd->sb[1].zone)
> +			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
> +		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
>  		ret = dmz_get_sb(zmd, 1);
>  	} else
>  		ret = dmz_lookup_secondary_sb(zmd);
> @@ -1147,9 +1155,9 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>  		zmd->nr_useable_zones++;
>  		if (dmz_is_rnd(zone)) {
>  			zmd->nr_rnd_zones++;
> -			if (!zmd->sb_zone) {
> +			if (!zmd->sb[0].zone) {
>  				/* Super block zone */
> -				zmd->sb_zone = zone;
> +				zmd->sb[0].zone = zone;
>  			}
>  		}
>  	}
> @@ -2420,7 +2428,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
>  		goto err;
>  
>  	/* Set metadata zones starting from sb_zone */
> -	zid = dmz_id(zmd, zmd->sb_zone);
> +	zid = dmz_id(zmd, zmd->sb[0].zone);
>  	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
>  		zone = dmz_get(zmd, zid + i);
>  		if (!dmz_is_rnd(zone))
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-03-27  7:14 ` [PATCH 3/4] dm-zoned: V2 metadata handling Hannes Reinecke
@ 2020-03-31  0:54   ` Damien Le Moal
  2020-03-31  9:11   ` Bob Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Damien Le Moal @ 2020-03-31  0:54 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/03/27 16:15, Hannes Reinecke wrote:
> Add 'V2' metadata which includes UUIDs for the dmz set and for
> the device itself.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>

Looks good.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

> ---
>  drivers/md/dm-zoned-metadata.c | 80 +++++++++++++++++++++++++++++++++---------
>  drivers/md/dm-zoned-target.c   |  2 +-
>  2 files changed, 64 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index dc1d17bc3bbb..026f285fba33 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -16,7 +16,8 @@
>  /*
>   * Metadata version.
>   */
> -#define DMZ_META_VER	1
> +#define DMZ_META_COMPAT_VER 1
> +#define DMZ_META_VER	2
>  
>  /*
>   * On-disk super block magic.
> @@ -69,8 +70,17 @@ struct dmz_super {
>  	/* Checksum */
>  	__le32		crc;			/*  48 */
>  
> -	/* Padding to full 512B sector */
> -	u8		reserved[464];		/* 512 */
> +	/* DM-Zoned label */
> +	u8		dmz_label[32];		/*  80 */
> +
> +	/* DM-Zoned UUID */
> +	u8		dmz_uuid[16];		/*  96 */
> +
> +	/* Device UUID */
> +	u8		dev_uuid[16];		/* 112 */
> +
> +	/* Padding to full 512B - CRC sector */
> +	u8		reserved[400];		/* 512 */
>  };
>  
>  /*
> @@ -133,6 +143,10 @@ struct dmz_sb {
>  struct dmz_metadata {
>  	struct dmz_dev		*dev;
>  
> +	char			dmz_label[BDEVNAME_SIZE];
> +	uuid_t			dmz_uuid;
> +	uuid_t			dev_uuid;
> +
>  	sector_t		zone_bitmap_size;
>  	unsigned int		zone_nr_bitmap_blocks;
>  	unsigned int		zone_bits_per_mblk;
> @@ -659,7 +673,14 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  	int ret;
>  
>  	sb->magic = cpu_to_le32(DMZ_MAGIC);
> -	sb->version = cpu_to_le32(DMZ_META_VER);
> +
> +	if (!uuid_is_null(&zmd->dmz_uuid)) {
> +		sb->version = cpu_to_le32(DMZ_META_VER);
> +		uuid_copy((uuid_t *)sb->dmz_uuid, &zmd->dmz_uuid);
> +		memcpy(sb->dmz_label, zmd->dmz_label, BDEVNAME_SIZE);
> +		uuid_copy((uuid_t *)sb->dev_uuid, &zmd->dev_uuid);
> +	} else
> +		sb->version = cpu_to_le32(DMZ_META_COMPAT_VER);
>  
>  	sb->gen = cpu_to_le64(sb_gen);
>  
> @@ -848,28 +869,29 @@ static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
>  {
>  	unsigned int nr_meta_zones, nr_data_zones;
>  	struct dmz_dev *dev = zmd->dev;
> -	u32 crc, stored_crc;
> +	u32 crc, stored_crc, dmz_version;
>  	u64 gen;
>  
> -	gen = le64_to_cpu(sb->gen);
> -	stored_crc = le32_to_cpu(sb->crc);
> -	sb->crc = 0;
> -	crc = crc32_le(gen, (unsigned char *)sb, DMZ_BLOCK_SIZE);
> -	if (crc != stored_crc) {
> -		dmz_dev_err(dev, "Invalid checksum (needed 0x%08x, got 0x%08x)",
> -			    crc, stored_crc);
> -		return -ENXIO;
> -	}
> -
>  	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
>  		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
>  			    DMZ_MAGIC, le32_to_cpu(sb->magic));
>  		return -ENXIO;
>  	}
>  
> -	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
> +	dmz_version = le32_to_cpu(sb->version);
> +	if (dmz_version > DMZ_META_VER) {
>  		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
> -			    DMZ_META_VER, le32_to_cpu(sb->version));
> +			    DMZ_META_VER, dmz_version);
> +		return -ENXIO;
> +	}
> +
> +	gen = le64_to_cpu(sb->gen);
> +	stored_crc = le32_to_cpu(sb->crc);
> +	sb->crc = 0;
> +	crc = crc32_le(gen, (unsigned char *)sb, DMZ_BLOCK_SIZE);
> +	if (crc != stored_crc) {
> +		dmz_dev_err(dev, "Invalid checksum (needed 0x%08x, got 0x%08x)",
> +			    crc, stored_crc);
>  		return -ENXIO;
>  	}
>  
> @@ -895,6 +917,21 @@ static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
>  		return -ENXIO;
>  	}
>  
> +	if (dmz_version == DMZ_META_VER) {
> +		if (uuid_is_null((uuid_t *)sb->dmz_uuid)) {
> +			dmz_dev_err(dev, "NULL DM-Zoned uuid");
> +			return -ENXIO;
> +		}
> +		uuid_copy(&zmd->dmz_uuid, (uuid_t *)sb->dmz_uuid);
> +		memcpy(zmd->dmz_label, sb->dmz_label, BDEVNAME_SIZE);
> +		if (uuid_is_null((uuid_t *)sb->dev_uuid)) {
> +			dmz_dev_err(dev, "NULL device uuid");
> +			return -ENXIO;
> +		}
> +		uuid_copy(&zmd->dev_uuid, (uuid_t *)sb->dev_uuid);
> +
> +	}
> +
>  	/* OK */
>  	zmd->nr_meta_blocks = le32_to_cpu(sb->nr_meta_blocks);
>  	zmd->nr_reserved_seq = le32_to_cpu(sb->nr_reserved_seq);
> @@ -2460,9 +2497,18 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
>  		goto err;
>  	}
>  
> +	dmz_dev_info(dev, "DM-Zoned version %d",
> +		     uuid_is_null(&zmd->dmz_uuid) ?
> +		     DMZ_META_COMPAT_VER : DMZ_META_VER);
> +	if (!uuid_is_null(&zmd->dmz_uuid))
> +		dmz_dev_info(dev, "DM UUID %pUl", &zmd->dmz_uuid);
> +	if (strlen(zmd->dmz_label))
> +		dmz_dev_info(dev, "DM Label %s", zmd->dmz_label);
>  	dmz_dev_info(dev, "Host-%s zoned block device",
>  		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
>  		     "aware" : "managed");
> +	if (!uuid_is_null(&zmd->dev_uuid))
> +		dmz_dev_info(dev, "  uuid %pUl", &zmd->dev_uuid);
>  	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
>  		     (u64)dev->capacity);
>  	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 44e30a7de8b9..7ec9dde24516 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>  
>  static struct target_type dmz_type = {
>  	.name		 = "zoned",
> -	.version	 = {1, 1, 0},
> +	.version	 = {1, 2, 0},
>  	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>  	.module		 = THIS_MODULE,
>  	.ctr		 = dmz_ctr,
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] dm-zoned: store zone id within the zone structure
  2020-03-27  7:14 ` [PATCH 1/4] dm-zoned: store zone id within the zone structure Hannes Reinecke
@ 2020-03-31  0:57   ` Damien Le Moal
  2020-03-31  8:54     ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: Damien Le Moal @ 2020-03-31  0:57 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/03/27 16:15, Hannes Reinecke wrote:
> Instead of calculating the zone index by the offset within the
> zone array store the index within the structure itself.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/md/dm-zoned-metadata.c | 3 ++-
>  drivers/md/dm-zoned.h          | 3 +++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index c8787560fa9f..afce594067fb 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -189,7 +189,7 @@ struct dmz_metadata {
>   */
>  unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
>  {
> -	return ((unsigned int)(zone - zmd->zones));
> +	return zone->id;
>  }
>  
>  sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
> @@ -1119,6 +1119,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>  
>  	INIT_LIST_HEAD(&zone->link);
>  	atomic_set(&zone->refcount, 0);
> +	zone->id = idx;

See my comment on patch 4. Allowing partial bdev mapping changes the meaning of
this ID: for a partial mapping, this is not equal to the zone number anymore. So
we should document that (in the struct declaration), and may be merge this patch
with patch 4 since they are related so closely ?

>  	zone->chunk = DMZ_MAP_UNMAPPED;
>  
>  	switch (blkz->type) {
> diff --git a/drivers/md/dm-zoned.h b/drivers/md/dm-zoned.h
> index 884c0e586082..39d59898abbe 100644
> --- a/drivers/md/dm-zoned.h
> +++ b/drivers/md/dm-zoned.h
> @@ -87,6 +87,9 @@ struct dm_zone {
>  	/* Zone activation reference count */
>  	atomic_t		refcount;
>  
> +	/* Zone id */
> +	unsigned int		id;
> +
>  	/* Zone write pointer block (relative to the zone start block) */
>  	unsigned int		wp_block;
>  
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity
  2020-03-31  0:49   ` Damien Le Moal
@ 2020-03-31  8:53     ` Hannes Reinecke
  2020-04-02  2:45       ` Damien Le Moal
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-31  8:53 UTC (permalink / raw)
  To: Damien Le Moal, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 3/31/20 2:49 AM, Damien Le Moal wrote:
> On 2020/03/27 16:15, Hannes Reinecke wrote:
>> dm-zoned requires several zones for metadata and chunk bitmaps,
>> so it cannot expose the entire capacity as the device size.
>> Originally the code would check for the capacity being equal to
>> the device size, which is arguably wrong.
>> So relax this check and increase the interface version number
>> to signal to userspace that it can set a smaller device size.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/md/dm-zoned-target.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
>> index 7ec9dde24516..89a825d1034e 100644
>> --- a/drivers/md/dm-zoned-target.c
>> +++ b/drivers/md/dm-zoned-target.c
>> @@ -715,7 +715,7 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
>>   	aligned_capacity = dev->capacity &
>>   				~((sector_t)blk_queue_zone_sectors(q) - 1);
>>   	if (ti->begin ||
>> -	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
>> +	    ((ti->len > dev->capacity) && (ti->len > aligned_capacity))) {
>>   		ti->error = "Partial mapping not supported";
> 
> The message is now wrong. Also, the condition can now be simplified to:
> 
> if ((ti->begin + ti->len) > aligned_capacity) {
> 
> Since aligned capacity is equal or smaller than dev capacity. And we have to
> account for the potential non-zero begin.
> 
_Actually_ I would forbid for 'ti->begin' to be anything other than 0.
For a zoned device there is no point in allowing for partial handling at 
all.
Problem is that 'dev->capacity' is the capacity of the zoned block 
device, whereas 'ti-len' is the exported capacity of the device-mapper 
device, which is smaller than the device capacity by the number of 
metadata blocks/zones.

>>   		ret = -EINVAL;
>>   		goto err;
>> @@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>>   
>>   static struct target_type dmz_type = {
>>   	.name		 = "zoned",
>> -	.version	 = {1, 2, 0},
>> +	.version	 = {1, 3, 0},
>>   	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>>   	.module		 = THIS_MODULE,
>>   	.ctr		 = dmz_ctr,
>>
> 
> I do not think this is nearly enough: dmz_init_zones() is still considering the
> entire drive and does a zone report from 0 on the backend bdev. It should start
> at ti->begin and up to ti->begin+ti->len, no ?
> 
Yes, and no. I want to disallow 'ti->begin' to be anything else than 0 
as ti->begin and ti->len are relative to exported device-mapper device, 
and we always want to have block 0 mapped :-)

And as such dmz_init_zones() needs to cover all zones, as this is 
relative to the zoned block device.

> Furthermore, this introduce a change in the meaning of the zone ID. Since this
> is set to the index of the zone in the report (patch 1), if the mapping is
> partial and the zone report does not start at 0, then zone ID is not zone number
> on the device anymore. So dmz_start_block() needs to be offset by ti->begin
> otherwise IOs will go to the wrong zones.
> 
As I said: We will never do a partial mapping.
What this patch does it to bring the device-mapper mapping in-line with 
the exported device size.

Originally we would export a device-mapper mapping for blocks up to the 
zone-device capacity. As the resulting device-mapper block device has a 
smaller capacity than the mapping would allow for those 'spare' blocks 
would never been used, thus the invalid mapping was never triggered.

What this patch does is to bring the device-mapper mapping in-line with 
the exported block device capacity, so that we don't have an invalid 
mapping. Nothing else has (and should) be changed.

Especially not the partial zoned device handling :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/4] dm-zoned: store zone id within the zone structure
  2020-03-31  0:57   ` Damien Le Moal
@ 2020-03-31  8:54     ` Hannes Reinecke
  0 siblings, 0 replies; 19+ messages in thread
From: Hannes Reinecke @ 2020-03-31  8:54 UTC (permalink / raw)
  To: Damien Le Moal, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 3/31/20 2:57 AM, Damien Le Moal wrote:
> On 2020/03/27 16:15, Hannes Reinecke wrote:
>> Instead of calculating the zone index by the offset within the
>> zone array store the index within the structure itself.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/md/dm-zoned-metadata.c | 3 ++-
>>   drivers/md/dm-zoned.h          | 3 +++
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
>> index c8787560fa9f..afce594067fb 100644
>> --- a/drivers/md/dm-zoned-metadata.c
>> +++ b/drivers/md/dm-zoned-metadata.c
>> @@ -189,7 +189,7 @@ struct dmz_metadata {
>>    */
>>   unsigned int dmz_id(struct dmz_metadata *zmd, struct dm_zone *zone)
>>   {
>> -	return ((unsigned int)(zone - zmd->zones));
>> +	return zone->id;
>>   }
>>   
>>   sector_t dmz_start_sect(struct dmz_metadata *zmd, struct dm_zone *zone)
>> @@ -1119,6 +1119,7 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>>   
>>   	INIT_LIST_HEAD(&zone->link);
>>   	atomic_set(&zone->refcount, 0);
>> +	zone->id = idx;
> 
> See my comment on patch 4. Allowing partial bdev mapping changes the meaning of
> this ID: for a partial mapping, this is not equal to the zone number anymore. So
> we should document that (in the struct declaration), and may be merge this patch
> with patch 4 since they are related so closely ?
> 
See my reply to patch 4. I do not allow for partial mapping, nor was it 
ever the intention to do so.
The zone numbering will not change.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/4] dm-zoned: use array for superblock zones
  2020-03-27  7:14 ` [PATCH 2/4] dm-zoned: use array for superblock zones Hannes Reinecke
  2020-03-31  0:51   ` Damien Le Moal
@ 2020-03-31  9:10   ` Bob Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Bob Liu @ 2020-03-31  9:10 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Damien LeMoal, dm-devel

On 3/27/20 3:14 PM, Hannes Reinecke wrote:
> Instead of storing just the first superblock zone and calculate
> the secondary relative to that we should be using an array for
> holding the superblock zones.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>

Reviewed-by: Bob Liu <bob.liu@oracle.com>

> ---
>  drivers/md/dm-zoned-metadata.c | 34 +++++++++++++++++++++-------------
>  1 file changed, 21 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index afce594067fb..dc1d17bc3bbb 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -124,6 +124,7 @@ struct dmz_sb {
>  	sector_t		block;
>  	struct dmz_mblock	*mblk;
>  	struct dmz_super	*sb;
> +	struct dm_zone		*zone;
>  };
>  
>  /*
> @@ -150,7 +151,6 @@ struct dmz_metadata {
>  	/* Zone information array */
>  	struct dm_zone		*zones;
>  
> -	struct dm_zone		*sb_zone;
>  	struct dmz_sb		sb[2];
>  	unsigned int		mblk_primary;
>  	u64			sb_gen;
> @@ -937,16 +937,20 @@ static int dmz_lookup_secondary_sb(struct dmz_metadata *zmd)
>  
>  	/* Bad first super block: search for the second one */
>  	zmd->sb[1].block = zmd->sb[0].block + zone_nr_blocks;
> +	zmd->sb[1].zone = zmd->sb[0].zone + 1;
>  	for (i = 0; i < zmd->nr_rnd_zones - 1; i++) {
>  		if (dmz_read_sb(zmd, 1) != 0)
>  			break;
> -		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC)
> +		if (le32_to_cpu(zmd->sb[1].sb->magic) == DMZ_MAGIC) {
> +			zmd->sb[1].zone += i;
>  			return 0;
> +		}
>  		zmd->sb[1].block += zone_nr_blocks;
>  	}
>  
>  	dmz_free_mblock(zmd, mblk);
>  	zmd->sb[1].mblk = NULL;
> +	zmd->sb[1].zone = NULL;
>  
>  	return -EIO;
>  }
> @@ -990,11 +994,9 @@ static int dmz_recover_mblocks(struct dmz_metadata *zmd, unsigned int dst_set)
>  	dmz_dev_warn(zmd->dev, "Metadata set %u invalid: recovering", dst_set);
>  
>  	if (dst_set == 0)
> -		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
> -	else {
> -		zmd->sb[1].block = zmd->sb[0].block +
> -			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
> -	}
> +		zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
> +	else
> +		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
>  
>  	page = alloc_page(GFP_NOIO);
>  	if (!page)
> @@ -1038,8 +1040,13 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>  	u64 sb_gen[2] = {0, 0};
>  	int ret;
>  
> +	if (!zmd->sb[0].zone) {
> +		dmz_dev_err(zmd->dev, "Primary super block zone not set");
> +		return -ENXIO;
> +	}
> +
>  	/* Read and check the primary super block */
> -	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb_zone);
> +	zmd->sb[0].block = dmz_start_block(zmd, zmd->sb[0].zone);
>  	ret = dmz_get_sb(zmd, 0);
>  	if (ret) {
>  		dmz_dev_err(zmd->dev, "Read primary super block failed");
> @@ -1051,8 +1058,9 @@ static int dmz_load_sb(struct dmz_metadata *zmd)
>  	/* Read and check secondary super block */
>  	if (ret == 0) {
>  		sb_good[0] = true;
> -		zmd->sb[1].block = zmd->sb[0].block +
> -			(zmd->nr_meta_zones << zmd->dev->zone_nr_blocks_shift);
> +		if (!zmd->sb[1].zone)
> +			zmd->sb[1].zone = zmd->sb[0].zone + zmd->nr_meta_zones;
> +		zmd->sb[1].block = dmz_start_block(zmd, zmd->sb[1].zone);
>  		ret = dmz_get_sb(zmd, 1);
>  	} else
>  		ret = dmz_lookup_secondary_sb(zmd);
> @@ -1147,9 +1155,9 @@ static int dmz_init_zone(struct blk_zone *blkz, unsigned int idx, void *data)
>  		zmd->nr_useable_zones++;
>  		if (dmz_is_rnd(zone)) {
>  			zmd->nr_rnd_zones++;
> -			if (!zmd->sb_zone) {
> +			if (!zmd->sb[0].zone) {
>  				/* Super block zone */
> -				zmd->sb_zone = zone;
> +				zmd->sb[0].zone = zone;
>  			}
>  		}
>  	}
> @@ -2420,7 +2428,7 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
>  		goto err;
>  
>  	/* Set metadata zones starting from sb_zone */
> -	zid = dmz_id(zmd, zmd->sb_zone);
> +	zid = dmz_id(zmd, zmd->sb[0].zone);
>  	for (i = 0; i < zmd->nr_meta_zones << 1; i++) {
>  		zone = dmz_get(zmd, zid + i);
>  		if (!dmz_is_rnd(zone))
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-03-27  7:14 ` [PATCH 3/4] dm-zoned: V2 metadata handling Hannes Reinecke
  2020-03-31  0:54   ` Damien Le Moal
@ 2020-03-31  9:11   ` Bob Liu
  2020-04-02 14:53     ` John Dorminy
  1 sibling, 1 reply; 19+ messages in thread
From: Bob Liu @ 2020-03-31  9:11 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Damien LeMoal, dm-devel

On 3/27/20 3:14 PM, Hannes Reinecke wrote:
> Add 'V2' metadata which includes UUIDs for the dmz set and for
> the device itself.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>

Reviewed-by: Bob Liu <bob.liu@oracle.com>

> ---
>  drivers/md/dm-zoned-metadata.c | 80 +++++++++++++++++++++++++++++++++---------
>  drivers/md/dm-zoned-target.c   |  2 +-
>  2 files changed, 64 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
> index dc1d17bc3bbb..026f285fba33 100644
> --- a/drivers/md/dm-zoned-metadata.c
> +++ b/drivers/md/dm-zoned-metadata.c
> @@ -16,7 +16,8 @@
>  /*
>   * Metadata version.
>   */
> -#define DMZ_META_VER	1
> +#define DMZ_META_COMPAT_VER 1
> +#define DMZ_META_VER	2
>  
>  /*
>   * On-disk super block magic.
> @@ -69,8 +70,17 @@ struct dmz_super {
>  	/* Checksum */
>  	__le32		crc;			/*  48 */
>  
> -	/* Padding to full 512B sector */
> -	u8		reserved[464];		/* 512 */
> +	/* DM-Zoned label */
> +	u8		dmz_label[32];		/*  80 */
> +
> +	/* DM-Zoned UUID */
> +	u8		dmz_uuid[16];		/*  96 */
> +
> +	/* Device UUID */
> +	u8		dev_uuid[16];		/* 112 */
> +
> +	/* Padding to full 512B - CRC sector */
> +	u8		reserved[400];		/* 512 */
>  };
>  
>  /*
> @@ -133,6 +143,10 @@ struct dmz_sb {
>  struct dmz_metadata {
>  	struct dmz_dev		*dev;
>  
> +	char			dmz_label[BDEVNAME_SIZE];
> +	uuid_t			dmz_uuid;
> +	uuid_t			dev_uuid;
> +
>  	sector_t		zone_bitmap_size;
>  	unsigned int		zone_nr_bitmap_blocks;
>  	unsigned int		zone_bits_per_mblk;
> @@ -659,7 +673,14 @@ static int dmz_write_sb(struct dmz_metadata *zmd, unsigned int set)
>  	int ret;
>  
>  	sb->magic = cpu_to_le32(DMZ_MAGIC);
> -	sb->version = cpu_to_le32(DMZ_META_VER);
> +
> +	if (!uuid_is_null(&zmd->dmz_uuid)) {
> +		sb->version = cpu_to_le32(DMZ_META_VER);
> +		uuid_copy((uuid_t *)sb->dmz_uuid, &zmd->dmz_uuid);
> +		memcpy(sb->dmz_label, zmd->dmz_label, BDEVNAME_SIZE);
> +		uuid_copy((uuid_t *)sb->dev_uuid, &zmd->dev_uuid);
> +	} else
> +		sb->version = cpu_to_le32(DMZ_META_COMPAT_VER);
>  
>  	sb->gen = cpu_to_le64(sb_gen);
>  
> @@ -848,28 +869,29 @@ static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
>  {
>  	unsigned int nr_meta_zones, nr_data_zones;
>  	struct dmz_dev *dev = zmd->dev;
> -	u32 crc, stored_crc;
> +	u32 crc, stored_crc, dmz_version;
>  	u64 gen;
>  
> -	gen = le64_to_cpu(sb->gen);
> -	stored_crc = le32_to_cpu(sb->crc);
> -	sb->crc = 0;
> -	crc = crc32_le(gen, (unsigned char *)sb, DMZ_BLOCK_SIZE);
> -	if (crc != stored_crc) {
> -		dmz_dev_err(dev, "Invalid checksum (needed 0x%08x, got 0x%08x)",
> -			    crc, stored_crc);
> -		return -ENXIO;
> -	}
> -
>  	if (le32_to_cpu(sb->magic) != DMZ_MAGIC) {
>  		dmz_dev_err(dev, "Invalid meta magic (needed 0x%08x, got 0x%08x)",
>  			    DMZ_MAGIC, le32_to_cpu(sb->magic));
>  		return -ENXIO;
>  	}
>  
> -	if (le32_to_cpu(sb->version) != DMZ_META_VER) {
> +	dmz_version = le32_to_cpu(sb->version);
> +	if (dmz_version > DMZ_META_VER) {
>  		dmz_dev_err(dev, "Invalid meta version (needed %d, got %d)",
> -			    DMZ_META_VER, le32_to_cpu(sb->version));
> +			    DMZ_META_VER, dmz_version);
> +		return -ENXIO;
> +	}
> +
> +	gen = le64_to_cpu(sb->gen);
> +	stored_crc = le32_to_cpu(sb->crc);
> +	sb->crc = 0;
> +	crc = crc32_le(gen, (unsigned char *)sb, DMZ_BLOCK_SIZE);
> +	if (crc != stored_crc) {
> +		dmz_dev_err(dev, "Invalid checksum (needed 0x%08x, got 0x%08x)",
> +			    crc, stored_crc);
>  		return -ENXIO;
>  	}
>  
> @@ -895,6 +917,21 @@ static int dmz_check_sb(struct dmz_metadata *zmd, struct dmz_super *sb)
>  		return -ENXIO;
>  	}
>  
> +	if (dmz_version == DMZ_META_VER) {
> +		if (uuid_is_null((uuid_t *)sb->dmz_uuid)) {
> +			dmz_dev_err(dev, "NULL DM-Zoned uuid");
> +			return -ENXIO;
> +		}
> +		uuid_copy(&zmd->dmz_uuid, (uuid_t *)sb->dmz_uuid);
> +		memcpy(zmd->dmz_label, sb->dmz_label, BDEVNAME_SIZE);
> +		if (uuid_is_null((uuid_t *)sb->dev_uuid)) {
> +			dmz_dev_err(dev, "NULL device uuid");
> +			return -ENXIO;
> +		}
> +		uuid_copy(&zmd->dev_uuid, (uuid_t *)sb->dev_uuid);
> +
> +	}
> +
>  	/* OK */
>  	zmd->nr_meta_blocks = le32_to_cpu(sb->nr_meta_blocks);
>  	zmd->nr_reserved_seq = le32_to_cpu(sb->nr_reserved_seq);
> @@ -2460,9 +2497,18 @@ int dmz_ctr_metadata(struct dmz_dev *dev, struct dmz_metadata **metadata)
>  		goto err;
>  	}
>  
> +	dmz_dev_info(dev, "DM-Zoned version %d",
> +		     uuid_is_null(&zmd->dmz_uuid) ?
> +		     DMZ_META_COMPAT_VER : DMZ_META_VER);
> +	if (!uuid_is_null(&zmd->dmz_uuid))
> +		dmz_dev_info(dev, "DM UUID %pUl", &zmd->dmz_uuid);
> +	if (strlen(zmd->dmz_label))
> +		dmz_dev_info(dev, "DM Label %s", zmd->dmz_label);
>  	dmz_dev_info(dev, "Host-%s zoned block device",
>  		     bdev_zoned_model(dev->bdev) == BLK_ZONED_HA ?
>  		     "aware" : "managed");
> +	if (!uuid_is_null(&zmd->dev_uuid))
> +		dmz_dev_info(dev, "  uuid %pUl", &zmd->dev_uuid);
>  	dmz_dev_info(dev, "  %llu 512-byte logical sectors",
>  		     (u64)dev->capacity);
>  	dmz_dev_info(dev, "  %u zones of %llu 512-byte logical sectors",
> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
> index 44e30a7de8b9..7ec9dde24516 100644
> --- a/drivers/md/dm-zoned-target.c
> +++ b/drivers/md/dm-zoned-target.c
> @@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>  
>  static struct target_type dmz_type = {
>  	.name		 = "zoned",
> -	.version	 = {1, 1, 0},
> +	.version	 = {1, 2, 0},
>  	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>  	.module		 = THIS_MODULE,
>  	.ctr		 = dmz_ctr,
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity
  2020-03-31  8:53     ` Hannes Reinecke
@ 2020-04-02  2:45       ` Damien Le Moal
  0 siblings, 0 replies; 19+ messages in thread
From: Damien Le Moal @ 2020-04-02  2:45 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer; +Cc: Bob Liu, dm-devel

On 2020/03/31 17:53, Hannes Reinecke wrote:
> On 3/31/20 2:49 AM, Damien Le Moal wrote:
>> On 2020/03/27 16:15, Hannes Reinecke wrote:
>>> dm-zoned requires several zones for metadata and chunk bitmaps,
>>> so it cannot expose the entire capacity as the device size.
>>> Originally the code would check for the capacity being equal to
>>> the device size, which is arguably wrong.
>>> So relax this check and increase the interface version number
>>> to signal to userspace that it can set a smaller device size.
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> ---
>>>   drivers/md/dm-zoned-target.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
>>> index 7ec9dde24516..89a825d1034e 100644
>>> --- a/drivers/md/dm-zoned-target.c
>>> +++ b/drivers/md/dm-zoned-target.c
>>> @@ -715,7 +715,7 @@ static int dmz_get_zoned_device(struct dm_target *ti, char *path)
>>>   	aligned_capacity = dev->capacity &
>>>   				~((sector_t)blk_queue_zone_sectors(q) - 1);
>>>   	if (ti->begin ||
>>> -	    ((ti->len != dev->capacity) && (ti->len != aligned_capacity))) {
>>> +	    ((ti->len > dev->capacity) && (ti->len > aligned_capacity))) {
>>>   		ti->error = "Partial mapping not supported";
>>
>> The message is now wrong. Also, the condition can now be simplified to:
>>
>> if ((ti->begin + ti->len) > aligned_capacity) {
>>
>> Since aligned capacity is equal or smaller than dev capacity. And we have to
>> account for the potential non-zero begin.
>>
> _Actually_ I would forbid for 'ti->begin' to be anything other than 0.
> For a zoned device there is no point in allowing for partial handling at 
> all.
> Problem is that 'dev->capacity' is the capacity of the zoned block 
> device, whereas 'ti-len' is the exported capacity of the device-mapper 
> device, which is smaller than the device capacity by the number of 
> metadata blocks/zones.

OK. Got it. I misunderstood that. Sounds good to me.

> 
>>>   		ret = -EINVAL;
>>>   		goto err;
>>> @@ -1008,7 +1008,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv,
>>>   
>>>   static struct target_type dmz_type = {
>>>   	.name		 = "zoned",
>>> -	.version	 = {1, 2, 0},
>>> +	.version	 = {1, 3, 0},
>>>   	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
>>>   	.module		 = THIS_MODULE,
>>>   	.ctr		 = dmz_ctr,
>>>
>>
>> I do not think this is nearly enough: dmz_init_zones() is still considering the
>> entire drive and does a zone report from 0 on the backend bdev. It should start
>> at ti->begin and up to ti->begin+ti->len, no ?
>>
> Yes, and no. I want to disallow 'ti->begin' to be anything else than 0 
> as ti->begin and ti->len are relative to exported device-mapper device, 
> and we always want to have block 0 mapped :-)
> 
> And as such dmz_init_zones() needs to cover all zones, as this is 
> relative to the zoned block device.

Yes. Which is already the case since we need to see the metatdata and reserved
zones too. Makes sense.

> 
>> Furthermore, this introduce a change in the meaning of the zone ID. Since this
>> is set to the index of the zone in the report (patch 1), if the mapping is
>> partial and the zone report does not start at 0, then zone ID is not zone number
>> on the device anymore. So dmz_start_block() needs to be offset by ti->begin
>> otherwise IOs will go to the wrong zones.
>>
> As I said: We will never do a partial mapping.
> What this patch does it to bring the device-mapper mapping in-line with 
> the exported device size.

Got it !

> Originally we would export a device-mapper mapping for blocks up to the 
> zone-device capacity. As the resulting device-mapper block device has a 
> smaller capacity than the mapping would allow for those 'spare' blocks 
> would never been used, thus the invalid mapping was never triggered.
> 
> What this patch does is to bring the device-mapper mapping in-line with 
> the exported block device capacity, so that we don't have an invalid 
> mapping. Nothing else has (and should) be changed.
> 
> Especially not the partial zoned device handling :-)

OK. Thanks for doing that ! I  totally missed these points when I wrote the
mapper :)

Cheers.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-03-31  9:11   ` Bob Liu
@ 2020-04-02 14:53     ` John Dorminy
  2020-04-02 15:09       ` Hannes Reinecke
  0 siblings, 1 reply; 19+ messages in thread
From: John Dorminy @ 2020-04-02 14:53 UTC (permalink / raw)
  To: Bob Liu; +Cc: Damien LeMoal, device-mapper development, Mike Snitzer


[-- Attachment #1.1: Type: text/plain, Size: 468 bytes --]

I'm worried about hardcoding uuid members as u8[16].

May I ask why you're not using uuid_t to define it in the on-disk
structure? It would save the casting of the uuid members to (uuid_t *)
every time you use a uuid.h function.

Possibly it is customary to use only raw datatypes on disk rather than
opaque types like uuid_t, I'm not sure. But in that case, perhaps using the
named constant UUID_SIZE instead of 16 would make the meaning clearer?

Thanks!

[-- Attachment #1.2: Type: text/html, Size: 545 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-04-02 14:53     ` John Dorminy
@ 2020-04-02 15:09       ` Hannes Reinecke
  2020-04-02 15:52         ` John Dorminy
  0 siblings, 1 reply; 19+ messages in thread
From: Hannes Reinecke @ 2020-04-02 15:09 UTC (permalink / raw)
  To: John Dorminy, Bob Liu
  Cc: Damien LeMoal, device-mapper development, Mike Snitzer

On 4/2/20 4:53 PM, John Dorminy wrote:
> I'm worried about hardcoding uuid members as u8[16].
> 
> May I ask why you're not using uuid_t to define it in the on-disk 
> structure? It would save the casting of the uuid members to (uuid_t *) 
> every time you use a uuid.h function.
> 
> Possibly it is customary to use only raw datatypes on disk rather than 
> opaque types like uuid_t, I'm not sure. But in that case, perhaps using 
> the named constant UUID_SIZE instead of 16 would make the meaning clearer?
> 
I prefer to use absolute types (like __u8) when describing the on-disk 
format.
When using opaque types like uuid_t there always is the risk that the 
internal representation will change, leading to an involuntary change of 
the on-disk format structure and subsequent compability issues.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-04-02 15:09       ` Hannes Reinecke
@ 2020-04-02 15:52         ` John Dorminy
  2020-04-02 20:01           ` John Dorminy
  0 siblings, 1 reply; 19+ messages in thread
From: John Dorminy @ 2020-04-02 15:52 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Bob Liu, device-mapper development, Damien LeMoal, Mike Snitzer


[-- Attachment #1.1: Type: text/plain, Size: 1368 bytes --]

That does make sense. May I request, then, using UUID_SIZE instead of 16?
Perhaps with a compile-time assertion that UUID_SIZE has not change from 16?

On Thu, Apr 2, 2020 at 11:10 AM Hannes Reinecke <hare@suse.de> wrote:

> On 4/2/20 4:53 PM, John Dorminy wrote:
> > I'm worried about hardcoding uuid members as u8[16].
> >
> > May I ask why you're not using uuid_t to define it in the on-disk
> > structure? It would save the casting of the uuid members to (uuid_t *)
> > every time you use a uuid.h function.
> >
> > Possibly it is customary to use only raw datatypes on disk rather than
> > opaque types like uuid_t, I'm not sure. But in that case, perhaps using
> > the named constant UUID_SIZE instead of 16 would make the meaning
> clearer?
> >
> I prefer to use absolute types (like __u8) when describing the on-disk
> format.
> When using opaque types like uuid_t there always is the risk that the
> internal representation will change, leading to an involuntary change of
> the on-disk format structure and subsequent compability issues.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke            Teamlead Storage & Networking
> hare@suse.de                               +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
>
>

[-- Attachment #1.2: Type: text/html, Size: 1817 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-04-02 15:52         ` John Dorminy
@ 2020-04-02 20:01           ` John Dorminy
  2020-04-03  5:47             ` Damien Le Moal
  0 siblings, 1 reply; 19+ messages in thread
From: John Dorminy @ 2020-04-02 20:01 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Bob Liu, device-mapper development, Damien LeMoal, Mike Snitzer


[-- Attachment #1.1: Type: text/plain, Size: 2130 bytes --]

Actually, I take that back. If it is a worry that the internal
representation will change, it seems also unsafe to be casting the sixteen
bytes as a uuid_t when copying into them with uuid_copy(). If the internal
representation changed to add a new field, then there would be a buffer
overrun when using uuid_copy(). If the internal representation changed the
order of the bytes, uuid_copy() would propagate that to the on-disk
structure and yield compatibility issues.

Perhaps it would be safer to store the UUIDs on disk as a string. Or add to
uuid.[ch] a function to get the raw bytes in a fixed order suitable for
storing on disk?

On Thu, Apr 2, 2020 at 11:52 AM John Dorminy <jdorminy@redhat.com> wrote:

> That does make sense. May I request, then, using UUID_SIZE instead of 16?
> Perhaps with a compile-time assertion that UUID_SIZE has not change from 16?
>
> On Thu, Apr 2, 2020 at 11:10 AM Hannes Reinecke <hare@suse.de> wrote:
>
>> On 4/2/20 4:53 PM, John Dorminy wrote:
>> > I'm worried about hardcoding uuid members as u8[16].
>> >
>> > May I ask why you're not using uuid_t to define it in the on-disk
>> > structure? It would save the casting of the uuid members to (uuid_t *)
>> > every time you use a uuid.h function.
>> >
>> > Possibly it is customary to use only raw datatypes on disk rather than
>> > opaque types like uuid_t, I'm not sure. But in that case, perhaps using
>> > the named constant UUID_SIZE instead of 16 would make the meaning
>> clearer?
>> >
>> I prefer to use absolute types (like __u8) when describing the on-disk
>> format.
>> When using opaque types like uuid_t there always is the risk that the
>> internal representation will change, leading to an involuntary change of
>> the on-disk format structure and subsequent compability issues.
>>
>> Cheers,
>>
>> Hannes
>> --
>> Dr. Hannes Reinecke            Teamlead Storage & Networking
>> hare@suse.de                               +49 911 74053 688
>> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
>> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
>>
>>

[-- Attachment #1.2: Type: text/html, Size: 2852 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/4] dm-zoned: V2 metadata handling
  2020-04-02 20:01           ` John Dorminy
@ 2020-04-03  5:47             ` Damien Le Moal
  0 siblings, 0 replies; 19+ messages in thread
From: Damien Le Moal @ 2020-04-03  5:47 UTC (permalink / raw)
  To: John Dorminy, Hannes Reinecke
  Cc: Bob Liu, device-mapper development, Mike Snitzer

On 2020/04/03 5:01, John Dorminy wrote:
> Actually, I take that back. If it is a worry that the internal representation
> will change, it seems also unsafe to be casting the sixteen bytes as a uuid_t
> when copying into them with uuid_copy(). If the internal representation changed
> to add a new field, then there would be a buffer overrun when using uuid_copy().
> If the internal representation changed the order of the bytes, uuid_copy() would
> propagate that to the on-disk structure and yield compatibility issues.
> 
> Perhaps it would be safer to store the UUIDs on disk as a string. Or add to
> uuid.[ch] a function to get the raw bytes in a fixed order suitable for storing
> on disk?

Please have a look at file systems on-disk metadata definition and checks. There
are plenty of examples of how UUIDs are handled. For instance, XFS has in
fs/xfs/libxfs/xfs_format.h:

typedef struct xfs_dsb {
	...
	uuid_t          sb_uuid;        /* user-visible file system unique id */
	...
}

For the on-disk super block UUID and the same for the in-memory version.
And you can see code like:
memcpy(&to->sb_uuid, &from->sb_uuid, sizeof(to->sb_uuid));
uuid_copy(&to->sb_meta_uuid, &from->sb_uuid);

and checks such as:

BUILD_BUG_ON(sizeof(geo->uuid) != sizeof(sbp->sb_uuid));

for some UUID declared with:

unsigned char   uuid[16];       /* unique id of the filesystem  */
(struct xfs_fsop_geom_v1 in fs/xfs/libxfs/xfs_fs.h.

On the other hand, f2fs defines its on-disk and in-memory super block UUID as:

__u8 uuid[16];                  /* 128-bit uuid for volume */

And copies it with:

memcpy(&sb->s_uuid, raw_super->uuid, sizeof(raw_super->uuid));

The general pattern is:
* For UUIDs defined as uuid_t, use uuid_copy() or memcpy()
* For UUIDs defined as an array of bytes or char, use memcpy()
* Add BUILD_BUG_ON() checks when mixing both definitions.

There is no need to go as far as using a string for the UUID.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-04-03  5:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27  7:14 [PATCH RFC 0/4] dm-zoned: Metadata V2 Hannes Reinecke
2020-03-27  7:14 ` [PATCH 1/4] dm-zoned: store zone id within the zone structure Hannes Reinecke
2020-03-31  0:57   ` Damien Le Moal
2020-03-31  8:54     ` Hannes Reinecke
2020-03-27  7:14 ` [PATCH 2/4] dm-zoned: use array for superblock zones Hannes Reinecke
2020-03-31  0:51   ` Damien Le Moal
2020-03-31  9:10   ` Bob Liu
2020-03-27  7:14 ` [PATCH 3/4] dm-zoned: V2 metadata handling Hannes Reinecke
2020-03-31  0:54   ` Damien Le Moal
2020-03-31  9:11   ` Bob Liu
2020-04-02 14:53     ` John Dorminy
2020-04-02 15:09       ` Hannes Reinecke
2020-04-02 15:52         ` John Dorminy
2020-04-02 20:01           ` John Dorminy
2020-04-03  5:47             ` Damien Le Moal
2020-03-27  7:14 ` [PATCH 4/4] dm-zoned: allow for device size smaller than the capacity Hannes Reinecke
2020-03-31  0:49   ` Damien Le Moal
2020-03-31  8:53     ` Hannes Reinecke
2020-04-02  2:45       ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.