All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] dm: zoned block device support
@ 2017-05-01 17:53 damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 01/10] dm-table: Introduce DM_TARGET_ZONED_HM feature damien.lemoal
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

This series introduces zoned block device support to the device mapper
infrastructure. Pathces are as follows:

- Patch 1: Add a new target type feature flag to indicate if a target type
  supports host-managed zoned block devices. This prevents using these drives
  with the current target types since none of them have the proper support
  implemented and will not operate properly with these drives.
- Patch 2: If a target device is a zoned block device, check that the range of
  LBAs mapped is aligned to the device zone size and that the device start
  offset also aligns to zone boundaries. This is necessary for zone reset and
  zone report correct execution.
- Patch 3: Check that the different target devices of a table have compatible
  zone sizes and models. This is necessary for target types that expose a zone
  model different from the underlying device.
- Patch 4: Fix handling of REQ_OP_ZONE_RESET bios
- Patch 5: Fix handling of REQ_OP_ZONE_REPORT bios
- Patch 6: Introduce a new helper function to reverse map a device zone report
  to the target LBA range
- Patch 7: Add support for host-managed zoned block devices to dm-flakey. This
  is necessary for testing file systems supporting natively these drives (e.g.
  f2fs).
- Patch 8: Add support for for zoned block devices to dm-linear. This can have
  useful applications during development and testing (e.g. allow creating
  smaller zoned devices with different combinations and positions of zones).
  There are also interesting applications for production, for instance, the
  ability to aggregate conventional zones of different drives to create a
  regular disk.
- Patch 9: Add sequential write enforcement to dm_kcopyd_copy so that
  sequential zones of a host-managed zoned block device can be specified as
  destinations.
- Patch 10: New dm-zoned target type (this was already sent for review twice).
  This resend adds modifications suggested by Hannes to implement reclaim
  using dm-kcopyd. dm-zoned depends on patch 9.

As always, comments and reviews are welcome.

Changes from v1:
- Use for-loop in patch 3 as suggested by Bart
- Add memory shrinker to dm-zoned to shrink the metadata block cache under
  memory pressure (suggested by Bart)
- Added Hannes Reviewed-by tag

Damien Le Moal (10):
  dm-table: Introduce DM_TARGET_ZONED_HM feature
  dm-table: Check device area zone alignment
  dm-table: Check block devices zone model compatibility
  dm: Fix REQ_OP_ZONE_RESET bio handling
  dm: Fix REQ_OP_ZONE_REPORT bio handling
  dm: Introduce dm_remap_zone_report()
  dm-flakey: Add support for zoned block devices
  dm-linear: Add support for zoned block devices
  dm-kcopyd: Add sequential write feature
  dm-zoned: Drive-managed zoned block device target

 Documentation/device-mapper/dm-zoned.txt |  154 +++
 drivers/md/Kconfig                       |   19 +
 drivers/md/Makefile                      |    2 +
 drivers/md/dm-flakey.c                   |   21 +-
 drivers/md/dm-kcopyd.c                   |   68 +-
 drivers/md/dm-linear.c                   |   14 +-
 drivers/md/dm-table.c                    |  145 ++
 drivers/md/dm-zoned-io.c                 |  998 ++++++++++++++
 drivers/md/dm-zoned-metadata.c           | 2195 ++++++++++++++++++++++++++++++
 drivers/md/dm-zoned-reclaim.c            |  535 ++++++++
 drivers/md/dm-zoned.h                    |  528 +++++++
 drivers/md/dm.c                          |   93 +-
 include/linux/device-mapper.h            |   16 +
 include/linux/dm-kcopyd.h                |    1 +
 14 files changed, 4783 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/device-mapper/dm-zoned.txt
 create mode 100644 drivers/md/dm-zoned-io.c
 create mode 100644 drivers/md/dm-zoned-metadata.c
 create mode 100644 drivers/md/dm-zoned-reclaim.c
 create mode 100644 drivers/md/dm-zoned.h

-- 
2.9.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 01/10] dm-table: Introduce DM_TARGET_ZONED_HM feature
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 02/10] dm-table: Check device area zone alignment damien.lemoal
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

The target drivers currently available will not operate correctly if a
table target maps onto a host-managed zoned block device.

To avoid problems, this patch introduces the new feature flag
DM_TARGET_ZONED_HM for a target driver to explicitly state that it
supports host-managed zoned block devices. This feature is checked
in dm_get_device() to prevent the addition to a table of a target
mapping to a host-managed zoned block device if the target type does
not have the feature enabled.

Note that as host-aware zoned block devices are backward compatible
with regular block devices, they can be used by any of the current
target types. This new feature is thus restricted to host-managed
zoned block devices.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm-table.c         | 23 +++++++++++++++++++++++
 include/linux/device-mapper.h |  6 ++++++
 2 files changed, 29 insertions(+)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 3ad16d9..06d3b7b 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -388,6 +388,24 @@ dev_t dm_get_dev_t(const char *path)
 EXPORT_SYMBOL_GPL(dm_get_dev_t);
 
 /*
+ * Check if the target supports supports host-managed zoned block devices.
+ */
+static bool device_supported(struct dm_target *ti, struct dm_dev *dev)
+{
+	struct block_device *bdev = dev->bdev;
+	char b[BDEVNAME_SIZE];
+
+	if (bdev_zoned_model(bdev) == BLK_ZONED_HM &&
+	    !dm_target_zoned_hm(ti->type)) {
+		DMWARN("%s: Unsupported host-managed zoned block device %s",
+		       dm_device_name(ti->table->md), bdevname(bdev, b));
+		return false;
+	}
+
+	return true;
+}
+
+/*
  * Add a device to the list, or just increment the usage count if
  * it's already present.
  */
@@ -426,6 +444,11 @@ int dm_get_device(struct dm_target *ti, const char *path, fmode_t mode,
 	}
 	atomic_inc(&dd->count);
 
+	if (!device_supported(ti, dd->dm_dev)) {
+		dm_put_device(ti, dd->dm_dev);
+		return -ENOTSUPP;
+	}
+
 	*result = dd->dm_dev;
 	return 0;
 }
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index a7e6903..b3c2408 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -214,6 +214,12 @@ struct target_type {
 #define dm_target_is_wildcard(type)	((type)->features & DM_TARGET_WILDCARD)
 
 /*
+ * Indicates that a target supports host-managed zoned block devices.
+ */
+#define DM_TARGET_ZONED_HM		0x00000010
+#define dm_target_zoned_hm(type)	((type)->features & DM_TARGET_ZONED_HM)
+
+/*
  * Some targets need to be sent the same WRITE bio severals times so
  * that they can send copies of it to different devices.  This function
  * examines any supplied bio and returns the number of copies of it the
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 02/10] dm-table: Check device area zone alignment
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 01/10] dm-table: Introduce DM_TARGET_ZONED_HM feature damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 03/10] dm-table: Check block devices zone model compatibility damien.lemoal
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

If a target maps to a zoned block device, check that the device area is
aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET
operations (resetting a partially mapped sequential zone would not be
possible). This also greatly facilitate the processing of zone report
with REQ_OP_ZONE_REPORT bios.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm-table.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 06d3b7b..6947f0f 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -339,6 +339,33 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev,
 		return 1;
 	}
 
+	/*
+	 * If the target is mapped to a zoned block device, check
+	 * that the device zones are not partially mapped.
+	 */
+	if (bdev_zoned_model(bdev) != BLK_ZONED_NONE) {
+		unsigned int zone_sectors = bdev_zone_sectors(bdev);
+
+		if (start & (zone_sectors - 1)) {
+			DMWARN("%s: start=%llu not aligned to h/w "
+			       "zone size %u of %s",
+			       dm_device_name(ti->table->md),
+			       (unsigned long long)start,
+			       zone_sectors, bdevname(bdev, b));
+			return 1;
+		}
+
+		if (start + len < dev_size &&
+		    len & (zone_sectors - 1)) {
+			DMWARN("%s: len=%llu not aligned to h/w "
+			       "zone size %u of %s",
+			       dm_device_name(ti->table->md),
+			       (unsigned long long)start,
+			       zone_sectors, bdevname(bdev, b));
+			return 1;
+		}
+	}
+
 	return 0;
 }
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 03/10] dm-table: Check block devices zone model compatibility
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 01/10] dm-table: Introduce DM_TARGET_ZONED_HM feature damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 02/10] dm-table: Check device area zone alignment damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 04/10] dm: Fix REQ_OP_ZONE_RESET bio handling damien.lemoal
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

When setting the dm device queue limits, several possibilities exists
for zoned block devices:
1) The dm target driver may want to expose a different zone model (e.g.
host-managed device emulation or regular block device on top of
host-managed zoned block devices)
2) Expose the underlying zone model of the devices as is

To allow both cases, the underlying block device zone model must be set
in the target limits in dm_set_device_limits() and the compatibility of
all devices checked similarly to the logical block size alignment. For
this last check, introduce the function validate_hardware_zone_model()
to check that all targets of a table have the same zone model and that
the zone size of the target devices are equal.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm-table.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 6947f0f..cc89a78 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -505,6 +505,8 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
 		       q->limits.alignment_offset,
 		       (unsigned long long) start << SECTOR_SHIFT);
 
+	limits->zoned = bdev_zoned_model(bdev);
+
 	return 0;
 }
 
@@ -720,6 +722,94 @@ static int validate_hardware_logical_block_alignment(struct dm_table *table,
 	return 0;
 }
 
+/*
+ * Check a devices's table for compatibility between zoned devices used by
+ * the table targets. The zone model may come directly from a target block
+ * device or may have been set by the target using the io_hints method.
+ * Overall, if any of the table device targets is advertized as a zoned
+ * block device, then all targets devices should also be advertized as
+ * using the same model and the devices zone size all equal.
+ */
+static int validate_hardware_zone_model(struct dm_table *table,
+					struct queue_limits *limits)
+{
+	struct dm_target *ti;
+	struct queue_limits ti_limits;
+	unsigned int zone_sectors = limits->chunk_sectors;
+	unsigned int num_targets = dm_table_get_num_targets(table);
+	int zone_model = -1;
+	unsigned int i;
+
+	if (!num_targets)
+		return 0;
+
+	/*
+	 * Check each entry in the table in turn.
+	 */
+	for (i = 0; i < num_targets; i++) {
+
+		ti = dm_table_get_target(table, i);
+
+		/* Get the target device limits */
+		blk_set_stacking_limits(&ti_limits);
+		if (ti->type->iterate_devices)
+			ti->type->iterate_devices(ti, dm_set_device_limits,
+						  &ti_limits);
+
+		/*
+		 * Let the target driver change the hardware limits, and
+		 * in particular the zone model if needed.
+		 */
+		if (ti->type->io_hints)
+			ti->type->io_hints(ti, &ti_limits);
+
+		/* Check zone model compatibility */
+		if (zone_model == -1)
+			zone_model = ti_limits.zoned;
+		if (ti_limits.zoned != zone_model) {
+			zone_model = -1;
+			break;
+		}
+
+		if (zone_model != BLK_ZONED_NONE) {
+			/* Check zone size validity and compatibility */
+			if (!zone_sectors ||
+			    !is_power_of_2(zone_sectors))
+				break;
+			if (ti_limits.chunk_sectors != zone_sectors) {
+				zone_sectors = ti_limits.chunk_sectors;
+				break;
+			}
+		}
+
+	}
+
+	if (i < num_targets) {
+		if (zone_model == -1)
+			DMWARN("%s: table line %u (start sect %llu len %llu) "
+			       "has an incompatible zone model",
+			       dm_device_name(table->md), i,
+			       (unsigned long long) ti->begin,
+			       (unsigned long long) ti->len);
+		else
+			DMWARN("%s: table line %u (start sect %llu len %llu) "
+			       "has an incompatible zone size %u",
+			       dm_device_name(table->md), i,
+			       (unsigned long long) ti->begin,
+			       (unsigned long long) ti->len,
+			       zone_sectors);
+		return -EINVAL;
+	}
+
+	if (zone_model == BLK_ZONED_HA ||
+	    zone_model == BLK_ZONED_HM) {
+		limits->zoned = zone_model;
+		limits->chunk_sectors = zone_sectors;
+	}
+
+	return 0;
+}
+
 int dm_table_add_target(struct dm_table *t, const char *type,
 			sector_t start, sector_t len, char *params)
 {
@@ -1432,6 +1522,9 @@ int dm_calculate_queue_limits(struct dm_table *table,
 			       (unsigned long long) ti->len);
 	}
 
+	if (validate_hardware_zone_model(table, limits))
+		return -EINVAL;
+
 	return validate_hardware_logical_block_alignment(table, limits);
 }
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 04/10] dm: Fix REQ_OP_ZONE_RESET bio handling
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
                   ` (2 preceding siblings ...)
  2017-05-01 17:53 ` [PATCH v2 03/10] dm-table: Check block devices zone model compatibility damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 05/10] dm: Fix REQ_OP_ZONE_REPORT " damien.lemoal
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

The REQ_OP_ZONE_RESET bio has no payload and zero sectors. Its position
is the only information used to indicate the zone to reset on the
device. Due to its zero length, this bio is not cloned and sent to the
target through the non-flush case in __split_and_process_bio().
Add an additional case in that function to call
__split_and_process_non_flush() without checking the clone info size.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dfb7597..1d98035 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1318,6 +1318,10 @@ static void __split_and_process_bio(struct mapped_device *md,
 		ci.sector_count = 0;
 		error = __send_empty_flush(&ci);
 		/* dec_pending submits any data associated with flush */
+	} else if (bio_op(bio) == REQ_OP_ZONE_RESET) {
+		ci.bio = bio;
+		ci.sector_count = 0;
+		error = __split_and_process_non_flush(&ci);
 	} else {
 		ci.bio = bio;
 		ci.sector_count = bio_sectors(bio);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 05/10] dm: Fix REQ_OP_ZONE_REPORT bio handling
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
                   ` (3 preceding siblings ...)
  2017-05-01 17:53 ` [PATCH v2 04/10] dm: Fix REQ_OP_ZONE_RESET bio handling damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 06/10] dm: Introduce dm_remap_zone_report() damien.lemoal
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

A REQ_OP_ZONE_REPORT bio is not a medium access command. Its number of
sectors indicates the maximum size allowed for the report reply size
and not an amount of sectors accessed from the device.
REQ_OP_ZONE_REPORT bios should thus not be split depending on the
target device maximum I/O length but passed as is. Note that it is the
responsability of the target to remap and format the report reply.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1d98035..cd44928 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1098,7 +1098,8 @@ static int clone_bio(struct dm_target_io *tio, struct bio *bio,
 			return r;
 	}
 
-	bio_advance(clone, to_bytes(sector - clone->bi_iter.bi_sector));
+	if (bio_op(bio) != REQ_OP_ZONE_REPORT)
+		bio_advance(clone, to_bytes(sector - clone->bi_iter.bi_sector));
 	clone->bi_iter.bi_size = to_bytes(len);
 
 	if (bio_integrity(bio))
@@ -1275,7 +1276,11 @@ static int __split_and_process_non_flush(struct clone_info *ci)
 	if (!dm_target_is_valid(ti))
 		return -EIO;
 
-	len = min_t(sector_t, max_io_len(ci->sector, ti), ci->sector_count);
+	if (bio_op(bio) == REQ_OP_ZONE_REPORT)
+		len = ci->sector_count;
+	else
+		len = min_t(sector_t, max_io_len(ci->sector, ti),
+			    ci->sector_count);
 
 	r = __clone_and_map_data_bio(ci, ti, ci->sector, &len);
 	if (r < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 06/10] dm: Introduce dm_remap_zone_report()
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
                   ` (4 preceding siblings ...)
  2017-05-01 17:53 ` [PATCH v2 05/10] dm: Fix REQ_OP_ZONE_REPORT " damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 07/10] dm-flakey: Add support for zoned block devices damien.lemoal
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

A target driver support zoned block devices and exposing it as such may
receive REQ_OP_ZONE_REPORT request for the user to determine the mapped
device zone configuration. To process properly such request, the target
driver may need to remap the zone descriptors provided in the report
reply. The helper function dm_remap_zone_report() does this generically
using only the target start offset and length and the start offset
within the target device.

dm_remap_zone_report() will remap the start sector of all zones
reported. If the report includes sequential zones, the write pointer
position of these zones will also be remapped.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm.c               | 80 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/device-mapper.h | 10 ++++++
 2 files changed, 90 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index cd44928..1f6558e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -975,6 +975,86 @@ void dm_accept_partial_bio(struct bio *bio, unsigned n_sectors)
 }
 EXPORT_SYMBOL_GPL(dm_accept_partial_bio);
 
+#ifdef CONFIG_BLK_DEV_ZONED
+/*
+ * The zone descriptors obtained with a zone report indicate
+ * zone positions within the target device. The zone descriptors
+ * must be remapped to match their position within the dm device.
+ * A target may call dm_remap_zone_report after completion of a
+ * REQ_OP_ZONE_REPORT bio to remap the zone descriptors obtained
+ * from the target device mapping to the dm device.
+ */
+void dm_remap_zone_report(struct dm_target *ti, struct bio *bio, sector_t start)
+{
+	struct dm_target_io *tio =
+		container_of(bio, struct dm_target_io, clone);
+	struct bio *report_bio = tio->io->bio;
+	struct blk_zone_report_hdr *hdr = NULL;
+	struct blk_zone *zone;
+	unsigned int nr_rep = 0;
+	unsigned int ofst;
+	struct bio_vec bvec;
+	struct bvec_iter iter;
+	void *addr;
+
+	if (bio->bi_error)
+		return;
+
+	/*
+	 * Remap the start sector of the reported zones. For sequential zones,
+	 * also remap the write pointer position.
+	 */
+	bio_for_each_segment(bvec, report_bio, iter) {
+
+		addr = kmap_atomic(bvec.bv_page);
+
+		/* Remember the report header in the first page */
+		if (!hdr) {
+			hdr = addr;
+			ofst = sizeof(struct blk_zone_report_hdr);
+		} else {
+			ofst = 0;
+		}
+
+		/* Set zones start sector */
+		while (hdr->nr_zones && ofst < bvec.bv_len) {
+			zone = addr + ofst;
+			if (zone->start >= start + ti->len) {
+				hdr->nr_zones = 0;
+				break;
+			}
+			zone->start = zone->start + ti->begin - start;
+			if (zone->type != BLK_ZONE_TYPE_CONVENTIONAL) {
+				if (zone->cond == BLK_ZONE_COND_FULL)
+					zone->wp = zone->start + zone->len;
+				else if (zone->cond == BLK_ZONE_COND_EMPTY)
+					zone->wp = zone->start;
+				else
+					zone->wp = zone->wp + ti->begin - start;
+			}
+			ofst += sizeof(struct blk_zone);
+			hdr->nr_zones--;
+			nr_rep++;
+		}
+
+		if (addr != hdr)
+			kunmap_atomic(addr);
+
+		if (!hdr->nr_zones)
+			break;
+
+	}
+
+	if (hdr) {
+		hdr->nr_zones = nr_rep;
+		kunmap_atomic(hdr);
+	}
+
+	bio_advance(report_bio, report_bio->bi_iter.bi_size);
+}
+EXPORT_SYMBOL_GPL(dm_remap_zone_report);
+#endif
+
 /*
  * Flush current->bio_list when the target map method blocks.
  * This fixes deadlocks in snapshot and possibly in other targets.
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index b3c2408..d21c761 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -433,6 +433,16 @@ struct gendisk *dm_disk(struct mapped_device *md);
 int dm_suspended(struct dm_target *ti);
 int dm_noflush_suspending(struct dm_target *ti);
 void dm_accept_partial_bio(struct bio *bio, unsigned n_sectors);
+#ifdef CONFIG_BLK_DEV_ZONED
+void dm_remap_zone_report(struct dm_target *ti, struct bio *bio,
+			  sector_t start);
+#else
+static inline void dm_remap_zone_report(struct dm_target *ti, struct bio *bio,
+					sector_t start)
+{
+	bio->bi_error = -ENOTSUPP;
+}
+#endif
 union map_info *dm_get_rq_mapinfo(struct request *rq);
 
 struct queue_limits *dm_get_queue_limits(struct mapped_device *md);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 07/10] dm-flakey: Add support for zoned block devices
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
                   ` (5 preceding siblings ...)
  2017-05-01 17:53 ` [PATCH v2 06/10] dm: Introduce dm_remap_zone_report() damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 08/10] dm-linear: " damien.lemoal
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

With the development of file system support for zoned block devices
(e.g. f2fs), having dm-flakey support for these devices is interesting
to improve testing.

This patch adds support for zoned block devices in dm-flakey, both
host-aware and host-managed. The target type feature is set to
DM_TARGET_ZONED_HM indicate support for host-managed models. The
remaining of the support adds hooks for remapping of REQ_OP_ZONE_RESET
and REQ_OP_ZONE_REPORT bios. Additionally, in the bio completion path,
(backward) remapping of a zone report reply is also added.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm-flakey.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index 13305a1..b419c85 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -251,6 +251,8 @@ static int flakey_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	return 0;
 
 bad:
+	if (fc->dev)
+		dm_put_device(ti, fc->dev);
 	kfree(fc);
 	return r;
 }
@@ -275,7 +277,7 @@ static void flakey_map_bio(struct dm_target *ti, struct bio *bio)
 	struct flakey_c *fc = ti->private;
 
 	bio->bi_bdev = fc->dev->bdev;
-	if (bio_sectors(bio))
+	if (bio_sectors(bio) || bio_op(bio) == REQ_OP_ZONE_RESET)
 		bio->bi_iter.bi_sector =
 			flakey_map_sector(ti, bio->bi_iter.bi_sector);
 }
@@ -306,6 +308,14 @@ static int flakey_map(struct dm_target *ti, struct bio *bio)
 	struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct per_bio_data));
 	pb->bio_submitted = false;
 
+	/* Do not fail reset zone */
+	if (bio_op(bio) == REQ_OP_ZONE_RESET)
+		goto map_bio;
+
+	/* We need to remap reported zones, so remember the BIO iter */
+	if (bio_op(bio) == REQ_OP_ZONE_REPORT)
+		goto map_bio;
+
 	/* Are we alive ? */
 	elapsed = (jiffies - fc->start_time) / HZ;
 	if (elapsed % (fc->up_interval + fc->down_interval) >= fc->up_interval) {
@@ -363,6 +373,14 @@ static int flakey_end_io(struct dm_target *ti, struct bio *bio, int error)
 	struct flakey_c *fc = ti->private;
 	struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct per_bio_data));
 
+	if (bio_op(bio) == REQ_OP_ZONE_RESET)
+		return error;
+
+	if (bio_op(bio) == REQ_OP_ZONE_REPORT) {
+		dm_remap_zone_report(ti, bio, fc->start);
+		return error;
+	}
+
 	if (!error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
 		if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
 		    all_corrupt_bio_flags_match(bio, fc)) {
@@ -446,6 +464,7 @@ static int flakey_iterate_devices(struct dm_target *ti, iterate_devices_callout_
 static struct target_type flakey_target = {
 	.name   = "flakey",
 	.version = {1, 4, 0},
+	.features = DM_TARGET_ZONED_HM,
 	.module = THIS_MODULE,
 	.ctr    = flakey_ctr,
 	.dtr    = flakey_dtr,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 08/10] dm-linear: Add support for zoned block devices
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
                   ` (6 preceding siblings ...)
  2017-05-01 17:53 ` [PATCH v2 07/10] dm-flakey: Add support for zoned block devices damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
  2017-05-01 17:53 ` [PATCH v2 09/10] dm-kcopyd: Add sequential write feature damien.lemoal
       [not found] ` <20170501175314.10922-11-damien.lemoal@wdc.com>
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

Add support for zoned block devices by allowing host-managed zoned block
device mapped targets, the remapping of REQ_OP_ZONE_RESET and the post
processing (reply remapping) of REQ_OP_ZONE_REPORT.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm-linear.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 4788b0b..9c4debd 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -87,7 +87,7 @@ static void linear_map_bio(struct dm_target *ti, struct bio *bio)
 	struct linear_c *lc = ti->private;
 
 	bio->bi_bdev = lc->dev->bdev;
-	if (bio_sectors(bio))
+	if (bio_sectors(bio) || bio_op(bio) == REQ_OP_ZONE_RESET)
 		bio->bi_iter.bi_sector =
 			linear_map_sector(ti, bio->bi_iter.bi_sector);
 }
@@ -99,6 +99,16 @@ static int linear_map(struct dm_target *ti, struct bio *bio)
 	return DM_MAPIO_REMAPPED;
 }
 
+static int linear_end_io(struct dm_target *ti, struct bio *bio, int error)
+{
+	struct linear_c *lc = ti->private;
+
+	if (!error && bio_op(bio) == REQ_OP_ZONE_REPORT)
+		dm_remap_zone_report(ti, bio, lc->start);
+
+	return error;
+}
+
 static void linear_status(struct dm_target *ti, status_type_t type,
 			  unsigned status_flags, char *result, unsigned maxlen)
 {
@@ -162,10 +172,12 @@ static long linear_direct_access(struct dm_target *ti, sector_t sector,
 static struct target_type linear_target = {
 	.name   = "linear",
 	.version = {1, 3, 0},
+	.features = DM_TARGET_ZONED_HM,
 	.module = THIS_MODULE,
 	.ctr    = linear_ctr,
 	.dtr    = linear_dtr,
 	.map    = linear_map,
+	.end_io = linear_end_io,
 	.status = linear_status,
 	.prepare_ioctl = linear_prepare_ioctl,
 	.iterate_devices = linear_iterate_devices,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 09/10] dm-kcopyd: Add sequential write feature
  2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
                   ` (7 preceding siblings ...)
  2017-05-01 17:53 ` [PATCH v2 08/10] dm-linear: " damien.lemoal
@ 2017-05-01 17:53 ` damien.lemoal
       [not found] ` <20170501175314.10922-11-damien.lemoal@wdc.com>
  9 siblings, 0 replies; 12+ messages in thread
From: damien.lemoal @ 2017-05-01 17:53 UTC (permalink / raw)
  To: dm-devel, Mike Snitzer, Alasdair Kergon
  Cc: Hannes Reinecke, Christoph Hellwig, Bart Van Assche, linux-block,
	Damien Le Moal

From: Damien Le Moal <damien.lemoal@wdc.com>

When copyying blocks to host-managed zoned block devices, writes must be
sequential. dm_kcopyd_copy() does not howerver guarantee this as writes
are issued in the completion order of reads, and reads may complete out
of order despite being issued sequentially.

Fix this by introducing the DM_KCOPYD_WRITE_SEQ flag. This can be
specified by the user when calling dm_kcopyd_copy() and is set
automatically if one of the destinations is a host-managed zoned block
device. For a split job, the master job maintains the write position at
which writes must be issued. This is checked with the pop() function
which is modify to not return any write I/O sub job that is not at the
correct write position.

When DM_KCOPYD_WRITE_SEQ is specified for a job, errors cannot be
ignored and the flag DM_KCOPYD_IGNORE_ERROR is ignored, even if
specified by the user.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/md/dm-kcopyd.c    | 68 +++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/dm-kcopyd.h |  1 +
 2 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index 9e9d04cb..477846e 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -356,6 +356,7 @@ struct kcopyd_job {
 	struct mutex lock;
 	atomic_t sub_jobs;
 	sector_t progress;
+	sector_t write_ofst;
 
 	struct kcopyd_job *master_job;
 };
@@ -386,6 +387,34 @@ void dm_kcopyd_exit(void)
  * Functions to push and pop a job onto the head of a given job
  * list.
  */
+static struct kcopyd_job *pop_io_job(struct list_head *jobs,
+				     struct dm_kcopyd_client *kc)
+{
+	struct kcopyd_job *job;
+
+	/*
+	 * For I/O jobs, pop any read, any write without sequential write
+	 * constraint and sequential writes that are at the right position.
+	 */
+	list_for_each_entry(job, jobs, list) {
+
+		if (job->rw == READ ||
+		    !test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags)) {
+			list_del(&job->list);
+			return job;
+		}
+
+		if (job->write_ofst == job->master_job->write_ofst) {
+			job->master_job->write_ofst += job->source.count;
+			list_del(&job->list);
+			return job;
+		}
+
+	}
+
+	return NULL;
+}
+
 static struct kcopyd_job *pop(struct list_head *jobs,
 			      struct dm_kcopyd_client *kc)
 {
@@ -395,8 +424,12 @@ static struct kcopyd_job *pop(struct list_head *jobs,
 	spin_lock_irqsave(&kc->job_lock, flags);
 
 	if (!list_empty(jobs)) {
-		job = list_entry(jobs->next, struct kcopyd_job, list);
-		list_del(&job->list);
+		if (jobs == &kc->io_jobs) {
+			job = pop_io_job(jobs, kc);
+		} else {
+			job = list_entry(jobs->next, struct kcopyd_job, list);
+			list_del(&job->list);
+		}
 	}
 	spin_unlock_irqrestore(&kc->job_lock, flags);
 
@@ -506,6 +539,14 @@ static int run_io_job(struct kcopyd_job *job)
 		.client = job->kc->io_client,
 	};
 
+	/*
+	 * If we need to write sequentially and some reads or writes failed,
+	 * no point in continuing.
+	 */
+	if (test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags) &&
+	    job->master_job->write_err)
+		return -EIO;
+
 	io_job_start(job->kc->throttle);
 
 	if (job->rw == READ)
@@ -655,6 +696,7 @@ static void segment_complete(int read_err, unsigned long write_err,
 		int i;
 
 		*sub_job = *job;
+		sub_job->write_ofst = progress;
 		sub_job->source.sector += progress;
 		sub_job->source.count = count;
 
@@ -723,6 +765,27 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
 	job->num_dests = num_dests;
 	memcpy(&job->dests, dests, sizeof(*dests) * num_dests);
 
+	/*
+	 * If one of the destination is a host-managed zoned block device,
+	 * we need to write sequentially. If one of the destination is a
+	 * host-aware device, then leave it to the caller to choose what to do.
+	 */
+	if (!test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags)) {
+		for (i = 0; i < job->num_dests; i++) {
+			if (bdev_zoned_model(dests[i].bdev) == BLK_ZONED_HM) {
+				set_bit(DM_KCOPYD_WRITE_SEQ, &job->flags);
+				break;
+			}
+		}
+	}
+
+	/*
+	 * If we need to write sequentially, errors cannot be ignored.
+	 */
+	if (test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags) &&
+	    test_bit(DM_KCOPYD_IGNORE_ERROR, &job->flags))
+		clear_bit(DM_KCOPYD_IGNORE_ERROR, &job->flags);
+
 	if (from) {
 		job->source = *from;
 		job->pages = NULL;
@@ -746,6 +809,7 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
 	job->fn = fn;
 	job->context = context;
 	job->master_job = job;
+	job->write_ofst = 0;
 
 	if (job->source.count <= SUB_JOB_SIZE)
 		dispatch_job(job);
diff --git a/include/linux/dm-kcopyd.h b/include/linux/dm-kcopyd.h
index f486d63..cfac858 100644
--- a/include/linux/dm-kcopyd.h
+++ b/include/linux/dm-kcopyd.h
@@ -20,6 +20,7 @@
 #define DM_KCOPYD_MAX_REGIONS 8
 
 #define DM_KCOPYD_IGNORE_ERROR 1
+#define DM_KCOPYD_WRITE_SEQ    2
 
 struct dm_kcopyd_throttle {
 	unsigned throttle;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 10/10] dm-zoned: Drive-managed zoned block device target
       [not found] ` <20170501175314.10922-11-damien.lemoal@wdc.com>
@ 2017-05-02 21:53     ` Bart Van Assche
  0 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2017-05-02 21:53 UTC (permalink / raw)
  To: dm-devel, agk, Damien Le Moal, snitzer; +Cc: hch, hare, linux-block

On Tue, 2017-05-02 at 02:53 +0900, damien.lemoal@wdc.com wrote:
> +static unsigned long dmz_mblock_shrinker_count(struct shrinker *shrink,
> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 struct shri=
nk_control *sc)
> +{
> +=A0=A0=A0=A0=A0=A0=A0struct dmz_target *dmz =3D
> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0container_of(shrink, struct=
 dmz_target, mblk_shrinker);
> +
> +=A0=A0=A0=A0=A0=A0=A0return atomic_read(&dmz->nr_mblks);
> +}

Hello Damien,

dmz_mblock_shrinker_count() probably should return the following value sinc=
e
dmz_shrink_mblock_cache() won't free more than the this number of elements:

        max(atomic_read(&dmz->nr_mblks) - dmz->min_nr_mblks, 0)=20

But since v2 is IMHO good enough to be merged, for the whole series:

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>=

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 10/10] dm-zoned: Drive-managed zoned block device target
@ 2017-05-02 21:53     ` Bart Van Assche
  0 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2017-05-02 21:53 UTC (permalink / raw)
  To: dm-devel, agk, Damien Le Moal, snitzer; +Cc: hch, hare, linux-block

On Tue, 2017-05-02 at 02:53 +0900, damien.lemoal@wdc.com wrote:
> +static unsigned long dmz_mblock_shrinker_count(struct shrinker *shrink,
> +                                              struct shrink_control *sc)
> +{
> +       struct dmz_target *dmz =
> +               container_of(shrink, struct dmz_target, mblk_shrinker);
> +
> +       return atomic_read(&dmz->nr_mblks);
> +}

Hello Damien,

dmz_mblock_shrinker_count() probably should return the following value since
dmz_shrink_mblock_cache() won't free more than the this number of elements:

        max(atomic_read(&dmz->nr_mblks) - dmz->min_nr_mblks, 0) 

But since v2 is IMHO good enough to be merged, for the whole series:

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-05-02 21:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-01 17:53 [PATCH v2 00/10] dm: zoned block device support damien.lemoal
2017-05-01 17:53 ` [PATCH v2 01/10] dm-table: Introduce DM_TARGET_ZONED_HM feature damien.lemoal
2017-05-01 17:53 ` [PATCH v2 02/10] dm-table: Check device area zone alignment damien.lemoal
2017-05-01 17:53 ` [PATCH v2 03/10] dm-table: Check block devices zone model compatibility damien.lemoal
2017-05-01 17:53 ` [PATCH v2 04/10] dm: Fix REQ_OP_ZONE_RESET bio handling damien.lemoal
2017-05-01 17:53 ` [PATCH v2 05/10] dm: Fix REQ_OP_ZONE_REPORT " damien.lemoal
2017-05-01 17:53 ` [PATCH v2 06/10] dm: Introduce dm_remap_zone_report() damien.lemoal
2017-05-01 17:53 ` [PATCH v2 07/10] dm-flakey: Add support for zoned block devices damien.lemoal
2017-05-01 17:53 ` [PATCH v2 08/10] dm-linear: " damien.lemoal
2017-05-01 17:53 ` [PATCH v2 09/10] dm-kcopyd: Add sequential write feature damien.lemoal
     [not found] ` <20170501175314.10922-11-damien.lemoal@wdc.com>
2017-05-02 21:53   ` [PATCH v2 10/10] dm-zoned: Drive-managed zoned block device target Bart Van Assche
2017-05-02 21:53     ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.