All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] dm snapshot: add optional discard support features
@ 2019-07-03 16:25 Mike Snitzer
  2019-07-05 16:03 ` Nikos Tsironis
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2019-07-03 16:25 UTC (permalink / raw)
  To: dm-devel; +Cc: John Dorminy

Add 2 _optional_ features to the snapshot target:

discard_zeroes_cow - a discard issued to the snapshot device that maps
to entire chunks to will zero the corresponding exception(s) in the
snapshot's exception store.

discard_passdown_origin - a discard to the snapshot device is passed down
to the snapshot-origin's underlying device.  This doesn't cause copy-out
to the snapshot exception store because the snapshot-origin target is
bypassed.

The discard_passdown_origin feature depends on the discard_zeroes_cow
feature being enabled.

When these 2 features are enabled they allow a temporarily read-only
device that has completely exhausted its free space to recover space.
To do so dm-snapshot provides temporary buffer to accommodate writes
that the temporarily read-only device cannot handle yet.  Once the upper
layer frees space (e.g. fstrim to XFS) the discards issued to the
dm-snapshot target will be issued to underlying read-only device whose
free space was exhausted.  In addition those discards will also cause
zeroes to be written to the snapshot exception store if corresponding
exceptions exist.  If the underlying origin device provides
deduplication for zero blocks then if/when the snapshot is merged backed
to the origin those blocks will become unused.  Once the origin has
gained adequate space, merging the snapshot back to the thinly
provisioned device will permit continued use of that device without the
temporary space provided by the snapshot.

Requested-by: John Dorminy <jdorminy@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 Documentation/device-mapper/snapshot.txt |  16 +++
 drivers/md/dm-snap.c                     | 168 +++++++++++++++++++++++++++----
 2 files changed, 167 insertions(+), 17 deletions(-)

diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt
index b8bbb516f989..1810833f6dc6 100644
--- a/Documentation/device-mapper/snapshot.txt
+++ b/Documentation/device-mapper/snapshot.txt
@@ -31,6 +31,7 @@ its visible content unchanged, at least until the <COW device> fills up.
 
 
 *) snapshot <origin> <COW device> <persistent?> <chunksize>
+   [<# feature args> [<arg>]*]
 
 A snapshot of the <origin> block device is created. Changed chunks of
 <chunksize> sectors will be stored on the <COW device>.  Writes will
@@ -53,8 +54,23 @@ When loading or unloading the snapshot target, the corresponding
 snapshot-origin or snapshot-merge target must be suspended. A failure to
 suspend the origin target could result in data corruption.
 
+Optional features:
+
+   discard_zeroes_cow - a discard issued to the snapshot device that
+   maps to entire chunks to will zero the corresponding exception(s) in
+   the snapshot's exception store.
+
+   discard_passdown_origin - a discard to the snapshot device is passed
+   down to the snapshot-origin's underlying device.  This doesn't cause
+   copy-out to the snapshot exception store because the snapshot-origin
+   target is bypassed.
+
+   The discard_passdown_origin feature depends on the discard_zeroes_cow
+   feature being enabled.
+
 
 * snapshot-merge <origin> <COW device> <persistent> <chunksize>
+  [<# feature args> [<arg>]*]
 
 takes the same table arguments as the snapshot target except it only
 works with persistent snapshots.  This target assumes the role of the
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index 3107f2b1988b..e894302619dd 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -134,7 +134,10 @@ struct dm_snapshot {
 	 * - I/O error while merging
 	 *	=> stop merging; set merge_failed; process I/O normally.
 	 */
-	int merge_failed;
+	bool merge_failed:1;
+
+	bool discard_zeroes_cow:1;
+	bool discard_passdown_origin:1;
 
 	/*
 	 * Incoming bios that overlap with chunks being merged must wait
@@ -1173,12 +1176,64 @@ static void stop_merge(struct dm_snapshot *s)
 	clear_bit(SHUTDOWN_MERGE, &s->state_bits);
 }
 
+static int parse_snapshot_features(struct dm_arg_set *as, struct dm_snapshot *s,
+				   struct dm_target *ti)
+{
+	int r;
+	unsigned argc;
+	const char *arg_name;
+
+	static const struct dm_arg _args[] = {
+		{0, 2, "Invalid number of feature arguments"},
+	};
+
+	/*
+	 * No feature arguments supplied.
+	 */
+	if (!as->argc)
+		return 0;
+
+	r = dm_read_arg_group(_args, as, &argc, &ti->error);
+	if (r)
+		return -EINVAL;
+
+	while (argc && !r) {
+		arg_name = dm_shift_arg(as);
+		argc--;
+
+		if (!strcasecmp(arg_name, "discard_zeroes_cow"))
+			s->discard_zeroes_cow = true;
+
+		else if (!strcasecmp(arg_name, "discard_passdown_origin"))
+			s->discard_passdown_origin = true;
+
+		else {
+			ti->error = "Unrecognised feature requested";
+			r = -EINVAL;
+			break;
+		}
+	}
+
+	if (!s->discard_zeroes_cow && s->discard_passdown_origin) {
+		/*
+		 * TODO: really these are disjoint.. but ti->num_discard_bios
+		 * and dm_bio_get_target_bio_nr() require rigid constraints.
+		 */
+		ti->error = "discard_passdown_origin feature depends on discard_zeroes_cow";
+		r = -EINVAL;
+	}
+
+	return r;
+}
+
 /*
- * Construct a snapshot mapping: <origin_dev> <COW-dev> <p|po|n> <chunk-size>
+ * Construct a snapshot mapping:
+ * <origin_dev> <COW-dev> <p|po|n> <chunk-size> [<# feature args> [<arg>]*]
  */
 static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 {
 	struct dm_snapshot *s;
+	struct dm_arg_set as;
 	int i;
 	int r = -EINVAL;
 	char *origin_path, *cow_path;
@@ -1186,8 +1241,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	unsigned args_used, num_flush_bios = 1;
 	fmode_t origin_mode = FMODE_READ;
 
-	if (argc != 4) {
-		ti->error = "requires exactly 4 arguments";
+	if (argc < 4) {
+		ti->error = "requires 4 or more arguments";
 		r = -EINVAL;
 		goto bad;
 	}
@@ -1204,6 +1259,13 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		goto bad;
 	}
 
+	as.argc = argc;
+	as.argv = argv;
+	dm_consume_args(&as, 4);
+	r = parse_snapshot_features(&as, s, ti);
+	if (r)
+		goto bad_features;
+
 	origin_path = argv[0];
 	argv++;
 	argc--;
@@ -1289,6 +1351,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 	ti->private = s;
 	ti->num_flush_bios = num_flush_bios;
+	if (s->discard_zeroes_cow)
+		ti->num_discard_bios = (s->discard_passdown_origin ? 2 : 1);
 	ti->per_io_data_size = sizeof(struct dm_snap_tracked_chunk);
 
 	/* Add snapshot to the list of snapshots for this origin */
@@ -1336,29 +1400,22 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 bad_read_metadata:
 	unregister_snapshot(s);
-
 bad_load_and_register:
 	mempool_exit(&s->pending_pool);
-
 bad_pending_pool:
 	dm_kcopyd_client_destroy(s->kcopyd_client);
-
 bad_kcopyd:
 	dm_exception_table_exit(&s->pending, pending_cache);
 	dm_exception_table_exit(&s->complete, exception_cache);
-
 bad_hash_tables:
 	dm_exception_store_destroy(s->store);
-
 bad_store:
 	dm_put_device(ti, s->cow);
-
 bad_cow:
 	dm_put_device(ti, s->origin);
-
 bad_origin:
+bad_features:
 	kfree(s);
-
 bad:
 	return r;
 }
@@ -1806,6 +1863,32 @@ static void remap_exception(struct dm_snapshot *s, struct dm_exception *e,
 		(bio->bi_iter.bi_sector & s->store->chunk_mask);
 }
 
+static void zero_callback(int read_err, unsigned long write_err, void *context)
+{
+	struct dm_snapshot *s = context;
+
+	up(&s->cow_count);
+}
+
+static void zero_exception(struct dm_snapshot *s, struct dm_exception *e,
+			   struct bio *bio, chunk_t chunk)
+{
+	struct dm_io_region dest;
+
+	dest.bdev = s->cow->bdev;
+	dest.sector = bio->bi_iter.bi_sector;
+	dest.count = s->store->chunk_size;
+
+	down(&s->cow_count);
+	dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, s);
+}
+
+static bool io_overlaps_chunk(struct dm_snapshot *s, struct bio *bio)
+{
+	return bio->bi_iter.bi_size ==
+		(s->store->chunk_size << SECTOR_SHIFT);
+}
+
 static int snapshot_map(struct dm_target *ti, struct bio *bio)
 {
 	struct dm_exception *e;
@@ -1839,10 +1922,42 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
 		goto out_unlock;
 	}
 
+	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
+		if (s->discard_passdown_origin && dm_bio_get_target_bio_nr(bio)) {
+			/*
+			 * passdown discard to origin (without triggering
+			 * snapshot exceptions via do_origin; doing so would
+			 * defeat the goal of freeing space in origin that is
+			 * implied by the "discard_passdown_origin" feature)
+			 */
+			bio_set_dev(bio, s->origin->bdev);
+			track_chunk(s, bio, chunk);
+			goto out_unlock;
+		}
+		/* discard to snapshot (target_bio_nr == 0) zeroes exceptions */
+	}
+
 	/* If the block is already remapped - use that, else remap it */
 	e = dm_lookup_exception(&s->complete, chunk);
 	if (e) {
 		remap_exception(s, e, bio, chunk);
+		if (unlikely(bio_op(bio) == REQ_OP_DISCARD) &&
+		    io_overlaps_chunk(s, bio)) {
+			dm_exception_table_unlock(&lock);
+			up_read(&s->lock);
+			zero_exception(s, e, bio, chunk);
+			goto out;
+		}
+		goto out_unlock;
+	}
+
+	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
+		/*
+		 * If no exception exists, complete discard immediately
+		 * otherwise it'll trigger copy-out.
+		 */
+		bio_endio(bio);
+		r = DM_MAPIO_SUBMITTED;
 		goto out_unlock;
 	}
 
@@ -1890,9 +2005,7 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
 
 		r = DM_MAPIO_SUBMITTED;
 
-		if (!pe->started &&
-		    bio->bi_iter.bi_size ==
-		    (s->store->chunk_size << SECTOR_SHIFT)) {
+		if (!pe->started && io_overlaps_chunk(s, bio)) {
 			pe->started = 1;
 
 			dm_exception_table_unlock(&lock);
@@ -2138,6 +2251,7 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
 {
 	unsigned sz = 0;
 	struct dm_snapshot *snap = ti->private;
+	unsigned num_features;
 
 	switch (type) {
 	case STATUSTYPE_INFO:
@@ -2180,6 +2294,14 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
 		DMEMIT("%s %s", snap->origin->name, snap->cow->name);
 		snap->store->type->status(snap->store, type, result + sz,
 					  maxlen - sz);
+		num_features = snap->discard_zeroes_cow + snap->discard_passdown_origin;
+		if (num_features) {
+			DMEMIT(" %u", num_features);
+			if (snap->discard_zeroes_cow)
+				DMEMIT(" discard_zeroes_cow");
+			if (snap->discard_passdown_origin)
+				DMEMIT(" discard_passdown_origin");
+		}
 		break;
 	}
 }
@@ -2198,6 +2320,16 @@ static int snapshot_iterate_devices(struct dm_target *ti,
 	return r;
 }
 
+static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits)
+{
+	struct dm_snapshot *snap = ti->private;
+
+	if (snap->discard_zeroes_cow) {
+		/* All discards are split on chunk_size boundary */
+		limits->discard_granularity = snap->store->chunk_size;
+		limits->max_discard_sectors = snap->store->chunk_size;
+	}
+}
 
 /*-----------------------------------------------------------------
  * Origin methods
@@ -2522,7 +2654,7 @@ static struct target_type origin_target = {
 
 static struct target_type snapshot_target = {
 	.name    = "snapshot",
-	.version = {1, 15, 0},
+	.version = {1, 16, 0},
 	.module  = THIS_MODULE,
 	.ctr     = snapshot_ctr,
 	.dtr     = snapshot_dtr,
@@ -2532,11 +2664,12 @@ static struct target_type snapshot_target = {
 	.resume  = snapshot_resume,
 	.status  = snapshot_status,
 	.iterate_devices = snapshot_iterate_devices,
+	.io_hints = snapshot_io_hints,
 };
 
 static struct target_type merge_target = {
 	.name    = dm_snapshot_merge_target_name,
-	.version = {1, 4, 0},
+	.version = {1, 5, 0},
 	.module  = THIS_MODULE,
 	.ctr     = snapshot_ctr,
 	.dtr     = snapshot_dtr,
@@ -2547,6 +2680,7 @@ static struct target_type merge_target = {
 	.resume  = snapshot_merge_resume,
 	.status  = snapshot_status,
 	.iterate_devices = snapshot_iterate_devices,
+	.io_hints = snapshot_io_hints,
 };
 
 static int __init dm_snapshot_init(void)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] dm snapshot: add optional discard support features
  2019-07-03 16:25 [PATCH] dm snapshot: add optional discard support features Mike Snitzer
@ 2019-07-05 16:03 ` Nikos Tsironis
  2019-07-11 20:36   ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Nikos Tsironis @ 2019-07-05 16:03 UTC (permalink / raw)
  To: Mike Snitzer, dm-devel; +Cc: John Dorminy

Hi Mike,

A question inline.

On 7/3/19 7:25 PM, Mike Snitzer wrote:
> Add 2 _optional_ features to the snapshot target:
> 
> discard_zeroes_cow - a discard issued to the snapshot device that maps
> to entire chunks to will zero the corresponding exception(s) in the
> snapshot's exception store.
> 
> discard_passdown_origin - a discard to the snapshot device is passed down
> to the snapshot-origin's underlying device.  This doesn't cause copy-out
> to the snapshot exception store because the snapshot-origin target is
> bypassed.
> 
> The discard_passdown_origin feature depends on the discard_zeroes_cow
> feature being enabled.
> 
> When these 2 features are enabled they allow a temporarily read-only
> device that has completely exhausted its free space to recover space.
> To do so dm-snapshot provides temporary buffer to accommodate writes
> that the temporarily read-only device cannot handle yet.  Once the upper
> layer frees space (e.g. fstrim to XFS) the discards issued to the
> dm-snapshot target will be issued to underlying read-only device whose
> free space was exhausted.  In addition those discards will also cause
> zeroes to be written to the snapshot exception store if corresponding
> exceptions exist.  If the underlying origin device provides
> deduplication for zero blocks then if/when the snapshot is merged backed
> to the origin those blocks will become unused.  Once the origin has
> gained adequate space, merging the snapshot back to the thinly
> provisioned device will permit continued use of that device without the
> temporary space provided by the snapshot.
> 
> Requested-by: John Dorminy <jdorminy@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  Documentation/device-mapper/snapshot.txt |  16 +++
>  drivers/md/dm-snap.c                     | 168 +++++++++++++++++++++++++++----
>  2 files changed, 167 insertions(+), 17 deletions(-)
> 
> diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt
> index b8bbb516f989..1810833f6dc6 100644
> --- a/Documentation/device-mapper/snapshot.txt
> +++ b/Documentation/device-mapper/snapshot.txt
> @@ -31,6 +31,7 @@ its visible content unchanged, at least until the <COW device> fills up.
>  
>  
>  *) snapshot <origin> <COW device> <persistent?> <chunksize>
> +   [<# feature args> [<arg>]*]
>  
>  A snapshot of the <origin> block device is created. Changed chunks of
>  <chunksize> sectors will be stored on the <COW device>.  Writes will
> @@ -53,8 +54,23 @@ When loading or unloading the snapshot target, the corresponding
>  snapshot-origin or snapshot-merge target must be suspended. A failure to
>  suspend the origin target could result in data corruption.
>  
> +Optional features:
> +
> +   discard_zeroes_cow - a discard issued to the snapshot device that
> +   maps to entire chunks to will zero the corresponding exception(s) in
> +   the snapshot's exception store.
> +
> +   discard_passdown_origin - a discard to the snapshot device is passed
> +   down to the snapshot-origin's underlying device.  This doesn't cause
> +   copy-out to the snapshot exception store because the snapshot-origin
> +   target is bypassed.
> +
> +   The discard_passdown_origin feature depends on the discard_zeroes_cow
> +   feature being enabled.
> +
>  
>  * snapshot-merge <origin> <COW device> <persistent> <chunksize>
> +  [<# feature args> [<arg>]*]
>  
>  takes the same table arguments as the snapshot target except it only
>  works with persistent snapshots.  This target assumes the role of the
> diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
> index 3107f2b1988b..e894302619dd 100644
> --- a/drivers/md/dm-snap.c
> +++ b/drivers/md/dm-snap.c
> @@ -134,7 +134,10 @@ struct dm_snapshot {
>  	 * - I/O error while merging
>  	 *	=> stop merging; set merge_failed; process I/O normally.
>  	 */
> -	int merge_failed;
> +	bool merge_failed:1;
> +
> +	bool discard_zeroes_cow:1;
> +	bool discard_passdown_origin:1;
>  
>  	/*
>  	 * Incoming bios that overlap with chunks being merged must wait
> @@ -1173,12 +1176,64 @@ static void stop_merge(struct dm_snapshot *s)
>  	clear_bit(SHUTDOWN_MERGE, &s->state_bits);
>  }
>  
> +static int parse_snapshot_features(struct dm_arg_set *as, struct dm_snapshot *s,
> +				   struct dm_target *ti)
> +{
> +	int r;
> +	unsigned argc;
> +	const char *arg_name;
> +
> +	static const struct dm_arg _args[] = {
> +		{0, 2, "Invalid number of feature arguments"},
> +	};
> +
> +	/*
> +	 * No feature arguments supplied.
> +	 */
> +	if (!as->argc)
> +		return 0;
> +
> +	r = dm_read_arg_group(_args, as, &argc, &ti->error);
> +	if (r)
> +		return -EINVAL;
> +
> +	while (argc && !r) {
> +		arg_name = dm_shift_arg(as);
> +		argc--;
> +
> +		if (!strcasecmp(arg_name, "discard_zeroes_cow"))
> +			s->discard_zeroes_cow = true;
> +
> +		else if (!strcasecmp(arg_name, "discard_passdown_origin"))
> +			s->discard_passdown_origin = true;
> +
> +		else {
> +			ti->error = "Unrecognised feature requested";
> +			r = -EINVAL;
> +			break;
> +		}
> +	}
> +
> +	if (!s->discard_zeroes_cow && s->discard_passdown_origin) {
> +		/*
> +		 * TODO: really these are disjoint.. but ti->num_discard_bios
> +		 * and dm_bio_get_target_bio_nr() require rigid constraints.
> +		 */
> +		ti->error = "discard_passdown_origin feature depends on discard_zeroes_cow";
> +		r = -EINVAL;
> +	}
> +
> +	return r;
> +}
> +
>  /*
> - * Construct a snapshot mapping: <origin_dev> <COW-dev> <p|po|n> <chunk-size>
> + * Construct a snapshot mapping:
> + * <origin_dev> <COW-dev> <p|po|n> <chunk-size> [<# feature args> [<arg>]*]
>   */
>  static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  {
>  	struct dm_snapshot *s;
> +	struct dm_arg_set as;
>  	int i;
>  	int r = -EINVAL;
>  	char *origin_path, *cow_path;
> @@ -1186,8 +1241,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  	unsigned args_used, num_flush_bios = 1;
>  	fmode_t origin_mode = FMODE_READ;
>  
> -	if (argc != 4) {
> -		ti->error = "requires exactly 4 arguments";
> +	if (argc < 4) {
> +		ti->error = "requires 4 or more arguments";
>  		r = -EINVAL;
>  		goto bad;
>  	}
> @@ -1204,6 +1259,13 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  		goto bad;
>  	}
>  
> +	as.argc = argc;
> +	as.argv = argv;
> +	dm_consume_args(&as, 4);
> +	r = parse_snapshot_features(&as, s, ti);
> +	if (r)
> +		goto bad_features;
> +
>  	origin_path = argv[0];
>  	argv++;
>  	argc--;
> @@ -1289,6 +1351,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  
>  	ti->private = s;
>  	ti->num_flush_bios = num_flush_bios;
> +	if (s->discard_zeroes_cow)
> +		ti->num_discard_bios = (s->discard_passdown_origin ? 2 : 1);
>  	ti->per_io_data_size = sizeof(struct dm_snap_tracked_chunk);
>  
>  	/* Add snapshot to the list of snapshots for this origin */
> @@ -1336,29 +1400,22 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  
>  bad_read_metadata:
>  	unregister_snapshot(s);
> -
>  bad_load_and_register:
>  	mempool_exit(&s->pending_pool);
> -
>  bad_pending_pool:
>  	dm_kcopyd_client_destroy(s->kcopyd_client);
> -
>  bad_kcopyd:
>  	dm_exception_table_exit(&s->pending, pending_cache);
>  	dm_exception_table_exit(&s->complete, exception_cache);
> -
>  bad_hash_tables:
>  	dm_exception_store_destroy(s->store);
> -
>  bad_store:
>  	dm_put_device(ti, s->cow);
> -
>  bad_cow:
>  	dm_put_device(ti, s->origin);
> -
>  bad_origin:
> +bad_features:
>  	kfree(s);
> -
>  bad:
>  	return r;
>  }
> @@ -1806,6 +1863,32 @@ static void remap_exception(struct dm_snapshot *s, struct dm_exception *e,
>  		(bio->bi_iter.bi_sector & s->store->chunk_mask);
>  }
>  
> +static void zero_callback(int read_err, unsigned long write_err, void *context)
> +{
> +	struct dm_snapshot *s = context;
> +
> +	up(&s->cow_count);
> +}
> +
> +static void zero_exception(struct dm_snapshot *s, struct dm_exception *e,
> +			   struct bio *bio, chunk_t chunk)
> +{
> +	struct dm_io_region dest;
> +
> +	dest.bdev = s->cow->bdev;
> +	dest.sector = bio->bi_iter.bi_sector;
> +	dest.count = s->store->chunk_size;
> +
> +	down(&s->cow_count);
> +	dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, s);
> +}
> +
> +static bool io_overlaps_chunk(struct dm_snapshot *s, struct bio *bio)
> +{
> +	return bio->bi_iter.bi_size ==
> +		(s->store->chunk_size << SECTOR_SHIFT);
> +}
> +
>  static int snapshot_map(struct dm_target *ti, struct bio *bio)
>  {
>  	struct dm_exception *e;
> @@ -1839,10 +1922,42 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
>  		goto out_unlock;
>  	}
>  
> +	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
> +		if (s->discard_passdown_origin && dm_bio_get_target_bio_nr(bio)) {
> +			/*
> +			 * passdown discard to origin (without triggering
> +			 * snapshot exceptions via do_origin; doing so would
> +			 * defeat the goal of freeing space in origin that is
> +			 * implied by the "discard_passdown_origin" feature)
> +			 */
> +			bio_set_dev(bio, s->origin->bdev);
> +			track_chunk(s, bio, chunk);
> +			goto out_unlock;
> +		}
> +		/* discard to snapshot (target_bio_nr == 0) zeroes exceptions */
> +	}
> +
>  	/* If the block is already remapped - use that, else remap it */
>  	e = dm_lookup_exception(&s->complete, chunk);
>  	if (e) {
>  		remap_exception(s, e, bio, chunk);
> +		if (unlikely(bio_op(bio) == REQ_OP_DISCARD) &&
> +		    io_overlaps_chunk(s, bio)) {
> +			dm_exception_table_unlock(&lock);
> +			up_read(&s->lock);
> +			zero_exception(s, e, bio, chunk);
> +			goto out;
> +		}
> +		goto out_unlock;
> +	}

In case an exception exists for a chunk and we get a discard for it, we
want to zero the corresponding exception in the exception store.

The code remaps the discard bio, issues the zeroing operation by calling
zero_exception() and returns DM_MAPIO_REMAPPED. If I am not missing
something, device mapper core will then submit the discard bio to the
COW device, so we end up both zeroing and discarding the chunk in the
COW device.

Is this deliberate?

Thanks,
Nikos

> +
> +	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
> +		/*
> +		 * If no exception exists, complete discard immediately
> +		 * otherwise it'll trigger copy-out.
> +		 */
> +		bio_endio(bio);
> +		r = DM_MAPIO_SUBMITTED;
>  		goto out_unlock;
>  	}
>  
> @@ -1890,9 +2005,7 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
>  
>  		r = DM_MAPIO_SUBMITTED;
>  
> -		if (!pe->started &&
> -		    bio->bi_iter.bi_size ==
> -		    (s->store->chunk_size << SECTOR_SHIFT)) {
> +		if (!pe->started && io_overlaps_chunk(s, bio)) {
>  			pe->started = 1;
>  
>  			dm_exception_table_unlock(&lock);
> @@ -2138,6 +2251,7 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
>  {
>  	unsigned sz = 0;
>  	struct dm_snapshot *snap = ti->private;
> +	unsigned num_features;
>  
>  	switch (type) {
>  	case STATUSTYPE_INFO:
> @@ -2180,6 +2294,14 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
>  		DMEMIT("%s %s", snap->origin->name, snap->cow->name);
>  		snap->store->type->status(snap->store, type, result + sz,
>  					  maxlen - sz);
> +		num_features = snap->discard_zeroes_cow + snap->discard_passdown_origin;
> +		if (num_features) {
> +			DMEMIT(" %u", num_features);
> +			if (snap->discard_zeroes_cow)
> +				DMEMIT(" discard_zeroes_cow");
> +			if (snap->discard_passdown_origin)
> +				DMEMIT(" discard_passdown_origin");
> +		}
>  		break;
>  	}
>  }
> @@ -2198,6 +2320,16 @@ static int snapshot_iterate_devices(struct dm_target *ti,
>  	return r;
>  }
>  
> +static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits)
> +{
> +	struct dm_snapshot *snap = ti->private;
> +
> +	if (snap->discard_zeroes_cow) {
> +		/* All discards are split on chunk_size boundary */
> +		limits->discard_granularity = snap->store->chunk_size;
> +		limits->max_discard_sectors = snap->store->chunk_size;
> +	}
> +}
>  
>  /*-----------------------------------------------------------------
>   * Origin methods
> @@ -2522,7 +2654,7 @@ static struct target_type origin_target = {
>  
>  static struct target_type snapshot_target = {
>  	.name    = "snapshot",
> -	.version = {1, 15, 0},
> +	.version = {1, 16, 0},
>  	.module  = THIS_MODULE,
>  	.ctr     = snapshot_ctr,
>  	.dtr     = snapshot_dtr,
> @@ -2532,11 +2664,12 @@ static struct target_type snapshot_target = {
>  	.resume  = snapshot_resume,
>  	.status  = snapshot_status,
>  	.iterate_devices = snapshot_iterate_devices,
> +	.io_hints = snapshot_io_hints,
>  };
>  
>  static struct target_type merge_target = {
>  	.name    = dm_snapshot_merge_target_name,
> -	.version = {1, 4, 0},
> +	.version = {1, 5, 0},
>  	.module  = THIS_MODULE,
>  	.ctr     = snapshot_ctr,
>  	.dtr     = snapshot_dtr,
> @@ -2547,6 +2680,7 @@ static struct target_type merge_target = {
>  	.resume  = snapshot_merge_resume,
>  	.status  = snapshot_status,
>  	.iterate_devices = snapshot_iterate_devices,
> +	.io_hints = snapshot_io_hints,
>  };
>  
>  static int __init dm_snapshot_init(void)
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: dm snapshot: add optional discard support features
  2019-07-05 16:03 ` Nikos Tsironis
@ 2019-07-11 20:36   ` Mike Snitzer
  2019-07-11 20:46     ` [PATCH v2] " Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2019-07-11 20:36 UTC (permalink / raw)
  To: Nikos Tsironis; +Cc: dm-devel, John Dorminy

On Fri, Jul 05 2019 at 12:03pm -0400,
Nikos Tsironis <ntsironis@arrikto.com> wrote:

> Hi Mike,
> 
> A question inline.
...
> > diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
> > index 3107f2b1988b..e894302619dd 100644
> > --- a/drivers/md/dm-snap.c
> > +++ b/drivers/md/dm-snap.c
> > @@ -1839,10 +1922,42 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
> >  		goto out_unlock;
> >  	}
> >  
> > +	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
> > +		if (s->discard_passdown_origin && dm_bio_get_target_bio_nr(bio)) {
> > +			/*
> > +			 * passdown discard to origin (without triggering
> > +			 * snapshot exceptions via do_origin; doing so would
> > +			 * defeat the goal of freeing space in origin that is
> > +			 * implied by the "discard_passdown_origin" feature)
> > +			 */
> > +			bio_set_dev(bio, s->origin->bdev);
> > +			track_chunk(s, bio, chunk);
> > +			goto out_unlock;
> > +		}
> > +		/* discard to snapshot (target_bio_nr == 0) zeroes exceptions */
> > +	}
> > +
> >  	/* If the block is already remapped - use that, else remap it */
> >  	e = dm_lookup_exception(&s->complete, chunk);
> >  	if (e) {
> >  		remap_exception(s, e, bio, chunk);
> > +		if (unlikely(bio_op(bio) == REQ_OP_DISCARD) &&
> > +		    io_overlaps_chunk(s, bio)) {
> > +			dm_exception_table_unlock(&lock);
> > +			up_read(&s->lock);
> > +			zero_exception(s, e, bio, chunk);
> > +			goto out;
> > +		}
> > +		goto out_unlock;
> > +	}
> 
> In case an exception exists for a chunk and we get a discard for it, we
> want to zero the corresponding exception in the exception store.
> 
> The code remaps the discard bio, issues the zeroing operation by calling
> zero_exception() and returns DM_MAPIO_REMAPPED. If I am not missing
> something, device mapper core will then submit the discard bio to the
> COW device, so we end up both zeroing and discarding the chunk in the
> COW device.
> 
> Is this deliberate?

No, it was an oversight.

The following incremental patch fixes it and a couple other bugs I found
while fixing the issue you reported.  I'll post v2 in reply to v1.  Your
timely review would be appreciated.  I'd still like to send this
upstream for the 5.3 merge tomorrow (Friday) by my end of day.  Thanks!

diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index e894302619dd..63916e1dc569 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -1,6 +1,4 @@
 /*
- * dm-snapshot.c
- *
  * Copyright (C) 2001-2002 Sistina Software (UK) Limited.
  *
  * This file is released under the GPL.
@@ -1865,9 +1863,12 @@ static void remap_exception(struct dm_snapshot *s, struct dm_exception *e,
 
 static void zero_callback(int read_err, unsigned long write_err, void *context)
 {
-	struct dm_snapshot *s = context;
+	struct bio *bio = context;
+	struct dm_snapshot *s = bio->bi_private;
 
 	up(&s->cow_count);
+	bio->bi_status = write_err ? BLK_STS_IOERR : 0;
+	bio_endio(bio);
 }
 
 static void zero_exception(struct dm_snapshot *s, struct dm_exception *e,
@@ -1880,7 +1881,9 @@ static void zero_exception(struct dm_snapshot *s, struct dm_exception *e,
 	dest.count = s->store->chunk_size;
 
 	down(&s->cow_count);
-	dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, s);
+	WARN_ON_ONCE(bio->bi_private);
+	bio->bi_private = s;
+	dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, bio);
 }
 
 static bool io_overlaps_chunk(struct dm_snapshot *s, struct bio *bio)
@@ -1946,6 +1949,7 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
 			dm_exception_table_unlock(&lock);
 			up_read(&s->lock);
 			zero_exception(s, e, bio, chunk);
+			r = DM_MAPIO_SUBMITTED; /* discard is not issued */
 			goto out;
 		}
 		goto out_unlock;
@@ -2292,8 +2296,8 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
 		 * make sense.
 		 */
 		DMEMIT("%s %s", snap->origin->name, snap->cow->name);
-		snap->store->type->status(snap->store, type, result + sz,
-					  maxlen - sz);
+		sz += snap->store->type->status(snap->store, type, result + sz,
+						maxlen - sz);
 		num_features = snap->discard_zeroes_cow + snap->discard_passdown_origin;
 		if (num_features) {
 			DMEMIT(" %u", num_features);
@@ -2325,6 +2329,12 @@ static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits)
 	struct dm_snapshot *snap = ti->private;
 
 	if (snap->discard_zeroes_cow) {
+		struct dm_snapshot *snap_src = NULL, *snap_dest = NULL;
+
+		(void) __find_snapshots_sharing_cow(snap, &snap_src, &snap_dest, NULL);
+		if (snap_src && snap_dest)
+			snap = snap_src;
+
 		/* All discards are split on chunk_size boundary */
 		limits->discard_granularity = snap->store->chunk_size;
 		limits->max_discard_sectors = snap->store->chunk_size;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2] dm snapshot: add optional discard support features
  2019-07-11 20:36   ` Mike Snitzer
@ 2019-07-11 20:46     ` Mike Snitzer
  2019-07-12 13:40       ` Nikos Tsironis
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2019-07-11 20:46 UTC (permalink / raw)
  To: Nikos Tsironis; +Cc: dm-devel, John Dorminy

discard_zeroes_cow - a discard issued to the snapshot device that maps
to entire chunks to will zero the corresponding exception(s) in the
snapshot's exception store.

discard_passdown_origin - a discard to the snapshot device is passed down
to the snapshot-origin's underlying device.  This doesn't cause copy-out
to the snapshot exception store because the snapshot-origin target is
bypassed.

The discard_passdown_origin feature depends on the discard_zeroes_cow
feature being enabled.

When these 2 features are enabled they allow a temporarily read-only
device that has completely exhausted its free space to recover space.
To do so dm-snapshot provides temporary buffer to accommodate writes
that the temporarily read-only device cannot handle yet.  Once the upper
layer frees space (e.g. fstrim to XFS) the discards issued to the
dm-snapshot target will be issued to underlying read-only device whose
free space was exhausted.  In addition those discards will also cause
zeroes to be written to the snapshot exception store if corresponding
exceptions exist.  If the underlying origin device provides
deduplication for zero blocks then if/when the snapshot is merged backed
to the origin those blocks will become unused.  Once the origin has
gained adequate space, merging the snapshot back to the thinly
provisioned device will permit continued use of that device without the
temporary space provided by the snapshot.

Requested-by: John Dorminy <jdorminy@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 Documentation/device-mapper/snapshot.txt |  16 +++
 drivers/md/dm-snap.c                     | 186 +++++++++++++++++++++++++++----
 2 files changed, 181 insertions(+), 21 deletions(-)

diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt
index b8bbb516f989..1810833f6dc6 100644
--- a/Documentation/device-mapper/snapshot.txt
+++ b/Documentation/device-mapper/snapshot.txt
@@ -31,6 +31,7 @@ its visible content unchanged, at least until the <COW device> fills up.
 
 
 *) snapshot <origin> <COW device> <persistent?> <chunksize>
+   [<# feature args> [<arg>]*]
 
 A snapshot of the <origin> block device is created. Changed chunks of
 <chunksize> sectors will be stored on the <COW device>.  Writes will
@@ -53,8 +54,23 @@ When loading or unloading the snapshot target, the corresponding
 snapshot-origin or snapshot-merge target must be suspended. A failure to
 suspend the origin target could result in data corruption.
 
+Optional features:
+
+   discard_zeroes_cow - a discard issued to the snapshot device that
+   maps to entire chunks to will zero the corresponding exception(s) in
+   the snapshot's exception store.
+
+   discard_passdown_origin - a discard to the snapshot device is passed
+   down to the snapshot-origin's underlying device.  This doesn't cause
+   copy-out to the snapshot exception store because the snapshot-origin
+   target is bypassed.
+
+   The discard_passdown_origin feature depends on the discard_zeroes_cow
+   feature being enabled.
+
 
 * snapshot-merge <origin> <COW device> <persistent> <chunksize>
+  [<# feature args> [<arg>]*]
 
 takes the same table arguments as the snapshot target except it only
 works with persistent snapshots.  This target assumes the role of the
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index 3107f2b1988b..63916e1dc569 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -1,6 +1,4 @@
 /*
- * dm-snapshot.c
- *
  * Copyright (C) 2001-2002 Sistina Software (UK) Limited.
  *
  * This file is released under the GPL.
@@ -134,7 +132,10 @@ struct dm_snapshot {
 	 * - I/O error while merging
 	 *	=> stop merging; set merge_failed; process I/O normally.
 	 */
-	int merge_failed;
+	bool merge_failed:1;
+
+	bool discard_zeroes_cow:1;
+	bool discard_passdown_origin:1;
 
 	/*
 	 * Incoming bios that overlap with chunks being merged must wait
@@ -1173,12 +1174,64 @@ static void stop_merge(struct dm_snapshot *s)
 	clear_bit(SHUTDOWN_MERGE, &s->state_bits);
 }
 
+static int parse_snapshot_features(struct dm_arg_set *as, struct dm_snapshot *s,
+				   struct dm_target *ti)
+{
+	int r;
+	unsigned argc;
+	const char *arg_name;
+
+	static const struct dm_arg _args[] = {
+		{0, 2, "Invalid number of feature arguments"},
+	};
+
+	/*
+	 * No feature arguments supplied.
+	 */
+	if (!as->argc)
+		return 0;
+
+	r = dm_read_arg_group(_args, as, &argc, &ti->error);
+	if (r)
+		return -EINVAL;
+
+	while (argc && !r) {
+		arg_name = dm_shift_arg(as);
+		argc--;
+
+		if (!strcasecmp(arg_name, "discard_zeroes_cow"))
+			s->discard_zeroes_cow = true;
+
+		else if (!strcasecmp(arg_name, "discard_passdown_origin"))
+			s->discard_passdown_origin = true;
+
+		else {
+			ti->error = "Unrecognised feature requested";
+			r = -EINVAL;
+			break;
+		}
+	}
+
+	if (!s->discard_zeroes_cow && s->discard_passdown_origin) {
+		/*
+		 * TODO: really these are disjoint.. but ti->num_discard_bios
+		 * and dm_bio_get_target_bio_nr() require rigid constraints.
+		 */
+		ti->error = "discard_passdown_origin feature depends on discard_zeroes_cow";
+		r = -EINVAL;
+	}
+
+	return r;
+}
+
 /*
- * Construct a snapshot mapping: <origin_dev> <COW-dev> <p|po|n> <chunk-size>
+ * Construct a snapshot mapping:
+ * <origin_dev> <COW-dev> <p|po|n> <chunk-size> [<# feature args> [<arg>]*]
  */
 static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 {
 	struct dm_snapshot *s;
+	struct dm_arg_set as;
 	int i;
 	int r = -EINVAL;
 	char *origin_path, *cow_path;
@@ -1186,8 +1239,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	unsigned args_used, num_flush_bios = 1;
 	fmode_t origin_mode = FMODE_READ;
 
-	if (argc != 4) {
-		ti->error = "requires exactly 4 arguments";
+	if (argc < 4) {
+		ti->error = "requires 4 or more arguments";
 		r = -EINVAL;
 		goto bad;
 	}
@@ -1204,6 +1257,13 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		goto bad;
 	}
 
+	as.argc = argc;
+	as.argv = argv;
+	dm_consume_args(&as, 4);
+	r = parse_snapshot_features(&as, s, ti);
+	if (r)
+		goto bad_features;
+
 	origin_path = argv[0];
 	argv++;
 	argc--;
@@ -1289,6 +1349,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 	ti->private = s;
 	ti->num_flush_bios = num_flush_bios;
+	if (s->discard_zeroes_cow)
+		ti->num_discard_bios = (s->discard_passdown_origin ? 2 : 1);
 	ti->per_io_data_size = sizeof(struct dm_snap_tracked_chunk);
 
 	/* Add snapshot to the list of snapshots for this origin */
@@ -1336,29 +1398,22 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 bad_read_metadata:
 	unregister_snapshot(s);
-
 bad_load_and_register:
 	mempool_exit(&s->pending_pool);
-
 bad_pending_pool:
 	dm_kcopyd_client_destroy(s->kcopyd_client);
-
 bad_kcopyd:
 	dm_exception_table_exit(&s->pending, pending_cache);
 	dm_exception_table_exit(&s->complete, exception_cache);
-
 bad_hash_tables:
 	dm_exception_store_destroy(s->store);
-
 bad_store:
 	dm_put_device(ti, s->cow);
-
 bad_cow:
 	dm_put_device(ti, s->origin);
-
 bad_origin:
+bad_features:
 	kfree(s);
-
 bad:
 	return r;
 }
@@ -1806,6 +1861,37 @@ static void remap_exception(struct dm_snapshot *s, struct dm_exception *e,
 		(bio->bi_iter.bi_sector & s->store->chunk_mask);
 }
 
+static void zero_callback(int read_err, unsigned long write_err, void *context)
+{
+	struct bio *bio = context;
+	struct dm_snapshot *s = bio->bi_private;
+
+	up(&s->cow_count);
+	bio->bi_status = write_err ? BLK_STS_IOERR : 0;
+	bio_endio(bio);
+}
+
+static void zero_exception(struct dm_snapshot *s, struct dm_exception *e,
+			   struct bio *bio, chunk_t chunk)
+{
+	struct dm_io_region dest;
+
+	dest.bdev = s->cow->bdev;
+	dest.sector = bio->bi_iter.bi_sector;
+	dest.count = s->store->chunk_size;
+
+	down(&s->cow_count);
+	WARN_ON_ONCE(bio->bi_private);
+	bio->bi_private = s;
+	dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, bio);
+}
+
+static bool io_overlaps_chunk(struct dm_snapshot *s, struct bio *bio)
+{
+	return bio->bi_iter.bi_size ==
+		(s->store->chunk_size << SECTOR_SHIFT);
+}
+
 static int snapshot_map(struct dm_target *ti, struct bio *bio)
 {
 	struct dm_exception *e;
@@ -1839,10 +1925,43 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
 		goto out_unlock;
 	}
 
+	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
+		if (s->discard_passdown_origin && dm_bio_get_target_bio_nr(bio)) {
+			/*
+			 * passdown discard to origin (without triggering
+			 * snapshot exceptions via do_origin; doing so would
+			 * defeat the goal of freeing space in origin that is
+			 * implied by the "discard_passdown_origin" feature)
+			 */
+			bio_set_dev(bio, s->origin->bdev);
+			track_chunk(s, bio, chunk);
+			goto out_unlock;
+		}
+		/* discard to snapshot (target_bio_nr == 0) zeroes exceptions */
+	}
+
 	/* If the block is already remapped - use that, else remap it */
 	e = dm_lookup_exception(&s->complete, chunk);
 	if (e) {
 		remap_exception(s, e, bio, chunk);
+		if (unlikely(bio_op(bio) == REQ_OP_DISCARD) &&
+		    io_overlaps_chunk(s, bio)) {
+			dm_exception_table_unlock(&lock);
+			up_read(&s->lock);
+			zero_exception(s, e, bio, chunk);
+			r = DM_MAPIO_SUBMITTED; /* discard is not issued */
+			goto out;
+		}
+		goto out_unlock;
+	}
+
+	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
+		/*
+		 * If no exception exists, complete discard immediately
+		 * otherwise it'll trigger copy-out.
+		 */
+		bio_endio(bio);
+		r = DM_MAPIO_SUBMITTED;
 		goto out_unlock;
 	}
 
@@ -1890,9 +2009,7 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
 
 		r = DM_MAPIO_SUBMITTED;
 
-		if (!pe->started &&
-		    bio->bi_iter.bi_size ==
-		    (s->store->chunk_size << SECTOR_SHIFT)) {
+		if (!pe->started && io_overlaps_chunk(s, bio)) {
 			pe->started = 1;
 
 			dm_exception_table_unlock(&lock);
@@ -2138,6 +2255,7 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
 {
 	unsigned sz = 0;
 	struct dm_snapshot *snap = ti->private;
+	unsigned num_features;
 
 	switch (type) {
 	case STATUSTYPE_INFO:
@@ -2178,8 +2296,16 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
 		 * make sense.
 		 */
 		DMEMIT("%s %s", snap->origin->name, snap->cow->name);
-		snap->store->type->status(snap->store, type, result + sz,
-					  maxlen - sz);
+		sz += snap->store->type->status(snap->store, type, result + sz,
+						maxlen - sz);
+		num_features = snap->discard_zeroes_cow + snap->discard_passdown_origin;
+		if (num_features) {
+			DMEMIT(" %u", num_features);
+			if (snap->discard_zeroes_cow)
+				DMEMIT(" discard_zeroes_cow");
+			if (snap->discard_passdown_origin)
+				DMEMIT(" discard_passdown_origin");
+		}
 		break;
 	}
 }
@@ -2198,6 +2324,22 @@ static int snapshot_iterate_devices(struct dm_target *ti,
 	return r;
 }
 
+static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits)
+{
+	struct dm_snapshot *snap = ti->private;
+
+	if (snap->discard_zeroes_cow) {
+		struct dm_snapshot *snap_src = NULL, *snap_dest = NULL;
+
+		(void) __find_snapshots_sharing_cow(snap, &snap_src, &snap_dest, NULL);
+		if (snap_src && snap_dest)
+			snap = snap_src;
+
+		/* All discards are split on chunk_size boundary */
+		limits->discard_granularity = snap->store->chunk_size;
+		limits->max_discard_sectors = snap->store->chunk_size;
+	}
+}
 
 /*-----------------------------------------------------------------
  * Origin methods
@@ -2522,7 +2664,7 @@ static struct target_type origin_target = {
 
 static struct target_type snapshot_target = {
 	.name    = "snapshot",
-	.version = {1, 15, 0},
+	.version = {1, 16, 0},
 	.module  = THIS_MODULE,
 	.ctr     = snapshot_ctr,
 	.dtr     = snapshot_dtr,
@@ -2532,11 +2674,12 @@ static struct target_type snapshot_target = {
 	.resume  = snapshot_resume,
 	.status  = snapshot_status,
 	.iterate_devices = snapshot_iterate_devices,
+	.io_hints = snapshot_io_hints,
 };
 
 static struct target_type merge_target = {
 	.name    = dm_snapshot_merge_target_name,
-	.version = {1, 4, 0},
+	.version = {1, 5, 0},
 	.module  = THIS_MODULE,
 	.ctr     = snapshot_ctr,
 	.dtr     = snapshot_dtr,
@@ -2547,6 +2690,7 @@ static struct target_type merge_target = {
 	.resume  = snapshot_merge_resume,
 	.status  = snapshot_status,
 	.iterate_devices = snapshot_iterate_devices,
+	.io_hints = snapshot_io_hints,
 };
 
 static int __init dm_snapshot_init(void)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] dm snapshot: add optional discard support features
  2019-07-11 20:46     ` [PATCH v2] " Mike Snitzer
@ 2019-07-12 13:40       ` Nikos Tsironis
  2019-07-15 18:06         ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Nikos Tsironis @ 2019-07-12 13:40 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, John Dorminy

Hi Mike,

I have reviewed the patch. A few comments below.

On 7/11/19 11:46 PM, Mike Snitzer wrote:
> discard_zeroes_cow - a discard issued to the snapshot device that maps
> to entire chunks to will zero the corresponding exception(s) in the
> snapshot's exception store.
> 
> discard_passdown_origin - a discard to the snapshot device is passed down
> to the snapshot-origin's underlying device.  This doesn't cause copy-out
> to the snapshot exception store because the snapshot-origin target is
> bypassed.
> 
> The discard_passdown_origin feature depends on the discard_zeroes_cow
> feature being enabled.
> 
> When these 2 features are enabled they allow a temporarily read-only
> device that has completely exhausted its free space to recover space.
> To do so dm-snapshot provides temporary buffer to accommodate writes
> that the temporarily read-only device cannot handle yet.  Once the upper
> layer frees space (e.g. fstrim to XFS) the discards issued to the
> dm-snapshot target will be issued to underlying read-only device whose
> free space was exhausted.  In addition those discards will also cause
> zeroes to be written to the snapshot exception store if corresponding
> exceptions exist.  If the underlying origin device provides
> deduplication for zero blocks then if/when the snapshot is merged backed
> to the origin those blocks will become unused.  Once the origin has
> gained adequate space, merging the snapshot back to the thinly
> provisioned device will permit continued use of that device without the
> temporary space provided by the snapshot.
> 
> Requested-by: John Dorminy <jdorminy@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  Documentation/device-mapper/snapshot.txt |  16 +++
>  drivers/md/dm-snap.c                     | 186 +++++++++++++++++++++++++++----
>  2 files changed, 181 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt
> index b8bbb516f989..1810833f6dc6 100644
> --- a/Documentation/device-mapper/snapshot.txt
> +++ b/Documentation/device-mapper/snapshot.txt
> @@ -31,6 +31,7 @@ its visible content unchanged, at least until the <COW device> fills up.
>  
>  
>  *) snapshot <origin> <COW device> <persistent?> <chunksize>
> +   [<# feature args> [<arg>]*]
>  
>  A snapshot of the <origin> block device is created. Changed chunks of
>  <chunksize> sectors will be stored on the <COW device>.  Writes will
> @@ -53,8 +54,23 @@ When loading or unloading the snapshot target, the corresponding
>  snapshot-origin or snapshot-merge target must be suspended. A failure to
>  suspend the origin target could result in data corruption.
>  
> +Optional features:
> +
> +   discard_zeroes_cow - a discard issued to the snapshot device that
> +   maps to entire chunks to will zero the corresponding exception(s) in
> +   the snapshot's exception store.
> +
> +   discard_passdown_origin - a discard to the snapshot device is passed
> +   down to the snapshot-origin's underlying device.  This doesn't cause
> +   copy-out to the snapshot exception store because the snapshot-origin
> +   target is bypassed.
> +
> +   The discard_passdown_origin feature depends on the discard_zeroes_cow
> +   feature being enabled.
> +
>  
>  * snapshot-merge <origin> <COW device> <persistent> <chunksize>
> +  [<# feature args> [<arg>]*]
>  
>  takes the same table arguments as the snapshot target except it only
>  works with persistent snapshots.  This target assumes the role of the
> diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
> index 3107f2b1988b..63916e1dc569 100644
> --- a/drivers/md/dm-snap.c
> +++ b/drivers/md/dm-snap.c
> @@ -1,6 +1,4 @@
>  /*
> - * dm-snapshot.c
> - *
>   * Copyright (C) 2001-2002 Sistina Software (UK) Limited.
>   *
>   * This file is released under the GPL.
> @@ -134,7 +132,10 @@ struct dm_snapshot {
>  	 * - I/O error while merging
>  	 *	=> stop merging; set merge_failed; process I/O normally.
>  	 */
> -	int merge_failed;
> +	bool merge_failed:1;
> +
> +	bool discard_zeroes_cow:1;
> +	bool discard_passdown_origin:1;
>  
>  	/*
>  	 * Incoming bios that overlap with chunks being merged must wait
> @@ -1173,12 +1174,64 @@ static void stop_merge(struct dm_snapshot *s)
>  	clear_bit(SHUTDOWN_MERGE, &s->state_bits);
>  }
>  
> +static int parse_snapshot_features(struct dm_arg_set *as, struct dm_snapshot *s,
> +				   struct dm_target *ti)
> +{
> +	int r;
> +	unsigned argc;
> +	const char *arg_name;
> +
> +	static const struct dm_arg _args[] = {
> +		{0, 2, "Invalid number of feature arguments"},
> +	};
> +
> +	/*
> +	 * No feature arguments supplied.
> +	 */
> +	if (!as->argc)
> +		return 0;
> +
> +	r = dm_read_arg_group(_args, as, &argc, &ti->error);
> +	if (r)
> +		return -EINVAL;
> +
> +	while (argc && !r) {
> +		arg_name = dm_shift_arg(as);
> +		argc--;
> +
> +		if (!strcasecmp(arg_name, "discard_zeroes_cow"))
> +			s->discard_zeroes_cow = true;
> +
> +		else if (!strcasecmp(arg_name, "discard_passdown_origin"))
> +			s->discard_passdown_origin = true;
> +
> +		else {
> +			ti->error = "Unrecognised feature requested";
> +			r = -EINVAL;
> +			break;
> +		}
> +	}
> +
> +	if (!s->discard_zeroes_cow && s->discard_passdown_origin) {
> +		/*
> +		 * TODO: really these are disjoint.. but ti->num_discard_bios
> +		 * and dm_bio_get_target_bio_nr() require rigid constraints.
> +		 */
> +		ti->error = "discard_passdown_origin feature depends on discard_zeroes_cow";
> +		r = -EINVAL;
> +	}
> +
> +	return r;
> +}
> +
>  /*
> - * Construct a snapshot mapping: <origin_dev> <COW-dev> <p|po|n> <chunk-size>
> + * Construct a snapshot mapping:
> + * <origin_dev> <COW-dev> <p|po|n> <chunk-size> [<# feature args> [<arg>]*]
>   */
>  static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  {
>  	struct dm_snapshot *s;
> +	struct dm_arg_set as;
>  	int i;
>  	int r = -EINVAL;
>  	char *origin_path, *cow_path;
> @@ -1186,8 +1239,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  	unsigned args_used, num_flush_bios = 1;
>  	fmode_t origin_mode = FMODE_READ;
>  
> -	if (argc != 4) {
> -		ti->error = "requires exactly 4 arguments";
> +	if (argc < 4) {
> +		ti->error = "requires 4 or more arguments";
>  		r = -EINVAL;
>  		goto bad;
>  	}
> @@ -1204,6 +1257,13 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  		goto bad;
>  	}
>  
> +	as.argc = argc;
> +	as.argv = argv;
> +	dm_consume_args(&as, 4);
> +	r = parse_snapshot_features(&as, s, ti);
> +	if (r)
> +		goto bad_features;
> +
>  	origin_path = argv[0];
>  	argv++;
>  	argc--;
> @@ -1289,6 +1349,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  
>  	ti->private = s;
>  	ti->num_flush_bios = num_flush_bios;
> +	if (s->discard_zeroes_cow)
> +		ti->num_discard_bios = (s->discard_passdown_origin ? 2 : 1);
>  	ti->per_io_data_size = sizeof(struct dm_snap_tracked_chunk);
>  
>  	/* Add snapshot to the list of snapshots for this origin */
> @@ -1336,29 +1398,22 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  
>  bad_read_metadata:
>  	unregister_snapshot(s);
> -
>  bad_load_and_register:
>  	mempool_exit(&s->pending_pool);
> -
>  bad_pending_pool:
>  	dm_kcopyd_client_destroy(s->kcopyd_client);
> -
>  bad_kcopyd:
>  	dm_exception_table_exit(&s->pending, pending_cache);
>  	dm_exception_table_exit(&s->complete, exception_cache);
> -
>  bad_hash_tables:
>  	dm_exception_store_destroy(s->store);
> -
>  bad_store:
>  	dm_put_device(ti, s->cow);
> -
>  bad_cow:
>  	dm_put_device(ti, s->origin);
> -
>  bad_origin:
> +bad_features:
>  	kfree(s);
> -
>  bad:
>  	return r;
>  }
> @@ -1806,6 +1861,37 @@ static void remap_exception(struct dm_snapshot *s, struct dm_exception *e,
>  		(bio->bi_iter.bi_sector & s->store->chunk_mask);
>  }
>  
> +static void zero_callback(int read_err, unsigned long write_err, void *context)
> +{
> +	struct bio *bio = context;
> +	struct dm_snapshot *s = bio->bi_private;
> +
> +	up(&s->cow_count);
> +	bio->bi_status = write_err ? BLK_STS_IOERR : 0;
> +	bio_endio(bio);
> +}
> +
> +static void zero_exception(struct dm_snapshot *s, struct dm_exception *e,
> +			   struct bio *bio, chunk_t chunk)
> +{
> +	struct dm_io_region dest;
> +
> +	dest.bdev = s->cow->bdev;
> +	dest.sector = bio->bi_iter.bi_sector;
> +	dest.count = s->store->chunk_size;
> +
> +	down(&s->cow_count);
> +	WARN_ON_ONCE(bio->bi_private);
> +	bio->bi_private = s;
> +	dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, bio);
> +}
> +
> +static bool io_overlaps_chunk(struct dm_snapshot *s, struct bio *bio)
> +{
> +	return bio->bi_iter.bi_size ==
> +		(s->store->chunk_size << SECTOR_SHIFT);
> +}
> +
>  static int snapshot_map(struct dm_target *ti, struct bio *bio)
>  {
>  	struct dm_exception *e;
> @@ -1839,10 +1925,43 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
>  		goto out_unlock;
>  	}
>  
> +	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
> +		if (s->discard_passdown_origin && dm_bio_get_target_bio_nr(bio)) {
> +			/*
> +			 * passdown discard to origin (without triggering
> +			 * snapshot exceptions via do_origin; doing so would
> +			 * defeat the goal of freeing space in origin that is
> +			 * implied by the "discard_passdown_origin" feature)
> +			 */
> +			bio_set_dev(bio, s->origin->bdev);
> +			track_chunk(s, bio, chunk);
Why track_chunk() is needed here?

We track snapshot reads redirected to the origin to ensure that the
origin will not be written while the reads are in-flight, thus returning
corrupted data. Also, we track writes to a merging snapshot, which are
redirected to the COW device, to ensure we don't merge these COW chunks
before the writes finish.

I am probably missing something, but I can't understand why we need to
track the discard to the origin.

> +			goto out_unlock;
> +		}
> +		/* discard to snapshot (target_bio_nr == 0) zeroes exceptions */
> +	}
> +
>  	/* If the block is already remapped - use that, else remap it */
>  	e = dm_lookup_exception(&s->complete, chunk);
>  	if (e) {
>  		remap_exception(s, e, bio, chunk);
> +		if (unlikely(bio_op(bio) == REQ_OP_DISCARD) &&
> +		    io_overlaps_chunk(s, bio)) {
> +			dm_exception_table_unlock(&lock);
> +			up_read(&s->lock);
> +			zero_exception(s, e, bio, chunk);
> +			r = DM_MAPIO_SUBMITTED; /* discard is not issued */
> +			goto out;
> +		}
> +		goto out_unlock;
> +	}
> +
> +	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
> +		/*
> +		 * If no exception exists, complete discard immediately
> +		 * otherwise it'll trigger copy-out.
> +		 */
> +		bio_endio(bio);
> +		r = DM_MAPIO_SUBMITTED;
>  		goto out_unlock;
>  	}
>  
> @@ -1890,9 +2009,7 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
>  
>  		r = DM_MAPIO_SUBMITTED;
>  
> -		if (!pe->started &&
> -		    bio->bi_iter.bi_size ==
> -		    (s->store->chunk_size << SECTOR_SHIFT)) {
> +		if (!pe->started && io_overlaps_chunk(s, bio)) {
>  			pe->started = 1;
>  
>  			dm_exception_table_unlock(&lock);
> @@ -2138,6 +2255,7 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
>  {
>  	unsigned sz = 0;
>  	struct dm_snapshot *snap = ti->private;
> +	unsigned num_features;
>  
>  	switch (type) {
>  	case STATUSTYPE_INFO:
> @@ -2178,8 +2296,16 @@ static void snapshot_status(struct dm_target *ti, status_type_t type,
>  		 * make sense.
>  		 */
>  		DMEMIT("%s %s", snap->origin->name, snap->cow->name);
> -		snap->store->type->status(snap->store, type, result + sz,
> -					  maxlen - sz);
> +		sz += snap->store->type->status(snap->store, type, result + sz,
> +						maxlen - sz);
> +		num_features = snap->discard_zeroes_cow + snap->discard_passdown_origin;
> +		if (num_features) {
> +			DMEMIT(" %u", num_features);
> +			if (snap->discard_zeroes_cow)
> +				DMEMIT(" discard_zeroes_cow");
> +			if (snap->discard_passdown_origin)
> +				DMEMIT(" discard_passdown_origin");
> +		}
>  		break;
>  	}
>  }
> @@ -2198,6 +2324,22 @@ static int snapshot_iterate_devices(struct dm_target *ti,
>  	return r;
>  }
>  > +static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits)
> +{
> +	struct dm_snapshot *snap = ti->private;
> +
> +	if (snap->discard_zeroes_cow) {
> +		struct dm_snapshot *snap_src = NULL, *snap_dest = NULL;
> +

I think the following:

> +		(void) __find_snapshots_sharing_cow(snap, &snap_src, &snap_dest, NULL);
> +		if (snap_src && snap_dest)
> +			snap = snap_src;
> +
> +		/* All discards are split on chunk_size boundary */
> +		limits->discard_granularity = snap->store->chunk_size;
> +		limits->max_discard_sectors = snap->store->chunk_size;

should be:

	down_read(&_origins_lock);

	(void) __find_snapshots_sharing_cow(snap, &snap_src, &snap_dest, NULL);
	if (snap_src && snap_dest)
		snap = snap_src;

	/* All discards are split on chunk_size boundary */
	limits->discard_granularity = snap->store->chunk_size;
	limits->max_discard_sectors = snap->store->chunk_size;

	up_read(&_origins_lock);

Taking _origins_lock protects us against:

	* Concurrent modification of the _origins hash table by
	  register/unregister_snapshot().
	* snapshot_dtr() unregistering snap_src and freeing the relevant
	  resources, e.g., snap_src->store.

> +	}
> +}>  
>  /*-----------------------------------------------------------------
>   * Origin methods
> @@ -2522,7 +2664,7 @@ static struct target_type origin_target = {
>  
>  static struct target_type snapshot_target = {
>  	.name    = "snapshot",
> -	.version = {1, 15, 0},
> +	.version = {1, 16, 0},
>  	.module  = THIS_MODULE,
>  	.ctr     = snapshot_ctr,
>  	.dtr     = snapshot_dtr,
> @@ -2532,11 +2674,12 @@ static struct target_type snapshot_target = {
>  	.resume  = snapshot_resume,
>  	.status  = snapshot_status,
>  	.iterate_devices = snapshot_iterate_devices,
> +	.io_hints = snapshot_io_hints,
>  };
>  
>  static struct target_type merge_target = {
>  	.name    = dm_snapshot_merge_target_name,
> -	.version = {1, 4, 0},
> +	.version = {1, 5, 0},
>  	.module  = THIS_MODULE,
>  	.ctr     = snapshot_ctr,
>  	.dtr     = snapshot_dtr,
> @@ -2547,6 +2690,7 @@ static struct target_type merge_target = {
>  	.resume  = snapshot_merge_resume,
>  	.status  = snapshot_status,
>  	.iterate_devices = snapshot_iterate_devices,
> +	.io_hints = snapshot_io_hints,
>  };
>  
>  static int __init dm_snapshot_init(void)
> 
One final note. The discard features can also be used by the
snapshot-merge target. But, snapshot_merge_map() is not doing anything
special about discards. They are treated like normal writes.

For chunks that reside in the origin this means that the discard will be
passed down to it using do_origin(). Although this could trigger
exceptions to other snapshots, I think that for the use case these features
are trying to solve there is going to be only one active snapshot. Otherwise,
if we have multiple snapshots, discard_passdown_origin would mean that
discarding one snapshot could affect/corrupt the rest of the snapshots.

For the not-yet-merged, remapped chunks the discard will be redirected to
the COW device, so discard_zeroes_cow is not really zeroing COW in this
case, it is discarding it.

Also, if both discard_zeroes_cow and discard_passdown_origin are enabled,
the relevant chunk, either in origin or in COW, will be discarded twice.

Nikos

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] dm snapshot: add optional discard support features
  2019-07-12 13:40       ` Nikos Tsironis
@ 2019-07-15 18:06         ` Mike Snitzer
  0 siblings, 0 replies; 6+ messages in thread
From: Mike Snitzer @ 2019-07-15 18:06 UTC (permalink / raw)
  To: Nikos Tsironis; +Cc: dm-devel, John Dorminy

On Fri, Jul 12 2019 at  9:40am -0400,
Nikos Tsironis <ntsironis@arrikto.com> wrote:

> Hi Mike,
> 
> I have reviewed the patch. A few comments below.

Thanks, my email filtering caused this mail to miss my INBOX (still yet
to reason through why).  So I'm just seeing this now.

> On 7/11/19 11:46 PM, Mike Snitzer wrote:
> > @@ -1839,10 +1925,43 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
> >  		goto out_unlock;
> >  	}
> >  
> > +	if (unlikely(bio_op(bio) == REQ_OP_DISCARD)) {
> > +		if (s->discard_passdown_origin && dm_bio_get_target_bio_nr(bio)) {
> > +			/*
> > +			 * passdown discard to origin (without triggering
> > +			 * snapshot exceptions via do_origin; doing so would
> > +			 * defeat the goal of freeing space in origin that is
> > +			 * implied by the "discard_passdown_origin" feature)
> > +			 */
> > +			bio_set_dev(bio, s->origin->bdev);
> > +			track_chunk(s, bio, chunk);
> Why track_chunk() is needed here?
> 
> We track snapshot reads redirected to the origin to ensure that the
> origin will not be written while the reads are in-flight, thus returning
> corrupted data. Also, we track writes to a merging snapshot, which are
> redirected to the COW device, to ensure we don't merge these COW chunks
> before the writes finish.
> 
> I am probably missing something, but I can't understand why we need to
> track the discard to the origin.

I followed how reads are remapped to origin when remapping the discard
destined for origin because a competing write to origin could trigger
copyout that undermines the intent behind passing down the discard to
the origin.  That said, even if the write is withheld until the discard
completes that write triggered copyout could still undermine that intent.

But for this usecase, the origin wouldn't even be mounted (and is
incapable of being written due to it being full).  So I quickly stopped
caring and was only interested in ensuring do crash would occur in the
face of this hypotehtical concurrent access.

> > @@ -2198,6 +2324,22 @@ static int snapshot_iterate_devices(struct dm_target *ti,
> >  	return r;
> >  }
> >  > +static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits)
> > +{
> > +	struct dm_snapshot *snap = ti->private;
> > +
> > +	if (snap->discard_zeroes_cow) {
> > +		struct dm_snapshot *snap_src = NULL, *snap_dest = NULL;
> > +
> 
> I think the following:
> 
> > +		(void) __find_snapshots_sharing_cow(snap, &snap_src, &snap_dest, NULL);
> > +		if (snap_src && snap_dest)
> > +			snap = snap_src;
> > +
> > +		/* All discards are split on chunk_size boundary */
> > +		limits->discard_granularity = snap->store->chunk_size;
> > +		limits->max_discard_sectors = snap->store->chunk_size;
> 
> should be:
> 
> 	down_read(&_origins_lock);
> 
> 	(void) __find_snapshots_sharing_cow(snap, &snap_src, &snap_dest, NULL);
> 	if (snap_src && snap_dest)
> 		snap = snap_src;
> 
> 	/* All discards are split on chunk_size boundary */
> 	limits->discard_granularity = snap->store->chunk_size;
> 	limits->max_discard_sectors = snap->store->chunk_size;
> 
> 	up_read(&_origins_lock);
> 
> Taking _origins_lock protects us against:
> 
> 	* Concurrent modification of the _origins hash table by
> 	  register/unregister_snapshot().
> 	* snapshot_dtr() unregistering snap_src and freeing the relevant
> 	  resources, e.g., snap_src->store.

I agree, will stage incremental fix.

> One final note. The discard features can also be used by the
> snapshot-merge target. But, snapshot_merge_map() is not doing anything
> special about discards. They are treated like normal writes.

Another oversight, but the usecase of async discards coming in after the
merge has started (while very much possible) is not the intended
workflow.
 
> For chunks that reside in the origin this means that the discard will be
> passed down to it using do_origin(). Although this could trigger
> exceptions to other snapshots, I think that for the use case these features
> are trying to solve there is going to be only one active snapshot. Otherwise,
> if we have multiple snapshots, discard_passdown_origin would mean that
> discarding one snapshot could affect/corrupt the rest of the snapshots.

Yes, there should probably be a negative check to ensure there is only 1
active snapshot.  The quirky nature of dm-snap.c's snapshot exception
store handover is what teased out the need for me to even care about
multiple "snapshot" or "snapshot-merge" devices sharing the same cow in
the context of snapshot_io_hints() -- I triggered the need to account
for this in followup testing that did:

(snapshot active with table: "0 2097152 snapshot 253:0 253:2 P 8")
echo "0 2097152 snapshot 253:0 253:2 P 8 2 discard_zeroes_cow discard_passdown_origin" | dmsetup load test-snap
dmsetup suspend test-snap
dmsetup resume test-snap

This reload triggers handover.  That handover caused chunk_size to be 0
thanks to snapshiot_ctr()'s: s->store->chunk_size = 0; which caused
snapshot_io_hints() to incorrectly set the 2 discard limits to 0.

> For the not-yet-merged, remapped chunks the discard will be redirected to
> the COW device, so discard_zeroes_cow is not really zeroing COW in this
> case, it is discarding it.
> 
> Also, if both discard_zeroes_cow and discard_passdown_origin are enabled,
> the relevant chunk, either in origin or in COW, will be discarded twice.

snapshot_merge_map() does need to account for discards, not doing so was
an oversight due to the narrow usecase I was catering to (again, only 1
snapshot and all IO meant to free space having been completed before
merge is started).  I'll fix this up as well.

Thanks again for your careful review Nikos, much appreciated!

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-15 18:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-03 16:25 [PATCH] dm snapshot: add optional discard support features Mike Snitzer
2019-07-05 16:03 ` Nikos Tsironis
2019-07-11 20:36   ` Mike Snitzer
2019-07-11 20:46     ` [PATCH v2] " Mike Snitzer
2019-07-12 13:40       ` Nikos Tsironis
2019-07-15 18:06         ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.