[PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems
@ 2014-02-21  2:55 Mike Snitzer
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend Mike Snitzer
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:55 UTC (permalink / raw)
  To: dm-devel

This patchset fixes various thinp issues.  Aside from fixing some bugs
in earlier 3.14-rc1 code (as noted with commit references): the most
significant issue addressed is that until now we didn't have the
awareness or ability to force the user to perform consistency checks
on the thin pool's metadata and data if warranted.

While there is a fair amount of change here I feel pretty strongly
that these fixes should get upstream ASAP.  Posting for review with
the hope that I'll get timely feedback/acks so that I can send to
Linus early next week.

(Alasdair, I went with "needs check" for the flag used in patch 7 even
though you initially disagreed with the name.. it is short and to the
point, and really: it is internal to the thin implementation.
Userspace can spin it however you'd like).

Mike Snitzer (8):
  dm thin: synchronize the pool mode during suspend
  dm thin: set flag when over the metadata low watermark threshold
  dm thin: set flag if metadata is out of space
  dm thin: error out I/O if inappropriate for it to be retried
  dm thin: fix the error path for the thin device constructor
  dm thin: fix pool_preresume resize with heavy IO races
  dm thin: ensure user takes action to validate data and metadata consistency
  dm thin: allow metadata space larger than supported to go unused

 Documentation/device-mapper/cache.txt              |  11 +-
 Documentation/device-mapper/thin-provisioning.txt  |  28 +++
 drivers/md/dm-cache-metadata.c                     |   2 +-
 drivers/md/dm-thin-metadata.c                      | 118 ++++++++++-
 drivers/md/dm-thin-metadata.h                      |  23 ++-
 drivers/md/dm-thin.c                               | 225 ++++++++++++++++-----
 drivers/md/persistent-data/dm-space-map-metadata.c |  19 +-
 .../md/persistent-data/dm-transaction-manager.c    |  13 +-
 .../md/persistent-data/dm-transaction-manager.h    |   2 +-
 9 files changed, 350 insertions(+), 91 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
@ 2014-02-21  2:55 ` Mike Snitzer
  2014-02-21 13:58   ` Joe Thornber
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold Mike Snitzer
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:55 UTC (permalink / raw)
  To: dm-devel

Commit b5330655 ("dm thin: handle metadata failures more consistently")
increased potential for the pool's mode to be changed in response to
metadata operation failures.

When the pool mode is changed it isn't synchronized with the mode in
pool_features stored in the target's context (ti->private) that is used
as the basis for (re)establishing the pool mode during resume via
bind_control_target.

It is important that we synchronize the pool mode when suspending
otherwise the pool may experience and unexpected mode transition on the
next resume (especially if there was no new table load).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-thin.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index d501e43..9facc6f 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -2364,6 +2364,12 @@ static void pool_postsuspend(struct dm_target *ti)
 	cancel_delayed_work(&pool->waker);
 	flush_workqueue(pool->wq);
 	(void) commit(pool);
+
+	/*
+	 * The pool mode may have changed, sync it so bind_control_target()
+	 * doesn't cause an unexpected mode transition on resume.
+	 */
+	pt->adjusted_pf.mode = get_pool_mode(pool);
 }

 static int check_arg_count(unsigned argc, unsigned args_required)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend Mike Snitzer
@ 2014-02-21  2:55 ` Mike Snitzer
  2014-02-21 13:56   ` Joe Thornber
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space Mike Snitzer
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:55 UTC (permalink / raw)
  To: dm-devel

The threshold boundary code in persistent-data/dm-space-map-metadata.c
was too racey and resulted in a flood of warnings and events.

Check the 'metadata_low_water_triggered' flag in metadata_low_callback()
before logging a warning and sending an event.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-thin.c                               | 11 ++++++++++-
 drivers/md/persistent-data/dm-space-map-metadata.c | 19 +------------------
 2 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 9facc6f..42d08eb 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -165,6 +165,7 @@ struct pool {
 
 	struct pool_features pf;
 	bool low_water_triggered:1;	/* A dm event has been sent */
+	bool metadata_low_water_triggered:1;
 
 	struct dm_bio_prison *prison;
 	struct dm_kcopyd_client *copier;
@@ -1824,6 +1825,7 @@ static struct pool *pool_create(struct mapped_device *pool_md,
 	INIT_LIST_HEAD(&pool->prepared_mappings);
 	INIT_LIST_HEAD(&pool->prepared_discards);
 	pool->low_water_triggered = false;
+	pool->metadata_low_water_triggered = false;
 	bio_list_init(&pool->retry_on_resume_list);
 
 	pool->shared_read_ds = dm_deferred_set_create();
@@ -1993,10 +1995,16 @@ static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf,
 static void metadata_low_callback(void *context)
 {
 	struct pool *pool = context;
+	unsigned long flags;
+
+	if (pool->metadata_low_water_triggered)
+		return;
 
 	DMWARN("%s: reached low water mark for metadata device: sending event.",
 	       dm_device_name(pool->pool_md));
-
+	spin_lock_irqsave(&pool->lock, flags);
+	pool->metadata_low_water_triggered = true;
+	spin_unlock_irqrestore(&pool->lock, flags);
 	dm_table_event(pool->ti->table);
 }
 
@@ -2350,6 +2358,7 @@ static void pool_resume(struct dm_target *ti)
 
 	spin_lock_irqsave(&pool->lock, flags);
 	pool->low_water_triggered = false;
+	pool->metadata_low_water_triggered = false;
 	__requeue_bios(pool);
 	spin_unlock_irqrestore(&pool->lock, flags);
 
diff --git a/drivers/md/persistent-data/dm-space-map-metadata.c b/drivers/md/persistent-data/dm-space-map-metadata.c
index 536782e..a0231d3 100644
--- a/drivers/md/persistent-data/dm-space-map-metadata.c
+++ b/drivers/md/persistent-data/dm-space-map-metadata.c
@@ -21,9 +21,7 @@
  */
 struct threshold {
 	bool threshold_set;
-	bool value_set;
 	dm_block_t threshold;
-	dm_block_t current_value;
 	dm_sm_threshold_fn fn;
 	void *context;
 };
@@ -31,7 +29,6 @@ struct threshold {
 static void threshold_init(struct threshold *t)
 {
 	t->threshold_set = false;
-	t->value_set = false;
 }
 
 static void set_threshold(struct threshold *t, dm_block_t value,
@@ -43,24 +40,10 @@ static void set_threshold(struct threshold *t, dm_block_t value,
 	t->context = context;
 }
 
-static bool below_threshold(struct threshold *t, dm_block_t value)
-{
-	return t->threshold_set && value <= t->threshold;
-}
-
-static bool threshold_already_triggered(struct threshold *t)
-{
-	return t->value_set && below_threshold(t, t->current_value);
-}
-
 static void check_threshold(struct threshold *t, dm_block_t value)
 {
-	if (below_threshold(t, value) &&
-	    !threshold_already_triggered(t))
+	if (t->threshold_set && value <= t->threshold)
 		t->fn(t->context);
-
-	t->value_set = true;
-	t->current_value = value;
 }
 
 /*----------------------------------------------------------------*/
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend Mike Snitzer
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold Mike Snitzer
@ 2014-02-21  2:56 ` Mike Snitzer
  2014-02-21 14:20   ` Joe Thornber
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried Mike Snitzer
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:56 UTC (permalink / raw)
  To: dm-devel

It is difficult for the metadata space map to accurately account for its
free space.  Manually establish a flag that reflects this out of space
condition -- until/unless dm_pool_get_free_metadata_block_count() can
return a known out of space value (either 0 or some fixed reserve
threshold we create).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-thin-metadata.c | 33 ++++++++++++++++++++++++++++++++-
 drivers/md/dm-thin-metadata.h |  7 +++++++
 drivers/md/dm-thin.c          |  8 +++-----
 3 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index 3bb4506..21f3998 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -192,6 +192,15 @@ struct dm_pool_metadata {
 	 * operation possible in this state is the closing of the device.
 	 */
 	bool fail_io:1;
+
+	/*
+	 * Set if metadata space has been exhausted.  It is difficult for the
+	 * metadata space map to accurately account for its free space.  So
+	 * until/unless dm_pool_get_free_metadata_block_count can return a known
+	 * out of space value (either 0 or some fixed reserve threshold we create)
+	 * manually establish a flag that reflects this out of space condition.
+	 */
+	bool metadata_out_of_space:1;
 };
 
 struct dm_thin_device {
@@ -815,6 +824,7 @@ struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
 	INIT_LIST_HEAD(&pmd->thin_devices);
 	pmd->read_only = false;
 	pmd->fail_io = false;
+	pmd->metadata_out_of_space = false;
 	pmd->bdev = bdev;
 	pmd->data_block_size = data_block_size;
 
@@ -1603,6 +1613,24 @@ int dm_pool_get_free_metadata_block_count(struct dm_pool_metadata *pmd,
 	return r;
 }
 
+void dm_pool_set_metadata_out_of_space(struct dm_pool_metadata *pmd)
+{
+	down_write(&pmd->root_lock);
+	pmd->metadata_out_of_space = true;
+	up_write(&pmd->root_lock);
+}
+
+bool dm_pool_is_metadata_out_of_space(struct dm_pool_metadata *pmd)
+{
+	bool r;
+
+	down_read(&pmd->root_lock);
+	r = pmd->metadata_out_of_space;
+	up_read(&pmd->root_lock);
+
+	return r;
+}
+
 int dm_pool_get_metadata_dev_size(struct dm_pool_metadata *pmd,
 				  dm_block_t *result)
 {
@@ -1719,8 +1747,11 @@ int dm_pool_resize_metadata_dev(struct dm_pool_metadata *pmd, dm_block_t new_cou
 	int r = -EINVAL;
 
 	down_write(&pmd->root_lock);
-	if (!pmd->fail_io)
+	if (!pmd->fail_io) {
 		r = __resize_space_map(pmd->metadata_sm, new_count);
+		if (!r)
+			pmd->metadata_out_of_space = false;
+	}
 	up_write(&pmd->root_lock);
 
 	return r;
diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
index dd6088a..cc78fbb 100644
--- a/drivers/md/dm-thin-metadata.h
+++ b/drivers/md/dm-thin-metadata.h
@@ -91,6 +91,11 @@ int dm_pool_commit_metadata(struct dm_pool_metadata *pmd);
 int dm_pool_abort_metadata(struct dm_pool_metadata *pmd);
 
 /*
+ * Set if metadata space has been exhausted.
+ */
+void dm_pool_set_metadata_out_of_space(struct dm_pool_metadata *pmd);
+
+/*
  * Set/get userspace transaction id.
  */
 int dm_pool_set_metadata_transaction_id(struct dm_pool_metadata *pmd,
@@ -185,6 +190,8 @@ int dm_pool_get_data_dev_size(struct dm_pool_metadata *pmd, dm_block_t *result);
 
 int dm_pool_block_is_used(struct dm_pool_metadata *pmd, dm_block_t b, bool *result);
 
+bool dm_pool_is_metadata_out_of_space(struct dm_pool_metadata *pmd);
+
 /*
  * Returns -ENOSPC if the new size is too small and already allocated
  * blocks would be lost.
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 42d08eb..8e68831 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -1464,16 +1464,14 @@ static void out_of_data_space(struct pool *pool)
 
 static void metadata_operation_failed(struct pool *pool, const char *op, int r)
 {
-	dm_block_t free_blocks;
-
 	DMERR_LIMIT("%s: metadata operation '%s' failed: error = %d",
 		    dm_device_name(pool->pool_md), op, r);
 
-	if (r == -ENOSPC &&
-	    !dm_pool_get_free_metadata_block_count(pool->pmd, &free_blocks) &&
-	    !free_blocks)
+	if (r == -ENOSPC) {
+		dm_pool_set_metadata_out_of_space(pool->pmd);
 		DMERR_LIMIT("%s: no free metadata space available.",
 			    dm_device_name(pool->pool_md));
+	}
 
 	set_pool_mode(pool, PM_READ_ONLY);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
                   ` (2 preceding siblings ...)
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space Mike Snitzer
@ 2014-02-21  2:56 ` Mike Snitzer
  2014-02-21 14:22   ` Joe Thornber
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 5/8] dm thin: fix the error path for the thin device constructor Mike Snitzer
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:56 UTC (permalink / raw)
  To: dm-devel

If the pool is in fail mode, error_if_no_space is enabled or the
metadata space is exhausted do _not_ allow IO to be retried.  This
change complements commit 8c0f0e8c9f0 ("dm thin: requeue bios to DM core
if no_free_space and in read-only mode").

Also, update Documentation to include information about when the thin
provisioning target commits metadata and how it deals with running out
of space.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 Documentation/device-mapper/cache.txt             | 11 +++++------
 Documentation/device-mapper/thin-provisioning.txt | 23 +++++++++++++++++++++++
 drivers/md/dm-thin.c                              | 14 +++++++++++++-
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/Documentation/device-mapper/cache.txt b/Documentation/device-mapper/cache.txt
index e6b72d3..68c0f51 100644
--- a/Documentation/device-mapper/cache.txt
+++ b/Documentation/device-mapper/cache.txt
@@ -124,12 +124,11 @@ the default being 204800 sectors (or 100MB).
 Updating on-disk metadata
 -------------------------
 
-On-disk metadata is committed every time a REQ_SYNC or REQ_FUA bio is
-written.  If no such requests are made then commits will occur every
-second.  This means the cache behaves like a physical disk that has a
-write cache (the same is true of the thin-provisioning target).  If
-power is lost you may lose some recent writes.  The metadata should
-always be consistent in spite of any crash.
+On-disk metadata is committed every time a FLUSH or FUA bio is written.
+If no such requests are made then commits will occur every second.  This
+means the cache behaves like a physical disk that has a volatile write
+cache.  If power is lost you may lose some recent writes.  The metadata
+should always be consistent in spite of any crash.
 
 The 'dirty' state for a cache block changes far too frequently for us
 to keep updating it on the fly.  So we treat it as a hint.  In normal
diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt
index 8a7a3d4..3989dd6 100644
--- a/Documentation/device-mapper/thin-provisioning.txt
+++ b/Documentation/device-mapper/thin-provisioning.txt
@@ -116,6 +116,29 @@ Resuming a device with a new table itself triggers an event so the
 userspace daemon can use this to detect a situation where a new table
 already exceeds the threshold.
 
+A low water mark for the metadata device is maintained in the kernel and
+will trigger a dm event if free space on the metadata device drops below
+it.
+
+Updating on-disk metadata
+-------------------------
+
+On-disk metadata is committed every time a FLUSH or FUA bio is written.
+If no such requests are made then commits will occur every second.  This
+means the thin-provisioning target behaves like a physical disk that has
+a volatile write cache.  If power is lost you may lose some recent
+writes.  The metadata should always be consistent in spite of any crash.
+
+If data space is exhausted the pool will either error or queue IO
+according to the configuration (see: error_if_no_space).  When metadata
+space is exhausted the pool will error IO, that requires new pool block
+allocation, until the pool's metadata device is resized.  When either the
+data or metadata space is exhausted the current metadata transaction
+must be aborted.  Given that the pool will cache IO whose completion may
+have already been acknowledged to the upper IO layers (e.g. filesystem)
+it is strongly suggested that those layers perform consistency checks
+before the data or metadata space is resized after having been exhausted.
+
 Thin provisioning
 -----------------
 
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 8e68831..bc52b3b 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -989,6 +989,13 @@ static void retry_on_resume(struct bio *bio)
 	spin_unlock_irqrestore(&pool->lock, flags);
 }
 
+static bool should_error_unserviceable_bio(struct pool *pool)
+{
+	return (unlikely(get_pool_mode(pool) == PM_FAIL) ||
+		pool->pf.error_if_no_space ||
+		dm_pool_is_metadata_out_of_space(pool->pmd));
+}
+
 static void handle_unserviceable_bio(struct pool *pool, struct bio *bio)
 {
 	/*
@@ -997,7 +1004,7 @@ static void handle_unserviceable_bio(struct pool *pool, struct bio *bio)
 	 */
 	WARN_ON_ONCE(get_pool_mode(pool) != PM_READ_ONLY);
 
-	if (pool->pf.error_if_no_space)
+	if (should_error_unserviceable_bio(pool))
 		bio_io_error(bio);
 	else
 		retry_on_resume(bio);
@@ -1008,6 +1015,11 @@ static void retry_bios_on_resume(struct pool *pool, struct dm_bio_prison_cell *c
 	struct bio *bio;
 	struct bio_list bios;
 
+	if (should_error_unserviceable_bio(pool)) {
+		cell_error(pool, cell);
+		return;
+	}
+
 	bio_list_init(&bios);
 	cell_release(pool, cell, &bios);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 5/8] dm thin: fix the error path for the thin device constructor
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
                   ` (3 preceding siblings ...)
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried Mike Snitzer
@ 2014-02-21  2:56 ` Mike Snitzer
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races Mike Snitzer
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:56 UTC (permalink / raw)
  To: dm-devel

dm_pool_close_thin_device() must be called if dm_set_target_max_io_len()
fails in thin_ctr().  Otherwise __pool_destroy() will fail because the
pool will still have an open thin device:

 device-mapper: thin metadata: attempt to close pmd when 1 device(s) are still open
 device-mapper: thin: __pool_destroy: dm_pool_metadata_close() failed.

Also, must establish error code if failing thin_ctr() because the pool
is in fail_io mode.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/md/dm-thin.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index bc52b3b..7db53d7 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -2920,6 +2920,7 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv)
 
 	if (get_pool_mode(tc->pool) == PM_FAIL) {
 		ti->error = "Couldn't open thin device, Pool is in fail mode";
+		r = -EINVAL;
 		goto bad_thin_open;
 	}
 
@@ -2931,7 +2932,7 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv)
 
 	r = dm_set_target_max_io_len(ti, tc->pool->sectors_per_block);
 	if (r)
-		goto bad_thin_open;
+		goto bad_target_max_io_len;
 
 	ti->num_flush_bios = 1;
 	ti->flush_supported = true;
@@ -2952,6 +2953,8 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv)
 
 	return 0;
 
+bad_target_max_io_len:
+	dm_pool_close_thin_device(tc->td);
 bad_thin_open:
 	__pool_dec(tc->pool);
 bad_pool_lookup:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
                   ` (4 preceding siblings ...)
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 5/8] dm thin: fix the error path for the thin device constructor Mike Snitzer
@ 2014-02-21  2:56 ` Mike Snitzer
  2014-02-21 14:27   ` Joe Thornber
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 7/8] dm thin: ensure user takes action to validate data and metadata consistency Mike Snitzer
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 8/8] dm thin: allow metadata space larger than supported to go unused Mike Snitzer
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:56 UTC (permalink / raw)
  To: dm-devel

Before, when a pool is being resized, on resume the pool's mode was
being immediately set to PM_WRITE and the process_* methods would be set
to the normal writable operations.  These were established before
pool_resume() was able to actually resize either the metadata or data
device and resulted in the resize failing.  This failure occurred due to
the thin device calling wake_worker() which kicked do_worker() during
pool_preresume() -- which caused calls to the writable process_* methods
to occur during the resize.

Now, the pool is forced to stay in read-only mode if it was already in
read-only mode.  This prevents bouncing the pool's mode and associated
process_* methods and consistently keeps read-only processing in place
until the resize(s) complete.  To achieve this the pool can now be in a
PM_READ_ONLY state but the metadata's is !read_only -- so as to allow
the commit() the follows the resize to take place.  Differentiate
between commit_metadata_superblock() and commit() since different
negative checks apply (but factor out a common __commit that each
calls).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-thin-metadata.c | 44 ++++++++++++++++++++++++
 drivers/md/dm-thin-metadata.h |  3 ++
 drivers/md/dm-thin.c          | 78 +++++++++++++++++++++++++++++++++++--------
 3 files changed, 112 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index 21f3998..bea6db6 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -1631,6 +1631,39 @@ bool dm_pool_is_metadata_out_of_space(struct dm_pool_metadata *pmd)
 	return r;
 }
 
+static int dm_pool_is_data_out_of_space(struct dm_pool_metadata *pmd, bool *result)
+{
+	int r;
+	dm_block_t free_blocks;
+
+	*result = false;
+
+	r = dm_pool_get_free_block_count(pmd, &free_blocks);
+	if (r)
+		return r;
+	if (!free_blocks)
+		*result = true;
+
+	return 0;
+}
+
+int dm_pool_is_out_of_space(struct dm_pool_metadata *pmd, bool *result)
+{
+	int r;
+
+	*result = false;
+
+	r = dm_pool_is_data_out_of_space(pmd, result);
+	if (r)
+		return r;
+	if (*result)
+		return 0;
+
+	*result = dm_pool_is_metadata_out_of_space(pmd);
+
+	return 0;
+}
+
 int dm_pool_get_metadata_dev_size(struct dm_pool_metadata *pmd,
 				  dm_block_t *result)
 {
@@ -1757,6 +1790,17 @@ int dm_pool_resize_metadata_dev(struct dm_pool_metadata *pmd, dm_block_t new_cou
 	return r;
 }
 
+bool dm_pool_metadata_is_read_only(struct dm_pool_metadata *pmd)
+{
+	bool r;
+
+	down_read(&pmd->root_lock);
+	r = pmd->read_only;
+	up_read(&pmd->root_lock);
+
+	return r;
+}
+
 void dm_pool_metadata_read_only(struct dm_pool_metadata *pmd)
 {
 	down_write(&pmd->root_lock);
diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
index cc78fbb..520dd34 100644
--- a/drivers/md/dm-thin-metadata.h
+++ b/drivers/md/dm-thin-metadata.h
@@ -192,6 +192,8 @@ int dm_pool_block_is_used(struct dm_pool_metadata *pmd, dm_block_t b, bool *resu
 
 bool dm_pool_is_metadata_out_of_space(struct dm_pool_metadata *pmd);
 
+int dm_pool_is_out_of_space(struct dm_pool_metadata *pmd, bool *result);
+
 /*
  * Returns -ENOSPC if the new size is too small and already allocated
  * blocks would be lost.
@@ -203,6 +205,7 @@ int dm_pool_resize_metadata_dev(struct dm_pool_metadata *pmd, dm_block_t new_siz
  * Flicks the underlying block manager into read only mode, so you know
  * that nothing is changing.
  */
+bool dm_pool_metadata_is_read_only(struct dm_pool_metadata *pmd);
 void dm_pool_metadata_read_only(struct dm_pool_metadata *pmd);
 void dm_pool_metadata_read_write(struct dm_pool_metadata *pmd);
 
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 7db53d7..e560416 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -894,22 +894,43 @@ static void schedule_zero(struct thin_c *tc, dm_block_t virt_block,
 	}
 }
 
+static int __commit(struct pool *pool)
+{
+	int r = dm_pool_commit_metadata(pool->pmd);
+	if (r)
+		metadata_operation_failed(pool, "dm_pool_commit_metadata", r);
+
+	return r;
+}
+
+/*
+ * Commit that is confined to the metadata superblock due to pool
+ * being in PM_READ_ONLY mode (read-only process_* functions).
+ * A non-zero return indicates metadata is not writable.
+ */
+static int commit_metadata_superblock(struct pool *pool)
+{
+	WARN_ON_ONCE(get_pool_mode(pool) != PM_READ_ONLY);
+
+	if (dm_pool_metadata_is_read_only(pool->pmd)) {
+		DMERR_LIMIT("%s: commit failed: pool metadata is not writable",
+			    dm_device_name(pool->pool_md));
+		return -EINVAL;
+	}
+
+	return __commit(pool);
+}
+
 /*
  * A non-zero return indicates read_only or fail_io mode.
  * Many callers don't care about the return value.
  */
 static int commit(struct pool *pool)
 {
-	int r;
-
 	if (get_pool_mode(pool) != PM_WRITE)
 		return -EINVAL;
 
-	r = dm_pool_commit_metadata(pool->pmd);
-	if (r)
-		metadata_operation_failed(pool, "dm_pool_commit_metadata", r);
-
-	return r;
+	return __commit(pool);
 }
 
 static void check_low_water_mark(struct pool *pool, dm_block_t free_blocks)
@@ -1440,7 +1461,11 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
 			new_mode = PM_FAIL;
 			set_pool_mode(pool, new_mode);
 		} else {
-			dm_pool_metadata_read_only(pool->pmd);
+			bool out_of_space;
+			if (!dm_pool_is_out_of_space(pool->pmd, &out_of_space) && out_of_space)
+				dm_pool_metadata_read_write(pool->pmd);
+			else
+				dm_pool_metadata_read_only(pool->pmd);
 			pool->process_bio = process_bio_read_only;
 			pool->process_discard = process_discard;
 			pool->process_prepared_mapping = process_prepared_mapping_fail;
@@ -1715,12 +1740,17 @@ static int bind_control_target(struct pool *pool, struct dm_target *ti)
 	/*
 	 * If we were in PM_FAIL mode, rollback of metadata failed.  We're
 	 * not going to recover without a thin_repair.  So we never let the
-	 * pool move out of the old mode.  On the other hand a PM_READ_ONLY
-	 * may have been due to a lack of metadata or data space, and may
-	 * now work (ie. if the underlying devices have been resized).
+	 * pool move out of the old mode.  Similarly, if in PM_READ_ONLY mode
+	 * and the pool doesn't have free space we must not switch modes here;
+	 * pool_preresume() will transition to PM_WRITE mode if a resize succeeds.
 	 */
 	if (old_mode == PM_FAIL)
 		new_mode = old_mode;
+	else if (old_mode == PM_READ_ONLY) {
+		bool out_of_space;
+		if (!dm_pool_is_out_of_space(pool->pmd, &out_of_space) && out_of_space)
+			new_mode = old_mode;
+	}
 
 	set_pool_mode(pool, new_mode);
 
@@ -2338,6 +2368,7 @@ static int pool_preresume(struct dm_target *ti)
 	bool need_commit1, need_commit2;
 	struct pool_c *pt = ti->private;
 	struct pool *pool = pt->pool;
+	bool out_of_space = false;
 
 	/*
 	 * Take control of the pool object.
@@ -2346,6 +2377,12 @@ static int pool_preresume(struct dm_target *ti)
 	if (r)
 		return r;
 
+	if (get_pool_mode(pool) == PM_READ_ONLY) {
+		r = dm_pool_is_out_of_space(pool->pmd, &out_of_space);
+		if (r)
+			metadata_operation_failed(pool, "dm_pool_is_out_of_space", r);
+	}
+
 	r = maybe_resize_data_dev(ti, &need_commit1);
 	if (r)
 		return r;
@@ -2354,8 +2391,23 @@ static int pool_preresume(struct dm_target *ti)
 	if (r)
 		return r;
 
-	if (need_commit1 || need_commit2)
-		(void) commit(pool);
+	if (need_commit1 || need_commit2) {
+		if (out_of_space) {
+			/*
+			 * Must update metadata's superblock despite read-only mode.
+			 */
+			r = commit_metadata_superblock(pool);
+			if (r || pt->requested_pf.mode == PM_READ_ONLY)
+				dm_pool_metadata_read_only(pool->pmd);
+			else {
+				/*
+				 * If resize succeeded transition the pool to write mode.
+				 */
+				set_pool_mode(pool, PM_WRITE);
+			}
+		} else
+			(void) commit(pool);
+	}
 
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 7/8] dm thin: ensure user takes action to validate data and metadata consistency
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
                   ` (5 preceding siblings ...)
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races Mike Snitzer
@ 2014-02-21  2:56 ` Mike Snitzer
  2014-02-21 14:35   ` Joe Thornber
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 8/8] dm thin: allow metadata space larger than supported to go unused Mike Snitzer
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:56 UTC (permalink / raw)
  To: dm-devel

If a thin metadata operation fails the current transaction will abort,
whereby causing potential for IO layers up the stack (e.g. filesystems)
to have data loss.  As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
thin metadata's superblock which forces the user to:
1) verify the thin metadata is consistent (e.g. use thin_check, etc)
2) verify the thin data is consistent (e.g. use fsck)

The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
to run thin_repair.

On metadata operation failure: abort current metadata transaction, set
pool in read-only mode, and now set the needs_check flag.

As part of this change, constraints are introduced or relaxed:
* don't allow a pool to transition to write mode if needs_check is set
* don't queue IO if needs_check is set
* don't allow data or metadata space to be resized if needs_check is set
* if a thin pool's metadata space is exhausted: the kernel will now
  force the user to take the pool offline for repair before the kernel
  will allow the metadata space to be extended.
* relax response to data allocation failure due to no data space:
  don't abort the current metadata transaction (this allows previously
  allocated and prepared mappings to be committed).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 Documentation/device-mapper/thin-provisioning.txt | 21 +++---
 drivers/md/dm-thin-metadata.c                     | 20 +++++-
 drivers/md/dm-thin-metadata.h                     |  8 +++
 drivers/md/dm-thin.c                              | 82 +++++++++++++++++------
 4 files changed, 102 insertions(+), 29 deletions(-)

diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt
index 3989dd6..075ca84 100644
--- a/Documentation/device-mapper/thin-provisioning.txt
+++ b/Documentation/device-mapper/thin-provisioning.txt
@@ -130,14 +130,19 @@ a volatile write cache.  If power is lost you may lose some recent
 writes.  The metadata should always be consistent in spite of any crash.
 
 If data space is exhausted the pool will either error or queue IO
-according to the configuration (see: error_if_no_space).  When metadata
-space is exhausted the pool will error IO, that requires new pool block
-allocation, until the pool's metadata device is resized.  When either the
-data or metadata space is exhausted the current metadata transaction
-must be aborted.  Given that the pool will cache IO whose completion may
-have already been acknowledged to the upper IO layers (e.g. filesystem)
-it is strongly suggested that those layers perform consistency checks
-before the data or metadata space is resized after having been exhausted.
+according to the configuration (see: error_if_no_space).  If metadata
+space is exhausted or a metadata operation fails: the pool will error IO
+until the pool is taken offline and repair is performed to 1) fix any
+potential inconsistencies and 2) clear the flag that imposes repair.
+Once the pool's metadata device is repaired it may be resized, which
+will allow the pool to return to normal operation.  Note that a pool's
+data and metadata devices cannot be resized until repair is performed.
+It should also be noted that when the pool's metadata space is exhausted
+the current metadata transaction is aborted.  Given that the pool will
+cache IO whose completion may have already been acknowledged to upper IO
+layers (e.g. filesystem) it is strongly suggested that consistency
+checks (e.g. fsck) be performed on those layers when repair of the pool
+is required.
 
 Thin provisioning
 -----------------
diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index bea6db6..7c2bc26 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -76,7 +76,7 @@
 
 #define THIN_SUPERBLOCK_MAGIC 27022010
 #define THIN_SUPERBLOCK_LOCATION 0
-#define THIN_VERSION 1
+#define THIN_VERSION 2
 #define THIN_METADATA_CACHE_SIZE 64
 #define SECTOR_TO_BLOCK_SHIFT 3
 
@@ -1830,3 +1830,21 @@ int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd,
 
 	return r;
 }
+
+void dm_pool_metadata_set_needs_check(struct dm_pool_metadata *pmd)
+{
+	down_write(&pmd->root_lock);
+	pmd->flags |= THIN_METADATA_NEEDS_CHECK_FLAG;
+	up_write(&pmd->root_lock);
+}
+
+bool dm_pool_metadata_needs_check(struct dm_pool_metadata *pmd)
+{
+	bool needs_check;
+
+	down_read(&pmd->root_lock);
+	needs_check = pmd->flags & THIN_METADATA_NEEDS_CHECK_FLAG;
+	up_read(&pmd->root_lock);
+
+	return needs_check;
+}
diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
index 520dd34..583ff5d 100644
--- a/drivers/md/dm-thin-metadata.h
+++ b/drivers/md/dm-thin-metadata.h
@@ -27,6 +27,11 @@
 
 /*----------------------------------------------------------------*/
 
+/*
+ * Thin metadata superblock flags.
+ */
+#define THIN_METADATA_NEEDS_CHECK_FLAG		(1 << 0)
+
 struct dm_pool_metadata;
 struct dm_thin_device;
 
@@ -214,6 +219,9 @@ int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd,
 					dm_sm_threshold_fn fn,
 					void *context);
 
+void dm_pool_metadata_set_needs_check(struct dm_pool_metadata *pmd);
+bool dm_pool_metadata_needs_check(struct dm_pool_metadata *pmd);
+
 /*----------------------------------------------------------------*/
 
 #endif
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index e560416..cd52cf2 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -1014,6 +1014,7 @@ static bool should_error_unserviceable_bio(struct pool *pool)
 {
 	return (unlikely(get_pool_mode(pool) == PM_FAIL) ||
 		pool->pf.error_if_no_space ||
+		dm_pool_metadata_needs_check(pool->pmd) ||
 		dm_pool_is_metadata_out_of_space(pool->pmd));
 }
 
@@ -1436,8 +1437,19 @@ static enum pool_mode get_pool_mode(struct pool *pool)
 static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
 {
 	int r;
+	bool out_of_space, needs_check = dm_pool_metadata_needs_check(pool->pmd);
 	enum pool_mode old_mode = pool->pf.mode;
 
+	/*
+	 * Never allow the pool to transition to PM_WRITE mode if user
+	 * intervention is required to verify metadata and data consistency.
+	 */
+	if (new_mode == PM_WRITE && old_mode != new_mode && needs_check) {
+		DMERR("%s: unable to switch pool to write mode until repaired.",
+		      dm_device_name(pool->pool_md));
+		new_mode = old_mode;
+	}
+
 	switch (new_mode) {
 	case PM_FAIL:
 		if (old_mode != new_mode)
@@ -1454,23 +1466,35 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
 		if (old_mode != new_mode)
 			DMERR("%s: switching pool to read-only mode",
 			      dm_device_name(pool->pool_md));
-		r = dm_pool_abort_metadata(pool->pmd);
-		if (r) {
-			DMERR("%s: aborting transaction failed",
-			      dm_device_name(pool->pool_md));
-			new_mode = PM_FAIL;
-			set_pool_mode(pool, new_mode);
-		} else {
-			bool out_of_space;
-			if (!dm_pool_is_out_of_space(pool->pmd, &out_of_space) && out_of_space)
-				dm_pool_metadata_read_write(pool->pmd);
-			else
-				dm_pool_metadata_read_only(pool->pmd);
-			pool->process_bio = process_bio_read_only;
-			pool->process_discard = process_discard;
+		if (needs_check) {
+			r = dm_pool_abort_metadata(pool->pmd);
+			if (r) {
+				DMERR("%s: aborting transaction failed",
+				      dm_device_name(pool->pool_md));
+				new_mode = PM_FAIL;
+				set_pool_mode(pool, new_mode);
+				goto out;
+			}
+		}
+		if (!dm_pool_is_out_of_space(pool->pmd, &out_of_space) && out_of_space)
+			dm_pool_metadata_read_write(pool->pmd);
+		else
+			dm_pool_metadata_read_only(pool->pmd);
+		pool->process_bio = process_bio_read_only;
+		pool->process_discard = process_discard;
+		if (needs_check)
 			pool->process_prepared_mapping = process_prepared_mapping_fail;
-			pool->process_prepared_discard = process_prepared_discard_passdown;
+		else {
+			/*
+			 * Allow previously prepared mappings to complete (important
+			 * for proper handling of case when data space is exhausted).
+			 * If read-only mode was requested no new mappings will be
+			 * created (due to process_bio_read_only) so no risk using
+			 * process_prepared_mapping.
+			 */
+			pool->process_prepared_mapping = process_prepared_mapping;
 		}
+		pool->process_prepared_discard = process_prepared_discard_passdown;
 		break;
 
 	case PM_WRITE:
@@ -1484,7 +1508,7 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
 		pool->process_prepared_discard = process_prepared_discard;
 		break;
 	}
-
+out:
 	pool->pf.mode = new_mode;
 }
 
@@ -1501,15 +1525,21 @@ static void out_of_data_space(struct pool *pool)
 
 static void metadata_operation_failed(struct pool *pool, const char *op, int r)
 {
+	const char *pool_device_name = dm_device_name(pool->pool_md);
+
 	DMERR_LIMIT("%s: metadata operation '%s' failed: error = %d",
-		    dm_device_name(pool->pool_md), op, r);
+		    pool_device_name, op, r);
 
 	if (r == -ENOSPC) {
 		dm_pool_set_metadata_out_of_space(pool->pmd);
 		DMERR_LIMIT("%s: no free metadata space available.",
-			    dm_device_name(pool->pool_md));
+			    pool_device_name);
 	}
 
+	DMWARN_LIMIT("%s: consistency of metadata and data must be verified, please repair.",
+		     pool_device_name);
+	dm_pool_metadata_set_needs_check(pool->pmd);
+
 	set_pool_mode(pool, PM_READ_ONLY);
 }
 
@@ -2295,6 +2325,12 @@ static int maybe_resize_data_dev(struct dm_target *ti, bool *need_commit)
 		return -EINVAL;
 
 	} else if (data_size > sb_data_size) {
+		if (dm_pool_metadata_needs_check(pool->pmd)) {
+			DMERR("%s: unable to grow the data device until repaired.",
+			      dm_device_name(pool->pool_md));
+			return -EINVAL;;
+		}
+
 		if (sb_data_size)
 			DMINFO("%s: growing the data device from %llu to %llu blocks",
 			       dm_device_name(pool->pool_md),
@@ -2336,6 +2372,12 @@ static int maybe_resize_metadata_dev(struct dm_target *ti, bool *need_commit)
 		return -EINVAL;
 
 	} else if (metadata_dev_size > sb_metadata_dev_size) {
+		if (dm_pool_metadata_needs_check(pool->pmd)) {
+			DMERR("%s: unable to grow the metadata device until repaired.",
+			      dm_device_name(pool->pool_md));
+			return -EINVAL;;
+		}
+
 		DMINFO("%s: growing the metadata device from %llu to %llu blocks",
 		       dm_device_name(pool->pool_md),
 		       sb_metadata_dev_size, metadata_dev_size);
@@ -2865,7 +2907,7 @@ static struct target_type pool_target = {
 	.name = "thin-pool",
 	.features = DM_TARGET_SINGLETON | DM_TARGET_ALWAYS_WRITEABLE |
 		    DM_TARGET_IMMUTABLE,
-	.version = {1, 10, 0},
+	.version = {1, 11, 0},
 	.module = THIS_MODULE,
 	.ctr = pool_ctr,
 	.dtr = pool_dtr,
@@ -3155,7 +3197,7 @@ static int thin_iterate_devices(struct dm_target *ti,
 
 static struct target_type thin_target = {
 	.name = "thin",
-	.version = {1, 10, 0},
+	.version = {1, 11, 0},
 	.module	= THIS_MODULE,
 	.ctr = thin_ctr,
 	.dtr = thin_dtr,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH for-dm-3.14-fixes 8/8] dm thin: allow metadata space larger than supported to go unused
  2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
                   ` (6 preceding siblings ...)
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 7/8] dm thin: ensure user takes action to validate data and metadata consistency Mike Snitzer
@ 2014-02-21  2:56 ` Mike Snitzer
  2014-02-21 14:36   ` Joe Thornber
  7 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21  2:56 UTC (permalink / raw)
  To: dm-devel

It was always intended that a user could provide a thin metadata device
that is larger than the max supported by the on-disk format.  The extra
space would just go unused.

That would only work when extending the metadata device, not on initial
thin-pool creation.  If the user attempted to use a larger metadata
device on creation they would get an error like the following:

 device-mapper: space map common: space map too large
 device-mapper: transaction manager: couldn't create metadata space map
 device-mapper: thin metadata: tm_create_with_sm failed
 device-mapper: table: 252:17: thin-pool: Error creating metadata object
 device-mapper: ioctl: error adding target to table

Fix this by allowing the initial thin-pool creation to cap the size of
the metadata area to be smaller than the size of the metadata device.

Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for
the sizeof the disk_bitmap_header.  So the supported maximum metadata
size is a bit smaller (reduced from 33423360 to 33292800 sectors).

Lastly, remove the "excess space will not be used" warning message from
get_metadata_dev_size(); it resulted in printing the warning multiple
times.  Factor out warn_if_metadata_device_too_big(), call it from
pool_ctr() and maybe_resize_metadata_dev().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-cache-metadata.c                     |  2 +-
 drivers/md/dm-thin-metadata.c                      | 21 +++++++------
 drivers/md/dm-thin-metadata.h                      |  5 ++--
 drivers/md/dm-thin.c                               | 35 ++++++++++++++--------
 .../md/persistent-data/dm-transaction-manager.c    | 13 ++++----
 .../md/persistent-data/dm-transaction-manager.h    |  2 +-
 6 files changed, 47 insertions(+), 31 deletions(-)

diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c
index 9ef0752..1baf615 100644
--- a/drivers/md/dm-cache-metadata.c
+++ b/drivers/md/dm-cache-metadata.c
@@ -323,7 +323,7 @@ static int __format_metadata(struct dm_cache_metadata *cmd)
 {
 	int r;
 
-	r = dm_tm_create_with_sm(cmd->bm, CACHE_SUPERBLOCK_LOCATION,
+	r = dm_tm_create_with_sm(cmd->bm, CACHE_SUPERBLOCK_LOCATION, 0,
 				 &cmd->tm, &cmd->metadata_sm);
 	if (r < 0) {
 		DMERR("tm_create_with_sm failed");
diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index 7c2bc26..3780650 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -503,11 +503,11 @@ bad_locked:
 	return r;
 }
 
-static int __format_metadata(struct dm_pool_metadata *pmd)
+static int __format_metadata(struct dm_pool_metadata *pmd, dm_block_t nr_blocks)
 {
 	int r;
 
-	r = dm_tm_create_with_sm(pmd->bm, THIN_SUPERBLOCK_LOCATION,
+	r = dm_tm_create_with_sm(pmd->bm, THIN_SUPERBLOCK_LOCATION, nr_blocks,
 				 &pmd->tm, &pmd->metadata_sm);
 	if (r < 0) {
 		DMERR("tm_create_with_sm failed");
@@ -642,7 +642,8 @@ bad_unlock_sblock:
 	return r;
 }
 
-static int __open_or_format_metadata(struct dm_pool_metadata *pmd, bool format_device)
+static int __open_or_format_metadata(struct dm_pool_metadata *pmd, bool format_device,
+				     dm_block_t nr_blocks)
 {
 	int r, unformatted;
 
@@ -651,12 +652,13 @@ static int __open_or_format_metadata(struct dm_pool_metadata *pmd, bool format_d
 		return r;
 
 	if (unformatted)
-		return format_device ? __format_metadata(pmd) : -EPERM;
+		return format_device ? __format_metadata(pmd, nr_blocks) : -EPERM;
 
 	return __open_metadata(pmd);
 }
 
-static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool format_device)
+static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool format_device,
+					    dm_block_t nr_blocks)
 {
 	int r;
 
@@ -668,7 +670,7 @@ static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool f
 		return PTR_ERR(pmd->bm);
 	}
 
-	r = __open_or_format_metadata(pmd, format_device);
+	r = __open_or_format_metadata(pmd, format_device, nr_blocks);
 	if (r)
 		dm_block_manager_destroy(pmd->bm);
 
@@ -808,7 +810,8 @@ out_locked:
 
 struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
 					       sector_t data_block_size,
-					       bool format_device)
+					       bool format_device,
+					       dm_block_t nr_blocks)
 {
 	int r;
 	struct dm_pool_metadata *pmd;
@@ -828,7 +831,7 @@ struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
 	pmd->bdev = bdev;
 	pmd->data_block_size = data_block_size;
 
-	r = __create_persistent_data_objects(pmd, format_device);
+	r = __create_persistent_data_objects(pmd, format_device, nr_blocks);
 	if (r) {
 		kfree(pmd);
 		return ERR_PTR(r);
@@ -1578,7 +1581,7 @@ int dm_pool_abort_metadata(struct dm_pool_metadata *pmd)
 
 	__set_abort_with_changes_flags(pmd);
 	__destroy_persistent_data_objects(pmd);
-	r = __create_persistent_data_objects(pmd, false);
+	r = __create_persistent_data_objects(pmd, false, 0);
 	if (r)
 		pmd->fail_io = true;
 
diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
index 583ff5d..e24169c 100644
--- a/drivers/md/dm-thin-metadata.h
+++ b/drivers/md/dm-thin-metadata.h
@@ -18,7 +18,7 @@
  * We have one block of index, which can hold 255 index entries.  Each
  * index entry contains allocation info about 16k metadata blocks.
  */
-#define THIN_METADATA_MAX_SECTORS (255 * (1 << 14) * (THIN_METADATA_BLOCK_SIZE / (1 << SECTOR_SHIFT)))
+#define THIN_METADATA_MAX_SECTORS (255 * ((1 << 14) - 64) * (THIN_METADATA_BLOCK_SIZE / (1 << SECTOR_SHIFT)))
 
 /*
  * A metadata device larger than 16GB triggers a warning.
@@ -45,7 +45,8 @@ typedef uint64_t dm_thin_id;
  */
 struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
 					       sector_t data_block_size,
-					       bool format_device);
+					       bool format_device,
+					       dm_block_t nr_blocks);
 
 int dm_pool_metadata_close(struct dm_pool_metadata *pmd);
 
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index cd52cf2..c4cee29 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -1829,6 +1829,8 @@ static void __pool_destroy(struct pool *pool)
 
 static struct kmem_cache *_new_mapping_cache;
 
+static dm_block_t get_metadata_dev_size_in_blocks(struct block_device *);
+
 static struct pool *pool_create(struct mapped_device *pool_md,
 				struct block_device *metadata_dev,
 				unsigned long block_size,
@@ -1839,8 +1841,10 @@ static struct pool *pool_create(struct mapped_device *pool_md,
 	struct pool *pool;
 	struct dm_pool_metadata *pmd;
 	bool format_device = read_only ? false : true;
+	dm_block_t nr_blocks = get_metadata_dev_size_in_blocks(metadata_dev);
 
-	pmd = dm_pool_metadata_open(metadata_dev, block_size, format_device);
+	pmd = dm_pool_metadata_open(metadata_dev, block_size, format_device,
+				    nr_blocks);
 	if (IS_ERR(pmd)) {
 		*error = "Error creating metadata object";
 		return (struct pool *)pmd;
@@ -2078,16 +2082,27 @@ static void metadata_low_callback(void *context)
 	dm_table_event(pool->ti->table);
 }
 
-static sector_t get_metadata_dev_size(struct block_device *bdev)
+static sector_t get_dev_size(struct block_device *bdev)
 {
-	sector_t metadata_dev_size = i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
+	return i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
+}
+
+static void warn_if_metadata_device_too_big(struct block_device *bdev)
+{
+	sector_t metadata_dev_size = get_dev_size(bdev);
 	char buffer[BDEVNAME_SIZE];
 
-	if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING) {
+	if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING)
 		DMWARN("Metadata device %s is larger than %u sectors: excess space will not be used.",
 		       bdevname(bdev, buffer), THIN_METADATA_MAX_SECTORS);
-		metadata_dev_size = THIN_METADATA_MAX_SECTORS_WARNING;
-	}
+}
+
+static sector_t get_metadata_dev_size(struct block_device *bdev)
+{
+	sector_t metadata_dev_size = get_dev_size(bdev);
+
+	if (metadata_dev_size > THIN_METADATA_MAX_SECTORS)
+		metadata_dev_size = THIN_METADATA_MAX_SECTORS;
 
 	return metadata_dev_size;
 }
@@ -2174,12 +2189,7 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
 		ti->error = "Error opening metadata block device";
 		goto out_unlock;
 	}
-
-	/*
-	 * Run for the side-effect of possibly issuing a warning if the
-	 * device is too big.
-	 */
-	(void) get_metadata_dev_size(metadata_dev->bdev);
+	warn_if_metadata_device_too_big(metadata_dev->bdev);
 
 	r = dm_get_device(ti, argv[1], FMODE_READ | FMODE_WRITE, &data_dev);
 	if (r) {
@@ -2378,6 +2388,7 @@ static int maybe_resize_metadata_dev(struct dm_target *ti, bool *need_commit)
 			return -EINVAL;;
 		}
 
+		warn_if_metadata_device_too_big(pool->md_dev);
 		DMINFO("%s: growing the metadata device from %llu to %llu blocks",
 		       dm_device_name(pool->pool_md),
 		       sb_metadata_dev_size, metadata_dev_size);
diff --git a/drivers/md/persistent-data/dm-transaction-manager.c b/drivers/md/persistent-data/dm-transaction-manager.c
index 81da1a2..495277e 100644
--- a/drivers/md/persistent-data/dm-transaction-manager.c
+++ b/drivers/md/persistent-data/dm-transaction-manager.c
@@ -322,7 +322,7 @@ static int dm_tm_create_internal(struct dm_block_manager *bm,
 				 dm_block_t sb_location,
 				 struct dm_transaction_manager **tm,
 				 struct dm_space_map **sm,
-				 int create,
+				 int create, dm_block_t nr_blocks,
 				 void *sm_root, size_t sm_len)
 {
 	int r;
@@ -338,8 +338,9 @@ static int dm_tm_create_internal(struct dm_block_manager *bm,
 	}
 
 	if (create) {
-		r = dm_sm_metadata_create(*sm, *tm, dm_bm_nr_blocks(bm),
-					  sb_location);
+		if (!nr_blocks)
+			nr_blocks = dm_bm_nr_blocks(bm);
+		r = dm_sm_metadata_create(*sm, *tm, nr_blocks, sb_location);
 		if (r) {
 			DMERR("couldn't create metadata space map");
 			goto bad;
@@ -362,10 +363,10 @@ bad:
 }
 
 int dm_tm_create_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
-			 struct dm_transaction_manager **tm,
+			 dm_block_t nr_blocks, struct dm_transaction_manager **tm,
 			 struct dm_space_map **sm)
 {
-	return dm_tm_create_internal(bm, sb_location, tm, sm, 1, NULL, 0);
+	return dm_tm_create_internal(bm, sb_location, tm, sm, 1, nr_blocks, NULL, 0);
 }
 EXPORT_SYMBOL_GPL(dm_tm_create_with_sm);
 
@@ -374,7 +375,7 @@ int dm_tm_open_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
 		       struct dm_transaction_manager **tm,
 		       struct dm_space_map **sm)
 {
-	return dm_tm_create_internal(bm, sb_location, tm, sm, 0, sm_root, root_len);
+	return dm_tm_create_internal(bm, sb_location, tm, sm, 0, 0, sm_root, root_len);
 }
 EXPORT_SYMBOL_GPL(dm_tm_open_with_sm);
 
diff --git a/drivers/md/persistent-data/dm-transaction-manager.h b/drivers/md/persistent-data/dm-transaction-manager.h
index b5b1390..a8403f0 100644
--- a/drivers/md/persistent-data/dm-transaction-manager.h
+++ b/drivers/md/persistent-data/dm-transaction-manager.h
@@ -120,7 +120,7 @@ struct dm_block_manager *dm_tm_get_bm(struct dm_transaction_manager *tm);
  * shouldn't be used.
  */
 int dm_tm_create_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
-			 struct dm_transaction_manager **tm,
+			 dm_block_t nr_blocks, struct dm_transaction_manager **tm,
 			 struct dm_space_map **sm);
 
 int dm_tm_open_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold Mike Snitzer
@ 2014-02-21 13:56   ` Joe Thornber
  2014-02-21 14:05     ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 13:56 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

NACK

On Thu, Feb 20, 2014 at 09:55:59PM -0500, Mike Snitzer wrote:
> The threshold boundary code in persistent-data/dm-space-map-metadata.c
> was too racey and resulted in a flood of warnings and events.

This code never runs concurrently, so I don't see how it can be racey.
Scanning the code it seems to work as advertised; it is an edge
triggered threshold, activating every time the free space crosses the
threshold from high to low.

- You haven't explained what your changes within
  dm-space-map-metadata.c do.  Is the threshold no longer edge
  triggered?

- With your changes, how is the threshold reset?  For instance:

  i) we cross the lwm and issue an event

  ii) the admin deletes a ton of old snapshots to free up some
      metadata space, taking us well above the lwm.

  iii) 3 months later we cross the lwm again.  Will the admin be notified?

I suspect the real issue is when the free space is near the threshold
it 'bobbles', crossing back and forth and triggering often (not a
race).  I suggest the correct way to fix this is to change the
threshold code to introduce some hysteresis.  For example by saying it
can't trigger unless free space has crossed another higher threshold
first (lwm + FUDGE_FACTOR).  Fixes for this should remain within
dm-space-map-metadata.c.

- Joe





> 
> Check the 'metadata_low_water_triggered' flag in metadata_low_callback()
> before logging a warning and sending an event.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-thin.c                               | 11 ++++++++++-
>  drivers/md/persistent-data/dm-space-map-metadata.c | 19 +------------------
>  2 files changed, 11 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> index 9facc6f..42d08eb 100644
> --- a/drivers/md/dm-thin.c
> +++ b/drivers/md/dm-thin.c
> @@ -165,6 +165,7 @@ struct pool {
>  
>  	struct pool_features pf;
>  	bool low_water_triggered:1;	/* A dm event has been sent */
> +	bool metadata_low_water_triggered:1;
>  
>  	struct dm_bio_prison *prison;
>  	struct dm_kcopyd_client *copier;
> @@ -1824,6 +1825,7 @@ static struct pool *pool_create(struct mapped_device *pool_md,
>  	INIT_LIST_HEAD(&pool->prepared_mappings);
>  	INIT_LIST_HEAD(&pool->prepared_discards);
>  	pool->low_water_triggered = false;
> +	pool->metadata_low_water_triggered = false;
>  	bio_list_init(&pool->retry_on_resume_list);
>  
>  	pool->shared_read_ds = dm_deferred_set_create();
> @@ -1993,10 +1995,16 @@ static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf,
>  static void metadata_low_callback(void *context)
>  {
>  	struct pool *pool = context;
> +	unsigned long flags;
> +
> +	if (pool->metadata_low_water_triggered)
> +		return;
>  
>  	DMWARN("%s: reached low water mark for metadata device: sending event.",
>  	       dm_device_name(pool->pool_md));
> -
> +	spin_lock_irqsave(&pool->lock, flags);
> +	pool->metadata_low_water_triggered = true;
> +	spin_unlock_irqrestore(&pool->lock, flags);
>  	dm_table_event(pool->ti->table);
>  }
>  
> @@ -2350,6 +2358,7 @@ static void pool_resume(struct dm_target *ti)
>  
>  	spin_lock_irqsave(&pool->lock, flags);
>  	pool->low_water_triggered = false;
> +	pool->metadata_low_water_triggered = false;
>  	__requeue_bios(pool);
>  	spin_unlock_irqrestore(&pool->lock, flags);
>  
> diff --git a/drivers/md/persistent-data/dm-space-map-metadata.c b/drivers/md/persistent-data/dm-space-map-metadata.c
> index 536782e..a0231d3 100644
> --- a/drivers/md/persistent-data/dm-space-map-metadata.c
> +++ b/drivers/md/persistent-data/dm-space-map-metadata.c
> @@ -21,9 +21,7 @@
>   */
>  struct threshold {
>  	bool threshold_set;
> -	bool value_set;
>  	dm_block_t threshold;
> -	dm_block_t current_value;
>  	dm_sm_threshold_fn fn;
>  	void *context;
>  };
> @@ -31,7 +29,6 @@ struct threshold {
>  static void threshold_init(struct threshold *t)
>  {
>  	t->threshold_set = false;
> -	t->value_set = false;
>  }
>  
>  static void set_threshold(struct threshold *t, dm_block_t value,
> @@ -43,24 +40,10 @@ static void set_threshold(struct threshold *t, dm_block_t value,
>  	t->context = context;
>  }
>  
> -static bool below_threshold(struct threshold *t, dm_block_t value)
> -{
> -	return t->threshold_set && value <= t->threshold;
> -}
> -
> -static bool threshold_already_triggered(struct threshold *t)
> -{
> -	return t->value_set && below_threshold(t, t->current_value);
> -}
> -
>  static void check_threshold(struct threshold *t, dm_block_t value)
>  {
> -	if (below_threshold(t, value) &&
> -	    !threshold_already_triggered(t))
> +	if (t->threshold_set && value <= t->threshold)
>  		t->fn(t->context);
> -
> -	t->value_set = true;
> -	t->current_value = value;
>  }
>  
>  /*----------------------------------------------------------------*/
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend
  2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend Mike Snitzer
@ 2014-02-21 13:58   ` Joe Thornber
  0 siblings, 0 replies; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 13:58 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

ACK

On Thu, Feb 20, 2014 at 09:55:58PM -0500, Mike Snitzer wrote:
> Commit b5330655 ("dm thin: handle metadata failures more consistently")
> increased potential for the pool's mode to be changed in response to
> metadata operation failures.
> 
> When the pool mode is changed it isn't synchronized with the mode in
> pool_features stored in the target's context (ti->private) that is used
> as the basis for (re)establishing the pool mode during resume via
> bind_control_target.
> 
> It is important that we synchronize the pool mode when suspending
> otherwise the pool may experience and unexpected mode transition on the
> next resume (especially if there was no new table load).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-thin.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> index d501e43..9facc6f 100644
> --- a/drivers/md/dm-thin.c
> +++ b/drivers/md/dm-thin.c
> @@ -2364,6 +2364,12 @@ static void pool_postsuspend(struct dm_target *ti)
>  	cancel_delayed_work(&pool->waker);
>  	flush_workqueue(pool->wq);
>  	(void) commit(pool);
> +
> +	/*
> +	 * The pool mode may have changed, sync it so bind_control_target()
> +	 * doesn't cause an unexpected mode transition on resume.
> +	 */
> +	pt->adjusted_pf.mode = get_pool_mode(pool);
>  }
>  
>  static int check_arg_count(unsigned argc, unsigned args_required)
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold
  2014-02-21 13:56   ` Joe Thornber
@ 2014-02-21 14:05     ` Mike Snitzer
  2014-02-21 14:10       ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21 14:05 UTC (permalink / raw)
  To: dm-devel

On Fri, Feb 21 2014 at  8:56am -0500,
Joe Thornber <thornber@redhat.com> wrote:

> NACK

You can NACK all you want.  Old code doesn't work.  New code does.

> On Thu, Feb 20, 2014 at 09:55:59PM -0500, Mike Snitzer wrote:
> > The threshold boundary code in persistent-data/dm-space-map-metadata.c
> > was too racey and resulted in a flood of warnings and events.
> 
> This code never runs concurrently, so I don't see how it can be racey.
> Scanning the code it seems to work as advertised; it is an edge
> triggered threshold, activating every time the free space crosses the
> threshold from high to low.
> 
> - You haven't explained what your changes within
>   dm-space-map-metadata.c do.  Is the threshold no longer edge
>   triggered?

It is identical to data device's low water mark.

> - With your changes, how is the threshold reset?  For instance:
> 
>   i) we cross the lwm and issue an event
> 
>   ii) the admin deletes a ton of old snapshots to free up some
>       metadata space, taking us well above the lwm.
> 
>   iii) 3 months later we cross the lwm again.  Will the admin be notified?

It resets the same as the data low water mark, see change in
pool_resume.  So any limitation with this approach applies to data low
water mark too.

> I suspect the real issue is when the free space is near the threshold
> it 'bobbles', crossing back and forth and triggering often (not a
> race).  I suggest the correct way to fix this is to change the
> threshold code to introduce some hysteresis.  For example by saying it
> can't trigger unless free space has crossed another higher threshold
> first (lwm + FUDGE_FACTOR).  Fixes for this should remain within
> dm-space-map-metadata.c.

Yeah, I added tracing and a FUDGE_FACTOR could help.  But in the end I
saw no benefit to having 2 different mechanisms for low water mark.  I
went with the approach that has been proven with data's low water mark.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold
  2014-02-21 14:05     ` Mike Snitzer
@ 2014-02-21 14:10       ` Mike Snitzer
  0 siblings, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21 14:10 UTC (permalink / raw)
  To: dm-devel

On Fri, Feb 21 2014 at  9:05am -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Fri, Feb 21 2014 at  8:56am -0500,
> Joe Thornber <thornber@redhat.com> wrote:
> 
> > NACK
> 
> You can NACK all you want.  Old code doesn't work.  New code does.
> 
> > On Thu, Feb 20, 2014 at 09:55:59PM -0500, Mike Snitzer wrote:
> > > The threshold boundary code in persistent-data/dm-space-map-metadata.c
> > > was too racey and resulted in a flood of warnings and events.
> > 
> > This code never runs concurrently, so I don't see how it can be racey.
> > Scanning the code it seems to work as advertised; it is an edge
> > triggered threshold, activating every time the free space crosses the
> > threshold from high to low.
> > 
> > - You haven't explained what your changes within
> >   dm-space-map-metadata.c do.  Is the threshold no longer edge
> >   triggered?
> 
> It is identical to data device's low water mark.

To be clear, it is still edge triggered from the metadata sm -- it makes
the same upcall to t->fn(t->context).

Only difference dm-thin.c:metadata_low_callback controls the throttle on
the event and warning.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space Mike Snitzer
@ 2014-02-21 14:20   ` Joe Thornber
  2014-02-21 14:35     ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:20 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

NACK.

It's odd that the interface has dm-thin telling dm-thin-metadata
whether or not it's out of space (yet thin-metadata clears it
itself).  So I don't like the patch from that point of view, though
it's a minor quibble.

More importantly, running out of metadata space is a serious error
that causes a transaction to be aborted.  Any thins that contained
metadata changes in that aborted transaction need to be umounted and
fscked.  We should not transistion back to WRITE mode just because
more metadata space is provided since this doesn't force the sys admin
to repare/check the thins that have lost io.

So I'm nacking because I think you're planning to special case running
out of metadata space, rather than treating it the same as any other
metadata error (eg, a failed write to the metadata device).

As we've discussed the correct thing to do when getting any error from
a metadata operation is:

  i) Abort the current transaction

  ii) flick to READ_ONLY mode.  Any thin devices that lost writes due
      to the abort will reflect this by erroring every subsequent
      REQ_FUA or REQ_FLUSH.

  iii) Set a needs_check flag in the metadata to prevent a table
       reload transitioning to WRITE mode.

- Joe

On Thu, Feb 20, 2014 at 09:56:00PM -0500, Mike Snitzer wrote:
> It is difficult for the metadata space map to accurately account for its
> free space.  Manually establish a flag that reflects this out of space
> condition -- until/unless dm_pool_get_free_metadata_block_count() can
> return a known out of space value (either 0 or some fixed reserve
> threshold we create).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-thin-metadata.c | 33 ++++++++++++++++++++++++++++++++-
>  drivers/md/dm-thin-metadata.h |  7 +++++++
>  drivers/md/dm-thin.c          |  8 +++-----
>  3 files changed, 42 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
> index 3bb4506..21f3998 100644
> --- a/drivers/md/dm-thin-metadata.c
> +++ b/drivers/md/dm-thin-metadata.c
> @@ -192,6 +192,15 @@ struct dm_pool_metadata {
>  	 * operation possible in this state is the closing of the device.
>  	 */
>  	bool fail_io:1;
> +
> +	/*
> +	 * Set if metadata space has been exhausted.  It is difficult for the
> +	 * metadata space map to accurately account for its free space.  So
> +	 * until/unless dm_pool_get_free_metadata_block_count can return a known
> +	 * out of space value (either 0 or some fixed reserve threshold we create)
> +	 * manually establish a flag that reflects this out of space condition.
> +	 */
> +	bool metadata_out_of_space:1;
>  };
>  
>  struct dm_thin_device {
> @@ -815,6 +824,7 @@ struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
>  	INIT_LIST_HEAD(&pmd->thin_devices);
>  	pmd->read_only = false;
>  	pmd->fail_io = false;
> +	pmd->metadata_out_of_space = false;
>  	pmd->bdev = bdev;
>  	pmd->data_block_size = data_block_size;
>  
> @@ -1603,6 +1613,24 @@ int dm_pool_get_free_metadata_block_count(struct dm_pool_metadata *pmd,
>  	return r;
>  }
>  
> +void dm_pool_set_metadata_out_of_space(struct dm_pool_metadata *pmd)
> +{
> +	down_write(&pmd->root_lock);
> +	pmd->metadata_out_of_space = true;
> +	up_write(&pmd->root_lock);
> +}
> +
> +bool dm_pool_is_metadata_out_of_space(struct dm_pool_metadata *pmd)
> +{
> +	bool r;
> +
> +	down_read(&pmd->root_lock);
> +	r = pmd->metadata_out_of_space;
> +	up_read(&pmd->root_lock);
> +
> +	return r;
> +}
> +
>  int dm_pool_get_metadata_dev_size(struct dm_pool_metadata *pmd,
>  				  dm_block_t *result)
>  {
> @@ -1719,8 +1747,11 @@ int dm_pool_resize_metadata_dev(struct dm_pool_metadata *pmd, dm_block_t new_cou
>  	int r = -EINVAL;
>  
>  	down_write(&pmd->root_lock);
> -	if (!pmd->fail_io)
> +	if (!pmd->fail_io) {
>  		r = __resize_space_map(pmd->metadata_sm, new_count);
> +		if (!r)
> +			pmd->metadata_out_of_space = false;
> +	}
>  	up_write(&pmd->root_lock);
>  
>  	return r;
> diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
> index dd6088a..cc78fbb 100644
> --- a/drivers/md/dm-thin-metadata.h
> +++ b/drivers/md/dm-thin-metadata.h
> @@ -91,6 +91,11 @@ int dm_pool_commit_metadata(struct dm_pool_metadata *pmd);
>  int dm_pool_abort_metadata(struct dm_pool_metadata *pmd);
>  
>  /*
> + * Set if metadata space has been exhausted.
> + */
> +void dm_pool_set_metadata_out_of_space(struct dm_pool_metadata *pmd);
> +
> +/*
>   * Set/get userspace transaction id.
>   */
>  int dm_pool_set_metadata_transaction_id(struct dm_pool_metadata *pmd,
> @@ -185,6 +190,8 @@ int dm_pool_get_data_dev_size(struct dm_pool_metadata *pmd, dm_block_t *result);
>  
>  int dm_pool_block_is_used(struct dm_pool_metadata *pmd, dm_block_t b, bool *result);
>  
> +bool dm_pool_is_metadata_out_of_space(struct dm_pool_metadata *pmd);
> +
>  /*
>   * Returns -ENOSPC if the new size is too small and already allocated
>   * blocks would be lost.
> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> index 42d08eb..8e68831 100644
> --- a/drivers/md/dm-thin.c
> +++ b/drivers/md/dm-thin.c
> @@ -1464,16 +1464,14 @@ static void out_of_data_space(struct pool *pool)
>  
>  static void metadata_operation_failed(struct pool *pool, const char *op, int r)
>  {
> -	dm_block_t free_blocks;
> -
>  	DMERR_LIMIT("%s: metadata operation '%s' failed: error = %d",
>  		    dm_device_name(pool->pool_md), op, r);
>  
> -	if (r == -ENOSPC &&
> -	    !dm_pool_get_free_metadata_block_count(pool->pmd, &free_blocks) &&
> -	    !free_blocks)
> +	if (r == -ENOSPC) {
> +		dm_pool_set_metadata_out_of_space(pool->pmd);
>  		DMERR_LIMIT("%s: no free metadata space available.",
>  			    dm_device_name(pool->pool_md));
> +	}
>  
>  	set_pool_mode(pool, PM_READ_ONLY);
>  }
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried Mike Snitzer
@ 2014-02-21 14:22   ` Joe Thornber
  2014-02-21 14:48     ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:22 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

NACK (but close).

On Thu, Feb 20, 2014 at 09:56:01PM -0500, Mike Snitzer wrote:
> If the pool is in fail mode, error_if_no_space is enabled or the
> metadata space is exhausted do _not_ allow IO to be retried.

This should be:

"If the pool is in fail mode, error_if_no_space is enabled or a
transaction has been aborted, do _not_ allow IO to be retried."

(In practise 'transaction has been aborted' is the same as the
needs_check flag we've discussed.)

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races Mike Snitzer
@ 2014-02-21 14:27   ` Joe Thornber
  2014-02-21 14:37     ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:27 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

NACK.

On Thu, Feb 20, 2014 at 09:56:03PM -0500, Mike Snitzer wrote:
> Before, when a pool is being resized, on resume the pool's mode was
> being immediately set to PM_WRITE and the process_* methods would be set
> to the normal writable operations.  These were established before
> pool_resume() was able to actually resize either the metadata or data
> device and resulted in the resize failing.  This failure occurred due to
> the thin device calling wake_worker() which kicked do_worker() during
> pool_preresume() -- which caused calls to the writable process_* methods
> to occur during the resize.

The bug is the worker should not be able to be woken until
pool_preresume has completed.  A simple check in wake_worker() should
fix this.

> Now, the pool is forced to stay in read-only mode if it was already in
> read-only mode.  This prevents bouncing the pool's mode and associated
> process_* methods and consistently keeps read-only processing in place
> until the resize(s) complete.  To achieve this the pool can now be in a
> PM_READ_ONLY state but the metadata's is !read_only -- so as to allow
> the commit() the follows the resize to take place.  Differentiate
> between commit_metadata_superblock() and commit() since different
> negative checks apply (but factor out a common __commit that each
> calls).

Too complicated.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 7/8] dm thin: ensure user takes action to validate data and metadata consistency
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 7/8] dm thin: ensure user takes action to validate data and metadata consistency Mike Snitzer
@ 2014-02-21 14:35   ` Joe Thornber
  0 siblings, 0 replies; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:35 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

ACK.  But we need a lot of test cases for dmtest before this can go
upstream.

On Thu, Feb 20, 2014 at 09:56:04PM -0500, Mike Snitzer wrote:
> If a thin metadata operation fails the current transaction will abort,
> whereby causing potential for IO layers up the stack (e.g. filesystems)
> to have data loss.  As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
> thin metadata's superblock which forces the user to:
> 1) verify the thin metadata is consistent (e.g. use thin_check, etc)
> 2) verify the thin data is consistent (e.g. use fsck)
> 
> The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
> to run thin_repair.
> 
> On metadata operation failure: abort current metadata transaction, set
> pool in read-only mode, and now set the needs_check flag.
> 
> As part of this change, constraints are introduced or relaxed:
> * don't allow a pool to transition to write mode if needs_check is set
> * don't queue IO if needs_check is set
> * don't allow data or metadata space to be resized if needs_check is set
> * if a thin pool's metadata space is exhausted: the kernel will now
>   force the user to take the pool offline for repair before the kernel
>   will allow the metadata space to be extended.
> * relax response to data allocation failure due to no data space:
>   don't abort the current metadata transaction (this allows previously
>   allocated and prepared mappings to be committed).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  Documentation/device-mapper/thin-provisioning.txt | 21 +++---
>  drivers/md/dm-thin-metadata.c                     | 20 +++++-
>  drivers/md/dm-thin-metadata.h                     |  8 +++
>  drivers/md/dm-thin.c                              | 82 +++++++++++++++++------
>  4 files changed, 102 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt
> index 3989dd6..075ca84 100644
> --- a/Documentation/device-mapper/thin-provisioning.txt
> +++ b/Documentation/device-mapper/thin-provisioning.txt
> @@ -130,14 +130,19 @@ a volatile write cache.  If power is lost you may lose some recent
>  writes.  The metadata should always be consistent in spite of any crash.
>  
>  If data space is exhausted the pool will either error or queue IO
> -according to the configuration (see: error_if_no_space).  When metadata
> -space is exhausted the pool will error IO, that requires new pool block
> -allocation, until the pool's metadata device is resized.  When either the
> -data or metadata space is exhausted the current metadata transaction
> -must be aborted.  Given that the pool will cache IO whose completion may
> -have already been acknowledged to the upper IO layers (e.g. filesystem)
> -it is strongly suggested that those layers perform consistency checks
> -before the data or metadata space is resized after having been exhausted.
> +according to the configuration (see: error_if_no_space).  If metadata
> +space is exhausted or a metadata operation fails: the pool will error IO
> +until the pool is taken offline and repair is performed to 1) fix any
> +potential inconsistencies and 2) clear the flag that imposes repair.
> +Once the pool's metadata device is repaired it may be resized, which
> +will allow the pool to return to normal operation.  Note that a pool's
> +data and metadata devices cannot be resized until repair is performed.
> +It should also be noted that when the pool's metadata space is exhausted
> +the current metadata transaction is aborted.  Given that the pool will
> +cache IO whose completion may have already been acknowledged to upper IO
> +layers (e.g. filesystem) it is strongly suggested that consistency
> +checks (e.g. fsck) be performed on those layers when repair of the pool
> +is required.
>  
>  Thin provisioning
>  -----------------
> diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
> index bea6db6..7c2bc26 100644
> --- a/drivers/md/dm-thin-metadata.c
> +++ b/drivers/md/dm-thin-metadata.c
> @@ -76,7 +76,7 @@
>  
>  #define THIN_SUPERBLOCK_MAGIC 27022010
>  #define THIN_SUPERBLOCK_LOCATION 0
> -#define THIN_VERSION 1
> +#define THIN_VERSION 2
>  #define THIN_METADATA_CACHE_SIZE 64
>  #define SECTOR_TO_BLOCK_SHIFT 3
>  
> @@ -1830,3 +1830,21 @@ int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd,
>  
>  	return r;
>  }
> +
> +void dm_pool_metadata_set_needs_check(struct dm_pool_metadata *pmd)
> +{
> +	down_write(&pmd->root_lock);
> +	pmd->flags |= THIN_METADATA_NEEDS_CHECK_FLAG;
> +	up_write(&pmd->root_lock);
> +}
> +
> +bool dm_pool_metadata_needs_check(struct dm_pool_metadata *pmd)
> +{
> +	bool needs_check;
> +
> +	down_read(&pmd->root_lock);
> +	needs_check = pmd->flags & THIN_METADATA_NEEDS_CHECK_FLAG;
> +	up_read(&pmd->root_lock);
> +
> +	return needs_check;
> +}
> diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
> index 520dd34..583ff5d 100644
> --- a/drivers/md/dm-thin-metadata.h
> +++ b/drivers/md/dm-thin-metadata.h
> @@ -27,6 +27,11 @@
>  
>  /*----------------------------------------------------------------*/
>  
> +/*
> + * Thin metadata superblock flags.
> + */
> +#define THIN_METADATA_NEEDS_CHECK_FLAG		(1 << 0)
> +
>  struct dm_pool_metadata;
>  struct dm_thin_device;
>  
> @@ -214,6 +219,9 @@ int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd,
>  					dm_sm_threshold_fn fn,
>  					void *context);
>  
> +void dm_pool_metadata_set_needs_check(struct dm_pool_metadata *pmd);
> +bool dm_pool_metadata_needs_check(struct dm_pool_metadata *pmd);
> +
>  /*----------------------------------------------------------------*/
>  
>  #endif
> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> index e560416..cd52cf2 100644
> --- a/drivers/md/dm-thin.c
> +++ b/drivers/md/dm-thin.c
> @@ -1014,6 +1014,7 @@ static bool should_error_unserviceable_bio(struct pool *pool)
>  {
>  	return (unlikely(get_pool_mode(pool) == PM_FAIL) ||
>  		pool->pf.error_if_no_space ||
> +		dm_pool_metadata_needs_check(pool->pmd) ||
>  		dm_pool_is_metadata_out_of_space(pool->pmd));
>  }
>  
> @@ -1436,8 +1437,19 @@ static enum pool_mode get_pool_mode(struct pool *pool)
>  static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
>  {
>  	int r;
> +	bool out_of_space, needs_check = dm_pool_metadata_needs_check(pool->pmd);
>  	enum pool_mode old_mode = pool->pf.mode;
>  
> +	/*
> +	 * Never allow the pool to transition to PM_WRITE mode if user
> +	 * intervention is required to verify metadata and data consistency.
> +	 */
> +	if (new_mode == PM_WRITE && old_mode != new_mode && needs_check) {
> +		DMERR("%s: unable to switch pool to write mode until repaired.",
> +		      dm_device_name(pool->pool_md));
> +		new_mode = old_mode;
> +	}
> +
>  	switch (new_mode) {
>  	case PM_FAIL:
>  		if (old_mode != new_mode)
> @@ -1454,23 +1466,35 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
>  		if (old_mode != new_mode)
>  			DMERR("%s: switching pool to read-only mode",
>  			      dm_device_name(pool->pool_md));
> -		r = dm_pool_abort_metadata(pool->pmd);
> -		if (r) {
> -			DMERR("%s: aborting transaction failed",
> -			      dm_device_name(pool->pool_md));
> -			new_mode = PM_FAIL;
> -			set_pool_mode(pool, new_mode);
> -		} else {
> -			bool out_of_space;
> -			if (!dm_pool_is_out_of_space(pool->pmd, &out_of_space) && out_of_space)
> -				dm_pool_metadata_read_write(pool->pmd);
> -			else
> -				dm_pool_metadata_read_only(pool->pmd);
> -			pool->process_bio = process_bio_read_only;
> -			pool->process_discard = process_discard;
> +		if (needs_check) {
> +			r = dm_pool_abort_metadata(pool->pmd);
> +			if (r) {
> +				DMERR("%s: aborting transaction failed",
> +				      dm_device_name(pool->pool_md));
> +				new_mode = PM_FAIL;
> +				set_pool_mode(pool, new_mode);
> +				goto out;
> +			}
> +		}
> +		if (!dm_pool_is_out_of_space(pool->pmd, &out_of_space) && out_of_space)
> +			dm_pool_metadata_read_write(pool->pmd);
> +		else
> +			dm_pool_metadata_read_only(pool->pmd);
> +		pool->process_bio = process_bio_read_only;
> +		pool->process_discard = process_discard;
> +		if (needs_check)
>  			pool->process_prepared_mapping = process_prepared_mapping_fail;
> -			pool->process_prepared_discard = process_prepared_discard_passdown;
> +		else {
> +			/*
> +			 * Allow previously prepared mappings to complete (important
> +			 * for proper handling of case when data space is exhausted).
> +			 * If read-only mode was requested no new mappings will be
> +			 * created (due to process_bio_read_only) so no risk using
> +			 * process_prepared_mapping.
> +			 */
> +			pool->process_prepared_mapping = process_prepared_mapping;
>  		}
> +		pool->process_prepared_discard = process_prepared_discard_passdown;
>  		break;
>  
>  	case PM_WRITE:
> @@ -1484,7 +1508,7 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
>  		pool->process_prepared_discard = process_prepared_discard;
>  		break;
>  	}
> -
> +out:
>  	pool->pf.mode = new_mode;
>  }
>  
> @@ -1501,15 +1525,21 @@ static void out_of_data_space(struct pool *pool)
>  
>  static void metadata_operation_failed(struct pool *pool, const char *op, int r)
>  {
> +	const char *pool_device_name = dm_device_name(pool->pool_md);
> +
>  	DMERR_LIMIT("%s: metadata operation '%s' failed: error = %d",
> -		    dm_device_name(pool->pool_md), op, r);
> +		    pool_device_name, op, r);
>  
>  	if (r == -ENOSPC) {
>  		dm_pool_set_metadata_out_of_space(pool->pmd);
>  		DMERR_LIMIT("%s: no free metadata space available.",
> -			    dm_device_name(pool->pool_md));
> +			    pool_device_name);
>  	}
>  
> +	DMWARN_LIMIT("%s: consistency of metadata and data must be verified, please repair.",
> +		     pool_device_name);
> +	dm_pool_metadata_set_needs_check(pool->pmd);
> +
>  	set_pool_mode(pool, PM_READ_ONLY);
>  }
>  
> @@ -2295,6 +2325,12 @@ static int maybe_resize_data_dev(struct dm_target *ti, bool *need_commit)
>  		return -EINVAL;
>  
>  	} else if (data_size > sb_data_size) {
> +		if (dm_pool_metadata_needs_check(pool->pmd)) {
> +			DMERR("%s: unable to grow the data device until repaired.",
> +			      dm_device_name(pool->pool_md));
> +			return -EINVAL;;
> +		}
> +
>  		if (sb_data_size)
>  			DMINFO("%s: growing the data device from %llu to %llu blocks",
>  			       dm_device_name(pool->pool_md),
> @@ -2336,6 +2372,12 @@ static int maybe_resize_metadata_dev(struct dm_target *ti, bool *need_commit)
>  		return -EINVAL;
>  
>  	} else if (metadata_dev_size > sb_metadata_dev_size) {
> +		if (dm_pool_metadata_needs_check(pool->pmd)) {
> +			DMERR("%s: unable to grow the metadata device until repaired.",
> +			      dm_device_name(pool->pool_md));
> +			return -EINVAL;;
> +		}
> +
>  		DMINFO("%s: growing the metadata device from %llu to %llu blocks",
>  		       dm_device_name(pool->pool_md),
>  		       sb_metadata_dev_size, metadata_dev_size);
> @@ -2865,7 +2907,7 @@ static struct target_type pool_target = {
>  	.name = "thin-pool",
>  	.features = DM_TARGET_SINGLETON | DM_TARGET_ALWAYS_WRITEABLE |
>  		    DM_TARGET_IMMUTABLE,
> -	.version = {1, 10, 0},
> +	.version = {1, 11, 0},
>  	.module = THIS_MODULE,
>  	.ctr = pool_ctr,
>  	.dtr = pool_dtr,
> @@ -3155,7 +3197,7 @@ static int thin_iterate_devices(struct dm_target *ti,
>  
>  static struct target_type thin_target = {
>  	.name = "thin",
> -	.version = {1, 10, 0},
> +	.version = {1, 11, 0},
>  	.module	= THIS_MODULE,
>  	.ctr = thin_ctr,
>  	.dtr = thin_dtr,
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space
  2014-02-21 14:20   ` Joe Thornber
@ 2014-02-21 14:35     ` Mike Snitzer
  2014-02-21 14:44       ` Joe Thornber
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21 14:35 UTC (permalink / raw)
  To: dm-devel

On Fri, Feb 21 2014 at  9:20am -0500,
Joe Thornber <thornber@redhat.com> wrote:

> NACK.
> 
> It's odd that the interface has dm-thin telling dm-thin-metadata
> whether or not it's out of space (yet thin-metadata clears it
> itself).  So I don't like the patch from that point of view, though
> it's a minor quibble.

Yeap, minor quibble.. I can expose an interface that lets me reset it to
false after resize -- from thin.  Seems unnecessary given the expected
short lifetime of this code (see below).

> More importantly, running out of metadata space is a serious error
> that causes a transaction to be aborted.  Any thins that contained
> metadata changes in that aborted transaction need to be umounted and
> fscked.  We should not transistion back to WRITE mode just because
> more metadata space is provided since this doesn't force the sys admin
> to repare/check the thins that have lost io.

This is an incremental change that just adds the control of whether the
metadata is out of space or not.  The current kernel code cannot give us
that info.  This is a stop gap until you implement the reserve that
gives us a concrete value to test for to _know_ that metadata is out of
space.  That is why I said as much in the patch header.  Once we have
that this change can be reverted.  But for now it is needed.

> So I'm nacking because I think you're planning to special case running
> out of metadata space, rather than treating it the same as any other
> metadata error (eg, a failed write to the metadata device).
> 
> As we've discussed the correct thing to do when getting any error from
> a metadata operation is:
> 
>   i) Abort the current transaction
> 
>   ii) flick to READ_ONLY mode.  Any thin devices that lost writes due
>       to the abort will reflect this by erroring every subsequent
>       REQ_FUA or REQ_FLUSH.
> 
>   iii) Set a needs_check flag in the metadata to prevent a table
>        reload transitioning to WRITE mode.

I'm not seeing how running out of space on metadata is any different
than any other metadata operation that failed.  The transaction is
aborted, as soon as that happens all bets are off on upper level data
consistency.  So while thin metadata may be perfectly fine, it is still
prudent for the user to check their data.

Patch 7 impsoes this constraint.  It provides all 3 points you listed
above.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 8/8] dm thin: allow metadata space larger than supported to go unused
  2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 8/8] dm thin: allow metadata space larger than supported to go unused Mike Snitzer
@ 2014-02-21 14:36   ` Joe Thornber
  0 siblings, 0 replies; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:36 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel

ACK

On Thu, Feb 20, 2014 at 09:56:05PM -0500, Mike Snitzer wrote:
> It was always intended that a user could provide a thin metadata device
> that is larger than the max supported by the on-disk format.  The extra
> space would just go unused.
> 
> That would only work when extending the metadata device, not on initial
> thin-pool creation.  If the user attempted to use a larger metadata
> device on creation they would get an error like the following:
> 
>  device-mapper: space map common: space map too large
>  device-mapper: transaction manager: couldn't create metadata space map
>  device-mapper: thin metadata: tm_create_with_sm failed
>  device-mapper: table: 252:17: thin-pool: Error creating metadata object
>  device-mapper: ioctl: error adding target to table
> 
> Fix this by allowing the initial thin-pool creation to cap the size of
> the metadata area to be smaller than the size of the metadata device.
> 
> Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for
> the sizeof the disk_bitmap_header.  So the supported maximum metadata
> size is a bit smaller (reduced from 33423360 to 33292800 sectors).
> 
> Lastly, remove the "excess space will not be used" warning message from
> get_metadata_dev_size(); it resulted in printing the warning multiple
> times.  Factor out warn_if_metadata_device_too_big(), call it from
> pool_ctr() and maybe_resize_metadata_dev().
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-cache-metadata.c                     |  2 +-
>  drivers/md/dm-thin-metadata.c                      | 21 +++++++------
>  drivers/md/dm-thin-metadata.h                      |  5 ++--
>  drivers/md/dm-thin.c                               | 35 ++++++++++++++--------
>  .../md/persistent-data/dm-transaction-manager.c    | 13 ++++----
>  .../md/persistent-data/dm-transaction-manager.h    |  2 +-
>  6 files changed, 47 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c
> index 9ef0752..1baf615 100644
> --- a/drivers/md/dm-cache-metadata.c
> +++ b/drivers/md/dm-cache-metadata.c
> @@ -323,7 +323,7 @@ static int __format_metadata(struct dm_cache_metadata *cmd)
>  {
>  	int r;
>  
> -	r = dm_tm_create_with_sm(cmd->bm, CACHE_SUPERBLOCK_LOCATION,
> +	r = dm_tm_create_with_sm(cmd->bm, CACHE_SUPERBLOCK_LOCATION, 0,
>  				 &cmd->tm, &cmd->metadata_sm);
>  	if (r < 0) {
>  		DMERR("tm_create_with_sm failed");
> diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
> index 7c2bc26..3780650 100644
> --- a/drivers/md/dm-thin-metadata.c
> +++ b/drivers/md/dm-thin-metadata.c
> @@ -503,11 +503,11 @@ bad_locked:
>  	return r;
>  }
>  
> -static int __format_metadata(struct dm_pool_metadata *pmd)
> +static int __format_metadata(struct dm_pool_metadata *pmd, dm_block_t nr_blocks)
>  {
>  	int r;
>  
> -	r = dm_tm_create_with_sm(pmd->bm, THIN_SUPERBLOCK_LOCATION,
> +	r = dm_tm_create_with_sm(pmd->bm, THIN_SUPERBLOCK_LOCATION, nr_blocks,
>  				 &pmd->tm, &pmd->metadata_sm);
>  	if (r < 0) {
>  		DMERR("tm_create_with_sm failed");
> @@ -642,7 +642,8 @@ bad_unlock_sblock:
>  	return r;
>  }
>  
> -static int __open_or_format_metadata(struct dm_pool_metadata *pmd, bool format_device)
> +static int __open_or_format_metadata(struct dm_pool_metadata *pmd, bool format_device,
> +				     dm_block_t nr_blocks)
>  {
>  	int r, unformatted;
>  
> @@ -651,12 +652,13 @@ static int __open_or_format_metadata(struct dm_pool_metadata *pmd, bool format_d
>  		return r;
>  
>  	if (unformatted)
> -		return format_device ? __format_metadata(pmd) : -EPERM;
> +		return format_device ? __format_metadata(pmd, nr_blocks) : -EPERM;
>  
>  	return __open_metadata(pmd);
>  }
>  
> -static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool format_device)
> +static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool format_device,
> +					    dm_block_t nr_blocks)
>  {
>  	int r;
>  
> @@ -668,7 +670,7 @@ static int __create_persistent_data_objects(struct dm_pool_metadata *pmd, bool f
>  		return PTR_ERR(pmd->bm);
>  	}
>  
> -	r = __open_or_format_metadata(pmd, format_device);
> +	r = __open_or_format_metadata(pmd, format_device, nr_blocks);
>  	if (r)
>  		dm_block_manager_destroy(pmd->bm);
>  
> @@ -808,7 +810,8 @@ out_locked:
>  
>  struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
>  					       sector_t data_block_size,
> -					       bool format_device)
> +					       bool format_device,
> +					       dm_block_t nr_blocks)
>  {
>  	int r;
>  	struct dm_pool_metadata *pmd;
> @@ -828,7 +831,7 @@ struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
>  	pmd->bdev = bdev;
>  	pmd->data_block_size = data_block_size;
>  
> -	r = __create_persistent_data_objects(pmd, format_device);
> +	r = __create_persistent_data_objects(pmd, format_device, nr_blocks);
>  	if (r) {
>  		kfree(pmd);
>  		return ERR_PTR(r);
> @@ -1578,7 +1581,7 @@ int dm_pool_abort_metadata(struct dm_pool_metadata *pmd)
>  
>  	__set_abort_with_changes_flags(pmd);
>  	__destroy_persistent_data_objects(pmd);
> -	r = __create_persistent_data_objects(pmd, false);
> +	r = __create_persistent_data_objects(pmd, false, 0);
>  	if (r)
>  		pmd->fail_io = true;
>  
> diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h
> index 583ff5d..e24169c 100644
> --- a/drivers/md/dm-thin-metadata.h
> +++ b/drivers/md/dm-thin-metadata.h
> @@ -18,7 +18,7 @@
>   * We have one block of index, which can hold 255 index entries.  Each
>   * index entry contains allocation info about 16k metadata blocks.
>   */
> -#define THIN_METADATA_MAX_SECTORS (255 * (1 << 14) * (THIN_METADATA_BLOCK_SIZE / (1 << SECTOR_SHIFT)))
> +#define THIN_METADATA_MAX_SECTORS (255 * ((1 << 14) - 64) * (THIN_METADATA_BLOCK_SIZE / (1 << SECTOR_SHIFT)))
>  
>  /*
>   * A metadata device larger than 16GB triggers a warning.
> @@ -45,7 +45,8 @@ typedef uint64_t dm_thin_id;
>   */
>  struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev,
>  					       sector_t data_block_size,
> -					       bool format_device);
> +					       bool format_device,
> +					       dm_block_t nr_blocks);
>  
>  int dm_pool_metadata_close(struct dm_pool_metadata *pmd);
>  
> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> index cd52cf2..c4cee29 100644
> --- a/drivers/md/dm-thin.c
> +++ b/drivers/md/dm-thin.c
> @@ -1829,6 +1829,8 @@ static void __pool_destroy(struct pool *pool)
>  
>  static struct kmem_cache *_new_mapping_cache;
>  
> +static dm_block_t get_metadata_dev_size_in_blocks(struct block_device *);
> +
>  static struct pool *pool_create(struct mapped_device *pool_md,
>  				struct block_device *metadata_dev,
>  				unsigned long block_size,
> @@ -1839,8 +1841,10 @@ static struct pool *pool_create(struct mapped_device *pool_md,
>  	struct pool *pool;
>  	struct dm_pool_metadata *pmd;
>  	bool format_device = read_only ? false : true;
> +	dm_block_t nr_blocks = get_metadata_dev_size_in_blocks(metadata_dev);
>  
> -	pmd = dm_pool_metadata_open(metadata_dev, block_size, format_device);
> +	pmd = dm_pool_metadata_open(metadata_dev, block_size, format_device,
> +				    nr_blocks);
>  	if (IS_ERR(pmd)) {
>  		*error = "Error creating metadata object";
>  		return (struct pool *)pmd;
> @@ -2078,16 +2082,27 @@ static void metadata_low_callback(void *context)
>  	dm_table_event(pool->ti->table);
>  }
>  
> -static sector_t get_metadata_dev_size(struct block_device *bdev)
> +static sector_t get_dev_size(struct block_device *bdev)
>  {
> -	sector_t metadata_dev_size = i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
> +	return i_size_read(bdev->bd_inode) >> SECTOR_SHIFT;
> +}
> +
> +static void warn_if_metadata_device_too_big(struct block_device *bdev)
> +{
> +	sector_t metadata_dev_size = get_dev_size(bdev);
>  	char buffer[BDEVNAME_SIZE];
>  
> -	if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING) {
> +	if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING)
>  		DMWARN("Metadata device %s is larger than %u sectors: excess space will not be used.",
>  		       bdevname(bdev, buffer), THIN_METADATA_MAX_SECTORS);
> -		metadata_dev_size = THIN_METADATA_MAX_SECTORS_WARNING;
> -	}
> +}
> +
> +static sector_t get_metadata_dev_size(struct block_device *bdev)
> +{
> +	sector_t metadata_dev_size = get_dev_size(bdev);
> +
> +	if (metadata_dev_size > THIN_METADATA_MAX_SECTORS)
> +		metadata_dev_size = THIN_METADATA_MAX_SECTORS;
>  
>  	return metadata_dev_size;
>  }
> @@ -2174,12 +2189,7 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
>  		ti->error = "Error opening metadata block device";
>  		goto out_unlock;
>  	}
> -
> -	/*
> -	 * Run for the side-effect of possibly issuing a warning if the
> -	 * device is too big.
> -	 */
> -	(void) get_metadata_dev_size(metadata_dev->bdev);
> +	warn_if_metadata_device_too_big(metadata_dev->bdev);
>  
>  	r = dm_get_device(ti, argv[1], FMODE_READ | FMODE_WRITE, &data_dev);
>  	if (r) {
> @@ -2378,6 +2388,7 @@ static int maybe_resize_metadata_dev(struct dm_target *ti, bool *need_commit)
>  			return -EINVAL;;
>  		}
>  
> +		warn_if_metadata_device_too_big(pool->md_dev);
>  		DMINFO("%s: growing the metadata device from %llu to %llu blocks",
>  		       dm_device_name(pool->pool_md),
>  		       sb_metadata_dev_size, metadata_dev_size);
> diff --git a/drivers/md/persistent-data/dm-transaction-manager.c b/drivers/md/persistent-data/dm-transaction-manager.c
> index 81da1a2..495277e 100644
> --- a/drivers/md/persistent-data/dm-transaction-manager.c
> +++ b/drivers/md/persistent-data/dm-transaction-manager.c
> @@ -322,7 +322,7 @@ static int dm_tm_create_internal(struct dm_block_manager *bm,
>  				 dm_block_t sb_location,
>  				 struct dm_transaction_manager **tm,
>  				 struct dm_space_map **sm,
> -				 int create,
> +				 int create, dm_block_t nr_blocks,
>  				 void *sm_root, size_t sm_len)
>  {
>  	int r;
> @@ -338,8 +338,9 @@ static int dm_tm_create_internal(struct dm_block_manager *bm,
>  	}
>  
>  	if (create) {
> -		r = dm_sm_metadata_create(*sm, *tm, dm_bm_nr_blocks(bm),
> -					  sb_location);
> +		if (!nr_blocks)
> +			nr_blocks = dm_bm_nr_blocks(bm);
> +		r = dm_sm_metadata_create(*sm, *tm, nr_blocks, sb_location);
>  		if (r) {
>  			DMERR("couldn't create metadata space map");
>  			goto bad;
> @@ -362,10 +363,10 @@ bad:
>  }
>  
>  int dm_tm_create_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
> -			 struct dm_transaction_manager **tm,
> +			 dm_block_t nr_blocks, struct dm_transaction_manager **tm,
>  			 struct dm_space_map **sm)
>  {
> -	return dm_tm_create_internal(bm, sb_location, tm, sm, 1, NULL, 0);
> +	return dm_tm_create_internal(bm, sb_location, tm, sm, 1, nr_blocks, NULL, 0);
>  }
>  EXPORT_SYMBOL_GPL(dm_tm_create_with_sm);
>  
> @@ -374,7 +375,7 @@ int dm_tm_open_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
>  		       struct dm_transaction_manager **tm,
>  		       struct dm_space_map **sm)
>  {
> -	return dm_tm_create_internal(bm, sb_location, tm, sm, 0, sm_root, root_len);
> +	return dm_tm_create_internal(bm, sb_location, tm, sm, 0, 0, sm_root, root_len);
>  }
>  EXPORT_SYMBOL_GPL(dm_tm_open_with_sm);
>  
> diff --git a/drivers/md/persistent-data/dm-transaction-manager.h b/drivers/md/persistent-data/dm-transaction-manager.h
> index b5b1390..a8403f0 100644
> --- a/drivers/md/persistent-data/dm-transaction-manager.h
> +++ b/drivers/md/persistent-data/dm-transaction-manager.h
> @@ -120,7 +120,7 @@ struct dm_block_manager *dm_tm_get_bm(struct dm_transaction_manager *tm);
>   * shouldn't be used.
>   */
>  int dm_tm_create_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
> -			 struct dm_transaction_manager **tm,
> +			 dm_block_t nr_blocks, struct dm_transaction_manager **tm,
>  			 struct dm_space_map **sm);
>  
>  int dm_tm_open_with_sm(struct dm_block_manager *bm, dm_block_t sb_location,
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races
  2014-02-21 14:27   ` Joe Thornber
@ 2014-02-21 14:37     ` Mike Snitzer
  2014-02-21 14:47       ` Joe Thornber
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21 14:37 UTC (permalink / raw)
  To: dm-devel

On Fri, Feb 21 2014 at  9:27am -0500,
Joe Thornber <thornber@redhat.com> wrote:

> NACK.
> 
> On Thu, Feb 20, 2014 at 09:56:03PM -0500, Mike Snitzer wrote:
> > Before, when a pool is being resized, on resume the pool's mode was
> > being immediately set to PM_WRITE and the process_* methods would be set
> > to the normal writable operations.  These were established before
> > pool_resume() was able to actually resize either the metadata or data
> > device and resulted in the resize failing.  This failure occurred due to
> > the thin device calling wake_worker() which kicked do_worker() during
> > pool_preresume() -- which caused calls to the writable process_* methods
> > to occur during the resize.
> 
> The bug is the worker should not be able to be woken until
> pool_preresume has completed.  A simple check in wake_worker() should
> fix this.
> 
> > Now, the pool is forced to stay in read-only mode if it was already in
> > read-only mode.  This prevents bouncing the pool's mode and associated
> > process_* methods and consistently keeps read-only processing in place
> > until the resize(s) complete.  To achieve this the pool can now be in a
> > PM_READ_ONLY state but the metadata's is !read_only -- so as to allow
> > the commit() the follows the resize to take place.  Differentiate
> > between commit_metadata_superblock() and commit() since different
> > negative checks apply (but factor out a common __commit that each
> > calls).
> 
> Too complicated.

You may think it complicated, and yeah it isn't dead simple, but it
unlocks the nuanced state that is needed to allow for fixing data space
exhaustion that still allows for prepared_mappings to complete -- even
though the pool is in read-only mode.

So patch 7 depends on this patch.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space
  2014-02-21 14:35     ` Mike Snitzer
@ 2014-02-21 14:44       ` Joe Thornber
  0 siblings, 0 replies; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:44 UTC (permalink / raw)
  To: device-mapper development

On Fri, Feb 21, 2014 at 09:35:41AM -0500, Mike Snitzer wrote:
> I'm not seeing how running out of space on metadata is any different
> than any other metadata operation that failed.

Then why do you need a new interface to tell you it's out of space?

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races
  2014-02-21 14:37     ` Mike Snitzer
@ 2014-02-21 14:47       ` Joe Thornber
  0 siblings, 0 replies; 23+ messages in thread
From: Joe Thornber @ 2014-02-21 14:47 UTC (permalink / raw)
  To: device-mapper development

On Fri, Feb 21, 2014 at 09:37:47AM -0500, Mike Snitzer wrote:
> You may think it complicated, and yeah it isn't dead simple, but it
> unlocks the nuanced state that is needed to allow for fixing data space
> exhaustion that still allows for prepared_mappings to complete -- even
> though the pool is in read-only mode.

ok, but we still need a patch to stop the worker being woken while the
pool is suspended.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried
  2014-02-21 14:22   ` Joe Thornber
@ 2014-02-21 14:48     ` Mike Snitzer
  0 siblings, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2014-02-21 14:48 UTC (permalink / raw)
  To: dm-devel

On Fri, Feb 21 2014 at  9:22am -0500,
Joe Thornber <thornber@redhat.com> wrote:

> NACK (but close).
> 
> On Thu, Feb 20, 2014 at 09:56:01PM -0500, Mike Snitzer wrote:
> > If the pool is in fail mode, error_if_no_space is enabled or the
> > metadata space is exhausted do _not_ allow IO to be retried.
> 
> This should be:
> 
> "If the pool is in fail mode, error_if_no_space is enabled or a
> transaction has been aborted, do _not_ allow IO to be retried."
> 
> (In practise 'transaction has been aborted' is the same as the
> needs_check flag we've discussed.)

Yeah, which is made available in patch 7.  These are all building to the
final state in patch 7.  No shipping kernel should have the intermediate
state that these incremental patches introduce.

I left them split out to see the progression, as I think it useful.
If I folded all of these changes together it would be difficult to see
the progression.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-02-21 14:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-21  2:55 [PATCH for-dm-3.14-fixes 0/8] dm thin: address a few fundamental problems Mike Snitzer
2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 1/8] dm thin: synchronize the pool mode during suspend Mike Snitzer
2014-02-21 13:58   ` Joe Thornber
2014-02-21  2:55 ` [PATCH for-dm-3.14-fixes 2/8] dm thin: set flag when over the metadata low watermark threshold Mike Snitzer
2014-02-21 13:56   ` Joe Thornber
2014-02-21 14:05     ` Mike Snitzer
2014-02-21 14:10       ` Mike Snitzer
2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 3/8] dm thin: set flag if metadata is out of space Mike Snitzer
2014-02-21 14:20   ` Joe Thornber
2014-02-21 14:35     ` Mike Snitzer
2014-02-21 14:44       ` Joe Thornber
2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 4/8] dm thin: error out I/O if inappropriate for it to be retried Mike Snitzer
2014-02-21 14:22   ` Joe Thornber
2014-02-21 14:48     ` Mike Snitzer
2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 5/8] dm thin: fix the error path for the thin device constructor Mike Snitzer
2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 6/8] dm thin: fix pool_preresume resize with heavy IO races Mike Snitzer
2014-02-21 14:27   ` Joe Thornber
2014-02-21 14:37     ` Mike Snitzer
2014-02-21 14:47       ` Joe Thornber
2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 7/8] dm thin: ensure user takes action to validate data and metadata consistency Mike Snitzer
2014-02-21 14:35   ` Joe Thornber
2014-02-21  2:56 ` [PATCH for-dm-3.14-fixes 8/8] dm thin: allow metadata space larger than supported to go unused Mike Snitzer
2014-02-21 14:36   ` Joe Thornber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.