* [PATCH v4 0/2] dm: restrict conflicting table loads and improve queue initialization
@ 2010-05-27 12:47 Mike Snitzer
2010-05-27 12:47 ` [PATCH v4 1/2] dm: prevent table type changes after initial table load Mike Snitzer
2010-05-27 12:47 ` [PATCH v4 2/2 "v11"] dm: only initialize full request_queue for request-based device Mike Snitzer
0 siblings, 2 replies; 11+ messages in thread
From: Mike Snitzer @ 2010-05-27 12:47 UTC (permalink / raw)
To: dm-devel; +Cc: Kiyoshi Ueda, Alasdair Kergon
A mapped_device now has a specific immutable type (md->type) that is
set during the initial table_load (see: patch 1/2).
The DM device's request_queue will only have an elevator allocated and
exposed via sysfs if it is request-based (see: patch 2/2).
Please see the individual patch headers for more details.
This patchset is based on agk's DM "editing" tree:
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/
(apply all patches listed in the series file up to NEXT_PATCHES_END)
This patchset also depends on the following linux-2.6-block.git
'for-linus' branch commits:
01effb0 block: allow initialization of previously allocated request_queue
e36f724 block: Adjust elv_iosched_show to return "none" for bio-based DM
I've made a full quilt series available here (builds on 2.6.34):
http://people.redhat.com/msnitzer/patches/dm-queue-init/latest/
I welcome all comments/review.
v4 change:
- dm_set_md_type takes a specific unsigned type rather than dm_table*
v3 changes:
- reduce table_load's md->type_lock protected critical section
- only call dm_setup_md_queue on first table load
- simplified dm_setup_md_queue as a result
- use existing DM_TYPE_* definitions instead of duplicating them
- eliminate overly verbose comments as the code speaks for itself much
more clearly
v2 changes:
- remove do_resume and table_clear interlock
- rebase to agk's latest "editing" tree
- drop the dm_get_verified_mdptr and find_device_noinit patches that
were at the end of v1; can revisit that code duplication cleanup later
Mike Snitzer (2):
dm: prevent table type changes after initial table load
dm: only initialize full request_queue for request-based device
drivers/md/dm-ioctl.c | 24 +++++++++
drivers/md/dm.c | 131 +++++++++++++++++++++++++++++++++++++------------
drivers/md/dm.h | 7 +++
3 files changed, 130 insertions(+), 32 deletions(-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 1/2] dm: prevent table type changes after initial table load
2010-05-27 12:47 [PATCH v4 0/2] dm: restrict conflicting table loads and improve queue initialization Mike Snitzer
@ 2010-05-27 12:47 ` Mike Snitzer
2010-06-29 11:40 ` Alasdair G Kergon
2010-05-27 12:47 ` [PATCH v4 2/2 "v11"] dm: only initialize full request_queue for request-based device Mike Snitzer
1 sibling, 1 reply; 11+ messages in thread
From: Mike Snitzer @ 2010-05-27 12:47 UTC (permalink / raw)
To: dm-devel; +Cc: Kiyoshi Ueda, Alasdair Kergon
Before table_load() stages an inactive table atomically check if its
type conflicts with a previously established device type (md->type).
Introduce 'type_lock' in mapped_device structure and use it to protect
md->type access. table_load() sets md->type without concern for:
- another table_load() racing to set conflicting md->type.
- do_resume() making a conflicting table live.
Allowed transitions:
DM_TYPE_NONE => DM_TYPE_BIO_BASED
DM_TYPE_NONE => DM_TYPE_REQUEST_BASED
Once a table load occurs the DM device's type is completely immutable.
This prevents a table reload from switching the inactive table directly
to a conflicting type (even if the table is explicitly cleared before
load).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
---
drivers/md/dm-ioctl.c | 15 +++++++++++++++
drivers/md/dm.c | 37 ++++++++++++++++++++++++++++++-------
drivers/md/dm.h | 5 +++++
3 files changed, 50 insertions(+), 7 deletions(-)
Index: linux-2.6/drivers/md/dm-ioctl.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-ioctl.c
+++ linux-2.6/drivers/md/dm-ioctl.c
@@ -1176,6 +1176,21 @@ static int table_load(struct dm_ioctl *p
goto out;
}
+ /* Protect md->type against concurrent table loads. */
+ dm_lock_md_type(md);
+ if (dm_get_md_type(md) == DM_TYPE_NONE) {
+ /* initial table load, set md's type based on table's type */
+ dm_set_md_type(md, dm_table_get_type(t));
+ } else if (dm_get_md_type(md) != dm_table_get_type(t)) {
+ DMWARN("can't change device type after initial table load.");
+ dm_table_destroy(t);
+ dm_unlock_md_type(md);
+ r = -EINVAL;
+ goto out;
+ }
+ dm_unlock_md_type(md);
+
+ /* stage inactive table */
down_write(&_hash_lock);
hc = dm_get_mdptr(md);
if (!hc || hc->md != md) {
Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c
+++ linux-2.6/drivers/md/dm.c
@@ -123,6 +123,10 @@ struct mapped_device {
unsigned long flags;
struct request_queue *queue;
+ unsigned type;
+ /* Protect type against concurrent access. */
+ struct mutex type_lock;
+
struct gendisk *disk;
char name[16];
@@ -1874,8 +1878,10 @@ static struct mapped_device *alloc_dev(i
if (r < 0)
goto bad_minor;
+ md->type = DM_TYPE_NONE;
init_rwsem(&md->io_lock);
mutex_init(&md->suspend_lock);
+ mutex_init(&md->type_lock);
spin_lock_init(&md->deferred_lock);
spin_lock_init(&md->barrier_error_lock);
rwlock_init(&md->map_lock);
@@ -2128,6 +2134,30 @@ int dm_create(int minor, struct mapped_d
return 0;
}
+/*
+ * Functions to manage md->type.
+ * All are required to hold md->type_lock.
+ */
+void dm_lock_md_type(struct mapped_device *md)
+{
+ mutex_lock(&md->type_lock);
+}
+
+void dm_unlock_md_type(struct mapped_device *md)
+{
+ mutex_unlock(&md->type_lock);
+}
+
+void dm_set_md_type(struct mapped_device *md, unsigned type)
+{
+ md->type = type;
+}
+
+unsigned dm_get_md_type(struct mapped_device *md)
+{
+ return md->type;
+}
+
static struct mapped_device *dm_find_md(dev_t dev)
{
struct mapped_device *md;
@@ -2403,13 +2433,6 @@ struct dm_table *dm_swap_table(struct ma
goto out;
}
- /* cannot change the device type, once a table is bound */
- if (md->map &&
- (dm_table_get_type(md->map) != dm_table_get_type(table))) {
- DMWARN("can't change the device type after a table is bound");
- goto out;
- }
-
map = __bind(md, table, &limits);
out:
Index: linux-2.6/drivers/md/dm.h
===================================================================
--- linux-2.6.orig/drivers/md/dm.h
+++ linux-2.6/drivers/md/dm.h
@@ -66,6 +66,11 @@ int dm_table_alloc_md_mempools(struct dm
void dm_table_free_md_mempools(struct dm_table *t);
struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
+void dm_lock_md_type(struct mapped_device *md);
+void dm_unlock_md_type(struct mapped_device *md);
+void dm_set_md_type(struct mapped_device *md, unsigned type);
+unsigned dm_get_md_type(struct mapped_device *md);
+
/*
* To check the return value from dm_table_find_target().
*/
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 2/2 "v11"] dm: only initialize full request_queue for request-based device
2010-05-27 12:47 [PATCH v4 0/2] dm: restrict conflicting table loads and improve queue initialization Mike Snitzer
2010-05-27 12:47 ` [PATCH v4 1/2] dm: prevent table type changes after initial table load Mike Snitzer
@ 2010-05-27 12:47 ` Mike Snitzer
2010-06-04 20:15 ` [PATCH 3/2] dm: table load must always try dm_setup_md_queue Mike Snitzer
1 sibling, 1 reply; 11+ messages in thread
From: Mike Snitzer @ 2010-05-27 12:47 UTC (permalink / raw)
To: dm-devel; +Cc: Kiyoshi Ueda, Alasdair Kergon
Allocate a minimalist request_queue structure initially (needed for both
bio and request-based DM). A bio-based DM device no longer defaults to
having a fully initialized request_queue (request_fn, elevator, etc).
So bio-based DM devices no longer register elevator sysfs attributes
('iosched/' tree or 'scheduler' other than "none").
Initialization of a full request_queue (request_fn, elevator, etc) is
deferred until it is known that the DM device is request-based -- at the
end of the table load sequence.
Factor DM device's request_queue initialization:
- common to both request-based and bio-based into dm_init_md_queue().
- specific to request-based into dm_init_request_based_queue().
md->type_lock is also used to protect md->queue during table_load().
md->queue is setup without concern for:
- another table_load() racing to setup conflicting queue state.
- do_resume() making a conflicting table live.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
---
drivers/md/dm-ioctl.c | 11 +++++
drivers/md/dm.c | 93 ++++++++++++++++++++++++++++++++++++--------------
drivers/md/dm.h | 2 +
3 files changed, 79 insertions(+), 27 deletions(-)
Index: linux-2.6/drivers/md/dm-ioctl.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-ioctl.c
+++ linux-2.6/drivers/md/dm-ioctl.c
@@ -1176,11 +1176,20 @@ static int table_load(struct dm_ioctl *p
goto out;
}
- /* Protect md->type against concurrent table loads. */
+ /* Protect md->type and md->queue against concurrent table loads. */
dm_lock_md_type(md);
if (dm_get_md_type(md) == DM_TYPE_NONE) {
/* initial table load, set md's type based on table's type */
dm_set_md_type(md, dm_table_get_type(t));
+
+ /* setup md->queue to reflect md's type (may block) */
+ r = dm_setup_md_queue(md);
+ if (r) {
+ DMWARN("unable to setup device queue for this table.");
+ dm_table_destroy(t);
+ dm_unlock_md_type(md);
+ goto out;
+ }
} else if (dm_get_md_type(md) != dm_table_get_type(t)) {
DMWARN("can't change device type after initial table load.");
dm_table_destroy(t);
Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c
+++ linux-2.6/drivers/md/dm.c
@@ -124,7 +124,7 @@ struct mapped_device {
struct request_queue *queue;
unsigned type;
- /* Protect type against concurrent access. */
+ /* Protect queue and type against concurrent access. */
struct mutex type_lock;
struct gendisk *disk;
@@ -1853,6 +1853,28 @@ static const struct block_device_operati
static void dm_wq_work(struct work_struct *work);
static void dm_rq_barrier_work(struct work_struct *work);
+static void dm_init_md_queue(struct mapped_device *md)
+{
+ /*
+ * Request-based dm devices cannot be stacked on top of bio-based dm
+ * devices. The type of this dm device has not been decided yet.
+ * The type is decided at the first table loading time.
+ * To prevent problematic device stacking, clear the queue flag
+ * for request stacking support until then.
+ *
+ * This queue is new, so no concurrency on the queue_flags.
+ */
+ queue_flag_clear_unlocked(QUEUE_FLAG_STACKABLE, md->queue);
+
+ md->queue->queuedata = md;
+ md->queue->backing_dev_info.congested_fn = dm_any_congested;
+ md->queue->backing_dev_info.congested_data = md;
+ blk_queue_make_request(md->queue, dm_request);
+ blk_queue_bounce_limit(md->queue, BLK_BOUNCE_ANY);
+ md->queue->unplug_fn = dm_unplug_all;
+ blk_queue_merge_bvec(md->queue, dm_merge_bvec);
+}
+
/*
* Allocate and initialise a blank device with a given minor.
*/
@@ -1892,34 +1914,11 @@ static struct mapped_device *alloc_dev(i
INIT_LIST_HEAD(&md->uevent_list);
spin_lock_init(&md->uevent_lock);
- md->queue = blk_init_queue(dm_request_fn, NULL);
+ md->queue = blk_alloc_queue(GFP_KERNEL);
if (!md->queue)
goto bad_queue;
- /*
- * Request-based dm devices cannot be stacked on top of bio-based dm
- * devices. The type of this dm device has not been decided yet,
- * although we initialized the queue using blk_init_queue().
- * The type is decided at the first table loading time.
- * To prevent problematic device stacking, clear the queue flag
- * for request stacking support until then.
- *
- * This queue is new, so no concurrency on the queue_flags.
- */
- queue_flag_clear_unlocked(QUEUE_FLAG_STACKABLE, md->queue);
- md->saved_make_request_fn = md->queue->make_request_fn;
- md->queue->queuedata = md;
- md->queue->backing_dev_info.congested_fn = dm_any_congested;
- md->queue->backing_dev_info.congested_data = md;
- blk_queue_make_request(md->queue, dm_request);
- blk_queue_bounce_limit(md->queue, BLK_BOUNCE_ANY);
- md->queue->unplug_fn = dm_unplug_all;
- blk_queue_merge_bvec(md->queue, dm_merge_bvec);
- blk_queue_softirq_done(md->queue, dm_softirq_done);
- blk_queue_prep_rq(md->queue, dm_prep_fn);
- blk_queue_lld_busy(md->queue, dm_lld_busy);
- blk_queue_ordered(md->queue, QUEUE_ORDERED_DRAIN_FLUSH,
- dm_rq_prepare_flush);
+ dm_init_md_queue(md);
md->disk = alloc_disk(1);
if (!md->disk)
@@ -2158,6 +2157,48 @@ unsigned dm_get_md_type(struct mapped_de
return md->type;
}
+/*
+ * Fully initialize a request-based queue (->elevator, ->request_fn, etc).
+ */
+static int dm_init_request_based_queue(struct mapped_device *md)
+{
+ struct request_queue *q = NULL;
+
+ BUG_ON(md->queue->elevator);
+
+ /* Fully initialize the queue */
+ q = blk_init_allocated_queue(md->queue, dm_request_fn, NULL);
+ if (!q)
+ return 0;
+
+ md->queue = q;
+ md->saved_make_request_fn = md->queue->make_request_fn;
+ dm_init_md_queue(md);
+ blk_queue_softirq_done(md->queue, dm_softirq_done);
+ blk_queue_prep_rq(md->queue, dm_prep_fn);
+ blk_queue_lld_busy(md->queue, dm_lld_busy);
+ blk_queue_ordered(md->queue, QUEUE_ORDERED_DRAIN_FLUSH,
+ dm_rq_prepare_flush);
+
+ elv_register_queue(md->queue);
+
+ return 1;
+}
+
+/*
+ * Setup the DM device's queue based on md's type
+ */
+int dm_setup_md_queue(struct mapped_device *md)
+{
+ if ((dm_get_md_type(md) == DM_TYPE_REQUEST_BASED) &&
+ !dm_init_request_based_queue(md)) {
+ DMWARN("Cannot initialize queue for Request-based dm");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static struct mapped_device *dm_find_md(dev_t dev)
{
struct mapped_device *md;
Index: linux-2.6/drivers/md/dm.h
===================================================================
--- linux-2.6.orig/drivers/md/dm.h
+++ linux-2.6/drivers/md/dm.h
@@ -71,6 +71,8 @@ void dm_unlock_md_type(struct mapped_dev
void dm_set_md_type(struct mapped_device *md, unsigned type);
unsigned dm_get_md_type(struct mapped_device *md);
+int dm_setup_md_queue(struct mapped_device *md);
+
/*
* To check the return value from dm_table_find_target().
*/
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/2] dm: table load must always try dm_setup_md_queue
2010-05-27 12:47 ` [PATCH v4 2/2 "v11"] dm: only initialize full request_queue for request-based device Mike Snitzer
@ 2010-06-04 20:15 ` Mike Snitzer
2010-06-09 5:38 ` Kiyoshi Ueda
0 siblings, 1 reply; 11+ messages in thread
From: Mike Snitzer @ 2010-06-04 20:15 UTC (permalink / raw)
To: dm-devel; +Cc: Kiyoshi Ueda, Alasdair Kergon
A DM device's first table_load will establish the immutable md->type.
But md->queue initialization, based on md->type, may fail at that time
(if blk_init_allocated_queue cannot allocate memory).
Therefore any subsequent table_load must (re)try dm_setup_md_queue
independent of establishing md->type.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm-ioctl.c | 18 +++++++++---------
drivers/md/dm.c | 3 ++-
2 files changed, 11 insertions(+), 10 deletions(-)
NOTE: coping with blk_init_allocated_queue failure was mistakenly
"optimized" away during review of the patch that introduced
dm_setup_md_queue:
https://www.redhat.com/archives/dm-devel/2010-May/msg00173.html
Index: linux-2.6/drivers/md/dm-ioctl.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-ioctl.c
+++ linux-2.6/drivers/md/dm-ioctl.c
@@ -1181,15 +1181,6 @@ static int table_load(struct dm_ioctl *p
if (dm_get_md_type(md) == DM_TYPE_NONE) {
/* initial table load, set md's type based on table's type */
dm_set_md_type(md, dm_table_get_type(t));
-
- /* setup md->queue to reflect md's type (may block) */
- r = dm_setup_md_queue(md);
- if (r) {
- DMWARN("unable to setup device queue for this table.");
- dm_table_destroy(t);
- dm_unlock_md_type(md);
- goto out;
- }
} else if (dm_get_md_type(md) != dm_table_get_type(t)) {
DMWARN("can't change device type after initial table load.");
dm_table_destroy(t);
@@ -1197,6 +1188,15 @@ static int table_load(struct dm_ioctl *p
r = -EINVAL;
goto out;
}
+
+ /* setup md->queue to reflect md's type (may block) */
+ r = dm_setup_md_queue(md);
+ if (r) {
+ DMWARN("unable to setup device queue for this table.");
+ dm_table_destroy(t);
+ dm_unlock_md_type(md);
+ goto out;
+ }
dm_unlock_md_type(md);
/* stage inactive table */
Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c
+++ linux-2.6/drivers/md/dm.c
@@ -2164,7 +2164,8 @@ static int dm_init_request_based_queue(s
{
struct request_queue *q = NULL;
- BUG_ON(md->queue->elevator);
+ if (unlikely(md->queue->elevator))
+ return 1;
/* Fully initialize the queue */
q = blk_init_allocated_queue(md->queue, dm_request_fn, NULL);
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/2] dm: table load must always try dm_setup_md_queue
2010-06-04 20:15 ` [PATCH 3/2] dm: table load must always try dm_setup_md_queue Mike Snitzer
@ 2010-06-09 5:38 ` Kiyoshi Ueda
2010-06-09 8:53 ` Mike Snitzer
0 siblings, 1 reply; 11+ messages in thread
From: Kiyoshi Ueda @ 2010-06-09 5:38 UTC (permalink / raw)
To: Mike Snitzer; +Cc: dm-devel, Alasdair Kergon
Hi Mike,
On 06/05/2010 05:15 AM +0900, Mike Snitzer wrote:
> @@ -2164,7 +2164,8 @@ static int dm_init_request_based_queue(s
> {
> struct request_queue *q = NULL;
>
> - BUG_ON(md->queue->elevator);
> + if (unlikely(md->queue->elevator))
> + return 1;
I think the "unlikely" should be rather "likely", since
dm_init_request_based_queue() is now called whenever request-based
table is loaded even after the actual initialization has been done.
For others,
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Thanks,
Kiyoshi Ueda
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/2] dm: table load must always try dm_setup_md_queue
2010-06-09 5:38 ` Kiyoshi Ueda
@ 2010-06-09 8:53 ` Mike Snitzer
2010-06-10 2:54 ` Kiyoshi Ueda
0 siblings, 1 reply; 11+ messages in thread
From: Mike Snitzer @ 2010-06-09 8:53 UTC (permalink / raw)
To: Kiyoshi Ueda; +Cc: dm-devel, Alasdair Kergon
On Wed, Jun 09 2010 at 1:38am -0400,
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> wrote:
> Hi Mike,
>
> On 06/05/2010 05:15 AM +0900, Mike Snitzer wrote:
> > @@ -2164,7 +2164,8 @@ static int dm_init_request_based_queue(s
> > {
> > struct request_queue *q = NULL;
> >
> > - BUG_ON(md->queue->elevator);
> > + if (unlikely(md->queue->elevator))
> > + return 1;
>
> I think the "unlikely" should be rather "likely", since
> dm_init_request_based_queue() is now called whenever request-based
> table is loaded even after the actual initialization has been done.
Yes, good point. Though I'd expect reloading a request-based table is
actually fairly rare. Is it really all that worthwhile to have any
branch prediction here? Should we just remove the likely/unlikely
entirely?
Thanks,
Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/2] dm: table load must always try dm_setup_md_queue
2010-06-09 8:53 ` Mike Snitzer
@ 2010-06-10 2:54 ` Kiyoshi Ueda
0 siblings, 0 replies; 11+ messages in thread
From: Kiyoshi Ueda @ 2010-06-10 2:54 UTC (permalink / raw)
To: Mike Snitzer; +Cc: dm-devel, Alasdair Kergon
Hi Mike,
On 06/09/2010 05:53 PM +0900, Mike Snitzer wrote:
> On Wed, Jun 09 2010 at 1:38am -0400, Kiyoshi Ueda wrote:
>> On 06/05/2010 05:15 AM +0900, Mike Snitzer wrote:
>>> @@ -2164,7 +2164,8 @@ static int dm_init_request_based_queue(s
>>> {
>>> struct request_queue *q = NULL;
>>>
>>> - BUG_ON(md->queue->elevator);
>>> + if (unlikely(md->queue->elevator))
>>> + return 1;
>>
>> I think the "unlikely" should be rather "likely", since
>> dm_init_request_based_queue() is now called whenever request-based
>> table is loaded even after the actual initialization has been done.
>
> Yes, good point. Though I'd expect reloading a request-based table is
> actually fairly rare. Is it really all that worthwhile to have any
> branch prediction here? Should we just remove the likely/unlikely
> entirely?
For multipath, table reloading happens when path connection state is
changed (e.g. link-up => link-down), since if a path state becomes down,
the device for the path disappears and multipath configuration is changed.
Although I'm not sure "it's actually fairly rare", I believe it's rare.
So I don't object to remove the branch prediction.
Thanks,
Kiyoshi Ueda
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 1/2] dm: prevent table type changes after initial table load
2010-05-27 12:47 ` [PATCH v4 1/2] dm: prevent table type changes after initial table load Mike Snitzer
@ 2010-06-29 11:40 ` Alasdair G Kergon
2010-06-29 13:52 ` Mike Snitzer
0 siblings, 1 reply; 11+ messages in thread
From: Alasdair G Kergon @ 2010-06-29 11:40 UTC (permalink / raw)
To: device-mapper development
On Thu, May 27, 2010 at 08:47:48AM -0400, Mike Snitzer wrote:
> Introduce 'type_lock' in mapped_device structure and use it to protect
> md->type access. table_load() sets md->type without concern for:
> - another table_load() racing to set conflicting md->type.
which leads to non-deterministic behaviour that is of no practical use and
so there is no need to support it.
> - do_resume() making a conflicting table live.
Any table being made live must already have passed through table_load so I
don't see how there can be conflicting types.
The mutex here seems to be overkill: instead of protecting one field to
cope with a useless race, we should be making the output of the race
deterministic (e.g. perhaps giving an error on the parallel table_load
attempt).
Anyway, I can take this patch for now and we can deal with the mutex/race
differently later.
Alasdair
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 1/2] dm: prevent table type changes after initial table load
2010-06-29 11:40 ` Alasdair G Kergon
@ 2010-06-29 13:52 ` Mike Snitzer
2010-06-29 14:19 ` Alasdair G Kergon
0 siblings, 1 reply; 11+ messages in thread
From: Mike Snitzer @ 2010-06-29 13:52 UTC (permalink / raw)
To: device-mapper development
On Tue, Jun 29 2010 at 7:40am -0400,
Alasdair G Kergon <agk@redhat.com> wrote:
> On Thu, May 27, 2010 at 08:47:48AM -0400, Mike Snitzer wrote:
> > Introduce 'type_lock' in mapped_device structure and use it to protect
> > md->type access. table_load() sets md->type without concern for:
>
> > - another table_load() racing to set conflicting md->type.
> which leads to non-deterministic behaviour that is of no practical use and
> so there is no need to support it.
>
> > - do_resume() making a conflicting table live.
> Any table being made live must already have passed through table_load so I
> don't see how there can be conflicting types.
As we talked about on irc. There is potential for do_resume() to
destage the "inactive" table slot, in the process of making the table
"live", and in parallel another table load can arrive to stage another
"inactive" table (which can have a conflicting type). Without a
critical section there is room for this to occur.
And the critical section must span more than setting md->type; as
md->queue initialization may fail (elevator memory allocation fails
ENOMEM). So competing table_load needs protection from concurrent queue
initialization.
(do understand that I know you're commenting on this patch header in
isolation; but I figured I'd give context for "why the mutex"). I have
an updated patch header too:
Before table_load() stages an inactive table check if its type
conflicts with a previously established device type (md->type).
Allowed md->type transitions:
DM_TYPE_NONE => DM_TYPE_BIO_BASED
DM_TYPE_NONE => DM_TYPE_REQUEST_BASED
Once a table load occurs the DM device's type is completely immutable.
This prevents a table reload from switching the inactive table directly
to a conflicting type (even if the table is explicitly cleared before
load).
Introduce 'type_lock' in mapped_device structure and use it to protect
md->type access. Use of mutex prepares for follow-on DM device queue
initialization changes that will allocate memory while md->type_lock
is held.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
> The mutex here seems to be overkill: instead of protecting one field to
> cope with a useless race, we should be making the output of the race
> deterministic (e.g. perhaps giving an error on the parallel table_load
> attempt).
We're protecting 2 fields in the end (md->type and md->queue): once 2/2
is applied.
I agree 100% on improving the DM core to be trained to error out on
competing table loads (at a higher level).. given the parallelism that
we have right now in DM operations (across devices) we get this awkward
inability to _know_ the state transitions of:
no table -> inactive table -> live table
As you pointed out that is an unwanted side-effect of supporting
operations we do want.
> Anyway, I can take this patch for now and we can deal with the mutex/race
> differently later.
OK, sounds good.
Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 1/2] dm: prevent table type changes after initial table load
2010-06-29 13:52 ` Mike Snitzer
@ 2010-06-29 14:19 ` Alasdair G Kergon
2010-06-29 15:07 ` Mike Snitzer
0 siblings, 1 reply; 11+ messages in thread
From: Alasdair G Kergon @ 2010-06-29 14:19 UTC (permalink / raw)
To: device-mapper development
On Tue, Jun 29, 2010 at 09:52:17AM -0400, Mike Snitzer wrote:
> As we talked about on irc. There is potential for do_resume() to
> destage the "inactive" table slot, in the process of making the table
> "live", and in parallel another table load can arrive to stage another
> "inactive" table (which can have a conflicting type).
But the type already got set earlier and, once set, is an immutable table
property. There cannot be conflicts with do_resume - it doesn't touch
this.
Alasdair
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 1/2] dm: prevent table type changes after initial table load
2010-06-29 14:19 ` Alasdair G Kergon
@ 2010-06-29 15:07 ` Mike Snitzer
0 siblings, 0 replies; 11+ messages in thread
From: Mike Snitzer @ 2010-06-29 15:07 UTC (permalink / raw)
To: device-mapper development
On Tue, Jun 29 2010 at 10:19am -0400,
Alasdair G Kergon <agk@redhat.com> wrote:
> On Tue, Jun 29, 2010 at 09:52:17AM -0400, Mike Snitzer wrote:
> > As we talked about on irc. There is potential for do_resume() to
> > destage the "inactive" table slot, in the process of making the table
> > "live", and in parallel another table load can arrive to stage another
> > "inactive" table (which can have a conflicting type).
>
> But the type already got set earlier and, once set, is an immutable table
> property. There cannot be conflicts with do_resume - it doesn't touch
> this.
Yes, do_resume doesn't operate with any concern for md->type or
md->queue changing because table_load safely sets these immutable
properties.
That safety comes at the expense of protecting a critical section in
table_load. And that protection must span the access to both md->type
and md->queue.
Would you rather I ignore do_resume entirely? I'm not understanding why
you keep going back to this notion that do_resume doesn't matter. The
only reason it doesn't matter is because table_load takes care to make
it that way.
Without appropriate table_load serialization do_resume would be prone to
the races I explained above. That's all I've been trying to convey.
Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-06-29 15:07 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-27 12:47 [PATCH v4 0/2] dm: restrict conflicting table loads and improve queue initialization Mike Snitzer
2010-05-27 12:47 ` [PATCH v4 1/2] dm: prevent table type changes after initial table load Mike Snitzer
2010-06-29 11:40 ` Alasdair G Kergon
2010-06-29 13:52 ` Mike Snitzer
2010-06-29 14:19 ` Alasdair G Kergon
2010-06-29 15:07 ` Mike Snitzer
2010-05-27 12:47 ` [PATCH v4 2/2 "v11"] dm: only initialize full request_queue for request-based device Mike Snitzer
2010-06-04 20:15 ` [PATCH 3/2] dm: table load must always try dm_setup_md_queue Mike Snitzer
2010-06-09 5:38 ` Kiyoshi Ueda
2010-06-09 8:53 ` Mike Snitzer
2010-06-10 2:54 ` Kiyoshi Ueda
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.