All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd
@ 2011-09-01 12:48 Philipp Reisner
  2011-09-01 12:48 ` [PATCH 01/18] drbd: default to detach on-io-error Philipp Reisner
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

This the first request for review of drbd-8.4. The complete set has 
492 patches. This is the fourth installment containing 18 patches.

The whole set is available here:
  git://git.drbd.org/linux-2.6-drbd.git for-jens

and is jens_for-3.2_drivers...for-jens
and this part is cb3bd33...5d96817

The most noticeable change is the support for multiple replicated volumes in
a single DRBD connection.  Write-ordering is obeyed among all writes in all
volumes in a single connection.  This feature is really important for users
who use DRBD for mirroring over longer distances. (Protocol A).


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/18] drbd: default to detach on-io-error
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 02/18] drbd: only wakeup if something changed in update_peer_seq Philipp Reisner
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Old default behaviour was "pass-on",
which is not useful in production at all.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 include/linux/drbd_limits.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/drbd_limits.h b/include/linux/drbd_limits.h
index 75f05af..22920a8 100644
--- a/include/linux/drbd_limits.h
+++ b/include/linux/drbd_limits.h
@@ -125,7 +125,7 @@
 #define DRBD_DISK_SIZE_SECT_MAX  (1 * (2LLU << 40))
 #define DRBD_DISK_SIZE_SECT_DEF  0 /* = disabled = no user size... */
 
-#define DRBD_ON_IO_ERROR_DEF EP_PASS_ON
+#define DRBD_ON_IO_ERROR_DEF EP_DETACH
 #define DRBD_FENCING_DEF FP_DONT_CARE
 #define DRBD_AFTER_SB_0P_DEF ASB_DISCONNECT
 #define DRBD_AFTER_SB_1P_DEF ASB_DISCONNECT
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/18] drbd: only wakeup if something changed in update_peer_seq
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
  2011-09-01 12:48 ` [PATCH 01/18] drbd: default to detach on-io-error Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 03/18] drbd: add page pool to be used for meta data IO Philipp Reisner
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

This commit got it wrong:
    drbd: Make the peer_seq updating code more obvious

    Make it more clear that update_peer_seq() is supposed to wake up the
    seq_wait queue whenever the sequence number changes.

We don't need to wake up everytime we receive a sequence number
that is _different_ from our currently stored "newest" sequence number,
but only if we receive a sequence number _newer_ than what we already
have, when we actually change mdev->peer_seq.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_receiver.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 620f27f..64f658b 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1737,14 +1737,15 @@ static bool need_peer_seq(struct drbd_conf *mdev)
 
 static void update_peer_seq(struct drbd_conf *mdev, unsigned int peer_seq)
 {
-	unsigned int old_peer_seq;
+	unsigned int newest_peer_seq;
 
 	if (need_peer_seq(mdev)) {
 		spin_lock(&mdev->peer_seq_lock);
-		old_peer_seq = mdev->peer_seq;
-		mdev->peer_seq = seq_max(mdev->peer_seq, peer_seq);
+		newest_peer_seq = seq_max(mdev->peer_seq, peer_seq);
+		mdev->peer_seq = newest_peer_seq;
 		spin_unlock(&mdev->peer_seq_lock);
-		if (old_peer_seq != peer_seq)
+		/* wake up only if we actually changed mdev->peer_seq */
+		if (peer_seq == newest_peer_seq)
 			wake_up(&mdev->seq_wait);
 	}
 }
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/18] drbd: add page pool to be used for meta data IO
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
  2011-09-01 12:48 ` [PATCH 01/18] drbd: default to detach on-io-error Philipp Reisner
  2011-09-01 12:48 ` [PATCH 02/18] drbd: only wakeup if something changed in update_peer_seq Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 04/18] drbd: use the newly introduced page pool for bitmap IO Philipp Reisner
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_int.h  |   23 ++++++++++++++++++++++-
 drivers/block/drbd/drbd_main.c |    9 +++++++++
 2 files changed, 31 insertions(+), 1 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index e3f542c..4debab7 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1463,11 +1463,32 @@ extern struct kmem_cache *drbd_al_ext_cache;	/* activity log extents */
 extern mempool_t *drbd_request_mempool;
 extern mempool_t *drbd_ee_mempool;
 
-extern struct page *drbd_pp_pool; /* drbd's page pool */
+/* drbd's page pool, used to buffer data received from the peer,
+ * or data requested by the peer.
+ *
+ * This does not have an emergency reserve.
+ *
+ * When allocating from this pool, it first takes pages from the pool.
+ * Only if the pool is depleted will try to allocate from the system.
+ *
+ * The assumption is that pages taken from this pool will be processed,
+ * and given back, "quickly", and then can be recycled, so we can avoid
+ * frequent calls to alloc_page(), and still will be able to make progress even
+ * under memory pressure.
+ */
+extern struct page *drbd_pp_pool;
 extern spinlock_t   drbd_pp_lock;
 extern int	    drbd_pp_vacant;
 extern wait_queue_head_t drbd_pp_wait;
 
+/* We also need a standard (emergency-reserve backed) page pool
+ * for meta data IO (activity log, bitmap).
+ * We can keep it global, as long as it is used as "N pages at a time".
+ * 128 should be plenty, currently we probably can get away with as few as 1.
+ */
+#define DRBD_MIN_POOL_PAGES	128
+extern mempool_t *drbd_md_io_page_pool;
+
 extern rwlock_t global_state_lock;
 
 extern int conn_lowest_minor(struct drbd_tconn *tconn);
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 6d761cb..cb1636c 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -129,6 +129,7 @@ struct kmem_cache *drbd_bm_ext_cache;	/* bitmap extents */
 struct kmem_cache *drbd_al_ext_cache;	/* activity log extents */
 mempool_t *drbd_request_mempool;
 mempool_t *drbd_ee_mempool;
+mempool_t *drbd_md_io_page_pool;
 
 /* I do not use a standard mempool, because:
    1) I want to hand out the pre-allocated objects first.
@@ -1952,6 +1953,8 @@ static void drbd_destroy_mempools(void)
 
 	/* D_ASSERT(atomic_read(&drbd_pp_vacant)==0); */
 
+	if (drbd_md_io_page_pool)
+		mempool_destroy(drbd_md_io_page_pool);
 	if (drbd_ee_mempool)
 		mempool_destroy(drbd_ee_mempool);
 	if (drbd_request_mempool)
@@ -1965,6 +1968,7 @@ static void drbd_destroy_mempools(void)
 	if (drbd_al_ext_cache)
 		kmem_cache_destroy(drbd_al_ext_cache);
 
+	drbd_md_io_page_pool = NULL;
 	drbd_ee_mempool      = NULL;
 	drbd_request_mempool = NULL;
 	drbd_ee_cache        = NULL;
@@ -1988,6 +1992,7 @@ static int drbd_create_mempools(void)
 	drbd_bm_ext_cache    = NULL;
 	drbd_al_ext_cache    = NULL;
 	drbd_pp_pool         = NULL;
+	drbd_md_io_page_pool = NULL;
 
 	/* caches */
 	drbd_request_cache = kmem_cache_create(
@@ -2011,6 +2016,10 @@ static int drbd_create_mempools(void)
 		goto Enomem;
 
 	/* mempools */
+	drbd_md_io_page_pool = mempool_create_page_pool(DRBD_MIN_POOL_PAGES, 0);
+	if (drbd_md_io_page_pool == NULL)
+		goto Enomem;
+
 	drbd_request_mempool = mempool_create(number,
 		mempool_alloc_slab, mempool_free_slab, drbd_request_cache);
 	if (drbd_request_mempool == NULL)
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/18] drbd: use the newly introduced page pool for bitmap IO
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (2 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 03/18] drbd: add page pool to be used for meta data IO Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 05/18] drbd: introduce a bio_set to allocate housekeeping bios from Philipp Reisner
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_bitmap.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 2fc3dfa..412ca94 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -957,9 +957,8 @@ static void bm_async_io_complete(struct bio *bio, int error)
 
 	bm_page_unlock_io(mdev, idx);
 
-	/* FIXME give back to page pool */
 	if (ctx->flags & BM_AIO_COPY_PAGES)
-		put_page(bio->bi_io_vec[0].bv_page);
+		mempool_free(bio->bi_io_vec[0].bv_page, drbd_md_io_page_pool);
 
 	bio_put(bio);
 
@@ -993,10 +992,8 @@ static void bm_page_io_async(struct bm_aio_ctx *ctx, int page_nr, int rw) __must
 	bm_set_page_unchanged(b->bm_pages[page_nr]);
 
 	if (ctx->flags & BM_AIO_COPY_PAGES) {
-		/* FIXME alloc_page is good enough for now, but actually needs
-		 * to use pre-allocated page pool */
 		void *src, *dest;
-		page = alloc_page(__GFP_HIGHMEM|__GFP_WAIT);
+		page = mempool_alloc(drbd_md_io_page_pool, __GFP_HIGHMEM|__GFP_WAIT);
 		dest = kmap_atomic(page, KM_USER0);
 		src = kmap_atomic(b->bm_pages[page_nr], KM_USER1);
 		memcpy(dest, src, PAGE_SIZE);
@@ -1008,6 +1005,8 @@ static void bm_page_io_async(struct bm_aio_ctx *ctx, int page_nr, int rw) __must
 
 	bio->bi_bdev = mdev->ldev->md_bdev;
 	bio->bi_sector = on_disk_sector;
+	/* bio_add_page of a single page to an empty bio will always succeed,
+	 * according to api.  Do we want to assert that? */
 	bio_add_page(bio, page, len, 0);
 	bio->bi_private = ctx;
 	bio->bi_end_io = bm_async_io_complete;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/18] drbd: introduce a bio_set to allocate housekeeping bios from
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (3 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 04/18] drbd: use the newly introduced page pool for bitmap IO Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 06/18] drbd: fix drbd_delete_device: remove vnr from volumes; idr_remove(); synchronize_rcu(); before cleanup Philipp Reisner
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Don't rely on availability of bios from the global fs_bio_set,
we should use our own bio_set for meta data IO.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_actlog.c   |    2 +-
 drivers/block/drbd/drbd_bitmap.c   |    3 +--
 drivers/block/drbd/drbd_int.h      |    6 ++++++
 drivers/block/drbd/drbd_main.c     |   28 ++++++++++++++++++++++++++++
 drivers/block/drbd/drbd_receiver.c |    6 +++++-
 5 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 55f5a7a..9ab6365 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -125,7 +125,7 @@ static int _drbd_md_sync_page_io(struct drbd_conf *mdev,
 		rw |= REQ_FUA | REQ_FLUSH;
 	rw |= REQ_SYNC;
 
-	bio = bio_alloc(GFP_NOIO, 1);
+	bio = bio_alloc_drbd(GFP_NOIO);
 	bio->bi_bdev = bdev->md_bdev;
 	bio->bi_sector = sector;
 	ok = (bio_add_page(bio, page, size, 0) == size);
diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 412ca94..a1d72db 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -968,8 +968,7 @@ static void bm_async_io_complete(struct bio *bio, int error)
 
 static void bm_page_io_async(struct bm_aio_ctx *ctx, int page_nr, int rw) __must_hold(local)
 {
-	/* we are process context. we always get a bio */
-	struct bio *bio = bio_alloc(GFP_KERNEL, 1);
+	struct bio *bio = bio_alloc_drbd(GFP_KERNEL);
 	struct drbd_conf *mdev = ctx->mdev;
 	struct drbd_bitmap *b = mdev->bitmap;
 	struct page *page;
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 4debab7..5d90b9d 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1489,6 +1489,12 @@ extern wait_queue_head_t drbd_pp_wait;
 #define DRBD_MIN_POOL_PAGES	128
 extern mempool_t *drbd_md_io_page_pool;
 
+/* We also need to make sure we get a bio
+ * when we need it for housekeeping purposes */
+extern struct bio_set *drbd_md_io_bio_set;
+/* to allocate from that set */
+extern struct bio *bio_alloc_drbd(gfp_t gfp_mask);
+
 extern rwlock_t global_state_lock;
 
 extern int conn_lowest_minor(struct drbd_tconn *tconn);
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index cb1636c..26bcd13 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -130,6 +130,7 @@ struct kmem_cache *drbd_al_ext_cache;	/* activity log extents */
 mempool_t *drbd_request_mempool;
 mempool_t *drbd_ee_mempool;
 mempool_t *drbd_md_io_page_pool;
+struct bio_set *drbd_md_io_bio_set;
 
 /* I do not use a standard mempool, because:
    1) I want to hand out the pre-allocated objects first.
@@ -150,6 +151,25 @@ static const struct block_device_operations drbd_ops = {
 	.release = drbd_release,
 };
 
+static void bio_destructor_drbd(struct bio *bio)
+{
+	bio_free(bio, drbd_md_io_bio_set);
+}
+
+struct bio *bio_alloc_drbd(gfp_t gfp_mask)
+{
+	struct bio *bio;
+
+	if (!drbd_md_io_bio_set)
+		return bio_alloc(gfp_mask, 1);
+
+	bio = bio_alloc_bioset(gfp_mask, 1, drbd_md_io_bio_set);
+	if (!bio)
+		return NULL;
+	bio->bi_destructor = bio_destructor_drbd;
+	return bio;
+}
+
 #ifdef __CHECKER__
 /* When checking with sparse, and this is an inline function, sparse will
    give tons of false positives. When this is a real functions sparse works.
@@ -1953,6 +1973,8 @@ static void drbd_destroy_mempools(void)
 
 	/* D_ASSERT(atomic_read(&drbd_pp_vacant)==0); */
 
+	if (drbd_md_io_bio_set)
+		bioset_free(drbd_md_io_bio_set);
 	if (drbd_md_io_page_pool)
 		mempool_destroy(drbd_md_io_page_pool);
 	if (drbd_ee_mempool)
@@ -1968,6 +1990,7 @@ static void drbd_destroy_mempools(void)
 	if (drbd_al_ext_cache)
 		kmem_cache_destroy(drbd_al_ext_cache);
 
+	drbd_md_io_bio_set   = NULL;
 	drbd_md_io_page_pool = NULL;
 	drbd_ee_mempool      = NULL;
 	drbd_request_mempool = NULL;
@@ -1993,6 +2016,7 @@ static int drbd_create_mempools(void)
 	drbd_al_ext_cache    = NULL;
 	drbd_pp_pool         = NULL;
 	drbd_md_io_page_pool = NULL;
+	drbd_md_io_bio_set   = NULL;
 
 	/* caches */
 	drbd_request_cache = kmem_cache_create(
@@ -2016,6 +2040,10 @@ static int drbd_create_mempools(void)
 		goto Enomem;
 
 	/* mempools */
+	drbd_md_io_bio_set = bioset_create(DRBD_MIN_POOL_PAGES, 0);
+	if (drbd_md_io_bio_set == NULL)
+		goto Enomem;
+
 	drbd_md_io_page_pool = mempool_create_page_pool(DRBD_MIN_POOL_PAGES, 0);
 	if (drbd_md_io_page_pool == NULL)
 		goto Enomem;
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 64f658b..12b1533 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1127,7 +1127,11 @@ int drbd_submit_peer_request(struct drbd_conf *mdev,
 	/* In most cases, we will only need one bio.  But in case the lower
 	 * level restrictions happen to be different at this offset on this
 	 * side than those of the sending peer, we may need to submit the
-	 * request in more than one bio. */
+	 * request in more than one bio.
+	 *
+	 * Plain bio_alloc is good enough here, this is no DRBD internally
+	 * generated bio, but a bio allocated on behalf of the peer.
+	 */
 next_bio:
 	bio = bio_alloc(GFP_NOIO, nr_pages);
 	if (!bio) {
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/18] drbd: fix drbd_delete_device: remove vnr from volumes; idr_remove(); synchronize_rcu(); before cleanup
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (4 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 05/18] drbd: introduce a bio_set to allocate housekeeping bios from Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 07/18] drbd: get rid of drbd_bcast_ee, it is of no use anymore Philipp Reisner
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Still missing: rcu_readlock() on the various call sites that
access/iterate over those idrs.

We don't need a specific write lock, as we only modify from
configuration context, which is already strictly serialized.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_main.c |   42 ++++++++++++++++++++++-----------------
 1 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 26bcd13..21914f4 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2124,7 +2124,9 @@ void drbd_delete_device(unsigned int minor)
 	if (!mdev)
 		return;
 
-	idr_remove(&mdev->tconn->volumes, minor);
+	idr_remove(&mdev->tconn->volumes, mdev->vnr);
+	idr_remove(&minors, minor);
+	synchronize_rcu();
 
 	/* paranoia asserts */
 	D_ASSERT(mdev->open_cnt == 0);
@@ -2153,7 +2155,6 @@ void drbd_delete_device(unsigned int minor)
 	 * allocated from drbd_new_device
 	 * and actually free the mdev itself */
 	drbd_free_mdev(mdev);
-	idr_remove(&minors, minor);
 }
 
 static void drbd_cleanup(void)
@@ -2331,15 +2332,6 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor,
 		return ERR_NOMEM;
 
 	mdev->tconn = tconn;
-	if (!idr_pre_get(&tconn->volumes, GFP_KERNEL))
-		goto out_no_idr;
-	if (idr_get_new(&tconn->volumes, mdev, &vnr_got))
-		goto out_no_idr;
-	if (vnr_got != vnr) {
-		dev_err(DEV, "vnr_got (%d) != vnr (%d)\n", vnr_got, vnr);
-		goto out_no_q;
-	}
-
 	mdev->minor = minor;
 
 	drbd_init_set_defaults(mdev);
@@ -2395,19 +2387,35 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor,
 	INIT_LIST_HEAD(&mdev->current_epoch->list);
 	mdev->epochs = 1;
 
+	if (!idr_pre_get(&tconn->volumes, GFP_KERNEL))
+		goto out_no_vol_idr;
+	if (idr_get_new(&tconn->volumes, mdev, &vnr_got))
+		goto out_no_vol_idr;
+	if (vnr_got != vnr) {
+		dev_err(DEV, "vnr_got (%d) != vnr (%d)\n", vnr_got, vnr);
+		goto out_idr_remove_vol;
+	}
+
 	if (!idr_pre_get(&minors, GFP_KERNEL))
-		goto out_no_minor_idr;
+		goto out_idr_remove_vol;
 	if (idr_get_new(&minors, mdev, &minor_got))
-		goto out_no_minor_idr;
+		goto out_idr_remove_vol;
 	if (minor_got != minor) {
-		idr_remove(&minors, minor_got);
-		goto out_no_minor_idr;
+		/* minor exists, or other idr strangeness? */
+		dev_err(DEV, "available minor (%d) != requested minor (%d)\n",
+				minor_got, minor);
+		goto out_idr_remove_minor;
 	}
 	add_disk(disk);
 
 	return NO_ERROR;
 
-out_no_minor_idr:
+out_idr_remove_minor:
+	idr_remove(&minors, minor_got);
+out_idr_remove_vol:
+	idr_remove(&tconn->volumes, vnr_got);
+	synchronize_rcu();
+out_no_vol_idr:
 	kfree(mdev->current_epoch);
 out_no_epoch:
 	drbd_bm_cleanup(mdev);
@@ -2418,8 +2426,6 @@ out_no_io_page:
 out_no_disk:
 	blk_cleanup_queue(q);
 out_no_q:
-	idr_remove(&tconn->volumes, vnr_got);
-out_no_idr:
 	kfree(mdev);
 	return ERR_NOMEM;
 }
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/18] drbd: get rid of drbd_bcast_ee, it is of no use anymore
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (5 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 06/18] drbd: fix drbd_delete_device: remove vnr from volumes; idr_remove(); synchronize_rcu(); before cleanup Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 08/18] drbd: prepare the transition from connector to genetlink Philipp Reisner
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

This function was used to broadcast the (leading part of the)
bio payload in case we see a data integrity error.  It could be received
from userland with the drbdsetup events subcommand,
to have a peek into the payload that caused the checksum mismatch,
and guess from there what may have caused the mismatch,
mainly to guess wether it was modification of in-flight data,
or data corruption by broken hardware or software bugs.

Meanwhile we support bios that are larger than the maximum payload a
netlink datagram can carry.
And we have means to reliably detect modification of in-flight data by
calculating, and comparing, the checksum before and after sendmsg.
There is no need to carry this around anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_receiver.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 12b1533..6dfea2f 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1356,8 +1356,6 @@ read_in_block(struct drbd_conf *mdev, u64 id, sector_t sector,
 		if (memcmp(dig_in, dig_vv, dgs)) {
 			dev_err(DEV, "Digest integrity check FAILED: %llus +%u\n",
 				(unsigned long long)sector, data_size);
-			drbd_bcast_ee(mdev, "digest failed",
-					dgs, dig_in, dig_vv, peer_req);
 			drbd_free_ee(mdev, peer_req);
 			return NULL;
 		}
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/18] drbd: prepare the transition from connector to genetlink
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (6 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 07/18] drbd: get rid of drbd_bcast_ee, it is of no use anymore Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 09/18] drbd: switch configuration interface " Philipp Reisner
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

This adds the new API header and helper files.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 include/linux/drbd_genl.h         |  349 +++++++++++++++++++++++++++++++
 include/linux/drbd_genl_api.h     |   55 +++++
 include/linux/genl_magic_func.h   |  417 +++++++++++++++++++++++++++++++++++++
 include/linux/genl_magic_struct.h |  260 +++++++++++++++++++++++
 4 files changed, 1081 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/drbd_genl.h
 create mode 100644 include/linux/drbd_genl_api.h
 create mode 100644 include/linux/genl_magic_func.h
 create mode 100644 include/linux/genl_magic_struct.h

diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
new file mode 100644
index 0000000..84e1684
--- /dev/null
+++ b/include/linux/drbd_genl.h
@@ -0,0 +1,349 @@
+/*
+ * General overview:
+ * full generic netlink message:
+ * |nlmsghdr|genlmsghdr|<payload>
+ *
+ * payload:
+ * |optional fixed size family header|<sequence of netlink attributes>
+ *
+ * sequence of netlink attributes:
+ * I chose to have all "top level" attributes NLA_NESTED,
+ * corresponding to some real struct.
+ * So we have a sequence of |tla, len|<nested nla sequence>
+ *
+ * nested nla sequence:
+ * may be empty, or contain a sequence of netlink attributes
+ * representing the struct fields.
+ *
+ * The tag number of any field (regardless of containing struct)
+ * will be available as T_ ## field_name,
+ * so you cannot have the same field name in two differnt structs.
+ *
+ * The tag numbers themselves are per struct, though,
+ * so should always begin at 1 (not 0, that is the special "NLA_UNSPEC" type,
+ * which we won't use here).
+ * The tag numbers are used as index in the respective nla_policy array.
+ *
+ * GENL_struct(tag_name, tag_number, struct name, struct fields) - struct and policy
+ *	genl_magic_struct.h
+ *		generates the struct declaration,
+ *		generates an entry in the tla enum,
+ *	genl_magic_func.h
+ *		generates an entry in the static tla policy
+ *		with .type = NLA_NESTED
+ *		generates the static <struct_name>_nl_policy definition,
+ *		and static conversion functions
+ *
+ *	genl_magic_func.h
+ *
+ * GENL_mc_group(group)
+ *	genl_magic_struct.h
+ *		does nothing
+ *	genl_magic_func.h
+ *		defines and registers the mcast group,
+ *		and provides a send helper
+ *
+ * GENL_notification(op_name, op_num, mcast_group, tla list)
+ *	These are notifications to userspace.
+ *
+ *	genl_magic_struct.h
+ *		generates an entry in the genl_ops enum,
+ *	genl_magic_func.h
+ *		does nothing
+ *
+ *	mcast group: the name of the mcast group this notification should be
+ *	expected on
+ *	tla list: the list of expected top level attributes,
+ *	for documentation and sanity checking.
+ *
+ * GENL_op(op_name, op_num, flags and handler, tla list) - "genl operations"
+ *	These are requests from userspace.
+ *
+ *	_op and _notification share the same "number space",
+ *	op_nr will be assigned to "genlmsghdr->cmd"
+ *
+ *	genl_magic_struct.h
+ *		generates an entry in the genl_ops enum,
+ *	genl_magic_func.h
+ *		generates an entry in the static genl_ops array,
+ *		and static register/unregister functions to
+ *		genl_register_family_with_ops().
+ *
+ *	flags and handler:
+ *		GENL_op_init( .doit = x, .dumpit = y, .flags = something)
+ *		GENL_doit(x) => .dumpit = NULL, .flags = GENL_ADMIN_PERM
+ *	tla list: the list of expected top level attributes,
+ *	for documentation and sanity checking.
+ */
+
+/*
+ * STRUCTS
+ */
+
+/* this is sent kernel -> userland on various error conditions, and contains
+ * informational textual info, which is supposedly human readable.
+ * The computer relevant return code is in the drbd_genlmsghdr.
+ */
+GENL_struct(DRBD_NLA_CFG_REPLY, 1, drbd_cfg_reply,
+		/* "arbitrary" size strings, nla_policy.len = 0 */
+	__str_field(1, GENLA_F_MANDATORY,	info_text, 0)
+)
+
+/* Configuration requests typically need a context to operate on.
+ * Possible keys are device minor (fits in the drbd_genlmsghdr),
+ * the replication link (aka connection) name,
+ * and/or the replication group (aka resource) name,
+ * and the volume id within the resource. */
+GENL_struct(DRBD_NLA_CFG_CONTEXT, 2, drbd_cfg_context,
+		/* currently only 256 volumes per group,
+		 * but maybe we still change that */
+	__u32_field(1, GENLA_F_MANDATORY,	ctx_volume)
+	__str_field(2, GENLA_F_MANDATORY,	ctx_conn_name, 128)
+)
+
+GENL_struct(DRBD_NLA_DISK_CONF, 3, disk_conf,
+	__u64_field(1, GENLA_F_MANDATORY,	disk_size)
+	__str_field(2, GENLA_F_REQUIRED,	backing_dev,	128)
+	__str_field(3, GENLA_F_REQUIRED,	meta_dev,	128)
+	__u32_field(4, GENLA_F_REQUIRED,	meta_dev_idx)
+	__u32_field(5, GENLA_F_MANDATORY,	max_bio_bvecs)
+	__u32_field(6, GENLA_F_MANDATORY,	on_io_error)
+	__u32_field(7, GENLA_F_MANDATORY,	fencing)
+	__flg_field(8, GENLA_F_MANDATORY,	no_disk_barrier)
+	__flg_field(9, GENLA_F_MANDATORY,	no_disk_flush)
+	__flg_field(10, GENLA_F_MANDATORY,	no_disk_drain)
+	__flg_field(11, GENLA_F_MANDATORY,	no_md_flush)
+	__flg_field(12, GENLA_F_MANDATORY,	use_bmbv)
+)
+
+GENL_struct(DRBD_NLA_SYNCER_CONF, 4, syncer_conf,
+	__u32_field(1,	GENLA_F_MANDATORY,	rate)
+	__u32_field(2,	GENLA_F_MANDATORY,	after)
+	__u32_field(3,	GENLA_F_MANDATORY,	al_extents)
+	__str_field(4,	GENLA_F_MANDATORY,	cpu_mask,       32)
+	__str_field(5,	GENLA_F_MANDATORY,	verify_alg,     SHARED_SECRET_MAX)
+	__str_field(6,	GENLA_F_MANDATORY,	csums_alg,	SHARED_SECRET_MAX)
+	__flg_field(7,	GENLA_F_MANDATORY,	use_rle)
+	__u32_field(8,	GENLA_F_MANDATORY,	on_no_data)
+	__u32_field(9,	GENLA_F_MANDATORY,	c_plan_ahead)
+	__u32_field(10,	GENLA_F_MANDATORY,	c_delay_target)
+	__u32_field(11,	GENLA_F_MANDATORY,	c_fill_target)
+	__u32_field(12,	GENLA_F_MANDATORY,	c_max_rate)
+	__u32_field(13,	GENLA_F_MANDATORY,	c_min_rate)
+)
+
+GENL_struct(DRBD_NLA_NET_CONF, 5, net_conf,
+	__str_field(1,	GENLA_F_MANDATORY | GENLA_F_SENSITIVE,
+						shared_secret,	SHARED_SECRET_MAX)
+	__str_field(2,	GENLA_F_MANDATORY,	cram_hmac_alg,	SHARED_SECRET_MAX)
+	__str_field(3,	GENLA_F_MANDATORY,	integrity_alg,	SHARED_SECRET_MAX)
+	__str_field(4,	GENLA_F_REQUIRED,	my_addr,	128)
+	__str_field(5,	GENLA_F_REQUIRED,	peer_addr,	128)
+	__u32_field(6,	GENLA_F_REQUIRED,	wire_protocol)
+	__u32_field(7,	GENLA_F_MANDATORY,	try_connect_int)
+	__u32_field(8,	GENLA_F_MANDATORY,	timeout)
+	__u32_field(9,	GENLA_F_MANDATORY,	ping_int)
+	__u32_field(10,	GENLA_F_MANDATORY,	ping_timeo)
+	__u32_field(11,	GENLA_F_MANDATORY,	sndbuf_size)
+	__u32_field(12,	GENLA_F_MANDATORY,	rcvbuf_size)
+	__u32_field(13,	GENLA_F_MANDATORY,	ko_count)
+	__u32_field(14,	GENLA_F_MANDATORY,	max_buffers)
+	__u32_field(15,	GENLA_F_MANDATORY,	max_epoch_size)
+	__u32_field(16,	GENLA_F_MANDATORY,	unplug_watermark)
+	__u32_field(17,	GENLA_F_MANDATORY,	after_sb_0p)
+	__u32_field(18,	GENLA_F_MANDATORY,	after_sb_1p)
+	__u32_field(19,	GENLA_F_MANDATORY,	after_sb_2p)
+	__u32_field(20,	GENLA_F_MANDATORY,	rr_conflict)
+	__u32_field(21,	GENLA_F_MANDATORY,	on_congestion)
+	__u32_field(22,	GENLA_F_MANDATORY,	cong_fill)
+	__u32_field(23,	GENLA_F_MANDATORY,	cong_extents)
+	__flg_field(24, GENLA_F_MANDATORY,	two_primaries)
+	__flg_field(25, GENLA_F_MANDATORY,	want_lose)
+	__flg_field(26, GENLA_F_MANDATORY,	no_cork)
+	__flg_field(27, GENLA_F_MANDATORY,	always_asbp)
+	__flg_field(28, GENLA_F_MANDATORY,	dry_run)
+)
+
+GENL_struct(DRBD_NLA_SET_ROLE_PARMS, 6, set_role_parms,
+	__flg_field(1, GENLA_F_MANDATORY,	assume_uptodate)
+)
+
+GENL_struct(DRBD_NLA_RESIZE_PARMS, 7, resize_parms,
+	__u64_field(1, GENLA_F_MANDATORY,	resize_size)
+	__flg_field(2, GENLA_F_MANDATORY,	resize_force)
+	__flg_field(3, GENLA_F_MANDATORY,	no_resync)
+)
+
+GENL_struct(DRBD_NLA_STATE_INFO, 8, state_info,
+	/* the reason of the broadcast,
+	 * if this is an event triggered broadcast. */
+	__u32_field(1, GENLA_F_MANDATORY,	sib_reason)
+	__u32_field(2, GENLA_F_REQUIRED,	current_state)
+	__u64_field(3, GENLA_F_MANDATORY,	capacity)
+	__u64_field(4, GENLA_F_MANDATORY,	ed_uuid)
+
+	/* These are for broadcast from after state change work.
+	 * prev_state and new_state are from the moment the state change took
+	 * place, new_state is not neccessarily the same as current_state,
+	 * there may have been more state changes since.  Which will be
+	 * broadcasted soon, in their respective after state change work.  */
+	__u32_field(5, GENLA_F_MANDATORY,	prev_state)
+	__u32_field(6, GENLA_F_MANDATORY,	new_state)
+
+	/* if we have a local disk: */
+	__bin_field(7, GENLA_F_MANDATORY,	uuids, (UI_SIZE*sizeof(__u64)))
+	__u32_field(8, GENLA_F_MANDATORY,	disk_flags)
+	__u64_field(9, GENLA_F_MANDATORY,	bits_total)
+	__u64_field(10, GENLA_F_MANDATORY,	bits_oos)
+	/* and in case resync or online verify is active */
+	__u64_field(11, GENLA_F_MANDATORY,	bits_rs_total)
+	__u64_field(12, GENLA_F_MANDATORY,	bits_rs_failed)
+
+	/* for pre and post notifications of helper execution */
+	__str_field(13, GENLA_F_MANDATORY,	helper, 32)
+	__u32_field(14, GENLA_F_MANDATORY,	helper_exit_code)
+)
+
+GENL_struct(DRBD_NLA_START_OV_PARMS, 9, start_ov_parms,
+	__u64_field(1, GENLA_F_MANDATORY,	ov_start_sector)
+)
+
+GENL_struct(DRBD_NLA_NEW_C_UUID_PARMS, 10, new_c_uuid_parms,
+	__flg_field(1, GENLA_F_MANDATORY, clear_bm)
+)
+
+GENL_struct(DRBD_NLA_TIMEOUT_PARMS, 11, timeout_parms,
+	__u32_field(1,	GENLA_F_REQUIRED,	timeout_type)
+)
+
+GENL_struct(DRBD_NLA_DISCONNECT_PARMS, 12, disconnect_parms,
+	__flg_field(1, GENLA_F_MANDATORY,	force_disconnect)
+)
+
+/*
+ * Notifications and commands (genlmsghdr->cmd)
+ */
+GENL_mc_group(events)
+
+	/* kernel -> userspace announcement of changes */
+GENL_notification(
+	DRBD_EVENT, 1, events,
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_STATE_INFO, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_NET_CONF, GENLA_F_MANDATORY)
+	GENL_tla_expected(DRBD_NLA_DISK_CONF, GENLA_F_MANDATORY)
+	GENL_tla_expected(DRBD_NLA_SYNCER_CONF, GENLA_F_MANDATORY)
+)
+
+	/* query kernel for specific or all info */
+GENL_op(
+	DRBD_ADM_GET_STATUS, 2,
+	GENL_op_init(
+		.doit = drbd_adm_get_status,
+		.dumpit = drbd_adm_get_status_all,
+		/* anyone may ask for the status,
+		 * it is broadcasted anyways */
+	),
+	/* To select the object .doit.
+	 * Or a subset of objects in .dumpit. */
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_MANDATORY)
+)
+
+#if 0
+	/* TO BE DONE */
+	/* create or destroy resources, aka replication groups */
+GENL_op(DRBD_ADM_CREATE_RESOURCE, 3, GENL_doit(drbd_adm_create_resource),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_DELETE_RESOURCE, 4, GENL_doit(drbd_adm_delete_resource),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+#endif
+
+	/* add DRBD minor devices as volumes to resources */
+GENL_op(DRBD_ADM_ADD_MINOR, 5, GENL_doit(drbd_adm_add_minor),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_DEL_MINOR, 6, GENL_doit(drbd_adm_delete_minor),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+
+	/* add or delete replication links to resources */
+GENL_op(DRBD_ADM_ADD_LINK, 7, GENL_doit(drbd_adm_create_connection),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_DEL_LINK, 8, GENL_doit(drbd_adm_delete_connection),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+
+	/* operates on replication links */
+GENL_op(DRBD_ADM_SYNCER, 9,
+	GENL_doit(drbd_adm_syncer),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_SYNCER_CONF, GENLA_F_MANDATORY)
+)
+
+GENL_op(
+	DRBD_ADM_CONNECT, 10,
+	GENL_doit(drbd_adm_connect),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_NET_CONF, GENLA_F_REQUIRED)
+)
+
+GENL_op(DRBD_ADM_DISCONNECT, 11, GENL_doit(drbd_adm_disconnect),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+
+	/* operates on minors */
+GENL_op(DRBD_ADM_ATTACH, 12,
+	GENL_doit(drbd_adm_attach),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_DISK_CONF, GENLA_F_REQUIRED)
+)
+
+GENL_op(
+	DRBD_ADM_RESIZE, 13,
+	GENL_doit(drbd_adm_resize),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_RESIZE_PARMS, GENLA_F_MANDATORY)
+)
+
+	/* operates on all volumes within a resource */
+GENL_op(
+	DRBD_ADM_PRIMARY, 14,
+	GENL_doit(drbd_adm_set_role),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_SET_ROLE_PARMS, GENLA_F_REQUIRED)
+)
+
+GENL_op(
+	DRBD_ADM_SECONDARY, 15,
+	GENL_doit(drbd_adm_set_role),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_SET_ROLE_PARMS, GENLA_F_REQUIRED)
+)
+
+GENL_op(
+	DRBD_ADM_NEW_C_UUID, 16,
+	GENL_doit(drbd_adm_new_c_uuid),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED)
+	GENL_tla_expected(DRBD_NLA_NEW_C_UUID_PARMS, GENLA_F_MANDATORY)
+)
+
+GENL_op(
+	DRBD_ADM_START_OV, 17,
+	GENL_doit(drbd_adm_start_ov),
+	GENL_tla_expected(DRBD_NLA_START_OV_PARMS, GENLA_F_MANDATORY)
+)
+
+GENL_op(DRBD_ADM_DETACH,	18, GENL_doit(drbd_adm_detach),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_INVALIDATE,	19, GENL_doit(drbd_adm_invalidate),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_INVAL_PEER,	20, GENL_doit(drbd_adm_invalidate_peer),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_PAUSE_SYNC,	21, GENL_doit(drbd_adm_pause_sync),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_RESUME_SYNC,	22, GENL_doit(drbd_adm_resume_sync),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_SUSPEND_IO,	23, GENL_doit(drbd_adm_suspend_io),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_RESUME_IO,	24, GENL_doit(drbd_adm_resume_io),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_OUTDATE,	25, GENL_doit(drbd_adm_outdate),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_GET_TIMEOUT_TYPE, 26, GENL_doit(drbd_adm_get_timeout_type),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
diff --git a/include/linux/drbd_genl_api.h b/include/linux/drbd_genl_api.h
new file mode 100644
index 0000000..9ef50d5
--- /dev/null
+++ b/include/linux/drbd_genl_api.h
@@ -0,0 +1,55 @@
+#ifndef DRBD_GENL_STRUCT_H
+#define DRBD_GENL_STRUCT_H
+
+/**
+ * struct drbd_genlmsghdr - DRBD specific header used in NETLINK_GENERIC requests
+ * @minor:
+ *     For admin requests (user -> kernel): which minor device to operate on.
+ *     For (unicast) replies or informational (broadcast) messages
+ *     (kernel -> user): which minor device the information is about.
+ *     If we do not operate on minors, but on connections or resources,
+ *     the minor value shall be (~0), and the attribute DRBD_NLA_CFG_CONTEXT
+ *     is used instead.
+ * @flags: possible operation modifiers (relevant only for user->kernel):
+ *     DRBD_GENL_F_SET_DEFAULTS
+ * @volume:
+ *     When creating a new minor (adding it to a resource), the resource needs
+ *     to know which volume number within the resource this is supposed to be.
+ *     The volume number corresponds to the same volume number on the remote side,
+ *     whereas the minor number on the remote side may be different
+ *     (union with flags).
+ * @ret_code: kernel->userland unicast cfg reply return code (union with flags);
+ */
+struct drbd_genlmsghdr {
+	__u32 minor;
+	union {
+	__u32 flags;
+	__s32 ret_code;
+	};
+};
+
+/* To be used in drbd_genlmsghdr.flags */
+enum {
+	DRBD_GENL_F_SET_DEFAULTS = 1,
+};
+
+enum drbd_state_info_bcast_reason {
+	SIB_GET_STATUS_REPLY = 1,
+	SIB_STATE_CHANGE = 2,
+	SIB_HELPER_PRE = 3,
+	SIB_HELPER_POST = 4,
+	SIB_SYNC_PROGRESS = 5,
+};
+
+/* hack around predefined gcc/cpp "linux=1",
+ * we cannot possibly include <1/drbd_genl.h> */
+#undef linux
+
+#include <linux/drbd.h>
+#define GENL_MAGIC_VERSION	API_VERSION
+#define GENL_MAGIC_FAMILY	drbd
+#define GENL_MAGIC_FAMILY_HDRSZ	sizeof(struct drbd_genlmsghdr)
+#define GENL_MAGIC_INCLUDE_FILE <linux/drbd_genl.h>
+#include <linux/genl_magic_struct.h>
+
+#endif
diff --git a/include/linux/genl_magic_func.h b/include/linux/genl_magic_func.h
new file mode 100644
index 0000000..8a86f65
--- /dev/null
+++ b/include/linux/genl_magic_func.h
@@ -0,0 +1,417 @@
+#ifndef GENL_MAGIC_FUNC_H
+#define GENL_MAGIC_FUNC_H
+
+#include <linux/genl_magic_struct.h>
+
+/*
+ * Extension of genl attribute validation policies			{{{1
+ *									{{{2
+ */
+
+/**
+ * nla_is_required - return true if this attribute is required
+ * @nla: netlink attribute
+ */
+static inline int nla_is_required(const struct nlattr *nla)
+{
+        return nla->nla_type & GENLA_F_REQUIRED;
+}
+
+/**
+ * nla_is_mandatory - return true if understanding this attribute is mandatory
+ * @nla: netlink attribute
+ * Note: REQUIRED attributes are implicitly MANDATORY as well
+ */
+static inline int nla_is_mandatory(const struct nlattr *nla)
+{
+        return nla->nla_type & (GENLA_F_MANDATORY | GENLA_F_REQUIRED);
+}
+
+/* Functionality to be integrated into nla_parse(), and validate_nla(),
+ * respectively.
+ *
+ * Enforcing the "mandatory" bit is done here,
+ * by rejecting unknown mandatory attributes.
+ *
+ * Part of enforcing the "required" flag would mean to embed it into
+ * nla_policy.type, and extending validate_nla(), which currently does
+ * BUG_ON(pt->type > NLA_TYPE_MAX); we have to work on existing kernels,
+ * so we cannot do that.  Thats why enforcing "required" is done in the
+ * generated assignment functions below. */
+static int nla_check_unknown(int maxtype, struct nlattr *head, int len)
+{
+	struct nlattr *nla;
+	int rem;
+        nla_for_each_attr(nla, head, len, rem) {
+		__u16 type = nla_type(nla);
+		if (type > maxtype && nla_is_mandatory(nla))
+			return -EOPNOTSUPP;
+	}
+	return 0;
+}
+
+/*
+ * Magic: declare tla policy						{{{1
+ * Magic: declare nested policies
+ *									{{{2
+ */
+#undef GENL_mc_group
+#define GENL_mc_group(group)
+
+#undef GENL_notification
+#define GENL_notification(op_name, op_num, mcast_group, tla_list)
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, tla_list)
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+	[tag_name] = { .type = NLA_NESTED },
+
+static struct nla_policy CONCAT_(GENL_MAGIC_FAMILY, _tla_nl_policy)[] = {
+#include GENL_MAGIC_INCLUDE_FILE
+};
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+static struct nla_policy s_name ## _nl_policy[] __read_mostly =		\
+{ s_fields };
+
+#undef __field
+#define __field(attr_nr, attr_flag, name, nla_type, _type, __get, __put) \
+	[__nla_type(attr_nr)] = { .type = nla_type },
+
+#undef __array
+#define __array(attr_nr, attr_flag, name, nla_type, _type, maxlen,	\
+		__get, __put)						\
+	[__nla_type(attr_nr)] = { .type = nla_type,			\
+		.len = maxlen - (nla_type == NLA_NUL_STRING) },
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+#ifndef __KERNEL__
+#ifndef pr_info
+#define pr_info(args...)	fprintf(stderr, args);
+#endif
+#endif
+
+#if 1
+static void dprint_field(const char *dir, int nla_type,
+		const char *name, void *valp)
+{
+	__u64 val = valp ? *(__u32 *)valp : 1;
+	switch (nla_type) {
+	case NLA_U8:  val = (__u8)val;
+	case NLA_U16: val = (__u16)val;
+	case NLA_U32: val = (__u32)val;
+		pr_info("%s attr %s: %d 0x%08x\n", dir,
+			name, (int)val, (unsigned)val);
+		break;
+	case NLA_U64:
+		val = *(__u64*)valp;
+		pr_info("%s attr %s: %lld 0x%08llx\n", dir,
+			name, (long long)val, (unsigned long long)val);
+		break;
+	case NLA_FLAG:
+		if (val)
+			pr_info("%s attr %s: set\n", dir, name);
+		break;
+	}
+}
+
+static void dprint_array(const char *dir, int nla_type,
+		const char *name, const char *val, unsigned len)
+{
+	switch (nla_type) {
+	case NLA_NUL_STRING:
+		if (len && val[len-1] == '\0')
+			len--;
+		pr_info("%s attr %s: [len:%u] '%s'\n", dir, name, len, val);
+		break;
+	default:
+		/* we can always show 4 byte,
+		 * thats what nlattr are aligned to. */
+		pr_info("%s attr %s: [len:%u] %02x%02x%02x%02x ...\n",
+			dir, name, len, val[0], val[1], val[2], val[3]);
+	}
+}
+
+#define DPRINT_TLA(a, op, b) pr_info("%s %s %s\n", a, op, b);
+
+/* Name is a member field name of the struct s.
+ * If s is NULL (only parsing, no copy requested in *_from_attrs()),
+ * nla is supposed to point to the attribute containing the information
+ * corresponding to that struct member. */
+#define DPRINT_FIELD(dir, nla_type, name, s, nla)			\
+	do {								\
+		if (s)							\
+			dprint_field(dir, nla_type, #name, &s->name);	\
+		else if (nla)						\
+			dprint_field(dir, nla_type, #name,		\
+				(nla_type == NLA_FLAG) ? NULL		\
+						: nla_data(nla));	\
+	} while (0)
+
+#define	DPRINT_ARRAY(dir, nla_type, name, s, nla)			\
+	do {								\
+		if (s)							\
+			dprint_array(dir, nla_type, #name,		\
+					s->name, s->name ## _len);	\
+		else if (nla)						\
+			dprint_array(dir, nla_type, #name,		\
+					nla_data(nla), nla_len(nla));	\
+	} while (0)
+#else
+#define DPRINT_TLA(a, op, b) do {} while (0)
+#define DPRINT_FIELD(dir, nla_type, name, s, nla) do {} while (0)
+#define	DPRINT_ARRAY(dir, nla_type, name, s, nla) do {} while (0)
+#endif
+
+/*
+ * Magic: provide conversion functions					{{{1
+ * populate struct from attribute table:
+ *									{{{2
+ */
+
+/* processing of generic netlink messages is serialized.
+ * use one static buffer for parsing of nested attributes */
+static struct nlattr *nested_attr_tb[128];
+
+#ifndef BUILD_BUG_ON
+/* Force a compilation error if condition is true */
+#define BUILD_BUG_ON(condition) ((void)BUILD_BUG_ON_ZERO(condition))
+/* Force a compilation error if condition is true, but also produce a
+   result (of value 0 and type size_t), so the expression can be used
+   e.g. in a structure initializer (or where-ever else comma expressions
+   aren't permitted). */
+#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
+#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
+#endif
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+	/* static, potentially unused */				\
+int s_name ## _from_attrs(struct s_name *s, struct nlattr *tb[])	\
+{									\
+	const int maxtype = ARRAY_SIZE(s_name ## _nl_policy)-1;		\
+	struct nlattr *tla = tb[tag_number];				\
+	struct nlattr **ntb = nested_attr_tb;				\
+	struct nlattr *nla;						\
+	int err;							\
+	BUILD_BUG_ON(ARRAY_SIZE(s_name ## _nl_policy) > ARRAY_SIZE(nested_attr_tb));	\
+	if (!tla)							\
+		return -ENOMSG;						\
+	DPRINT_TLA(#s_name, "<=-", #tag_name);				\
+	err = nla_parse_nested(ntb, maxtype, tla, s_name ## _nl_policy); \
+	if (err)							\
+		return err;						\
+	err = nla_check_unknown(maxtype, nla_data(tla), nla_len(tla));	\
+	if (err)							\
+	      return err;						\
+									\
+	s_fields							\
+	return 0;							\
+}
+
+#undef __field
+#define __field(attr_nr, attr_flag, name, nla_type, type, __get, __put)	\
+		nla = ntb[__nla_type(attr_nr)];				\
+		if (nla) {						\
+			if (s)						\
+				s->name = __get(nla);			\
+			DPRINT_FIELD("<<", nla_type, name, s, nla);	\
+		} else if ((attr_flag) & GENLA_F_REQUIRED) {		\
+			pr_info("<< missing attr: %s\n", #name);	\
+			return -ENOMSG;					\
+		}
+
+/* validate_nla() already checked nla_len <= maxlen appropriately. */
+#undef __array
+#define __array(attr_nr, attr_flag, name, nla_type, type, maxlen, __get, __put) \
+		nla = ntb[__nla_type(attr_nr)];				\
+		if (nla) {						\
+			if (s)						\
+				s->name ## _len =			\
+					__get(s->name, nla, maxlen);	\
+			DPRINT_ARRAY("<<", nla_type, name, s, nla);	\
+		} else if ((attr_flag) & GENLA_F_REQUIRED) {		\
+			pr_info("<< missing attr: %s\n", #name);	\
+			return -ENOMSG;					\
+		}							\
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)
+
+/*
+ * Magic: define op number to op name mapping				{{{1
+ *									{{{2
+ */
+const char *CONCAT_(GENL_MAGIC_FAMILY, _genl_cmd_to_str)(__u8 cmd)
+{
+	switch (cmd) {
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, tla_list)		\
+	case op_num: return #op_name;
+#include GENL_MAGIC_INCLUDE_FILE
+	default:
+		     return "unknown";
+	}
+}
+
+#ifdef __KERNEL__
+#include <linux/stringify.h>
+/*
+ * Magic: define genl_ops						{{{1
+ *									{{{2
+ */
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, tla_list)		\
+{								\
+	handler							\
+	.cmd = op_name,						\
+	.policy	= CONCAT_(GENL_MAGIC_FAMILY, _tla_nl_policy),	\
+},
+
+#define ZZZ_genl_ops		CONCAT_(GENL_MAGIC_FAMILY, _genl_ops)
+static struct genl_ops ZZZ_genl_ops[] __read_mostly = {
+#include GENL_MAGIC_INCLUDE_FILE
+};
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, tla_list)
+
+/*
+ * Define the genl_family, multicast groups,				{{{1
+ * and provide register/unregister functions.
+ *									{{{2
+ */
+#define ZZZ_genl_family		CONCAT_(GENL_MAGIC_FAMILY, _genl_family)
+static struct genl_family ZZZ_genl_family __read_mostly = {
+	.id = GENL_ID_GENERATE,
+	.name = __stringify(GENL_MAGIC_FAMILY),
+	.version = GENL_MAGIC_VERSION,
+#ifdef GENL_MAGIC_FAMILY_HDRSZ
+	.hdrsize = NLA_ALIGN(GENL_MAGIC_FAMILY_HDRSZ),
+#endif
+	.maxattr = ARRAY_SIZE(drbd_tla_nl_policy)-1,
+};
+
+/*
+ * Magic: define multicast groups
+ * Magic: define multicast group registration helper
+ */
+#undef GENL_mc_group
+#define GENL_mc_group(group)						\
+static struct genl_multicast_group					\
+CONCAT_(GENL_MAGIC_FAMILY, _mcg_ ## group) __read_mostly = {		\
+	.name = #group,							\
+};									\
+static int CONCAT_(GENL_MAGIC_FAMILY, _genl_multicast_ ## group)(	\
+	struct sk_buff *skb, gfp_t flags)				\
+{									\
+	unsigned int group_id =						\
+		CONCAT_(GENL_MAGIC_FAMILY, _mcg_ ## group).id;	\
+	if (!group_id)							\
+		return -EINVAL;						\
+	return genlmsg_multicast(skb, 0, group_id, flags);		\
+}
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+int CONCAT_(GENL_MAGIC_FAMILY, _genl_register)(void)
+{
+	int err = genl_register_family_with_ops(&ZZZ_genl_family,
+		ZZZ_genl_ops, ARRAY_SIZE(ZZZ_genl_ops));
+	if (err)
+		return err;
+#undef GENL_mc_group
+#define GENL_mc_group(group)						\
+	err = genl_register_mc_group(&ZZZ_genl_family,			\
+		&CONCAT_(GENL_MAGIC_FAMILY, _mcg_ ## group));		\
+	if (err)							\
+		goto fail;						\
+	else								\
+		pr_info("%s: mcg %s: %u\n", #group,			\
+			__stringify(GENL_MAGIC_FAMILY),			\
+			CONCAT_(GENL_MAGIC_FAMILY, _mcg_ ## group).id);
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+#undef GENL_mc_group
+#define GENL_mc_group(group)
+	return 0;
+fail:
+	genl_unregister_family(&ZZZ_genl_family);
+	return err;
+}
+
+void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void)
+{
+	genl_unregister_family(&ZZZ_genl_family);
+}
+
+/*
+ * Magic: provide conversion functions					{{{1
+ * populate skb from struct.
+ *									{{{2
+ */
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, tla_list)
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+static int s_name ## _to_skb(struct sk_buff *skb, struct s_name *s,	\
+		const bool exclude_sensitive)				\
+{									\
+	struct nlattr *tla = nla_nest_start(skb, tag_number);		\
+	if (!tla)							\
+		goto nla_put_failure;					\
+	DPRINT_TLA(#s_name, "-=>", #tag_name);				\
+	s_fields							\
+	nla_nest_end(skb, tla);						\
+	return 0;							\
+									\
+nla_put_failure:							\
+	if (tla)							\
+		nla_nest_cancel(skb, tla);				\
+        return -EMSGSIZE;						\
+}									\
+static inline int s_name ## _to_priv_skb(struct sk_buff *skb,		\
+		struct s_name *s)					\
+{									\
+	return s_name ## _to_skb(skb, s, 0);				\
+}									\
+static inline int s_name ## _to_unpriv_skb(struct sk_buff *skb,		\
+		struct s_name *s)					\
+{									\
+	return s_name ## _to_skb(skb, s, 1);				\
+}
+
+
+#undef __field
+#define __field(attr_nr, attr_flag, name, nla_type, type, __get, __put)	\
+	if (!exclude_sensitive || !((attr_flag) & GENLA_F_SENSITIVE)) {	\
+		DPRINT_FIELD(">>", nla_type, name, s, NULL);		\
+		__put(skb, attr_nr, s->name);				\
+	}
+
+#undef __array
+#define __array(attr_nr, attr_flag, name, nla_type, type, maxlen, __get, __put) \
+	if (!exclude_sensitive || !((attr_flag) & GENLA_F_SENSITIVE)) {	\
+		DPRINT_ARRAY(">>",nla_type, name, s, NULL);		\
+		__put(skb, attr_nr, min_t(int, maxlen,			\
+			s->name ## _len + (nla_type == NLA_NUL_STRING)),\
+						s->name);		\
+	}
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+#endif /* __KERNEL__ */
+
+/* }}}1 */
+#endif /* GENL_MAGIC_FUNC_H */
+/* vim: set foldmethod=marker foldlevel=1 nofoldenable : */
diff --git a/include/linux/genl_magic_struct.h b/include/linux/genl_magic_struct.h
new file mode 100644
index 0000000..745ebfd
--- /dev/null
+++ b/include/linux/genl_magic_struct.h
@@ -0,0 +1,260 @@
+#ifndef GENL_MAGIC_STRUCT_H
+#define GENL_MAGIC_STRUCT_H
+
+#ifndef GENL_MAGIC_FAMILY
+# error "you need to define GENL_MAGIC_FAMILY before inclusion"
+#endif
+
+#ifndef GENL_MAGIC_VERSION
+# error "you need to define GENL_MAGIC_VERSION before inclusion"
+#endif
+
+#ifndef GENL_MAGIC_INCLUDE_FILE
+# error "you need to define GENL_MAGIC_INCLUDE_FILE before inclusion"
+#endif
+
+#include <linux/genetlink.h>
+#include <linux/types.h>
+
+#define CONCAT__(a,b)	a ## b
+#define CONCAT_(a,b)	CONCAT__(a,b)
+
+extern int CONCAT_(GENL_MAGIC_FAMILY, _genl_register)(void);
+extern void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void);
+
+/*
+ * Extension of genl attribute validation policies			{{{2
+ */
+
+/**
+ * GENLA_F_FLAGS - policy type flags to ease compatible ABI evolvement
+ *
+ * @GENLA_F_REQUIRED: attribute has to be present, or message is considered invalid.
+ * Adding new REQUIRED attributes breaks ABI compatibility, so don't do that.
+ *
+ * @GENLA_F_MANDATORY: if present, receiver _must_ understand it.
+ * Without this, unknown attributes (> maxtype) are _silently_ ignored
+ * by validate_nla().
+ *
+ * To be used for API extensions, so older kernel can reject requests for not
+ * yet implemented features, if newer userland tries to use them even though
+ * the genl_family version clearly indicates they are not available.
+ *
+ * @GENLA_F_MAY_IGNORE: To clearly document the fact, for good measure.
+ * To be used for API extensions for things that have sane defaults,
+ * so newer userland can still talk to older kernel, knowing it will
+ * silently ignore these attributes if not yet known.
+ *
+ * NOTE: These flags overload
+ *   NLA_F_NESTED		(1 << 15)
+ *   NLA_F_NET_BYTEORDER	(1 << 14)
+ * from linux/netlink.h, which are not useful for validate_nla():
+ * NET_BYTEORDER is not used anywhere, and NESTED would be specified by setting
+ * .type = NLA_NESTED in the appropriate policy.
+ *
+ * See also: nla_type()
+ */
+enum {
+	GENLA_F_MAY_IGNORE	= 0,
+	GENLA_F_MANDATORY	= 1 << 14,
+	GENLA_F_REQUIRED	= 1 << 15,
+
+	/* This will not be present in the __u16 .nla_type, but can be
+	 * triggered on in <struct>_to_skb, to exclude "sensitive"
+	 * information from broadcasts, or on unpriviledged get requests.
+	 * This is useful because genetlink multicast groups can be listened in
+	 * on by anyone.  */
+	GENLA_F_SENSITIVE	= 1 << 16,
+};
+
+#define __nla_type(x)	((__u16)((__u16)(x) & (__u16)NLA_TYPE_MASK))
+
+/*									}}}1
+ * MAGIC
+ * multi-include macro expansion magic starts here
+ */
+
+/* MAGIC helpers							{{{2 */
+
+/* possible field types */
+#define __flg_field(attr_nr, attr_flag, name) \
+	__field(attr_nr, attr_flag, name, NLA_FLAG, char, \
+			nla_get_flag, __nla_put_flag)
+#define __u8_field(attr_nr, attr_flag, name)	\
+	__field(attr_nr, attr_flag, name, NLA_U8, unsigned char, \
+			nla_get_u8, NLA_PUT_U8)
+#define __u16_field(attr_nr, attr_flag, name)	\
+	__field(attr_nr, attr_flag, name, NLA_U16, __u16, \
+			nla_get_u16, NLA_PUT_U16)
+#define __u32_field(attr_nr, attr_flag, name)	\
+	__field(attr_nr, attr_flag, name, NLA_U32, __u32, \
+			nla_get_u32, NLA_PUT_U32)
+#define __u64_field(attr_nr, attr_flag, name)	\
+	__field(attr_nr, attr_flag, name, NLA_U64, __u64, \
+			nla_get_u64, NLA_PUT_U64)
+#define __str_field(attr_nr, attr_flag, name, maxlen) \
+	__array(attr_nr, attr_flag, name, NLA_NUL_STRING, char, maxlen, \
+			nla_strlcpy, NLA_PUT)
+#define __bin_field(attr_nr, attr_flag, name, maxlen) \
+	__array(attr_nr, attr_flag, name, NLA_BINARY, char, maxlen, \
+			nla_memcpy, NLA_PUT)
+
+#define __nla_put_flag(skb, attrtype, value)		\
+	do {						\
+		if (value)				\
+			NLA_PUT_FLAG(skb, attrtype);	\
+	} while (0)
+
+#define GENL_op_init(args...)	args
+#define GENL_doit(handler)		\
+	.doit = handler,		\
+	.flags = GENL_ADMIN_PERM,
+#define GENL_dumpit(handler)		\
+	.dumpit = handler,		\
+	.flags = GENL_ADMIN_PERM,
+
+/*									}}}1
+ * Magic: define the enum symbols for genl_ops
+ * Magic: define the enum symbols for top level attributes
+ * Magic: define the enum symbols for nested attributes
+ *									{{{2
+ */
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)
+
+#undef GENL_mc_group
+#define GENL_mc_group(group)
+
+#undef GENL_notification
+#define GENL_notification(op_name, op_num, mcast_group, tla_list)	\
+	op_name = op_num,
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, tla_list)			\
+	op_name = op_num,
+
+enum {
+#include GENL_MAGIC_INCLUDE_FILE
+};
+
+#undef GENL_notification
+#define GENL_notification(op_name, op_num, mcast_group, tla_list)
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, attr_list)
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields) \
+		tag_name = tag_number,
+
+enum {
+#include GENL_MAGIC_INCLUDE_FILE
+};
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)	\
+enum {								\
+	s_fields						\
+};
+
+#undef __field
+#define __field(attr_nr, attr_flag, name, nla_type, type, __get, __put)	\
+	T_ ## name = (__u16)(attr_nr | attr_flag),
+
+#undef __array
+#define __array(attr_nr, attr_flag, name, nla_type, type, maxlen, __get, __put) \
+	T_ ## name = (__u16)(attr_nr | attr_flag),
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+/*									}}}1
+ * Magic: compile time assert unique numbers for operations
+ * Magic: -"- unique numbers for top level attributes
+ * Magic: -"- unique numbers for nested attributes
+ *									{{{2
+ */
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, attr_list)	\
+	case op_name:
+
+#undef GENL_notification
+#define GENL_notification(op_name, op_num, mcast_group, tla_list)	\
+	case op_name:
+
+static inline void ct_assert_unique_operations(void)
+{
+	switch (0) {
+#include GENL_MAGIC_INCLUDE_FILE
+		;
+	}
+}
+
+#undef GENL_op
+#define GENL_op(op_name, op_num, handler, attr_list)
+
+#undef GENL_notification
+#define GENL_notification(op_name, op_num, mcast_group, tla_list)
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+		case tag_number:
+
+static inline void ct_assert_unique_top_level_attributes(void)
+{
+	switch (0) {
+#include GENL_MAGIC_INCLUDE_FILE
+		;
+	}
+}
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+static inline void ct_assert_unique_ ## s_name ## _attributes(void)	\
+{									\
+	switch (0) {							\
+		s_fields						\
+			;						\
+	}								\
+}
+
+#undef __field
+#define __field(attr_nr, attr_flag, name, nla_type, type, __get, __put)	\
+	case attr_nr:
+
+#undef __array
+#define __array(attr_nr, attr_flag, name, nla_type, type, maxlen, __get, __put) \
+	case attr_nr:
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+/*									}}}1
+ * Magic: declare structs
+ * struct <name> {
+ *	fields
+ * };
+ *									{{{2
+ */
+
+#undef GENL_struct
+#define GENL_struct(tag_name, tag_number, s_name, s_fields)		\
+struct s_name { s_fields };
+
+#undef __field
+#define __field(attr_nr, attr_flag, name, nla_type, type, __get, __put) \
+	type name;
+
+#undef __array
+#define __array(attr_nr, attr_flag, name, nla_type, type, maxlen, __get, __put) \
+	type name[maxlen];	\
+	__u32 name ## _len;
+
+#include GENL_MAGIC_INCLUDE_FILE
+
+/* }}}1 */
+#endif /* GENL_MAGIC_STRUCT_H */
+/* vim: set foldmethod=marker nofoldenable : */
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/18] drbd: switch configuration interface from connector to genetlink
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (7 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 08/18] drbd: prepare the transition from connector to genetlink Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 10/18] drbd: allow holes in minor and volume id allocation Philipp Reisner
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_actlog.c |    3 +-
 drivers/block/drbd/drbd_int.h    |   36 +-
 drivers/block/drbd/drbd_main.c   |   27 +-
 drivers/block/drbd/drbd_nl.c     | 1569 +++++++++++++++++++-------------------
 drivers/block/drbd/drbd_state.c  |    7 +-
 include/linux/drbd.h             |   35 +-
 6 files changed, 839 insertions(+), 838 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 9ab6365..6d1e892 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -692,6 +692,7 @@ static int w_update_odbm(struct drbd_work *w, int unused)
 {
 	struct update_odbm_work *udw = container_of(w, struct update_odbm_work, w);
 	struct drbd_conf *mdev = w->mdev;
+	struct sib_info sib = { .sib_reason = SIB_SYNC_PROGRESS, };
 
 	if (!get_ldev(mdev)) {
 		if (__ratelimit(&drbd_ratelimit_state))
@@ -715,7 +716,7 @@ static int w_update_odbm(struct drbd_work *w, int unused)
 			break;
 		}
 	}
-	drbd_bcast_sync_progress(mdev);
+	drbd_bcast_event(mdev, &sib);
 
 	return 1;
 }
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 5d90b9d..acd2877 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -44,6 +44,7 @@
 #include <net/tcp.h>
 #include <linux/lru_cache.h>
 #include <linux/prefetch.h>
+#include <linux/drbd_genl_api.h>
 #include <linux/drbd.h>
 #include "drbd_state.h"
 
@@ -65,7 +66,6 @@
 extern unsigned int minor_count;
 extern int disable_sendpage;
 extern int allow_oos;
-extern unsigned int cn_idx;
 
 #ifdef CONFIG_DRBD_FAULT_INJECTION
 extern int enable_faults;
@@ -865,14 +865,6 @@ struct drbd_md {
 	 */
 };
 
-/* for sync_conf and other types... */
-#define NL_PACKET(name, number, fields) struct name { fields };
-#define NL_INTEGER(pn,pr,member) int member;
-#define NL_INT64(pn,pr,member) __u64 member;
-#define NL_BIT(pn,pr,member)   unsigned member:1;
-#define NL_STRING(pn,pr,member,len) unsigned char member[len]; int member ## _len;
-#include "linux/drbd_nl.h"
-
 struct drbd_backing_dev {
 	struct block_device *backing_bdev;
 	struct block_device *md_bdev;
@@ -1502,7 +1494,7 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor,
 extern void drbd_free_mdev(struct drbd_conf *mdev);
 extern void drbd_delete_device(unsigned int minor);
 
-struct drbd_tconn *drbd_new_tconn(char *name);
+struct drbd_tconn *drbd_new_tconn(const char *name);
 extern void drbd_free_tconn(struct drbd_tconn *tconn);
 struct drbd_tconn *conn_by_name(const char *name);
 
@@ -1679,16 +1671,22 @@ extern int __drbd_set_out_of_sync(struct drbd_conf *mdev, sector_t sector,
 extern void drbd_al_apply_to_bm(struct drbd_conf *mdev);
 extern void drbd_al_shrink(struct drbd_conf *mdev);
 
-
 /* drbd_nl.c */
-
-void drbd_nl_cleanup(void);
-int __init drbd_nl_init(void);
-void drbd_bcast_state(struct drbd_conf *mdev, union drbd_state);
-void drbd_bcast_sync_progress(struct drbd_conf *mdev);
-void drbd_bcast_ee(struct drbd_conf *, const char *, const int, const char *,
-		   const char *, const struct drbd_peer_request *);
-
+/* state info broadcast */
+struct sib_info {
+	enum drbd_state_info_bcast_reason sib_reason;
+	union {
+		struct {
+			char *helper_name;
+			unsigned helper_exit_code;
+		};
+		struct {
+			union drbd_state os;
+			union drbd_state ns;
+		};
+	};
+};
+void drbd_bcast_event(struct drbd_conf *mdev, const struct sib_info *sib);
 
 /*
  * inline helper functions
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 21914f4..9b41213 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -86,7 +86,6 @@ MODULE_PARM_DESC(allow_oos, "DONT USE!");
 module_param(minor_count, uint, 0444);
 module_param(disable_sendpage, bool, 0644);
 module_param(allow_oos, bool, 0);
-module_param(cn_idx, uint, 0444);
 module_param(proc_details, int, 0644);
 
 #ifdef CONFIG_DRBD_FAULT_INJECTION
@@ -108,7 +107,6 @@ module_param(fault_devs, int, 0644);
 unsigned int minor_count = DRBD_MINOR_COUNT_DEF;
 int disable_sendpage;
 int allow_oos;
-unsigned int cn_idx = CN_IDX_DRBD;
 int proc_details;       /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
@@ -2175,7 +2173,7 @@ static void drbd_cleanup(void)
 	if (drbd_proc)
 		remove_proc_entry("drbd", NULL);
 
-	drbd_nl_cleanup();
+	drbd_genl_unregister();
 
 	idr_for_each_entry(&minors, mdev, i)
 		drbd_delete_device(i);
@@ -2237,6 +2235,9 @@ struct drbd_tconn *conn_by_name(const char *name)
 {
 	struct drbd_tconn *tconn;
 
+	if (!name || !name[0])
+		return NULL;
+
 	write_lock_irq(&global_state_lock);
 	list_for_each_entry(tconn, &drbd_tconns, all_tconn) {
 		if (!strcmp(tconn->name, name))
@@ -2248,7 +2249,7 @@ found:
 	return tconn;
 }
 
-struct drbd_tconn *drbd_new_tconn(char *name)
+struct drbd_tconn *drbd_new_tconn(const char *name)
 {
 	struct drbd_tconn *tconn;
 
@@ -2333,6 +2334,7 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor,
 
 	mdev->tconn = tconn;
 	mdev->minor = minor;
+	mdev->vnr = vnr;
 
 	drbd_init_set_defaults(mdev);
 
@@ -2462,10 +2464,6 @@ int __init drbd_init(void)
 #endif
 	}
 
-	err = drbd_nl_init();
-	if (err)
-		return err;
-
 	err = register_blkdev(DRBD_MAJOR, "drbd");
 	if (err) {
 		printk(KERN_ERR
@@ -2474,6 +2472,13 @@ int __init drbd_init(void)
 		return err;
 	}
 
+	err = drbd_genl_register();
+	if (err) {
+		pr_err("drbd: unable to register generic netlink family\n");
+		goto fail;
+	}
+
+
 	register_reboot_notifier(&drbd_notifier);
 
 	/*
@@ -2488,12 +2493,12 @@ int __init drbd_init(void)
 
 	err = drbd_create_mempools();
 	if (err)
-		goto Enomem;
+		goto fail;
 
 	drbd_proc = proc_create_data("drbd", S_IFREG | S_IRUGO , NULL, &drbd_proc_fops, NULL);
 	if (!drbd_proc)	{
 		printk(KERN_ERR "drbd: unable to register proc file\n");
-		goto Enomem;
+		goto fail;
 	}
 
 	rwlock_init(&global_state_lock);
@@ -2508,7 +2513,7 @@ int __init drbd_init(void)
 
 	return 0; /* Success! */
 
-Enomem:
+fail:
 	drbd_cleanup();
 	if (err == -ENOMEM)
 		/* currently always the case */
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f2739fd..a8f27cb 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -29,110 +29,258 @@
 #include <linux/fs.h>
 #include <linux/file.h>
 #include <linux/slab.h>
-#include <linux/connector.h>
 #include <linux/blkpg.h>
 #include <linux/cpumask.h>
 #include "drbd_int.h"
 #include "drbd_req.h"
 #include "drbd_wrappers.h"
 #include <asm/unaligned.h>
-#include <linux/drbd_tag_magic.h>
 #include <linux/drbd_limits.h>
-#include <linux/compiler.h>
 #include <linux/kthread.h>
 
-static unsigned short *tl_add_blob(unsigned short *, enum drbd_tags, const void *, int);
-static unsigned short *tl_add_str(unsigned short *, enum drbd_tags, const char *);
-static unsigned short *tl_add_int(unsigned short *, enum drbd_tags, const void *);
+#include <net/genetlink.h>
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,31)
+/*
+ * copied from more recent kernel source
+ */
+int genl_register_family_with_ops(struct genl_family *family,
+	struct genl_ops *ops, size_t n_ops)
+{
+	int err, i;
 
-/* see get_sb_bdev and bd_claim */
+	err = genl_register_family(family);
+	if (err)
+		return err;
+
+	for (i = 0; i < n_ops; ++i, ++ops) {
+		err = genl_register_ops(family, ops);
+		if (err)
+			goto err_out;
+	}
+	return 0;
+err_out:
+	genl_unregister_family(family);
+	return err;
+}
+#endif
+
+/* .doit */
+// int drbd_adm_create_resource(struct sk_buff *skb, struct genl_info *info);
+// int drbd_adm_delete_resource(struct sk_buff *skb, struct genl_info *info);
+
+int drbd_adm_add_minor(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_delete_minor(struct sk_buff *skb, struct genl_info *info);
+
+int drbd_adm_create_connection(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_delete_connection(struct sk_buff *skb, struct genl_info *info);
+
+int drbd_adm_set_role(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_detach(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_connect(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_start_ov(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_new_c_uuid(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_disconnect(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_invalidate(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_invalidate_peer(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_pause_sync(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_resume_sync(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_suspend_io(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_resume_io(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_outdate(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_syncer(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_get_status(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_get_timeout_type(struct sk_buff *skb, struct genl_info *info);
+/* .dumpit */
+int drbd_adm_get_status_all(struct sk_buff *skb, struct netlink_callback *cb);
+
+#include <linux/drbd_genl_api.h>
+#include <linux/genl_magic_func.h>
+
+/* used blkdev_get_by_path, to claim our meta data device(s) */
 static char *drbd_m_holder = "Hands off! this is DRBD's meta data device.";
 
-/* Generate the tag_list to struct functions */
-#define NL_PACKET(name, number, fields) \
-static int name ## _from_tags( \
-	unsigned short *tags, struct name *arg) __attribute__ ((unused)); \
-static int name ## _from_tags( \
-	unsigned short *tags, struct name *arg) \
-{ \
-	int tag; \
-	int dlen; \
-	\
-	while ((tag = get_unaligned(tags++)) != TT_END) {	\
-		dlen = get_unaligned(tags++);			\
-		switch (tag_number(tag)) { \
-		fields \
-		default: \
-			if (tag & T_MANDATORY) { \
-				printk(KERN_ERR "drbd: Unknown tag: %d\n", tag_number(tag)); \
-				return 0; \
-			} \
-		} \
-		tags = (unsigned short *)((char *)tags + dlen); \
-	} \
-	return 1; \
+/* Configuration is strictly serialized, because generic netlink message
+ * processing is strictly serialized by the genl_lock().
+ * Which means we can use one static global drbd_config_context struct.
+ */
+static struct drbd_config_context {
+	/* assigned from drbd_genlmsghdr */
+	unsigned int minor;
+	/* assigned from request attributes, if present */
+	unsigned int volume;
+#define VOLUME_UNSPECIFIED		(-1U)
+	/* pointer into the request skb,
+	 * limited lifetime! */
+	char *conn_name;
+
+	/* reply buffer */
+	struct sk_buff *reply_skb;
+	/* pointer into reply buffer */
+	struct drbd_genlmsghdr *reply_dh;
+	/* resolved from attributes, if possible */
+	struct drbd_conf *mdev;
+	struct drbd_tconn *tconn;
+} adm_ctx;
+
+static void drbd_adm_send_reply(struct sk_buff *skb, struct genl_info *info)
+{
+	genlmsg_end(skb, genlmsg_data(nlmsg_data(nlmsg_hdr(skb))));
+	if (genlmsg_reply(skb, info))
+		printk(KERN_ERR "drbd: error sending genl reply\n");
 }
-#define NL_INTEGER(pn, pr, member) \
-	case pn: /* D_ASSERT( tag_type(tag) == TT_INTEGER ); */ \
-		arg->member = get_unaligned((int *)(tags));	\
-		break;
-#define NL_INT64(pn, pr, member) \
-	case pn: /* D_ASSERT( tag_type(tag) == TT_INT64 ); */ \
-		arg->member = get_unaligned((u64 *)(tags));	\
-		break;
-#define NL_BIT(pn, pr, member) \
-	case pn: /* D_ASSERT( tag_type(tag) == TT_BIT ); */ \
-		arg->member = *(char *)(tags) ? 1 : 0; \
-		break;
-#define NL_STRING(pn, pr, member, len) \
-	case pn: /* D_ASSERT( tag_type(tag) == TT_STRING ); */ \
-		if (dlen > len) { \
-			printk(KERN_ERR "drbd: arg too long: %s (%u wanted, max len: %u bytes)\n", \
-				#member, dlen, (unsigned int)len); \
-			return 0; \
-		} \
-		 arg->member ## _len = dlen; \
-		 memcpy(arg->member, tags, min_t(size_t, dlen, len)); \
-		 break;
-#include "linux/drbd_nl.h"
-
-/* Generate the struct to tag_list functions */
-#define NL_PACKET(name, number, fields) \
-static unsigned short* \
-name ## _to_tags( \
-	struct name *arg, unsigned short *tags) __attribute__ ((unused)); \
-static unsigned short* \
-name ## _to_tags( \
-	struct name *arg, unsigned short *tags) \
-{ \
-	fields \
-	return tags; \
+
+/* Used on a fresh "drbd_adm_prepare"d reply_skb, this cannot fail: The only
+ * reason it could fail was no space in skb, and there are 4k available. */
+static int drbd_msg_put_info(const char *info)
+{
+	struct sk_buff *skb = adm_ctx.reply_skb;
+	struct nlattr *nla;
+	int err = -EMSGSIZE;
+
+	if (!info || !info[0])
+		return 0;
+
+	nla = nla_nest_start(skb, DRBD_NLA_CFG_REPLY);
+	if (!nla)
+		return err;
+
+	err = nla_put_string(skb, T_info_text, info);
+	if (err) {
+		nla_nest_cancel(skb, nla);
+		return err;
+	} else
+		nla_nest_end(skb, nla);
+	return 0;
+}
+
+/* This would be a good candidate for a "pre_doit" hook,
+ * and per-family private info->pointers.
+ * But we need to stay compatible with older kernels.
+ * If it returns successfully, adm_ctx members are valid.
+ */
+#define DRBD_ADM_NEED_MINOR	1
+#define DRBD_ADM_NEED_CONN	2
+static int drbd_adm_prepare(struct sk_buff *skb, struct genl_info *info,
+		unsigned flags)
+{
+	struct drbd_genlmsghdr *d_in = info->userhdr;
+	const u8 cmd = info->genlhdr->cmd;
+	int err;
+
+	memset(&adm_ctx, 0, sizeof(adm_ctx));
+
+	/* genl_rcv_msg only checks for CAP_NET_ADMIN on "GENL_ADMIN_PERM" :( */
+	if (cmd != DRBD_ADM_GET_STATUS
+	&& security_netlink_recv(skb, CAP_SYS_ADMIN))
+	       return -EPERM;
+
+	adm_ctx.reply_skb = genlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!adm_ctx.reply_skb)
+		goto fail;
+
+	adm_ctx.reply_dh = genlmsg_put_reply(adm_ctx.reply_skb,
+					info, &drbd_genl_family, 0, cmd);
+	/* put of a few bytes into a fresh skb of >= 4k will always succeed.
+	 * but anyways */
+	if (!adm_ctx.reply_dh)
+		goto fail;
+
+	adm_ctx.reply_dh->minor = d_in->minor;
+	adm_ctx.reply_dh->ret_code = NO_ERROR;
+
+	if (info->attrs[DRBD_NLA_CFG_CONTEXT]) {
+		struct nlattr *nla;
+		/* parse and validate only */
+		err = drbd_cfg_context_from_attrs(NULL, info->attrs);
+		if (err)
+			goto fail;
+
+		/* It was present, and valid,
+		 * copy it over to the reply skb. */
+		err = nla_put_nohdr(adm_ctx.reply_skb,
+				info->attrs[DRBD_NLA_CFG_CONTEXT]->nla_len,
+				info->attrs[DRBD_NLA_CFG_CONTEXT]);
+		if (err)
+			goto fail;
+
+		/* and assign stuff to the global adm_ctx */
+		nla = nested_attr_tb[__nla_type(T_ctx_volume)];
+		adm_ctx.volume = nla ? nla_get_u32(nla) : VOLUME_UNSPECIFIED;
+		nla = nested_attr_tb[__nla_type(T_ctx_conn_name)];
+		if (nla)
+			adm_ctx.conn_name = nla_data(nla);
+	} else
+		adm_ctx.volume = VOLUME_UNSPECIFIED;
+
+	adm_ctx.minor = d_in->minor;
+	adm_ctx.mdev = minor_to_mdev(d_in->minor);
+	adm_ctx.tconn = conn_by_name(adm_ctx.conn_name);
+
+	pr_info("adm request: cmd=%u[%s], flags=0x%x, minor=%d, conn=%s\n",
+		cmd, drbd_genl_cmd_to_str(cmd), d_in->flags,
+		d_in->minor, adm_ctx.conn_name ?: "n/a");
+
+	if (!adm_ctx.mdev && (flags & DRBD_ADM_NEED_MINOR)) {
+		drbd_msg_put_info("unknown minor");
+		return ERR_MINOR_INVALID;
+	}
+	if (!adm_ctx.tconn && (flags & DRBD_ADM_NEED_CONN)) {
+		drbd_msg_put_info("unknown connection");
+		return ERR_INVALID_REQUEST;
+	}
+
+	/* some more paranoia, if the request was over-determined */
+	if (adm_ctx.mdev &&
+	    adm_ctx.volume != VOLUME_UNSPECIFIED &&
+	    adm_ctx.volume != adm_ctx.mdev->vnr) {
+		pr_warning("request: minor=%u, volume=%u; but that minor is volume %u in %s\n",
+				adm_ctx.minor, adm_ctx.volume,
+				adm_ctx.mdev->vnr, adm_ctx.mdev->tconn->name);
+		drbd_msg_put_info("over-determined configuration context mismatch");
+		return ERR_INVALID_REQUEST;
+	}
+	if (adm_ctx.mdev && adm_ctx.tconn &&
+	    adm_ctx.mdev->tconn != adm_ctx.tconn) {
+		pr_warning("request: minor=%u, conn=%s; but that minor belongs to connection %s\n",
+				adm_ctx.minor, adm_ctx.conn_name, adm_ctx.mdev->tconn->name);
+		drbd_msg_put_info("over-determined configuration context mismatch");
+		return ERR_INVALID_REQUEST;
+	}
+	return NO_ERROR;
+
+fail:
+	nlmsg_free(adm_ctx.reply_skb);
+	adm_ctx.reply_skb = NULL;
+	return -ENOMEM;
 }
 
-#define NL_INTEGER(pn, pr, member) \
-	put_unaligned(pn | pr | TT_INTEGER, tags++);	\
-	put_unaligned(sizeof(int), tags++);		\
-	put_unaligned(arg->member, (int *)tags);	\
-	tags = (unsigned short *)((char *)tags+sizeof(int));
-#define NL_INT64(pn, pr, member) \
-	put_unaligned(pn | pr | TT_INT64, tags++);	\
-	put_unaligned(sizeof(u64), tags++);		\
-	put_unaligned(arg->member, (u64 *)tags);	\
-	tags = (unsigned short *)((char *)tags+sizeof(u64));
-#define NL_BIT(pn, pr, member) \
-	put_unaligned(pn | pr | TT_BIT, tags++);	\
-	put_unaligned(sizeof(char), tags++);		\
-	*(char *)tags = arg->member; \
-	tags = (unsigned short *)((char *)tags+sizeof(char));
-#define NL_STRING(pn, pr, member, len) \
-	put_unaligned(pn | pr | TT_STRING, tags++);	\
-	put_unaligned(arg->member ## _len, tags++);	\
-	memcpy(tags, arg->member, arg->member ## _len); \
-	tags = (unsigned short *)((char *)tags + arg->member ## _len);
-#include "linux/drbd_nl.h"
-
-void drbd_bcast_ev_helper(struct drbd_conf *mdev, char *helper_name);
-void drbd_nl_send_reply(struct cn_msg *, int);
+static int drbd_adm_finish(struct genl_info *info, int retcode)
+{
+	struct nlattr *nla;
+	const char *conn_name = NULL;
+	const u8 cmd = info->genlhdr->cmd;
+
+	if (!adm_ctx.reply_skb)
+		return -ENOMEM;
+
+	adm_ctx.reply_dh->ret_code = retcode;
+
+	nla = info->attrs[DRBD_NLA_CFG_CONTEXT];
+	if (nla) {
+		nla = nla_find_nested(nla, __nla_type(T_ctx_conn_name));
+		if (nla)
+			conn_name = nla_data(nla);
+	}
+
+	pr_info("adm reply: cmd=%u[%s], retcode=%d, minor=%d, conn=%s\n",
+		cmd, drbd_genl_cmd_to_str(cmd), retcode,
+		adm_ctx.minor, adm_ctx.conn_name ?: "n/a");
+
+	drbd_adm_send_reply(adm_ctx.reply_skb, info);
+	return 0;
+}
 
 int drbd_khelper(struct drbd_conf *mdev, char *cmd)
 {
@@ -142,9 +290,9 @@ int drbd_khelper(struct drbd_conf *mdev, char *cmd)
 			NULL, /* Will be set to address family */
 			NULL, /* Will be set to address */
 			NULL };
-
 	char mb[12], af[20], ad[60], *afs;
 	char *argv[] = {usermode_helper, cmd, mb, NULL };
+	struct sib_info sib;
 	int ret;
 
 	snprintf(mb, 12, "minor-%d", mdev_to_minor(mdev));
@@ -177,8 +325,9 @@ int drbd_khelper(struct drbd_conf *mdev, char *cmd)
 	drbd_md_sync(mdev);
 
 	dev_info(DEV, "helper command: %s %s %s\n", usermode_helper, cmd, mb);
-
-	drbd_bcast_ev_helper(mdev, cmd);
+	sib.sib_reason = SIB_HELPER_PRE;
+	sib.helper_name = cmd;
+	drbd_bcast_event(mdev, &sib);
 	ret = call_usermodehelper(usermode_helper, argv, envp, 1);
 	if (ret)
 		dev_warn(DEV, "helper command: %s %s %s exit code %u (0x%x)\n",
@@ -188,6 +337,9 @@ int drbd_khelper(struct drbd_conf *mdev, char *cmd)
 		dev_info(DEV, "helper command: %s %s %s exit code %u (0x%x)\n",
 				usermode_helper, cmd, mb,
 				(ret >> 8) & 0xff, ret);
+	sib.sib_reason = SIB_HELPER_POST;
+	sib.helper_exit_code = ret;
+	drbd_bcast_event(mdev, &sib);
 
 	if (ret < 0) /* Ignore any ERRNOs we got. */
 		ret = 0;
@@ -362,7 +514,7 @@ drbd_set_role(struct drbd_conf *mdev, enum drbd_role new_role, int force)
 		}
 
 		if (rv == SS_NOTHING_TO_DO)
-			goto fail;
+			goto out;
 		if (rv == SS_PRIMARY_NOP && mask.pdsk == 0) {
 			nps = drbd_try_outdate_peer(mdev);
 
@@ -388,13 +540,13 @@ drbd_set_role(struct drbd_conf *mdev, enum drbd_role new_role, int force)
 			rv = _drbd_request_state(mdev, mask, val,
 						CS_VERBOSE + CS_WAIT_COMPLETE);
 			if (rv < SS_SUCCESS)
-				goto fail;
+				goto out;
 		}
 		break;
 	}
 
 	if (rv < SS_SUCCESS)
-		goto fail;
+		goto out;
 
 	if (forced)
 		dev_warn(DEV, "Forced to consider local data as UpToDate!\n");
@@ -438,33 +590,46 @@ drbd_set_role(struct drbd_conf *mdev, enum drbd_role new_role, int force)
 	drbd_md_sync(mdev);
 
 	kobject_uevent(&disk_to_dev(mdev->vdisk)->kobj, KOBJ_CHANGE);
- fail:
+out:
 	mutex_unlock(mdev->state_mutex);
 	return rv;
 }
 
-static int drbd_nl_primary(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			   struct drbd_nl_cfg_reply *reply)
+static const char *from_attrs_err_to_txt(int err)
 {
-	struct primary primary_args;
-
-	memset(&primary_args, 0, sizeof(struct primary));
-	if (!primary_from_tags(nlp->tag_list, &primary_args)) {
-		reply->ret_code = ERR_MANDATORY_TAG;
-		return 0;
-	}
-
-	reply->ret_code =
-		drbd_set_role(mdev, R_PRIMARY, primary_args.primary_force);
-
-	return 0;
+	return	err == -ENOMSG ? "required attribute missing" :
+		err == -EOPNOTSUPP ? "unknown mandatory attribute" :
+		"invalid attribute value";
 }
 
-static int drbd_nl_secondary(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			     struct drbd_nl_cfg_reply *reply)
+int drbd_adm_set_role(struct sk_buff *skb, struct genl_info *info)
 {
-	reply->ret_code = drbd_set_role(mdev, R_SECONDARY, 0);
+	struct set_role_parms parms;
+	int err;
+	enum drbd_ret_code retcode;
 
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
+
+	memset(&parms, 0, sizeof(parms));
+	if (info->attrs[DRBD_NLA_SET_ROLE_PARMS]) {
+		err = set_role_parms_from_attrs(&parms, info->attrs);
+		if (err) {
+			retcode = ERR_MANDATORY_TAG;
+			drbd_msg_put_info(from_attrs_err_to_txt(err));
+			goto out;
+		}
+	}
+
+	if (info->genlhdr->cmd == DRBD_ADM_PRIMARY)
+		retcode = drbd_set_role(adm_ctx.mdev, R_PRIMARY, parms.assume_uptodate);
+	else
+		retcode = drbd_set_role(adm_ctx.mdev, R_SECONDARY, 0);
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
@@ -541,6 +706,12 @@ char *ppsize(char *buf, unsigned long long size)
  *  R_PRIMARY D_INCONSISTENT, and C_SYNC_TARGET:
  *  peer may not initiate a resize.
  */
+/* Note these are not to be confused with
+ * drbd_adm_suspend_io/drbd_adm_resume_io,
+ * which are (sub) state changes triggered by admin (drbdsetup),
+ * and can be long lived.
+ * This changes an mdev->flag, is triggered by drbd internals,
+ * and should be short-lived. */
 void drbd_suspend_io(struct drbd_conf *mdev)
 {
 	set_bit(SUSPEND_IO, &mdev->flags);
@@ -881,11 +1052,10 @@ static void drbd_suspend_al(struct drbd_conf *mdev)
 		dev_info(DEV, "Suspended AL updates\n");
 }
 
-/* does always return 0;
- * interesting return code is in reply->ret_code */
-static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			     struct drbd_nl_cfg_reply *reply)
+int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
 {
+	struct drbd_conf *mdev;
+	int err;
 	enum drbd_ret_code retcode;
 	enum determine_dev_size dd;
 	sector_t max_possible_sectors;
@@ -897,6 +1067,13 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 	enum drbd_state_rv rv;
 	int cp_discovered = 0;
 
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto fail;
+
+	mdev = adm_ctx.mdev;
 	conn_reconfig_start(mdev->tconn);
 
 	/* if you want to reconfigure, please tear down first */
@@ -910,7 +1087,7 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 	 * to realize a "hot spare" feature (not that I'd recommend that) */
 	wait_event(mdev->misc_wait, !atomic_read(&mdev->local_cnt));
 
-	/* allocation not in the IO path, cqueue thread context */
+	/* allocation not in the IO path, drbdsetup context */
 	nbc = kzalloc(sizeof(struct drbd_backing_dev), GFP_KERNEL);
 	if (!nbc) {
 		retcode = ERR_NOMEM;
@@ -922,12 +1099,14 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 	nbc->dc.fencing       = DRBD_FENCING_DEF;
 	nbc->dc.max_bio_bvecs = DRBD_MAX_BIO_BVECS_DEF;
 
-	if (!disk_conf_from_tags(nlp->tag_list, &nbc->dc)) {
+	err = disk_conf_from_attrs(&nbc->dc, info->attrs);
+	if (err) {
 		retcode = ERR_MANDATORY_TAG;
+		drbd_msg_put_info(from_attrs_err_to_txt(err));
 		goto fail;
 	}
 
-	if (nbc->dc.meta_dev_idx < DRBD_MD_INDEX_FLEX_INT) {
+	if ((int)nbc->dc.meta_dev_idx < DRBD_MD_INDEX_FLEX_INT) {
 		retcode = ERR_MD_IDX_INVALID;
 		goto fail;
 	}
@@ -961,7 +1140,7 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 	 */
 	bdev = blkdev_get_by_path(nbc->dc.meta_dev,
 				  FMODE_READ | FMODE_WRITE | FMODE_EXCL,
-				  (nbc->dc.meta_dev_idx < 0) ?
+				  ((int)nbc->dc.meta_dev_idx < 0) ?
 				  (void *)mdev : (void *)drbd_m_holder);
 	if (IS_ERR(bdev)) {
 		dev_err(DEV, "open(\"%s\") failed with %ld\n", nbc->dc.meta_dev,
@@ -997,7 +1176,7 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 		goto fail;
 	}
 
-	if (nbc->dc.meta_dev_idx < 0) {
+	if ((int)nbc->dc.meta_dev_idx < 0) {
 		max_possible_sectors = DRBD_MAX_SECTORS_FLEX;
 		/* at least one MB, otherwise it does not make sense */
 		min_md_device_sectors = (2<<10);
@@ -1028,7 +1207,7 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 		dev_warn(DEV, "==> truncating very big lower level device "
 			"to currently maximum possible %llu sectors <==\n",
 			(unsigned long long) max_possible_sectors);
-		if (nbc->dc.meta_dev_idx >= 0)
+		if ((int)nbc->dc.meta_dev_idx >= 0)
 			dev_warn(DEV, "==>> using internal or flexible "
 				      "meta data may help <<==\n");
 	}
@@ -1242,8 +1421,8 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 
 	kobject_uevent(&disk_to_dev(mdev->vdisk)->kobj, KOBJ_CHANGE);
 	put_ldev(mdev);
-	reply->ret_code = retcode;
 	conn_reconfig_done(mdev->tconn);
+	drbd_adm_finish(info, retcode);
 	return 0;
 
  force_diskless_dec:
@@ -1251,6 +1430,7 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
  force_diskless:
 	drbd_force_state(mdev, NS(disk, D_FAILED));
 	drbd_md_sync(mdev);
+	conn_reconfig_done(mdev->tconn);
  fail:
 	if (nbc) {
 		if (nbc->backing_bdev)
@@ -1263,8 +1443,7 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 	}
 	lc_destroy(resync_lru);
 
-	reply->ret_code = retcode;
-	conn_reconfig_done(mdev->tconn);
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
@@ -1273,42 +1452,54 @@ static int drbd_nl_disk_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
  * Then we transition to D_DISKLESS, and wait for put_ldev() to return all
  * internal references as well.
  * Only then we have finally detached. */
-static int drbd_nl_detach(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			  struct drbd_nl_cfg_reply *reply)
+int drbd_adm_detach(struct sk_buff *skb, struct genl_info *info)
 {
+	struct drbd_conf *mdev;
 	enum drbd_ret_code retcode;
-	int ret;
+
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
+
+	mdev = adm_ctx.mdev;
 	drbd_suspend_io(mdev); /* so no-one is stuck in drbd_al_begin_io */
-	retcode = drbd_request_state(mdev, NS(disk, D_FAILED));
-	/* D_FAILED will transition to DISKLESS. */
-	ret = wait_event_interruptible(mdev->misc_wait,
-			mdev->state.disk != D_FAILED);
+	retcode = drbd_request_state(mdev, NS(disk, D_DISKLESS));
+	wait_event(mdev->misc_wait,
+			mdev->state.disk != D_DISKLESS ||
+			!atomic_read(&mdev->local_cnt));
 	drbd_resume_io(mdev);
-	if ((int)retcode == (int)SS_IS_DISKLESS)
-		retcode = SS_NOTHING_TO_DO;
-	if (ret)
-		retcode = ERR_INTR;
-	reply->ret_code = retcode;
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_net_conf(struct drbd_tconn *tconn, struct drbd_nl_cfg_req *nlp,
-			    struct drbd_nl_cfg_reply *reply)
+int drbd_adm_connect(struct sk_buff *skb, struct genl_info *info)
 {
-	int i;
-	enum drbd_ret_code retcode;
+	char hmac_name[CRYPTO_MAX_ALG_NAME];
+	struct drbd_conf *mdev;
 	struct net_conf *new_conf = NULL;
 	struct crypto_hash *tfm = NULL;
 	struct crypto_hash *integrity_w_tfm = NULL;
 	struct crypto_hash *integrity_r_tfm = NULL;
-	struct drbd_conf *mdev;
-	char hmac_name[CRYPTO_MAX_ALG_NAME];
 	void *int_dig_out = NULL;
 	void *int_dig_in = NULL;
 	void *int_dig_vv = NULL;
 	struct drbd_tconn *oconn;
+	struct drbd_tconn *tconn;
 	struct sockaddr *new_my_addr, *new_peer_addr, *taken_addr;
+	enum drbd_ret_code retcode;
+	int i;
+	int err;
+
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_CONN);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
+	tconn = adm_ctx.tconn;
 	conn_reconfig_start(tconn);
 
 	if (tconn->cstate > C_STANDALONE) {
@@ -1343,8 +1534,10 @@ static int drbd_nl_net_conf(struct drbd_tconn *tconn, struct drbd_nl_cfg_req *nl
 	new_conf->on_congestion    = DRBD_ON_CONGESTION_DEF;
 	new_conf->cong_extents     = DRBD_CONG_EXTENTS_DEF;
 
-	if (!net_conf_from_tags(nlp->tag_list, new_conf)) {
+	err = net_conf_from_attrs(new_conf, info->attrs);
+	if (err) {
 		retcode = ERR_MANDATORY_TAG;
+		drbd_msg_put_info(from_attrs_err_to_txt(err));
 		goto fail;
 	}
 
@@ -1495,8 +1688,8 @@ static int drbd_nl_net_conf(struct drbd_tconn *tconn, struct drbd_nl_cfg_req *nl
 		mdev->recv_cnt = 0;
 		kobject_uevent(&disk_to_dev(mdev->vdisk)->kobj, KOBJ_CHANGE);
 	}
-	reply->ret_code = retcode;
 	conn_reconfig_done(tconn);
+	drbd_adm_finish(info, retcode);
 	return 0;
 
 fail:
@@ -1508,24 +1701,37 @@ fail:
 	crypto_free_hash(integrity_r_tfm);
 	kfree(new_conf);
 
-	reply->ret_code = retcode;
 	conn_reconfig_done(tconn);
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_disconnect(struct drbd_tconn *tconn, struct drbd_nl_cfg_req *nlp,
-			      struct drbd_nl_cfg_reply *reply)
+int drbd_adm_disconnect(struct sk_buff *skb, struct genl_info *info)
 {
-	int retcode;
-	struct disconnect dc;
+	struct disconnect_parms parms;
+	struct drbd_tconn *tconn;
+	enum drbd_ret_code retcode;
+	int err;
 
-	memset(&dc, 0, sizeof(struct disconnect));
-	if (!disconnect_from_tags(nlp->tag_list, &dc)) {
-		retcode = ERR_MANDATORY_TAG;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_CONN);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
 		goto fail;
+
+	tconn = adm_ctx.tconn;
+	memset(&parms, 0, sizeof(parms));
+	if (info->attrs[DRBD_NLA_DISCONNECT_PARMS]) {
+		err = disconnect_parms_from_attrs(&parms, info->attrs);
+		if (err) {
+			retcode = ERR_MANDATORY_TAG;
+			drbd_msg_put_info(from_attrs_err_to_txt(err));
+			goto fail;
+		}
 	}
 
-	if (dc.force) {
+	if (parms.force_disconnect) {
 		spin_lock_irq(&tconn->req_lock);
 		if (tconn->cstate >= C_WF_CONNECTION)
 			_conn_request_state(tconn, NS(conn, C_DISCONNECTING), CS_HARD);
@@ -1567,7 +1773,7 @@ static int drbd_nl_disconnect(struct drbd_tconn *tconn, struct drbd_nl_cfg_req *
  done:
 	retcode = NO_ERROR;
  fail:
-	reply->ret_code = retcode;
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
@@ -1587,20 +1793,32 @@ void resync_after_online_grow(struct drbd_conf *mdev)
 		_drbd_request_state(mdev, NS(conn, C_WF_SYNC_UUID), CS_VERBOSE + CS_SERIALIZE);
 }
 
-static int drbd_nl_resize(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			  struct drbd_nl_cfg_reply *reply)
+int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
 {
-	struct resize rs;
-	int retcode = NO_ERROR;
+	struct resize_parms rs;
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
 	enum determine_dev_size dd;
 	enum dds_flags ddsf;
+	int err;
 
-	memset(&rs, 0, sizeof(struct resize));
-	if (!resize_from_tags(nlp->tag_list, &rs)) {
-		retcode = ERR_MANDATORY_TAG;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
 		goto fail;
+
+	memset(&rs, 0, sizeof(struct resize_parms));
+	if (info->attrs[DRBD_NLA_RESIZE_PARMS]) {
+		err = resize_parms_from_attrs(&rs, info->attrs);
+		if (err) {
+			retcode = ERR_MANDATORY_TAG;
+			drbd_msg_put_info(from_attrs_err_to_txt(err));
+			goto fail;
+		}
 	}
 
+	mdev = adm_ctx.mdev;
 	if (mdev->state.conn > C_CONNECTED) {
 		retcode = ERR_RESIZE_RESYNC;
 		goto fail;
@@ -1644,14 +1862,14 @@ static int drbd_nl_resize(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
 	}
 
  fail:
-	reply->ret_code = retcode;
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_syncer_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			       struct drbd_nl_cfg_reply *reply)
+int drbd_adm_syncer(struct sk_buff *skb, struct genl_info *info)
 {
-	int retcode = NO_ERROR;
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
 	int err;
 	int ovr; /* online verify running */
 	int rsr; /* re-sync running */
@@ -1662,12 +1880,21 @@ static int drbd_nl_syncer_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *n
 	int *rs_plan_s = NULL;
 	int fifo_size;
 
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto fail;
+	mdev = adm_ctx.mdev;
+
 	if (!zalloc_cpumask_var(&new_cpu_mask, GFP_KERNEL)) {
 		retcode = ERR_NOMEM;
+		drbd_msg_put_info("unable to allocate cpumask");
 		goto fail;
 	}
 
-	if (nlp->flags & DRBD_NL_SET_DEFAULTS) {
+	if (((struct drbd_genlmsghdr*)info->userhdr)->flags
+			& DRBD_GENL_F_SET_DEFAULTS) {
 		memset(&sc, 0, sizeof(struct syncer_conf));
 		sc.rate       = DRBD_RATE_DEF;
 		sc.after      = DRBD_AFTER_DEF;
@@ -1681,8 +1908,10 @@ static int drbd_nl_syncer_conf(struct drbd_conf *mdev, struct drbd_nl_cfg_req *n
 	} else
 		memcpy(&sc, &mdev->sync_conf, sizeof(struct syncer_conf));
 
-	if (!syncer_conf_from_tags(nlp->tag_list, &sc)) {
+	err = syncer_conf_from_attrs(&sc, info->attrs);
+	if (err) {
 		retcode = ERR_MANDATORY_TAG;
+		drbd_msg_put_info(from_attrs_err_to_txt(err));
 		goto fail;
 	}
 
@@ -1832,14 +2061,23 @@ fail:
 	free_cpumask_var(new_cpu_mask);
 	crypto_free_hash(csums_tfm);
 	crypto_free_hash(verify_tfm);
-	reply->ret_code = retcode;
+
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_invalidate(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			      struct drbd_nl_cfg_reply *reply)
+int drbd_adm_invalidate(struct sk_buff *skb, struct genl_info *info)
 {
-	int retcode;
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
+
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
+
+	mdev = adm_ctx.mdev;
 
 	/* If there is still bitmap IO pending, probably because of a previous
 	 * resync just being finished, wait for it before requesting a new resync. */
@@ -1862,7 +2100,8 @@ static int drbd_nl_invalidate(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nl
 		retcode = drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_T));
 	}
 
-	reply->ret_code = retcode;
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
@@ -1875,56 +2114,58 @@ static int drbd_bmio_set_susp_al(struct drbd_conf *mdev)
 	return rv;
 }
 
-static int drbd_nl_invalidate_peer(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-				   struct drbd_nl_cfg_reply *reply)
+static int drbd_adm_simple_request_state(struct sk_buff *skb, struct genl_info *info,
+		union drbd_state mask, union drbd_state val)
 {
-	int retcode;
-
-	/* If there is still bitmap IO pending, probably because of a previous
-	 * resync just being finished, wait for it before requesting a new resync. */
-	wait_event(mdev->misc_wait, !test_bit(BITMAP_IO, &mdev->flags));
+	enum drbd_ret_code retcode;
 
-	retcode = _drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_S), CS_ORDERED);
-
-	if (retcode < SS_SUCCESS) {
-		if (retcode == SS_NEED_CONNECTION && mdev->state.role == R_PRIMARY) {
-			/* The peer will get a resync upon connect anyways. Just make that
-			   into a full resync. */
-			retcode = drbd_request_state(mdev, NS(pdsk, D_INCONSISTENT));
-			if (retcode >= SS_SUCCESS) {
-				if (drbd_bitmap_io(mdev, &drbd_bmio_set_susp_al,
-					"set_n_write from invalidate_peer",
-					BM_LOCKED_SET_ALLOWED))
-					retcode = ERR_IO_MD_DISK;
-			}
-		} else
-			retcode = drbd_request_state(mdev, NS(conn, C_STARTING_SYNC_S));
-	}
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	reply->ret_code = retcode;
+	retcode = drbd_request_state(adm_ctx.mdev, mask, val);
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_pause_sync(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			      struct drbd_nl_cfg_reply *reply)
+int drbd_adm_invalidate_peer(struct sk_buff *skb, struct genl_info *info)
 {
-	int retcode = NO_ERROR;
+	return drbd_adm_simple_request_state(skb, info, NS(conn, C_STARTING_SYNC_S));
+}
 
-	if (drbd_request_state(mdev, NS(user_isp, 1)) == SS_NOTHING_TO_DO)
-		retcode = ERR_PAUSE_IS_SET;
+int drbd_adm_pause_sync(struct sk_buff *skb, struct genl_info *info)
+{
+	enum drbd_ret_code retcode;
+
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	reply->ret_code = retcode;
+	if (drbd_request_state(adm_ctx.mdev, NS(user_isp, 1)) == SS_NOTHING_TO_DO)
+		retcode = ERR_PAUSE_IS_SET;
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_resume_sync(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			       struct drbd_nl_cfg_reply *reply)
+int drbd_adm_resume_sync(struct sk_buff *skb, struct genl_info *info)
 {
-	int retcode = NO_ERROR;
 	union drbd_state s;
+	enum drbd_ret_code retcode;
+
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	if (drbd_request_state(mdev, NS(user_isp, 0)) == SS_NOTHING_TO_DO) {
-		s = mdev->state;
+	if (drbd_request_state(adm_ctx.mdev, NS(user_isp, 0)) == SS_NOTHING_TO_DO) {
+		s = adm_ctx.mdev->state;
 		if (s.conn == C_PAUSED_SYNC_S || s.conn == C_PAUSED_SYNC_T) {
 			retcode = s.aftr_isp ? ERR_PIC_AFTER_DEP :
 				  s.peer_isp ? ERR_PIC_PEER_DEP : ERR_PAUSE_IS_CLEAR;
@@ -1933,28 +2174,35 @@ static int drbd_nl_resume_sync(struct drbd_conf *mdev, struct drbd_nl_cfg_req *n
 		}
 	}
 
-	reply->ret_code = retcode;
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_suspend_io(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			      struct drbd_nl_cfg_reply *reply)
+int drbd_adm_suspend_io(struct sk_buff *skb, struct genl_info *info)
 {
-	reply->ret_code = drbd_request_state(mdev, NS(susp, 1));
-
-	return 0;
+	return drbd_adm_simple_request_state(skb, info, NS(susp, 1));
 }
 
-static int drbd_nl_resume_io(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			     struct drbd_nl_cfg_reply *reply)
+int drbd_adm_resume_io(struct sk_buff *skb, struct genl_info *info)
 {
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
+
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
+
+	mdev = adm_ctx.mdev;
 	if (test_bit(NEW_CUR_UUID, &mdev->flags)) {
 		drbd_uuid_new_current(mdev);
 		clear_bit(NEW_CUR_UUID, &mdev->flags);
 	}
 	drbd_suspend_io(mdev);
-	reply->ret_code = drbd_request_state(mdev, NS3(susp, 0, susp_nod, 0, susp_fen, 0));
-	if (reply->ret_code == SS_SUCCESS) {
+	retcode = drbd_request_state(mdev, NS3(susp, 0, susp_nod, 0, susp_fen, 0));
+	if (retcode == SS_SUCCESS) {
 		if (mdev->state.conn < C_CONNECTED)
 			tl_clear(mdev->tconn);
 		if (mdev->state.disk == D_DISKLESS || mdev->state.disk == D_FAILED)
@@ -1962,138 +2210,261 @@ static int drbd_nl_resume_io(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp
 	}
 	drbd_resume_io(mdev);
 
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_outdate(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			   struct drbd_nl_cfg_reply *reply)
+int drbd_adm_outdate(struct sk_buff *skb, struct genl_info *info)
 {
-	reply->ret_code = drbd_request_state(mdev, NS(disk, D_OUTDATED));
-	return 0;
+	return drbd_adm_simple_request_state(skb, info, NS(disk, D_OUTDATED));
 }
 
-static int drbd_nl_get_config(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			   struct drbd_nl_cfg_reply *reply)
+int nla_put_status_info(struct sk_buff *skb, struct drbd_conf *mdev,
+		const struct sib_info *sib)
 {
-	unsigned short *tl;
-
-	tl = reply->tag_list;
-
-	if (get_ldev(mdev)) {
-		tl = disk_conf_to_tags(&mdev->ldev->dc, tl);
-		put_ldev(mdev);
+	struct state_info *si = NULL; /* for sizeof(si->member); */
+	struct nlattr *nla;
+	int got_ldev;
+	int got_net;
+	int err = 0;
+	int exclude_sensitive;
+
+	/* If sib != NULL, this is drbd_bcast_event, which anyone can listen
+	 * to.  So we better exclude_sensitive information.
+	 *
+	 * If sib == NULL, this is drbd_adm_get_status, executed synchronously
+	 * in the context of the requesting user process. Exclude sensitive
+	 * information, unless current has superuser.
+	 *
+	 * NOTE: for drbd_adm_get_status_all(), this is a netlink dump, and
+	 * relies on the current implementation of netlink_dump(), which
+	 * executes the dump callback successively from netlink_recvmsg(),
+	 * always in the context of the receiving process */
+	exclude_sensitive = sib || !capable(CAP_SYS_ADMIN);
+
+	got_ldev = get_ldev(mdev);
+	got_net = get_net_conf(mdev->tconn);
+
+	/* We need to add connection name and volume number information still.
+	 * Minor number is in drbd_genlmsghdr. */
+	nla = nla_nest_start(skb, DRBD_NLA_CFG_CONTEXT);
+	if (!nla)
+		goto nla_put_failure;
+	NLA_PUT_U32(skb, T_ctx_volume, mdev->vnr);
+	NLA_PUT_STRING(skb, T_ctx_conn_name, mdev->tconn->name);
+	nla_nest_end(skb, nla);
+
+	if (got_ldev)
+		if (disk_conf_to_skb(skb, &mdev->ldev->dc, exclude_sensitive))
+			goto nla_put_failure;
+	if (got_net)
+		if (net_conf_to_skb(skb, mdev->tconn->net_conf, exclude_sensitive))
+			goto nla_put_failure;
+
+	if (syncer_conf_to_skb(skb, &mdev->sync_conf, exclude_sensitive))
+			goto nla_put_failure;
+
+	nla = nla_nest_start(skb, DRBD_NLA_STATE_INFO);
+	if (!nla)
+		goto nla_put_failure;
+	NLA_PUT_U32(skb, T_sib_reason, sib ? sib->sib_reason : SIB_GET_STATUS_REPLY);
+	NLA_PUT_U32(skb, T_current_state, mdev->state.i);
+	NLA_PUT_U64(skb, T_ed_uuid, mdev->ed_uuid);
+	NLA_PUT_U64(skb, T_capacity, drbd_get_capacity(mdev->this_bdev));
+
+	if (got_ldev) {
+		NLA_PUT_U32(skb, T_disk_flags, mdev->ldev->md.flags);
+		NLA_PUT(skb, T_uuids, sizeof(si->uuids), mdev->ldev->md.uuid);
+		NLA_PUT_U64(skb, T_bits_total, drbd_bm_bits(mdev));
+		NLA_PUT_U64(skb, T_bits_oos, drbd_bm_total_weight(mdev));
+		if (C_SYNC_SOURCE <= mdev->state.conn &&
+		    C_PAUSED_SYNC_T >= mdev->state.conn) {
+			NLA_PUT_U64(skb, T_bits_rs_total, mdev->rs_total);
+			NLA_PUT_U64(skb, T_bits_rs_failed, mdev->rs_failed);
+		}
 	}
 
-	if (get_net_conf(mdev->tconn)) {
-		tl = net_conf_to_tags(mdev->tconn->net_conf, tl);
-		put_net_conf(mdev->tconn);
+	if (sib) {
+		switch(sib->sib_reason) {
+		case SIB_SYNC_PROGRESS:
+		case SIB_GET_STATUS_REPLY:
+			break;
+		case SIB_STATE_CHANGE:
+			NLA_PUT_U32(skb, T_prev_state, sib->os.i);
+			NLA_PUT_U32(skb, T_new_state, sib->ns.i);
+			break;
+		case SIB_HELPER_POST:
+			NLA_PUT_U32(skb,
+				T_helper_exit_code, sib->helper_exit_code);
+			/* fall through */
+		case SIB_HELPER_PRE:
+			NLA_PUT_STRING(skb, T_helper, sib->helper_name);
+			break;
+		}
 	}
-	tl = syncer_conf_to_tags(&mdev->sync_conf, tl);
-
-	put_unaligned(TT_END, tl++); /* Close the tag list */
+	nla_nest_end(skb, nla);
 
-	return (int)((char *)tl - (char *)reply->tag_list);
+	if (0)
+nla_put_failure:
+		err = -EMSGSIZE;
+	if (got_ldev)
+		put_ldev(mdev);
+	if (got_net)
+		put_net_conf(mdev->tconn);
+	return err;
 }
 
-static int drbd_nl_get_state(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			     struct drbd_nl_cfg_reply *reply)
+int drbd_adm_get_status(struct sk_buff *skb, struct genl_info *info)
 {
-	unsigned short *tl = reply->tag_list;
-	union drbd_state s = mdev->state;
-	unsigned long rs_left;
-	unsigned int res;
+	enum drbd_ret_code retcode;
+	int err;
 
-	tl = get_state_to_tags((struct get_state *)&s, tl);
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	/* no local ref, no bitmap, no syncer progress. */
-	if (s.conn >= C_SYNC_SOURCE && s.conn <= C_PAUSED_SYNC_T) {
-		if (get_ldev(mdev)) {
-			drbd_get_syncer_progress(mdev, &rs_left, &res);
-			tl = tl_add_int(tl, T_sync_progress, &res);
-			put_ldev(mdev);
-		}
+	err = nla_put_status_info(adm_ctx.reply_skb, adm_ctx.mdev, NULL);
+	if (err) {
+		nlmsg_free(adm_ctx.reply_skb);
+		return err;
 	}
-	put_unaligned(TT_END, tl++); /* Close the tag list */
-
-	return (int)((char *)tl - (char *)reply->tag_list);
+out:
+	drbd_adm_finish(info, retcode);
+	return 0;
 }
 
-static int drbd_nl_get_uuids(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			     struct drbd_nl_cfg_reply *reply)
+int drbd_adm_get_status_all(struct sk_buff *skb, struct netlink_callback *cb)
 {
-	unsigned short *tl;
-
-	tl = reply->tag_list;
+	struct drbd_conf *mdev;
+	struct drbd_genlmsghdr *dh;
+	int minor = cb->args[0];
+
+	/* Open coded deferred single idr_for_each_entry iteration.
+	 * This may miss entries inserted after this dump started,
+	 * or entries deleted before they are reached.
+	 * But we need to make sure the mdev won't disappear while
+	 * we are looking at it. */
+
+	rcu_read_lock();
+	mdev = idr_get_next(&minors, &minor);
+	if (mdev) {
+		dh = genlmsg_put(skb, NETLINK_CB(cb->skb).pid,
+				cb->nlh->nlmsg_seq, &drbd_genl_family,
+				NLM_F_MULTI, DRBD_ADM_GET_STATUS);
+		if (!dh)
+			goto errout;
+
+		D_ASSERT(mdev->minor == minor);
+
+		dh->minor = minor;
+		dh->ret_code = NO_ERROR;
+
+		pr_info("dump: minor=%u, conn=%s[%u]\n",
+			dh->minor, mdev->tconn->name, mdev->vnr);
+		if (nla_put_status_info(skb, mdev, NULL)) {
+			genlmsg_cancel(skb, dh);
+			goto errout;
+		}
+		genlmsg_end(skb, dh);
+        }
 
-	if (get_ldev(mdev)) {
-		tl = tl_add_blob(tl, T_uuids, mdev->ldev->md.uuid, UI_SIZE*sizeof(u64));
-		tl = tl_add_int(tl, T_uuids_flags, &mdev->ldev->md.flags);
-		put_ldev(mdev);
-	}
-	put_unaligned(TT_END, tl++); /* Close the tag list */
+errout:
+	rcu_read_unlock();
+	/* where to start idr_get_next with the next iteration */
+        cb->args[0] = minor+1;
 
-	return (int)((char *)tl - (char *)reply->tag_list);
+	/* No more minors found: empty skb. Which will terminate the dump. */
+        return skb->len;
 }
 
-/**
- * drbd_nl_get_timeout_flag() - Used by drbdsetup to find out which timeout value to use
- * @mdev:	DRBD device.
- * @nlp:	Netlink/connector packet from drbdsetup
- * @reply:	Reply packet for drbdsetup
- */
-static int drbd_nl_get_timeout_flag(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-				    struct drbd_nl_cfg_reply *reply)
+int drbd_adm_get_timeout_type(struct sk_buff *skb, struct genl_info *info)
 {
-	unsigned short *tl;
-	char rv;
-
-	tl = reply->tag_list;
+	enum drbd_ret_code retcode;
+	struct timeout_parms tp;
+	int err;
 
-	rv = mdev->state.pdsk == D_OUTDATED        ? UT_PEER_OUTDATED :
-	  test_bit(USE_DEGR_WFC_T, &mdev->flags) ? UT_DEGRADED : UT_DEFAULT;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	tl = tl_add_blob(tl, T_use_degraded, &rv, sizeof(rv));
-	put_unaligned(TT_END, tl++); /* Close the tag list */
+	tp.timeout_type =
+		adm_ctx.mdev->state.pdsk == D_OUTDATED ? UT_PEER_OUTDATED :
+		test_bit(USE_DEGR_WFC_T, &adm_ctx.mdev->flags) ? UT_DEGRADED :
+		UT_DEFAULT;
 
-	return (int)((char *)tl - (char *)reply->tag_list);
+	err = timeout_parms_to_priv_skb(adm_ctx.reply_skb, &tp);
+	if (err) {
+		nlmsg_free(adm_ctx.reply_skb);
+		return err;
+	}
+out:
+	drbd_adm_finish(info, retcode);
+	return 0;
 }
 
-static int drbd_nl_start_ov(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-				    struct drbd_nl_cfg_reply *reply)
+int drbd_adm_start_ov(struct sk_buff *skb, struct genl_info *info)
 {
-	/* default to resume from last known position, if possible */
-	struct start_ov args =
-		{ .start_sector = mdev->ov_start_sector };
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
 
-	if (!start_ov_from_tags(nlp->tag_list, &args)) {
-		reply->ret_code = ERR_MANDATORY_TAG;
-		return 0;
-	}
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
+	mdev = adm_ctx.mdev;
+	if (info->attrs[DRBD_NLA_START_OV_PARMS]) {
+		/* resume from last known position, if possible */
+		struct start_ov_parms parms =
+			{ .ov_start_sector = mdev->ov_start_sector };
+		int err = start_ov_parms_from_attrs(&parms, info->attrs);
+		if (err) {
+			retcode = ERR_MANDATORY_TAG;
+			drbd_msg_put_info(from_attrs_err_to_txt(err));
+			goto out;
+		}
+		/* w_make_ov_request expects position to be aligned */
+		mdev->ov_start_sector = parms.ov_start_sector & ~BM_SECT_PER_BIT;
+	}
 	/* If there is still bitmap IO pending, e.g. previous resync or verify
 	 * just being finished, wait for it before requesting a new resync. */
 	wait_event(mdev->misc_wait, !test_bit(BITMAP_IO, &mdev->flags));
-
-	/* w_make_ov_request expects position to be aligned */
-	mdev->ov_start_sector = args.start_sector & ~BM_SECT_PER_BIT;
-	reply->ret_code = drbd_request_state(mdev,NS(conn,C_VERIFY_S));
+	retcode = drbd_request_state(mdev,NS(conn,C_VERIFY_S));
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
 
-static int drbd_nl_new_c_uuid(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			      struct drbd_nl_cfg_reply *reply)
+int drbd_adm_new_c_uuid(struct sk_buff *skb, struct genl_info *info)
 {
-	int retcode = NO_ERROR;
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
 	int skip_initial_sync = 0;
 	int err;
+	struct new_c_uuid_parms args;
 
-	struct new_c_uuid args;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out_nolock;
 
-	memset(&args, 0, sizeof(struct new_c_uuid));
-	if (!new_c_uuid_from_tags(nlp->tag_list, &args)) {
-		reply->ret_code = ERR_MANDATORY_TAG;
-		return 0;
+	mdev = adm_ctx.mdev;
+	memset(&args, 0, sizeof(args));
+	if (info->attrs[DRBD_NLA_NEW_C_UUID_PARMS]) {
+		err = new_c_uuid_parms_from_attrs(&args, info->attrs);
+		if (err) {
+			retcode = ERR_MANDATORY_TAG;
+			drbd_msg_put_info(from_attrs_err_to_txt(err));
+			goto out_nolock;
+		}
 	}
 
 	mutex_lock(mdev->state_mutex); /* Protects us against serialized state changes. */
@@ -2139,510 +2510,164 @@ out_dec:
 	put_ldev(mdev);
 out:
 	mutex_unlock(mdev->state_mutex);
-
-	reply->ret_code = retcode;
-	return 0;
-}
-
-static int drbd_nl_new_conn(struct drbd_nl_cfg_req *nlp, struct drbd_nl_cfg_reply *reply)
-{
-	struct new_connection args;
-
-	if (!new_connection_from_tags(nlp->tag_list, &args)) {
-		reply->ret_code = ERR_MANDATORY_TAG;
-		return 0;
-	}
-
-	reply->ret_code = NO_ERROR;
-	if (!drbd_new_tconn(args.name))
-		reply->ret_code = ERR_NOMEM;
-
-	return 0;
-}
-
-static int drbd_nl_new_minor(struct drbd_tconn *tconn,
-		      struct drbd_nl_cfg_req *nlp, struct drbd_nl_cfg_reply *reply)
-{
-	struct new_minor args;
-
-	args.vol_nr = 0;
-	args.minor = 0;
-
-	if (!new_minor_from_tags(nlp->tag_list, &args)) {
-		reply->ret_code = ERR_MANDATORY_TAG;
-		return 0;
-	}
-
-	reply->ret_code = conn_new_minor(tconn, args.minor, args.vol_nr);
-
+out_nolock:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-static int drbd_nl_del_minor(struct drbd_conf *mdev, struct drbd_nl_cfg_req *nlp,
-			     struct drbd_nl_cfg_reply *reply)
+static enum drbd_ret_code
+drbd_check_conn_name(const char *name)
 {
-	if (mdev->state.disk == D_DISKLESS &&
-	    mdev->state.conn == C_STANDALONE &&
-	    mdev->state.role == R_SECONDARY) {
-		drbd_delete_device(mdev_to_minor(mdev));
-		reply->ret_code = NO_ERROR;
-	} else {
-		reply->ret_code = ERR_MINOR_CONFIGURED;
+	if (!name || !name[0]) {
+		drbd_msg_put_info("connection name missing");
+		return ERR_MANDATORY_TAG;
 	}
-	return 0;
-}
-
-static int drbd_nl_del_conn(struct drbd_tconn *tconn,
-			    struct drbd_nl_cfg_req *nlp, struct drbd_nl_cfg_reply *reply)
-{
-	if (conn_lowest_minor(tconn) < 0) {
-		drbd_free_tconn(tconn);
-		reply->ret_code = NO_ERROR;
-	} else {
-		reply->ret_code = ERR_CONN_IN_USE;
+	/* if we want to use these in sysfs/configfs/debugfs some day,
+	 * we must not allow slashes */
+	if (strchr(name, '/')) {
+		drbd_msg_put_info("invalid connection name");
+		return ERR_INVALID_REQUEST;
 	}
-
-	return 0;
+	return NO_ERROR;
 }
 
-enum cn_handler_type {
-	CHT_MINOR,
-	CHT_CONN,
-	CHT_CTOR,
-	/* CHT_RES, later */
-};
-struct cn_handler_struct {
-	enum cn_handler_type type;
-	union {
-		int (*minor_based)(struct drbd_conf *,
-				   struct drbd_nl_cfg_req *,
-				   struct drbd_nl_cfg_reply *);
-		int (*conn_based)(struct drbd_tconn *,
-				  struct drbd_nl_cfg_req *,
-				  struct drbd_nl_cfg_reply *);
-		int (*constructor)(struct drbd_nl_cfg_req *,
-				   struct drbd_nl_cfg_reply *);
-	};
-	int reply_body_size;
-};
-
-static struct cn_handler_struct cnd_table[] = {
-	[ P_primary ]		= { CHT_MINOR, { &drbd_nl_primary },	0 },
-	[ P_secondary ]		= { CHT_MINOR, { &drbd_nl_secondary },	0 },
-	[ P_disk_conf ]		= { CHT_MINOR, { &drbd_nl_disk_conf },	0 },
-	[ P_detach ]		= { CHT_MINOR, { &drbd_nl_detach },	0 },
-	[ P_net_conf ]		= { CHT_CONN,  { .conn_based = &drbd_nl_net_conf },	0 },
-	[ P_disconnect ]	= { CHT_CONN,  { .conn_based = &drbd_nl_disconnect },	0 },
-	[ P_resize ]		= { CHT_MINOR, { &drbd_nl_resize },	0 },
-	[ P_syncer_conf ]	= { CHT_MINOR, { &drbd_nl_syncer_conf },0 },
-	[ P_invalidate ]	= { CHT_MINOR, { &drbd_nl_invalidate },	0 },
-	[ P_invalidate_peer ]	= { CHT_MINOR, { &drbd_nl_invalidate_peer },0 },
-	[ P_pause_sync ]	= { CHT_MINOR, { &drbd_nl_pause_sync },	0 },
-	[ P_resume_sync ]	= { CHT_MINOR, { &drbd_nl_resume_sync },0 },
-	[ P_suspend_io ]	= { CHT_MINOR, { &drbd_nl_suspend_io },	0 },
-	[ P_resume_io ]		= { CHT_MINOR, { &drbd_nl_resume_io },	0 },
-	[ P_outdate ]		= { CHT_MINOR, { &drbd_nl_outdate },	0 },
-	[ P_get_config ]	= { CHT_MINOR, { &drbd_nl_get_config },
-				    sizeof(struct syncer_conf_tag_len_struct) +
-				    sizeof(struct disk_conf_tag_len_struct) +
-				    sizeof(struct net_conf_tag_len_struct) },
-	[ P_get_state ]		= { CHT_MINOR, { &drbd_nl_get_state },
-				    sizeof(struct get_state_tag_len_struct) +
-				    sizeof(struct sync_progress_tag_len_struct)	},
-	[ P_get_uuids ]		= { CHT_MINOR, { &drbd_nl_get_uuids },
-				    sizeof(struct get_uuids_tag_len_struct) },
-	[ P_get_timeout_flag ]	= { CHT_MINOR, { &drbd_nl_get_timeout_flag },
-				    sizeof(struct get_timeout_flag_tag_len_struct)},
-	[ P_start_ov ]		= { CHT_MINOR, { &drbd_nl_start_ov },	0 },
-	[ P_new_c_uuid ]	= { CHT_MINOR, { &drbd_nl_new_c_uuid },	0 },
-	[ P_new_connection ]	= { CHT_CTOR,  { .constructor = &drbd_nl_new_conn }, 0 },
-	[ P_new_minor ]		= { CHT_CONN,  { .conn_based = &drbd_nl_new_minor }, 0 },
-	[ P_del_minor ]		= { CHT_MINOR, { &drbd_nl_del_minor },	0 },
-	[ P_del_connection ]    = { CHT_CONN,  { .conn_based = &drbd_nl_del_conn }, 0 },
-};
-
-static void drbd_connector_callback(struct cn_msg *req, struct netlink_skb_parms *nsp)
+int drbd_adm_create_connection(struct sk_buff *skb, struct genl_info *info)
 {
-	struct drbd_nl_cfg_req *nlp = (struct drbd_nl_cfg_req *)req->data;
-	struct cn_handler_struct *cm;
-	struct cn_msg *cn_reply;
-	struct drbd_nl_cfg_reply *reply;
-	struct drbd_conf *mdev;
-	struct drbd_tconn *tconn;
-	int retcode, rr;
-	int reply_size = sizeof(struct cn_msg)
-		+ sizeof(struct drbd_nl_cfg_reply)
-		+ sizeof(short int);
-
-	if (!try_module_get(THIS_MODULE)) {
-		printk(KERN_ERR "drbd: try_module_get() failed!\n");
-		return;
-	}
-
-	if (!cap_raised(current_cap(), CAP_SYS_ADMIN)) {
-		retcode = ERR_PERM;
-		goto fail;
-	}
+	enum drbd_ret_code retcode;
 
-	if (nlp->packet_type >= P_nl_after_last_packet ||
-	    nlp->packet_type == P_return_code_only) {
-		retcode = ERR_PACKET_NR;
-		goto fail;
-	}
+	retcode = drbd_adm_prepare(skb, info, 0);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	cm = cnd_table + nlp->packet_type;
+	retcode = drbd_check_conn_name(adm_ctx.conn_name);
+	if (retcode != NO_ERROR)
+		goto out;
 
-	/* This may happen if packet number is 0: */
-	if (cm->minor_based == NULL) {
-		retcode = ERR_PACKET_NR;
-		goto fail;
+	if (adm_ctx.tconn) {
+		retcode = ERR_INVALID_REQUEST;
+		drbd_msg_put_info("connection exists");
+		goto out;
 	}
 
-	reply_size += cm->reply_body_size;
-
-	/* allocation not in the IO path, cqueue thread context */
-	cn_reply = kzalloc(reply_size, GFP_KERNEL);
-	if (!cn_reply) {
+	if (!drbd_new_tconn(adm_ctx.conn_name))
 		retcode = ERR_NOMEM;
-		goto fail;
-	}
-	reply = (struct drbd_nl_cfg_reply *) cn_reply->data;
-
-	reply->packet_type =
-		cm->reply_body_size ? nlp->packet_type : P_return_code_only;
-	reply->minor = nlp->drbd_minor;
-	reply->ret_code = NO_ERROR; /* Might by modified by cm->function. */
-	/* reply->tag_list; might be modified by cm->function. */
-
-	retcode = ERR_MINOR_INVALID;
-	rr = 0;
-	switch (cm->type) {
-	case CHT_MINOR:
-		mdev = minor_to_mdev(nlp->drbd_minor);
-		if (!mdev)
-			goto fail;
-		rr = cm->minor_based(mdev, nlp, reply);
-		break;
-	case CHT_CONN:
-		tconn = conn_by_name(nlp->obj_name);
-		if (!tconn) {
-			retcode = ERR_CONN_NOT_KNOWN;
-			goto fail;
-		}
-		rr = cm->conn_based(tconn, nlp, reply);
-		break;
-	case CHT_CTOR:
-		rr = cm->constructor(nlp, reply);
-		break;
-	/* case CHT_RES: */
-	}
-
-	cn_reply->id = req->id;
-	cn_reply->seq = req->seq;
-	cn_reply->ack = req->ack  + 1;
-	cn_reply->len = sizeof(struct drbd_nl_cfg_reply) + rr;
-	cn_reply->flags = 0;
-
-	rr = cn_netlink_send(cn_reply, CN_IDX_DRBD, GFP_KERNEL);
-	if (rr && rr != -ESRCH)
-		printk(KERN_INFO "drbd: cn_netlink_send()=%d\n", rr);
-
-	kfree(cn_reply);
-	module_put(THIS_MODULE);
-	return;
- fail:
-	drbd_nl_send_reply(req, retcode);
-	module_put(THIS_MODULE);
-}
-
-static atomic_t drbd_nl_seq = ATOMIC_INIT(2); /* two. */
-
-static unsigned short *
-__tl_add_blob(unsigned short *tl, enum drbd_tags tag, const void *data,
-	unsigned short len, int nul_terminated)
-{
-	unsigned short l = tag_descriptions[tag_number(tag)].max_len;
-	len = (len < l) ? len :  l;
-	put_unaligned(tag, tl++);
-	put_unaligned(len, tl++);
-	memcpy(tl, data, len);
-	tl = (unsigned short*)((char*)tl + len);
-	if (nul_terminated)
-		*((char*)tl - 1) = 0;
-	return tl;
-}
-
-static unsigned short *
-tl_add_blob(unsigned short *tl, enum drbd_tags tag, const void *data, int len)
-{
-	return __tl_add_blob(tl, tag, data, len, 0);
-}
-
-static unsigned short *
-tl_add_str(unsigned short *tl, enum drbd_tags tag, const char *str)
-{
-	return __tl_add_blob(tl, tag, str, strlen(str)+1, 0);
-}
-
-static unsigned short *
-tl_add_int(unsigned short *tl, enum drbd_tags tag, const void *val)
-{
-	put_unaligned(tag, tl++);
-	switch(tag_type(tag)) {
-	case TT_INTEGER:
-		put_unaligned(sizeof(int), tl++);
-		put_unaligned(*(int *)val, (int *)tl);
-		tl = (unsigned short*)((char*)tl+sizeof(int));
-		break;
-	case TT_INT64:
-		put_unaligned(sizeof(u64), tl++);
-		put_unaligned(*(u64 *)val, (u64 *)tl);
-		tl = (unsigned short*)((char*)tl+sizeof(u64));
-		break;
-	default:
-		/* someone did something stupid. */
-		;
-	}
-	return tl;
-}
-
-void drbd_bcast_state(struct drbd_conf *mdev, union drbd_state state)
-{
-	char buffer[sizeof(struct cn_msg)+
-		    sizeof(struct drbd_nl_cfg_reply)+
-		    sizeof(struct get_state_tag_len_struct)+
-		    sizeof(short int)];
-	struct cn_msg *cn_reply = (struct cn_msg *) buffer;
-	struct drbd_nl_cfg_reply *reply =
-		(struct drbd_nl_cfg_reply *)cn_reply->data;
-	unsigned short *tl = reply->tag_list;
-
-	/* dev_warn(DEV, "drbd_bcast_state() got called\n"); */
-
-	tl = get_state_to_tags((struct get_state *)&state, tl);
-
-	put_unaligned(TT_END, tl++); /* Close the tag list */
-
-	cn_reply->id.idx = CN_IDX_DRBD;
-	cn_reply->id.val = CN_VAL_DRBD;
-
-	cn_reply->seq = atomic_inc_return(&drbd_nl_seq);
-	cn_reply->ack = 0; /* not used here. */
-	cn_reply->len = sizeof(struct drbd_nl_cfg_reply) +
-		(int)((char *)tl - (char *)reply->tag_list);
-	cn_reply->flags = 0;
-
-	reply->packet_type = P_get_state;
-	reply->minor = mdev_to_minor(mdev);
-	reply->ret_code = NO_ERROR;
-
-	cn_netlink_send(cn_reply, CN_IDX_DRBD, GFP_NOIO);
-}
-
-void drbd_bcast_ev_helper(struct drbd_conf *mdev, char *helper_name)
-{
-	char buffer[sizeof(struct cn_msg)+
-		    sizeof(struct drbd_nl_cfg_reply)+
-		    sizeof(struct call_helper_tag_len_struct)+
-		    sizeof(short int)];
-	struct cn_msg *cn_reply = (struct cn_msg *) buffer;
-	struct drbd_nl_cfg_reply *reply =
-		(struct drbd_nl_cfg_reply *)cn_reply->data;
-	unsigned short *tl = reply->tag_list;
-
-	/* dev_warn(DEV, "drbd_bcast_state() got called\n"); */
-
-	tl = tl_add_str(tl, T_helper, helper_name);
-	put_unaligned(TT_END, tl++); /* Close the tag list */
-
-	cn_reply->id.idx = CN_IDX_DRBD;
-	cn_reply->id.val = CN_VAL_DRBD;
-
-	cn_reply->seq = atomic_inc_return(&drbd_nl_seq);
-	cn_reply->ack = 0; /* not used here. */
-	cn_reply->len = sizeof(struct drbd_nl_cfg_reply) +
-		(int)((char *)tl - (char *)reply->tag_list);
-	cn_reply->flags = 0;
-
-	reply->packet_type = P_call_helper;
-	reply->minor = mdev_to_minor(mdev);
-	reply->ret_code = NO_ERROR;
-
-	cn_netlink_send(cn_reply, CN_IDX_DRBD, GFP_NOIO);
+out:
+	drbd_adm_finish(info, retcode);
+	return 0;
 }
 
-void drbd_bcast_ee(struct drbd_conf *mdev, const char *reason, const int dgs,
-		   const char *seen_hash, const char *calc_hash,
-			   const struct drbd_peer_request *peer_req)
+int drbd_adm_add_minor(struct sk_buff *skb, struct genl_info *info)
 {
-	struct cn_msg *cn_reply;
-	struct drbd_nl_cfg_reply *reply;
-	unsigned short *tl;
-	struct page *page;
-	unsigned len;
+	struct drbd_genlmsghdr *dh = info->userhdr;
+	enum drbd_ret_code retcode;
 
-	if (!peer_req)
-		return;
-	if (!reason || !reason[0])
-		return;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_CONN);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	/* apparently we have to memcpy twice, first to prepare the data for the
-	 * struct cn_msg, then within cn_netlink_send from the cn_msg to the
-	 * netlink skb. */
-	/* receiver thread context, which is not in the writeout path (of this node),
-	 * but may be in the writeout path of the _other_ node.
-	 * GFP_NOIO to avoid potential "distributed deadlock". */
-	cn_reply = kzalloc(
-		sizeof(struct cn_msg)+
-		sizeof(struct drbd_nl_cfg_reply)+
-		sizeof(struct dump_ee_tag_len_struct)+
-		sizeof(short int),
-		GFP_NOIO);
-
-	if (!cn_reply) {
-		dev_err(DEV, "could not kmalloc buffer for drbd_bcast_ee, "
-			     "sector %llu, size %u\n",
-			(unsigned long long)peer_req->i.sector,
-			peer_req->i.size);
-		return;
+	/* FIXME drop minor_count parameter, limit to MINORMASK */
+	if (dh->minor >= minor_count) {
+		drbd_msg_put_info("requested minor out of range");
+		return ERR_INVALID_REQUEST;
 	}
-
-	reply = (struct drbd_nl_cfg_reply*)cn_reply->data;
-	tl = reply->tag_list;
-
-	tl = tl_add_str(tl, T_dump_ee_reason, reason);
-	tl = tl_add_blob(tl, T_seen_digest, seen_hash, dgs);
-	tl = tl_add_blob(tl, T_calc_digest, calc_hash, dgs);
-	tl = tl_add_int(tl, T_ee_sector, &peer_req->i.sector);
-	tl = tl_add_int(tl, T_ee_block_id, &peer_req->block_id);
-
-	/* dump the first 32k */
-	len = min_t(unsigned, peer_req->i.size, 32 << 10);
-	put_unaligned(T_ee_data, tl++);
-	put_unaligned(len, tl++);
-
-	page = peer_req->pages;
-	page_chain_for_each(page) {
-		void *d = kmap_atomic(page, KM_USER0);
-		unsigned l = min_t(unsigned, len, PAGE_SIZE);
-		memcpy(tl, d, l);
-		kunmap_atomic(d, KM_USER0);
-		tl = (unsigned short*)((char*)tl + l);
-		len -= l;
-		if (len == 0)
-			break;
+	/* FIXME we need a define here */
+	if (adm_ctx.volume >= 256) {
+		drbd_msg_put_info("requested volume id out of range");
+		return ERR_INVALID_REQUEST;
 	}
-	put_unaligned(TT_END, tl++); /* Close the tag list */
-
-	cn_reply->id.idx = CN_IDX_DRBD;
-	cn_reply->id.val = CN_VAL_DRBD;
-
-	cn_reply->seq = atomic_inc_return(&drbd_nl_seq);
-	cn_reply->ack = 0; // not used here.
-	cn_reply->len = sizeof(struct drbd_nl_cfg_reply) +
-		(int)((char*)tl - (char*)reply->tag_list);
-	cn_reply->flags = 0;
 
-	reply->packet_type = P_dump_ee;
-	reply->minor = mdev_to_minor(mdev);
-	reply->ret_code = NO_ERROR;
-
-	cn_netlink_send(cn_reply, CN_IDX_DRBD, GFP_NOIO);
-	kfree(cn_reply);
+	retcode = conn_new_minor(adm_ctx.tconn, dh->minor, adm_ctx.volume);
+out:
+	drbd_adm_finish(info, retcode);
+	return 0;
 }
 
-void drbd_bcast_sync_progress(struct drbd_conf *mdev)
+int drbd_adm_delete_minor(struct sk_buff *skb, struct genl_info *info)
 {
-	char buffer[sizeof(struct cn_msg)+
-		    sizeof(struct drbd_nl_cfg_reply)+
-		    sizeof(struct sync_progress_tag_len_struct)+
-		    sizeof(short int)];
-	struct cn_msg *cn_reply = (struct cn_msg *) buffer;
-	struct drbd_nl_cfg_reply *reply =
-		(struct drbd_nl_cfg_reply *)cn_reply->data;
-	unsigned short *tl = reply->tag_list;
-	unsigned long rs_left;
-	unsigned int res;
-
-	/* no local ref, no bitmap, no syncer progress, no broadcast. */
-	if (!get_ldev(mdev))
-		return;
-	drbd_get_syncer_progress(mdev, &rs_left, &res);
-	put_ldev(mdev);
-
-	tl = tl_add_int(tl, T_sync_progress, &res);
-	put_unaligned(TT_END, tl++); /* Close the tag list */
-
-	cn_reply->id.idx = CN_IDX_DRBD;
-	cn_reply->id.val = CN_VAL_DRBD;
-
-	cn_reply->seq = atomic_inc_return(&drbd_nl_seq);
-	cn_reply->ack = 0; /* not used here. */
-	cn_reply->len = sizeof(struct drbd_nl_cfg_reply) +
-		(int)((char *)tl - (char *)reply->tag_list);
-	cn_reply->flags = 0;
+	struct drbd_conf *mdev;
+	enum drbd_ret_code retcode;
 
-	reply->packet_type = P_sync_progress;
-	reply->minor = mdev_to_minor(mdev);
-	reply->ret_code = NO_ERROR;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
 
-	cn_netlink_send(cn_reply, CN_IDX_DRBD, GFP_NOIO);
+	mdev = adm_ctx.mdev;
+	if (mdev->state.disk == D_DISKLESS &&
+	    mdev->state.conn == C_STANDALONE &&
+	    mdev->state.role == R_SECONDARY) {
+		drbd_delete_device(mdev_to_minor(mdev));
+		retcode = NO_ERROR;
+	} else
+		retcode = ERR_MINOR_CONFIGURED;
+out:
+	drbd_adm_finish(info, retcode);
+	return 0;
 }
 
-int __init drbd_nl_init(void)
+int drbd_adm_delete_connection(struct sk_buff *skb, struct genl_info *info)
 {
-	static struct cb_id cn_id_drbd;
-	int err, try=10;
-
-	cn_id_drbd.val = CN_VAL_DRBD;
-	do {
-		cn_id_drbd.idx = cn_idx;
-		err = cn_add_callback(&cn_id_drbd, "cn_drbd", &drbd_connector_callback);
-		if (!err)
-			break;
-		cn_idx = (cn_idx + CN_IDX_STEP);
-	} while (try--);
+	enum drbd_ret_code retcode;
 
-	if (err) {
-		printk(KERN_ERR "drbd: cn_drbd failed to register\n");
-		return err;
+	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_CONN);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
+
+	if (conn_lowest_minor(adm_ctx.tconn) < 0) {
+		drbd_free_tconn(adm_ctx.tconn);
+		retcode = NO_ERROR;
+	} else {
+		retcode = ERR_CONN_IN_USE;
 	}
 
+out:
+	drbd_adm_finish(info, retcode);
 	return 0;
 }
 
-void drbd_nl_cleanup(void)
+void drbd_bcast_event(struct drbd_conf *mdev, const struct sib_info *sib)
 {
-	static struct cb_id cn_id_drbd;
-
-	cn_id_drbd.idx = cn_idx;
-	cn_id_drbd.val = CN_VAL_DRBD;
+	static atomic_t drbd_genl_seq = ATOMIC_INIT(2); /* two. */
+	struct sk_buff *msg;
+	struct drbd_genlmsghdr *d_out;
+	unsigned seq;
+	int err = -ENOMEM;
+
+	seq = atomic_inc_return(&drbd_genl_seq);
+	msg = genlmsg_new(NLMSG_GOODSIZE, GFP_NOIO);
+	if (!msg)
+		goto failed;
+
+	err = -EMSGSIZE;
+	d_out = genlmsg_put(msg, 0, seq, &drbd_genl_family, 0, DRBD_EVENT);
+	if (!d_out) /* cannot happen, but anyways. */
+		goto nla_put_failure;
+	d_out->minor = mdev_to_minor(mdev);
+	d_out->ret_code = 0;
+
+	pr_info("event: minor=%u, conn=%s\n", d_out->minor, mdev->tconn->name);
+
+	if (nla_put_status_info(msg, mdev, sib))
+		goto nla_put_failure;
+	genlmsg_end(msg, d_out);
+	err = drbd_genl_multicast_events(msg, 0);
+	/* msg has been consumed or freed in netlink_broadcast() */
+	if (err && err != -ESRCH)
+		goto failed;
 
-	cn_del_callback(&cn_id_drbd);
-}
+	return;
 
-void drbd_nl_send_reply(struct cn_msg *req, int ret_code)
-{
-	char buffer[sizeof(struct cn_msg)+sizeof(struct drbd_nl_cfg_reply)];
-	struct cn_msg *cn_reply = (struct cn_msg *) buffer;
-	struct drbd_nl_cfg_reply *reply =
-		(struct drbd_nl_cfg_reply *)cn_reply->data;
-	int rr;
-
-	memset(buffer, 0, sizeof(buffer));
-	cn_reply->id = req->id;
-
-	cn_reply->seq = req->seq;
-	cn_reply->ack = req->ack  + 1;
-	cn_reply->len = sizeof(struct drbd_nl_cfg_reply);
-	cn_reply->flags = 0;
-
-	reply->packet_type = P_return_code_only;
-	reply->minor = ((struct drbd_nl_cfg_req *)req->data)->drbd_minor;
-	reply->ret_code = ret_code;
-
-	rr = cn_netlink_send(cn_reply, CN_IDX_DRBD, GFP_NOIO);
-	if (rr && rr != -ESRCH)
-		printk(KERN_INFO "drbd: cn_netlink_send()=%d\n", rr);
+nla_put_failure:
+	nlmsg_free(msg);
+failed:
+	dev_err(DEV, "Error %d while broadcasting event. "
+			"Event seq:%u sib_reason:%u\n",
+			err, seq, sib->sib_reason);
 }
-
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 100d48b..34be2ef 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -971,6 +971,11 @@ static void after_state_ch(struct drbd_conf *mdev, union drbd_state os,
 	enum drbd_fencing_p fp;
 	enum drbd_req_event what = NOTHING;
 	union drbd_state nsm = (union drbd_state){ .i = -1 };
+	struct sib_info sib;
+
+	sib.sib_reason = SIB_STATE_CHANGE;
+	sib.os = os;
+	sib.ns = ns;
 
 	if (os.conn != C_CONNECTED && ns.conn == C_CONNECTED) {
 		clear_bit(CRASHED_PRIMARY, &mdev->flags);
@@ -985,7 +990,7 @@ static void after_state_ch(struct drbd_conf *mdev, union drbd_state os,
 	}
 
 	/* Inform userspace about the change... */
-	drbd_bcast_state(mdev, ns);
+	drbd_bcast_event(mdev, &sib);
 
 	if (!(os.role == R_PRIMARY && os.disk < D_UP_TO_DATE && os.pdsk < D_UP_TO_DATE) &&
 	    (ns.role == R_PRIMARY && ns.disk < D_UP_TO_DATE && ns.pdsk < D_UP_TO_DATE))
diff --git a/include/linux/drbd.h b/include/linux/drbd.h
index e192167..d28fdd8 100644
--- a/include/linux/drbd.h
+++ b/include/linux/drbd.h
@@ -51,7 +51,6 @@
 
 #endif
 
-
 extern const char *drbd_buildtag(void);
 #define REL_VERSION "8.3.11"
 #define API_VERSION 88
@@ -159,6 +158,7 @@ enum drbd_ret_code {
 	ERR_CONN_IN_USE         = 159,
 	ERR_MINOR_CONFIGURED    = 160,
 	ERR_MINOR_EXISTS	= 161,
+	ERR_INVALID_REQUEST	= 162,
 
 	/* insert new ones above this line */
 	AFTER_LAST_ERR_CODE
@@ -349,37 +349,4 @@ enum drbd_timeout_flag {
 #define DRBD_MD_INDEX_FLEX_EXT -2
 #define DRBD_MD_INDEX_FLEX_INT -3
 
-/* Start of the new netlink/connector stuff */
-
-enum drbd_ncr_flags {
-	DRBD_NL_CREATE_DEVICE = 0x01,
-	DRBD_NL_SET_DEFAULTS =  0x02,
-};
-#define DRBD_NL_OBJ_NAME_LEN 32
-
-
-/* For searching a vacant cn_idx value */
-#define CN_IDX_STEP			6977
-
-struct drbd_nl_cfg_req {
-	int packet_type;
-	union {
-		struct {
-			unsigned int drbd_minor;
-			enum drbd_ncr_flags flags;
-		};
-		struct {
-			char obj_name[DRBD_NL_OBJ_NAME_LEN];
-		};
-	};
-	unsigned short tag_list[];
-};
-
-struct drbd_nl_cfg_reply {
-	int packet_type;
-	unsigned int minor;
-	int ret_code; /* enum ret_code or set_st_err_t */
-	unsigned short tag_list[]; /* only used with get_* calls */
-};
-
 #endif
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/18] drbd: allow holes in minor and volume id allocation
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (8 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 09/18] drbd: switch configuration interface " Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 11/18] drbd: remove now unused connector related files Philipp Reisner
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

s/idr_get_new/idr_get_new_above/

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_int.h  |    1 +
 drivers/block/drbd/drbd_main.c |   33 +++++++++++++++++----------------
 drivers/block/drbd/drbd_nl.c   |    2 +-
 3 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index acd2877..f4c3c71 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1509,6 +1509,7 @@ extern int is_valid_ar_handle(struct drbd_request *, sector_t);
 
 
 /* drbd_nl.c */
+extern int drbd_msg_put_info(const char *info);
 extern void drbd_suspend_io(struct drbd_conf *mdev);
 extern void drbd_resume_io(struct drbd_conf *mdev);
 extern char *ppsize(char *buf, unsigned long long size);
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 9b41213..64bf9b6 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2322,6 +2322,7 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor,
 	struct request_queue *q;
 	int vnr_got = vnr;
 	int minor_got = minor;
+	enum drbd_ret_code err = ERR_NOMEM;
 
 	mdev = minor_to_mdev(minor);
 	if (mdev)
@@ -2389,33 +2390,33 @@ enum drbd_ret_code conn_new_minor(struct drbd_tconn *tconn, unsigned int minor,
 	INIT_LIST_HEAD(&mdev->current_epoch->list);
 	mdev->epochs = 1;
 
-	if (!idr_pre_get(&tconn->volumes, GFP_KERNEL))
-		goto out_no_vol_idr;
-	if (idr_get_new(&tconn->volumes, mdev, &vnr_got))
-		goto out_no_vol_idr;
-	if (vnr_got != vnr) {
-		dev_err(DEV, "vnr_got (%d) != vnr (%d)\n", vnr_got, vnr);
-		goto out_idr_remove_vol;
-	}
-
 	if (!idr_pre_get(&minors, GFP_KERNEL))
 		goto out_idr_remove_vol;
-	if (idr_get_new(&minors, mdev, &minor_got))
+	if (idr_get_new_above(&minors, mdev, minor, &minor_got))
 		goto out_idr_remove_vol;
 	if (minor_got != minor) {
-		/* minor exists, or other idr strangeness? */
-		dev_err(DEV, "available minor (%d) != requested minor (%d)\n",
-				minor_got, minor);
+		err = ERR_MINOR_EXISTS;
+		drbd_msg_put_info("requested minor exists already");
 		goto out_idr_remove_minor;
 	}
+
+	if (!idr_pre_get(&tconn->volumes, GFP_KERNEL))
+		goto out_no_vol_idr;
+	if (idr_get_new_above(&tconn->volumes, mdev, vnr, &vnr_got))
+		goto out_no_vol_idr;
+	if (vnr_got != vnr) {
+		err = ERR_INVALID_REQUEST;
+		drbd_msg_put_info("requested volume exists already");
+		goto out_idr_remove_vol;
+	}
 	add_disk(disk);
 
 	return NO_ERROR;
 
-out_idr_remove_minor:
-	idr_remove(&minors, minor_got);
 out_idr_remove_vol:
 	idr_remove(&tconn->volumes, vnr_got);
+out_idr_remove_minor:
+	idr_remove(&minors, minor_got);
 	synchronize_rcu();
 out_no_vol_idr:
 	kfree(mdev->current_epoch);
@@ -2429,7 +2430,7 @@ out_no_disk:
 	blk_cleanup_queue(q);
 out_no_q:
 	kfree(mdev);
-	return ERR_NOMEM;
+	return err;
 }
 
 /* counterpart of drbd_new_device.
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index a8f27cb..b40c83d 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -133,7 +133,7 @@ static void drbd_adm_send_reply(struct sk_buff *skb, struct genl_info *info)
 
 /* Used on a fresh "drbd_adm_prepare"d reply_skb, this cannot fail: The only
  * reason it could fail was no space in skb, and there are 4k available. */
-static int drbd_msg_put_info(const char *info)
+int drbd_msg_put_info(const char *info)
 {
 	struct sk_buff *skb = adm_ctx.reply_skb;
 	struct nlattr *nla;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 11/18] drbd: remove now unused connector related files
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (9 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 10/18] drbd: allow holes in minor and volume id allocation Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:48 ` [PATCH 12/18] drbd: drbd_adm_get_status needs to show some more detail Philipp Reisner
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 include/linux/drbd_nl.h        |  172 ----------------------------------------
 include/linux/drbd_tag_magic.h |   84 -------------------
 2 files changed, 0 insertions(+), 256 deletions(-)
 delete mode 100644 include/linux/drbd_nl.h
 delete mode 100644 include/linux/drbd_tag_magic.h

diff --git a/include/linux/drbd_nl.h b/include/linux/drbd_nl.h
deleted file mode 100644
index 1216c7a..0000000
--- a/include/linux/drbd_nl.h
+++ /dev/null
@@ -1,172 +0,0 @@
-/*
-   PAKET( name,
-	  TYPE ( pn, pr, member )
-	  ...
-   )
-
-   You may never reissue one of the pn arguments
-*/
-
-#if !defined(NL_PACKET) || !defined(NL_STRING) || !defined(NL_INTEGER) || !defined(NL_BIT) || !defined(NL_INT64)
-#error "The macros NL_PACKET, NL_STRING, NL_INTEGER, NL_INT64 and NL_BIT needs to be defined"
-#endif
-
-NL_PACKET(primary, 1,
-       NL_BIT(		1,	T_MAY_IGNORE,	primary_force)
-)
-
-NL_PACKET(secondary, 2, )
-
-NL_PACKET(disk_conf, 3,
-	NL_INT64(	2,	T_MAY_IGNORE,	disk_size)
-	NL_STRING(	3,	T_MANDATORY,	backing_dev,	128)
-	NL_STRING(	4,	T_MANDATORY,	meta_dev,	128)
-	NL_INTEGER(	5,	T_MANDATORY,	meta_dev_idx)
-	NL_INTEGER(	6,	T_MAY_IGNORE,	on_io_error)
-	NL_INTEGER(	7,	T_MAY_IGNORE,	fencing)
-	NL_BIT(		37,	T_MAY_IGNORE,	use_bmbv)
-	NL_BIT(		53,	T_MAY_IGNORE,	no_disk_flush)
-	NL_BIT(		54,	T_MAY_IGNORE,	no_md_flush)
-	  /*  55 max_bio_size was available in 8.2.6rc2 */
-	NL_INTEGER(	56,	T_MAY_IGNORE,	max_bio_bvecs)
-	NL_BIT(		57,	T_MAY_IGNORE,	no_disk_barrier)
-	NL_BIT(		58,	T_MAY_IGNORE,	no_disk_drain)
-)
-
-NL_PACKET(detach, 4, )
-
-NL_PACKET(net_conf, 5,
-	NL_STRING(	8,	T_MANDATORY,	my_addr,	128)
-	NL_STRING(	9,	T_MANDATORY,	peer_addr,	128)
-	NL_STRING(	10,	T_MAY_IGNORE,	shared_secret,	SHARED_SECRET_MAX)
-	NL_STRING(	11,	T_MAY_IGNORE,	cram_hmac_alg,	SHARED_SECRET_MAX)
-	NL_STRING(	44,	T_MAY_IGNORE,	integrity_alg,	SHARED_SECRET_MAX)
-	NL_INTEGER(	14,	T_MAY_IGNORE,	timeout)
-	NL_INTEGER(	15,	T_MANDATORY,	wire_protocol)
-	NL_INTEGER(	16,	T_MAY_IGNORE,	try_connect_int)
-	NL_INTEGER(	17,	T_MAY_IGNORE,	ping_int)
-	NL_INTEGER(	18,	T_MAY_IGNORE,	max_epoch_size)
-	NL_INTEGER(	19,	T_MAY_IGNORE,	max_buffers)
-	NL_INTEGER(	20,	T_MAY_IGNORE,	unplug_watermark)
-	NL_INTEGER(	21,	T_MAY_IGNORE,	sndbuf_size)
-	NL_INTEGER(	22,	T_MAY_IGNORE,	ko_count)
-	NL_INTEGER(	24,	T_MAY_IGNORE,	after_sb_0p)
-	NL_INTEGER(	25,	T_MAY_IGNORE,	after_sb_1p)
-	NL_INTEGER(	26,	T_MAY_IGNORE,	after_sb_2p)
-	NL_INTEGER(	39,	T_MAY_IGNORE,	rr_conflict)
-	NL_INTEGER(	40,	T_MAY_IGNORE,	ping_timeo)
-	NL_INTEGER(	67,	T_MAY_IGNORE,	rcvbuf_size)
-	NL_INTEGER(	81,	T_MAY_IGNORE,	on_congestion)
-	NL_INTEGER(	82,	T_MAY_IGNORE,	cong_fill)
-	NL_INTEGER(	83,	T_MAY_IGNORE,	cong_extents)
-	  /* 59 addr_family was available in GIT, never released */
-	NL_BIT(		60,	T_MANDATORY,	mind_af)
-	NL_BIT(		27,	T_MAY_IGNORE,	want_lose)
-	NL_BIT(		28,	T_MAY_IGNORE,	two_primaries)
-	NL_BIT(		41,	T_MAY_IGNORE,	always_asbp)
-	NL_BIT(		61,	T_MAY_IGNORE,	no_cork)
-	NL_BIT(		62,	T_MANDATORY,	auto_sndbuf_size)
-	NL_BIT(		70,	T_MANDATORY,	dry_run)
-)
-
-NL_PACKET(disconnect, 6,
-	NL_BIT(		84,	T_MAY_IGNORE,	force)
-)
-
-NL_PACKET(resize, 7,
-	NL_INT64(		29,	T_MAY_IGNORE,	resize_size)
-	NL_BIT(			68,	T_MAY_IGNORE,	resize_force)
-	NL_BIT(			69,	T_MANDATORY,	no_resync)
-)
-
-NL_PACKET(syncer_conf, 8,
-	NL_INTEGER(	30,	T_MAY_IGNORE,	rate)
-	NL_INTEGER(	31,	T_MAY_IGNORE,	after)
-	NL_INTEGER(	32,	T_MAY_IGNORE,	al_extents)
-/*	NL_INTEGER(     71,	T_MAY_IGNORE,	dp_volume)
- *	NL_INTEGER(     72,	T_MAY_IGNORE,	dp_interval)
- *	NL_INTEGER(     73,	T_MAY_IGNORE,	throttle_th)
- *	NL_INTEGER(     74,	T_MAY_IGNORE,	hold_off_th)
- * feature will be reimplemented differently with 8.3.9 */
-	NL_STRING(      52,     T_MAY_IGNORE,   verify_alg,     SHARED_SECRET_MAX)
-	NL_STRING(      51,     T_MAY_IGNORE,   cpu_mask,       32)
-	NL_STRING(	64,	T_MAY_IGNORE,	csums_alg,	SHARED_SECRET_MAX)
-	NL_BIT(         65,     T_MAY_IGNORE,   use_rle)
-	NL_INTEGER(	75,	T_MAY_IGNORE,	on_no_data)
-	NL_INTEGER(	76,	T_MAY_IGNORE,	c_plan_ahead)
-	NL_INTEGER(     77,	T_MAY_IGNORE,	c_delay_target)
-	NL_INTEGER(     78,	T_MAY_IGNORE,	c_fill_target)
-	NL_INTEGER(     79,	T_MAY_IGNORE,	c_max_rate)
-	NL_INTEGER(     80,	T_MAY_IGNORE,	c_min_rate)
-)
-
-NL_PACKET(invalidate, 9, )
-NL_PACKET(invalidate_peer, 10, )
-NL_PACKET(pause_sync, 11, )
-NL_PACKET(resume_sync, 12, )
-NL_PACKET(suspend_io, 13, )
-NL_PACKET(resume_io, 14, )
-NL_PACKET(outdate, 15, )
-NL_PACKET(get_config, 16, )
-NL_PACKET(get_state, 17,
-	NL_INTEGER(	33,	T_MAY_IGNORE,	state_i)
-)
-
-NL_PACKET(get_uuids, 18,
-	NL_STRING(	34,	T_MAY_IGNORE,	uuids,	(UI_SIZE*sizeof(__u64)))
-	NL_INTEGER(	35,	T_MAY_IGNORE,	uuids_flags)
-)
-
-NL_PACKET(get_timeout_flag, 19,
-	NL_BIT(		36,	T_MAY_IGNORE,	use_degraded)
-)
-
-NL_PACKET(call_helper, 20,
-	NL_STRING(	38,	T_MAY_IGNORE,	helper,		32)
-)
-
-/* Tag nr 42 already allocated in drbd-8.1 development. */
-
-NL_PACKET(sync_progress, 23,
-	NL_INTEGER(	43,	T_MAY_IGNORE,	sync_progress)
-)
-
-NL_PACKET(dump_ee, 24,
-	NL_STRING(	45,	T_MAY_IGNORE,	dump_ee_reason, 32)
-	NL_STRING(	46,	T_MAY_IGNORE,	seen_digest, SHARED_SECRET_MAX)
-	NL_STRING(	47,	T_MAY_IGNORE,	calc_digest, SHARED_SECRET_MAX)
-	NL_INT64(	48,	T_MAY_IGNORE,	ee_sector)
-	NL_INT64(	49,	T_MAY_IGNORE,	ee_block_id)
-	NL_STRING(	50,	T_MAY_IGNORE,	ee_data,	32 << 10)
-)
-
-NL_PACKET(start_ov, 25,
-	NL_INT64(	66,	T_MAY_IGNORE,	start_sector)
-)
-
-NL_PACKET(new_c_uuid, 26,
-       NL_BIT(		63,	T_MANDATORY,	clear_bm)
-)
-
-#ifdef NL_RESPONSE
-NL_RESPONSE(return_code_only, 27)
-#endif
-
-NL_PACKET(new_connection, 28, /* CHT_CTOR */
-	NL_STRING(	85,	T_MANDATORY,	name, DRBD_NL_OBJ_NAME_LEN)
-)
-
-NL_PACKET(new_minor, 29, /* CHT_CONN */
-	NL_INTEGER(	86,	T_MANDATORY,	minor)
-	NL_INTEGER(	87,	T_MANDATORY,	vol_nr)
-)
-
-NL_PACKET(del_minor, 30, ) /* CHT_MINOR */
-NL_PACKET(del_connection, 31, ) /* CHT_CONN */
-
-#undef NL_PACKET
-#undef NL_INTEGER
-#undef NL_INT64
-#undef NL_BIT
-#undef NL_STRING
-#undef NL_RESPONSE
diff --git a/include/linux/drbd_tag_magic.h b/include/linux/drbd_tag_magic.h
deleted file mode 100644
index 0695431..0000000
--- a/include/linux/drbd_tag_magic.h
+++ /dev/null
@@ -1,84 +0,0 @@
-#ifndef DRBD_TAG_MAGIC_H
-#define DRBD_TAG_MAGIC_H
-
-#define TT_END     0
-#define TT_REMOVED 0xE000
-
-/* declare packet_type enums */
-enum packet_types {
-#define NL_PACKET(name, number, fields) P_ ## name = number,
-#define NL_RESPONSE(name, number) P_ ## name = number,
-#define NL_INTEGER(pn, pr, member)
-#define NL_INT64(pn, pr, member)
-#define NL_BIT(pn, pr, member)
-#define NL_STRING(pn, pr, member, len)
-#include "drbd_nl.h"
-	P_nl_after_last_packet,
-};
-
-/* These struct are used to deduce the size of the tag lists: */
-#define NL_PACKET(name, number, fields)	\
-	struct name ## _tag_len_struct { fields };
-#define NL_INTEGER(pn, pr, member)		\
-	int member; int tag_and_len ## member;
-#define NL_INT64(pn, pr, member)		\
-	__u64 member; int tag_and_len ## member;
-#define NL_BIT(pn, pr, member)		\
-	unsigned char member:1; int tag_and_len ## member;
-#define NL_STRING(pn, pr, member, len)	\
-	unsigned char member[len]; int member ## _len; \
-	int tag_and_len ## member;
-#include "linux/drbd_nl.h"
-
-/* declare tag-list-sizes */
-static const int tag_list_sizes[] = {
-#define NL_PACKET(name, number, fields) 2 fields ,
-#define NL_INTEGER(pn, pr, member)      + 4 + 4
-#define NL_INT64(pn, pr, member)        + 4 + 8
-#define NL_BIT(pn, pr, member)          + 4 + 1
-#define NL_STRING(pn, pr, member, len)  + 4 + (len)
-#include "drbd_nl.h"
-};
-
-/* The two highest bits are used for the tag type */
-#define TT_MASK      0xC000
-#define TT_INTEGER   0x0000
-#define TT_INT64     0x4000
-#define TT_BIT       0x8000
-#define TT_STRING    0xC000
-/* The next bit indicates if processing of the tag is mandatory */
-#define T_MANDATORY  0x2000
-#define T_MAY_IGNORE 0x0000
-#define TN_MASK      0x1fff
-/* The remaining 13 bits are used to enumerate the tags */
-
-#define tag_type(T)   ((T) & TT_MASK)
-#define tag_number(T) ((T) & TN_MASK)
-
-/* declare tag enums */
-#define NL_PACKET(name, number, fields) fields
-enum drbd_tags {
-#define NL_INTEGER(pn, pr, member)     T_ ## member = pn | TT_INTEGER | pr ,
-#define NL_INT64(pn, pr, member)       T_ ## member = pn | TT_INT64   | pr ,
-#define NL_BIT(pn, pr, member)         T_ ## member = pn | TT_BIT     | pr ,
-#define NL_STRING(pn, pr, member, len) T_ ## member = pn | TT_STRING  | pr ,
-#include "drbd_nl.h"
-};
-
-struct tag {
-	const char *name;
-	int type_n_flags;
-	int max_len;
-};
-
-/* declare tag names */
-#define NL_PACKET(name, number, fields) fields
-static const struct tag tag_descriptions[] = {
-#define NL_INTEGER(pn, pr, member)     [ pn ] = { #member, TT_INTEGER | pr, sizeof(int)   },
-#define NL_INT64(pn, pr, member)       [ pn ] = { #member, TT_INT64   | pr, sizeof(__u64) },
-#define NL_BIT(pn, pr, member)         [ pn ] = { #member, TT_BIT     | pr, sizeof(int)   },
-#define NL_STRING(pn, pr, member, len) [ pn ] = { #member, TT_STRING  | pr, (len)         },
-#include "drbd_nl.h"
-};
-
-#endif
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 12/18] drbd: drbd_adm_get_status needs to show some more detail
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (10 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 11/18] drbd: remove now unused connector related files Philipp Reisner
@ 2011-09-01 12:48 ` Philipp Reisner
  2011-09-01 12:49 ` [PATCH 13/18] drbd: simplify conn_all_vols_unconf, make it bool Philipp Reisner
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:48 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

We want to see existing connection objects, even if they do not
currently have volumes attached.

Change the .dumpit variant of drbd_adm_get_status to iterate not over
minor devices, but over connections + volumes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_int.h  |    3 +-
 drivers/block/drbd/drbd_main.c |   15 +++---
 drivers/block/drbd/drbd_nl.c   |  117 +++++++++++++++++++++++++++++++++-------
 3 files changed, 107 insertions(+), 28 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index f4c3c71..d84a073 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -171,6 +171,7 @@ drbd_insert_fault(struct drbd_conf *mdev, unsigned int type) {
 extern struct ratelimit_state drbd_ratelimit_state;
 extern struct idr minors;
 extern struct list_head drbd_tconns;
+extern struct mutex drbd_cfg_mutex;
 
 /* on the wire */
 enum drbd_packet {
@@ -918,7 +919,7 @@ enum {
 
 struct drbd_tconn {			/* is a resource from the config file */
 	char *name;			/* Resource name */
-	struct list_head all_tconn;	/* List of all drbd_tconn, prot by global_state_lock */
+	struct list_head all_tconn;	/* linked on global drbd_tconns */
 	struct idr volumes;		/* <tconn, vnr> to mdev mapping */
 	enum drbd_conns cstate;		/* Only C_STANDALONE to C_WF_REPORT_PARAMS */
 	struct mutex cstate_mutex;	/* Protects graceful disconnects */
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 64bf9b6..2e79032 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -120,6 +120,7 @@ module_param_string(usermode_helper, usermode_helper, sizeof(usermode_helper), 0
  */
 struct idr minors;
 struct list_head drbd_tconns;  /* list of struct drbd_tconn */
+DEFINE_MUTEX(drbd_cfg_mutex);
 
 struct kmem_cache *drbd_request_cache;
 struct kmem_cache *drbd_ee_cache;	/* peer requests */
@@ -2238,14 +2239,14 @@ struct drbd_tconn *conn_by_name(const char *name)
 	if (!name || !name[0])
 		return NULL;
 
-	write_lock_irq(&global_state_lock);
+	mutex_lock(&drbd_cfg_mutex);
 	list_for_each_entry(tconn, &drbd_tconns, all_tconn) {
 		if (!strcmp(tconn->name, name))
 			goto found;
 	}
 	tconn = NULL;
 found:
-	write_unlock_irq(&global_state_lock);
+	mutex_unlock(&drbd_cfg_mutex);
 	return tconn;
 }
 
@@ -2285,9 +2286,9 @@ struct drbd_tconn *drbd_new_tconn(const char *name)
 	drbd_thread_init(tconn, &tconn->worker, drbd_worker, "worker");
 	drbd_thread_init(tconn, &tconn->asender, drbd_asender, "asender");
 
-	write_lock_irq(&global_state_lock);
-	list_add(&tconn->all_tconn, &drbd_tconns);
-	write_unlock_irq(&global_state_lock);
+	mutex_lock(&drbd_cfg_mutex);
+	list_add_tail(&tconn->all_tconn, &drbd_tconns);
+	mutex_unlock(&drbd_cfg_mutex);
 
 	return tconn;
 
@@ -2302,9 +2303,9 @@ fail:
 
 void drbd_free_tconn(struct drbd_tconn *tconn)
 {
-	write_lock_irq(&global_state_lock);
+	mutex_lock(&drbd_cfg_mutex);
 	list_del(&tconn->all_tconn);
-	write_unlock_irq(&global_state_lock);
+	mutex_unlock(&drbd_cfg_mutex);
 	idr_destroy(&tconn->volumes);
 
 	free_cpumask_var(tconn->cpu_mask);
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index b40c83d..c389995 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1577,6 +1577,10 @@ int drbd_adm_connect(struct sk_buff *skb, struct genl_info *info)
 
 	new_my_addr = (struct sockaddr *)&new_conf->my_addr;
 	new_peer_addr = (struct sockaddr *)&new_conf->peer_addr;
+
+	/* No need to take drbd_cfg_mutex here.  All reconfiguration is
+	 * strictly serialized on genl_lock(). We are protected against
+	 * concurrent reconfiguration/addition/deletion */
 	list_for_each_entry(oconn, &drbd_tconns, all_tconn) {
 		if (oconn == tconn)
 			continue;
@@ -2220,6 +2224,24 @@ int drbd_adm_outdate(struct sk_buff *skb, struct genl_info *info)
 	return drbd_adm_simple_request_state(skb, info, NS(disk, D_OUTDATED));
 }
 
+int nla_put_drbd_cfg_context(struct sk_buff *skb, const char *conn_name, unsigned vnr)
+{
+	struct nlattr *nla;
+	nla = nla_nest_start(skb, DRBD_NLA_CFG_CONTEXT);
+	if (!nla)
+		goto nla_put_failure;
+	if (vnr != VOLUME_UNSPECIFIED)
+		NLA_PUT_U32(skb, T_ctx_volume, vnr);
+	NLA_PUT_STRING(skb, T_ctx_conn_name, conn_name);
+	nla_nest_end(skb, nla);
+	return 0;
+
+nla_put_failure:
+	if (nla)
+		nla_nest_cancel(skb, nla);
+	return -EMSGSIZE;
+}
+
 int nla_put_status_info(struct sk_buff *skb, struct drbd_conf *mdev,
 		const struct sib_info *sib)
 {
@@ -2248,12 +2270,8 @@ int nla_put_status_info(struct sk_buff *skb, struct drbd_conf *mdev,
 
 	/* We need to add connection name and volume number information still.
 	 * Minor number is in drbd_genlmsghdr. */
-	nla = nla_nest_start(skb, DRBD_NLA_CFG_CONTEXT);
-	if (!nla)
+	if (nla_put_drbd_cfg_context(skb, mdev->tconn->name, mdev->vnr))
 		goto nla_put_failure;
-	NLA_PUT_U32(skb, T_ctx_volume, mdev->vnr);
-	NLA_PUT_STRING(skb, T_ctx_conn_name, mdev->tconn->name);
-	nla_nest_end(skb, nla);
 
 	if (got_ldev)
 		if (disk_conf_to_skb(skb, &mdev->ldev->dc, exclude_sensitive))
@@ -2340,43 +2358,102 @@ int drbd_adm_get_status_all(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct drbd_conf *mdev;
 	struct drbd_genlmsghdr *dh;
-	int minor = cb->args[0];
-
-	/* Open coded deferred single idr_for_each_entry iteration.
+	struct drbd_tconn *pos = (struct drbd_tconn*)cb->args[0];
+	struct drbd_tconn *tconn = NULL;
+	struct drbd_tconn *tmp;
+	unsigned volume = cb->args[1];
+
+	/* Open coded, deferred, iteration:
+	 * list_for_each_entry_safe(tconn, tmp, &drbd_tconns, all_tconn) {
+	 *	idr_for_each_entry(&tconn->volumes, mdev, i) {
+	 *	  ...
+	 *	}
+	 * }
+	 * where tconn is cb->args[0];
+	 * and i is cb->args[1];
+	 *
 	 * This may miss entries inserted after this dump started,
 	 * or entries deleted before they are reached.
-	 * But we need to make sure the mdev won't disappear while
-	 * we are looking at it. */
+	 *
+	 * We need to make sure the mdev won't disappear while
+	 * we are looking at it, and revalidate our iterators
+	 * on each iteration.
+	 */
 
+	/* synchronize with drbd_new_tconn/drbd_free_tconn */
+	mutex_lock(&drbd_cfg_mutex);
+	/* synchronize with drbd_delete_device */
 	rcu_read_lock();
-	mdev = idr_get_next(&minors, &minor);
-	if (mdev) {
+next_tconn:
+	/* revalidate iterator position */
+	list_for_each_entry(tmp, &drbd_tconns, all_tconn) {
+		if (pos == NULL) {
+			/* first iteration */
+			pos = tmp;
+			tconn = pos;
+			break;
+		}
+		if (tmp == pos) {
+			tconn = pos;
+			break;
+		}
+	}
+	if (tconn) {
+		mdev = idr_get_next(&tconn->volumes, &volume);
+		if (!mdev) {
+			/* No more volumes to dump on this tconn.
+			 * Advance tconn iterator. */
+			pos = list_entry(tconn->all_tconn.next,
+					struct drbd_tconn, all_tconn);
+			/* But, did we dump any volume on this tconn yet? */
+			if (volume != 0) {
+				tconn = NULL;
+				volume = 0;
+				goto next_tconn;
+			}
+		}
+
 		dh = genlmsg_put(skb, NETLINK_CB(cb->skb).pid,
 				cb->nlh->nlmsg_seq, &drbd_genl_family,
 				NLM_F_MULTI, DRBD_ADM_GET_STATUS);
 		if (!dh)
-			goto errout;
+			goto out;
+
+		if (!mdev) {
+			/* this is a tconn without a single volume */
+			dh->minor = -1U;
+			dh->ret_code = NO_ERROR;
+			if (nla_put_drbd_cfg_context(skb, tconn->name, VOLUME_UNSPECIFIED))
+				genlmsg_cancel(skb, dh);
+			else
+				genlmsg_end(skb, dh);
+			goto out;
+		}
 
-		D_ASSERT(mdev->minor == minor);
+		D_ASSERT(mdev->vnr == volume);
+		D_ASSERT(mdev->tconn == tconn);
 
-		dh->minor = minor;
+		dh->minor = mdev_to_minor(mdev);
 		dh->ret_code = NO_ERROR;
 
 		pr_info("dump: minor=%u, conn=%s[%u]\n",
 			dh->minor, mdev->tconn->name, mdev->vnr);
 		if (nla_put_status_info(skb, mdev, NULL)) {
 			genlmsg_cancel(skb, dh);
-			goto errout;
+			goto out;
 		}
 		genlmsg_end(skb, dh);
         }
 
-errout:
+out:
 	rcu_read_unlock();
-	/* where to start idr_get_next with the next iteration */
-        cb->args[0] = minor+1;
+	mutex_unlock(&drbd_cfg_mutex);
+	/* where to start the next iteration */
+        cb->args[0] = (long)pos;
+        cb->args[1] = (pos == tconn) ? volume + 1 : 0;
 
-	/* No more minors found: empty skb. Which will terminate the dump. */
+	/* No more tconns/volumes/minors found results in an empty skb.
+	 * Which will terminate the dump. */
         return skb->len;
 }
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 13/18] drbd: simplify conn_all_vols_unconf, make it bool
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (11 preceding siblings ...)
  2011-09-01 12:48 ` [PATCH 12/18] drbd: drbd_adm_get_status needs to show some more detail Philipp Reisner
@ 2011-09-01 12:49 ` Philipp Reisner
  2011-09-01 12:49 ` [PATCH 14/18] drbd: Allow a Diskless Secondary volume to be removed Philipp Reisner
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:49 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Get rid of a temporary variable and, funny bitand assignment.
Just short circuit, returning false, once we encounter the first
still configured volume.

FIXME verify call sites for need of rcu_read_lock or stronger.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_state.c |   16 +++++++---------
 drivers/block/drbd/drbd_state.h |    2 +-
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 34be2ef..c6a659b 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -48,20 +48,18 @@ static enum drbd_state_rv is_valid_transition(union drbd_state os, union drbd_st
 static union drbd_state sanitize_state(struct drbd_conf *mdev, union drbd_state ns,
 				       const char **warn_sync_abort);
 
-int conn_all_vols_unconf(struct drbd_tconn *tconn)
+bool conn_all_vols_unconf(struct drbd_tconn *tconn)
 {
 	struct drbd_conf *mdev;
-	int minor, uncfg = 1;
+	int minor;
 
 	idr_for_each_entry(&tconn->volumes, mdev, minor) {
-		uncfg &= (mdev->state.disk == D_DISKLESS &&
-			  mdev->state.conn == C_STANDALONE &&
-			  mdev->state.role == R_SECONDARY);
-		if (!uncfg)
-			break;
+		if (mdev->state.disk != D_DISKLESS ||
+		    mdev->state.conn != C_STANDALONE ||
+		    mdev->state.role != R_SECONDARY)
+			return false;
 	}
-
-	return uncfg;
+	return true;
 }
 
 /**
diff --git a/drivers/block/drbd/drbd_state.h b/drivers/block/drbd/drbd_state.h
index d9536cd..55df072 100644
--- a/drivers/block/drbd/drbd_state.h
+++ b/drivers/block/drbd/drbd_state.h
@@ -91,7 +91,7 @@ conn_request_state(struct drbd_tconn *tconn, union drbd_state mask, union drbd_s
 		   enum chg_state_flags flags);
 
 extern void drbd_resume_al(struct drbd_conf *mdev);
-extern int conn_all_vols_unconf(struct drbd_tconn *tconn);
+extern bool conn_all_vols_unconf(struct drbd_tconn *tconn);
 
 /**
  * drbd_request_state() - Reqest a state change
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 14/18] drbd: Allow a Diskless Secondary volume to be removed
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (12 preceding siblings ...)
  2011-09-01 12:49 ` [PATCH 13/18] drbd: simplify conn_all_vols_unconf, make it bool Philipp Reisner
@ 2011-09-01 12:49 ` Philipp Reisner
  2011-09-01 12:49 ` [PATCH 15/18] drbd: new-connection and new-minor succeed, if the object already exists Philipp Reisner
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:49 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Even if the connection is still established.
We should be able to reduce a volume from a replication group,
without taking the whole group offline.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_nl.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index c389995..740649b 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -248,6 +248,8 @@ static int drbd_adm_prepare(struct sk_buff *skb, struct genl_info *info,
 		drbd_msg_put_info("over-determined configuration context mismatch");
 		return ERR_INVALID_REQUEST;
 	}
+	if (adm_ctx.mdev && !adm_ctx.tconn)
+		adm_ctx.tconn = adm_ctx.mdev->tconn;
 	return NO_ERROR;
 
 fail:
@@ -2676,10 +2678,15 @@ int drbd_adm_delete_minor(struct sk_buff *skb, struct genl_info *info)
 
 	mdev = adm_ctx.mdev;
 	if (mdev->state.disk == D_DISKLESS &&
-	    mdev->state.conn == C_STANDALONE &&
+	    /* no need to be mdev->state.conn == C_STANDALONE &&
+	     * we may want to delete a minor from a live replication group.
+	     */
 	    mdev->state.role == R_SECONDARY) {
 		drbd_delete_device(mdev_to_minor(mdev));
 		retcode = NO_ERROR;
+		/* if this was the last volume of this connection,
+		 * this will terminate all threads */
+		conn_reconfig_done(adm_ctx.tconn);
 	} else
 		retcode = ERR_MINOR_CONFIGURED;
 out:
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 15/18] drbd: new-connection and new-minor succeed, if the object already exists
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (13 preceding siblings ...)
  2011-09-01 12:49 ` [PATCH 14/18] drbd: Allow a Diskless Secondary volume to be removed Philipp Reisner
@ 2011-09-01 12:49 ` Philipp Reisner
  2011-09-01 12:49 ` [PATCH 16/18] drbd: bail out if a config requrest is over-determined, and not matching Philipp Reisner
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:49 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Follow O_CREAT semantics when creating connection or minor device/volume
objects.  If we need O_CREAT|O_EXCL semantics some time down the road,
we can add NLM_F_EXCL to the netlink message flags.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_nl.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 740649b..e89d108 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -2625,8 +2625,11 @@ int drbd_adm_create_connection(struct sk_buff *skb, struct genl_info *info)
 		goto out;
 
 	if (adm_ctx.tconn) {
-		retcode = ERR_INVALID_REQUEST;
-		drbd_msg_put_info("connection exists");
+		if (info->nlhdr->nlmsg_flags & NLM_F_EXCL) {
+			retcode = ERR_INVALID_REQUEST;
+			drbd_msg_put_info("connection exists");
+		}
+		/* else: still NO_ERROR */
 		goto out;
 	}
 
@@ -2659,6 +2662,15 @@ int drbd_adm_add_minor(struct sk_buff *skb, struct genl_info *info)
 		return ERR_INVALID_REQUEST;
 	}
 
+	/* drbd_adm_prepare made sure already
+	 * that mdev->tconn and mdev->vnr match the request. */
+	if (adm_ctx.mdev) {
+		if (info->nlhdr->nlmsg_flags & NLM_F_EXCL)
+			retcode = ERR_MINOR_EXISTS;
+		/* else: still NO_ERROR */
+		goto out;
+	}
+
 	retcode = conn_new_minor(adm_ctx.tconn, dh->minor, adm_ctx.volume);
 out:
 	drbd_adm_finish(info, retcode);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 16/18] drbd: bail out if a config requrest is over-determined, and not matching
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (14 preceding siblings ...)
  2011-09-01 12:49 ` [PATCH 15/18] drbd: new-connection and new-minor succeed, if the object already exists Philipp Reisner
@ 2011-09-01 12:49 ` Philipp Reisner
  2011-09-01 12:49 ` [PATCH 17/18] drbd: add forgotten spin_unlock Philipp Reisner
  2011-09-01 12:49 ` [PATCH 18/18] drbd: introduce in-kernel "down" command Philipp Reisner
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:49 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

We have resources resp. connections, volumes, and minor numbers.
A config request may specifies all three of them.
If it turns out that the minor belongs to a different connection, or a
different volume number in the same connection, that configuration
request is invalid.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_nl.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index e89d108..773946d 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -232,20 +232,20 @@ static int drbd_adm_prepare(struct sk_buff *skb, struct genl_info *info,
 	}
 
 	/* some more paranoia, if the request was over-determined */
+	if (adm_ctx.mdev && adm_ctx.tconn &&
+	    adm_ctx.mdev->tconn != adm_ctx.tconn) {
+		pr_warning("request: minor=%u, conn=%s; but that minor belongs to connection %s\n",
+				adm_ctx.minor, adm_ctx.conn_name, adm_ctx.mdev->tconn->name);
+		drbd_msg_put_info("minor exists in different connection");
+		return ERR_INVALID_REQUEST;
+	}
 	if (adm_ctx.mdev &&
 	    adm_ctx.volume != VOLUME_UNSPECIFIED &&
 	    adm_ctx.volume != adm_ctx.mdev->vnr) {
 		pr_warning("request: minor=%u, volume=%u; but that minor is volume %u in %s\n",
 				adm_ctx.minor, adm_ctx.volume,
 				adm_ctx.mdev->vnr, adm_ctx.mdev->tconn->name);
-		drbd_msg_put_info("over-determined configuration context mismatch");
-		return ERR_INVALID_REQUEST;
-	}
-	if (adm_ctx.mdev && adm_ctx.tconn &&
-	    adm_ctx.mdev->tconn != adm_ctx.tconn) {
-		pr_warning("request: minor=%u, conn=%s; but that minor belongs to connection %s\n",
-				adm_ctx.minor, adm_ctx.conn_name, adm_ctx.mdev->tconn->name);
-		drbd_msg_put_info("over-determined configuration context mismatch");
+		drbd_msg_put_info("minor exists as different volume");
 		return ERR_INVALID_REQUEST;
 	}
 	if (adm_ctx.mdev && !adm_ctx.tconn)
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 17/18] drbd: add forgotten spin_unlock
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (15 preceding siblings ...)
  2011-09-01 12:49 ` [PATCH 16/18] drbd: bail out if a config requrest is over-determined, and not matching Philipp Reisner
@ 2011-09-01 12:49 ` Philipp Reisner
  2011-09-01 12:49 ` [PATCH 18/18] drbd: introduce in-kernel "down" command Philipp Reisner
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:49 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

somehow a "goto abort" was introduced with commit
  drbd: Extracted is_valid_transition() out of sanitize_state()
which left drbd_req_state still holding the spin lock.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_state.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index c6a659b..3ca535e 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -181,8 +181,10 @@ drbd_req_state(struct drbd_conf *mdev, union drbd_state mask,
 	os = mdev->state;
 	ns = sanitize_state(mdev, apply_mask_val(os, mask, val), NULL);
 	rv = is_valid_transition(os, ns);
-	if (rv < SS_SUCCESS)
+	if (rv < SS_SUCCESS) {
+		spin_unlock_irqrestore(&mdev->tconn->req_lock, flags);
 		goto abort;
+	}
 
 	if (cl_wide_st_chg(mdev, os, ns)) {
 		rv = is_valid_state(mdev, ns);
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 18/18] drbd: introduce in-kernel "down" command
  2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
                   ` (16 preceding siblings ...)
  2011-09-01 12:49 ` [PATCH 17/18] drbd: add forgotten spin_unlock Philipp Reisner
@ 2011-09-01 12:49 ` Philipp Reisner
  17 siblings, 0 replies; 19+ messages in thread
From: Philipp Reisner @ 2011-09-01 12:49 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

This greatly simplifies deconfiguration of whole resources.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_main.c |    2 -
 drivers/block/drbd/drbd_nl.c   |  203 ++++++++++++++++++++++++++++++----------
 include/linux/drbd_genl.h      |    2 +
 3 files changed, 154 insertions(+), 53 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 2e79032..de49c8d 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2303,9 +2303,7 @@ fail:
 
 void drbd_free_tconn(struct drbd_tconn *tconn)
 {
-	mutex_lock(&drbd_cfg_mutex);
 	list_del(&tconn->all_tconn);
-	mutex_unlock(&drbd_cfg_mutex);
 	idr_destroy(&tconn->volumes);
 
 	free_cpumask_var(tconn->cpu_mask);
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 773946d..2970f45 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -73,6 +73,7 @@ int drbd_adm_delete_minor(struct sk_buff *skb, struct genl_info *info);
 
 int drbd_adm_create_connection(struct sk_buff *skb, struct genl_info *info);
 int drbd_adm_delete_connection(struct sk_buff *skb, struct genl_info *info);
+int drbd_adm_down(struct sk_buff *skb, struct genl_info *info);
 
 int drbd_adm_set_role(struct sk_buff *skb, struct genl_info *info);
 int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info);
@@ -1449,6 +1450,18 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
 	return 0;
 }
 
+static int adm_detach(struct drbd_conf *mdev)
+{
+	enum drbd_ret_code retcode;
+	drbd_suspend_io(mdev); /* so no-one is stuck in drbd_al_begin_io */
+	retcode = drbd_request_state(mdev, NS(disk, D_DISKLESS));
+	wait_event(mdev->misc_wait,
+			mdev->state.disk != D_DISKLESS ||
+			!atomic_read(&mdev->local_cnt));
+	drbd_resume_io(mdev);
+	return retcode;
+}
+
 /* Detaching the disk is a process in multiple stages.  First we need to lock
  * out application IO, in-flight IO, IO stuck in drbd_al_begin_io.
  * Then we transition to D_DISKLESS, and wait for put_ldev() to return all
@@ -1456,7 +1469,6 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
  * Only then we have finally detached. */
 int drbd_adm_detach(struct sk_buff *skb, struct genl_info *info)
 {
-	struct drbd_conf *mdev;
 	enum drbd_ret_code retcode;
 
 	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
@@ -1465,13 +1477,7 @@ int drbd_adm_detach(struct sk_buff *skb, struct genl_info *info)
 	if (retcode != NO_ERROR)
 		goto out;
 
-	mdev = adm_ctx.mdev;
-	drbd_suspend_io(mdev); /* so no-one is stuck in drbd_al_begin_io */
-	retcode = drbd_request_state(mdev, NS(disk, D_DISKLESS));
-	wait_event(mdev->misc_wait,
-			mdev->state.disk != D_DISKLESS ||
-			!atomic_read(&mdev->local_cnt));
-	drbd_resume_io(mdev);
+	retcode = adm_detach(adm_ctx.mdev);
 out:
 	drbd_adm_finish(info, retcode);
 	return 0;
@@ -1713,10 +1719,49 @@ out:
 	return 0;
 }
 
+static enum drbd_state_rv conn_try_disconnect(struct drbd_tconn *tconn, bool force)
+{
+	enum drbd_state_rv rv;
+	if (force) {
+		spin_lock_irq(&tconn->req_lock);
+		if (tconn->cstate >= C_WF_CONNECTION)
+			_conn_request_state(tconn, NS(conn, C_DISCONNECTING), CS_HARD);
+		spin_unlock_irq(&tconn->req_lock);
+		return SS_SUCCESS;
+	}
+
+	rv = conn_request_state(tconn, NS(conn, C_DISCONNECTING), 0);
+
+	switch (rv) {
+	case SS_NOTHING_TO_DO:
+	case SS_ALREADY_STANDALONE:
+		return SS_SUCCESS;
+	case SS_PRIMARY_NOP:
+		/* Our state checking code wants to see the peer outdated. */
+		rv = conn_request_state(tconn, NS2(conn, C_DISCONNECTING,
+							pdsk, D_OUTDATED), CS_VERBOSE);
+		break;
+	case SS_CW_FAILED_BY_PEER:
+		/* The peer probably wants to see us outdated. */
+		rv = conn_request_state(tconn, NS2(conn, C_DISCONNECTING,
+							disk, D_OUTDATED), 0);
+		if (rv == SS_IS_DISKLESS || rv == SS_LOWER_THAN_OUTDATED) {
+			conn_request_state(tconn, NS(conn, C_DISCONNECTING), CS_HARD);
+			rv = SS_SUCCESS;
+		}
+		break;
+	default:;
+		/* no special handling necessary */
+	}
+
+	return rv;
+}
+
 int drbd_adm_disconnect(struct sk_buff *skb, struct genl_info *info)
 {
 	struct disconnect_parms parms;
 	struct drbd_tconn *tconn;
+	enum drbd_state_rv rv;
 	enum drbd_ret_code retcode;
 	int err;
 
@@ -1737,35 +1782,8 @@ int drbd_adm_disconnect(struct sk_buff *skb, struct genl_info *info)
 		}
 	}
 
-	if (parms.force_disconnect) {
-		spin_lock_irq(&tconn->req_lock);
-		if (tconn->cstate >= C_WF_CONNECTION)
-			_conn_request_state(tconn, NS(conn, C_DISCONNECTING), CS_HARD);
-		spin_unlock_irq(&tconn->req_lock);
-		goto done;
-	}
-
-	retcode = conn_request_state(tconn, NS(conn, C_DISCONNECTING), 0);
-
-	if (retcode == SS_NOTHING_TO_DO)
-		goto done;
-	else if (retcode == SS_ALREADY_STANDALONE)
-		goto done;
-	else if (retcode == SS_PRIMARY_NOP) {
-		/* Our state checking code wants to see the peer outdated. */
-		retcode = conn_request_state(tconn, NS2(conn, C_DISCONNECTING,
-							pdsk, D_OUTDATED), CS_VERBOSE);
-	} else if (retcode == SS_CW_FAILED_BY_PEER) {
-		/* The peer probably wants to see us outdated. */
-		retcode = conn_request_state(tconn, NS2(conn, C_DISCONNECTING,
-							disk, D_OUTDATED), 0);
-		if (retcode == SS_IS_DISKLESS || retcode == SS_LOWER_THAN_OUTDATED) {
-			conn_request_state(tconn, NS(conn, C_DISCONNECTING), CS_HARD);
-			retcode = SS_SUCCESS;
-		}
-	}
-
-	if (retcode < SS_SUCCESS)
+	rv = conn_try_disconnect(tconn, parms.force_disconnect);
+	if (rv < SS_SUCCESS)
 		goto fail;
 
 	if (wait_event_interruptible(tconn->ping_wait,
@@ -1776,7 +1794,6 @@ int drbd_adm_disconnect(struct sk_buff *skb, struct genl_info *info)
 		goto fail;
 	}
 
- done:
 	retcode = NO_ERROR;
  fail:
 	drbd_adm_finish(info, retcode);
@@ -2677,9 +2694,21 @@ out:
 	return 0;
 }
 
+static enum drbd_ret_code adm_delete_minor(struct drbd_conf *mdev)
+{
+	if (mdev->state.disk == D_DISKLESS &&
+	    /* no need to be mdev->state.conn == C_STANDALONE &&
+	     * we may want to delete a minor from a live replication group.
+	     */
+	    mdev->state.role == R_SECONDARY) {
+		drbd_delete_device(mdev_to_minor(mdev));
+		return NO_ERROR;
+	} else
+		return ERR_MINOR_CONFIGURED;
+}
+
 int drbd_adm_delete_minor(struct sk_buff *skb, struct genl_info *info)
 {
-	struct drbd_conf *mdev;
 	enum drbd_ret_code retcode;
 
 	retcode = drbd_adm_prepare(skb, info, DRBD_ADM_NEED_MINOR);
@@ -2688,19 +2717,89 @@ int drbd_adm_delete_minor(struct sk_buff *skb, struct genl_info *info)
 	if (retcode != NO_ERROR)
 		goto out;
 
-	mdev = adm_ctx.mdev;
-	if (mdev->state.disk == D_DISKLESS &&
-	    /* no need to be mdev->state.conn == C_STANDALONE &&
-	     * we may want to delete a minor from a live replication group.
-	     */
-	    mdev->state.role == R_SECONDARY) {
-		drbd_delete_device(mdev_to_minor(mdev));
-		retcode = NO_ERROR;
-		/* if this was the last volume of this connection,
-		 * this will terminate all threads */
+	mutex_lock(&drbd_cfg_mutex);
+	retcode = adm_delete_minor(adm_ctx.mdev);
+	mutex_unlock(&drbd_cfg_mutex);
+	/* if this was the last volume of this connection,
+	 * this will terminate all threads */
+	if (retcode == NO_ERROR)
 		conn_reconfig_done(adm_ctx.tconn);
-	} else
-		retcode = ERR_MINOR_CONFIGURED;
+out:
+	drbd_adm_finish(info, retcode);
+	return 0;
+}
+
+int drbd_adm_down(struct sk_buff *skb, struct genl_info *info)
+{
+	enum drbd_ret_code retcode;
+	enum drbd_state_rv rv;
+	struct drbd_conf *mdev;
+	unsigned i;
+
+	retcode = drbd_adm_prepare(skb, info, 0);
+	if (!adm_ctx.reply_skb)
+		return retcode;
+	if (retcode != NO_ERROR)
+		goto out;
+
+	if (!adm_ctx.tconn) {
+		retcode = ERR_CONN_NOT_KNOWN;
+		goto out;
+	}
+
+	mutex_lock(&drbd_cfg_mutex);
+	/* demote */
+	idr_for_each_entry(&adm_ctx.tconn->volumes, mdev, i) {
+		retcode = drbd_set_role(mdev, R_SECONDARY, 0);
+		if (retcode < SS_SUCCESS) {
+			drbd_msg_put_info("failed to demote");
+			goto out_unlock;
+		}
+	}
+
+	/* disconnect */
+	rv = conn_try_disconnect(adm_ctx.tconn, 0);
+	if (rv < SS_SUCCESS) {
+		retcode = rv; /* enum type mismatch! */
+		drbd_msg_put_info("failed to disconnect");
+		goto out_unlock;
+	}
+
+	/* detach */
+	idr_for_each_entry(&adm_ctx.tconn->volumes, mdev, i) {
+		rv = adm_detach(mdev);
+		if (rv < SS_SUCCESS) {
+			retcode = rv; /* enum type mismatch! */
+			drbd_msg_put_info("failed to detach");
+			goto out_unlock;
+		}
+	}
+
+	/* delete volumes */
+	idr_for_each_entry(&adm_ctx.tconn->volumes, mdev, i) {
+		retcode = adm_delete_minor(mdev);
+		if (retcode != NO_ERROR) {
+			/* "can not happen" */
+			drbd_msg_put_info("failed to delete volume");
+			goto out_unlock;
+		}
+	}
+
+	/* stop all threads */
+	conn_reconfig_done(adm_ctx.tconn);
+
+	/* delete connection */
+	if (conn_lowest_minor(adm_ctx.tconn) < 0) {
+		drbd_free_tconn(adm_ctx.tconn);
+		retcode = NO_ERROR;
+	} else {
+		/* "can not happen" */
+		retcode = ERR_CONN_IN_USE;
+		drbd_msg_put_info("failed to delete connection");
+		goto out_unlock;
+	}
+out_unlock:
+	mutex_unlock(&drbd_cfg_mutex);
 out:
 	drbd_adm_finish(info, retcode);
 	return 0;
@@ -2716,12 +2815,14 @@ int drbd_adm_delete_connection(struct sk_buff *skb, struct genl_info *info)
 	if (retcode != NO_ERROR)
 		goto out;
 
+	mutex_lock(&drbd_cfg_mutex);
 	if (conn_lowest_minor(adm_ctx.tconn) < 0) {
 		drbd_free_tconn(adm_ctx.tconn);
 		retcode = NO_ERROR;
 	} else {
 		retcode = ERR_CONN_IN_USE;
 	}
+	mutex_unlock(&drbd_cfg_mutex);
 
 out:
 	drbd_adm_finish(info, retcode);
diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
index 84e1684..a07d692 100644
--- a/include/linux/drbd_genl.h
+++ b/include/linux/drbd_genl.h
@@ -347,3 +347,5 @@ GENL_op(DRBD_ADM_OUTDATE,	25, GENL_doit(drbd_adm_outdate),
 	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
 GENL_op(DRBD_ADM_GET_TIMEOUT_TYPE, 26, GENL_doit(drbd_adm_get_timeout_type),
 	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
+GENL_op(DRBD_ADM_DOWN,		27, GENL_doit(drbd_adm_down),
+	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, GENLA_F_REQUIRED))
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-09-01 12:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-01 12:48 [RFC 00/18] drbd: part 4 of adding multiple volume support to drbd Philipp Reisner
2011-09-01 12:48 ` [PATCH 01/18] drbd: default to detach on-io-error Philipp Reisner
2011-09-01 12:48 ` [PATCH 02/18] drbd: only wakeup if something changed in update_peer_seq Philipp Reisner
2011-09-01 12:48 ` [PATCH 03/18] drbd: add page pool to be used for meta data IO Philipp Reisner
2011-09-01 12:48 ` [PATCH 04/18] drbd: use the newly introduced page pool for bitmap IO Philipp Reisner
2011-09-01 12:48 ` [PATCH 05/18] drbd: introduce a bio_set to allocate housekeeping bios from Philipp Reisner
2011-09-01 12:48 ` [PATCH 06/18] drbd: fix drbd_delete_device: remove vnr from volumes; idr_remove(); synchronize_rcu(); before cleanup Philipp Reisner
2011-09-01 12:48 ` [PATCH 07/18] drbd: get rid of drbd_bcast_ee, it is of no use anymore Philipp Reisner
2011-09-01 12:48 ` [PATCH 08/18] drbd: prepare the transition from connector to genetlink Philipp Reisner
2011-09-01 12:48 ` [PATCH 09/18] drbd: switch configuration interface " Philipp Reisner
2011-09-01 12:48 ` [PATCH 10/18] drbd: allow holes in minor and volume id allocation Philipp Reisner
2011-09-01 12:48 ` [PATCH 11/18] drbd: remove now unused connector related files Philipp Reisner
2011-09-01 12:48 ` [PATCH 12/18] drbd: drbd_adm_get_status needs to show some more detail Philipp Reisner
2011-09-01 12:49 ` [PATCH 13/18] drbd: simplify conn_all_vols_unconf, make it bool Philipp Reisner
2011-09-01 12:49 ` [PATCH 14/18] drbd: Allow a Diskless Secondary volume to be removed Philipp Reisner
2011-09-01 12:49 ` [PATCH 15/18] drbd: new-connection and new-minor succeed, if the object already exists Philipp Reisner
2011-09-01 12:49 ` [PATCH 16/18] drbd: bail out if a config requrest is over-determined, and not matching Philipp Reisner
2011-09-01 12:49 ` [PATCH 17/18] drbd: add forgotten spin_unlock Philipp Reisner
2011-09-01 12:49 ` [PATCH 18/18] drbd: introduce in-kernel "down" command Philipp Reisner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.