linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/9] device-mapper snapshot: load metadata on creation
@ 2006-01-20 21:11 Alasdair G Kergon
  2006-01-23  5:33 ` Andrew Morton
  2006-04-15 11:19 ` BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation) Juergen Kreileder
  0 siblings, 2 replies; 8+ messages in thread
From: Alasdair G Kergon @ 2006-01-20 21:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Move snapshot metadata loading to happen when the table is created 
instead of when the device is resumed.  Writes to the origin device
don't trigger exceptions until each snapshot table becomes active 
when resume() is called on each snapshot.

If you're using lvm2, for this patch to work properly you should update 
to lvm2 version 2.02.01 or later and device-mapper version 1.02.02 or later.

Signed-Off-By: Alasdair G Kergon <agk@redhat.com>

Index: linux-2.6.16-rc1/drivers/md/dm-snap.c
===================================================================
--- linux-2.6.16-rc1.orig/drivers/md/dm-snap.c
+++ linux-2.6.16-rc1/drivers/md/dm-snap.c
@@ -373,16 +373,11 @@ static inline ulong round_up(ulong n, ul
 
 static void read_snapshot_metadata(struct dm_snapshot *s)
 {
-	if (s->have_metadata)
-		return;
-
 	if (s->store.read_metadata(&s->store)) {
 		down_write(&s->lock);
 		s->valid = 0;
 		up_write(&s->lock);
 	}
-
-	s->have_metadata = 1;
 }
 
 /*
@@ -471,7 +466,7 @@ static int snapshot_ctr(struct dm_target
 	s->chunk_shift = ffs(chunk_size) - 1;
 
 	s->valid = 1;
-	s->have_metadata = 0;
+	s->active = 0;
 	s->last_percent = 0;
 	init_rwsem(&s->lock);
 	s->table = ti->table;
@@ -506,7 +501,11 @@ static int snapshot_ctr(struct dm_target
 		goto bad5;
 	}
 
+	/* Metadata must only be loaded into one table at once */
+	read_snapshot_metadata(s);
+
 	/* Add snapshot to the list of snapshots for this origin */
+	/* Exceptions aren't triggered till snapshot_resume() is called */
 	if (register_snapshot(s)) {
 		r = -EINVAL;
 		ti->error = "Cannot register snapshot origin";
@@ -862,7 +861,9 @@ static void snapshot_resume(struct dm_ta
 {
 	struct dm_snapshot *s = (struct dm_snapshot *) ti->private;
 
-	read_snapshot_metadata(s);
+	down_write(&s->lock);
+	s->active = 1;
+	up_write(&s->lock);
 }
 
 static int snapshot_status(struct dm_target *ti, status_type_t type,
@@ -932,8 +933,8 @@ static int __origin_write(struct list_he
 	/* Do all the snapshots on this origin */
 	list_for_each_entry (snap, snapshots, list) {
 
-		/* Only deal with valid snapshots */
-		if (!snap->valid)
+		/* Only deal with valid and active snapshots */
+		if (!snap->valid || !snap->active)
 			continue;
 
 		/* Nothing to do if writing beyond end of snapshot */
@@ -1104,7 +1105,7 @@ static int origin_status(struct dm_targe
 
 static struct target_type origin_target = {
 	.name    = "snapshot-origin",
-	.version = {1, 0, 1},
+	.version = {1, 1, 0},
 	.module  = THIS_MODULE,
 	.ctr     = origin_ctr,
 	.dtr     = origin_dtr,
@@ -1115,7 +1116,7 @@ static struct target_type origin_target 
 
 static struct target_type snapshot_target = {
 	.name    = "snapshot",
-	.version = {1, 0, 1},
+	.version = {1, 1, 0},
 	.module  = THIS_MODULE,
 	.ctr     = snapshot_ctr,
 	.dtr     = snapshot_dtr,
Index: linux-2.6.16-rc1/drivers/md/dm-snap.h
===================================================================
--- linux-2.6.16-rc1.orig/drivers/md/dm-snap.h
+++ linux-2.6.16-rc1/drivers/md/dm-snap.h
@@ -99,7 +99,9 @@ struct dm_snapshot {
 
 	/* You can't use a snapshot if this is 0 (e.g. if full) */
 	int valid;
-	int have_metadata;
+
+	/* Origin writes don't trigger exceptions until this is set */
+	int active;
 
 	/* Used for display of table */
 	char type;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/9] device-mapper snapshot: load metadata on creation
  2006-01-20 21:11 [PATCH 1/9] device-mapper snapshot: load metadata on creation Alasdair G Kergon
@ 2006-01-23  5:33 ` Andrew Morton
  2006-01-23 15:34   ` Alasdair G Kergon
  2006-04-15 11:19 ` BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation) Juergen Kreileder
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-01-23  5:33 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: linux-kernel

Alasdair G Kergon <agk@redhat.com> wrote:
>
> If you're using lvm2, for this patch to work properly you should update 
>  to lvm2 version 2.02.01 or later and device-mapper version 1.02.02 or later.

If people are using older tools and they run with this patch, what will
happen to them?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/9] device-mapper snapshot: load metadata on creation
  2006-01-23  5:33 ` Andrew Morton
@ 2006-01-23 15:34   ` Alasdair G Kergon
  0 siblings, 0 replies; 8+ messages in thread
From: Alasdair G Kergon @ 2006-01-23 15:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sun, Jan 22, 2006 at 09:33:11PM -0800, Andrew Morton wrote:
> If people are using older tools and they run with this patch, what will
> happen to them?
 
The problem I have noticed is that in certain configurations LVM2 commands 
to deactivate a snapshot may fail part-way through and need to be completed
using dmsetup.

Deactivating snapshots before removing them seems to avoid that problem,
but it's not always feasible.

Example: 
  lvol1 is a snapshot of lvol0.  Both are active.
  You run 'lvremove vgg/lvol1' but the command hangs.
  Workaround: from another shell run 'dmsetup resume vgg-lvol1-cow' and
  the command then completes.

Having more than one snapshot of the same device can also lead to problems.

But without this patch and the updated LVM2 tools, whenever you run
'lvcreate -s' to create a snapshot you're taking the risk that the command 
will hang irrecoverably (leading you to want to reboot the machine even if 
it didn't panic).

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation)
  2006-01-20 21:11 [PATCH 1/9] device-mapper snapshot: load metadata on creation Alasdair G Kergon
  2006-01-23  5:33 ` Andrew Morton
@ 2006-04-15 11:19 ` Juergen Kreileder
  2006-04-20 17:05   ` Alasdair G Kergon
  1 sibling, 1 reply; 8+ messages in thread
From: Juergen Kreileder @ 2006-04-15 11:19 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: Andrew Morton, linux-kernel

Alasdair G. Kergon <agk@redhat.com> writes:

> Move snapshot metadata loading to happen when the table is created 
> instead of when the device is resumed.  Writes to the origin device
> don't trigger exceptions until each snapshot table becomes active 
> when resume() is called on each snapshot.
>
> If you're using lvm2, for this patch to work properly you should
> update to lvm2 version 2.02.01 or later and device-mapper version
> 1.02.02 or later.

I'm using devmapper 1.02.03 and lvm2 2.02.02 with 2.6.16.2,
nevertheless my logical volumes locked up three time when removing
snapshots so far.  Twice I got BUG at drivers/md/kcopyd.c:146, the
third time logging stopped at the first lvremove.

2.6.15 and earlier kernels in combination with older tools worked fine
over the last year.


        Juergen

kernel BUG at drivers/md/kcopyd.c:146!
invalid opcode: 0000 [#1]
CPU:    0
EIP:    0060:[client_free_pages+42/64]    Not tainted VLI
EFLAGS: 00010283   (2.6.16.2-exec-shield #1) 
EIP is at client_free_pages+0x2a/0x40
eax: 00000100   ebx: f6a92880   ecx: c18df0c0   edx: 00000000
esi: f4636880   edi: c044b13c   ebp: f88dc040   esp: e855fdb8
ds: 007b   es: 007b   ss: 0068
Process lvremove (pid: 3034, threadinfo=e855e000 task=d2f4ea70)
Stack: <0>f6a92880 c02a6382 f545fac0 c02a85ad f88dc040 f6002d80 00000000 00000000 
       c02a22c6 00000008 f6002d80 f6a92c60 00000004 c02a4a40 c02a3fc3 c03b894c 
       f88aa000 c02a4a74 00000000 00000000 f88aa000 c02a418e 00000001 00000202 
Call Trace:
 [kcopyd_client_destroy+18/80] kcopyd_client_destroy+0x12/0x50
 [snapshot_dtr+173/240] snapshot_dtr+0xad/0xf0
 [dm_table_put+86/208] dm_table_put+0x56/0xd0
 [dev_remove+0/128] dev_remove+0x0/0x80
 [__hash_remove+99/128] __hash_remove+0x63/0x80
 [dev_remove+52/128] dev_remove+0x34/0x80
 [ctl_ioctl+430/688] ctl_ioctl+0x1ae/0x2b0
 [ctl_ioctl+0/688] ctl_ioctl+0x0/0x2b0
 [do_ioctl+105/112] do_ioctl+0x69/0x70
 [vfs_ioctl+92/640] vfs_ioctl+0x5c/0x280
 [sys_ioctl+61/112] sys_ioctl+0x3d/0x70
 [sysenter_past_esp+86/121] sysenter_past_esp+0x56/0x79
Code: 00 53 89 c3 8b 40 0c 39 43 10 75 1f 8b 43 08 e8 bd ff ff ff c7 43 08 00 00 00 00 c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 5b c3 <0f> 0b 92 00 68 a7 37 c0 eb d7 8d b6 00 00 00 00 8d bf 00 00 00 
 <1>Unable to handle kernel paging request at virtual address f8929fa8
 printing eip:
c02a9c90
*pde = 00000000
Oops: 0002 [#2]
CPU:    0
EIP:    0060:[persistent_commit+64/272]    Not tainted VLI
EFLAGS: 00010286   (2.6.16.2-exec-shield #1) 
EIP is at persistent_commit+0x40/0x110
eax: f8929fa0   ebx: f4c59840   ecx: 00000000   edx: 0000076b
esi: 000003fd   edi: 00000000   ebp: c02a8ec0   esp: ee497ef4
ds: 007b   es: 007b   ss: 0068
Process kcopyd (pid: 2511, threadinfo=ee496000 task=e65d5030)
Stack: <0>0000076b 00000000 000003fd 00000000 c1b57cf4 f545fac0 c1b57cf4 c02a8e70 
       c02a8eb2 c1b57cf4 c86cf1e0 00000000 c02a61fd 00000000 00000282 c86cf1e0 
       c03b8a08 c02a61b0 c02a6a38 00000000 c044b104 f4bab580 00000296 00000000 
Call Trace:
 [copy_callback+0/80] copy_callback+0x0/0x50
 [copy_callback+66/80] copy_callback+0x42/0x50
 [run_complete_job+77/112] run_complete_job+0x4d/0x70
 [run_complete_job+0/112] run_complete_job+0x0/0x70
 [process_jobs+24/192] process_jobs+0x18/0xc0
 [do_work+15/45] do_work+0xf/0x2d
 [run_workqueue+94/208] run_workqueue+0x5e/0xd0
 [do_work+0/45] do_work+0x0/0x2d
 [worker_thread+238/288] worker_thread+0xee/0x120
 [default_wake_function+0/16] default_wake_function+0x0/0x10
 [kthread+214/224] kthread+0xd6/0xe0
 [worker_thread+0/288] worker_thread+0x0/0x120
 [kthread+0/224] kthread+0x0/0xe0
 [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Code: 31 ff 89 34 24 8b 72 0c 89 7c 24 0c 89 74 24 08 8b 53 20 8d 42 01 89 43 20 89 d8 e8 ab f9 ff ff 85 c0 74 12 8b 14 24 8b 4c 24 04 <89> 70 08 89 78 0c 89 10 89 48 04 8b 43 28 8b 53 2c 8d 14 c2 40 




kernel BUG at drivers/md/kcopyd.c:146!
invalid opcode: 0000 [#1]
CPU:    0
EIP:    0060:[client_free_pages+42/64]    Not tainted VLI
EFLAGS: 00010283   (2.6.16.2-exec-shield #1) 
EIP is at client_free_pages+0x2a/0x40
eax: 00000100   ebx: f628c520   ecx: c18df0c0   edx: 00000000
esi: eeb1a660   edi: c044b13c   ebp: f88d8040   esp: eb26bdb8
ds: 007b   es: 007b   ss: 0068
Process lvremove (pid: 5735, threadinfo=eb26a000 task=f0522a70)
Stack: <0>f628c520 c02a6382 f65ca1c0 c02a85ad f88d8040 f1e1e3c0 00000000 00000000 
       c02a22c6 00000008 f1e1e3c0 f628c2a0 00000004 c02a4a40 c02a3fc3 c03b894c 
       f88aa000 c02a4a74 00000246 00000000 f88aa000 c02a418e 00000001 00000202 
Call Trace:
 [kcopyd_client_destroy+18/80] kcopyd_client_destroy+0x12/0x50
 [snapshot_dtr+173/240] snapshot_dtr+0xad/0xf0
 [dm_table_put+86/208] dm_table_put+0x56/0xd0
 [dev_remove+0/128] dev_remove+0x0/0x80
 [__hash_remove+99/128] __hash_remove+0x63/0x80
 [dev_remove+52/128] dev_remove+0x34/0x80
 [ctl_ioctl+430/688] ctl_ioctl+0x1ae/0x2b0
 [ctl_ioctl+0/688] ctl_ioctl+0x0/0x2b0
 [do_ioctl+105/112] do_ioctl+0x69/0x70
 [vfs_ioctl+92/640] vfs_ioctl+0x5c/0x280
 [sys_ioctl+61/112] sys_ioctl+0x3d/0x70
 [sysenter_past_esp+86/121] sysenter_past_esp+0x56/0x79
Code: 00 53 89 c3 8b 40 0c 39 43 10 75 1f 8b 43 08 e8 bd ff ff ff c7 43 08 00 00 00 00 c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 5b c3 <0f> 0b 92 00 68 a7 37 c0 eb d7 8d b6 00 00 00 00 8d bf 00 00 00 
 <1>Unable to handle kernel NULL pointer dereference at virtual address 00000034
 printing eip:
c01551a5
*pde = 2b207001
*pte = 2e12c067
Oops: 0000 [#2]
CPU:    0
EIP:    0060:[bio_add_page+21/80]    Not tainted VLI
EFLAGS: 00010286   (2.6.16.2-exec-shield #1) 
EIP is at bio_add_page+0x15/0x50
eax: 00000000   ebx: e84ef580   ecx: 00001000   edx: c15da5c0
esi: c15da5c0   edi: ef705f04   ebp: ee8ba20c   esp: ef705e74
ds: 007b   es: 007b   ss: 0068
Process kcopyd (pid: 5203, threadinfo=ef704000 task=f0f2a030)
Stack: <0>0000000c 00000010 00000000 00000010 e84ef580 c02a5c4b 00000000 ee8ba20c 
       00000001 00000001 00000000 00000000 c02a59f0 c02a5a10 00000000 ef4af780 
       00000000 00001000 c15da5c0 00000001 00000001 ee8ba20c c02a6950 c02a5d28 
Call Trace:
 [dispatch_io+299/400] dispatch_io+0x12b/0x190
 [list_get_page+0/32] list_get_page+0x0/0x20
 [list_next_page+0/16] list_next_page+0x0/0x10
 [complete_io+0/208] complete_io+0x0/0xd0
 [async_io+120/208] async_io+0x78/0xd0
 [dm_io_async+74/96] dm_io_async+0x4a/0x60
 [complete_io+0/208] complete_io+0x0/0xd0
 [list_get_page+0/32] list_get_page+0x0/0x20
 [list_next_page+0/16] list_next_page+0x0/0x10
 [run_io_job+0/144] run_io_job+0x0/0x90
 [run_io_job+124/144] run_io_job+0x7c/0x90
 [complete_io+0/208] complete_io+0x0/0xd0
 [process_jobs+24/192] process_jobs+0x18/0xc0
 [run_workqueue+94/208] run_workqueue+0x5e/0xd0
 [do_work+0/45] do_work+0x0/0x2d
 [worker_thread+238/288] worker_thread+0xee/0x120
 [default_wake_function+0/16] default_wake_function+0x0/0x10
 [kthread+214/224] kthread+0xd6/0xe0
 [worker_thread+0/288] worker_thread+0x0/0x120
 [kthread+0/224] kthread+0x0/0xe0
 [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Code: 14 89 1c 24 e8 fd fd ff ff 83 c4 0c 5b c3 90 8d b4 26 00 00 00 00 83 ec 14 89 5c 24 0c 89 74 24 10 89 c3 8b 40 08 89 d6 8b 40 50 <8b> 40 34 0f b7 90 38 01 00 00 89 0c 24 89 f1 89 54 24 08 8b 54 

-- 
Juergen Kreileder, Blackdown Java-Linux Team
http://blog.blackdown.de/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation)
  2006-04-15 11:19 ` BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation) Juergen Kreileder
@ 2006-04-20 17:05   ` Alasdair G Kergon
  2006-04-20 19:48     ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Alasdair G Kergon @ 2006-04-20 17:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton

On Sat, Apr 15, 2006 at 01:19:51PM +0200, Juergen Kreileder wrote:
> I'm using devmapper 1.02.03 and lvm2 2.02.02 with 2.6.16.2,
> nevertheless my logical volumes locked up three time when removing
> snapshots so far.  Twice I got BUG at drivers/md/kcopyd.c:146, the
> third time logging stopped at the first lvremove.
 
> 2.6.15 and earlier kernels in combination with older tools worked fine
> over the last year.
 
I found several bugs in the snapshot code when I reviewed it,
including (thankfully hard-to-trigger) silent data corruption.

Patches went into 2.6.17-rc1.  [There's one unfinished patch
outstanding for a theoretical race that I've only been able to
reproduce under artificial conditions.]

> kernel BUG at drivers/md/kcopyd.c:146!

Probably needs this patch (12th March):

  dm snapshot: fix kcopyd destructor

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation)
  2006-04-20 17:05   ` Alasdair G Kergon
@ 2006-04-20 19:48     ` Andrew Morton
  2006-04-21 15:12       ` BUG at drivers/md/kcopyd.c:146 Juergen Kreileder
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-04-20 19:48 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: linux-kernel, Juergen Kreileder


(restored Juergen's cc)

Alasdair G Kergon <agk@redhat.com> wrote:
>
> On Sat, Apr 15, 2006 at 01:19:51PM +0200, Juergen Kreileder wrote:
> > I'm using devmapper 1.02.03 and lvm2 2.02.02 with 2.6.16.2,
> > nevertheless my logical volumes locked up three time when removing
> > snapshots so far.  Twice I got BUG at drivers/md/kcopyd.c:146, the
> > third time logging stopped at the first lvremove.
>  
> > 2.6.15 and earlier kernels in combination with older tools worked fine
> > over the last year.
>  
> I found several bugs in the snapshot code when I reviewed it,
> including (thankfully hard-to-trigger) silent data corruption.
> 
> Patches went into 2.6.17-rc1.  [There's one unfinished patch
> outstanding for a theoretical race that I've only been able to
> reproduce under artificial conditions.]
> 
> > kernel BUG at drivers/md/kcopyd.c:146!
> 
> Probably needs this patch (12th March):
> 
>   dm snapshot: fix kcopyd destructor
> 

Thanks, I've appended a copy here.

Juergen, can you please test this?

Alasdair, what are your thoughts on backporting this to 2.6.16.x?


From: Alasdair G Kergon <agk@redhat.com>

Before removing a snapshot, wait for the completion of any kcopyd jobs using
it.

Do this by maintaining a count (nr_jobs) of how many outstanding jobs each
kcopyd_client has.

The snapshot destructor first unregisters the snapshot so that no new kcopyd
jobs (created by writes to the origin) will reference that particular
snapshot.  kcopyd_client_destroy() is now run next to wait for the completion
of any outstanding jobs before the snapshot exception structures (that those
jobs reference) are freed.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/md/dm-snap.c |    6 +++++-
 drivers/md/kcopyd.c  |   17 ++++++++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff -puN drivers/md/dm-snap.c~dm-snapshot-fix-kcopyd-destructor drivers/md/dm-snap.c
--- devel/drivers/md/dm-snap.c~dm-snapshot-fix-kcopyd-destructor	2006-03-27 01:10:28.000000000 -0800
+++ devel-akpm/drivers/md/dm-snap.c	2006-03-27 01:10:28.000000000 -0800
@@ -559,8 +559,12 @@ static void snapshot_dtr(struct dm_targe
 {
 	struct dm_snapshot *s = (struct dm_snapshot *) ti->private;
 
+	/* Prevent further origin writes from using this snapshot. */
+	/* After this returns there can be no new kcopyd jobs. */
 	unregister_snapshot(s);
 
+	kcopyd_client_destroy(s->kcopyd_client);
+
 	exit_exception_table(&s->pending, pending_cache);
 	exit_exception_table(&s->complete, exception_cache);
 
@@ -569,7 +573,7 @@ static void snapshot_dtr(struct dm_targe
 
 	dm_put_device(ti, s->origin);
 	dm_put_device(ti, s->cow);
-	kcopyd_client_destroy(s->kcopyd_client);
+
 	kfree(s);
 }
 
diff -puN drivers/md/kcopyd.c~dm-snapshot-fix-kcopyd-destructor drivers/md/kcopyd.c
--- devel/drivers/md/kcopyd.c~dm-snapshot-fix-kcopyd-destructor	2006-03-27 01:10:28.000000000 -0800
+++ devel-akpm/drivers/md/kcopyd.c	2006-03-27 01:10:28.000000000 -0800
@@ -44,6 +44,9 @@ struct kcopyd_client {
 	struct page_list *pages;
 	unsigned int nr_pages;
 	unsigned int nr_free_pages;
+
+	wait_queue_head_t destroyq;
+	atomic_t nr_jobs;
 };
 
 static struct page_list *alloc_pl(void)
@@ -292,10 +295,15 @@ static int run_complete_job(struct kcopy
 	int read_err = job->read_err;
 	unsigned int write_err = job->write_err;
 	kcopyd_notify_fn fn = job->fn;
+	struct kcopyd_client *kc = job->kc;
 
-	kcopyd_put_pages(job->kc, job->pages);
+	kcopyd_put_pages(kc, job->pages);
 	mempool_free(job, _job_pool);
 	fn(read_err, write_err, context);
+
+	if (atomic_dec_and_test(&kc->nr_jobs))
+		wake_up(&kc->destroyq);
+
 	return 0;
 }
 
@@ -430,6 +438,7 @@ static void do_work(void *ignored)
  */
 static void dispatch_job(struct kcopyd_job *job)
 {
+	atomic_inc(&job->kc->nr_jobs);
 	push(&_pages_jobs, job);
 	wake();
 }
@@ -669,6 +678,9 @@ int kcopyd_client_create(unsigned int nr
 		return r;
 	}
 
+	init_waitqueue_head(&kc->destroyq);
+	atomic_set(&kc->nr_jobs, 0);
+
 	client_add(kc);
 	*result = kc;
 	return 0;
@@ -676,6 +688,9 @@ int kcopyd_client_create(unsigned int nr
 
 void kcopyd_client_destroy(struct kcopyd_client *kc)
 {
+	/* Wait for completion of all jobs submitted by this client. */
+	wait_event(kc->destroyq, !atomic_read(&kc->nr_jobs));
+
 	dm_io_put(kc->nr_pages);
 	client_free_pages(kc);
 	client_del(kc);
_


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG at drivers/md/kcopyd.c:146
  2006-04-20 19:48     ` Andrew Morton
@ 2006-04-21 15:12       ` Juergen Kreileder
  2006-04-24 18:31         ` Juergen Kreileder
  0 siblings, 1 reply; 8+ messages in thread
From: Juergen Kreileder @ 2006-04-21 15:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alasdair G Kergon, linux-kernel

Andrew Morton <akpm@osdl.org> writes:

> (restored Juergen's cc)

[The cc list was correct actually, I had set a Mail-Followup-To header.]

> Alasdair G Kergon <agk@redhat.com> wrote:
>>
>> On Sat, Apr 15, 2006 at 01:19:51PM +0200, Juergen Kreileder wrote:

[...]

>> I found several bugs in the snapshot code when I reviewed it,
>> including (thankfully hard-to-trigger) silent data corruption.
>>
>> Patches went into 2.6.17-rc1.  [There's one unfinished patch
>> outstanding for a theoretical race that I've only been able to
>> reproduce under artificial conditions.]
>>
>>> kernel BUG at drivers/md/kcopyd.c:146!
>>
>> Probably needs this patch (12th March):
>>
>> dm snapshot: fix kcopyd destructor
>>
>
> Thanks, I've appended a copy here.
>
> Juergen, can you please test this?

Works fine so far but it's too early to call it fixed.
I'll let it run over the weekend and report back.


        Juergen

-- 
Juergen Kreileder, Blackdown Java-Linux Team
http://blog.blackdown.de/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG at drivers/md/kcopyd.c:146
  2006-04-21 15:12       ` BUG at drivers/md/kcopyd.c:146 Juergen Kreileder
@ 2006-04-24 18:31         ` Juergen Kreileder
  0 siblings, 0 replies; 8+ messages in thread
From: Juergen Kreileder @ 2006-04-24 18:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alasdair G Kergon, linux-kernel

Juergen Kreileder <jk@blackdown.de> writes:

> Andrew Morton <akpm@osdl.org> writes:
>
>> (restored Juergen's cc)
>
> [The cc list was correct actually, I had set a Mail-Followup-To
> header.]
>
>> Alasdair G Kergon <agk@redhat.com> wrote:
>>>
>>> On Sat, Apr 15, 2006 at 01:19:51PM +0200, Juergen Kreileder wrote:
>
> [...]
>
>>> I found several bugs in the snapshot code when I reviewed it,
>>> including (thankfully hard-to-trigger) silent data corruption.
>>>
>>> Patches went into 2.6.17-rc1.  [There's one unfinished patch
>>> outstanding for a theoretical race that I've only been able to
>>> reproduce under artificial conditions.]
>>>
>>>> kernel BUG at drivers/md/kcopyd.c:146!
>>>
>>> Probably needs this patch (12th March):
>>>
>>> dm snapshot: fix kcopyd destructor
>>>
>>
>> Thanks, I've appended a copy here.
>>
>> Juergen, can you please test this?
>
> Works fine so far but it's too early to call it fixed.
> I'll let it run over the weekend and report back.

I've seen no problems, snapshots seem to work fine with the patch.



        Juergen

-- 
Juergen Kreileder, Blackdown Java-Linux Team
http://blog.blackdown.de/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-04-24 18:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-20 21:11 [PATCH 1/9] device-mapper snapshot: load metadata on creation Alasdair G Kergon
2006-01-23  5:33 ` Andrew Morton
2006-01-23 15:34   ` Alasdair G Kergon
2006-04-15 11:19 ` BUG at drivers/md/kcopyd.c:146 (was: [PATCH 1/9] device-mapper snapshot: load metadata on creation) Juergen Kreileder
2006-04-20 17:05   ` Alasdair G Kergon
2006-04-20 19:48     ` Andrew Morton
2006-04-21 15:12       ` BUG at drivers/md/kcopyd.c:146 Juergen Kreileder
2006-04-24 18:31         ` Juergen Kreileder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).