All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikulas Patocka <mpatocka@redhat.com>
To: device-mapper development <dm-devel@redhat.com>
Cc: Alasdair G Kergon <agk@redhat.com>, Jacky Kim <jcy_2008@163.com>,
	Milan Broz <mbroz@redhat.com>
Subject: Re: Re: 2.6.28.2 & dm-snapshot or kcopyd Oops
Date: Thu, 5 Feb 2009 22:26:33 -0500 (EST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0902052218040.5380@hs20-bc2-1.build.redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0902051749150.5515@hs20-bc2-1.build.redhat.com>

> +		push(&job->kc->complete_jobs, job);
> +		wake(job->kc);

I still found one bug with my patch. When I push the job to complete_jobs 
queue, it may be immediatelly taken and cease to exist. So job->kc 
dereference is wrong.

A new patch:



Under specific circumstances, kcopyd callback could be called from
the thread that called dm_kcopyd_copy not from kcopyd workqueue.

dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()

This code path is taken if thread calling dm_kcopyd_copy is delayed due to
scheduling inside split_job/segment_complete and the subjobs complete before
the loop in split_job completes.

Snapshots depend on the fact that callbacks are called from the singlethreaded
kcopyd workqueue and expect that there is no racing between individual
callbacks. The racing between callbacks can lead to corruption of exception
store and it can also cause that exception store callbacks are called twice
for the same exception --- a likely reason for crashes inside
pending_complete() / remove_exception().


When I reviewed kcopyd further, I found four total problems:

1. job->fn being called from the thread that submitted the job (see above).

2. job->fn(read_err, write_err, job->context); in segment_complete
reports the error of the last subjob, not the union of all errors.

3. This comment is bogus (the jobs are not cancelable), plus the code
is prone to deadlock because the callback may allocate another job from
the same mempool.
/*
 * To avoid a race we must keep the job around
 * until after the notify function has completed.
 * Otherwise the client may try and stop the job
 * after we've completed.
 */
job->fn(read_err, write_err, job->context);
mempool_free(job, job->kc->job_pool);

4. Master jobs and subjobs are allocated from the same mempool.
Completion and freeing of a master job depends on successful allocation of
all subjobs -> deadlock.


This patch moves completion of master jobs to run_complete_job (being called
from the kcopyd workqueue). The patch fixes problems 1, 2, 3.

Problem 4 will need more changes and another patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-kcopyd.c |   18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

Index: linux-2.6.29-rc3-devel/drivers/md/dm-kcopyd.c
===================================================================
--- linux-2.6.29-rc3-devel.orig/drivers/md/dm-kcopyd.c	2009-02-05 23:01:50.000000000 +0100
+++ linux-2.6.29-rc3-devel/drivers/md/dm-kcopyd.c	2009-02-06 04:15:52.000000000 +0100
@@ -297,7 +297,8 @@ static int run_complete_job(struct kcopy
 	dm_kcopyd_notify_fn fn = job->fn;
 	struct dm_kcopyd_client *kc = job->kc;
 
-	kcopyd_put_pages(kc, job->pages);
+	if (job->pages)
+		kcopyd_put_pages(kc, job->pages);
 	mempool_free(job, kc->job_pool);
 	fn(read_err, write_err, context);
 
@@ -461,6 +462,7 @@ static void segment_complete(int read_er
 	sector_t progress = 0;
 	sector_t count = 0;
 	struct kcopyd_job *job = (struct kcopyd_job *) context;
+	struct dm_kcopyd_client *kc = job->kc;
 
 	mutex_lock(&job->lock);
 
@@ -490,7 +492,7 @@ static void segment_complete(int read_er
 
 	if (count) {
 		int i;
-		struct kcopyd_job *sub_job = mempool_alloc(job->kc->job_pool,
+		struct kcopyd_job *sub_job = mempool_alloc(kc->job_pool,
 							   GFP_NOIO);
 
 		*sub_job = *job;
@@ -508,14 +510,8 @@ static void segment_complete(int read_er
 
 	} else if (atomic_dec_and_test(&job->sub_jobs)) {
 
-		/*
-		 * To avoid a race we must keep the job around
-		 * until after the notify function has completed.
-		 * Otherwise the client may try and stop the job
-		 * after we've completed.
-		 */
-		job->fn(read_err, write_err, job->context);
-		mempool_free(job, job->kc->job_pool);
+		push(&kc->complete_jobs, job);
+		wake(kc);
 	}
 }
 
@@ -528,6 +524,8 @@ static void split_job(struct kcopyd_job 
 {
 	int i;
 
+	atomic_inc(&job->kc->nr_jobs);
+
 	atomic_set(&job->sub_jobs, SPLIT_COUNT);
 	for (i = 0; i < SPLIT_COUNT; i++)
 		segment_complete(0, 0u, job);

  reply	other threads:[~2009-02-06  3:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200901231836184950432@163.com>
     [not found] ` <Pine.LNX.4.64.0901231514590.9877@hs20-bc2-1.build.redhat.com>
     [not found]   ` <200901301800149019891@163.com>
     [not found]     ` <Pine.LNX.4.64.0901301533390.5744@hs20-bc2-1.build.redhat.com>
     [not found]       ` <200901311416111648168@163.com>
     [not found]         ` <Pine.LNX.4.64.0902031505180.28433@hs20-bc2-1.build.redhat.com>
     [not found]           ` <200902051113011850587@163.com>
2009-02-05 22:52             ` 2.6.28.2 & dm-snapshot or kcopyd Oops Mikulas Patocka
2009-02-06  3:26               ` Mikulas Patocka [this message]
     [not found]                 ` <200902061924107769776@163.com>
2009-02-06 23:17                   ` Re: " Mikulas Patocka
     [not found]                     ` <200902072041046488539@163.com>
2009-02-09  8:13                       ` Mikulas Patocka
     [not found]                         ` <200902101305353713189@163.com>
2009-02-10 23:36                           ` Mikulas Patocka
2009-02-11 10:54                             ` Mikulas Patocka
     [not found]                               ` <200902121131140773281@163.com>
2009-02-13  6:41                                 ` Mikulas Patocka
2009-02-13  7:31                                   ` Mikulas Patocka
     [not found]                                     ` <200902131912139165975@163.com>
2009-02-16  7:14                                       ` Mikulas Patocka
     [not found]                                         ` <200902161703497923912@163.com>
2009-02-16 16:18                                           ` Mikulas Patocka
2009-02-16  7:40                                       ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0902052218040.5380@hs20-bc2-1.build.redhat.com \
    --to=mpatocka@redhat.com \
    --cc=agk@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=jcy_2008@163.com \
    --cc=mbroz@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.