All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Kiyoshi Ueda <k-ueda@ct.jp.nec.com>, Jan Kara <jack@suse.cz>,
	linux-scsi@vger.kernel.org, jaxboe@fusionio.com,
	linux-raid@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	James.Bottomley@suse.de, konishi.ryusuke@lab.ntt.co.jp,
	tj@kernel.org, tytso@mit.edu, swhiteho@redhat.com,
	chris.mason@oracle.com, dm-devel@redhat.com
Subject: Re: [PATCH, RFC 2/2] dm: support REQ_FLUSH directly
Date: Fri, 27 Aug 2010 10:43:46 +0900	[thread overview]
Message-ID: <4C771852.3050500@ce.jp.nec.com> (raw)
In-Reply-To: <20100826225024.GB17832@redhat.com>

Hi Mike,

(08/27/10 07:50), Mike Snitzer wrote:
>> Special casing is necessary because device-mapper may have to
>> send multiple copies of REQ_FLUSH request to multiple
>> targets, while normal request is just sent to single target.
> 
> Yes, request-based DM is meant to have all the same capabilities as
> bio-based DM.  So in theory it should support multiple targets but in
> practice it doesn't.  DM's multipath target is the only consumer of
> request-based DM and it only ever clones a single flush request
> (num_flush_requests = 1).

This is correct. But,

> So why not remove all of request-based DM's barrier infrastructure and
> simply rely on the revised block layer to sequence the FLUSH+WRITE
> request for request-based DM?
> 
> Given that we do not have a request-based DM target that requires
> cloning multiple FLUSH requests its unused code that is delaying DM
> support for the new FLUSH+FUA work (NOTE: bio-based DM obviously still
> needs work in this area).

the above mentioned 'special casing' is not a hard part.
See the attached patch.

The hard part is discerning the error type for flush failure
as discussed in the other thread.
And as Kiyoshi wrote, that's an existing problem so it can
be worked on as a separate issue than the new FLUSH work.

Thanks,
-- 
Jun'ichi Nomura, NEC Corporation


Cope with new sequencing of flush requests in the block layer.

Request-based dm used to depend on the barrier sequencer in the block layer
in that, when a flush request is dispatched, there are no other requests
in-flight. So it reused md->pending counter for checking completion of
cloned flush requests.

This patch separates the pending counter for flush request
as a prepartion for the new FLUSH work, where a flush request can be
dispatched while other normal requests are in-flight.

Index: linux-2.6.36-rc2/drivers/md/dm.c
===================================================================
--- linux-2.6.36-rc2.orig/drivers/md/dm.c
+++ linux-2.6.36-rc2/drivers/md/dm.c
@@ -162,6 +162,7 @@ struct mapped_device {
 
 	/* A pointer to the currently processing pre/post flush request */
 	struct request *flush_request;
+	atomic_t flush_pending;
 
 	/*
 	 * The current mapping.
@@ -777,10 +778,16 @@ static void store_barrier_error(struct m
  * the md may be freed in dm_put() at the end of this function.
  * Or do dm_get() before calling this function and dm_put() later.
  */
-static void rq_completed(struct mapped_device *md, int rw, int run_queue)
+static void rq_completed(struct mapped_device *md, int rw, int run_queue, bool is_flush)
 {
 	atomic_dec(&md->pending[rw]);
 
+	if (is_flush) {
+		atomic_dec(&md->flush_pending);
+		if (!atomic_read(&md->flush_pending))
+			wake_up(&md->wait);
+	}
+
 	/* nudge anyone waiting on suspend queue */
 	if (!md_in_flight(md))
 		wake_up(&md->wait);
@@ -837,7 +844,7 @@ static void dm_end_request(struct reques
 	} else
 		blk_end_request_all(rq, error);
 
-	rq_completed(md, rw, run_queue);
+	rq_completed(md, rw, run_queue, is_barrier);
 }
 
 static void dm_unprep_request(struct request *rq)
@@ -880,7 +887,7 @@ void dm_requeue_unmapped_request(struct 
 	blk_requeue_request(q, rq);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
-	rq_completed(md, rw, 0);
+	rq_completed(md, rw, 0, false);
 }
 EXPORT_SYMBOL_GPL(dm_requeue_unmapped_request);
 
@@ -1993,6 +2000,7 @@ static struct mapped_device *alloc_dev(i
 
 	atomic_set(&md->pending[0], 0);
 	atomic_set(&md->pending[1], 0);
+	atomic_set(&md->flush_pending, 0);
 	init_waitqueue_head(&md->wait);
 	INIT_WORK(&md->work, dm_wq_work);
 	INIT_WORK(&md->barrier_work, dm_rq_barrier_work);
@@ -2375,7 +2383,7 @@ void dm_put(struct mapped_device *md)
 }
 EXPORT_SYMBOL_GPL(dm_put);
 
-static int dm_wait_for_completion(struct mapped_device *md, int interruptible)
+static int dm_wait_for_completion(struct mapped_device *md, int interruptible, bool for_flush)
 {
 	int r = 0;
 	DECLARE_WAITQUEUE(wait, current);
@@ -2388,6 +2396,8 @@ static int dm_wait_for_completion(struct
 		set_current_state(interruptible);
 
 		smp_mb();
+		if (for_flush && !atomic_read(&md->flush_pending))
+			break;
 		if (!md_in_flight(md))
 			break;
 
@@ -2408,14 +2418,14 @@ static int dm_wait_for_completion(struct
 
 static void dm_flush(struct mapped_device *md)
 {
-	dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
+	dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE, false);
 
 	bio_init(&md->barrier_bio);
 	md->barrier_bio.bi_bdev = md->bdev;
 	md->barrier_bio.bi_rw = WRITE_BARRIER;
 	__split_and_process_bio(md, &md->barrier_bio);
 
-	dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
+	dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE, false);
 }
 
 static void process_barrier(struct mapped_device *md, struct bio *bio)
@@ -2512,11 +2522,12 @@ static int dm_rq_barrier(struct mapped_d
 			clone = clone_rq(md->flush_request, md, GFP_NOIO);
 			dm_rq_set_target_request_nr(clone, j);
 			atomic_inc(&md->pending[rq_data_dir(clone)]);
+			atomic_inc(&md->flush_pending);
 			map_request(ti, clone, md);
 		}
 	}
 
-	dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
+	dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE, true);
 	dm_table_put(map);
 
 	return md->barrier_error;
@@ -2705,7 +2716,7 @@ int dm_suspend(struct mapped_device *md,
 	 * We call dm_wait_for_completion to wait for all existing requests
 	 * to finish.
 	 */
-	r = dm_wait_for_completion(md, TASK_INTERRUPTIBLE);
+	r = dm_wait_for_completion(md, TASK_INTERRUPTIBLE, false);
 
 	down_write(&md->io_lock);
 	if (noflush)

  parent reply	other threads:[~2010-08-27  1:43 UTC|newest]

Thread overview: 155+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-27 16:56 [RFC] relaxed barrier semantics Christoph Hellwig
2010-07-27 17:54 ` Jan Kara
2010-07-27 18:35   ` Vivek Goyal
2010-07-27 18:42     ` James Bottomley
2010-07-27 18:51       ` Ric Wheeler
2010-07-27 19:43       ` Christoph Hellwig
2010-07-27 19:38     ` Christoph Hellwig
2010-07-28  8:08     ` Tejun Heo
2010-07-28  8:20       ` Tejun Heo
2010-07-28 13:55         ` Vladislav Bolkhovitin
2010-07-28 14:23           ` Tejun Heo
2010-07-28 14:37             ` James Bottomley
2010-07-28 14:44               ` Tejun Heo
2010-07-28 16:17                 ` Vladislav Bolkhovitin
2010-07-28 16:17               ` Vladislav Bolkhovitin
2010-07-28 16:16             ` Vladislav Bolkhovitin
2010-07-28  8:24       ` Christoph Hellwig
2010-07-28  8:40         ` Tejun Heo
2010-07-28  8:50           ` Christoph Hellwig
2010-07-28  8:58             ` Tejun Heo
2010-07-28  9:00               ` Christoph Hellwig
2010-07-28  9:11                 ` Hannes Reinecke
2010-07-28  9:16                   ` Christoph Hellwig
2010-07-28  9:24                     ` Tejun Heo
2010-07-28  9:38                       ` Christoph Hellwig
2010-07-28  9:28                   ` Steven Whitehouse
2010-07-28  9:35                     ` READ_META semantics, was " Christoph Hellwig
2010-07-28 13:52                       ` Jeff Moyer
2010-07-28  9:17                 ` Tejun Heo
2010-07-28  9:28                   ` Christoph Hellwig
2010-07-28  9:48                     ` Tejun Heo
2010-07-28 10:19                     ` Steven Whitehouse
2010-07-28 11:45                       ` Christoph Hellwig
2010-07-28 12:47                     ` Jan Kara
2010-07-28 23:00                       ` Christoph Hellwig
2010-07-29 10:45                         ` Jan Kara
2010-07-29 16:54                           ` Joel Becker
2010-07-29 17:02                             ` Christoph Hellwig
2010-07-29 17:02                             ` Christoph Hellwig
2010-07-29  1:44                     ` Ted Ts'o
2010-07-29  2:43                       ` Vivek Goyal
2010-07-29  2:43                       ` Vivek Goyal
2010-07-29  8:42                         ` Christoph Hellwig
2010-07-29 20:02                           ` Vivek Goyal
2010-07-29 20:06                             ` Christoph Hellwig
2010-07-30  3:17                               ` Vivek Goyal
2010-07-30  7:07                                 ` Christoph Hellwig
2010-07-30  7:41                                   ` Vivek Goyal
2010-08-02 18:28                                   ` [RFC PATCH] Flush only barriers (Was: Re: [RFC] relaxed barrier semantics) Vivek Goyal
2010-08-03 13:03                                     ` Christoph Hellwig
2010-08-04 15:29                                       ` Vivek Goyal
2010-08-04 16:21                                         ` Christoph Hellwig
2010-07-29  8:31                       ` [RFC] relaxed barrier semantics Christoph Hellwig
2010-07-29 11:16                         ` Jan Kara
2010-07-29 13:00                         ` extfs reliability Vladislav Bolkhovitin
2010-07-29 13:08                           ` Christoph Hellwig
2010-07-29 14:12                             ` Vladislav Bolkhovitin
2010-07-29 14:34                               ` Jan Kara
2010-07-29 18:20                                 ` Vladislav Bolkhovitin
2010-07-29 18:49                                 ` Vladislav Bolkhovitin
2010-07-29 14:26                           ` Jan Kara
2010-07-29 18:20                             ` Vladislav Bolkhovitin
2010-07-29 18:58                           ` Ted Ts'o
2010-07-29 19:44                       ` [RFC] relaxed barrier semantics Ric Wheeler
2010-07-29 19:49                         ` Christoph Hellwig
2010-07-29 19:56                           ` Ric Wheeler
2010-07-29 19:59                             ` James Bottomley
2010-07-29 20:03                               ` Christoph Hellwig
2010-07-29 20:07                                 ` James Bottomley
2010-07-29 20:11                                   ` Christoph Hellwig
2010-07-30 12:45                                     ` Vladislav Bolkhovitin
2010-07-30 12:56                                       ` Christoph Hellwig
2010-08-04  1:58                                     ` Jamie Lokier
2010-07-30 12:46                                 ` Vladislav Bolkhovitin
2010-07-30 12:57                                   ` Christoph Hellwig
2010-07-30 13:09                                     ` Vladislav Bolkhovitin
2010-07-30 13:12                                       ` Christoph Hellwig
2010-07-30 17:40                                         ` Vladislav Bolkhovitin
2010-07-29 20:58                               ` Ric Wheeler
2010-07-29 22:30                             ` Andreas Dilger
2010-07-29 23:04                               ` Ted Ts'o
2010-07-29 23:08                                 ` Ric Wheeler
2010-07-29 23:08                                 ` Ric Wheeler
2010-07-29 23:28                                 ` James Bottomley
2010-07-29 23:37                                   ` James Bottomley
2010-07-30  0:19                                     ` Ted Ts'o
2010-07-30 12:56                                   ` Vladislav Bolkhovitin
2010-07-30  7:11                                 ` Christoph Hellwig
2010-07-30  7:11                                 ` Christoph Hellwig
2010-07-30 12:56                                 ` Vladislav Bolkhovitin
2010-07-30 13:07                                   ` Tejun Heo
2010-07-30 13:22                                     ` Vladislav Bolkhovitin
2010-07-30 13:27                                       ` Vladislav Bolkhovitin
2010-07-30 13:09                                   ` Christoph Hellwig
2010-07-30 13:25                                     ` Vladislav Bolkhovitin
2010-07-30 13:34                                       ` Christoph Hellwig
2010-07-30 13:44                                         ` Vladislav Bolkhovitin
2010-07-30 14:20                                           ` Christoph Hellwig
2010-07-31  0:47                                             ` Jan Kara
2010-07-31  9:12                                               ` Christoph Hellwig
2010-08-02 13:14                                                 ` Jan Kara
2010-08-02 10:38                                               ` Vladislav Bolkhovitin
2010-08-02 12:48                                                 ` Christoph Hellwig
2010-08-02 19:03                                                   ` xfs rm performance Vladislav Bolkhovitin
2010-08-02 19:18                                                     ` Christoph Hellwig
2010-08-05 19:31                                                       ` Vladislav Bolkhovitin
2010-08-02 19:01                                             ` [RFC] relaxed barrier semantics Vladislav Bolkhovitin
2010-08-02 19:26                                               ` Christoph Hellwig
2010-07-30 12:56                                 ` Vladislav Bolkhovitin
2010-07-31  0:35                         ` Jan Kara
2010-07-29 19:44                       ` Ric Wheeler
2010-08-02 16:47                     ` Ryusuke Konishi
2010-08-02 17:39                     ` Chris Mason
2010-08-05 13:11                       ` Vladislav Bolkhovitin
2010-08-05 13:32                         ` Chris Mason
2010-08-05 14:52                           ` Hannes Reinecke
2010-08-05 14:52                           ` Hannes Reinecke
2010-08-05 15:17                             ` Chris Mason
2010-08-05 17:07                             ` Christoph Hellwig
2010-08-05 19:48                           ` Vladislav Bolkhovitin
2010-08-05 19:48                           ` Vladislav Bolkhovitin
2010-08-05 19:50                             ` Christoph Hellwig
2010-08-05 20:05                               ` Vladislav Bolkhovitin
2010-08-06 14:56                                 ` Hannes Reinecke
2010-08-06 18:38                                   ` Vladislav Bolkhovitin
2010-08-06 23:38                                     ` Christoph Hellwig
2010-08-06 23:34                                   ` Christoph Hellwig
2010-08-05 17:09                         ` Christoph Hellwig
2010-08-05 19:32                           ` Vladislav Bolkhovitin
2010-08-05 19:40                             ` Christoph Hellwig
2010-08-05 13:11                       ` Vladislav Bolkhovitin
2010-07-28 13:56                   ` Vladislav Bolkhovitin
2010-07-28 14:42                 ` Vivek Goyal
2010-07-27 19:37   ` Christoph Hellwig
2010-08-03 18:49   ` [PATCH, RFC 1/2] relaxed cache flushes Christoph Hellwig
2010-08-03 18:51     ` [PATCH, RFC 2/2] dm: support REQ_FLUSH directly Christoph Hellwig
2010-08-04  4:57       ` Kiyoshi Ueda
2010-08-04  8:54         ` Christoph Hellwig
2010-08-05  2:16           ` Jun'ichi Nomura
2010-08-26 22:50             ` Mike Snitzer
2010-08-27  0:40               ` Mike Snitzer
2010-08-27  1:20                 ` Jamie Lokier
2010-08-27  1:43               ` Jun'ichi Nomura [this message]
2010-08-27  4:08                 ` Mike Snitzer
2010-08-27  5:52                   ` Jun'ichi Nomura
2010-08-27 14:13                     ` Mike Snitzer
2010-08-30  4:45                       ` Jun'ichi Nomura
2010-08-30  8:33                         ` Tejun Heo
2010-08-30 12:43                           ` Mike Snitzer
2010-08-30 12:45                             ` Tejun Heo
2010-08-06 16:04     ` [PATCH, RFC] relaxed barriers Tejun Heo
2010-08-06 23:34       ` Christoph Hellwig
2010-08-07 10:13       ` [PATCH REPOST " Tejun Heo
2010-08-08 14:31         ` Christoph Hellwig
2010-08-09 14:50           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C771852.3050500@ce.jp.nec.com \
    --to=j-nomura@ce.jp.nec.com \
    --cc=James.Bottomley@suse.de \
    --cc=chris.mason@oracle.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jaxboe@fusionio.com \
    --cc=k-ueda@ct.jp.nec.com \
    --cc=konishi.ryusuke@lab.ntt.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=swhiteho@redhat.com \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.