Re: [PATCH 5/5] blk-mq: make plug work for mutiple disks and queues

From: Shaohua Li <shli@fb.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <axboe@fb.com>, <hch@lst.de>,
	<neilb@suse.de>
Subject: Re: [PATCH 5/5] blk-mq: make plug work for mutiple disks and queues
Date: Mon, 4 May 2015 12:44:10 -0700	[thread overview]
Message-ID: <20150504194410.GA3365187@devbig257.prn2.facebook.com> (raw)
In-Reply-To: <x49618cyq44.fsf@segfault.boston.devel.redhat.com>

On Fri, May 01, 2015 at 04:55:39PM -0400, Jeff Moyer wrote:
> Shaohua Li <shli@fb.com> writes:
> 
> > Last patch makes plug work for multiple queue case. However it only
> > works for single disk case, because it assumes only one request in the
> > plug list. If a task is accessing multiple disks, eg MD/DM, the
> > assumption is wrong. Let blk_attempt_plug_merge() record request from
> > the same queue.
> 
> I understand the desire to be performant, and that's why you
> piggy-backed the same_queue_rq onto this function, but it sure looks
> hackish.  I'd almost rather walk the list a second time instead of add
> warts like this.  To be perfectly clear, this is what I really don't
> like about it: we're relying on the nuance that we will only add a
> single request per queue to the plug_list in the mq case.  When you
> start to look at what happens for the sq case, it doesn't make much
> sense at all, as you'll just return the first entry in the list (last
> one visited walking the list backwards) that has the same request queue.
> That's fine if this code were mq specific, but it isn't.  It will just
> lead to confusion for others reviewing the code, and may trip up anyone
> modifying the mq plugging code.
> 
> I'll leave it up to Jens to decide if this is fit for inclusion.  The
> patch /functionally/ looks fine to me.

I really dont want to rewalk the list again for performance reason.
Added some comments in the code and hopefully it's better.
 
> > Cc: Jens Axboe <axboe@fb.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Shaohua Li <shli@fb.com>
> > ---
> >  block/blk-core.c | 10 +++++++---
> >  block/blk-mq.c   | 11 ++++++-----
> >  block/blk.h      |  3 ++-
> >  3 files changed, 15 insertions(+), 9 deletions(-)
> >
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index d51ed61..a5e1574 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1521,7 +1521,8 @@ bool bio_attempt_front_merge(struct request_queue *q, struct request *req,
> >   * Caller must ensure !blk_queue_nomerges(q) beforehand.
> >   */
> >  bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
> > -			    unsigned int *request_count)
> > +			    unsigned int *request_count,
> > +			    struct request **same_queue_rq)
> >  {
> >  	struct blk_plug *plug;
> >  	struct request *rq;
> > @@ -1541,8 +1542,10 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
> >  	list_for_each_entry_reverse(rq, plug_list, queuelist) {
> >  		int el_ret;
> >  
> > -		if (rq->q == q)
> > +		if (rq->q == q) {
> >  			(*request_count)++;
> > +			*same_queue_rq = rq;
> > +		}
> 
> Out of the 3 callers of blk_attempt_plug_merge, only one will use the
> result, yet all of them have to provide the argument.  How about just
> handling NULL in there?

Ok, fixed. Also the updated patch fixed a bug in the patch.


>From 1218a50d39dbd11d0fbd15547516d1a4a92df0b5 Mon Sep 17 00:00:00 2001
Message-Id: <1218a50d39dbd11d0fbd15547516d1a4a92df0b5.1430766392.git.shli@fb.com>
In-Reply-To: <f3bfe60a013827942790c89d658f63c920653437.1430766392.git.shli@fb.com>
References: <f3bfe60a013827942790c89d658f63c920653437.1430766392.git.shli@fb.com>
From: Shaohua Li <shli@fb.com>
Date: Wed, 29 Apr 2015 16:58:20 -0700
Subject: [PATCH 5/5] blk-mq: make plug work for mutiple disks and queues

Last patch makes plug work for multiple queue case. However it only
works for single disk case, because it assumes only one request in the
plug list. If a task is accessing multiple disks, eg MD/DM, the
assumption is wrong. Let blk_attempt_plug_merge() record request from
the same queue.

Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 block/blk-core.c | 15 ++++++++++++---
 block/blk-mq.c   | 12 ++++++++----
 block/blk.h      |  3 ++-
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index d51ed61..503927e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1521,7 +1521,8 @@ bool bio_attempt_front_merge(struct request_queue *q, struct request *req,
  * Caller must ensure !blk_queue_nomerges(q) beforehand.
  */
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
-			    unsigned int *request_count)
+			    unsigned int *request_count,
+			    struct request **same_queue_rq)
 {
 	struct blk_plug *plug;
 	struct request *rq;
@@ -1541,8 +1542,16 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 	list_for_each_entry_reverse(rq, plug_list, queuelist) {
 		int el_ret;
 
-		if (rq->q == q)
+		if (rq->q == q) {
 			(*request_count)++;
+			/*
+			 * Only blk-mq multiple hardware queues case checks the
+			 * rq in the same queue, there should be only one such
+			 * rq in a queue
+			 **/
+			if (same_queue_rq)
+				*same_queue_rq = rq;
+		}
 
 		if (rq->q != q || !blk_rq_merge_ok(rq, bio))
 			continue;
@@ -1607,7 +1616,7 @@ void blk_queue_bio(struct request_queue *q, struct bio *bio)
 	 * any locks.
 	 */
 	if (!blk_queue_nomerges(q) &&
-	    blk_attempt_plug_merge(q, bio, &request_count))
+	    blk_attempt_plug_merge(q, bio, &request_count, NULL))
 		return;
 
 	spin_lock_irq(q->queue_lock);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6a6b6d0..be1169f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1269,6 +1269,7 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	struct request *rq;
 	unsigned int request_count = 0;
 	struct blk_plug *plug;
+	struct request *same_queue_rq = NULL;
 
 	blk_queue_bounce(q, &bio);
 
@@ -1278,7 +1279,7 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	}
 
 	if (likely(!is_flush_fua) && !blk_queue_nomerges(q) &&
-	    blk_attempt_plug_merge(q, bio, &request_count))
+	    blk_attempt_plug_merge(q, bio, &request_count, &same_queue_rq))
 		return;
 
 	rq = blk_mq_map_request(q, bio, &data);
@@ -1309,9 +1310,12 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
 		 * issued. So the plug list will have one request at most
 		 */
 		if (plug) {
+			/*
+			 * The plug list might get flushed before this. If that
+			 * happens, same_queue_rq is invalid and plug list is empty
+			 **/
 			if (!list_empty(&plug->mq_list)) {
-				old_rq = list_first_entry(&plug->mq_list,
-					struct request, queuelist);
+				old_rq = same_queue_rq;
 				list_del_init(&old_rq->queuelist);
 			}
 			list_add_tail(&rq->queuelist, &plug->mq_list);
@@ -1360,7 +1364,7 @@ static void blk_sq_make_request(struct request_queue *q, struct bio *bio)
 	}
 
 	if (likely(!is_flush_fua) && !blk_queue_nomerges(q) &&
-	    blk_attempt_plug_merge(q, bio, &request_count))
+	    blk_attempt_plug_merge(q, bio, &request_count, NULL))
 		return;
 
 	rq = blk_mq_map_request(q, bio, &data);
diff --git a/block/blk.h b/block/blk.h
index 43b0361..aa8633c 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -78,7 +78,8 @@ bool bio_attempt_front_merge(struct request_queue *q, struct request *req,
 bool bio_attempt_back_merge(struct request_queue *q, struct request *req,
 			    struct bio *bio);
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
-			    unsigned int *request_count);
+			    unsigned int *request_count,
+			    struct request **same_queue_rq);
 
 void blk_account_io_start(struct request *req, bool new_io);
 void blk_account_io_completion(struct request *req, unsigned int bytes);
-- 
1.8.1