All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: axboe@kernel.dk, tytso@mit.edu, djwong@us.ibm.com,
	shli@kernel.org, neilb@suse.de, adilger.kernel@dilger.ca,
	jack@suse.cz, linux-kernel@vger.kernel.org, kmannth@us.ibm.com,
	cmm@us.ibm.com, linux-ext4@vger.kernel.org, rwheeler@redhat.com,
	hch@lst.de, josef@redhat.com, jmoyer@redhat.com
Subject: [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests
Date: Tue, 25 Jan 2011 15:41:58 -0500	[thread overview]
Message-ID: <20110125204158.GA3013@redhat.com> (raw)
In-Reply-To: <1295625598-15203-4-git-send-email-tj@kernel.org>

Hi Tejun,

On Fri, Jan 21 2011 at 10:59am -0500,
Tejun Heo <tj@kernel.org> wrote:
> 
> * As flush requests are never put on the IO scheduler, request fields
>   used for flush share space with rq->rb_node.  rq->completion_data is
>   moved out of the union.  This increases the request size by one
>   pointer.
> 
>   As rq->elevator_private* are used only by the iosched too, it is
>   possible to reduce the request size further.  However, to do that,
>   we need to modify request allocation path such that iosched data is
>   not allocated for flush requests.

I decided to take a crack at using rq->elevator_private* and came up
with the following patch.

Unfortunately, in testing I found that flush requests that have data do
in fact eventually get added to the queue as normal requests, via:
1) "data but flush is not necessary" case in blk_insert_flush
2) REQ_FSEQ_DATA case in blk_flush_complete_seq

I know this because in my following get_request() change to _not_ call
elv_set_request() for flush requests hit cfq_put_request()'s
BUG_ON(!cfqq->allocated[rw]).  cfqq->allocated[rw] gets set via
elv_set_request()'s call to cfq_set_request().

So this seems to call in to question the running theory that flush
requests can share 'struct request' space with elevator-specific members
(via union) -- be it rq->rb_node or rq->elevator_private*.

Please advise, thanks!
Mike



From: Mike Snitzer <snitzer@redhat.com>

Skip elevator initialization for flush requests by passing priv=0 to
blk_alloc_request() in get_request().  As such elv_set_request() is
never called for flush requests.

Move elevator_private* into 'struct elevator' and have the flush fields
share a union with it.

Reclaim the space lost in 'struct request' by moving 'completion_data'
back in to the union with 'rb_node'.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-core.c       |   13 +++++++++----
 block/cfq-iosched.c    |   18 +++++++++---------
 block/elevator.c       |    2 +-
 include/linux/blkdev.h |   26 +++++++++++++++-----------
 4 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 72dd23b..f507888 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -764,7 +764,7 @@ static struct request *get_request(struct request_queue *q, int rw_flags,
 	struct request_list *rl = &q->rq;
 	struct io_context *ioc = NULL;
 	const bool is_sync = rw_is_sync(rw_flags) != 0;
-	int may_queue, priv;
+	int may_queue, priv = 0;
 
 	may_queue = elv_may_queue(q, rw_flags);
 	if (may_queue == ELV_MQUEUE_NO)
@@ -808,9 +808,14 @@ static struct request *get_request(struct request_queue *q, int rw_flags,
 	rl->count[is_sync]++;
 	rl->starved[is_sync] = 0;
 
-	priv = !test_bit(QUEUE_FLAG_ELVSWITCH, &q->queue_flags);
-	if (priv)
-		rl->elvpriv++;
+	/*
+	 * Skip elevator initialization for flush requests
+	 */
+	if (!(bio && (bio->bi_rw & (REQ_FLUSH | REQ_FUA)))) {
+		priv = !test_bit(QUEUE_FLAG_ELVSWITCH, &q->queue_flags);
+		if (priv)
+			rl->elvpriv++;
+	}
 
 	if (blk_queue_io_stat(q))
 		rw_flags |= REQ_IO_STAT;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 501ffdf..580ae0a 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -54,9 +54,9 @@ static const int cfq_hist_divisor = 4;
 #define CFQQ_SEEKY(cfqq)	(hweight32(cfqq->seek_history) > 32/8)
 
 #define RQ_CIC(rq)		\
-	((struct cfq_io_context *) (rq)->elevator_private)
-#define RQ_CFQQ(rq)		(struct cfq_queue *) ((rq)->elevator_private2)
-#define RQ_CFQG(rq)		(struct cfq_group *) ((rq)->elevator_private3)
+	((struct cfq_io_context *) (rq)->elevator.private)
+#define RQ_CFQQ(rq)		(struct cfq_queue *) ((rq)->elevator.private2)
+#define RQ_CFQG(rq)		(struct cfq_group *) ((rq)->elevator.private3)
 
 static struct kmem_cache *cfq_pool;
 static struct kmem_cache *cfq_ioc_pool;
@@ -3609,12 +3609,12 @@ static void cfq_put_request(struct request *rq)
 
 		put_io_context(RQ_CIC(rq)->ioc);
 
-		rq->elevator_private = NULL;
-		rq->elevator_private2 = NULL;
+		rq->elevator.private = NULL;
+		rq->elevator.private2 = NULL;
 
 		/* Put down rq reference on cfqg */
 		cfq_put_cfqg(RQ_CFQG(rq));
-		rq->elevator_private3 = NULL;
+		rq->elevator.private3 = NULL;
 
 		cfq_put_queue(cfqq);
 	}
@@ -3702,9 +3702,9 @@ new_queue:
 
 	cfqq->allocated[rw]++;
 	cfqq->ref++;
-	rq->elevator_private = cic;
-	rq->elevator_private2 = cfqq;
-	rq->elevator_private3 = cfq_ref_get_cfqg(cfqq->cfqg);
+	rq->elevator.private = cic;
+	rq->elevator.private2 = cfqq;
+	rq->elevator.private3 = cfq_ref_get_cfqg(cfqq->cfqg);
 
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
diff --git a/block/elevator.c b/block/elevator.c
index 270e097..02b66be 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -764,7 +764,7 @@ int elv_set_request(struct request_queue *q, struct request *rq, gfp_t gfp_mask)
 	if (e->ops->elevator_set_req_fn)
 		return e->ops->elevator_set_req_fn(q, rq, gfp_mask);
 
-	rq->elevator_private = NULL;
+	rq->elevator.private = NULL;
 	return 0;
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8a082a5..0c569ec 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -99,25 +99,29 @@ struct request {
 	/*
 	 * The rb_node is only used inside the io scheduler, requests
 	 * are pruned when moved to the dispatch queue. So let the
-	 * flush fields share space with the rb_node.
+	 * completion_data share space with the rb_node.
 	 */
 	union {
 		struct rb_node rb_node;	/* sort/lookup */
-		struct {
-			unsigned int			seq;
-			struct list_head		list;
-		} flush;
+		void *completion_data;
 	};
 
-	void *completion_data;
-
 	/*
 	 * Three pointers are available for the IO schedulers, if they need
-	 * more they have to dynamically allocate it.
+	 * more they have to dynamically allocate it.  Let the flush fields
+	 * share space with these three pointers.
 	 */
-	void *elevator_private;
-	void *elevator_private2;
-	void *elevator_private3;
+	union {
+		struct {
+			void *private;
+			void *private2;
+			void *private3;
+		} elevator;
+		struct {
+			unsigned int			seq;
+			struct list_head		list;
+		} flush;
+	};
 
 	struct gendisk *rq_disk;
 	struct hd_struct *part;

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: axboe@kernel.dk, tytso@mit.edu, djwong@us.ibm.com,
	shli@kernel.org, neilb@suse.de, adilger.kernel@dilger.ca,
	jack@suse.cz, linux-kernel@vger.kernel.org, kmannth@us.ibm.com,
	cmm@us.ibm.com, linux-ext4@vger.kernel.org, rwheeler@redhat.com,
	hch@lst.de, josef@redhat.com, jmoyer@redhat.com
Subject: [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests
Date: Tue, 25 Jan 2011 15:41:58 -0500	[thread overview]
Message-ID: <20110125204158.GA3013@redhat.com> (raw)
In-Reply-To: <1295625598-15203-4-git-send-email-tj@kernel.org>

Hi Tejun,

On Fri, Jan 21 2011 at 10:59am -0500,
Tejun Heo <tj@kernel.org> wrote:
> 
> * As flush requests are never put on the IO scheduler, request fields
>   used for flush share space with rq->rb_node.  rq->completion_data is
>   moved out of the union.  This increases the request size by one
>   pointer.
> 
>   As rq->elevator_private* are used only by the iosched too, it is
>   possible to reduce the request size further.  However, to do that,
>   we need to modify request allocation path such that iosched data is
>   not allocated for flush requests.

I decided to take a crack at using rq->elevator_private* and came up
with the following patch.

Unfortunately, in testing I found that flush requests that have data do
in fact eventually get added to the queue as normal requests, via:
1) "data but flush is not necessary" case in blk_insert_flush
2) REQ_FSEQ_DATA case in blk_flush_complete_seq

I know this because in my following get_request() change to _not_ call
elv_set_request() for flush requests hit cfq_put_request()'s
BUG_ON(!cfqq->allocated[rw]).  cfqq->allocated[rw] gets set via
elv_set_request()'s call to cfq_set_request().

So this seems to call in to question the running theory that flush
requests can share 'struct request' space with elevator-specific members
(via union) -- be it rq->rb_node or rq->elevator_private*.

Please advise, thanks!
Mike



From: Mike Snitzer <snitzer@redhat.com>

Skip elevator initialization for flush requests by passing priv=0 to
blk_alloc_request() in get_request().  As such elv_set_request() is
never called for flush requests.

Move elevator_private* into 'struct elevator' and have the flush fields
share a union with it.

Reclaim the space lost in 'struct request' by moving 'completion_data'
back in to the union with 'rb_node'.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-core.c       |   13 +++++++++----
 block/cfq-iosched.c    |   18 +++++++++---------
 block/elevator.c       |    2 +-
 include/linux/blkdev.h |   26 +++++++++++++++-----------
 4 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 72dd23b..f507888 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -764,7 +764,7 @@ static struct request *get_request(struct request_queue *q, int rw_flags,
 	struct request_list *rl = &q->rq;
 	struct io_context *ioc = NULL;
 	const bool is_sync = rw_is_sync(rw_flags) != 0;
-	int may_queue, priv;
+	int may_queue, priv = 0;
 
 	may_queue = elv_may_queue(q, rw_flags);
 	if (may_queue == ELV_MQUEUE_NO)
@@ -808,9 +808,14 @@ static struct request *get_request(struct request_queue *q, int rw_flags,
 	rl->count[is_sync]++;
 	rl->starved[is_sync] = 0;
 
-	priv = !test_bit(QUEUE_FLAG_ELVSWITCH, &q->queue_flags);
-	if (priv)
-		rl->elvpriv++;
+	/*
+	 * Skip elevator initialization for flush requests
+	 */
+	if (!(bio && (bio->bi_rw & (REQ_FLUSH | REQ_FUA)))) {
+		priv = !test_bit(QUEUE_FLAG_ELVSWITCH, &q->queue_flags);
+		if (priv)
+			rl->elvpriv++;
+	}
 
 	if (blk_queue_io_stat(q))
 		rw_flags |= REQ_IO_STAT;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 501ffdf..580ae0a 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -54,9 +54,9 @@ static const int cfq_hist_divisor = 4;
 #define CFQQ_SEEKY(cfqq)	(hweight32(cfqq->seek_history) > 32/8)
 
 #define RQ_CIC(rq)		\
-	((struct cfq_io_context *) (rq)->elevator_private)
-#define RQ_CFQQ(rq)		(struct cfq_queue *) ((rq)->elevator_private2)
-#define RQ_CFQG(rq)		(struct cfq_group *) ((rq)->elevator_private3)
+	((struct cfq_io_context *) (rq)->elevator.private)
+#define RQ_CFQQ(rq)		(struct cfq_queue *) ((rq)->elevator.private2)
+#define RQ_CFQG(rq)		(struct cfq_group *) ((rq)->elevator.private3)
 
 static struct kmem_cache *cfq_pool;
 static struct kmem_cache *cfq_ioc_pool;
@@ -3609,12 +3609,12 @@ static void cfq_put_request(struct request *rq)
 
 		put_io_context(RQ_CIC(rq)->ioc);
 
-		rq->elevator_private = NULL;
-		rq->elevator_private2 = NULL;
+		rq->elevator.private = NULL;
+		rq->elevator.private2 = NULL;
 
 		/* Put down rq reference on cfqg */
 		cfq_put_cfqg(RQ_CFQG(rq));
-		rq->elevator_private3 = NULL;
+		rq->elevator.private3 = NULL;
 
 		cfq_put_queue(cfqq);
 	}
@@ -3702,9 +3702,9 @@ new_queue:
 
 	cfqq->allocated[rw]++;
 	cfqq->ref++;
-	rq->elevator_private = cic;
-	rq->elevator_private2 = cfqq;
-	rq->elevator_private3 = cfq_ref_get_cfqg(cfqq->cfqg);
+	rq->elevator.private = cic;
+	rq->elevator.private2 = cfqq;
+	rq->elevator.private3 = cfq_ref_get_cfqg(cfqq->cfqg);
 
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
diff --git a/block/elevator.c b/block/elevator.c
index 270e097..02b66be 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -764,7 +764,7 @@ int elv_set_request(struct request_queue *q, struct request *rq, gfp_t gfp_mask)
 	if (e->ops->elevator_set_req_fn)
 		return e->ops->elevator_set_req_fn(q, rq, gfp_mask);
 
-	rq->elevator_private = NULL;
+	rq->elevator.private = NULL;
 	return 0;
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8a082a5..0c569ec 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -99,25 +99,29 @@ struct request {
 	/*
 	 * The rb_node is only used inside the io scheduler, requests
 	 * are pruned when moved to the dispatch queue. So let the
-	 * flush fields share space with the rb_node.
+	 * completion_data share space with the rb_node.
 	 */
 	union {
 		struct rb_node rb_node;	/* sort/lookup */
-		struct {
-			unsigned int			seq;
-			struct list_head		list;
-		} flush;
+		void *completion_data;
 	};
 
-	void *completion_data;

  parent reply	other threads:[~2011-01-25 20:43 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-21 15:59 [PATCHSET] block: reimplement FLUSH/FUA to support merge Tejun Heo
2011-01-21 15:59 ` Tejun Heo
2011-01-21 15:59 ` [PATCH 1/3] block: add REQ_FLUSH_SEQ Tejun Heo
2011-01-21 15:59   ` Tejun Heo
2011-01-21 15:59 ` [PATCH 2/3] block: improve flush bio completion Tejun Heo
2011-01-21 15:59   ` Tejun Heo
2011-01-21 15:59 ` [PATCH 3/3] block: reimplement FLUSH/FUA to support merge Tejun Heo
2011-01-21 18:56   ` Vivek Goyal
2011-01-21 19:19     ` Vivek Goyal
2011-01-23 10:25     ` Tejun Heo
2011-01-23 10:29       ` Tejun Heo
2011-01-24 20:31       ` Darrick J. Wong
2011-01-25 10:21         ` Tejun Heo
2011-01-25 11:39           ` Jens Axboe
2011-03-23 23:37             ` Darrick J. Wong
2011-01-25 22:56           ` Darrick J. Wong
2011-01-22  0:49   ` Mike Snitzer
2011-01-23 10:31     ` Tejun Heo
2011-01-25 20:46       ` Vivek Goyal
2011-01-25 21:04         ` Mike Snitzer
2011-01-23 10:48   ` [PATCH UPDATED " Tejun Heo
2011-01-23 10:48     ` Tejun Heo
2011-01-25 20:41   ` Mike Snitzer [this message]
2011-01-25 20:41     ` [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests Mike Snitzer
2011-01-25 21:55     ` Mike Snitzer
2011-01-26  5:27       ` [RFC PATCH 4/3] block: skip elevator initialization for flush requests -- was never BUGGY relative to upstream Mike Snitzer
2011-01-26 10:03     ` [KNOWN BUGGY RFC PATCH 4/3] block: skip elevator initialization for flush requests Tejun Heo
2011-01-26 10:05       ` Tejun Heo
2011-02-01 17:38       ` [RFC " Mike Snitzer
2011-02-01 18:52         ` Tejun Heo
2011-02-01 22:46           ` [PATCH v2 1/2] " Mike Snitzer
2011-02-02 21:51             ` Vivek Goyal
2011-02-02 22:06               ` Mike Snitzer
2011-02-02 22:55             ` [PATCH v3 1/2] block: skip elevator data " Mike Snitzer
2011-02-03  9:28               ` Tejun Heo
2011-02-03 14:48                 ` [PATCH v4 " Mike Snitzer
2011-02-03 13:24               ` [PATCH v3 " Jens Axboe
2011-02-03 13:38                 ` Tejun Heo
2011-02-04 15:04                   ` Vivek Goyal
2011-02-04 15:08                     ` Tejun Heo
2011-02-04 16:58                     ` [PATCH v5 " Mike Snitzer
2011-02-03 14:54                 ` [PATCH v3 " Mike Snitzer
2011-02-01 22:46           ` [PATCH v2 2/2] block: share request flush fields with elevator_private Mike Snitzer
2011-02-01 22:46             ` Mike Snitzer
2011-02-02 21:52             ` Vivek Goyal
2011-02-03  9:24             ` Tejun Heo
2011-02-11 10:08             ` Jens Axboe
2011-01-21 15:59 ` [PATCH 3/3] block: reimplement FLUSH/FUA to support merge Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110125204158.GA3013@redhat.com \
    --to=snitzer@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=axboe@kernel.dk \
    --cc=cmm@us.ibm.com \
    --cc=djwong@us.ibm.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=josef@redhat.com \
    --cc=kmannth@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=rwheeler@redhat.com \
    --cc=shli@kernel.org \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.