linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Enable polling on stackable devices
@ 2020-01-26  4:41 Andrzej Jakowski
  2020-01-26  4:41 ` [PATCH 1/2] block: introduce polling on bio level Andrzej Jakowski
  2020-01-26  4:41 ` [PATCH 2/2] md: enable io polling Andrzej Jakowski
  0 siblings, 2 replies; 10+ messages in thread
From: Andrzej Jakowski @ 2020-01-26  4:41 UTC (permalink / raw)
  To: axboe, song; +Cc: linux-block, linux-raid, Andrzej Jakowski

IO polling is available on blk-mq devices. Currently it is not possible to
perform IO polling on stackable devices like MD. 

We have built a prototype exposing new function for bio devices to do IO
polling on them. That function is in available on request_queue so bio based
drivers can provide handler for it. 

IO polling has been provided for RAID-0 level for MD.

Andrzej Jakowski (2):
  block: introduce polling on bio level
  md: enable io polling

 block/blk-core.c       |  3 ++-
 block/blk-mq.c         | 26 ++++++++++++++++++++++++++
 drivers/md/md.c        | 39 +++++++++++++++++++++++++++++++++++----
 include/linux/blkdev.h |  2 ++
 4 files changed, 65 insertions(+), 5 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] block: introduce polling on bio level
  2020-01-26  4:41 [PATCH 0/2] Enable polling on stackable devices Andrzej Jakowski
@ 2020-01-26  4:41 ` Andrzej Jakowski
  2020-01-30  4:01   ` Jens Axboe
  2020-01-31  6:34   ` Christoph Hellwig
  2020-01-26  4:41 ` [PATCH 2/2] md: enable io polling Andrzej Jakowski
  1 sibling, 2 replies; 10+ messages in thread
From: Andrzej Jakowski @ 2020-01-26  4:41 UTC (permalink / raw)
  To: axboe, song; +Cc: linux-block, linux-raid, Andrzej Jakowski, Artur Paszkiewicz

In current implementation it is possible to perform IO polling if
underlying device is block-mq device. Is is not possible however to do
polled IO on stackable block devices like MD.

We have explored two paths for enabling IO polling on bio devices. First
idea revolved around rewriting MD to block-mq interface but it proved to
be complex. In the second idea we have built a prototype which
introduced new operation on request_queue - bio_poll_fn. bio_poll_fn if
provided by stackable block driver is called when user polls for IO
completion. bio_poll_fn approach was initially discussed and confirmed
with Jens.

We managed to collect performance data on RAID-0 volume built on top of
2xP4800X devices with polling on and off. Here are the results:
Polling      QD       Latency         IOPS
----------------------------------------------
off           1       12.03us         78800
off           2       13.27us        144000
off           4       15.83us        245000
off           8       31.14us        253000
off          16       63.03us        252000
off          32      128.89us        247000
off          64      259.10us        246000
off         128      517.27us        247000
on            1        9.00us        108000
on            2        9.07us        214000
on            4       12.00us        327000
on            8       21.43us        369000
on           16       43.18us        369000
on           32       85.75us        372000
on           64      169.87us        376000
on          128      346.15us        370000

Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
---
 block/blk-core.c       |  3 ++-
 block/blk-mq.c         | 26 ++++++++++++++++++++++++++
 include/linux/blkdev.h |  2 ++
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index e0a094fddee5..5d8706dfebe0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -889,7 +889,8 @@ generic_make_request_checks(struct bio *bio)
 	 * if queue is not a request based queue.
 	 */
 	if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q))
-		goto not_supported;
+		if (!(bio->bi_opf & REQ_HIPRI))
+			goto not_supported;
 
 	if (should_fail_bio(bio))
 		goto end_io;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a12b1763508d..ebbb88a336a7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3533,6 +3533,32 @@ int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin)
 	if (current->plug)
 		blk_flush_plug_list(current->plug, false);
 
+	if (q->bio_poll_fn != NULL) {
+		state = current->state;
+		do {
+			int ret;
+
+			ret = q->bio_poll_fn(q, cookie);
+			if (ret > 0) {
+				__set_current_state(TASK_RUNNING);
+				return ret;
+			}
+
+			if (signal_pending_state(state, current))
+				__set_current_state(TASK_RUNNING);
+
+			if (current->state == TASK_RUNNING)
+				return 1;
+			if (ret < 0 || !spin)
+				break;
+			cpu_relax();
+		} while (!need_resched());
+
+		__set_current_state(TASK_RUNNING);
+
+		return 0;
+	}
+
 	hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)];
 
 	/*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 47eb22a3b7f9..1c21c3ff3ad1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -287,6 +287,7 @@ static inline unsigned short req_get_ioprio(struct request *req)
 
 struct blk_queue_ctx;
 
+typedef int (bio_poll_fn) (struct request_queue *q, blk_qc_t cookie);
 typedef blk_qc_t (make_request_fn) (struct request_queue *q, struct bio *bio);
 
 struct bio_vec;
@@ -398,6 +399,7 @@ struct request_queue {
 	struct blk_queue_stats	*stats;
 	struct rq_qos		*rq_qos;
 
+	bio_poll_fn		*bio_poll_fn;
 	make_request_fn		*make_request_fn;
 	dma_drain_needed_fn	*dma_drain_needed;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] md: enable io polling
  2020-01-26  4:41 [PATCH 0/2] Enable polling on stackable devices Andrzej Jakowski
  2020-01-26  4:41 ` [PATCH 1/2] block: introduce polling on bio level Andrzej Jakowski
@ 2020-01-26  4:41 ` Andrzej Jakowski
  2020-01-31  6:36   ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Andrzej Jakowski @ 2020-01-26  4:41 UTC (permalink / raw)
  To: axboe, song; +Cc: linux-block, linux-raid, Andrzej Jakowski, Artur Paszkiewicz

Provide a callback for polling the mddev which in turn polls the active
member devices. Activate it only if all members support polling.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
---
 drivers/md/md.c | 39 +++++++++++++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e7c9f398bc6..95173cd4f8fd 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5421,6 +5421,27 @@ int mddev_init_writes_pending(struct mddev *mddev)
 }
 EXPORT_SYMBOL_GPL(mddev_init_writes_pending);
 
+static int md_poll(struct request_queue *q, blk_qc_t cookie)
+{
+	struct mddev *mddev = q->queuedata;
+	struct md_rdev *rdev;
+	int ret = 0;
+	int rv;
+
+	rdev_for_each(rdev, mddev) {
+		if (rdev->raid_disk >= 0 && !test_bit(Faulty, &rdev->flags)) {
+			rv = blk_poll(bdev_get_queue(rdev->bdev), cookie, false);
+			if (rv < 0) {
+				ret = rv;
+				break;
+			}
+			ret += rv;
+		}
+	}
+
+	return ret;
+}
+
 static int md_alloc(dev_t dev, char *name)
 {
 	/*
@@ -5485,6 +5506,7 @@ static int md_alloc(dev_t dev, char *name)
 
 	blk_queue_make_request(mddev->queue, md_make_request);
 	blk_set_stacking_limits(&mddev->queue->limits);
+	mddev->queue->bio_poll_fn = md_poll;
 
 	disk = alloc_disk(1 << shift);
 	if (!disk) {
@@ -5789,12 +5811,17 @@ int md_run(struct mddev *mddev)
 
 	if (mddev->queue) {
 		bool nonrot = true;
+		bool poll = true;
 
 		rdev_for_each(rdev, mddev) {
-			if (rdev->raid_disk >= 0 &&
-			    !blk_queue_nonrot(bdev_get_queue(rdev->bdev))) {
-				nonrot = false;
-				break;
+			if (rdev->raid_disk >= 0) {
+				struct request_queue *q;
+
+				q = bdev_get_queue(rdev->bdev);
+				if (!blk_queue_nonrot(q))
+					nonrot = false;
+				if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
+					poll = false;
 			}
 		}
 		if (mddev->degraded)
@@ -5803,6 +5830,10 @@ int md_run(struct mddev *mddev)
 			blk_queue_flag_set(QUEUE_FLAG_NONROT, mddev->queue);
 		else
 			blk_queue_flag_clear(QUEUE_FLAG_NONROT, mddev->queue);
+		if (poll)
+			blk_queue_flag_set(QUEUE_FLAG_POLL, mddev->queue);
+		else
+			blk_queue_flag_clear(QUEUE_FLAG_POLL, mddev->queue);
 		mddev->queue->backing_dev_info->congested_data = mddev;
 		mddev->queue->backing_dev_info->congested_fn = md_congested;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: introduce polling on bio level
  2020-01-26  4:41 ` [PATCH 1/2] block: introduce polling on bio level Andrzej Jakowski
@ 2020-01-30  4:01   ` Jens Axboe
  2020-01-31  6:35     ` Christoph Hellwig
  2020-01-31  6:34   ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-01-30  4:01 UTC (permalink / raw)
  To: Andrzej Jakowski, song; +Cc: linux-block, linux-raid, Artur Paszkiewicz

On 1/25/20 9:41 PM, Andrzej Jakowski wrote:
> In current implementation it is possible to perform IO polling if
> underlying device is block-mq device. Is is not possible however to do
> polled IO on stackable block devices like MD.
> 
> We have explored two paths for enabling IO polling on bio devices. First
> idea revolved around rewriting MD to block-mq interface but it proved to
> be complex. In the second idea we have built a prototype which
> introduced new operation on request_queue - bio_poll_fn. bio_poll_fn if
> provided by stackable block driver is called when user polls for IO
> completion. bio_poll_fn approach was initially discussed and confirmed
> with Jens.
> 
> We managed to collect performance data on RAID-0 volume built on top of
> 2xP4800X devices with polling on and off. Here are the results:
> Polling      QD       Latency         IOPS
> ----------------------------------------------
> off           1       12.03us         78800
> off           2       13.27us        144000
> off           4       15.83us        245000
> off           8       31.14us        253000
> off          16       63.03us        252000
> off          32      128.89us        247000
> off          64      259.10us        246000
> off         128      517.27us        247000
> on            1        9.00us        108000
> on            2        9.07us        214000
> on            4       12.00us        327000
> on            8       21.43us        369000
> on           16       43.18us        369000
> on           32       85.75us        372000
> on           64      169.87us        376000
> on          128      346.15us        370000

blk_poll() used to be a pointer in the queue, but since we just had
one implementation, we got rid of it. Might make sense to
reintroduce that, and just make it an optimized indirect call. I
think that would be prettier than add the bio hack in the middle of
it, and you're having to add a queue pointer anyway.

Alternatively, do it like you did, but have it be:

if (q->poll_fn)
	return q->poll_fn(...);

instead of duplicating most of the core of the function with essentially
the same thing, just calling ->bio_poll_fn() instead of mq_ops->poll().

In other words, I like the feature, but I think the implementation
currently leaves a lot to be desired. It should be nicely integrated,
not some hack off to the side.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: introduce polling on bio level
  2020-01-26  4:41 ` [PATCH 1/2] block: introduce polling on bio level Andrzej Jakowski
  2020-01-30  4:01   ` Jens Axboe
@ 2020-01-31  6:34   ` Christoph Hellwig
  2020-01-31 18:51     ` Andrzej Jakowski
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2020-01-31  6:34 UTC (permalink / raw)
  To: Andrzej Jakowski; +Cc: axboe, song, linux-block, linux-raid, Artur Paszkiewicz

On Sat, Jan 25, 2020 at 09:41:37PM -0700, Andrzej Jakowski wrote:
>  	if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q))
> -		goto not_supported;
> +		if (!(bio->bi_opf & REQ_HIPRI))
> +			goto not_supported;

Can you explain this check?  This looks weird to me  I think we need
a generalized check if a make_request based driver supports REQ_NOWAIT
instead (and as a separate patch / patchset).

> +	if (q->bio_poll_fn != NULL) {
> +		state = current->state;
> +		do {
> +			int ret;
> +
> +			ret = q->bio_poll_fn(q, cookie);
> +			if (ret > 0) {
> +				__set_current_state(TASK_RUNNING);
> +				return ret;
> +			}
> +
> +			if (signal_pending_state(state, current))
> +				__set_current_state(TASK_RUNNING);
> +
> +			if (current->state == TASK_RUNNING)
> +				return 1;
> +			if (ret < 0 || !spin)
> +				break;
> +			cpu_relax();
> +		} while (!need_resched());
> +
> +		__set_current_state(TASK_RUNNING);
> +
> +		return 0;
> +	}

This needs to be factored out into a helper at least.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: introduce polling on bio level
  2020-01-30  4:01   ` Jens Axboe
@ 2020-01-31  6:35     ` Christoph Hellwig
  2020-02-01  4:41       ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2020-01-31  6:35 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Andrzej Jakowski, song, linux-block, linux-raid, Artur Paszkiewicz

On Wed, Jan 29, 2020 at 09:01:49PM -0700, Jens Axboe wrote:
> blk_poll() used to be a pointer in the queue, but since we just had
> one implementation, we got rid of it. Might make sense to
> reintroduce that, and just make it an optimized indirect call. I
> think that would be prettier than add the bio hack in the middle of
> it, and you're having to add a queue pointer anyway.

Well, the other reason is to avoid an indirect call for the blk-mq
case, which are fairly expensive.  In fact I'm pretty sure we avoid
indirect calls from the bio layer into blk-mq entirely for the fast
path at the moment, and it would be great to keep it that way.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] md: enable io polling
  2020-01-26  4:41 ` [PATCH 2/2] md: enable io polling Andrzej Jakowski
@ 2020-01-31  6:36   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-01-31  6:36 UTC (permalink / raw)
  To: Andrzej Jakowski; +Cc: axboe, song, linux-block, linux-raid, Artur Paszkiewicz

> +	rdev_for_each(rdev, mddev) {
> +		if (rdev->raid_disk >= 0 && !test_bit(Faulty, &rdev->flags)) {
> +			rv = blk_poll(bdev_get_queue(rdev->bdev), cookie, false);

This adds a > 80 char line.  But if you just use a continue to skip
the not allicable ones you even clean up the code while avoiding that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: introduce polling on bio level
  2020-01-31  6:34   ` Christoph Hellwig
@ 2020-01-31 18:51     ` Andrzej Jakowski
  2020-02-01  8:20       ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Andrzej Jakowski @ 2020-01-31 18:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, song, linux-block, linux-raid, Artur Paszkiewicz

On 1/30/20 11:34 PM, Christoph Hellwig wrote:
> Can you explain this check?  This looks weird to me  I think we need
> a generalized check if a make_request based driver supports REQ_NOWAIT
> instead (and as a separate patch / patchset).

Original check used to reject polled IO for stackable block devices as 
"not supported". To solve that situation I introduced additional check 
to reject all non REQ_HIPRI requests. That check is not intended to 
generalize, like you indicated, but to conservatively select which 
requests to accept.
Perhaps there is better way to do that. Any suggestions?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: introduce polling on bio level
  2020-01-31  6:35     ` Christoph Hellwig
@ 2020-02-01  4:41       ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-02-01  4:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrzej Jakowski, song, linux-block, linux-raid, Artur Paszkiewicz

On 1/30/20 11:35 PM, Christoph Hellwig wrote:
> On Wed, Jan 29, 2020 at 09:01:49PM -0700, Jens Axboe wrote:
>> blk_poll() used to be a pointer in the queue, but since we just had
>> one implementation, we got rid of it. Might make sense to
>> reintroduce that, and just make it an optimized indirect call. I
>> think that would be prettier than add the bio hack in the middle of
>> it, and you're having to add a queue pointer anyway.
> 
> Well, the other reason is to avoid an indirect call for the blk-mq
> case, which are fairly expensive.  In fact I'm pretty sure we avoid
> indirect calls from the bio layer into blk-mq entirely for the fast
> path at the moment, and it would be great to keep it that way.

Sure, my suggestion was to provide it as an alternative - if set,
then call that. Though with the optimized indirect calls, at least
it's just a branch, actual call is the same.

This patch sets a bit in no mans land right now. It's duplicating
the main loop, and it's shoved in the middle of the function. This
has to get cleaned up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: introduce polling on bio level
  2020-01-31 18:51     ` Andrzej Jakowski
@ 2020-02-01  8:20       ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-02-01  8:20 UTC (permalink / raw)
  To: Andrzej Jakowski
  Cc: Christoph Hellwig, axboe, song, linux-block, linux-raid,
	Artur Paszkiewicz

On Fri, Jan 31, 2020 at 11:51:43AM -0700, Andrzej Jakowski wrote:
> On 1/30/20 11:34 PM, Christoph Hellwig wrote:
> > Can you explain this check?  This looks weird to me  I think we need
> > a generalized check if a make_request based driver supports REQ_NOWAIT
> > instead (and as a separate patch / patchset).
> 
> Original check used to reject polled IO for stackable block devices as "not
> supported". To solve that situation I introduced additional check to reject
> all non REQ_HIPRI requests. That check is not intended to generalize, like
> you indicated, but to conservatively select which requests to accept.
> Perhaps there is better way to do that. Any suggestions?

REQ_NOWAIT and REQ_HIPRI are completley unrelated concepts and they need
to be dealt with entirely separately.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-02-01  8:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-26  4:41 [PATCH 0/2] Enable polling on stackable devices Andrzej Jakowski
2020-01-26  4:41 ` [PATCH 1/2] block: introduce polling on bio level Andrzej Jakowski
2020-01-30  4:01   ` Jens Axboe
2020-01-31  6:35     ` Christoph Hellwig
2020-02-01  4:41       ` Jens Axboe
2020-01-31  6:34   ` Christoph Hellwig
2020-01-31 18:51     ` Andrzej Jakowski
2020-02-01  8:20       ` Christoph Hellwig
2020-01-26  4:41 ` [PATCH 2/2] md: enable io polling Andrzej Jakowski
2020-01-31  6:36   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).