All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v3] block: IO polling improvements
@ 2016-11-12  5:11 Jens Axboe
  2016-11-12  5:11 ` [PATCH 1/3] block: fast-path for small and simple direct I/O requests Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-12  5:11 UTC (permalink / raw)
  To: axboe, linux-block, linux-fsdevel; +Cc: hch

Respun on top of for-4.10/block, dropping patches that have been
merged. This patchset adds Christoph's simplified sync O_DIRECT
bdev access mode, and implements a more efficient IO polling on
top of it. For more details, see the v2 posting here:

http://www.mail-archive.com/linux-block@vger.kernel.org/msg02079.html

Changes since v2:

- Adapt to blk stat changes
- Make it explicit that only blk-mq supports polling
- Fix a bug in the hrtimer code, switching to absolute mode if we
  have to loop around (thanks Omar).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] block: fast-path for small and simple direct I/O requests
  2016-11-12  5:11 [PATCHSET v3] block: IO polling improvements Jens Axboe
@ 2016-11-12  5:11 ` Jens Axboe
  2016-11-14 19:33   ` Omar Sandoval
  2016-11-12  5:11 ` [PATCH 2/3] blk-mq: implement hybrid poll mode for sync O_DIRECT Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2016-11-12  5:11 UTC (permalink / raw)
  To: axboe, linux-block, linux-fsdevel; +Cc: hch, Christoph Hellwig, Jens Axboe

From: Christoph Hellwig <hch@lst.de>

This patch adds a small and simple fast patch for small direct I/O
requests on block devices that don't use AIO.  Between the neat
bio_iov_iter_get_pages helper that avoids allocating a page array
for get_user_pages and the on-stack bio and biovec this avoid memory
allocations and atomic operations entirely in the direct I/O code
(lower levels might still do memory allocations and will usually
have at least some atomic operations, though).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 fs/block_dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 05b553368bb4..7c3ec6049073 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -30,6 +30,7 @@
 #include <linux/cleancache.h>
 #include <linux/dax.h>
 #include <linux/badblocks.h>
+#include <linux/task_io_accounting_ops.h>
 #include <linux/falloc.h>
 #include <asm/uaccess.h>
 #include "internal.h"
@@ -175,12 +176,91 @@ static struct inode *bdev_file_inode(struct file *file)
 	return file->f_mapping->host;
 }
 
+#define DIO_INLINE_BIO_VECS 4
+
+static void blkdev_bio_end_io_simple(struct bio *bio)
+{
+	struct task_struct *waiter = bio->bi_private;
+
+	WRITE_ONCE(bio->bi_private, NULL);
+	wake_up_process(waiter);
+}
+
+static ssize_t
+__blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
+		int nr_pages)
+{
+	struct file *file = iocb->ki_filp;
+	struct block_device *bdev = I_BDEV(bdev_file_inode(file));
+	unsigned blkbits = blksize_bits(bdev_logical_block_size(bdev));
+	struct bio_vec inline_vecs[DIO_INLINE_BIO_VECS], *bvec;
+	loff_t pos = iocb->ki_pos;
+	bool should_dirty = false;
+	struct bio bio;
+	ssize_t ret;
+	blk_qc_t qc;
+	int i;
+
+	if ((pos | iov_iter_alignment(iter)) & ((1 << blkbits) - 1))
+		return -EINVAL;
+
+	bio_init(&bio);
+	bio.bi_max_vecs = nr_pages;
+	bio.bi_io_vec = inline_vecs;
+	bio.bi_bdev = bdev;
+	bio.bi_iter.bi_sector = pos >> blkbits;
+	bio.bi_private = current;
+	bio.bi_end_io = blkdev_bio_end_io_simple;
+
+	ret = bio_iov_iter_get_pages(&bio, iter);
+	if (unlikely(ret))
+		return ret;
+	ret = bio.bi_iter.bi_size;
+
+	if (iov_iter_rw(iter) == READ) {
+		bio_set_op_attrs(&bio, REQ_OP_READ, 0);
+		if (iter_is_iovec(iter))
+			should_dirty = true;
+	} else {
+		bio_set_op_attrs(&bio, REQ_OP_WRITE, REQ_SYNC | REQ_IDLE);
+		task_io_account_write(ret);
+	}
+
+	qc = submit_bio(&bio);
+	for (;;) {
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		if (!READ_ONCE(bio.bi_private))
+			break;
+		if (!(iocb->ki_flags & IOCB_HIPRI) ||
+		    !blk_mq_poll(bdev_get_queue(bdev), qc))
+			io_schedule();
+	}
+	__set_current_state(TASK_RUNNING);
+
+	bio_for_each_segment_all(bvec, &bio, i) {
+		if (should_dirty && !PageCompound(bvec->bv_page))
+			set_page_dirty_lock(bvec->bv_page);
+		put_page(bvec->bv_page);
+	}
+
+	if (unlikely(bio.bi_error))
+		return bio.bi_error;
+	iocb->ki_pos += ret;
+	return ret;
+}
+
 static ssize_t
 blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 {
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = bdev_file_inode(file);
+	int nr_pages;
 
+	nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
+	if (!nr_pages)
+		return 0;
+	if (is_sync_kiocb(iocb) && nr_pages <= DIO_INLINE_BIO_VECS)
+		return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
 	return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter,
 				    blkdev_get_block, NULL, NULL,
 				    DIO_SKIP_DIO_COUNT);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] blk-mq: implement hybrid poll mode for sync O_DIRECT
  2016-11-12  5:11 [PATCHSET v3] block: IO polling improvements Jens Axboe
  2016-11-12  5:11 ` [PATCH 1/3] block: fast-path for small and simple direct I/O requests Jens Axboe
@ 2016-11-12  5:11 ` Jens Axboe
  2016-11-12  5:11 ` [PATCH 3/3] blk-mq: make the polling code adaptive Jens Axboe
  2016-11-16 17:31 ` [PATCHSET v3] block: IO polling improvements Stephen Bates
  3 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-12  5:11 UTC (permalink / raw)
  To: axboe, linux-block, linux-fsdevel; +Cc: hch, Jens Axboe

This patch enables a hybrid polling mode. Instead of polling after IO
submission, we can induce an artificial delay, and then poll after that.
For example, if the IO is presumed to complete in 8 usecs from now, we
can sleep for 4 usecs, wake up, and then do our polling. This still puts
a sleep/wakeup cycle in the IO path, but instead of the wakeup happening
after the IO has completed, it'll happen before. With this hybrid
scheme, we can achieve big latency reductions while still using the same
(or less) amount of CPU.

Signed-off-by: Jens Axboe <axboe@fb.com>
---
 block/blk-mq.c         | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 block/blk-sysfs.c      | 29 +++++++++++++++++++++++++++++
 block/blk.h            |  1 +
 include/linux/blkdev.h |  1 +
 4 files changed, 78 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ae8df5ec20d3..2c77a2da123a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -332,6 +332,7 @@ static void __blk_mq_free_request(struct blk_mq_hw_ctx *hctx,
 	rq->rq_flags = 0;
 
 	clear_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
+	clear_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
 	blk_mq_put_tag(hctx, ctx, tag);
 	blk_queue_exit(q);
 }
@@ -2461,11 +2462,57 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues)
 }
 EXPORT_SYMBOL_GPL(blk_mq_update_nr_hw_queues);
 
+static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
+				     struct request *rq)
+{
+	struct hrtimer_sleeper hs;
+	ktime_t kt;
+
+	if (!q->poll_nsec || test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
+		return false;
+
+	set_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
+
+	/*
+	 * This will be replaced with the stats tracking code, using
+	 * 'avg_completion_time / 2' as the pre-sleep target.
+	 */
+	kt = ktime_set(0, q->poll_nsec);
+
+	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer_set_expires(&hs.timer, kt);
+
+	hrtimer_init_sleeper(&hs, current);
+	do {
+		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
+			break;
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		hrtimer_start_expires(&hs.timer, HRTIMER_MODE_REL);
+		if (hs.task)
+			io_schedule();
+		hrtimer_cancel(&hs.timer);
+	} while (hs.task && !signal_pending(current));
+
+	__set_current_state(TASK_RUNNING);
+	destroy_hrtimer_on_stack(&hs.timer);
+	return true;
+}
+
 static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 {
 	struct request_queue *q = hctx->queue;
 	long state;
 
+	/*
+	 * If we sleep, have the caller restart the poll loop to reset
+	 * the state. Like for the other success return cases, the
+	 * caller is responsible for checking if the IO completed. If
+	 * the IO isn't complete, we'll get called again and will go
+	 * straight to the busy poll loop.
+	 */
+	if (blk_mq_poll_hybrid_sleep(q, rq))
+		return true;
+
 	hctx->poll_considered++;
 
 	state = current->state;
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 9262d2d60a09..b87f992fdbd7 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -350,6 +350,28 @@ queue_rq_affinity_store(struct request_queue *q, const char *page, size_t count)
 	return ret;
 }
 
+static ssize_t queue_poll_delay_show(struct request_queue *q, char *page)
+{
+	return queue_var_show(q->poll_nsec / 1000, page);
+}
+
+static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page,
+				size_t count)
+{
+	unsigned long poll_usec;
+	ssize_t ret;
+
+	if (!q->mq_ops || !q->mq_ops->poll)
+		return -EINVAL;
+
+	ret = queue_var_store(&poll_usec, page, count);
+	if (ret < 0)
+		return ret;
+
+	q->poll_nsec = poll_usec * 1000;
+	return ret;
+}
+
 static ssize_t queue_poll_show(struct request_queue *q, char *page)
 {
 	return queue_var_show(test_bit(QUEUE_FLAG_POLL, &q->queue_flags), page);
@@ -602,6 +624,12 @@ static struct queue_sysfs_entry queue_poll_entry = {
 	.store = queue_poll_store,
 };
 
+static struct queue_sysfs_entry queue_poll_delay_entry = {
+	.attr = {.name = "io_poll_delay", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_poll_delay_show,
+	.store = queue_poll_delay_store,
+};
+
 static struct queue_sysfs_entry queue_wc_entry = {
 	.attr = {.name = "write_cache", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_wc_show,
@@ -655,6 +683,7 @@ static struct attribute *default_attrs[] = {
 	&queue_dax_entry.attr,
 	&queue_stats_entry.attr,
 	&queue_wb_lat_entry.attr,
+	&queue_poll_delay_entry.attr,
 	NULL,
 };
 
diff --git a/block/blk.h b/block/blk.h
index aa132dea598c..041185e5f129 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -111,6 +111,7 @@ void blk_account_io_done(struct request *req);
 enum rq_atomic_flags {
 	REQ_ATOM_COMPLETE = 0,
 	REQ_ATOM_STARTED,
+	REQ_ATOM_POLL_SLEPT,
 };
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bab18ee5810d..37ed4ea705c8 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -509,6 +509,7 @@ struct request_queue {
 	unsigned int		request_fn_active;
 
 	unsigned int		rq_timeout;
+	unsigned int		poll_nsec;
 	struct timer_list	timeout;
 	struct work_struct	timeout_work;
 	struct list_head	timeout_list;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] blk-mq: make the polling code adaptive
  2016-11-12  5:11 [PATCHSET v3] block: IO polling improvements Jens Axboe
  2016-11-12  5:11 ` [PATCH 1/3] block: fast-path for small and simple direct I/O requests Jens Axboe
  2016-11-12  5:11 ` [PATCH 2/3] blk-mq: implement hybrid poll mode for sync O_DIRECT Jens Axboe
@ 2016-11-12  5:11 ` Jens Axboe
  2016-11-14 19:43   ` Omar Sandoval
  2016-11-16 17:31 ` [PATCHSET v3] block: IO polling improvements Stephen Bates
  3 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2016-11-12  5:11 UTC (permalink / raw)
  To: axboe, linux-block, linux-fsdevel; +Cc: hch, Jens Axboe

The previous commit introduced the hybrid sleep/poll mode. Take
that one step further, and use the completion latencies to
automatically sleep for half the mean completion time. This is
a good approximation.

This changes the 'io_poll_delay' sysfs file a bit to expose the
various options. Depending on the value, the polling code will
behave differently:

-1	Never enter hybrid sleep mode
 0	Use half of the completion mean for the sleep delay
>0	Use this specific value as the sleep delay

Signed-off-by: Jens Axboe <axboe@fb.com>
---
 block/blk-mq.c         | 74 ++++++++++++++++++++++++++++++++++++++++++++++----
 block/blk-sysfs.c      | 26 ++++++++++++------
 include/linux/blkdev.h |  2 +-
 3 files changed, 88 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2c77a2da123a..70b1b59ed0d3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2125,6 +2125,11 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	 */
 	q->nr_requests = set->queue_depth;
 
+	/*
+	 * Default to classic polling
+	 */
+	q->poll_nsec = -1;
+
 	if (set->ops->complete)
 		blk_queue_softirq_done(q, set->ops->complete);
 
@@ -2462,13 +2467,70 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues)
 }
 EXPORT_SYMBOL_GPL(blk_mq_update_nr_hw_queues);
 
+static unsigned long blk_mq_poll_nsecs(struct request_queue *q,
+				       struct blk_mq_hw_ctx *hctx,
+				       struct request *rq)
+{
+	struct blk_rq_stat stat[2];
+	unsigned long ret = 0;
+
+	/*
+	 * If stats collection isn't on, don't sleep but turn it on for
+	 * future users
+	 */
+	if (!blk_stat_enable(q))
+		return 0;
+
+	/*
+	 * We don't have to do this once per IO, should optimize this
+	 * to just use the current window of stats until it changes
+	 */
+	memset(&stat, 0, sizeof(stat));
+	blk_hctx_stat_get(hctx, stat);
+
+	/*
+	 * As an optimistic guess, use half of the mean service time
+	 * for this type of request. We can (and should) make this smarter.
+	 * For instance, if the completion latencies are tight, we can
+	 * get closer than just half the mean. This is especially
+	 * important on devices where the completion latencies are longer
+	 * than ~10 usec.
+	 */
+	if (req_op(rq) == REQ_OP_READ && stat[BLK_STAT_READ].nr_samples)
+		ret = (stat[BLK_STAT_READ].mean + 1) / 2;
+	else if (req_op(rq) == REQ_OP_WRITE && stat[BLK_STAT_WRITE].nr_samples)
+		ret = (stat[BLK_STAT_WRITE].mean + 1) / 2;
+
+	return ret;
+}
+
 static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
+				     struct blk_mq_hw_ctx *hctx,
 				     struct request *rq)
 {
 	struct hrtimer_sleeper hs;
+	enum hrtimer_mode mode;
+	unsigned int nsecs;
 	ktime_t kt;
 
-	if (!q->poll_nsec || test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
+	if (test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
+		return false;
+
+	/*
+	 * poll_nsec can be:
+	 *
+	 * -1:	don't ever hybrid sleep
+	 *  0:	use half of prev avg
+	 * >0:	use this specific value
+	 */
+	if (q->poll_nsec == -1)
+		return false;
+	else if (q->poll_nsec > 0)
+		nsecs = q->poll_nsec;
+	else
+		nsecs = blk_mq_poll_nsecs(q, hctx, rq);
+
+	if (!nsecs)
 		return false;
 
 	set_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
@@ -2477,9 +2539,10 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	 * This will be replaced with the stats tracking code, using
 	 * 'avg_completion_time / 2' as the pre-sleep target.
 	 */
-	kt = ktime_set(0, q->poll_nsec);
+	kt = ktime_set(0, nsecs);
 
-	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	mode = HRTIMER_MODE_REL;
+	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, mode);
 	hrtimer_set_expires(&hs.timer, kt);
 
 	hrtimer_init_sleeper(&hs, current);
@@ -2487,10 +2550,11 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
 			break;
 		set_current_state(TASK_UNINTERRUPTIBLE);
-		hrtimer_start_expires(&hs.timer, HRTIMER_MODE_REL);
+		hrtimer_start_expires(&hs.timer, mode);
 		if (hs.task)
 			io_schedule();
 		hrtimer_cancel(&hs.timer);
+		mode = HRTIMER_MODE_ABS;
 	} while (hs.task && !signal_pending(current));
 
 	__set_current_state(TASK_RUNNING);
@@ -2510,7 +2574,7 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 	 * the IO isn't complete, we'll get called again and will go
 	 * straight to the busy poll loop.
 	 */
-	if (blk_mq_poll_hybrid_sleep(q, rq))
+	if (blk_mq_poll_hybrid_sleep(q, hctx, rq))
 		return true;
 
 	hctx->poll_considered++;
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index b87f992fdbd7..652a36eef00c 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -352,24 +352,34 @@ queue_rq_affinity_store(struct request_queue *q, const char *page, size_t count)
 
 static ssize_t queue_poll_delay_show(struct request_queue *q, char *page)
 {
-	return queue_var_show(q->poll_nsec / 1000, page);
+	int val;
+
+	if (q->poll_nsec == -1)
+		val = -1;
+	else
+		val = q->poll_nsec / 1000;
+
+	return sprintf(page, "%d\n", val);
 }
 
 static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page,
 				size_t count)
 {
-	unsigned long poll_usec;
-	ssize_t ret;
+	int err, val;
 
 	if (!q->mq_ops || !q->mq_ops->poll)
 		return -EINVAL;
 
-	ret = queue_var_store(&poll_usec, page, count);
-	if (ret < 0)
-		return ret;
+	err = kstrtoint(page, 10, &val);
+	if (err < 0)
+		return err;
 
-	q->poll_nsec = poll_usec * 1000;
-	return ret;
+	if (val == -1)
+		q->poll_nsec = -1;
+	else
+		q->poll_nsec = val * 1000;
+
+	return count;
 }
 
 static ssize_t queue_poll_show(struct request_queue *q, char *page)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 37ed4ea705c8..85699bc90a51 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -509,7 +509,7 @@ struct request_queue {
 	unsigned int		request_fn_active;
 
 	unsigned int		rq_timeout;
-	unsigned int		poll_nsec;
+	int			poll_nsec;
 	struct timer_list	timeout;
 	struct work_struct	timeout_work;
 	struct list_head	timeout_list;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] block: fast-path for small and simple direct I/O requests
  2016-11-12  5:11 ` [PATCH 1/3] block: fast-path for small and simple direct I/O requests Jens Axboe
@ 2016-11-14 19:33   ` Omar Sandoval
  2016-11-14 20:00       ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Omar Sandoval @ 2016-11-14 19:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: axboe, linux-block, linux-fsdevel, hch, Christoph Hellwig

On Fri, Nov 11, 2016 at 10:11:25PM -0700, Jens Axboe wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> This patch adds a small and simple fast patch for small direct I/O
> requests on block devices that don't use AIO.  Between the neat
> bio_iov_iter_get_pages helper that avoids allocating a page array
> for get_user_pages and the on-stack bio and biovec this avoid memory
> allocations and atomic operations entirely in the direct I/O code
> (lower levels might still do memory allocations and will usually
> have at least some atomic operations, though).
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jens Axboe <axboe@fb.com>
> ---
>  fs/block_dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
> 

[snip]

>  static ssize_t
>  blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>  {
>  	struct file *file = iocb->ki_filp;
>  	struct inode *inode = bdev_file_inode(file);
> +	int nr_pages;
>  
> +	nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
> +	if (!nr_pages)
> +		return 0;
> +	if (is_sync_kiocb(iocb) && nr_pages <= DIO_INLINE_BIO_VECS)
> +		return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>  	return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter,
>  				    blkdev_get_block, NULL, NULL,
>  				    DIO_SKIP_DIO_COUNT);

__blockdev_direct_IO() does a few cache prefetches that we're now
bypassing, do we want to do the same in __blkdev_direct_IO_simple()?
That's the stuff added in 65dd2aa90aa1 ("dio: optimize cache misses in
the submission path").

> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Omar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] blk-mq: make the polling code adaptive
  2016-11-12  5:11 ` [PATCH 3/3] blk-mq: make the polling code adaptive Jens Axboe
@ 2016-11-14 19:43   ` Omar Sandoval
  2016-11-14 19:58       ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Omar Sandoval @ 2016-11-14 19:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: axboe, linux-block, linux-fsdevel, hch

On Fri, Nov 11, 2016 at 10:11:27PM -0700, Jens Axboe wrote:
> The previous commit introduced the hybrid sleep/poll mode. Take
> that one step further, and use the completion latencies to
> automatically sleep for half the mean completion time. This is
> a good approximation.
> 
> This changes the 'io_poll_delay' sysfs file a bit to expose the
> various options. Depending on the value, the polling code will
> behave differently:
> 
> -1	Never enter hybrid sleep mode
>  0	Use half of the completion mean for the sleep delay
> >0	Use this specific value as the sleep delay
> 
> Signed-off-by: Jens Axboe <axboe@fb.com>
> ---
>  block/blk-mq.c         | 74 ++++++++++++++++++++++++++++++++++++++++++++++----
>  block/blk-sysfs.c      | 26 ++++++++++++------
>  include/linux/blkdev.h |  2 +-
>  3 files changed, 88 insertions(+), 14 deletions(-)
> 

[snip]

>  static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
> +				     struct blk_mq_hw_ctx *hctx,
>  				     struct request *rq)
>  {
>  	struct hrtimer_sleeper hs;
> +	enum hrtimer_mode mode;
> +	unsigned int nsecs;
>  	ktime_t kt;
>  
> -	if (!q->poll_nsec || test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
> +	if (test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
> +		return false;
> +
> +	/*
> +	 * poll_nsec can be:
> +	 *
> +	 * -1:	don't ever hybrid sleep
> +	 *  0:	use half of prev avg
> +	 * >0:	use this specific value
> +	 */
> +	if (q->poll_nsec == -1)
> +		return false;
> +	else if (q->poll_nsec > 0)
> +		nsecs = q->poll_nsec;
> +	else
> +		nsecs = blk_mq_poll_nsecs(q, hctx, rq);
> +
> +	if (!nsecs)
>  		return false;
>  
>  	set_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
> @@ -2477,9 +2539,10 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>  	 * This will be replaced with the stats tracking code, using
>  	 * 'avg_completion_time / 2' as the pre-sleep target.
>  	 */
> -	kt = ktime_set(0, q->poll_nsec);
> +	kt = ktime_set(0, nsecs);
>  
> -	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	mode = HRTIMER_MODE_REL;
> +	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, mode);
>  	hrtimer_set_expires(&hs.timer, kt);
>  
>  	hrtimer_init_sleeper(&hs, current);
> @@ -2487,10 +2550,11 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>  		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
>  			break;
>  		set_current_state(TASK_UNINTERRUPTIBLE);
> -		hrtimer_start_expires(&hs.timer, HRTIMER_MODE_REL);
> +		hrtimer_start_expires(&hs.timer, mode);
>  		if (hs.task)
>  			io_schedule();
>  		hrtimer_cancel(&hs.timer);
> +		mode = HRTIMER_MODE_ABS;
>  	} while (hs.task && !signal_pending(current));

This fix should be folded into patch 2.

>  	__set_current_state(TASK_RUNNING);
> @@ -2510,7 +2574,7 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
>  	 * the IO isn't complete, we'll get called again and will go
>  	 * straight to the busy poll loop.
>  	 */
> -	if (blk_mq_poll_hybrid_sleep(q, rq))
> +	if (blk_mq_poll_hybrid_sleep(q, hctx, rq))
>  		return true;
>  
>  	hctx->poll_considered++;

[snip]

-- 
Omar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] blk-mq: make the polling code adaptive
  2016-11-14 19:43   ` Omar Sandoval
@ 2016-11-14 19:58       ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-14 19:58 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: axboe, linux-block, linux-fsdevel, hch

On 11/14/2016 12:43 PM, Omar Sandoval wrote:
,9 +2539,10 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>>  	 * This will be replaced with the stats tracking code, using
>>  	 * 'avg_completion_time / 2' as the pre-sleep target.
>>  	 */
>> -	kt = ktime_set(0, q->poll_nsec);
>> +	kt = ktime_set(0, nsecs);
>>
>> -	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>> +	mode = HRTIMER_MODE_REL;
>> +	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, mode);
>>  	hrtimer_set_expires(&hs.timer, kt);
>>
>>  	hrtimer_init_sleeper(&hs, current);
>> @@ -2487,10 +2550,11 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>>  		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
>>  			break;
>>  		set_current_state(TASK_UNINTERRUPTIBLE);
>> -		hrtimer_start_expires(&hs.timer, HRTIMER_MODE_REL);
>> +		hrtimer_start_expires(&hs.timer, mode);
>>  		if (hs.task)
>>  			io_schedule();
>>  		hrtimer_cancel(&hs.timer);
>> +		mode = HRTIMER_MODE_ABS;
>>  	} while (hs.task && !signal_pending(current));
>
> This fix should be folded into patch 2.

Good point, I'll do that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] blk-mq: make the polling code adaptive
@ 2016-11-14 19:58       ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-14 19:58 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: axboe, linux-block, linux-fsdevel, hch

On 11/14/2016 12:43 PM, Omar Sandoval wrote:
,9 +2539,10 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>>  	 * This will be replaced with the stats tracking code, using
>>  	 * 'avg_completion_time / 2' as the pre-sleep target.
>>  	 */
>> -	kt = ktime_set(0, q->poll_nsec);
>> +	kt = ktime_set(0, nsecs);
>>
>> -	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>> +	mode = HRTIMER_MODE_REL;
>> +	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, mode);
>>  	hrtimer_set_expires(&hs.timer, kt);
>>
>>  	hrtimer_init_sleeper(&hs, current);
>> @@ -2487,10 +2550,11 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>>  		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
>>  			break;
>>  		set_current_state(TASK_UNINTERRUPTIBLE);
>> -		hrtimer_start_expires(&hs.timer, HRTIMER_MODE_REL);
>> +		hrtimer_start_expires(&hs.timer, mode);
>>  		if (hs.task)
>>  			io_schedule();
>>  		hrtimer_cancel(&hs.timer);
>> +		mode = HRTIMER_MODE_ABS;
>>  	} while (hs.task && !signal_pending(current));
>
> This fix should be folded into patch 2.

Good point, I'll do that.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] block: fast-path for small and simple direct I/O requests
  2016-11-14 19:33   ` Omar Sandoval
@ 2016-11-14 20:00       ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-14 20:00 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: axboe, linux-block, linux-fsdevel, hch, Christoph Hellwig

On 11/14/2016 12:33 PM, Omar Sandoval wrote:
> On Fri, Nov 11, 2016 at 10:11:25PM -0700, Jens Axboe wrote:
>> From: Christoph Hellwig <hch@lst.de>
>>
>> This patch adds a small and simple fast patch for small direct I/O
>> requests on block devices that don't use AIO.  Between the neat
>> bio_iov_iter_get_pages helper that avoids allocating a page array
>> for get_user_pages and the on-stack bio and biovec this avoid memory
>> allocations and atomic operations entirely in the direct I/O code
>> (lower levels might still do memory allocations and will usually
>> have at least some atomic operations, though).
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Jens Axboe <axboe@fb.com>
>> ---
>>  fs/block_dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 80 insertions(+)
>>
>
> [snip]
>
>>  static ssize_t
>>  blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>>  {
>>  	struct file *file = iocb->ki_filp;
>>  	struct inode *inode = bdev_file_inode(file);
>> +	int nr_pages;
>>
>> +	nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
>> +	if (!nr_pages)
>> +		return 0;
>> +	if (is_sync_kiocb(iocb) && nr_pages <= DIO_INLINE_BIO_VECS)
>> +		return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>  	return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter,
>>  				    blkdev_get_block, NULL, NULL,
>>  				    DIO_SKIP_DIO_COUNT);
>
> __blockdev_direct_IO() does a few cache prefetches that we're now
> bypassing, do we want to do the same in __blkdev_direct_IO_simple()?
> That's the stuff added in 65dd2aa90aa1 ("dio: optimize cache misses in
> the submission path").

Prefetches like that tend to grow stale, in my experience. So we should
probably just evaluate the new path cache behavior and see if it makes
sense.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] block: fast-path for small and simple direct I/O requests
@ 2016-11-14 20:00       ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-14 20:00 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: axboe, linux-block, linux-fsdevel, hch, Christoph Hellwig

On 11/14/2016 12:33 PM, Omar Sandoval wrote:
> On Fri, Nov 11, 2016 at 10:11:25PM -0700, Jens Axboe wrote:
>> From: Christoph Hellwig <hch@lst.de>
>>
>> This patch adds a small and simple fast patch for small direct I/O
>> requests on block devices that don't use AIO.  Between the neat
>> bio_iov_iter_get_pages helper that avoids allocating a page array
>> for get_user_pages and the on-stack bio and biovec this avoid memory
>> allocations and atomic operations entirely in the direct I/O code
>> (lower levels might still do memory allocations and will usually
>> have at least some atomic operations, though).
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Jens Axboe <axboe@fb.com>
>> ---
>>  fs/block_dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 80 insertions(+)
>>
>
> [snip]
>
>>  static ssize_t
>>  blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>>  {
>>  	struct file *file = iocb->ki_filp;
>>  	struct inode *inode = bdev_file_inode(file);
>> +	int nr_pages;
>>
>> +	nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
>> +	if (!nr_pages)
>> +		return 0;
>> +	if (is_sync_kiocb(iocb) && nr_pages <= DIO_INLINE_BIO_VECS)
>> +		return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>  	return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter,
>>  				    blkdev_get_block, NULL, NULL,
>>  				    DIO_SKIP_DIO_COUNT);
>
> __blockdev_direct_IO() does a few cache prefetches that we're now
> bypassing, do we want to do the same in __blkdev_direct_IO_simple()?
> That's the stuff added in 65dd2aa90aa1 ("dio: optimize cache misses in
> the submission path").

Prefetches like that tend to grow stale, in my experience. So we should
probably just evaluate the new path cache behavior and see if it makes
sense.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] block: fast-path for small and simple direct I/O requests
  2016-11-14 20:00       ` Jens Axboe
  (?)
@ 2016-11-15 13:49       ` Christoph Hellwig
  -1 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2016-11-15 13:49 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Omar Sandoval, axboe, linux-block, linux-fsdevel, hch, Christoph Hellwig

On Mon, Nov 14, 2016 at 01:00:09PM -0700, Jens Axboe wrote:
> Prefetches like that tend to grow stale, in my experience. So we should
> probably just evaluate the new path cache behavior and see if it makes
> sense.

Yes.  I've tested the patches with that in place, but it didn't
make a difference.  Probably because we're not wasting many cycles
between the possible place for the prefetch and the use of it anyway.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHSET v3] block: IO polling improvements
  2016-11-12  5:11 [PATCHSET v3] block: IO polling improvements Jens Axboe
                   ` (2 preceding siblings ...)
  2016-11-12  5:11 ` [PATCH 3/3] blk-mq: make the polling code adaptive Jens Axboe
@ 2016-11-16 17:31 ` Stephen Bates
  2016-11-16 20:59     ` Jens Axboe
  3 siblings, 1 reply; 14+ messages in thread
From: Stephen Bates @ 2016-11-16 17:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: axboe, linux-block, linux-fsdevel, hch

On Fri, November 11, 2016 11:11 pm, Jens Axboe wrote:
> Respun on top of for-4.10/block, dropping patches that have been
> merged. This patchset adds Christoph's simplified sync O_DIRECT bdev access
> mode, and implements a more efficient IO polling on top of it. For more
> details, see the v2 posting here:
>
> http://www.mail-archive.com/linux-block@vger.kernel.org/msg02079.html
>
>
> Changes since v2:
>
>
> - Adapt to blk stat changes
> - Make it explicit that only blk-mq supports polling
> - Fix a bug in the hrtimer code, switching to absolute mode if we
> have to loop around (thanks Omar).
>

Hi Jens

I applied this series cleanly on top of 2868f13c303e147 in your
for-4.10/block branch. I tested using a simple fio wrapper script [1] on a
low-latency NVMe SSD [2]. Here are some results:

io_poll  io_poll_delay  threads  latency  CPU/threads

0        N/A            1        17.0     47%
1        -1             1        13.8     100%
1        0              1        14.6     73%
1        10             1        17.0     56%

0        N/A            8        18.7     47%
1        -1             8        14.4     100%
1        0              8        15.7     73%
1        10             8        19.0     56%

The io_poll_delay option 0 is definitely a nice compromise between no
polling and 100% polling. The result for io_poll_delay=10us is interesting
as it implies there might be some mismatch between that value and the
completion time. However I think that is something we can improve on over
time and I don't see it as a reason to not get this series upstream.

For the series:

Tested-By: Stephen Bates <sbates@raithlin.com>
Reviewed-By: Stephen Bates <sbates@raithlin.com>

Cheers

Stephen

[1] https://github.com/sbates130272/fio-stuff/blob/master/misc/iopoll-test.sh
[2]
http://www.microsemi.com/products/storage/flashtec-nvram-drives/flashtec-nvram-drives

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHSET v3] block: IO polling improvements
  2016-11-16 17:31 ` [PATCHSET v3] block: IO polling improvements Stephen Bates
@ 2016-11-16 20:59     ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-16 20:59 UTC (permalink / raw)
  To: Stephen Bates; +Cc: axboe, linux-block, linux-fsdevel, hch

On 11/16/2016 10:31 AM, Stephen Bates wrote:
> On Fri, November 11, 2016 11:11 pm, Jens Axboe wrote:
>> Respun on top of for-4.10/block, dropping patches that have been
>> merged. This patchset adds Christoph's simplified sync O_DIRECT bdev access
>> mode, and implements a more efficient IO polling on top of it. For more
>> details, see the v2 posting here:
>>
>> http://www.mail-archive.com/linux-block@vger.kernel.org/msg02079.html
>>
>>
>> Changes since v2:
>>
>>
>> - Adapt to blk stat changes
>> - Make it explicit that only blk-mq supports polling
>> - Fix a bug in the hrtimer code, switching to absolute mode if we
>> have to loop around (thanks Omar).
>>
>
> Hi Jens
>
> I applied this series cleanly on top of 2868f13c303e147 in your
> for-4.10/block branch. I tested using a simple fio wrapper script [1] on a
> low-latency NVMe SSD [2]. Here are some results:
>
> io_poll  io_poll_delay  threads  latency  CPU/threads
>
> 0        N/A            1        17.0     47%
> 1        -1             1        13.8     100%
> 1        0              1        14.6     73%
> 1        10             1        17.0     56%
>
> 0        N/A            8        18.7     47%
> 1        -1             8        14.4     100%
> 1        0              8        15.7     73%
> 1        10             8        19.0     56%
>
> The io_poll_delay option 0 is definitely a nice compromise between no
> polling and 100% polling. The result for io_poll_delay=10us is interesting
> as it implies there might be some mismatch between that value and the
> completion time. However I think that is something we can improve on over
> time and I don't see it as a reason to not get this series upstream.
>
> For the series:
>
> Tested-By: Stephen Bates <sbates@raithlin.com>
> Reviewed-By: Stephen Bates <sbates@raithlin.com>

Thanks for testing, Stephen, I'll add your tested/reviewed-by tags. As
to the specific setting of the time, it might just be that it's a poor
choice of value. We'll need some time to setup the delay, and if we end
up being late, then we'll make things worse. But I agree, we can look
into that going forward. I've got some more ideas on how to improve the
timing in general, so we can both get closer to -1 and get better
efficiency as well.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHSET v3] block: IO polling improvements
@ 2016-11-16 20:59     ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2016-11-16 20:59 UTC (permalink / raw)
  To: Stephen Bates; +Cc: axboe, linux-block, linux-fsdevel, hch

On 11/16/2016 10:31 AM, Stephen Bates wrote:
> On Fri, November 11, 2016 11:11 pm, Jens Axboe wrote:
>> Respun on top of for-4.10/block, dropping patches that have been
>> merged. This patchset adds Christoph's simplified sync O_DIRECT bdev access
>> mode, and implements a more efficient IO polling on top of it. For more
>> details, see the v2 posting here:
>>
>> http://www.mail-archive.com/linux-block@vger.kernel.org/msg02079.html
>>
>>
>> Changes since v2:
>>
>>
>> - Adapt to blk stat changes
>> - Make it explicit that only blk-mq supports polling
>> - Fix a bug in the hrtimer code, switching to absolute mode if we
>> have to loop around (thanks Omar).
>>
>
> Hi Jens
>
> I applied this series cleanly on top of 2868f13c303e147 in your
> for-4.10/block branch. I tested using a simple fio wrapper script [1] on a
> low-latency NVMe SSD [2]. Here are some results:
>
> io_poll  io_poll_delay  threads  latency  CPU/threads
>
> 0        N/A            1        17.0     47%
> 1        -1             1        13.8     100%
> 1        0              1        14.6     73%
> 1        10             1        17.0     56%
>
> 0        N/A            8        18.7     47%
> 1        -1             8        14.4     100%
> 1        0              8        15.7     73%
> 1        10             8        19.0     56%
>
> The io_poll_delay option 0 is definitely a nice compromise between no
> polling and 100% polling. The result for io_poll_delay=10us is interesting
> as it implies there might be some mismatch between that value and the
> completion time. However I think that is something we can improve on over
> time and I don't see it as a reason to not get this series upstream.
>
> For the series:
>
> Tested-By: Stephen Bates <sbates@raithlin.com>
> Reviewed-By: Stephen Bates <sbates@raithlin.com>

Thanks for testing, Stephen, I'll add your tested/reviewed-by tags. As
to the specific setting of the time, it might just be that it's a poor
choice of value. We'll need some time to setup the delay, and if we end
up being late, then we'll make things worse. But I agree, we can look
into that going forward. I've got some more ideas on how to improve the
timing in general, so we can both get closer to -1 and get better
efficiency as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-11-16 21:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-12  5:11 [PATCHSET v3] block: IO polling improvements Jens Axboe
2016-11-12  5:11 ` [PATCH 1/3] block: fast-path for small and simple direct I/O requests Jens Axboe
2016-11-14 19:33   ` Omar Sandoval
2016-11-14 20:00     ` Jens Axboe
2016-11-14 20:00       ` Jens Axboe
2016-11-15 13:49       ` Christoph Hellwig
2016-11-12  5:11 ` [PATCH 2/3] blk-mq: implement hybrid poll mode for sync O_DIRECT Jens Axboe
2016-11-12  5:11 ` [PATCH 3/3] blk-mq: make the polling code adaptive Jens Axboe
2016-11-14 19:43   ` Omar Sandoval
2016-11-14 19:58     ` Jens Axboe
2016-11-14 19:58       ` Jens Axboe
2016-11-16 17:31 ` [PATCHSET v3] block: IO polling improvements Stephen Bates
2016-11-16 20:59   ` Jens Axboe
2016-11-16 20:59     ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.