All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v2 0/11] Various block optimizations
@ 2018-11-15 19:51 Jens Axboe
  2018-11-15 19:51 ` [PATCH 01/11] nvme: provide optimized poll function for separate poll queues Jens Axboe
                   ` (10 more replies)
  0 siblings, 11 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block

Some of these are optimizations, the latter part is prep work
for supporting polling with aio.

Patches against my for-4.21/block branch. These patches can also
be found in my mq-perf branch, though there are other patches
sitting on top of this series (notably aio polling, as mentioned).

Changes since v2:

- Include polled swap IO in the poll optimizations
- Get rid of unnecessary write barrier for DIO wakeup
- Fix a potential stall if need_resched() was set and preempt
  wasn't enabled
- Provide separate mq_ops for NVMe with poll queues
- Drop q->mq_ops patch
- Rebase on top of for-4.21/block

Changes since v1:

- Improve nvme irq disabling for polled IO
- Fix barriers in the ordered wakeup for polled O_DIRECT
- Add patch to allow polling to find any command that is done
- Add patch to control whether polling spins or not
- Have async O_DIRECT mark a bio as pollable
- Don't plug for polling




^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/11] nvme: provide optimized poll function for separate poll queues
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:35   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 02/11] block: add queue_is_mq() helper Jens Axboe
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

If we have separate poll queues, we know that they aren't using
interrupts. Hence we don't need to disable interrupts around
finding completions.

Provide a separate set of blk_mq_ops for such devices.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 45 +++++++++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ffbab5b01df4..fc7dd49f22fc 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1082,6 +1082,23 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, unsigned int tag)
 	return __nvme_poll(nvmeq, tag);
 }
 
+static int nvme_poll_noirq(struct blk_mq_hw_ctx *hctx, unsigned int tag)
+{
+	struct nvme_queue *nvmeq = hctx->driver_data;
+	u16 start, end;
+	bool found;
+
+	if (!nvme_cqe_pending(nvmeq))
+		return 0;
+
+	spin_lock(&nvmeq->cq_lock);
+	found = nvme_process_cq(nvmeq, &start, &end, tag);
+	spin_unlock(&nvmeq->cq_lock);
+
+	nvme_complete_cqes(nvmeq, start, end);
+	return found;
+}
+
 static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
 {
 	struct nvme_dev *dev = to_nvme_dev(ctrl);
@@ -1584,17 +1601,25 @@ static const struct blk_mq_ops nvme_mq_admin_ops = {
 	.timeout	= nvme_timeout,
 };
 
+#define NVME_SHARED_MQ_OPS					\
+	.queue_rq		= nvme_queue_rq,		\
+	.rq_flags_to_type	= nvme_rq_flags_to_type,	\
+	.complete		= nvme_pci_complete_rq,		\
+	.init_hctx		= nvme_init_hctx,		\
+	.init_request		= nvme_init_request,		\
+	.map_queues		= nvme_pci_map_queues,		\
+	.timeout		= nvme_timeout			\
+
 static const struct blk_mq_ops nvme_mq_ops = {
-	.queue_rq		= nvme_queue_rq,
-	.rq_flags_to_type	= nvme_rq_flags_to_type,
-	.complete		= nvme_pci_complete_rq,
-	.init_hctx		= nvme_init_hctx,
-	.init_request		= nvme_init_request,
-	.map_queues		= nvme_pci_map_queues,
-	.timeout		= nvme_timeout,
+	NVME_SHARED_MQ_OPS,
 	.poll			= nvme_poll,
 };
 
+static const struct blk_mq_ops nvme_mq_poll_noirq_ops = {
+	NVME_SHARED_MQ_OPS,
+	.poll			= nvme_poll_noirq,
+};
+
 static void nvme_dev_remove_admin(struct nvme_dev *dev)
 {
 	if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q)) {
@@ -2274,7 +2299,11 @@ static int nvme_dev_add(struct nvme_dev *dev)
 	int ret;
 
 	if (!dev->ctrl.tagset) {
-		dev->tagset.ops = &nvme_mq_ops;
+		if (!dev->io_queues[NVMEQ_TYPE_POLL])
+			dev->tagset.ops = &nvme_mq_ops;
+		else
+			dev->tagset.ops = &nvme_mq_poll_noirq_ops;
+
 		dev->tagset.nr_hw_queues = dev->online_queues - 1;
 		dev->tagset.nr_maps = NVMEQ_TYPE_NR;
 		dev->tagset.timeout = NVME_IO_TIMEOUT;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/11] block: add queue_is_mq() helper
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
  2018-11-15 19:51 ` [PATCH 01/11] nvme: provide optimized poll function for separate poll queues Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:35   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions Jens Axboe
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

Various spots check for q->mq_ops being non-NULL, but provide
a helper to do this instead.

Where the ->mq_ops != NULL check is redundant, remove it.

Since mq == rq-based now that legacy is gone, get rid of the
queue_is_rq_based() and just use queue_is_mq() everywhere.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c     |  8 ++++----
 block/blk-core.c       | 12 ++++++------
 block/blk-flush.c      |  3 +--
 block/blk-mq.c         |  2 +-
 block/blk-sysfs.c      | 14 +++++++-------
 block/blk-throttle.c   |  2 +-
 block/blk-wbt.c        |  2 +-
 block/blk-zoned.c      |  2 +-
 block/bsg.c            |  2 +-
 block/elevator.c       | 11 +++++------
 block/genhd.c          |  8 ++++----
 drivers/md/dm-rq.c     |  2 +-
 drivers/md/dm-table.c  |  4 ++--
 include/linux/blkdev.h |  6 +-----
 14 files changed, 36 insertions(+), 42 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 0f6b44614165..63d226a084cd 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1324,7 +1324,7 @@ int blkcg_activate_policy(struct request_queue *q,
 	if (blkcg_policy_enabled(q, pol))
 		return 0;
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_freeze_queue(q);
 pd_prealloc:
 	if (!pd_prealloc) {
@@ -1363,7 +1363,7 @@ int blkcg_activate_policy(struct request_queue *q,
 
 	spin_unlock_irq(&q->queue_lock);
 out_bypass_end:
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_unfreeze_queue(q);
 	if (pd_prealloc)
 		pol->pd_free_fn(pd_prealloc);
@@ -1387,7 +1387,7 @@ void blkcg_deactivate_policy(struct request_queue *q,
 	if (!blkcg_policy_enabled(q, pol))
 		return;
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_freeze_queue(q);
 
 	spin_lock_irq(&q->queue_lock);
@@ -1405,7 +1405,7 @@ void blkcg_deactivate_policy(struct request_queue *q,
 
 	spin_unlock_irq(&q->queue_lock);
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_unfreeze_queue(q);
 }
 EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
diff --git a/block/blk-core.c b/block/blk-core.c
index 92b6b200e9fb..0b684a520a11 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -232,7 +232,7 @@ void blk_sync_queue(struct request_queue *q)
 	del_timer_sync(&q->timeout);
 	cancel_work_sync(&q->timeout_work);
 
-	if (q->mq_ops) {
+	if (queue_is_mq(q)) {
 		struct blk_mq_hw_ctx *hctx;
 		int i;
 
@@ -281,7 +281,7 @@ void blk_set_queue_dying(struct request_queue *q)
 	 */
 	blk_freeze_queue_start(q);
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_wake_waiters(q);
 
 	/* Make blk_queue_enter() reexamine the DYING flag. */
@@ -356,7 +356,7 @@ void blk_cleanup_queue(struct request_queue *q)
 	 * blk_freeze_queue() should be enough for cases of passthrough
 	 * request.
 	 */
-	if (q->mq_ops && blk_queue_init_done(q))
+	if (queue_is_mq(q) && blk_queue_init_done(q))
 		blk_mq_quiesce_queue(q);
 
 	/* for synchronous bio-based driver finish in-flight integrity i/o */
@@ -374,7 +374,7 @@ void blk_cleanup_queue(struct request_queue *q)
 
 	blk_exit_queue(q);
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_free_queue(q);
 
 	percpu_ref_exit(&q->q_usage_counter);
@@ -982,7 +982,7 @@ generic_make_request_checks(struct bio *bio)
 	 * For a REQ_NOWAIT based request, return -EOPNOTSUPP
 	 * if queue is not a request based queue.
 	 */
-	if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_rq_based(q))
+	if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q))
 		goto not_supported;
 
 	if (should_fail_bio(bio))
@@ -1657,7 +1657,7 @@ EXPORT_SYMBOL_GPL(rq_flush_dcache_pages);
  */
 int blk_lld_busy(struct request_queue *q)
 {
-	if (q->mq_ops && q->mq_ops->busy)
+	if (queue_is_mq(q) && q->mq_ops->busy)
 		return q->mq_ops->busy(q);
 
 	return 0;
diff --git a/block/blk-flush.c b/block/blk-flush.c
index fcd18b158fd6..a3fc7191c694 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -273,8 +273,7 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 	 * assigned to empty flushes, and we deadlock if we are expecting
 	 * other requests to make progress. Don't defer for that case.
 	 */
-	if (!list_empty(&fq->flush_data_in_flight) &&
-	    !(q->mq_ops && q->elevator) &&
+	if (!list_empty(&fq->flush_data_in_flight) && q->elevator &&
 	    time_before(jiffies,
 			fq->flush_pending_since + FLUSH_PENDING_TIMEOUT))
 		return;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3b823891b3ef..32b246ed44c0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -150,7 +150,7 @@ void blk_freeze_queue_start(struct request_queue *q)
 	freeze_depth = atomic_inc_return(&q->mq_freeze_depth);
 	if (freeze_depth == 1) {
 		percpu_ref_kill(&q->q_usage_counter);
-		if (q->mq_ops)
+		if (queue_is_mq(q))
 			blk_mq_run_hw_queues(q, false);
 	}
 }
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 1e370207a20e..80eef48fddc8 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -68,7 +68,7 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count)
 	unsigned long nr;
 	int ret, err;
 
-	if (!q->mq_ops)
+	if (!queue_is_mq(q))
 		return -EINVAL;
 
 	ret = queue_var_store(&nr, page, count);
@@ -835,12 +835,12 @@ static void __blk_release_queue(struct work_struct *work)
 
 	blk_queue_free_zone_bitmaps(q);
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_release(q);
 
 	blk_trace_shutdown(q);
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_debugfs_unregister(q);
 
 	bioset_exit(&q->bio_split);
@@ -914,7 +914,7 @@ int blk_register_queue(struct gendisk *disk)
 		goto unlock;
 	}
 
-	if (q->mq_ops) {
+	if (queue_is_mq(q)) {
 		__blk_mq_register_dev(dev, q);
 		blk_mq_debugfs_register(q);
 	}
@@ -925,7 +925,7 @@ int blk_register_queue(struct gendisk *disk)
 
 	blk_throtl_register_queue(q);
 
-	if ((q->mq_ops && q->elevator)) {
+	if (q->elevator) {
 		ret = elv_register_queue(q);
 		if (ret) {
 			mutex_unlock(&q->sysfs_lock);
@@ -974,7 +974,7 @@ void blk_unregister_queue(struct gendisk *disk)
 	 * Remove the sysfs attributes before unregistering the queue data
 	 * structures that can be modified through sysfs.
 	 */
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		blk_mq_unregister_dev(disk_to_dev(disk), q);
 	mutex_unlock(&q->sysfs_lock);
 
@@ -983,7 +983,7 @@ void blk_unregister_queue(struct gendisk *disk)
 	blk_trace_remove_sysfs(disk_to_dev(disk));
 
 	mutex_lock(&q->sysfs_lock);
-	if (q->mq_ops && q->elevator)
+	if (q->elevator)
 		elv_unregister_queue(q);
 	mutex_unlock(&q->sysfs_lock);
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index d0a23f0bb3ed..8f0a104770ee 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2456,7 +2456,7 @@ void blk_throtl_register_queue(struct request_queue *q)
 	td->throtl_slice = DFL_THROTL_SLICE_HD;
 #endif
 
-	td->track_bio_latency = !queue_is_rq_based(q);
+	td->track_bio_latency = !queue_is_mq(q);
 	if (!td->track_bio_latency)
 		blk_stat_enable_accounting(q);
 }
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 9f142b84dc85..d051ebfb4852 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -701,7 +701,7 @@ void wbt_enable_default(struct request_queue *q)
 	if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
 		return;
 
-	if (q->mq_ops && IS_ENABLED(CONFIG_BLK_WBT_MQ))
+	if (queue_is_mq(q) && IS_ENABLED(CONFIG_BLK_WBT_MQ))
 		wbt_init(q);
 }
 EXPORT_SYMBOL_GPL(wbt_enable_default);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 13ba2011a306..e9c332b1d9da 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -421,7 +421,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk)
 	 * BIO based queues do not use a scheduler so only q->nr_zones
 	 * needs to be updated so that the sysfs exposed value is correct.
 	 */
-	if (!queue_is_rq_based(q)) {
+	if (!queue_is_mq(q)) {
 		q->nr_zones = nr_zones;
 		return 0;
 	}
diff --git a/block/bsg.c b/block/bsg.c
index 9a442c23a715..44f6028b9567 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -471,7 +471,7 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
 	/*
 	 * we need a proper transport to send commands, not a stacked device
 	 */
-	if (!queue_is_rq_based(q))
+	if (!queue_is_mq(q))
 		return 0;
 
 	bcd = &q->bsg_dev;
diff --git a/block/elevator.c b/block/elevator.c
index 796436270682..f05e90d4e695 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -667,7 +667,7 @@ static int __elevator_change(struct request_queue *q, const char *name)
 	/*
 	 * Special case for mq, turn off scheduling
 	 */
-	if (q->mq_ops && !strncmp(name, "none", 4))
+	if (!strncmp(name, "none", 4))
 		return elevator_switch(q, NULL);
 
 	strlcpy(elevator_name, name, sizeof(elevator_name));
@@ -685,8 +685,7 @@ static int __elevator_change(struct request_queue *q, const char *name)
 
 static inline bool elv_support_iosched(struct request_queue *q)
 {
-	if (q->mq_ops && q->tag_set && (q->tag_set->flags &
-				BLK_MQ_F_NO_SCHED))
+	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
 		return false;
 	return true;
 }
@@ -696,7 +695,7 @@ ssize_t elv_iosched_store(struct request_queue *q, const char *name,
 {
 	int ret;
 
-	if (!q->mq_ops || !elv_support_iosched(q))
+	if (!queue_is_mq(q) || !elv_support_iosched(q))
 		return count;
 
 	ret = __elevator_change(q, name);
@@ -713,7 +712,7 @@ ssize_t elv_iosched_show(struct request_queue *q, char *name)
 	struct elevator_type *__e;
 	int len = 0;
 
-	if (!queue_is_rq_based(q))
+	if (!queue_is_mq(q))
 		return sprintf(name, "none\n");
 
 	if (!q->elevator)
@@ -732,7 +731,7 @@ ssize_t elv_iosched_show(struct request_queue *q, char *name)
 	}
 	spin_unlock(&elv_list_lock);
 
-	if (q->mq_ops && q->elevator)
+	if (q->elevator)
 		len += sprintf(name+len, "none");
 
 	len += sprintf(len+name, "\n");
diff --git a/block/genhd.c b/block/genhd.c
index cff6bdf27226..0145bcb0cc76 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -47,7 +47,7 @@ static void disk_release_events(struct gendisk *disk);
 
 void part_inc_in_flight(struct request_queue *q, struct hd_struct *part, int rw)
 {
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		return;
 
 	atomic_inc(&part->in_flight[rw]);
@@ -57,7 +57,7 @@ void part_inc_in_flight(struct request_queue *q, struct hd_struct *part, int rw)
 
 void part_dec_in_flight(struct request_queue *q, struct hd_struct *part, int rw)
 {
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		return;
 
 	atomic_dec(&part->in_flight[rw]);
@@ -68,7 +68,7 @@ void part_dec_in_flight(struct request_queue *q, struct hd_struct *part, int rw)
 void part_in_flight(struct request_queue *q, struct hd_struct *part,
 		    unsigned int inflight[2])
 {
-	if (q->mq_ops) {
+	if (queue_is_mq(q)) {
 		blk_mq_in_flight(q, part, inflight);
 		return;
 	}
@@ -85,7 +85,7 @@ void part_in_flight(struct request_queue *q, struct hd_struct *part,
 void part_in_flight_rw(struct request_queue *q, struct hd_struct *part,
 		       unsigned int inflight[2])
 {
-	if (q->mq_ops) {
+	if (queue_is_mq(q)) {
 		blk_mq_in_flight_rw(q, part, inflight);
 		return;
 	}
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 7cd36e4d1310..1f1fe9a618ea 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -43,7 +43,7 @@ static unsigned dm_get_blk_mq_queue_depth(void)
 
 int dm_request_based(struct mapped_device *md)
 {
-	return queue_is_rq_based(md->queue);
+	return queue_is_mq(md->queue);
 }
 
 void dm_start_queue(struct request_queue *q)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 9038c302d5c2..844f7d0f2ef8 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -919,12 +919,12 @@ static int device_is_rq_based(struct dm_target *ti, struct dm_dev *dev,
 	struct request_queue *q = bdev_get_queue(dev->bdev);
 	struct verify_rq_based_data *v = data;
 
-	if (q->mq_ops)
+	if (queue_is_mq(q))
 		v->mq_count++;
 	else
 		v->sq_count++;
 
-	return queue_is_rq_based(q);
+	return queue_is_mq(q);
 }
 
 static int dm_table_determine_type(struct dm_table *t)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1d185f1fc333..41aaa05e42c1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -656,11 +656,7 @@ static inline bool blk_account_rq(struct request *rq)
 
 #define rq_data_dir(rq)		(op_is_write(req_op(rq)) ? WRITE : READ)
 
-/*
- * Driver can handle struct request, if it either has an old style
- * request_fn defined, or is blk-mq based.
- */
-static inline bool queue_is_rq_based(struct request_queue *q)
+static inline bool queue_is_mq(struct request_queue *q)
 {
 	return q->mq_ops;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
  2018-11-15 19:51 ` [PATCH 01/11] nvme: provide optimized poll function for separate poll queues Jens Axboe
  2018-11-15 19:51 ` [PATCH 02/11] block: add queue_is_mq() helper Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:38   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 04/11] block: avoid ordered task state change for polled IO Jens Axboe
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe, Josef Bacik

Put the short code in the fast path, where we don't have any
functions attached to the queue. This minimizes the impact on
the hot path in the core code.

Cc: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-rq-qos.c | 63 +++++++++++++++++++++-------------------------
 block/blk-rq-qos.h | 59 +++++++++++++++++++++++++++++++++++++------
 2 files changed, 80 insertions(+), 42 deletions(-)

diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index f8a4d3fbb98c..80f603b76f61 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -27,74 +27,67 @@ bool rq_wait_inc_below(struct rq_wait *rq_wait, unsigned int limit)
 	return atomic_inc_below(&rq_wait->inflight, limit);
 }
 
-void rq_qos_cleanup(struct request_queue *q, struct bio *bio)
+void __rq_qos_cleanup(struct rq_qos *rqos, struct bio *bio)
 {
-	struct rq_qos *rqos;
-
-	for (rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->cleanup)
 			rqos->ops->cleanup(rqos, bio);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
-void rq_qos_done(struct request_queue *q, struct request *rq)
+void __rq_qos_done(struct rq_qos *rqos, struct request *rq)
 {
-	struct rq_qos *rqos;
-
-	for (rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->done)
 			rqos->ops->done(rqos, rq);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
-void rq_qos_issue(struct request_queue *q, struct request *rq)
+void __rq_qos_issue(struct rq_qos *rqos, struct request *rq)
 {
-	struct rq_qos *rqos;
-
-	for(rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->issue)
 			rqos->ops->issue(rqos, rq);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
-void rq_qos_requeue(struct request_queue *q, struct request *rq)
+void __rq_qos_requeue(struct rq_qos *rqos, struct request *rq)
 {
-	struct rq_qos *rqos;
-
-	for(rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->requeue)
 			rqos->ops->requeue(rqos, rq);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
-void rq_qos_throttle(struct request_queue *q, struct bio *bio)
+void __rq_qos_throttle(struct rq_qos *rqos, struct bio *bio)
 {
-	struct rq_qos *rqos;
-
-	for(rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->throttle)
 			rqos->ops->throttle(rqos, bio);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
-void rq_qos_track(struct request_queue *q, struct request *rq, struct bio *bio)
+void __rq_qos_track(struct rq_qos *rqos, struct request *rq, struct bio *bio)
 {
-	struct rq_qos *rqos;
-
-	for(rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->track)
 			rqos->ops->track(rqos, rq, bio);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
-void rq_qos_done_bio(struct request_queue *q, struct bio *bio)
+void __rq_qos_done_bio(struct rq_qos *rqos, struct bio *bio)
 {
-	struct rq_qos *rqos;
-
-	for(rqos = q->rq_qos; rqos; rqos = rqos->next) {
+	do {
 		if (rqos->ops->done_bio)
 			rqos->ops->done_bio(rqos, bio);
-	}
+		rqos = rqos->next;
+	} while (rqos);
 }
 
 /*
diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h
index b6b11d496007..6e09e98b93ea 100644
--- a/block/blk-rq-qos.h
+++ b/block/blk-rq-qos.h
@@ -98,12 +98,57 @@ void rq_depth_scale_up(struct rq_depth *rqd);
 void rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle);
 bool rq_depth_calc_max_depth(struct rq_depth *rqd);
 
-void rq_qos_cleanup(struct request_queue *, struct bio *);
-void rq_qos_done(struct request_queue *, struct request *);
-void rq_qos_issue(struct request_queue *, struct request *);
-void rq_qos_requeue(struct request_queue *, struct request *);
-void rq_qos_done_bio(struct request_queue *q, struct bio *bio);
-void rq_qos_throttle(struct request_queue *, struct bio *);
-void rq_qos_track(struct request_queue *q, struct request *, struct bio *);
+void __rq_qos_cleanup(struct rq_qos *rqos, struct bio *bio);
+void __rq_qos_done(struct rq_qos *rqos, struct request *rq);
+void __rq_qos_issue(struct rq_qos *rqos, struct request *rq);
+void __rq_qos_requeue(struct rq_qos *rqos, struct request *rq);
+void __rq_qos_throttle(struct rq_qos *rqos, struct bio *bio);
+void __rq_qos_track(struct rq_qos *rqos, struct request *rq, struct bio *bio);
+void __rq_qos_done_bio(struct rq_qos *rqos, struct bio *bio);
+
+static inline void rq_qos_cleanup(struct request_queue *q, struct bio *bio)
+{
+	if (q->rq_qos)
+		__rq_qos_cleanup(q->rq_qos, bio);
+}
+
+static inline void rq_qos_done(struct request_queue *q, struct request *rq)
+{
+	if (q->rq_qos)
+		__rq_qos_done(q->rq_qos, rq);
+}
+
+static inline void rq_qos_issue(struct request_queue *q, struct request *rq)
+{
+	if (q->rq_qos)
+		__rq_qos_issue(q->rq_qos, rq);
+}
+
+static inline void rq_qos_requeue(struct request_queue *q, struct request *rq)
+{
+	if (q->rq_qos)
+		__rq_qos_requeue(q->rq_qos, rq);
+}
+
+static inline void rq_qos_done_bio(struct request_queue *q, struct bio *bio)
+{
+	if (q->rq_qos)
+		__rq_qos_done_bio(q->rq_qos, bio);
+}
+
+static inline void rq_qos_throttle(struct request_queue *q, struct bio *bio)
+{
+	if (q->rq_qos)
+		__rq_qos_throttle(q->rq_qos, bio);
+}
+
+static inline void rq_qos_track(struct request_queue *q, struct request *rq,
+				struct bio *bio)
+{
+	if (q->rq_qos)
+		__rq_qos_track(q->rq_qos, rq, bio);
+}
+
 void rq_qos_exit(struct request_queue *);
+
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/11] block: avoid ordered task state change for polled IO
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (2 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:41   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 05/11] block: add polled wakeup task helper Jens Axboe
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

Ensure that writes to the dio/bio waiter field are ordered
correctly. With the smp_rmb() before the READ_ONCE() check,
we should be able to use a more relaxed ordering for the
task state setting. We don't need a heavier barrier on
the wakeup side after writing the waiter field, since we
either going to be in the task we care about, or go through
wake_up_process() which implies a strong enough barrier.

For the core poll helper, the task state setting don't need
to imply any atomics, as it's the current task itself that
is being modified and we're not going to sleep.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c | 4 ++--
 fs/block_dev.c | 9 +++++++--
 fs/iomap.c     | 4 +++-
 mm/page_io.c   | 4 +++-
 4 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 32b246ed44c0..7fc4abb4cc36 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3331,12 +3331,12 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 		ret = q->mq_ops->poll(hctx, rq->tag);
 		if (ret > 0) {
 			hctx->poll_success++;
-			set_current_state(TASK_RUNNING);
+			__set_current_state(TASK_RUNNING);
 			return true;
 		}
 
 		if (signal_pending_state(state, current))
-			set_current_state(TASK_RUNNING);
+			__set_current_state(TASK_RUNNING);
 
 		if (current->state == TASK_RUNNING)
 			return true;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index c039abfb2052..5b754f84c814 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -237,9 +237,12 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 
 	qc = submit_bio(&bio);
 	for (;;) {
-		set_current_state(TASK_UNINTERRUPTIBLE);
+		__set_current_state(TASK_UNINTERRUPTIBLE);
+
+		smp_rmb();
 		if (!READ_ONCE(bio.bi_private))
 			break;
+
 		if (!(iocb->ki_flags & IOCB_HIPRI) ||
 		    !blk_poll(bdev_get_queue(bdev), qc))
 			io_schedule();
@@ -403,7 +406,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 		return -EIOCBQUEUED;
 
 	for (;;) {
-		set_current_state(TASK_UNINTERRUPTIBLE);
+		__set_current_state(TASK_UNINTERRUPTIBLE);
+
+		smp_rmb();
 		if (!READ_ONCE(dio->waiter))
 			break;
 
diff --git a/fs/iomap.c b/fs/iomap.c
index f61d13dfdf09..3373ea4984d9 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1888,7 +1888,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			return -EIOCBQUEUED;
 
 		for (;;) {
-			set_current_state(TASK_UNINTERRUPTIBLE);
+			__set_current_state(TASK_UNINTERRUPTIBLE);
+
+			smp_rmb();
 			if (!READ_ONCE(dio->submit.waiter))
 				break;
 
diff --git a/mm/page_io.c b/mm/page_io.c
index d4d1c89bcddd..008f6d00c47c 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -405,7 +405,9 @@ int swap_readpage(struct page *page, bool synchronous)
 	bio_get(bio);
 	qc = submit_bio(bio);
 	while (synchronous) {
-		set_current_state(TASK_UNINTERRUPTIBLE);
+		__set_current_state(TASK_UNINTERRUPTIBLE);
+
+		smp_rmb();
 		if (!READ_ONCE(bio->bi_private))
 			break;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/11] block: add polled wakeup task helper
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (3 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 04/11] block: avoid ordered task state change for polled IO Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:41   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 06/11] block: have ->poll_fn() return number of entries polled Jens Axboe
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

If we're polling for IO on a device that doesn't use interrupts, then
IO completion loop (and wake of task) is done by submitting task itself.
If that is the case, then we don't need to enter the wake_up_process()
function, we can simply mark ourselves as TASK_RUNNING.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/block_dev.c         |  4 ++--
 fs/iomap.c             |  2 +-
 include/linux/blkdev.h | 13 +++++++++++++
 mm/page_io.c           |  2 +-
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 5b754f84c814..0ed9be8906a8 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -181,7 +181,7 @@ static void blkdev_bio_end_io_simple(struct bio *bio)
 	struct task_struct *waiter = bio->bi_private;
 
 	WRITE_ONCE(bio->bi_private, NULL);
-	wake_up_process(waiter);
+	blk_wake_io_task(waiter);
 }
 
 static ssize_t
@@ -308,7 +308,7 @@ static void blkdev_bio_end_io(struct bio *bio)
 			struct task_struct *waiter = dio->waiter;
 
 			WRITE_ONCE(dio->waiter, NULL);
-			wake_up_process(waiter);
+			blk_wake_io_task(waiter);
 		}
 	}
 
diff --git a/fs/iomap.c b/fs/iomap.c
index 3373ea4984d9..38c9bc63296a 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1525,7 +1525,7 @@ static void iomap_dio_bio_end_io(struct bio *bio)
 		if (dio->wait_for_completion) {
 			struct task_struct *waiter = dio->submit.waiter;
 			WRITE_ONCE(dio->submit.waiter, NULL);
-			wake_up_process(waiter);
+			blk_wake_io_task(waiter);
 		} else if (dio->flags & IOMAP_DIO_WRITE) {
 			struct inode *inode = file_inode(dio->iocb->ki_filp);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 41aaa05e42c1..91c44f7a7f62 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1772,4 +1772,17 @@ static inline int blkdev_issue_flush(struct block_device *bdev, gfp_t gfp_mask,
 
 #endif /* CONFIG_BLOCK */
 
+static inline void blk_wake_io_task(struct task_struct *waiter)
+{
+	/*
+	 * If we're polling, the task itself is doing the completions. For
+	 * that case, we don't need to signal a wakeup, it's enough to just
+	 * mark us as RUNNING.
+	 */
+	if (waiter == current)
+		__set_current_state(TASK_RUNNING);
+	else
+		wake_up_process(waiter);
+}
+
 #endif
diff --git a/mm/page_io.c b/mm/page_io.c
index 008f6d00c47c..f277459db805 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -140,7 +140,7 @@ static void end_swap_bio_read(struct bio *bio)
 	unlock_page(page);
 	WRITE_ONCE(bio->bi_private, NULL);
 	bio_put(bio);
-	wake_up_process(waiter);
+	blk_wake_io_task(waiter);
 	put_task_struct(waiter);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/11] block: have ->poll_fn() return number of entries polled
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (4 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 05/11] block: add polled wakeup task helper Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-15 19:51 ` [PATCH 07/11] blk-mq: when polling for IO, look for any completion Jens Axboe
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

We currently only really support sync poll, ie poll with 1
IO in flight. This prepares us for supporting async poll.

Note that the returned value isn't necessarily 100% accurate.
If poll races with IRQ completion, we assume that the fact
that the task is now runnable means we found at least one
entry. In reality it could be more than 1, or not even 1.
This is fine, the caller will just need to take this into
account.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c                | 18 +++++++++---------
 drivers/nvme/host/multipath.c |  4 ++--
 include/linux/blkdev.h        |  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7fc4abb4cc36..52b1c97cd7c6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -38,7 +38,7 @@
 #include "blk-mq-sched.h"
 #include "blk-rq-qos.h"
 
-static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
+static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
 
@@ -3305,7 +3305,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	return true;
 }
 
-static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
+static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 {
 	struct request_queue *q = hctx->queue;
 	long state;
@@ -3318,7 +3318,7 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 	 * straight to the busy poll loop.
 	 */
 	if (blk_mq_poll_hybrid_sleep(q, hctx, rq))
-		return true;
+		return 1;
 
 	hctx->poll_considered++;
 
@@ -3332,30 +3332,30 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 		if (ret > 0) {
 			hctx->poll_success++;
 			__set_current_state(TASK_RUNNING);
-			return true;
+			return ret;
 		}
 
 		if (signal_pending_state(state, current))
 			__set_current_state(TASK_RUNNING);
 
 		if (current->state == TASK_RUNNING)
-			return true;
+			return 1;
 		if (ret < 0)
 			break;
 		cpu_relax();
 	}
 
 	__set_current_state(TASK_RUNNING);
-	return false;
+	return 0;
 }
 
-static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
+static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 {
 	struct blk_mq_hw_ctx *hctx;
 	struct request *rq;
 
 	if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
-		return false;
+		return 0;
 
 	hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)];
 	if (!blk_qc_t_is_internal(cookie))
@@ -3369,7 +3369,7 @@ static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 		 * so we should be safe with just the NULL check.
 		 */
 		if (!rq)
-			return false;
+			return 0;
 	}
 
 	return __blk_mq_poll(hctx, rq);
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index b82b0d3ca39a..65539c8df11d 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -220,11 +220,11 @@ static blk_qc_t nvme_ns_head_make_request(struct request_queue *q,
 	return ret;
 }
 
-static bool nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc)
+static int nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc)
 {
 	struct nvme_ns_head *head = q->queuedata;
 	struct nvme_ns *ns;
-	bool found = false;
+	int found = 0;
 	int srcu_idx;
 
 	srcu_idx = srcu_read_lock(&head->srcu);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 91c44f7a7f62..e96dc16ef8aa 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -283,7 +283,7 @@ static inline unsigned short req_get_ioprio(struct request *req)
 struct blk_queue_ctx;
 
 typedef blk_qc_t (make_request_fn) (struct request_queue *q, struct bio *bio);
-typedef bool (poll_q_fn) (struct request_queue *q, blk_qc_t);
+typedef int (poll_q_fn) (struct request_queue *q, blk_qc_t);
 
 struct bio_vec;
 typedef int (dma_drain_needed_fn)(struct request *);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/11] blk-mq: when polling for IO, look for any completion
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (5 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 06/11] block: have ->poll_fn() return number of entries polled Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:43   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 08/11] block: make blk_poll() take a parameter on whether to spin or not Jens Axboe
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

If we want to support async IO polling, then we have to allow
finding completions that aren't just for the one we are
looking for. Always pass in -1 to the mq_ops->poll() helper,
and have that return how many events were found in this poll
loop.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c          | 69 +++++++++++++++++++++++------------------
 drivers/nvme/host/pci.c | 14 ++++-----
 2 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 52b1c97cd7c6..3ca00d712158 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3266,9 +3266,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	 *  0:	use half of prev avg
 	 * >0:	use this specific value
 	 */
-	if (q->poll_nsec == -1)
-		return false;
-	else if (q->poll_nsec > 0)
+	if (q->poll_nsec > 0)
 		nsecs = q->poll_nsec;
 	else
 		nsecs = blk_mq_poll_nsecs(q, hctx, rq);
@@ -3305,21 +3303,36 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	return true;
 }
 
-static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
+static bool blk_mq_poll_hybrid(struct request_queue *q,
+			       struct blk_mq_hw_ctx *hctx, blk_qc_t cookie)
+{
+	struct request *rq;
+
+	if (q->poll_nsec == -1)
+		return false;
+
+	if (!blk_qc_t_is_internal(cookie))
+		rq = blk_mq_tag_to_rq(hctx->tags, blk_qc_t_to_tag(cookie));
+	else {
+		rq = blk_mq_tag_to_rq(hctx->sched_tags, blk_qc_t_to_tag(cookie));
+		/*
+		 * With scheduling, if the request has completed, we'll
+		 * get a NULL return here, as we clear the sched tag when
+		 * that happens. The request still remains valid, like always,
+		 * so we should be safe with just the NULL check.
+		 */
+		if (!rq)
+			return false;
+	}
+
+	return blk_mq_poll_hybrid_sleep(q, hctx, rq);
+}
+
+static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
 	long state;
 
-	/*
-	 * If we sleep, have the caller restart the poll loop to reset
-	 * the state. Like for the other success return cases, the
-	 * caller is responsible for checking if the IO completed. If
-	 * the IO isn't complete, we'll get called again and will go
-	 * straight to the busy poll loop.
-	 */
-	if (blk_mq_poll_hybrid_sleep(q, hctx, rq))
-		return 1;
-
 	hctx->poll_considered++;
 
 	state = current->state;
@@ -3328,7 +3341,7 @@ static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 
 		hctx->poll_invoked++;
 
-		ret = q->mq_ops->poll(hctx, rq->tag);
+		ret = q->mq_ops->poll(hctx, -1U);
 		if (ret > 0) {
 			hctx->poll_success++;
 			__set_current_state(TASK_RUNNING);
@@ -3352,27 +3365,23 @@ static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 {
 	struct blk_mq_hw_ctx *hctx;
-	struct request *rq;
 
 	if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
 		return 0;
 
 	hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)];
-	if (!blk_qc_t_is_internal(cookie))
-		rq = blk_mq_tag_to_rq(hctx->tags, blk_qc_t_to_tag(cookie));
-	else {
-		rq = blk_mq_tag_to_rq(hctx->sched_tags, blk_qc_t_to_tag(cookie));
-		/*
-		 * With scheduling, if the request has completed, we'll
-		 * get a NULL return here, as we clear the sched tag when
-		 * that happens. The request still remains valid, like always,
-		 * so we should be safe with just the NULL check.
-		 */
-		if (!rq)
-			return 0;
-	}
 
-	return __blk_mq_poll(hctx, rq);
+	/*
+	 * If we sleep, have the caller restart the poll loop to reset
+	 * the state. Like for the other success return cases, the
+	 * caller is responsible for checking if the IO completed. If
+	 * the IO isn't complete, we'll get called again and will go
+	 * straight to the busy poll loop.
+	 */
+	if (blk_mq_poll_hybrid(q, hctx, cookie))
+		return 1;
+
+	return __blk_mq_poll(hctx);
 }
 
 unsigned int blk_mq_rq_cpu(struct request *rq)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index fc7dd49f22fc..6c03461ad988 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1012,15 +1012,15 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
 	}
 }
 
-static inline bool nvme_process_cq(struct nvme_queue *nvmeq, u16 *start,
-		u16 *end, int tag)
+static inline int nvme_process_cq(struct nvme_queue *nvmeq, u16 *start,
+				  u16 *end, unsigned int tag)
 {
-	bool found = false;
+	int found = 0;
 
 	*start = nvmeq->cq_head;
-	while (!found && nvme_cqe_pending(nvmeq)) {
-		if (nvmeq->cqes[nvmeq->cq_head].command_id == tag)
-			found = true;
+	while (nvme_cqe_pending(nvmeq)) {
+		if (tag == -1U || nvmeq->cqes[nvmeq->cq_head].command_id == tag)
+			found++;
 		nvme_update_cq_head(nvmeq);
 	}
 	*end = nvmeq->cq_head;
@@ -1062,7 +1062,7 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
 static int __nvme_poll(struct nvme_queue *nvmeq, unsigned int tag)
 {
 	u16 start, end;
-	bool found;
+	int found;
 
 	if (!nvme_cqe_pending(nvmeq))
 		return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/11] block: make blk_poll() take a parameter on whether to spin or not
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (6 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 07/11] blk-mq: when polling for IO, look for any completion Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-15 19:51 ` [PATCH 09/11] blk-mq: ensure mq_ops ->poll() is entered at least once Jens Axboe
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

blk_poll() has always kept spinning until it found an IO. This is
fine for SYNC polling, since we need to find one request we have
pending, but in preparation for ASYNC polling it can be beneficial
to just check if we have any entries available or not.

Existing callers are converted to pass in 'spin == true', to retain
the old behavior.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c                  |  4 ++--
 block/blk-mq.c                    | 10 +++++-----
 drivers/nvme/host/multipath.c     |  4 ++--
 drivers/nvme/target/io-cmd-bdev.c |  2 +-
 fs/block_dev.c                    |  4 ++--
 fs/direct-io.c                    |  2 +-
 fs/iomap.c                        |  2 +-
 include/linux/blkdev.h            |  4 ++--
 mm/page_io.c                      |  2 +-
 9 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 0b684a520a11..ccf40f853afd 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1284,14 +1284,14 @@ blk_qc_t submit_bio(struct bio *bio)
 }
 EXPORT_SYMBOL(submit_bio);
 
-bool blk_poll(struct request_queue *q, blk_qc_t cookie)
+bool blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin)
 {
 	if (!q->poll_fn || !blk_qc_t_valid(cookie))
 		return false;
 
 	if (current->plug)
 		blk_flush_plug_list(current->plug, false);
-	return q->poll_fn(q, cookie);
+	return q->poll_fn(q, cookie, spin);
 }
 EXPORT_SYMBOL_GPL(blk_poll);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3ca00d712158..695aa9363a6e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -38,7 +38,7 @@
 #include "blk-mq-sched.h"
 #include "blk-rq-qos.h"
 
-static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
+static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, bool spin);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
 
@@ -3328,7 +3328,7 @@ static bool blk_mq_poll_hybrid(struct request_queue *q,
 	return blk_mq_poll_hybrid_sleep(q, hctx, rq);
 }
 
-static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx)
+static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, bool spin)
 {
 	struct request_queue *q = hctx->queue;
 	long state;
@@ -3353,7 +3353,7 @@ static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx)
 
 		if (current->state == TASK_RUNNING)
 			return 1;
-		if (ret < 0)
+		if (ret < 0 || !spin)
 			break;
 		cpu_relax();
 	}
@@ -3362,7 +3362,7 @@ static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx)
 	return 0;
 }
 
-static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
+static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, bool spin)
 {
 	struct blk_mq_hw_ctx *hctx;
 
@@ -3381,7 +3381,7 @@ static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 	if (blk_mq_poll_hybrid(q, hctx, cookie))
 		return 1;
 
-	return __blk_mq_poll(hctx);
+	return __blk_mq_poll(hctx, spin);
 }
 
 unsigned int blk_mq_rq_cpu(struct request *rq)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 65539c8df11d..c83bb3302684 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -220,7 +220,7 @@ static blk_qc_t nvme_ns_head_make_request(struct request_queue *q,
 	return ret;
 }
 
-static int nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc)
+static int nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc, bool spin)
 {
 	struct nvme_ns_head *head = q->queuedata;
 	struct nvme_ns *ns;
@@ -230,7 +230,7 @@ static int nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc)
 	srcu_idx = srcu_read_lock(&head->srcu);
 	ns = srcu_dereference(head->current_path[numa_node_id()], &head->srcu);
 	if (likely(ns && nvme_path_is_optimized(ns)))
-		found = ns->queue->poll_fn(q, qc);
+		found = ns->queue->poll_fn(q, qc, spin);
 	srcu_read_unlock(&head->srcu, srcu_idx);
 	return found;
 }
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index c1ec3475a140..f6971b45bc54 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -116,7 +116,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 
 	cookie = submit_bio(bio);
 
-	blk_poll(bdev_get_queue(req->ns->bdev), cookie);
+	blk_poll(bdev_get_queue(req->ns->bdev), cookie, true);
 }
 
 static void nvmet_bdev_execute_flush(struct nvmet_req *req)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 0ed9be8906a8..7810f5b588ea 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -244,7 +244,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 			break;
 
 		if (!(iocb->ki_flags & IOCB_HIPRI) ||
-		    !blk_poll(bdev_get_queue(bdev), qc))
+		    !blk_poll(bdev_get_queue(bdev), qc, true))
 			io_schedule();
 	}
 	__set_current_state(TASK_RUNNING);
@@ -413,7 +413,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 			break;
 
 		if (!(iocb->ki_flags & IOCB_HIPRI) ||
-		    !blk_poll(bdev_get_queue(bdev), qc))
+		    !blk_poll(bdev_get_queue(bdev), qc, true))
 			io_schedule();
 	}
 	__set_current_state(TASK_RUNNING);
diff --git a/fs/direct-io.c b/fs/direct-io.c
index ea07d5a34317..a5a4e5a1423e 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -518,7 +518,7 @@ static struct bio *dio_await_one(struct dio *dio)
 		dio->waiter = current;
 		spin_unlock_irqrestore(&dio->bio_lock, flags);
 		if (!(dio->iocb->ki_flags & IOCB_HIPRI) ||
-		    !blk_poll(dio->bio_disk->queue, dio->bio_cookie))
+		    !blk_poll(dio->bio_disk->queue, dio->bio_cookie, true))
 			io_schedule();
 		/* wake up sets us TASK_RUNNING */
 		spin_lock_irqsave(&dio->bio_lock, flags);
diff --git a/fs/iomap.c b/fs/iomap.c
index 38c9bc63296a..1ef4e063f068 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1897,7 +1897,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 			if (!(iocb->ki_flags & IOCB_HIPRI) ||
 			    !dio->submit.last_queue ||
 			    !blk_poll(dio->submit.last_queue,
-					 dio->submit.cookie))
+					 dio->submit.cookie, true))
 				io_schedule();
 		}
 		__set_current_state(TASK_RUNNING);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e96dc16ef8aa..e83ad6f15281 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -283,7 +283,7 @@ static inline unsigned short req_get_ioprio(struct request *req)
 struct blk_queue_ctx;
 
 typedef blk_qc_t (make_request_fn) (struct request_queue *q, struct bio *bio);
-typedef int (poll_q_fn) (struct request_queue *q, blk_qc_t);
+typedef int (poll_q_fn) (struct request_queue *q, blk_qc_t, bool spin);
 
 struct bio_vec;
 typedef int (dma_drain_needed_fn)(struct request *);
@@ -868,7 +868,7 @@ extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *,
 int blk_status_to_errno(blk_status_t status);
 blk_status_t errno_to_blk_status(int errno);
 
-bool blk_poll(struct request_queue *q, blk_qc_t cookie);
+bool blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin);
 
 static inline struct request_queue *bdev_get_queue(struct block_device *bdev)
 {
diff --git a/mm/page_io.c b/mm/page_io.c
index f277459db805..1518f459866d 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -411,7 +411,7 @@ int swap_readpage(struct page *page, bool synchronous)
 		if (!READ_ONCE(bio->bi_private))
 			break;
 
-		if (!blk_poll(disk->queue, qc))
+		if (!blk_poll(disk->queue, qc, true))
 			break;
 	}
 	__set_current_state(TASK_RUNNING);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/11] blk-mq: ensure mq_ops ->poll() is entered at least once
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (7 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 08/11] block: make blk_poll() take a parameter on whether to spin or not Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-15 19:51 ` [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to Jens Axboe
  2018-11-15 19:51 ` [PATCH 11/11] block: don't plug for aio/O_DIRECT HIPRI IO Jens Axboe
  10 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

Right now we immediately bail if need_resched() is true, but
we need to do at least one loop in case we have entries waiting.
So just invert the need_resched() check, putting it at the
bottom of the loop.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 695aa9363a6e..0ff70dfc8c0e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3336,7 +3336,7 @@ static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, bool spin)
 	hctx->poll_considered++;
 
 	state = current->state;
-	while (!need_resched()) {
+	do {
 		int ret;
 
 		hctx->poll_invoked++;
@@ -3356,7 +3356,7 @@ static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, bool spin)
 		if (ret < 0 || !spin)
 			break;
 		cpu_relax();
-	}
+	} while (!need_resched());
 
 	__set_current_state(TASK_RUNNING);
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (8 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 09/11] blk-mq: ensure mq_ops ->poll() is entered at least once Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:47   ` Christoph Hellwig
  2018-11-15 19:51 ` [PATCH 11/11] block: don't plug for aio/O_DIRECT HIPRI IO Jens Axboe
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

Inherit the iocb IOCB_HIPRI flag, and pass on REQ_HIPRI for
those kinds of requests.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/block_dev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 7810f5b588ea..c124982b810d 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -386,6 +386,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 
 		nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
 		if (!nr_pages) {
+			if (iocb->ki_flags & IOCB_HIPRI)
+				bio->bi_opf |= REQ_HIPRI;
+
 			qc = submit_bio(bio);
 			break;
 		}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/11] block: don't plug for aio/O_DIRECT HIPRI IO
  2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
                   ` (9 preceding siblings ...)
  2018-11-15 19:51 ` [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to Jens Axboe
@ 2018-11-15 19:51 ` Jens Axboe
  2018-11-16  8:49   ` Christoph Hellwig
  10 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-15 19:51 UTC (permalink / raw)
  To: linux-block; +Cc: Jens Axboe

Those will go straight to issue inside blk-mq, so don't bother
setting up a block plug for them.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/block_dev.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index c124982b810d..9dc695a3af4e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -356,7 +356,13 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 	dio->multi_bio = false;
 	dio->should_dirty = is_read && iter_is_iovec(iter);
 
-	blk_start_plug(&plug);
+	/*
+	 * Don't plug for HIPRI/polled IO, as those should go straight
+	 * to issue
+	 */
+	if (!(iocb->ki_flags & IOCB_HIPRI))
+		blk_start_plug(&plug);
+
 	for (;;) {
 		bio_set_dev(bio, bdev);
 		bio->bi_iter.bi_sector = pos >> 9;
@@ -403,7 +409,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 		submit_bio(bio);
 		bio = bio_alloc(GFP_KERNEL, nr_pages);
 	}
-	blk_finish_plug(&plug);
+
+	if (!(iocb->ki_flags & IOCB_HIPRI))
+		blk_finish_plug(&plug);
 
 	if (!is_sync)
 		return -EIOCBQUEUED;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/11] nvme: provide optimized poll function for separate poll queues
  2018-11-15 19:51 ` [PATCH 01/11] nvme: provide optimized poll function for separate poll queues Jens Axboe
@ 2018-11-16  8:35   ` Christoph Hellwig
  2018-11-16 15:22     ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:25PM -0700, Jens Axboe wrote:
> If we have separate poll queues, we know that they aren't using
> interrupts. Hence we don't need to disable interrupts around
> finding completions.
> 
> Provide a separate set of blk_mq_ops for such devices.

This looks ok, but I'd prefer if we could offer to just support
polling with the separate queue.  That way we get ourselves out of
all kinds of potential races of the interrupt path vs poll path.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/11] block: add queue_is_mq() helper
  2018-11-15 19:51 ` [PATCH 02/11] block: add queue_is_mq() helper Jens Axboe
@ 2018-11-16  8:35   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:26PM -0700, Jens Axboe wrote:
> Various spots check for q->mq_ops being non-NULL, but provide
> a helper to do this instead.
> 
> Where the ->mq_ops != NULL check is redundant, remove it.
> 
> Since mq == rq-based now that legacy is gone, get rid of the
> queue_is_rq_based() and just use queue_is_mq() everywhere.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions
  2018-11-15 19:51 ` [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions Jens Axboe
@ 2018-11-16  8:38   ` Christoph Hellwig
  2018-11-16 15:18     ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Josef Bacik

On Thu, Nov 15, 2018 at 12:51:27PM -0700, Jens Axboe wrote:
> Put the short code in the fast path, where we don't have any
> functions attached to the queue. This minimizes the impact on
> the hot path in the core code.

This looks mechanically fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

But since I seem to have missed the introduction of it - why do we need
multiple struct rq_qos per request to start with?  This sort of stacking
seems rather odd and counter-productive, and the commit introducing
this code doesn't explain the rationale at all.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/11] block: avoid ordered task state change for polled IO
  2018-11-15 19:51 ` [PATCH 04/11] block: avoid ordered task state change for polled IO Jens Axboe
@ 2018-11-16  8:41   ` Christoph Hellwig
  2018-11-16 15:32     ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:28PM -0700, Jens Axboe wrote:
> Ensure that writes to the dio/bio waiter field are ordered
> correctly. With the smp_rmb() before the READ_ONCE() check,
> we should be able to use a more relaxed ordering for the
> task state setting. We don't need a heavier barrier on
> the wakeup side after writing the waiter field, since we
> either going to be in the task we care about, or go through
> wake_up_process() which implies a strong enough barrier.
> 
> For the core poll helper, the task state setting don't need
> to imply any atomics, as it's the current task itself that
> is being modified and we're not going to sleep.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  block/blk-mq.c | 4 ++--
>  fs/block_dev.c | 9 +++++++--
>  fs/iomap.c     | 4 +++-
>  mm/page_io.c   | 4 +++-
>  4 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 32b246ed44c0..7fc4abb4cc36 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -3331,12 +3331,12 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
>  		ret = q->mq_ops->poll(hctx, rq->tag);
>  		if (ret > 0) {
>  			hctx->poll_success++;
> -			set_current_state(TASK_RUNNING);
> +			__set_current_state(TASK_RUNNING);
>  			return true;
>  		}
>  
>  		if (signal_pending_state(state, current))
> -			set_current_state(TASK_RUNNING);
> +			__set_current_state(TASK_RUNNING);
>  
>  		if (current->state == TASK_RUNNING)
>  			return true;
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index c039abfb2052..5b754f84c814 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -237,9 +237,12 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
>  
>  	qc = submit_bio(&bio);
>  	for (;;) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> +		__set_current_state(TASK_UNINTERRUPTIBLE);
> +
> +		smp_rmb();
>  		if (!READ_ONCE(bio.bi_private))
>  			break;
> +
>  		if (!(iocb->ki_flags & IOCB_HIPRI) ||
>  		    !blk_poll(bdev_get_queue(bdev), qc))
>  			io_schedule();
> @@ -403,7 +406,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
>  		return -EIOCBQUEUED;
>  
>  	for (;;) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> +		__set_current_state(TASK_UNINTERRUPTIBLE);
> +
> +		smp_rmb();
>  		if (!READ_ONCE(dio->waiter))
>  			break;
>  
> diff --git a/fs/iomap.c b/fs/iomap.c
> index f61d13dfdf09..3373ea4984d9 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -1888,7 +1888,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  			return -EIOCBQUEUED;
>  
>  		for (;;) {
> -			set_current_state(TASK_UNINTERRUPTIBLE);
> +			__set_current_state(TASK_UNINTERRUPTIBLE);
> +
> +			smp_rmb();
>  			if (!READ_ONCE(dio->submit.waiter))
>  				break;
>  
> diff --git a/mm/page_io.c b/mm/page_io.c
> index d4d1c89bcddd..008f6d00c47c 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -405,7 +405,9 @@ int swap_readpage(struct page *page, bool synchronous)
>  	bio_get(bio);
>  	qc = submit_bio(bio);
>  	while (synchronous) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> +		__set_current_state(TASK_UNINTERRUPTIBLE);
> +
> +		smp_rmb();
>  		if (!READ_ONCE(bio->bi_private))

I think any smp_rmb() should have a big fact comment explaining it.

Also to help stupid people like me that dont understand why we even
need it here  given the READ_ONCE below.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 05/11] block: add polled wakeup task helper
  2018-11-15 19:51 ` [PATCH 05/11] block: add polled wakeup task helper Jens Axboe
@ 2018-11-16  8:41   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:29PM -0700, Jens Axboe wrote:
> If we're polling for IO on a device that doesn't use interrupts, then
> IO completion loop (and wake of task) is done by submitting task itself.
> If that is the case, then we don't need to enter the wake_up_process()
> function, we can simply mark ourselves as TASK_RUNNING.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 07/11] blk-mq: when polling for IO, look for any completion
  2018-11-15 19:51 ` [PATCH 07/11] blk-mq: when polling for IO, look for any completion Jens Axboe
@ 2018-11-16  8:43   ` Christoph Hellwig
  2018-11-16 15:19     ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:31PM -0700, Jens Axboe wrote:
> If we want to support async IO polling, then we have to allow
> finding completions that aren't just for the one we are
> looking for. Always pass in -1 to the mq_ops->poll() helper,
> and have that return how many events were found in this poll
> loop.

Well, if we always put -1 in we can as well remove the argument,
especially given that you change the prototype in the patch before
this one.

Still digesting the rest of this.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to
  2018-11-15 19:51 ` [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to Jens Axboe
@ 2018-11-16  8:47   ` Christoph Hellwig
  2018-11-16  8:48     ` Christoph Hellwig
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:47 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:34PM -0700, Jens Axboe wrote:
> Inherit the iocb IOCB_HIPRI flag, and pass on REQ_HIPRI for
> those kinds of requests.

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to
  2018-11-16  8:47   ` Christoph Hellwig
@ 2018-11-16  8:48     ` Christoph Hellwig
  2018-11-16 15:19       ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Fri, Nov 16, 2018 at 12:47:39AM -0800, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 12:51:34PM -0700, Jens Axboe wrote:
> > Inherit the iocb IOCB_HIPRI flag, and pass on REQ_HIPRI for
> > those kinds of requests.
> 
> Looks fine,

Actually.  Who is going to poll for them?  With the separate poll
queue this means they are now on the poll queue where there are
no interrupts, but aio won't end up calling blk_poll..

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 11/11] block: don't plug for aio/O_DIRECT HIPRI IO
  2018-11-15 19:51 ` [PATCH 11/11] block: don't plug for aio/O_DIRECT HIPRI IO Jens Axboe
@ 2018-11-16  8:49   ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2018-11-16  8:49 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Thu, Nov 15, 2018 at 12:51:35PM -0700, Jens Axboe wrote:
> Those will go straight to issue inside blk-mq, so don't bother
> setting up a block plug for them.

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions
  2018-11-16  8:38   ` Christoph Hellwig
@ 2018-11-16 15:18     ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-16 15:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block, Josef Bacik

On 11/16/18 1:38 AM, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 12:51:27PM -0700, Jens Axboe wrote:
>> Put the short code in the fast path, where we don't have any
>> functions attached to the queue. This minimizes the impact on
>> the hot path in the core code.
> 
> This looks mechanically fine:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> But since I seem to have missed the introduction of it - why do we need
> multiple struct rq_qos per request to start with?  This sort of stacking
> seems rather odd and counter-productive, and the commit introducing
> this code doesn't explain the rationale at all.

Per request-queue, not per request. One would be iolatency, one would
be wbt, etc.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 07/11] blk-mq: when polling for IO, look for any completion
  2018-11-16  8:43   ` Christoph Hellwig
@ 2018-11-16 15:19     ` Jens Axboe
  2018-11-16 16:57       ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2018-11-16 15:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/16/18 1:43 AM, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 12:51:31PM -0700, Jens Axboe wrote:
>> If we want to support async IO polling, then we have to allow
>> finding completions that aren't just for the one we are
>> looking for. Always pass in -1 to the mq_ops->poll() helper,
>> and have that return how many events were found in this poll
>> loop.
> 
> Well, if we always put -1 in we can as well remove the argument,
> especially given that you change the prototype in the patch before
> this one.
> 
> Still digesting the rest of this.

We can remove the argument for the mq_ops->poll, as long as we
retain it for the q->poll. For the latter, we need it for
destination lookup.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to
  2018-11-16  8:48     ` Christoph Hellwig
@ 2018-11-16 15:19       ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-16 15:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/16/18 1:48 AM, Christoph Hellwig wrote:
> On Fri, Nov 16, 2018 at 12:47:39AM -0800, Christoph Hellwig wrote:
>> On Thu, Nov 15, 2018 at 12:51:34PM -0700, Jens Axboe wrote:
>>> Inherit the iocb IOCB_HIPRI flag, and pass on REQ_HIPRI for
>>> those kinds of requests.
>>
>> Looks fine,
> 
> Actually.  Who is going to poll for them?  With the separate poll
> queue this means they are now on the poll queue where there are
> no interrupts, but aio won't end up calling blk_poll..

It's a prep patch, I have patches that enable polling for libaio.
So aio will call blk_poll.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/11] nvme: provide optimized poll function for separate poll queues
  2018-11-16  8:35   ` Christoph Hellwig
@ 2018-11-16 15:22     ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-16 15:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/16/18 1:35 AM, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 12:51:25PM -0700, Jens Axboe wrote:
>> If we have separate poll queues, we know that they aren't using
>> interrupts. Hence we don't need to disable interrupts around
>> finding completions.
>>
>> Provide a separate set of blk_mq_ops for such devices.
> 
> This looks ok, but I'd prefer if we could offer to just support
> polling with the separate queue.  That way we get ourselves out of
> all kinds of potential races of the interrupt path vs poll path.

As Keith mentioned, we do use polling to find missing completions
in case of timeouts. And that has actually been really useful.

I'd rather keep such a change separate. If we do go down that
route, then there are more optimizations we can make.

Finally, let's not forget that polling is/was still a win even
if we did trigger interrupts. That's how NVMe has been since
polling was introduced. While the newer stuff is a lot more
efficient, I don't think we should totally abandon an easy opt-in
for polling for hardware unless we have strong reasons to do so.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/11] block: avoid ordered task state change for polled IO
  2018-11-16  8:41   ` Christoph Hellwig
@ 2018-11-16 15:32     ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-16 15:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/16/18 1:41 AM, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 12:51:28PM -0700, Jens Axboe wrote:
>> Ensure that writes to the dio/bio waiter field are ordered
>> correctly. With the smp_rmb() before the READ_ONCE() check,
>> we should be able to use a more relaxed ordering for the
>> task state setting. We don't need a heavier barrier on
>> the wakeup side after writing the waiter field, since we
>> either going to be in the task we care about, or go through
>> wake_up_process() which implies a strong enough barrier.
>>
>> For the core poll helper, the task state setting don't need
>> to imply any atomics, as it's the current task itself that
>> is being modified and we're not going to sleep.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  block/blk-mq.c | 4 ++--
>>  fs/block_dev.c | 9 +++++++--
>>  fs/iomap.c     | 4 +++-
>>  mm/page_io.c   | 4 +++-
>>  4 files changed, 15 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 32b246ed44c0..7fc4abb4cc36 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -3331,12 +3331,12 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
>>  		ret = q->mq_ops->poll(hctx, rq->tag);
>>  		if (ret > 0) {
>>  			hctx->poll_success++;
>> -			set_current_state(TASK_RUNNING);
>> +			__set_current_state(TASK_RUNNING);
>>  			return true;
>>  		}
>>  
>>  		if (signal_pending_state(state, current))
>> -			set_current_state(TASK_RUNNING);
>> +			__set_current_state(TASK_RUNNING);
>>  
>>  		if (current->state == TASK_RUNNING)
>>  			return true;
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index c039abfb2052..5b754f84c814 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -237,9 +237,12 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
>>  
>>  	qc = submit_bio(&bio);
>>  	for (;;) {
>> -		set_current_state(TASK_UNINTERRUPTIBLE);
>> +		__set_current_state(TASK_UNINTERRUPTIBLE);
>> +
>> +		smp_rmb();
>>  		if (!READ_ONCE(bio.bi_private))
>>  			break;
>> +
>>  		if (!(iocb->ki_flags & IOCB_HIPRI) ||
>>  		    !blk_poll(bdev_get_queue(bdev), qc))
>>  			io_schedule();
>> @@ -403,7 +406,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
>>  		return -EIOCBQUEUED;
>>  
>>  	for (;;) {
>> -		set_current_state(TASK_UNINTERRUPTIBLE);
>> +		__set_current_state(TASK_UNINTERRUPTIBLE);
>> +
>> +		smp_rmb();
>>  		if (!READ_ONCE(dio->waiter))
>>  			break;
>>  
>> diff --git a/fs/iomap.c b/fs/iomap.c
>> index f61d13dfdf09..3373ea4984d9 100644
>> --- a/fs/iomap.c
>> +++ b/fs/iomap.c
>> @@ -1888,7 +1888,9 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>>  			return -EIOCBQUEUED;
>>  
>>  		for (;;) {
>> -			set_current_state(TASK_UNINTERRUPTIBLE);
>> +			__set_current_state(TASK_UNINTERRUPTIBLE);
>> +
>> +			smp_rmb();
>>  			if (!READ_ONCE(dio->submit.waiter))
>>  				break;
>>  
>> diff --git a/mm/page_io.c b/mm/page_io.c
>> index d4d1c89bcddd..008f6d00c47c 100644
>> --- a/mm/page_io.c
>> +++ b/mm/page_io.c
>> @@ -405,7 +405,9 @@ int swap_readpage(struct page *page, bool synchronous)
>>  	bio_get(bio);
>>  	qc = submit_bio(bio);
>>  	while (synchronous) {
>> -		set_current_state(TASK_UNINTERRUPTIBLE);
>> +		__set_current_state(TASK_UNINTERRUPTIBLE);
>> +
>> +		smp_rmb();
>>  		if (!READ_ONCE(bio->bi_private))
> 
> I think any smp_rmb() should have a big fact comment explaining it.
> 
> Also to help stupid people like me that dont understand why we even
> need it here  given the READ_ONCE below.

Thinking about it, I don't think we need it at all. The barrier for
the task check is done on the wakeup side, the READ_ONCE() should
be enough for the read below. I'll update it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 07/11] blk-mq: when polling for IO, look for any completion
  2018-11-16 15:19     ` Jens Axboe
@ 2018-11-16 16:57       ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-16 16:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/16/18 8:19 AM, Jens Axboe wrote:
> On 11/16/18 1:43 AM, Christoph Hellwig wrote:
>> On Thu, Nov 15, 2018 at 12:51:31PM -0700, Jens Axboe wrote:
>>> If we want to support async IO polling, then we have to allow
>>> finding completions that aren't just for the one we are
>>> looking for. Always pass in -1 to the mq_ops->poll() helper,
>>> and have that return how many events were found in this poll
>>> loop.
>>
>> Well, if we always put -1 in we can as well remove the argument,
>> especially given that you change the prototype in the patch before
>> this one.
>>
>> Still digesting the rest of this.
> 
> We can remove the argument for the mq_ops->poll, as long as we
> retain it for the q->poll. For the latter, we need it for
> destination lookup.

Alright, went over this. I killed the nvme-fc poll implementation,
it isn't very useful at all. I fixed up the RDMA polling to work
fine with this, and then I killed the parameter. It's all in my
mq-perf branch, I believe now it should be palatable and good to
go.

Meat of it:

http://git.kernel.dk/cgit/linux-block/commit/?h=mq-perf&id=fee7a35981777e5df508baaacebff50147aad966

and removal:

http://git.kernel.dk/cgit/linux-block/commit/?h=mq-perf&id=bec293e546c6c7b01a276998a287427ba7e8b775

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCHSET v2 0/11] Various block optimizations
@ 2018-11-13 15:42 Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2018-11-13 15:42 UTC (permalink / raw)
  To: linux-block

Some of these are optimizations, the latter part is prep work
for supporting polling with aio.

Patches against my for-4.21/block branch. These patches can also
be found in my mq-perf branch, though there are other patches
sitting on top of this series (notably aio polling, as mentioned).

Changes since v1:

- Improve nvme irq disabling for polled IO
- Fix barriers in the ordered wakeup for polled O_DIRECT
- Add patch to allow polling to find any command that is done
- Add patch to control whether polling spins or not
- Have async O_DIRECT mark a bio as pollable
- Don't plug for polling


 block/blk-cgroup.c                |   8 +--
 block/blk-core.c                  |  20 ++++----
 block/blk-flush.c                 |   3 +-
 block/blk-mq-debugfs.c            |   2 +-
 block/blk-mq.c                    | 105 +++++++++++++++++++++-----------------
 block/blk-mq.h                    |  12 ++---
 block/blk-rq-qos.c                |  90 +++++++++-----------------------
 block/blk-rq-qos.h                |  35 ++++++++++---
 block/blk-softirq.c               |   4 +-
 block/blk-sysfs.c                 |  18 +++----
 block/blk-wbt.c                   |   2 +-
 block/elevator.c                  |   9 ++--
 block/genhd.c                     |   8 +--
 drivers/md/dm-table.c             |   2 +-
 drivers/nvme/host/multipath.c     |   6 +--
 drivers/nvme/host/pci.c           |  45 +++++++++-------
 drivers/nvme/target/io-cmd-bdev.c |   2 +-
 drivers/scsi/scsi_lib.c           |   2 +-
 fs/block_dev.c                    |  32 +++++++++---
 fs/direct-io.c                    |   2 +-
 fs/iomap.c                        |   9 ++--
 include/linux/blk-mq-ops.h        | 100 ++++++++++++++++++++++++++++++++++++
 include/linux/blk-mq.h            |  94 +---------------------------------
 include/linux/blkdev.h            |  37 +++++++++++---
 mm/page_io.c                      |   2 +-
 25 files changed, 347 insertions(+), 302 deletions(-)

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-11-16 16:57 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-15 19:51 [PATCHSET v2 0/11] Various block optimizations Jens Axboe
2018-11-15 19:51 ` [PATCH 01/11] nvme: provide optimized poll function for separate poll queues Jens Axboe
2018-11-16  8:35   ` Christoph Hellwig
2018-11-16 15:22     ` Jens Axboe
2018-11-15 19:51 ` [PATCH 02/11] block: add queue_is_mq() helper Jens Axboe
2018-11-16  8:35   ` Christoph Hellwig
2018-11-15 19:51 ` [PATCH 03/11] blk-rq-qos: inline check for q->rq_qos functions Jens Axboe
2018-11-16  8:38   ` Christoph Hellwig
2018-11-16 15:18     ` Jens Axboe
2018-11-15 19:51 ` [PATCH 04/11] block: avoid ordered task state change for polled IO Jens Axboe
2018-11-16  8:41   ` Christoph Hellwig
2018-11-16 15:32     ` Jens Axboe
2018-11-15 19:51 ` [PATCH 05/11] block: add polled wakeup task helper Jens Axboe
2018-11-16  8:41   ` Christoph Hellwig
2018-11-15 19:51 ` [PATCH 06/11] block: have ->poll_fn() return number of entries polled Jens Axboe
2018-11-15 19:51 ` [PATCH 07/11] blk-mq: when polling for IO, look for any completion Jens Axboe
2018-11-16  8:43   ` Christoph Hellwig
2018-11-16 15:19     ` Jens Axboe
2018-11-16 16:57       ` Jens Axboe
2018-11-15 19:51 ` [PATCH 08/11] block: make blk_poll() take a parameter on whether to spin or not Jens Axboe
2018-11-15 19:51 ` [PATCH 09/11] blk-mq: ensure mq_ops ->poll() is entered at least once Jens Axboe
2018-11-15 19:51 ` [PATCH 10/11] block: for async O_DIRECT, mark us as polling if asked to Jens Axboe
2018-11-16  8:47   ` Christoph Hellwig
2018-11-16  8:48     ` Christoph Hellwig
2018-11-16 15:19       ` Jens Axboe
2018-11-15 19:51 ` [PATCH 11/11] block: don't plug for aio/O_DIRECT HIPRI IO Jens Axboe
2018-11-16  8:49   ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2018-11-13 15:42 [PATCHSET v2 0/11] Various block optimizations Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.