All of lore.kernel.org
 help / color / mirror / Atom feed
* [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-07 17:24 Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv Paolo Valente
                   ` (7 more replies)
  0 siblings, 8 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-07 17:24 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, broonie,
	Paolo Valente

Hi,

I have finally pushed here [1] the current WIP branch of bfq for
blk-mq, which I have tentatively named bfq-mq.

This branch *IS NOT* meant for merging into mainline and contain code
that mau easily violate code style, and not only, in many
places. Commits implement the following main steps:
1) Add the last version of bfq for blk
2) Clone bfq source files into identical bfq-mq source files
3) Modify bfq-mq files to get a working version of bfq for blk-mq
(cgroups support not yet functional)

In my intentions, the main goals of this branch are:

1) Show, as soon as I could, the changes I made to let bfq-mq comply
with blk-mq-sched framework. I though this could be particularly
useful for Jens, being BFQ identical to CFQ in terms of hook
interfaces and io-context handling, and almost identical in terms
request-merging.

2) Enable people to test this first version bfq-mq. Code is purposely
overfull of log messages and invariant checks that halt the system on
failure (lock assertions, BUG_ONs, ...).

To make it easier to revise commits, I'm sending the patches that
transform bfq into bfq-mq (last four patches in the branch [1]). They
work on two files, bfq-mq-iosched.c and bfq-mq.h, which, at the
beginning, are just copies of bfq-iosched.c and bfq.h.

Thanks,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

Paolo Valente (4):
  blk-mq: pass bio to blk_mq_sched_get_rq_priv
  Move thinktime from bic to bfqq
  Embed bfq-ioc.c and add locking on request queue
  Modify interface and operation to comply with blk-mq-sched

 block/bfq-cgroup.c       |   4 -
 block/bfq-mq-iosched.c   | 852 +++++++++++++++++++++++++++++------------------
 block/bfq-mq.h           |  65 ++--
 block/blk-mq-sched.c     |   8 +-
 block/blk-mq-sched.h     |   5 +-
 include/linux/elevator.h |   2 +-
 6 files changed, 567 insertions(+), 369 deletions(-)

--
2.10.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
@ 2017-02-07 17:24 ` Paolo Valente
  2017-02-10 16:10   ` Jens Axboe
  2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Paolo Valente @ 2017-02-07 17:24 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, broonie,
	Paolo Valente

bio is used in bfq-mq's get_rq_priv, to get the request group. We could
pass directly the group here, but I thought that passing the bio was
more general, giving the possibility to get other pieces of information
if needed.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/blk-mq-sched.c     | 8 +++++---
 block/blk-mq-sched.h     | 5 +++--
 include/linux/elevator.h | 2 +-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index ee455e7..314a8ed 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -68,7 +68,9 @@ int blk_mq_sched_init_hctx_data(struct request_queue *q, size_t size,
 EXPORT_SYMBOL_GPL(blk_mq_sched_init_hctx_data);
 
 static void __blk_mq_sched_assign_ioc(struct request_queue *q,
-				      struct request *rq, struct io_context *ioc)
+				      struct request *rq,
+				      struct bio *bio,
+				      struct io_context *ioc)
 {
 	struct io_cq *icq;
 
@@ -83,7 +85,7 @@ static void __blk_mq_sched_assign_ioc(struct request_queue *q,
 	}
 
 	rq->elv.icq = icq;
-	if (!blk_mq_sched_get_rq_priv(q, rq)) {
+	if (!blk_mq_sched_get_rq_priv(q, rq, bio)) {
 		rq->rq_flags |= RQF_ELVPRIV;
 		get_io_context(icq->ioc);
 		return;
@@ -99,7 +101,7 @@ static void blk_mq_sched_assign_ioc(struct request_queue *q,
 
 	ioc = rq_ioc(bio);
 	if (ioc)
-		__blk_mq_sched_assign_ioc(q, rq, ioc);
+		__blk_mq_sched_assign_ioc(q, rq, bio, ioc);
 }
 
 struct request *blk_mq_sched_get_request(struct request_queue *q,
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index 5954859..7b5f3b9 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -49,12 +49,13 @@ blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio)
 }
 
 static inline int blk_mq_sched_get_rq_priv(struct request_queue *q,
-					   struct request *rq)
+					   struct request *rq,
+					   struct bio *bio)
 {
 	struct elevator_queue *e = q->elevator;
 
 	if (e && e->type->ops.mq.get_rq_priv)
-		return e->type->ops.mq.get_rq_priv(q, rq);
+		return e->type->ops.mq.get_rq_priv(q, rq, bio);
 
 	return 0;
 }
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index b5825c4..ce14ede 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -99,7 +99,7 @@ struct elevator_mq_ops {
 	void (*requeue_request)(struct request *);
 	struct request *(*former_request)(struct request_queue *, struct request *);
 	struct request *(*next_request)(struct request_queue *, struct request *);
-	int (*get_rq_priv)(struct request_queue *, struct request *);
+	int (*get_rq_priv)(struct request_queue *, struct request *, struct bio *);
 	void (*put_rq_priv)(struct request_queue *, struct request *);
 	void (*init_icq)(struct io_cq *);
 	void (*exit_icq)(struct io_cq *);
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [WIP PATCHSET 2/4] Move thinktime from bic to bfqq
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv Paolo Valente
@ 2017-02-07 17:24 ` Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 3/4] Embed bfq-ioc.c and add locking on request queue Paolo Valente
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-07 17:24 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, broonie,
	Paolo Valente

Prep change to make it possible to protect this field with a
scheduler lock.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-mq-iosched.c | 28 ++++++++++++++--------------
 block/bfq-mq.h         | 30 ++++++++++++++++--------------
 2 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/block/bfq-mq-iosched.c b/block/bfq-mq-iosched.c
index 0a7da4e..7e3f429 100644
--- a/block/bfq-mq-iosched.c
+++ b/block/bfq-mq-iosched.c
@@ -670,6 +670,7 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_io_cq *bic)
 	else
 		bfq_clear_bfqq_IO_bound(bfqq);
 
+	bfqq->ttime = bic->saved_ttime;
 	bfqq->wr_coeff = bic->saved_wr_coeff;
 	bfqq->wr_start_at_switch_to_srt = bic->saved_wr_start_at_switch_to_srt;
 	BUG_ON(time_is_after_jiffies(bfqq->wr_start_at_switch_to_srt));
@@ -1247,7 +1248,7 @@ static void bfq_bfqq_handle_idle_busy_switch(struct bfq_data *bfqd,
 		 * details on the usage of the next variable.
 		 */
 		arrived_in_time =  ktime_get_ns() <=
-			RQ_BIC(rq)->ttime.last_end_request +
+			bfqq->ttime.last_end_request +
 			bfqd->bfq_slice_idle * 3;
 
 	bfq_log_bfqq(bfqd, bfqq,
@@ -2003,6 +2004,7 @@ static void bfq_bfqq_save_state(struct bfq_queue *bfqq)
 	if (!bic)
 		return;
 
+	bic->saved_ttime = bfqq->ttime;
 	bic->saved_idle_window = bfq_bfqq_idle_window(bfqq);
 	bic->saved_IO_bound = bfq_bfqq_IO_bound(bfqq);
 	bic->saved_in_large_burst = bfq_bfqq_in_large_burst(bfqq);
@@ -3870,11 +3872,6 @@ static void bfq_exit_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 	bfq_put_queue(bfqq);
 }
 
-static void bfq_init_icq(struct io_cq *icq)
-{
-	icq_to_bic(icq)->ttime.last_end_request = ktime_get_ns() - (1ULL<<32);
-}
-
 static void bfq_exit_icq(struct io_cq *icq)
 {
 	struct bfq_io_cq *bic = icq_to_bic(icq);
@@ -4000,6 +3997,9 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		bfq_mark_bfqq_just_created(bfqq);
 	} else
 		bfq_clear_bfqq_sync(bfqq);
+
+	bfqq->ttime.last_end_request = ktime_get_ns() - (1ULL<<32);
+
 	bfq_mark_bfqq_IO_bound(bfqq);
 
 	/* Tentative initial value to trade off between thr and lat */
@@ -4107,14 +4107,14 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 }
 
 static void bfq_update_io_thinktime(struct bfq_data *bfqd,
-				    struct bfq_io_cq *bic)
+				    struct bfq_queue *bfqq)
 {
-	struct bfq_ttime *ttime = &bic->ttime;
-	u64 elapsed = ktime_get_ns() - bic->ttime.last_end_request;
+	struct bfq_ttime *ttime = &bfqq->ttime;
+	u64 elapsed = ktime_get_ns() - bfqq->ttime.last_end_request;
 
 	elapsed = min_t(u64, elapsed, 2 * bfqd->bfq_slice_idle);
 
-	ttime->ttime_samples = (7*bic->ttime.ttime_samples + 256) / 8;
+	ttime->ttime_samples = (7*bfqq->ttime.ttime_samples + 256) / 8;
 	ttime->ttime_total = div_u64(7*ttime->ttime_total + 256*elapsed,  8);
 	ttime->ttime_mean = div64_ul(ttime->ttime_total + 128,
 				     ttime->ttime_samples);
@@ -4157,8 +4157,8 @@ static void bfq_update_idle_window(struct bfq_data *bfqd,
 		(bfqd->hw_tag && BFQQ_SEEKY(bfqq) &&
 			bfqq->wr_coeff == 1))
 		enable_idle = 0;
-	else if (bfq_sample_valid(bic->ttime.ttime_samples)) {
-		if (bic->ttime.ttime_mean > bfqd->bfq_slice_idle &&
+	else if (bfq_sample_valid(bfqq->ttime.ttime_samples)) {
+		if (bfqq->ttime.ttime_mean > bfqd->bfq_slice_idle &&
 			bfqq->wr_coeff == 1)
 			enable_idle = 0;
 		else
@@ -4185,7 +4185,7 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	if (rq->cmd_flags & REQ_META)
 		bfqq->meta_pending++;
 
-	bfq_update_io_thinktime(bfqd, bic);
+	bfq_update_io_thinktime(bfqd, bfqq);
 	bfq_update_io_seektime(bfqd, bfqq, rq);
 	if (bfqq->entity.service > bfq_max_budget(bfqd) / 8 ||
 	    !BFQQ_SEEKY(bfqq))
@@ -4354,7 +4354,7 @@ static void bfq_completed_request(struct request_queue *q, struct request *rq)
 
 	now_ns = ktime_get_ns();
 
-	RQ_BIC(rq)->ttime.last_end_request = now_ns;
+	bfqq->ttime.last_end_request = now_ns;
 
 	/*
 	 * Using us instead of ns, to get a reasonable precision in
diff --git a/block/bfq-mq.h b/block/bfq-mq.h
index f28feb6..c6acee2 100644
--- a/block/bfq-mq.h
+++ b/block/bfq-mq.h
@@ -199,6 +199,18 @@ struct bfq_entity {
 struct bfq_group;
 
 /**
+ * struct bfq_ttime - per process thinktime stats.
+ */
+struct bfq_ttime {
+	u64 last_end_request; /* completion time of last request */
+
+	u64 ttime_total; /* total process thinktime */
+	unsigned long ttime_samples; /* number of thinktime samples */
+	u64 ttime_mean; /* average process thinktime */
+
+};
+
+/**
  * struct bfq_queue - leaf schedulable entity.
  *
  * A bfq_queue is a leaf request queue; it can be associated with an
@@ -259,6 +271,9 @@ struct bfq_queue {
 	/* node for active/idle bfqq list inside parent bfqd */
 	struct list_head bfqq_list;
 
+	/* associated @bfq_ttime struct */
+	struct bfq_ttime ttime;
+
 	/* bit vector: a 1 for each seeky requests in history */
 	u32 seek_history;
 
@@ -322,18 +337,6 @@ struct bfq_queue {
 };
 
 /**
- * struct bfq_ttime - per process thinktime stats.
- */
-struct bfq_ttime {
-	u64 last_end_request; /* completion time of last request */
-
-	u64 ttime_total; /* total process thinktime */
-	unsigned long ttime_samples; /* number of thinktime samples */
-	u64 ttime_mean; /* average process thinktime */
-
-};
-
-/**
  * struct bfq_io_cq - per (request_queue, io_context) structure.
  */
 struct bfq_io_cq {
@@ -341,8 +344,6 @@ struct bfq_io_cq {
 	struct io_cq icq; /* must be the first member */
 	/* array of two process queues, the sync and the async */
 	struct bfq_queue *bfqq[2];
-	/* associated @bfq_ttime struct */
-	struct bfq_ttime ttime;
 	/* per (request_queue, blkcg) ioprio */
 	int ioprio;
 #ifdef BFQ_GROUP_IOSCHED_ENABLED
@@ -379,6 +380,7 @@ struct bfq_io_cq {
 	unsigned long saved_last_wr_start_finish;
 	unsigned long saved_wr_start_at_switch_to_srt;
 	unsigned int saved_wr_cur_max_time;
+	struct bfq_ttime saved_ttime;
 };
 
 enum bfq_device_speed {
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [WIP PATCHSET 3/4] Embed bfq-ioc.c and add locking on request queue
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
@ 2017-02-07 17:24 ` Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 4/4] Modify interface and operation to comply with blk-mq-sched Paolo Valente
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-07 17:24 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, broonie,
	Paolo Valente

The version of bfq-ioc.c for bfq-iosched.c is not correct any more for
bfq-mq, because, in bfq-mq, the request queue lock is not being held
when bfq_bic_lookup is invoked. That function must then take that look
on its own. This commit removes the inclusion of bfq-ioc.c, copies the
content of bfq-ioc.c into bfq-mq-iosched.c, and adds the grabbing of
the lock.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-mq-iosched.c | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/block/bfq-mq-iosched.c b/block/bfq-mq-iosched.c
index 7e3f429..a8679de 100644
--- a/block/bfq-mq-iosched.c
+++ b/block/bfq-mq-iosched.c
@@ -190,7 +190,39 @@ static int device_speed_thresh[2];
 
 static void bfq_schedule_dispatch(struct bfq_data *bfqd);
 
-#include "bfq-ioc.c"
+/**
+ * icq_to_bic - convert iocontext queue structure to bfq_io_cq.
+ * @icq: the iocontext queue.
+ */
+static struct bfq_io_cq *icq_to_bic(struct io_cq *icq)
+{
+	/* bic->icq is the first member, %NULL will convert to %NULL */
+	return container_of(icq, struct bfq_io_cq, icq);
+}
+
+/**
+ * bfq_bic_lookup - search into @ioc a bic associated to @bfqd.
+ * @bfqd: the lookup key.
+ * @ioc: the io_context of the process doing I/O.
+ * @q: the request queue.
+ */
+static struct bfq_io_cq *bfq_bic_lookup(struct bfq_data *bfqd,
+					struct io_context *ioc,
+					struct request_queue *q)
+{
+	if (ioc) {
+		struct bfq_io_cq *icq;
+
+		spin_lock_irq(q->queue_lock);
+		icq = icq_to_bic(ioc_lookup_icq(ioc, q));
+		spin_unlock_irq(q->queue_lock);
+
+		return icq;
+	}
+
+	return NULL;
+}
+
 #include "bfq-sched.c"
 #include "bfq-cgroup.c"
 
@@ -1480,13 +1512,14 @@ static void bfq_add_request(struct request *rq)
 }
 
 static struct request *bfq_find_rq_fmerge(struct bfq_data *bfqd,
-					  struct bio *bio)
+					  struct bio *bio,
+					  struct request_queue *q)
 {
 	struct task_struct *tsk = current;
 	struct bfq_io_cq *bic;
 	struct bfq_queue *bfqq;
 
-	bic = bfq_bic_lookup(bfqd, tsk->io_context);
+	bic = bfq_bic_lookup(bfqd, tsk->io_context, q);
 	if (!bic)
 		return NULL;
 
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [WIP PATCHSET 4/4] Modify interface and operation to comply with blk-mq-sched
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
                   ` (2 preceding siblings ...)
  2017-02-07 17:24 ` [WIP PATCHSET 3/4] Embed bfq-ioc.c and add locking on request queue Paolo Valente
@ 2017-02-07 17:24 ` Paolo Valente
  2017-02-10 16:08   ` Bart Van Assche
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-07 17:24 UTC (permalink / raw)
  To: Jens Axboe, Tejun Heo
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, broonie,
	Paolo Valente

As for modifications of the operation, the major changes are the introduction
of a scheduler lock, and the moving to deferred work of the body of the hook
exit_icq. The latter change has been made to avoid deadlocks caused by the
combination of the following facts: 1) such a body  takes the scheduler lock,
and, if not deferred, 2) it does so from inside the exit_icq hook, which is
invoked with the queue lock held, and 3) there is at least one code path,
namely that starting from bfq_bio_merge, which takes these locks in the
opposite order.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-cgroup.c     |   4 -
 block/bfq-mq-iosched.c | 791 ++++++++++++++++++++++++++++++-------------------
 block/bfq-mq.h         |  37 +--
 3 files changed, 496 insertions(+), 336 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 36b68ec..7ecce47 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -472,8 +472,6 @@ static struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
 	struct bfq_group *bfqg, *parent;
 	struct bfq_entity *entity;
 
-	assert_spin_locked(bfqd->queue->queue_lock);
-
 	bfqg = bfq_lookup_bfqg(bfqd, blkcg);
 
 	if (unlikely(!bfqg))
@@ -602,8 +600,6 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
 	struct bfq_group *bfqg;
 	struct bfq_entity *entity;
 
-	lockdep_assert_held(bfqd->queue->queue_lock);
-
 	bfqg = bfq_find_set_group(bfqd, blkcg);
 
 	if (unlikely(!bfqg))
diff --git a/block/bfq-mq-iosched.c b/block/bfq-mq-iosched.c
index a8679de..05a12b6 100644
--- a/block/bfq-mq-iosched.c
+++ b/block/bfq-mq-iosched.c
@@ -76,9 +76,14 @@
 #include <linux/jiffies.h>
 #include <linux/rbtree.h>
 #include <linux/ioprio.h>
-#undef CONFIG_BFQ_GROUP_IOSCHED /* cgroups support not yet functional */
-#include "bfq.h"
+#include <linux/sbitmap.h>
+#include <linux/delay.h>
+
 #include "blk.h"
+#include "blk-mq.h"
+#include "blk-mq-tag.h"
+#include "blk-mq-sched.h"
+#include "bfq-mq.h"
 
 /* Expiration time of sync (0) and async (1) requests, in ns. */
 static const u64 bfq_fifo_expire[2] = { NSEC_PER_SEC / 4, NSEC_PER_SEC / 8 };
@@ -188,8 +193,6 @@ static int device_speed_thresh[2];
 #define RQ_BIC(rq)		((struct bfq_io_cq *) (rq)->elv.priv[0])
 #define RQ_BFQQ(rq)		((rq)->elv.priv[1])
 
-static void bfq_schedule_dispatch(struct bfq_data *bfqd);
-
 /**
  * icq_to_bic - convert iocontext queue structure to bfq_io_cq.
  * @icq: the iocontext queue.
@@ -211,11 +214,12 @@ static struct bfq_io_cq *bfq_bic_lookup(struct bfq_data *bfqd,
 					struct request_queue *q)
 {
 	if (ioc) {
+		unsigned long flags;
 		struct bfq_io_cq *icq;
 
-		spin_lock_irq(q->queue_lock);
+		spin_lock_irqsave(q->queue_lock, flags);
 		icq = icq_to_bic(ioc_lookup_icq(ioc, q));
-		spin_unlock_irq(q->queue_lock);
+		spin_unlock_irqrestore(q->queue_lock, flags);
 
 		return icq;
 	}
@@ -239,7 +243,7 @@ static void bfq_schedule_dispatch(struct bfq_data *bfqd)
 {
 	if (bfqd->queued != 0) {
 		bfq_log(bfqd, "schedule dispatch");
-		kblockd_schedule_work(&bfqd->unplug_work);
+		blk_mq_run_hw_queues(bfqd->queue, true);
 	}
 }
 
@@ -728,9 +732,9 @@ static int bfqq_process_refs(struct bfq_queue *bfqq)
 {
 	int process_refs, io_refs;
 
-	lockdep_assert_held(bfqq->bfqd->queue->queue_lock);
+	lockdep_assert_held(&bfqq->bfqd->lock);
 
-	io_refs = bfqq->allocated[READ] + bfqq->allocated[WRITE];
+	io_refs = bfqq->allocated;
 	process_refs = bfqq->ref - io_refs - bfqq->entity.on_st;
 	BUG_ON(process_refs < 0);
 	return process_refs;
@@ -1441,6 +1445,8 @@ static void bfq_add_request(struct request *rq)
 	bfqq->queued[rq_is_sync(rq)]++;
 	bfqd->queued++;
 
+	BUG_ON(!RQ_BFQQ(rq));
+	BUG_ON(RQ_BFQQ(rq) != bfqq);
 	elv_rb_add(&bfqq->sort_list, rq);
 
 	/*
@@ -1449,6 +1455,8 @@ static void bfq_add_request(struct request *rq)
 	prev = bfqq->next_rq;
 	next_rq = bfq_choose_req(bfqd, bfqq->next_rq, rq, bfqd->last_position);
 	BUG_ON(!next_rq);
+	BUG_ON(!RQ_BFQQ(next_rq));
+	BUG_ON(RQ_BFQQ(next_rq) != bfqq);
 	bfqq->next_rq = next_rq;
 
 	/*
@@ -1544,6 +1552,7 @@ static sector_t get_sdist(sector_t last_pos, struct request *rq)
 	return sdist;
 }
 
+#if 0 /* Still not clear if we can do without next two functions */
 static void bfq_activate_request(struct request_queue *q, struct request *rq)
 {
 	struct bfq_data *bfqd = q->elevator->elevator_data;
@@ -1557,8 +1566,10 @@ static void bfq_deactivate_request(struct request_queue *q, struct request *rq)
 	BUG_ON(bfqd->rq_in_driver == 0);
 	bfqd->rq_in_driver--;
 }
+#endif
 
-static void bfq_remove_request(struct request *rq)
+static void bfq_remove_request(struct request_queue *q,
+			       struct request *rq)
 {
 	struct bfq_queue *bfqq = RQ_BFQQ(rq);
 	struct bfq_data *bfqd = bfqq->bfqd;
@@ -1569,6 +1580,19 @@ static void bfq_remove_request(struct request *rq)
 
 	if (bfqq->next_rq == rq) {
 		bfqq->next_rq = bfq_find_next_rq(bfqd, bfqq, rq);
+		if (bfqq->next_rq && !RQ_BFQQ(bfqq->next_rq)) {
+			pr_crit("no bfqq! for next rq %p bfqq %p\n",
+				bfqq->next_rq, bfqq);
+		}
+
+		BUG_ON(bfqq->next_rq && !RQ_BFQQ(bfqq->next_rq));
+		if (bfqq->next_rq && RQ_BFQQ(bfqq->next_rq) != bfqq) {
+			pr_crit(
+			"wrong bfqq! for next rq %p, rq_bfqq %p bfqq %p\n",
+			bfqq->next_rq, RQ_BFQQ(bfqq->next_rq), bfqq);
+		}
+		BUG_ON(bfqq->next_rq && RQ_BFQQ(bfqq->next_rq) != bfqq);
+
 		bfq_updated_next_req(bfqd, bfqq);
 	}
 
@@ -1579,6 +1603,10 @@ static void bfq_remove_request(struct request *rq)
 	bfqd->queued--;
 	elv_rb_del(&bfqq->sort_list, rq);
 
+	elv_rqhash_del(q, rq);
+	if (q->last_merge == rq)
+		q->last_merge = NULL;
+
 	if (RB_EMPTY_ROOT(&bfqq->sort_list)) {
 		bfqq->next_rq = NULL;
 
@@ -1616,22 +1644,47 @@ static void bfq_remove_request(struct request *rq)
 	bfqg_stats_update_io_remove(bfqq_group(bfqq), rq->cmd_flags);
 }
 
-static int bfq_merge(struct request_queue *q, struct request **req,
-		     struct bio *bio)
+static bool bfq_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio)
+{
+	struct request_queue *q = hctx->queue;
+	struct bfq_data *bfqd = q->elevator->elevator_data;
+	struct request *free = NULL;
+	bool ret;
+
+	spin_lock_irq(&bfqd->lock);
+	ret = blk_mq_sched_try_merge(q, bio, &free);
+
+	/*
+	 * XXX Not yet freeing without lock held, to avoid an
+	 * inconsistency with respect to the lock-protected invocation
+	 * of blk_mq_sched_try_insert_merge in bfq_bio_merge. Waiting
+	 * for clarifications from Jens.
+	 */
+	if (free)
+		blk_mq_free_request(free);
+	spin_unlock_irq(&bfqd->lock);
+
+	return ret;
+}
+
+static int bfq_request_merge(struct request_queue *q, struct request **req,
+			     struct bio *bio)
 {
 	struct bfq_data *bfqd = q->elevator->elevator_data;
 	struct request *__rq;
 
-	__rq = bfq_find_rq_fmerge(bfqd, bio);
+	__rq = bfq_find_rq_fmerge(bfqd, bio, q);
 	if (__rq && elv_bio_merge_ok(__rq, bio)) {
 		*req = __rq;
+		bfq_log(bfqd, "request_merge: req %p", __rq);
+
 		return ELEVATOR_FRONT_MERGE;
 	}
 
 	return ELEVATOR_NO_MERGE;
 }
 
-static void bfq_merged_request(struct request_queue *q, struct request *req,
+static void bfq_request_merged(struct request_queue *q, struct request *req,
 			       int type)
 {
 	if (type == ELEVATOR_FRONT_MERGE &&
@@ -1645,13 +1698,23 @@ static void bfq_merged_request(struct request_queue *q, struct request *req,
 
 		/* Reposition request in its sort_list */
 		elv_rb_del(&bfqq->sort_list, req);
+		BUG_ON(!RQ_BFQQ(req));
+		BUG_ON(RQ_BFQQ(req) != bfqq);
 		elv_rb_add(&bfqq->sort_list, req);
+
+		spin_lock_irq(&bfqd->lock);
 		/* Choose next request to be served for bfqq */
 		prev = bfqq->next_rq;
 		next_rq = bfq_choose_req(bfqd, bfqq->next_rq, req,
 					 bfqd->last_position);
 		BUG_ON(!next_rq);
+
 		bfqq->next_rq = next_rq;
+
+		bfq_log_bfqq(bfqd, bfqq,
+			"requests_merged: req %p prev %p next_rq %p bfqq %p",
+			     req, prev, next_rq, bfqq);
+
 		/*
 		 * If next_rq changes, update both the queue's budget to
 		 * fit the new request and the queue's position in its
@@ -1661,22 +1724,27 @@ static void bfq_merged_request(struct request_queue *q, struct request *req,
 			bfq_updated_next_req(bfqd, bfqq);
 			bfq_pos_tree_add_move(bfqd, bfqq);
 		}
+		spin_unlock_irq(&bfqd->lock);
 	}
 }
 
-#ifdef BFQ_GROUP_IOSCHED_ENABLED
-static void bfq_bio_merged(struct request_queue *q, struct request *req,
-			   struct bio *bio)
-{
-	bfqg_stats_update_io_merged(bfqq_group(RQ_BFQQ(req)), bio->bi_opf);
-}
-#endif
-
-static void bfq_merged_requests(struct request_queue *q, struct request *rq,
+static void bfq_requests_merged(struct request_queue *q, struct request *rq,
 				struct request *next)
 {
 	struct bfq_queue *bfqq = RQ_BFQQ(rq), *next_bfqq = RQ_BFQQ(next);
 
+	BUG_ON(!RQ_BFQQ(rq));
+	BUG_ON(!RQ_BFQQ(next));
+
+	if (!RB_EMPTY_NODE(&rq->rb_node))
+		goto end;
+
+	bfq_log_bfqq(bfqq->bfqd, bfqq,
+		     "requests_merged: rq %p next %p bfqq %p next_bfqq %p",
+		     rq, next, bfqq, next_bfqq);
+
+	spin_lock_irq(&bfqq->bfqd->lock);
+
 	/*
 	 * If next and rq belong to the same bfq_queue and next is older
 	 * than rq, then reposition rq in the fifo (by substituting next
@@ -1697,7 +1765,10 @@ static void bfq_merged_requests(struct request_queue *q, struct request *rq,
 	if (bfqq->next_rq == next)
 		bfqq->next_rq = rq;
 
-	bfq_remove_request(next);
+	bfq_remove_request(q, next);
+
+	spin_unlock_irq(&bfqq->bfqd->lock);
+end:
 	bfqg_stats_update_io_merged(bfqq_group(bfqq), next->cmd_flags);
 }
 
@@ -1741,7 +1812,7 @@ static void bfq_end_wr(struct bfq_data *bfqd)
 {
 	struct bfq_queue *bfqq;
 
-	spin_lock_irq(bfqd->queue->queue_lock);
+	spin_lock_irq(&bfqd->lock);
 
 	list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list)
 		bfq_bfqq_end_wr(bfqq);
@@ -1749,7 +1820,7 @@ static void bfq_end_wr(struct bfq_data *bfqd)
 		bfq_bfqq_end_wr(bfqq);
 	bfq_end_wr_async(bfqd);
 
-	spin_unlock_irq(bfqd->queue->queue_lock);
+	spin_unlock_irq(&bfqd->lock);
 }
 
 static sector_t bfq_io_struct_pos(void *io_struct, bool request)
@@ -2131,8 +2202,8 @@ bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,
 	bfq_put_queue(bfqq);
 }
 
-static int bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
-			       struct bio *bio)
+static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
+				struct bio *bio)
 {
 	struct bfq_data *bfqd = q->elevator->elevator_data;
 	bool is_sync = op_is_sync(bio->bi_opf);
@@ -2150,7 +2221,7 @@ static int bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
 	 * merge only if rq is queued there.
 	 * Queue lock is held here.
 	 */
-	bic = bfq_bic_lookup(bfqd, current->io_context);
+	bic = bfq_bic_lookup(bfqd, current->io_context, q);
 	if (!bic)
 		return false;
 
@@ -2175,12 +2246,6 @@ static int bfq_allow_bio_merge(struct request_queue *q, struct request *rq,
 	return bfqq == RQ_BFQQ(rq);
 }
 
-static int bfq_allow_rq_merge(struct request_queue *q, struct request *rq,
-			      struct request *next)
-{
-	return RQ_BFQQ(rq) == RQ_BFQQ(next);
-}
-
 /*
  * Set the maximum time for the in-service queue to consume its
  * budget. This prevents seeky processes from lowering the throughput.
@@ -2211,7 +2276,6 @@ static void __bfq_set_in_service_queue(struct bfq_data *bfqd,
 {
 	if (bfqq) {
 		bfqg_stats_update_avg_queue_size(bfqq_group(bfqq));
-		bfq_mark_bfqq_must_alloc(bfqq);
 		bfq_clear_bfqq_fifo_expire(bfqq);
 
 		bfqd->budgets_assigned = (bfqd->budgets_assigned*7 + 256) / 8;
@@ -2650,27 +2714,28 @@ static void bfq_update_peak_rate(struct bfq_data *bfqd, struct request *rq)
 }
 
 /*
- * Move request from internal lists to the dispatch list of the request queue
+ * Remove request from internal lists.
  */
-static void bfq_dispatch_insert(struct request_queue *q, struct request *rq)
+static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
 {
 	struct bfq_queue *bfqq = RQ_BFQQ(rq);
 
 	/*
-	 * For consistency, the next instruction should have been executed
-	 * after removing the request from the queue and dispatching it.
-	 * We execute instead this instruction before bfq_remove_request()
-	 * (and hence introduce a temporary inconsistency), for efficiency.
-	 * In fact, in a forced_dispatch, this prevents two counters related
-	 * to bfqq->dispatched to risk to be uselessly decremented if bfqq
-	 * is not in service, and then to be incremented again after
-	 * incrementing bfqq->dispatched.
+	 * For consistency, the next instruction should have been
+	 * executed after removing the request from the queue and
+	 * dispatching it.  We execute instead this instruction before
+	 * bfq_remove_request() (and hence introduce a temporary
+	 * inconsistency), for efficiency.  In fact, should this
+	 * dispatch occur for a non in-service bfqq, this anticipated
+	 * increment prevents two counters related to bfqq->dispatched
+	 * from risking to be, first, uselessly decremented, and then
+	 * incremented again when the (new) value of bfqq->dispatched
+	 * happens to be taken into account.
 	 */
 	bfqq->dispatched++;
 	bfq_update_peak_rate(q->elevator->elevator_data, rq);
 
-	bfq_remove_request(rq);
-	elv_dispatch_sort(q, rq);
+	bfq_remove_request(q, rq);
 }
 
 static void __bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq)
@@ -3534,7 +3599,7 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 	bfq_log_bfqq(bfqd, bfqq, "select_queue: already in-service queue");
 
 	if (bfq_may_expire_for_budg_timeout(bfqq) &&
-	    !hrtimer_active(&bfqd->idle_slice_timer) &&
+	    !bfq_bfqq_wait_request(bfqq) &&
 	    !bfq_bfqq_must_idle(bfqq))
 		goto expire;
 
@@ -3570,7 +3635,6 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 			 * arrives.
 			 */
 			if (bfq_bfqq_wait_request(bfqq)) {
-				BUG_ON(!hrtimer_active(&bfqd->idle_slice_timer));
 				/*
 				 * If we get here: 1) at least a new request
 				 * has arrived but we have not disabled the
@@ -3597,7 +3661,7 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
 	 * for a new request, or has requests waiting for a completion and
 	 * may idle after their completion, then keep it anyway.
 	 */
-	if (hrtimer_active(&bfqd->idle_slice_timer) ||
+	if (bfq_bfqq_wait_request(bfqq) ||
 	    (bfqq->dispatched != 0 && bfq_bfqq_may_idle(bfqq))) {
 		bfqq = NULL;
 		goto keep_queue;
@@ -3676,13 +3740,11 @@ static void bfq_update_wr_data(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 }
 
 /*
- * Dispatch one request from bfqq, moving it to the request queue
- * dispatch list.
+ * Dispatch next request from bfqq.
  */
-static int bfq_dispatch_request(struct bfq_data *bfqd,
-				struct bfq_queue *bfqq)
+static struct request *bfq_dispatch_rq_from_bfqq(struct bfq_data *bfqd,
+						 struct bfq_queue *bfqq)
 {
-	int dispatched = 0;
 	struct request *rq = bfqq->next_rq;
 	unsigned long service_to_charge;
 
@@ -3698,7 +3760,7 @@ static int bfq_dispatch_request(struct bfq_data *bfqd,
 
 	BUG_ON(bfqq->entity.budget < bfqq->entity.service);
 
-	bfq_dispatch_insert(bfqd->queue, rq);
+	bfq_dispatch_remove(bfqd->queue, rq);
 
 	/*
 	 * If weight raising has to terminate for bfqq, then next
@@ -3714,86 +3776,66 @@ static int bfq_dispatch_request(struct bfq_data *bfqd,
 	bfq_update_wr_data(bfqd, bfqq);
 
 	bfq_log_bfqq(bfqd, bfqq,
-			"dispatched %u sec req (%llu), budg left %d",
+	     "dispatched %u sec req (%llu), budg left %d, new disp_nr %d",
 			blk_rq_sectors(rq),
 			(unsigned long long) blk_rq_pos(rq),
-			bfq_bfqq_budget_left(bfqq));
-
-	dispatched++;
+		     bfq_bfqq_budget_left(bfqq),
+		     bfqq->dispatched);
 
 	if (!bfqd->in_service_bic) {
 		atomic_long_inc(&RQ_BIC(rq)->icq.ioc->refcount);
 		bfqd->in_service_bic = RQ_BIC(rq);
 	}
 
+	/*
+	 * Expire bfqq, pretending that its budget expired, if bfqq
+	 * belongs to CLASS_IDLE and other queues are waiting for
+	 * service.
+	 */
 	if (bfqd->busy_queues > 1 && bfq_class_idle(bfqq))
 		goto expire;
 
-	return dispatched;
+	return rq;
 
 expire:
 	bfq_bfqq_expire(bfqd, bfqq, false, BFQ_BFQQ_BUDGET_EXHAUSTED);
-	return dispatched;
-}
-
-static int __bfq_forced_dispatch_bfqq(struct bfq_queue *bfqq)
-{
-	int dispatched = 0;
-
-	while (bfqq->next_rq) {
-		bfq_dispatch_insert(bfqq->bfqd->queue, bfqq->next_rq);
-		dispatched++;
-	}
-
-	BUG_ON(!list_empty(&bfqq->fifo));
-	return dispatched;
+	return rq;
 }
 
-/*
- * Drain our current requests.
- * Used for barriers and when switching io schedulers on-the-fly.
- */
-static int bfq_forced_dispatch(struct bfq_data *bfqd)
+static bool bfq_has_work(struct blk_mq_hw_ctx *hctx)
 {
-	struct bfq_queue *bfqq, *n;
-	struct bfq_service_tree *st;
-	int dispatched = 0;
+	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
 
-	bfqq = bfqd->in_service_queue;
-	if (bfqq)
-		__bfq_bfqq_expire(bfqd, bfqq);
+	bfq_log(bfqd, "has_work, dispatch_non_empty %d busy_queues %d",
+		!list_empty_careful(&bfqd->dispatch), bfqd->busy_queues > 0);
 
 	/*
-	 * Loop through classes, and be careful to leave the scheduler
-	 * in a consistent state, as feedback mechanisms and vtime
-	 * updates cannot be disabled during the process.
+	 * Avoiding lock: a race on bfqd->busy_queues should cause at
+	 * most a call to dispatch for nothing
 	 */
-	list_for_each_entry_safe(bfqq, n, &bfqd->active_list, bfqq_list) {
-		st = bfq_entity_service_tree(&bfqq->entity);
-
-		dispatched += __bfq_forced_dispatch_bfqq(bfqq);
-
-		bfqq->max_budget = bfq_max_budget(bfqd);
-		bfq_forget_idle(st);
-	}
-
-	BUG_ON(bfqd->busy_queues != 0);
-
-	return dispatched;
+	return !list_empty_careful(&bfqd->dispatch) ||
+		bfqd->busy_queues > 0;
 }
 
-static int bfq_dispatch_requests(struct request_queue *q, int force)
+static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 {
-	struct bfq_data *bfqd = q->elevator->elevator_data;
-	struct bfq_queue *bfqq;
+	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
+	struct request *rq = NULL;
+	struct bfq_queue *bfqq = NULL;
+
+	if (!list_empty(&bfqd->dispatch)) {
+		rq = list_first_entry(&bfqd->dispatch, struct request,
+				      queuelist);
+		list_del_init(&rq->queuelist);
+		bfq_log(bfqd,
+			"dispatch requests: picked %p from dispatch list", rq);
+		goto exit;
+	}
 
 	bfq_log(bfqd, "dispatch requests: %d busy queues", bfqd->busy_queues);
 
 	if (bfqd->busy_queues == 0)
-		return 0;
-
-	if (unlikely(force))
-		return bfq_forced_dispatch(bfqd);
+		goto exit;
 
 	/*
 	 * Force device to serve one request at a time if
@@ -3808,25 +3850,53 @@ static int bfq_dispatch_requests(struct request_queue *q, int force)
 	 * throughput.
 	 */
 	if (bfqd->strict_guarantees && bfqd->rq_in_driver > 0)
-		return 0;
+		goto exit;
 
 	bfqq = bfq_select_queue(bfqd);
 	if (!bfqq)
-		return 0;
+		goto exit;
 
 	BUG_ON(bfqq->entity.budget < bfqq->entity.service);
 
 	BUG_ON(bfq_bfqq_wait_request(bfqq));
 
-	if (!bfq_dispatch_request(bfqd, bfqq))
-		return 0;
-
-	bfq_log_bfqq(bfqd, bfqq, "dispatched %s request",
-			bfq_bfqq_sync(bfqq) ? "sync" : "async");
+	rq = bfq_dispatch_rq_from_bfqq(bfqd, bfqq);
 
 	BUG_ON(bfqq->next_rq == NULL &&
 	       bfqq->entity.budget < bfqq->entity.service);
-	return 1;
+exit:
+	if (rq) {
+		rq->rq_flags |= RQF_STARTED;
+		bfqd->rq_in_driver++;
+		if (bfqq)
+			bfq_log_bfqq(bfqd, bfqq,
+				"dispatched %s request %p, rq_in_driver %d",
+				     bfq_bfqq_sync(bfqq) ? "sync" : "async",
+				     rq,
+				     bfqd->rq_in_driver);
+		else
+			bfq_log(bfqd,
+		"dispatched request %p from dispatch list, rq_in_driver %d",
+				rq, bfqd->rq_in_driver);
+	} else
+		bfq_log(bfqd,
+		"returned NULL request, rq_in_driver %d",
+			bfqd->rq_in_driver);
+
+	return rq;
+}
+
+
+static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
+{
+	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
+	struct request *rq;
+
+	spin_lock_irq(&bfqd->lock);
+	rq = __bfq_dispatch_request(hctx);
+	spin_unlock_irq(&bfqd->lock);
+
+	return rq;
 }
 
 /*
@@ -3843,13 +3913,15 @@ static void bfq_put_queue(struct bfq_queue *bfqq)
 
 	BUG_ON(bfqq->ref <= 0);
 
-	bfq_log_bfqq(bfqq->bfqd, bfqq, "put_queue: %p %d", bfqq, bfqq->ref);
+	if (bfqq->bfqd)
+		bfq_log_bfqq(bfqq->bfqd, bfqq, "put_queue: %p %d", bfqq, bfqq->ref);
+
 	bfqq->ref--;
 	if (bfqq->ref)
 		return;
 
 	BUG_ON(rb_first(&bfqq->sort_list));
-	BUG_ON(bfqq->allocated[READ] + bfqq->allocated[WRITE] != 0);
+	BUG_ON(bfqq->allocated != 0);
 	BUG_ON(bfqq->entity.tree);
 	BUG_ON(bfq_bfqq_busy(bfqq));
 
@@ -3864,7 +3936,8 @@ static void bfq_put_queue(struct bfq_queue *bfqq)
 		 */
 		hlist_del_init(&bfqq->burst_list_node);
 
-	bfq_log_bfqq(bfqq->bfqd, bfqq, "put_queue: %p freed", bfqq);
+	if (bfqq->bfqd)
+		bfq_log_bfqq(bfqq->bfqd, bfqq, "put_queue: %p freed", bfqq);
 
 	kmem_cache_free(bfq_pool, bfqq);
 #ifdef BFQ_GROUP_IOSCHED_ENABLED
@@ -3905,29 +3978,53 @@ static void bfq_exit_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 	bfq_put_queue(bfqq);
 }
 
-static void bfq_exit_icq(struct io_cq *icq)
+static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)
 {
-	struct bfq_io_cq *bic = icq_to_bic(icq);
-	struct bfq_data *bfqd = bic_to_bfqd(bic);
+	struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync);
+	struct bfq_data *bfqd;
 
-	if (bic_to_bfqq(bic, false)) {
-		bfq_exit_bfqq(bfqd, bic_to_bfqq(bic, false));
-		bic_set_bfqq(bic, NULL, false);
-	}
+	if (bfqq)
+		bfqd = bfqq->bfqd; /* NULL if scheduler already exited */
 
-	if (bic_to_bfqq(bic, true)) {
+	if (bfqq && bfqd) {
+		spin_lock_irq(&bfqd->lock);
 		/*
 		 * If the bic is using a shared queue, put the reference
 		 * taken on the io_context when the bic started using a
 		 * shared bfq_queue.
 		 */
-		if (bfq_bfqq_coop(bic_to_bfqq(bic, true)))
-			put_io_context(icq->ioc);
-		bfq_exit_bfqq(bfqd, bic_to_bfqq(bic, true));
-		bic_set_bfqq(bic, NULL, true);
+		if (is_sync && bfq_bfqq_coop(bfqq))
+			put_io_context(bic->icq.ioc);
+		bfq_exit_bfqq(bfqd, bfqq);
+		bic_set_bfqq(bic, NULL, is_sync);
+		spin_unlock_irq(&bfqd->lock);
 	}
 }
 
+static void bfq_exit_icq_body(struct work_struct *work)
+{
+	struct bfq_io_cq *bic =
+		container_of(work, struct bfq_io_cq, exit_icq_work);
+
+	bfq_exit_icq_bfqq(bic, true);
+	bfq_exit_icq_bfqq(bic, false);
+}
+
+static void bfq_init_icq(struct io_cq *icq)
+{
+	struct bfq_io_cq *bic = icq_to_bic(icq);
+
+	INIT_WORK(&bic->exit_icq_work, bfq_exit_icq_body);
+}
+
+static void bfq_exit_icq(struct io_cq *icq)
+{
+	struct bfq_io_cq *bic = icq_to_bic(icq);
+
+	BUG_ON(!bic);
+	kblockd_schedule_work(&bic->exit_icq_work);
+}
+
 /*
  * Update the entity prio values; note that the new values will not
  * be used until the next (re)activation.
@@ -3937,6 +4034,11 @@ static void bfq_set_next_ioprio_data(struct bfq_queue *bfqq,
 {
 	struct task_struct *tsk = current;
 	int ioprio_class;
+	struct bfq_data *bfqd = bfqq->bfqd;
+
+	WARN_ON(!bfqd);
+	if (!bfqd)
+		return;
 
 	ioprio_class = IOPRIO_PRIO_CLASS(bic->ioprio);
 	switch (ioprio_class) {
@@ -4017,6 +4119,8 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	INIT_HLIST_NODE(&bfqq->burst_list_node);
 	BUG_ON(!hlist_unhashed(&bfqq->burst_list_node));
 
+	spin_lock_init(&bfqq->lock);
+
 	bfqq->ref = 0;
 	bfqq->bfqd = bfqd;
 
@@ -4273,21 +4377,17 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		if (budget_timeout)
 			bfq_bfqq_expire(bfqd, bfqq, false,
 					BFQ_BFQQ_BUDGET_TIMEOUT);
-
-		/*
-		 * Let the request rip immediately, or let a new queue be
-		 * selected if bfqq has just been expired.
-		 */
-		__blk_run_queue(bfqd->queue);
 	}
 }
 
-static void bfq_insert_request(struct request_queue *q, struct request *rq)
+
+static void __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
 {
-	struct bfq_data *bfqd = q->elevator->elevator_data;
 	struct bfq_queue *bfqq = RQ_BFQQ(rq), *new_bfqq;
 
-	assert_spin_locked(bfqd->queue->queue_lock);
+	assert_spin_locked(&bfqd->lock);
+
+	bfq_log_bfqq(bfqd, bfqq, "__insert_req: rq %p bfqq %p", rq, bfqq);
 
 	/*
 	 * An unplug may trigger a requeue of a request from the device
@@ -4303,8 +4403,14 @@ static void bfq_insert_request(struct request_queue *q, struct request *rq)
 			 * Release the request's reference to the old bfqq
 			 * and make sure one is taken to the shared queue.
 			 */
-			new_bfqq->allocated[rq_data_dir(rq)]++;
-			bfqq->allocated[rq_data_dir(rq)]--;
+			new_bfqq->allocated++;
+			bfqq->allocated--;
+			bfq_log_bfqq(bfqd, bfqq,
+		     "insert_request: new allocated %d", bfqq->allocated);
+			bfq_log_bfqq(bfqd, new_bfqq,
+		     "insert_request: new_bfqq new allocated %d",
+				     bfqq->allocated);
+
 			new_bfqq->ref++;
 			bfq_clear_bfqq_just_created(bfqq);
 			bfq_put_queue(bfqq);
@@ -4324,6 +4430,55 @@ static void bfq_insert_request(struct request_queue *q, struct request *rq)
 	bfq_rq_enqueued(bfqd, bfqq, rq);
 }
 
+static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
+			      bool at_head)
+{
+	struct request_queue *q = hctx->queue;
+	struct bfq_data *bfqd = q->elevator->elevator_data;
+
+	spin_lock_irq(&bfqd->lock);
+	if (blk_mq_sched_try_insert_merge(q, rq))
+		goto done;
+	spin_unlock_irq(&bfqd->lock);
+
+	blk_mq_sched_request_inserted(rq);
+
+	spin_lock_irq(&bfqd->lock);
+	if (at_head || blk_rq_is_passthrough(rq)) {
+		struct bfq_queue *bfqq = RQ_BFQQ(rq);
+
+		if (at_head)
+			list_add(&rq->queuelist, &bfqd->dispatch);
+		else
+			list_add_tail(&rq->queuelist, &bfqd->dispatch);
+
+		if (bfqq)
+			bfqq->dispatched++;
+	} else {
+		__bfq_insert_request(bfqd, rq);
+
+		if (rq_mergeable(rq)) {
+			elv_rqhash_add(q, rq);
+			if (!q->last_merge)
+				q->last_merge = rq;
+		}
+	}
+done:
+	spin_unlock_irq(&bfqd->lock);
+}
+
+static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx,
+				struct list_head *list, bool at_head)
+{
+	while (!list_empty(list)) {
+		struct request *rq;
+
+		rq = list_first_entry(list, struct request, queuelist);
+		list_del_init(&rq->queuelist);
+		bfq_insert_request(hctx, rq, at_head);
+	}
+}
+
 static void bfq_update_hw_tag(struct bfq_data *bfqd)
 {
 	bfqd->max_rq_in_driver = max_t(int, bfqd->max_rq_in_driver,
@@ -4349,27 +4504,21 @@ static void bfq_update_hw_tag(struct bfq_data *bfqd)
 	bfqd->hw_tag_samples = 0;
 }
 
-static void bfq_completed_request(struct request_queue *q, struct request *rq)
+static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
 {
-	struct bfq_queue *bfqq = RQ_BFQQ(rq);
-	struct bfq_data *bfqd = bfqq->bfqd;
 	u64 now_ns;
 	u32 delta_us;
 
-	bfq_log_bfqq(bfqd, bfqq, "completed one req with %u sects left",
-		     blk_rq_sectors(rq));
-
-	assert_spin_locked(bfqd->queue->queue_lock);
 	bfq_update_hw_tag(bfqd);
 
 	BUG_ON(!bfqd->rq_in_driver);
 	BUG_ON(!bfqq->dispatched);
 	bfqd->rq_in_driver--;
 	bfqq->dispatched--;
-	bfqg_stats_update_completion(bfqq_group(bfqq),
-				     rq_start_time_ns(rq),
-				     rq_io_start_time_ns(rq),
-				     rq->cmd_flags);
+
+	bfq_log_bfqq(bfqd, bfqq,
+		     "completed_requests: new disp %d, new rq_in_driver %d",
+		     bfqq->dispatched, bfqd->rq_in_driver);
 
 	if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
 		BUG_ON(!RB_EMPTY_ROOT(&bfqq->sort_list));
@@ -4395,7 +4544,8 @@ static void bfq_completed_request(struct request_queue *q, struct request *rq)
 	 */
 	delta_us = div_u64(now_ns - bfqd->last_completion, NSEC_PER_USEC);
 
-	bfq_log(bfqd, "rq_completed: delta %uus/%luus max_size %u rate %llu/%llu",
+	bfq_log_bfqq(bfqd, bfqq,
+		"rq_completed: delta %uus/%luus max_size %u rate %llu/%llu",
 		delta_us, BFQ_MIN_TT/NSEC_PER_USEC, bfqd->last_rq_max_size,
 		(USEC_PER_SEC*
 		(u64)((bfqd->last_rq_max_size<<BFQ_RATE_SHIFT)/delta_us))
@@ -4445,7 +4595,7 @@ static void bfq_completed_request(struct request_queue *q, struct request *rq)
 	if (bfqd->in_service_queue == bfqq) {
 		if (bfqq->dispatched == 0 && bfq_bfqq_must_idle(bfqq)) {
 			bfq_arm_slice_timer(bfqd);
-			goto out;
+			return;
 		} else if (bfq_may_expire_for_budg_timeout(bfqq))
 			bfq_bfqq_expire(bfqd, bfqq, false,
 					BFQ_BFQQ_BUDGET_TIMEOUT);
@@ -4455,68 +4605,81 @@ static void bfq_completed_request(struct request_queue *q, struct request *rq)
 			bfq_bfqq_expire(bfqd, bfqq, false,
 					BFQ_BFQQ_NO_MORE_REQUESTS);
 	}
-
-	if (!bfqd->rq_in_driver)
-		bfq_schedule_dispatch(bfqd);
-
-out:
-	return;
 }
 
-static int __bfq_may_queue(struct bfq_queue *bfqq)
+static void bfq_put_rq_priv_body(struct bfq_queue *bfqq)
 {
-	if (bfq_bfqq_wait_request(bfqq) && bfq_bfqq_must_alloc(bfqq)) {
-		bfq_clear_bfqq_must_alloc(bfqq);
-		return ELV_MQUEUE_MUST;
-	}
+	bfq_log_bfqq(bfqq->bfqd, bfqq,
+		     "put_request_body: allocated %d", bfqq->allocated);
+	BUG_ON(!bfqq->allocated);
+	bfqq->allocated--;
 
-	return ELV_MQUEUE_MAY;
+	bfq_put_queue(bfqq);
 }
 
-static int bfq_may_queue(struct request_queue *q, unsigned int op)
+static void bfq_put_rq_private(struct request_queue *q, struct request *rq)
 {
-	struct bfq_data *bfqd = q->elevator->elevator_data;
-	struct task_struct *tsk = current;
-	struct bfq_io_cq *bic;
 	struct bfq_queue *bfqq;
+	struct bfq_data *bfqd;
+	struct bfq_io_cq *bic;
 
-	/*
-	 * Don't force setup of a queue from here, as a call to may_queue
-	 * does not necessarily imply that a request actually will be
-	 * queued. So just lookup a possibly existing queue, or return
-	 * 'may queue' if that fails.
-	 */
-	bic = bfq_bic_lookup(bfqd, tsk->io_context);
-	if (!bic)
-		return ELV_MQUEUE_MAY;
+	BUG_ON(!rq);
+	bfqq = RQ_BFQQ(rq);
+	BUG_ON(!bfqq);
 
-	bfqq = bic_to_bfqq(bic, op_is_sync(op));
-	if (bfqq)
-		return __bfq_may_queue(bfqq);
+	bic = RQ_BIC(rq);
+	BUG_ON(!bic);
 
-	return ELV_MQUEUE_MAY;
-}
+	bfqd = bfqq->bfqd;
+	BUG_ON(!bfqd);
 
-/*
- * Queue lock held here.
- */
-static void bfq_put_request(struct request *rq)
-{
-	struct bfq_queue *bfqq = RQ_BFQQ(rq);
+	BUG_ON(rq->rq_flags & RQF_QUEUED);
+	BUG_ON(!(rq->rq_flags & RQF_ELVPRIV));
 
-	if (bfqq) {
-		const int rw = rq_data_dir(rq);
+	bfq_log_bfqq(bfqd, bfqq,
+		     "putting rq %p with %u sects left, STARTED %d",
+		     rq, blk_rq_sectors(rq),
+		     rq->rq_flags & RQF_STARTED);
 
-		BUG_ON(!bfqq->allocated[rw]);
-		bfqq->allocated[rw]--;
+	if (rq->rq_flags & RQF_STARTED)
+		bfqg_stats_update_completion(bfqq_group(bfqq),
+					     rq_start_time_ns(rq),
+					     rq_io_start_time_ns(rq),
+					     rq->cmd_flags);
 
-		rq->elv.priv[0] = NULL;
-		rq->elv.priv[1] = NULL;
+	BUG_ON(blk_rq_sectors(rq) == 0 && !(rq->rq_flags & RQF_STARTED));
 
-		bfq_log_bfqq(bfqq->bfqd, bfqq, "put_request %p, %d",
-			     bfqq, bfqq->ref);
-		bfq_put_queue(bfqq);
+	if (likely(rq->rq_flags & RQF_STARTED)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&bfqd->lock, flags);
+
+		bfq_completed_request(bfqq, bfqd);
+		bfq_put_rq_priv_body(bfqq);
+
+		spin_unlock_irqrestore(&bfqd->lock, flags);
+	} else {
+		/*
+		 * Request rq may be still/already in the scheduler,
+		 * in which case we need to remove it. And we cannot
+		 * defer such a check and removal, to avoid
+		 * inconsistencies in the time interval from the end
+		 * of this function to the start of the deferred work.
+		 * Fortunately, this situation occurs only in process
+		 * context, so taking the scheduler lock does not
+		 * cause any deadlock, even if other locks are already
+		 * (correctly) held by this process.
+		 */
+		BUG_ON(in_interrupt());
+
+		assert_spin_locked(&bfqd->lock);
+		if (!RB_EMPTY_NODE(&rq->rb_node))
+			bfq_remove_request(q, rq);
+		bfq_put_rq_priv_body(bfqq);
 	}
+
+	rq->elv.priv[0] = NULL;
+	rq->elv.priv[1] = NULL;
 }
 
 /*
@@ -4548,18 +4711,17 @@ bfq_split_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq)
 /*
  * Allocate bfq data structures associated with this request.
  */
-static int bfq_set_request(struct request_queue *q, struct request *rq,
-			   struct bio *bio, gfp_t gfp_mask)
+static int bfq_get_rq_private(struct request_queue *q, struct request *rq,
+			      struct bio *bio)
 {
 	struct bfq_data *bfqd = q->elevator->elevator_data;
 	struct bfq_io_cq *bic = icq_to_bic(rq->elv.icq);
-	const int rw = rq_data_dir(rq);
 	const int is_sync = rq_is_sync(rq);
 	struct bfq_queue *bfqq;
-	unsigned long flags;
 	bool split = false;
 
-	spin_lock_irqsave(q->queue_lock, flags);
+	spin_lock_irq(&bfqd->lock);
+
 	bfq_check_ioprio_change(bic, bio);
 
 	if (!bic)
@@ -4578,7 +4740,7 @@ static int bfq_set_request(struct request_queue *q, struct request *rq,
 		bic_set_bfqq(bic, bfqq, is_sync);
 		if (split && is_sync) {
 			bfq_log_bfqq(bfqd, bfqq,
-				     "set_request: was_in_list %d "
+				     "get_request: was_in_list %d "
 				     "was_in_large_burst %d "
 				     "large burst in progress %d",
 				     bic->was_in_burst_list,
@@ -4588,12 +4750,12 @@ static int bfq_set_request(struct request_queue *q, struct request *rq,
 			if ((bic->was_in_burst_list && bfqd->large_burst) ||
 			    bic->saved_in_large_burst) {
 				bfq_log_bfqq(bfqd, bfqq,
-					     "set_request: marking in "
+					     "get_request: marking in "
 					     "large burst");
 				bfq_mark_bfqq_in_large_burst(bfqq);
 			} else {
 				bfq_log_bfqq(bfqd, bfqq,
-					     "set_request: clearing in "
+					     "get_request: clearing in "
 					     "large burst");
 				bfq_clear_bfqq_in_large_burst(bfqq);
 				if (bic->was_in_burst_list)
@@ -4618,9 +4780,12 @@ static int bfq_set_request(struct request_queue *q, struct request *rq,
 		}
 	}
 
-	bfqq->allocated[rw]++;
+	bfqq->allocated++;
+	bfq_log_bfqq(bfqq->bfqd, bfqq,
+		     "get_request: new allocated %d", bfqq->allocated);
+
 	bfqq->ref++;
-	bfq_log_bfqq(bfqd, bfqq, "set_request: bfqq %p, %d", bfqq, bfqq->ref);
+	bfq_log_bfqq(bfqd, bfqq, "get_request: bfqq %p, %d", bfqq, bfqq->ref);
 
 	rq->elv.priv[0] = bic;
 	rq->elv.priv[1] = bfqq;
@@ -4647,26 +4812,55 @@ static int bfq_set_request(struct request_queue *q, struct request *rq,
 	if (unlikely(bfq_bfqq_just_created(bfqq)))
 		bfq_handle_burst(bfqd, bfqq);
 
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	spin_unlock_irq(&bfqd->lock);
 
 	return 0;
 
 queue_fail:
-	bfq_schedule_dispatch(bfqd);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	spin_unlock_irq(&bfqd->lock);
 
 	return 1;
 }
 
-static void bfq_kick_queue(struct work_struct *work)
+static void bfq_idle_slice_timer_body(struct bfq_queue *bfqq)
 {
-	struct bfq_data *bfqd =
-		container_of(work, struct bfq_data, unplug_work);
-	struct request_queue *q = bfqd->queue;
+	struct bfq_data *bfqd = bfqq->bfqd;
+	enum bfqq_expiration reason;
+	unsigned long flags;
+
+	BUG_ON(!bfqd);
+	spin_lock_irqsave(&bfqd->lock, flags);
+	bfq_log_bfqq(bfqd, bfqq, "handling slice_timer expiration");
+	bfq_clear_bfqq_wait_request(bfqq);
 
-	spin_lock_irq(q->queue_lock);
-	__blk_run_queue(q);
-	spin_unlock_irq(q->queue_lock);
+	if (bfqq != bfqd->in_service_queue) {
+		spin_unlock_irqrestore(&bfqd->lock, flags);
+		return;
+	}
+
+	if (bfq_bfqq_budget_timeout(bfqq))
+		/*
+		 * Also here the queue can be safely expired
+		 * for budget timeout without wasting
+		 * guarantees
+		 */
+		reason = BFQ_BFQQ_BUDGET_TIMEOUT;
+	else if (bfqq->queued[0] == 0 && bfqq->queued[1] == 0)
+		/*
+		 * The queue may not be empty upon timer expiration,
+		 * because we may not disable the timer when the
+		 * first request of the in-service queue arrives
+		 * during disk idling.
+		 */
+		reason = BFQ_BFQQ_TOO_IDLE;
+	else
+		goto schedule_dispatch;
+
+	bfq_bfqq_expire(bfqd, bfqq, true, reason);
+
+schedule_dispatch:
+	spin_unlock_irqrestore(&bfqd->lock, flags);
+	bfq_schedule_dispatch(bfqd);
 }
 
 /*
@@ -4677,59 +4871,24 @@ static enum hrtimer_restart bfq_idle_slice_timer(struct hrtimer *timer)
 {
 	struct bfq_data *bfqd = container_of(timer, struct bfq_data,
 					     idle_slice_timer);
-	struct bfq_queue *bfqq;
-	unsigned long flags;
-	enum bfqq_expiration reason;
+	struct bfq_queue *bfqq = bfqd->in_service_queue;
 
-	spin_lock_irqsave(bfqd->queue->queue_lock, flags);
+	bfq_log(bfqd, "slice_timer expired");
 
-	bfqq = bfqd->in_service_queue;
 	/*
 	 * Theoretical race here: the in-service queue can be NULL or
-	 * different from the queue that was idling if the timer handler
-	 * spins on the queue_lock and a new request arrives for the
-	 * current queue and there is a full dispatch cycle that changes
-	 * the in-service queue.  This can hardly happen, but in the worst
-	 * case we just expire a queue too early.
+	 * different from the queue that was idling if a new request
+	 * arrives for the current queue and there is a full dispatch
+	 * cycle that changes the in-service queue.  This can hardly
+	 * happen, but in the worst case we just expire a queue too
+	 * early.
 	 */
-	if (bfqq) {
-		bfq_log_bfqq(bfqd, bfqq, "slice_timer expired");
-		bfq_clear_bfqq_wait_request(bfqq);
-
-		if (bfq_bfqq_budget_timeout(bfqq))
-			/*
-			 * Also here the queue can be safely expired
-			 * for budget timeout without wasting
-			 * guarantees
-			 */
-			reason = BFQ_BFQQ_BUDGET_TIMEOUT;
-		else if (bfqq->queued[0] == 0 && bfqq->queued[1] == 0)
-			/*
-			 * The queue may not be empty upon timer expiration,
-			 * because we may not disable the timer when the
-			 * first request of the in-service queue arrives
-			 * during disk idling.
-			 */
-			reason = BFQ_BFQQ_TOO_IDLE;
-		else
-			goto schedule_dispatch;
-
-		bfq_bfqq_expire(bfqd, bfqq, true, reason);
-	}
-
-schedule_dispatch:
-	bfq_schedule_dispatch(bfqd);
+	if (bfqq)
+		bfq_idle_slice_timer_body(bfqq);
 
-	spin_unlock_irqrestore(bfqd->queue->queue_lock, flags);
 	return HRTIMER_NORESTART;
 }
 
-static void bfq_shutdown_timer_wq(struct bfq_data *bfqd)
-{
-	hrtimer_cancel(&bfqd->idle_slice_timer);
-	cancel_work_sync(&bfqd->unplug_work);
-}
-
 static void __bfq_put_async_bfqq(struct bfq_data *bfqd,
 					struct bfq_queue **bfqq_ptr)
 {
@@ -4766,30 +4925,44 @@ static void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg)
 static void bfq_exit_queue(struct elevator_queue *e)
 {
 	struct bfq_data *bfqd = e->elevator_data;
-	struct request_queue *q = bfqd->queue;
 	struct bfq_queue *bfqq, *n;
 
-	bfq_shutdown_timer_wq(bfqd);
+	bfq_log(bfqd, "exit_queue: starting ...");
 
-	spin_lock_irq(q->queue_lock);
+	hrtimer_cancel(&bfqd->idle_slice_timer);
 
 	BUG_ON(bfqd->in_service_queue);
-	list_for_each_entry_safe(bfqq, n, &bfqd->idle_list, bfqq_list)
-		bfq_deactivate_bfqq(bfqd, bfqq, false, false);
+	BUG_ON(!list_empty(&bfqd->active_list));
 
-	spin_unlock_irq(q->queue_lock);
+	list_for_each_entry_safe(bfqq, n, &bfqd->idle_list, bfqq_list) {
+		if (bfqq->bic) /* bfqqs without bic are handled below */
+			cancel_work_sync(&bfqq->bic->exit_icq_work);
+	}
 
-	bfq_shutdown_timer_wq(bfqd);
+	spin_lock_irq(&bfqd->lock);
+	list_for_each_entry_safe(bfqq, n, &bfqd->idle_list, bfqq_list) {
+		bfq_deactivate_bfqq(bfqd, bfqq, false, false);
+		/*
+		 * Make sure that deferred exit_icq_work completes
+		 * without errors for bfq_queues without bic
+		 */
+		if (!bfqq->bic)
+			bfqq->bfqd = NULL;
+	}
+	spin_unlock_irq(&bfqd->lock);
+
+	hrtimer_cancel(&bfqd->idle_slice_timer);
 
 	BUG_ON(hrtimer_active(&bfqd->idle_slice_timer));
 
 #ifdef BFQ_GROUP_IOSCHED_ENABLED
-	blkcg_deactivate_policy(q, &blkcg_policy_bfq);
+	blkcg_deactivate_policy(bfqd->queue, &blkcg_policy_bfq);
 #else
 	bfq_put_async_queues(bfqd, bfqd->root_group);
 	kfree(bfqd->root_group);
 #endif
 
+	bfq_log(bfqd, "exit_queue: finished ...");
 	kfree(bfqd);
 }
 
@@ -4848,10 +5021,6 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 
 	bfqd->queue = q;
 
-	spin_lock_irq(q->queue_lock);
-	q->elevator = eq;
-	spin_unlock_irq(q->queue_lock);
-
 	bfqd->root_group = bfq_create_group_hierarchy(bfqd, q->node);
 	if (!bfqd->root_group)
 		goto out_free;
@@ -4865,8 +5034,6 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 	bfqd->queue_weights_tree = RB_ROOT;
 	bfqd->group_weights_tree = RB_ROOT;
 
-	INIT_WORK(&bfqd->unplug_work, bfq_kick_queue);
-
 	INIT_LIST_HEAD(&bfqd->active_list);
 	INIT_LIST_HEAD(&bfqd->idle_list);
 	INIT_HLIST_HEAD(&bfqd->burst_list);
@@ -4915,6 +5082,11 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 	bfqd->peak_rate = R_fast[blk_queue_nonrot(bfqd->queue)] * 2 / 3;
 	bfqd->device_speed = BFQ_BFQD_FAST;
 
+	spin_lock_init(&bfqd->lock);
+	INIT_LIST_HEAD(&bfqd->dispatch);
+
+	q->elevator = eq;
+
 	return 0;
 
 out_free:
@@ -4971,7 +5143,7 @@ static ssize_t bfq_weights_show(struct elevator_queue *e, char *page)
 	num_char += sprintf(page + num_char, "Tot reqs queued %d\n\n",
 			    bfqd->queued);
 
-	spin_lock_irq(bfqd->queue->queue_lock);
+	spin_lock_irq(&bfqd->lock);
 
 	num_char += sprintf(page + num_char, "Active:\n");
 	list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list) {
@@ -5000,7 +5172,7 @@ static ssize_t bfq_weights_show(struct elevator_queue *e, char *page)
 				    jiffies_to_msecs(bfqq->wr_cur_max_time));
 	}
 
-	spin_unlock_irq(bfqd->queue->queue_lock);
+	spin_unlock_irq(&bfqd->lock);
 
 	return num_char;
 }
@@ -5208,35 +5380,31 @@ static struct elv_fs_entry bfq_attrs[] = {
 	__ATTR_NULL
 };
 
-static struct elevator_type iosched_bfq = {
-	.ops.sq = {
-		.elevator_merge_fn =		bfq_merge,
-		.elevator_merged_fn =		bfq_merged_request,
-		.elevator_merge_req_fn =	bfq_merged_requests,
-#ifdef BFQ_GROUP_IOSCHED_ENABLED
-		.elevator_bio_merged_fn =	bfq_bio_merged,
-#endif
-		.elevator_allow_bio_merge_fn =	bfq_allow_bio_merge,
-		.elevator_allow_rq_merge_fn =	bfq_allow_rq_merge,
-		.elevator_dispatch_fn =		bfq_dispatch_requests,
-		.elevator_add_req_fn =		bfq_insert_request,
-		.elevator_activate_req_fn =	bfq_activate_request,
-		.elevator_deactivate_req_fn =	bfq_deactivate_request,
-		.elevator_completed_req_fn =	bfq_completed_request,
-		.elevator_former_req_fn =	elv_rb_former_request,
-		.elevator_latter_req_fn =	elv_rb_latter_request,
-		.elevator_init_icq_fn =		bfq_init_icq,
-		.elevator_exit_icq_fn =		bfq_exit_icq,
-		.elevator_set_req_fn =		bfq_set_request,
-		.elevator_put_req_fn =		bfq_put_request,
-		.elevator_may_queue_fn =	bfq_may_queue,
-		.elevator_init_fn =		bfq_init_queue,
-		.elevator_exit_fn =		bfq_exit_queue,
+static struct elevator_type iosched_bfq_mq = {
+	.ops.mq = {
+		.get_rq_priv		= bfq_get_rq_private,
+		.put_rq_priv		= bfq_put_rq_private,
+		.init_icq		= bfq_init_icq,
+		.exit_icq		= bfq_exit_icq,
+		.insert_requests	= bfq_insert_requests,
+		.dispatch_request	= bfq_dispatch_request,
+		.next_request		= elv_rb_latter_request,
+		.former_request		= elv_rb_former_request,
+		.allow_merge		= bfq_allow_bio_merge,
+		.bio_merge		= bfq_bio_merge,
+		.request_merge		= bfq_request_merge,
+		.requests_merged	= bfq_requests_merged,
+		.request_merged		= bfq_request_merged,
+		.has_work		= bfq_has_work,
+		.init_sched		= bfq_init_queue,
+		.exit_sched		= bfq_exit_queue,
 	},
+
+	.uses_mq = 		true,
 	.icq_size =		sizeof(struct bfq_io_cq),
 	.icq_align =		__alignof__(struct bfq_io_cq),
 	.elevator_attrs =	bfq_attrs,
-	.elevator_name =	"bfq",
+	.elevator_name =	"bfq-mq",
 	.elevator_owner =	THIS_MODULE,
 };
 
@@ -5261,7 +5429,7 @@ static struct blkcg_policy blkcg_policy_bfq = {
 static int __init bfq_init(void)
 {
 	int ret;
-	char msg[60] = "BFQ I/O-scheduler: v8r8-rc2";
+	char msg[60] = "BFQ-MQ I/O-scheduler: v8r8-rc2";
 
 #ifdef BFQ_GROUP_IOSCHED_ENABLED
 	ret = blkcg_policy_register(&blkcg_policy_bfq);
@@ -5306,7 +5474,7 @@ static int __init bfq_init(void)
 	device_speed_thresh[0] = (4 * R_slow[0]) / 3;
 	device_speed_thresh[1] = (4 * R_slow[1]) / 3;
 
-	ret = elv_register(&iosched_bfq);
+	ret = elv_register(&iosched_bfq_mq);
 	if (ret)
 		goto err_pol_unreg;
 
@@ -5326,8 +5494,8 @@ static int __init bfq_init(void)
 
 static void __exit bfq_exit(void)
 {
-	elv_unregister(&iosched_bfq);
-#ifdef BFQ_GROUP_IOSCHED_ENABLED
+	elv_unregister(&iosched_bfq_mq);
+#ifdef CONFIG_BFQ_GROUP_ENABLED
 	blkcg_policy_unregister(&blkcg_policy_bfq);
 #endif
 	bfq_slab_kill();
@@ -5336,5 +5504,6 @@ static void __exit bfq_exit(void)
 module_init(bfq_init);
 module_exit(bfq_exit);
 
-MODULE_AUTHOR("Arianna Avanzini, Fabio Checconi, Paolo Valente");
+MODULE_AUTHOR("Paolo Valente");
 MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("MQ Budget Fair IO scheduler");
diff --git a/block/bfq-mq.h b/block/bfq-mq.h
index c6acee2..6e1c0d8 100644
--- a/block/bfq-mq.h
+++ b/block/bfq-mq.h
@@ -1,5 +1,5 @@
 /*
- * BFQ v8r8-rc2 for 4.10.0: data structures and common functions prototypes.
+ * BFQ-MQ v8r8-rc2 for 4.10.0: data structures and common functions prototypes.
  *
  * Based on ideas and code from CFQ:
  * Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
@@ -21,15 +21,8 @@
 #include <linux/rbtree.h>
 #include <linux/blk-cgroup.h>
 
-/*
- * Define an alternative macro to compile cgroups support. This is one
- * of the steps needed to let bfq-mq share the files bfq-sched.c and
- * bfq-cgroup.c with bfq. For bfq-mq, the macro
- * BFQ_GROUP_IOSCHED_ENABLED will be defined as a function of whether
- * the configuration option CONFIG_BFQ_MQ_GROUP_IOSCHED, and not
- * CONFIG_BFQ_GROUP_IOSCHED, is defined.
- */
-#ifdef CONFIG_BFQ_GROUP_IOSCHED
+/* see comments on CONFIG_BFQ_GROUP_IOSCHED in bfq.h */
+#ifdef CONFIG_BFQ_MQ_GROUP_IOSCHED
 #define BFQ_GROUP_IOSCHED_ENABLED
 #endif
 
@@ -248,8 +241,8 @@ struct bfq_queue {
 	struct request *next_rq;
 	/* number of sync and async requests queued */
 	int queued[2];
-	/* number of sync and async requests currently allocated */
-	int allocated[2];
+	/* number of requests currently allocated */
+	int allocated;
 	/* number of pending metadata requests */
 	int meta_pending;
 	/* fifo list of requests in sort_list */
@@ -334,6 +327,8 @@ struct bfq_queue {
 	unsigned long wr_start_at_switch_to_srt;
 
 	unsigned long split_time; /* time of last split */
+
+	spinlock_t lock;
 };
 
 /**
@@ -350,6 +345,9 @@ struct bfq_io_cq {
 	uint64_t blkcg_serial_nr; /* the current blkcg serial */
 #endif
 
+	/* delayed work to exec the body of the the exit_icq handler */
+	struct work_struct exit_icq_work;
+
 	/*
 	 * Snapshot of the idle window before merging; taken to
 	 * remember this value while the queue is merged, so as to be
@@ -391,11 +389,13 @@ enum bfq_device_speed {
 /**
  * struct bfq_data - per-device data structure.
  *
- * All the fields are protected by the @queue lock.
+ * All the fields are protected by @lock.
  */
 struct bfq_data {
-	/* request queue for the device */
+	/* device request queue */
 	struct request_queue *queue;
+	/* dispatch queue */
+	struct list_head dispatch;
 
 	/* root bfq_group for the device */
 	struct bfq_group *root_group;
@@ -449,8 +449,6 @@ struct bfq_data {
 	 * the queue in service.
 	 */
 	struct hrtimer idle_slice_timer;
-	/* delayed work to restart dispatching on the request queue */
-	struct work_struct unplug_work;
 
 	/* bfq_queue in service */
 	struct bfq_queue *in_service_queue;
@@ -603,6 +601,8 @@ struct bfq_data {
 
 	/* fallback dummy bfqq for extreme OOM conditions */
 	struct bfq_queue oom_bfqq;
+
+	spinlock_t lock;
 };
 
 enum bfqq_state_flags {
@@ -613,7 +613,6 @@ enum bfqq_state_flags {
 					     * waiting for a request
 					     * without idling the device
 					     */
-	BFQ_BFQQ_FLAG_must_alloc,	/* must be allowed rq alloc */
 	BFQ_BFQQ_FLAG_fifo_expire,	/* FIFO checked in this slice */
 	BFQ_BFQQ_FLAG_idle_window,	/* slice idling enabled */
 	BFQ_BFQQ_FLAG_sync,		/* synchronous queue */
@@ -652,7 +651,6 @@ BFQ_BFQQ_FNS(just_created);
 BFQ_BFQQ_FNS(busy);
 BFQ_BFQQ_FNS(wait_request);
 BFQ_BFQQ_FNS(non_blocking_wait_rq);
-BFQ_BFQQ_FNS(must_alloc);
 BFQ_BFQQ_FNS(fifo_expire);
 BFQ_BFQQ_FNS(idle_window);
 BFQ_BFQQ_FNS(sync);
@@ -672,7 +670,6 @@ static struct blkcg_gq *bfqg_to_blkg(struct bfq_group *bfqg);
 #define bfq_log_bfqq(bfqd, bfqq, fmt, args...)	do {			\
 	char __pbuf[128];						\
 									\
-	assert_spin_locked((bfqd)->queue->queue_lock);			\
 	blkg_path(bfqg_to_blkg(bfqq_group(bfqq)), __pbuf, sizeof(__pbuf)); \
 	pr_crit("bfq%d%c %s " fmt "\n", 			\
 		(bfqq)->pid,						\
@@ -708,7 +705,6 @@ static struct blkcg_gq *bfqg_to_blkg(struct bfq_group *bfqg);
 #define bfq_log_bfqq(bfqd, bfqq, fmt, args...)	do {			\
 	char __pbuf[128];						\
 									\
-	assert_spin_locked((bfqd)->queue->queue_lock);			\
 	blkg_path(bfqg_to_blkg(bfqq_group(bfqq)), __pbuf, sizeof(__pbuf)); \
 	blk_add_trace_msg((bfqd)->queue, "bfq%d%c %s " fmt, \
 			  (bfqq)->pid,			  \
@@ -935,7 +931,6 @@ static struct bfq_group *bfq_bfqq_to_bfqg(struct bfq_queue *bfqq)
 
 static void bfq_check_ioprio_change(struct bfq_io_cq *bic, struct bio *bio);
 static void bfq_put_queue(struct bfq_queue *bfqq);
-static void bfq_dispatch_insert(struct request_queue *q, struct request *rq);
 static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
 				       struct bio *bio, bool is_sync,
 				       struct bfq_io_cq *bic);
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
@ 2017-02-10 16:08   ` Bart Van Assche
  2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 16:08 UTC (permalink / raw)
  To: tj, paolo.valente, axboe
  Cc: ulf.hansson, linux-kernel, linux-block, broonie, linus.walleij

On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
> [1] https://github.com/Algodev-github/bfq-mq

Hello Paolo,

That branch includes two changes of the version suffix (EXTRAVERSION in Mak=
efile).
Please don't do that but set CONFIG_LOCALVERSION in .config to add a suffix=
 to
the kernel version string.

Thanks,

Bart.
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality N=
otice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or l=
egally privileged information of WDC and/or its affiliates, and are intende=
d solely for the use of the individual or entity to which they are addresse=
d. If you are not the intended recipient, any disclosure, copying, distribu=
tion or any action taken or omitted to be taken in reliance on it, is prohi=
bited. If you have received this e-mail in error, please notify the sender =
immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 16:08   ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 16:08 UTC (permalink / raw)
  To: tj, paolo.valente, axboe
  Cc: ulf.hansson, linux-kernel, linux-block, broonie, linus.walleij

On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
> [1] https://github.com/Algodev-github/bfq-mq

Hello Paolo,

That branch includes two changes of the version suffix (EXTRAVERSION in Makefile).
Please don't do that but set CONFIG_LOCALVERSION in .config to add a suffix to
the kernel version string.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv
  2017-02-07 17:24 ` [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv Paolo Valente
@ 2017-02-10 16:10   ` Jens Axboe
  0 siblings, 0 replies; 37+ messages in thread
From: Jens Axboe @ 2017-02-10 16:10 UTC (permalink / raw)
  To: Paolo Valente, Tejun Heo
  Cc: linux-block, linux-kernel, ulf.hansson, linus.walleij, broonie

On 02/07/2017 10:24 AM, Paolo Valente wrote:
> bio is used in bfq-mq's get_rq_priv, to get the request group. We could
> pass directly the group here, but I thought that passing the bio was
> more general, giving the possibility to get other pieces of information
> if needed.

I applied this one, thanks Paolo.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
@ 2017-02-10 16:45   ` Bart Van Assche
  2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 16:45 UTC (permalink / raw)
  To: tj, paolo.valente, axboe
  Cc: ulf.hansson, linux-kernel, linux-block, broonie, linus.walleij

On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
> 2) Enable people to test this first version bfq-mq.

Hello Paolo,

I installed this version of bfq-mq on a server that boots from a SATA
disk. That server boots fine with kernel v4.10-rc7 but not with this
tree. The first 30 seconds of the boot process seem to proceed normally
but after that time the messages on the console stop scrolling and
about another 30 seconds later the server reboots. I haven't found
anything useful in the system log. I configured the block layer as
follows:

$ grep '^C.*_MQ_' .config
CONFIG_BLK_MQ_PCI=3Dy
CONFIG_MQ_IOSCHED_BFQ=3Dy
CONFIG_MQ_IOSCHED_DEADLINE=3Dy
CONFIG_MQ_IOSCHED_NONE=3Dy
CONFIG_DEFAULT_MQ_BFQ_MQ=3Dy
CONFIG_DEFAULT_MQ_IOSCHED=3D"bfq-mq"
CONFIG_SCSI_MQ_DEFAULT=3Dy
CONFIG_DM_MQ_DEFAULT=3Dy

Bart.
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality N=
otice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or l=
egally privileged information of WDC and/or its affiliates, and are intende=
d solely for the use of the individual or entity to which they are addresse=
d. If you are not the intended recipient, any disclosure, copying, distribu=
tion or any action taken or omitted to be taken in reliance on it, is prohi=
bited. If you have received this e-mail in error, please notify the sender =
immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 16:45   ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 16:45 UTC (permalink / raw)
  To: tj, paolo.valente, axboe
  Cc: ulf.hansson, linux-kernel, linux-block, broonie, linus.walleij

On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
> 2) Enable people to test this first version bfq-mq.

Hello Paolo,

I installed this version of bfq-mq on a server that boots from a SATA
disk. That server boots fine with kernel v4.10-rc7 but not with this
tree. The first 30 seconds of the boot process seem to proceed normally
but after that time the messages on the console stop scrolling and
about another 30 seconds later the server reboots. I haven't found
anything useful in the system log. I configured the block layer as
follows:

$ grep '^C.*_MQ_' .config
CONFIG_BLK_MQ_PCI=y
CONFIG_MQ_IOSCHED_BFQ=y
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_NONE=y
CONFIG_DEFAULT_MQ_BFQ_MQ=y
CONFIG_DEFAULT_MQ_IOSCHED="bfq-mq"
CONFIG_SCSI_MQ_DEFAULT=y
CONFIG_DM_MQ_DEFAULT=y

Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-10 16:45   ` Bart Van Assche
@ 2017-02-10 16:49     ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 16:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 17:45, Bart Van Assche =
<Bart.VanAssche@sandisk.com> ha scritto:
>=20
> On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
>> 2) Enable people to test this first version bfq-mq.
>=20
> Hello Paolo,
>=20
> I installed this version of bfq-mq on a server that boots from a SATA
> disk. That server boots fine with kernel v4.10-rc7 but not with this
> tree. The first 30 seconds of the boot process seem to proceed =
normally
> but after that time the messages on the console stop scrolling and
> about another 30 seconds later the server reboots. I haven't found
> anything useful in the system log. I configured the block layer as
> follows:
>=20
> $ grep '^C.*_MQ_' .config
> CONFIG_BLK_MQ_PCI=3Dy
> CONFIG_MQ_IOSCHED_BFQ=3Dy
> CONFIG_MQ_IOSCHED_DEADLINE=3Dy
> CONFIG_MQ_IOSCHED_NONE=3Dy
> CONFIG_DEFAULT_MQ_BFQ_MQ=3Dy
> CONFIG_DEFAULT_MQ_IOSCHED=3D"bfq-mq"
> CONFIG_SCSI_MQ_DEFAULT=3Dy
> CONFIG_DM_MQ_DEFAULT=3Dy
>=20

Could you reconfigure with none or mq-deadline as default, check
whether the system boots, and, it it does, switch manually to bfq-mq,
check what happens, and, in the likely case of a failure, try to get
the oops?

Thank you very much,
Paolo

> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail =
Confidentiality Notice & Disclaimer:
>=20
> This e-mail and any files transmitted with it may contain confidential =
or legally privileged information of WDC and/or its affiliates, and are =
intended solely for the use of the individual or entity to which they =
are addressed. If you are not the intended recipient, any disclosure, =
copying, distribution or any action taken or omitted to be taken in =
reliance on it, is prohibited. If you have received this e-mail in =
error, please notify the sender immediately and delete the e-mail in its =
entirety from your system.
>=20

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 16:49     ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 16:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 17:45, Bart Van Assche <Bart.VanAssche@sandisk.com> ha scritto:
> 
> On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
>> 2) Enable people to test this first version bfq-mq.
> 
> Hello Paolo,
> 
> I installed this version of bfq-mq on a server that boots from a SATA
> disk. That server boots fine with kernel v4.10-rc7 but not with this
> tree. The first 30 seconds of the boot process seem to proceed normally
> but after that time the messages on the console stop scrolling and
> about another 30 seconds later the server reboots. I haven't found
> anything useful in the system log. I configured the block layer as
> follows:
> 
> $ grep '^C.*_MQ_' .config
> CONFIG_BLK_MQ_PCI=y
> CONFIG_MQ_IOSCHED_BFQ=y
> CONFIG_MQ_IOSCHED_DEADLINE=y
> CONFIG_MQ_IOSCHED_NONE=y
> CONFIG_DEFAULT_MQ_BFQ_MQ=y
> CONFIG_DEFAULT_MQ_IOSCHED="bfq-mq"
> CONFIG_SCSI_MQ_DEFAULT=y
> CONFIG_DM_MQ_DEFAULT=y
> 

Could you reconfigure with none or mq-deadline as default, check
whether the system boots, and, it it does, switch manually to bfq-mq,
check what happens, and, in the likely case of a failure, try to get
the oops?

Thank you very much,
Paolo

> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
> 
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-10 16:08   ` Bart Van Assche
@ 2017-02-10 16:49     ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 16:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 17:08, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
>> [1] https://github.com/Algodev-github/bfq-mq
>=20
> Hello Paolo,
>=20
> That branch includes two changes of the version suffix (EXTRAVERSION =
in Makefile).
> Please don't do that but set CONFIG_LOCALVERSION in .config to add a =
suffix to
> the kernel version string.
>=20

I know it, thanks. Unfortunately, many other irregular things you will =
probably find in that sort of private branch (as for that suffix, for =
some reason it was handy for me to have it tracked by git).

Thanks,
Paolo

> Thanks,
>=20
> Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 16:49     ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 16:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 17:08, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
> 
> On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
>> [1] https://github.com/Algodev-github/bfq-mq
> 
> Hello Paolo,
> 
> That branch includes two changes of the version suffix (EXTRAVERSION in Makefile).
> Please don't do that but set CONFIG_LOCALVERSION in .config to add a suffix to
> the kernel version string.
> 

I know it, thanks. Unfortunately, many other irregular things you will probably find in that sort of private branch (as for that suffix, for some reason it was handy for me to have it tracked by git).

Thanks,
Paolo

> Thanks,
> 
> Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-10 16:49     ` Paolo Valente
@ 2017-02-10 18:13       ` Bart Van Assche
  -1 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 18:13 UTC (permalink / raw)
  To: Paolo Valente
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij

On 02/10/2017 08:49 AM, Paolo Valente wrote:
>> $ grep '^C.*_MQ_' .config
>> CONFIG_BLK_MQ_PCI=3Dy
>> CONFIG_MQ_IOSCHED_BFQ=3Dy
>> CONFIG_MQ_IOSCHED_DEADLINE=3Dy
>> CONFIG_MQ_IOSCHED_NONE=3Dy
>> CONFIG_DEFAULT_MQ_BFQ_MQ=3Dy
>> CONFIG_DEFAULT_MQ_IOSCHED=3D"bfq-mq"
>> CONFIG_SCSI_MQ_DEFAULT=3Dy
>> CONFIG_DM_MQ_DEFAULT=3Dy
>>
> =

> Could you reconfigure with none or mq-deadline as default, check
> whether the system boots, and, it it does, switch manually to bfq-mq,
> check what happens, and, in the likely case of a failure, try to get
> the oops?

Hello Paolo,

I just finished performing that test with the following kernel config:
$ grep '^C.*_MQ_' .config
CONFIG_BLK_MQ_PCI=3Dy
CONFIG_MQ_IOSCHED_BFQ=3Dy
CONFIG_MQ_IOSCHED_DEADLINE=3Dy
CONFIG_MQ_IOSCHED_NONE=3Dy
CONFIG_DEFAULT_MQ_DEADLINE=3Dy
CONFIG_DEFAULT_MQ_IOSCHED=3D"mq-deadline"
CONFIG_SCSI_MQ_DEFAULT=3Dy
CONFIG_DM_MQ_DEFAULT=3Dy

After the system came up I logged in, switched to the bfq-mq scheduler
and ran several I/O tests against the boot disk. Sorry but nothing
interesting appeared in the kernel log.

Bart.
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality N=
otice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or l=
egally privileged information of WDC and/or its affiliates, and are intende=
d solely for the use of the individual or entity to which they are addresse=
d. If you are not the intended recipient, any disclosure, copying, distribu=
tion or any action taken or omitted to be taken in reliance on it, is prohi=
bited. If you have received this e-mail in error, please notify the sender =
immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 18:13       ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 18:13 UTC (permalink / raw)
  To: Paolo Valente
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij

On 02/10/2017 08:49 AM, Paolo Valente wrote:
>> $ grep '^C.*_MQ_' .config
>> CONFIG_BLK_MQ_PCI=y
>> CONFIG_MQ_IOSCHED_BFQ=y
>> CONFIG_MQ_IOSCHED_DEADLINE=y
>> CONFIG_MQ_IOSCHED_NONE=y
>> CONFIG_DEFAULT_MQ_BFQ_MQ=y
>> CONFIG_DEFAULT_MQ_IOSCHED="bfq-mq"
>> CONFIG_SCSI_MQ_DEFAULT=y
>> CONFIG_DM_MQ_DEFAULT=y
>>
> 
> Could you reconfigure with none or mq-deadline as default, check
> whether the system boots, and, it it does, switch manually to bfq-mq,
> check what happens, and, in the likely case of a failure, try to get
> the oops?

Hello Paolo,

I just finished performing that test with the following kernel config:
$ grep '^C.*_MQ_' .config
CONFIG_BLK_MQ_PCI=y
CONFIG_MQ_IOSCHED_BFQ=y
CONFIG_MQ_IOSCHED_DEADLINE=y
CONFIG_MQ_IOSCHED_NONE=y
CONFIG_DEFAULT_MQ_DEADLINE=y
CONFIG_DEFAULT_MQ_IOSCHED="mq-deadline"
CONFIG_SCSI_MQ_DEFAULT=y
CONFIG_DM_MQ_DEFAULT=y

After the system came up I logged in, switched to the bfq-mq scheduler
and ran several I/O tests against the boot disk. Sorry but nothing
interesting appeared in the kernel log.

Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
@ 2017-02-10 18:34   ` Bart Van Assche
  2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 18:34 UTC (permalink / raw)
  To: tj, paolo.valente, axboe
  Cc: ulf.hansson, linux-kernel, linux-block, broonie, linus.walleij

On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
> (lock assertions, BUG_ONs, ...).

Hello Paolo,

If you are using BUG_ON(), does that mean that you are not aware of Linus'
opinion about BUG_ON()? Please read https://lkml.org/lkml/2016/10/4/1.

Thanks,

Bart.
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality N=
otice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or l=
egally privileged information of WDC and/or its affiliates, and are intende=
d solely for the use of the individual or entity to which they are addresse=
d. If you are not the intended recipient, any disclosure, copying, distribu=
tion or any action taken or omitted to be taken in reliance on it, is prohi=
bited. If you have received this e-mail in error, please notify the sender =
immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 18:34   ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-10 18:34 UTC (permalink / raw)
  To: tj, paolo.valente, axboe
  Cc: ulf.hansson, linux-kernel, linux-block, broonie, linus.walleij

On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
> (lock assertions, BUG_ONs, ...).

Hello Paolo,

If you are using BUG_ON(), does that mean that you are not aware of Linus'
opinion about BUG_ON()? Please read https://lkml.org/lkml/2016/10/4/1.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-10 18:34   ` Bart Van Assche
@ 2017-02-10 19:33     ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 19:33 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 19:34, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
>> (lock assertions, BUG_ONs, ...).
>=20
> Hello Paolo,
>=20
> If you are using BUG_ON(), does that mean that you are not aware of =
Linus'
> opinion about BUG_ON()? Please read https://lkml.org/lkml/2016/10/4/1.
>=20

I am, thanks.  But this is a testing version, overfull of assertions
as a form of hysteric defensive programming.  I will of course remove
all halting assertions in the submission for merging.

Thanks,
Paolo

> Thanks,
>=20
> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail =
Confidentiality Notice & Disclaimer:
>=20
> This e-mail and any files transmitted with it may contain confidential =
or legally privileged information of WDC and/or its affiliates, and are =
intended solely for the use of the individual or entity to which they =
are addressed. If you are not the intended recipient, any disclosure, =
copying, distribution or any action taken or omitted to be taken in =
reliance on it, is prohibited. If you have received this e-mail in =
error, please notify the sender immediately and delete the e-mail in its =
entirety from your system.
>=20

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 19:33     ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 19:33 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 19:34, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
> 
> On Tue, 2017-02-07 at 18:24 +0100, Paolo Valente wrote:
>> (lock assertions, BUG_ONs, ...).
> 
> Hello Paolo,
> 
> If you are using BUG_ON(), does that mean that you are not aware of Linus'
> opinion about BUG_ON()? Please read https://lkml.org/lkml/2016/10/4/1.
> 

I am, thanks.  But this is a testing version, overfull of assertions
as a form of hysteric defensive programming.  I will of course remove
all halting assertions in the submission for merging.

Thanks,
Paolo

> Thanks,
> 
> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
> 
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-10 18:13       ` Bart Van Assche
@ 2017-02-10 19:49         ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 19:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 19:13, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On 02/10/2017 08:49 AM, Paolo Valente wrote:
>>> $ grep '^C.*_MQ_' .config
>>> CONFIG_BLK_MQ_PCI=3Dy
>>> CONFIG_MQ_IOSCHED_BFQ=3Dy
>>> CONFIG_MQ_IOSCHED_DEADLINE=3Dy
>>> CONFIG_MQ_IOSCHED_NONE=3Dy
>>> CONFIG_DEFAULT_MQ_BFQ_MQ=3Dy
>>> CONFIG_DEFAULT_MQ_IOSCHED=3D"bfq-mq"
>>> CONFIG_SCSI_MQ_DEFAULT=3Dy
>>> CONFIG_DM_MQ_DEFAULT=3Dy
>>>=20
>>=20
>> Could you reconfigure with none or mq-deadline as default, check
>> whether the system boots, and, it it does, switch manually to bfq-mq,
>> check what happens, and, in the likely case of a failure, try to get
>> the oops?
>=20
> Hello Paolo,
>=20
> I just finished performing that test with the following kernel config:
> $ grep '^C.*_MQ_' .config
> CONFIG_BLK_MQ_PCI=3Dy
> CONFIG_MQ_IOSCHED_BFQ=3Dy
> CONFIG_MQ_IOSCHED_DEADLINE=3Dy
> CONFIG_MQ_IOSCHED_NONE=3Dy
> CONFIG_DEFAULT_MQ_DEADLINE=3Dy
> CONFIG_DEFAULT_MQ_IOSCHED=3D"mq-deadline"
> CONFIG_SCSI_MQ_DEFAULT=3Dy
> CONFIG_DM_MQ_DEFAULT=3Dy
>=20
> After the system came up I logged in, switched to the bfq-mq scheduler
> and ran several I/O tests against the boot disk.

Without any failure, right?

Unfortunately, as you can imagine, no boot failure occurred on
any of my test systems so far :(

This version of bfq-mq can be configured to print all its activity in
the kernel log, by just defining a macro.  This will of course slow
down the system so much to make it probably unusable, if bfq-mq is
active from boot.  Yet, the failure may still occur so early to make
this approach useful to discover where bfq-mq gets stuck.  As of now I
have no better ideas.  Any suggestion is welcome.

Thanks,
Paolo

> Sorry but nothing
> interesting appeared in the kernel log.
>=20
> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail =
Confidentiality Notice & Disclaimer:
>=20
> This e-mail and any files transmitted with it may contain confidential =
or legally privileged information of WDC and/or its affiliates, and are =
intended solely for the use of the individual or entity to which they =
are addressed. If you are not the intended recipient, any disclosure, =
copying, distribution or any action taken or omitted to be taken in =
reliance on it, is prohibited. If you have received this e-mail in =
error, please notify the sender immediately and delete the e-mail in its =
entirety from your system.
>=20

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-10 19:49         ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-10 19:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 19:13, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
> 
> On 02/10/2017 08:49 AM, Paolo Valente wrote:
>>> $ grep '^C.*_MQ_' .config
>>> CONFIG_BLK_MQ_PCI=y
>>> CONFIG_MQ_IOSCHED_BFQ=y
>>> CONFIG_MQ_IOSCHED_DEADLINE=y
>>> CONFIG_MQ_IOSCHED_NONE=y
>>> CONFIG_DEFAULT_MQ_BFQ_MQ=y
>>> CONFIG_DEFAULT_MQ_IOSCHED="bfq-mq"
>>> CONFIG_SCSI_MQ_DEFAULT=y
>>> CONFIG_DM_MQ_DEFAULT=y
>>> 
>> 
>> Could you reconfigure with none or mq-deadline as default, check
>> whether the system boots, and, it it does, switch manually to bfq-mq,
>> check what happens, and, in the likely case of a failure, try to get
>> the oops?
> 
> Hello Paolo,
> 
> I just finished performing that test with the following kernel config:
> $ grep '^C.*_MQ_' .config
> CONFIG_BLK_MQ_PCI=y
> CONFIG_MQ_IOSCHED_BFQ=y
> CONFIG_MQ_IOSCHED_DEADLINE=y
> CONFIG_MQ_IOSCHED_NONE=y
> CONFIG_DEFAULT_MQ_DEADLINE=y
> CONFIG_DEFAULT_MQ_IOSCHED="mq-deadline"
> CONFIG_SCSI_MQ_DEFAULT=y
> CONFIG_DM_MQ_DEFAULT=y
> 
> After the system came up I logged in, switched to the bfq-mq scheduler
> and ran several I/O tests against the boot disk.

Without any failure, right?

Unfortunately, as you can imagine, no boot failure occurred on
any of my test systems so far :(

This version of bfq-mq can be configured to print all its activity in
the kernel log, by just defining a macro.  This will of course slow
down the system so much to make it probably unusable, if bfq-mq is
active from boot.  Yet, the failure may still occur so early to make
this approach useful to discover where bfq-mq gets stuck.  As of now I
have no better ideas.  Any suggestion is welcome.

Thanks,
Paolo

> Sorry but nothing
> interesting appeared in the kernel log.
> 
> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
> 
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-10 19:49         ` Paolo Valente
@ 2017-02-13 21:07           ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-13 21:07 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 20:49, Paolo Valente =
<paolo.valente@linaro.org> ha scritto:
>=20
>>=20
>> Il giorno 10 feb 2017, alle ore 19:13, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>>=20
>> On 02/10/2017 08:49 AM, Paolo Valente wrote:
>>>> $ grep '^C.*_MQ_' .config
>>>> CONFIG_BLK_MQ_PCI=3Dy
>>>> CONFIG_MQ_IOSCHED_BFQ=3Dy
>>>> CONFIG_MQ_IOSCHED_DEADLINE=3Dy
>>>> CONFIG_MQ_IOSCHED_NONE=3Dy
>>>> CONFIG_DEFAULT_MQ_BFQ_MQ=3Dy
>>>> CONFIG_DEFAULT_MQ_IOSCHED=3D"bfq-mq"
>>>> CONFIG_SCSI_MQ_DEFAULT=3Dy
>>>> CONFIG_DM_MQ_DEFAULT=3Dy
>>>>=20
>>>=20
>>> Could you reconfigure with none or mq-deadline as default, check
>>> whether the system boots, and, it it does, switch manually to =
bfq-mq,
>>> check what happens, and, in the likely case of a failure, try to get
>>> the oops?
>>=20
>> Hello Paolo,
>>=20
>> I just finished performing that test with the following kernel =
config:
>> $ grep '^C.*_MQ_' .config
>> CONFIG_BLK_MQ_PCI=3Dy
>> CONFIG_MQ_IOSCHED_BFQ=3Dy
>> CONFIG_MQ_IOSCHED_DEADLINE=3Dy
>> CONFIG_MQ_IOSCHED_NONE=3Dy
>> CONFIG_DEFAULT_MQ_DEADLINE=3Dy
>> CONFIG_DEFAULT_MQ_IOSCHED=3D"mq-deadline"
>> CONFIG_SCSI_MQ_DEFAULT=3Dy
>> CONFIG_DM_MQ_DEFAULT=3Dy
>>=20
>> After the system came up I logged in, switched to the bfq-mq =
scheduler
>> and ran several I/O tests against the boot disk.
>=20
> Without any failure, right?
>=20
> Unfortunately, as you can imagine, no boot failure occurred on
> any of my test systems so far :(
>=20
> This version of bfq-mq can be configured to print all its activity in
> the kernel log, by just defining a macro.  This will of course slow
> down the system so much to make it probably unusable, if bfq-mq is
> active from boot.  Yet, the failure may still occur so early to make
> this approach useful to discover where bfq-mq gets stuck.  As of now I
> have no better ideas.  Any suggestion is welcome.
>=20

Hi Bart,
I have found a machine crashing at boot, yet not only when bfq-mq is
chosen, but also when mq-deadline is chosen as the default
scheduler.  I have found and just reported the cause of the failure,
together with a fix.  Probably this is not the cause of your failure,
but what do you think about trying this fix?  BTW, I have rebased the
branch [1] against the new commits in Jens for-4.11/next.

Otherwise, if you have no news or suggestions, would you be willing to
try my micro-logging proposal?

Thanks,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> Thanks,
> Paolo
>=20
>> Sorry but nothing
>> interesting appeared in the kernel log.
>>=20
>> Bart.
>> Western Digital Corporation (and its subsidiaries) E-mail =
Confidentiality Notice & Disclaimer:
>>=20
>> This e-mail and any files transmitted with it may contain =
confidential or legally privileged information of WDC and/or its =
affiliates, and are intended solely for the use of the individual or =
entity to which they are addressed. If you are not the intended =
recipient, any disclosure, copying, distribution or any action taken or =
omitted to be taken in reliance on it, is prohibited. If you have =
received this e-mail in error, please notify the sender immediately and =
delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-13 21:07           ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-13 21:07 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: tj, axboe, ulf.hansson, linux-kernel, linux-block, broonie,
	linus.walleij


> Il giorno 10 feb 2017, alle ore 20:49, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
>> 
>> Il giorno 10 feb 2017, alle ore 19:13, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
>> 
>> On 02/10/2017 08:49 AM, Paolo Valente wrote:
>>>> $ grep '^C.*_MQ_' .config
>>>> CONFIG_BLK_MQ_PCI=y
>>>> CONFIG_MQ_IOSCHED_BFQ=y
>>>> CONFIG_MQ_IOSCHED_DEADLINE=y
>>>> CONFIG_MQ_IOSCHED_NONE=y
>>>> CONFIG_DEFAULT_MQ_BFQ_MQ=y
>>>> CONFIG_DEFAULT_MQ_IOSCHED="bfq-mq"
>>>> CONFIG_SCSI_MQ_DEFAULT=y
>>>> CONFIG_DM_MQ_DEFAULT=y
>>>> 
>>> 
>>> Could you reconfigure with none or mq-deadline as default, check
>>> whether the system boots, and, it it does, switch manually to bfq-mq,
>>> check what happens, and, in the likely case of a failure, try to get
>>> the oops?
>> 
>> Hello Paolo,
>> 
>> I just finished performing that test with the following kernel config:
>> $ grep '^C.*_MQ_' .config
>> CONFIG_BLK_MQ_PCI=y
>> CONFIG_MQ_IOSCHED_BFQ=y
>> CONFIG_MQ_IOSCHED_DEADLINE=y
>> CONFIG_MQ_IOSCHED_NONE=y
>> CONFIG_DEFAULT_MQ_DEADLINE=y
>> CONFIG_DEFAULT_MQ_IOSCHED="mq-deadline"
>> CONFIG_SCSI_MQ_DEFAULT=y
>> CONFIG_DM_MQ_DEFAULT=y
>> 
>> After the system came up I logged in, switched to the bfq-mq scheduler
>> and ran several I/O tests against the boot disk.
> 
> Without any failure, right?
> 
> Unfortunately, as you can imagine, no boot failure occurred on
> any of my test systems so far :(
> 
> This version of bfq-mq can be configured to print all its activity in
> the kernel log, by just defining a macro.  This will of course slow
> down the system so much to make it probably unusable, if bfq-mq is
> active from boot.  Yet, the failure may still occur so early to make
> this approach useful to discover where bfq-mq gets stuck.  As of now I
> have no better ideas.  Any suggestion is welcome.
> 

Hi Bart,
I have found a machine crashing at boot, yet not only when bfq-mq is
chosen, but also when mq-deadline is chosen as the default
scheduler.  I have found and just reported the cause of the failure,
together with a fix.  Probably this is not the cause of your failure,
but what do you think about trying this fix?  BTW, I have rebased the
branch [1] against the new commits in Jens for-4.11/next.

Otherwise, if you have no news or suggestions, would you be willing to
try my micro-logging proposal?

Thanks,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> Thanks,
> Paolo
> 
>> Sorry but nothing
>> interesting appeared in the kernel log.
>> 
>> Bart.
>> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
>> 
>> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
@ 2017-02-13 21:09   ` Paolo Valente
  2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-13 21:09 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, Tejun Heo, linux-block, linux-kernel, ulf.hansson,
	linus.walleij, broonie


> Il giorno 07 feb 2017, alle ore 18:24, Paolo Valente =
<paolo.valente@linaro.org> ha scritto:
>=20
> Hi,
>=20
> I have finally pushed here [1] the current WIP branch of bfq for
> blk-mq, which I have tentatively named bfq-mq.
>=20
> This branch *IS NOT* meant for merging into mainline and contain code
> that mau easily violate code style, and not only, in many
> places. Commits implement the following main steps:
> 1) Add the last version of bfq for blk
> 2) Clone bfq source files into identical bfq-mq source files
> 3) Modify bfq-mq files to get a working version of bfq for blk-mq
> (cgroups support not yet functional)
>=20
> In my intentions, the main goals of this branch are:
>=20
> 1) Show, as soon as I could, the changes I made to let bfq-mq comply
> with blk-mq-sched framework. I though this could be particularly
> useful for Jens, being BFQ identical to CFQ in terms of hook
> interfaces and io-context handling, and almost identical in terms
> request-merging.
>=20
> 2) Enable people to test this first version bfq-mq. Code is purposely
> overfull of log messages and invariant checks that halt the system on
> failure (lock assertions, BUG_ONs, ...).
>=20
> To make it easier to revise commits, I'm sending the patches that
> transform bfq into bfq-mq (last four patches in the branch [1]). They
> work on two files, bfq-mq-iosched.c and bfq-mq.h, which, at the
> beginning, are just copies of bfq-iosched.c and bfq.h.
>=20

Hi,
this is just to inform that, as I just wrote to Bart, I have rebase
the branch [1] against the current content of for-4.11/next.

Jens, Omar, did you find the time to have a look at the main commits
or to run some test?

Thanks,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> Thanks,
> Paolo
>=20
> [1] https://github.com/Algodev-github/bfq-mq
>=20
> Paolo Valente (4):
>  blk-mq: pass bio to blk_mq_sched_get_rq_priv
>  Move thinktime from bic to bfqq
>  Embed bfq-ioc.c and add locking on request queue
>  Modify interface and operation to comply with blk-mq-sched
>=20
> block/bfq-cgroup.c       |   4 -
> block/bfq-mq-iosched.c   | 852 =
+++++++++++++++++++++++++++++------------------
> block/bfq-mq.h           |  65 ++--
> block/blk-mq-sched.c     |   8 +-
> block/blk-mq-sched.h     |   5 +-
> include/linux/elevator.h |   2 +-
> 6 files changed, 567 insertions(+), 369 deletions(-)
>=20
> --
> 2.10.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-13 21:09   ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-13 21:09 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, Tejun Heo, linux-block, linux-kernel, ulf.hansson,
	linus.walleij, broonie


> Il giorno 07 feb 2017, alle ore 18:24, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> Hi,
> 
> I have finally pushed here [1] the current WIP branch of bfq for
> blk-mq, which I have tentatively named bfq-mq.
> 
> This branch *IS NOT* meant for merging into mainline and contain code
> that mau easily violate code style, and not only, in many
> places. Commits implement the following main steps:
> 1) Add the last version of bfq for blk
> 2) Clone bfq source files into identical bfq-mq source files
> 3) Modify bfq-mq files to get a working version of bfq for blk-mq
> (cgroups support not yet functional)
> 
> In my intentions, the main goals of this branch are:
> 
> 1) Show, as soon as I could, the changes I made to let bfq-mq comply
> with blk-mq-sched framework. I though this could be particularly
> useful for Jens, being BFQ identical to CFQ in terms of hook
> interfaces and io-context handling, and almost identical in terms
> request-merging.
> 
> 2) Enable people to test this first version bfq-mq. Code is purposely
> overfull of log messages and invariant checks that halt the system on
> failure (lock assertions, BUG_ONs, ...).
> 
> To make it easier to revise commits, I'm sending the patches that
> transform bfq into bfq-mq (last four patches in the branch [1]). They
> work on two files, bfq-mq-iosched.c and bfq-mq.h, which, at the
> beginning, are just copies of bfq-iosched.c and bfq.h.
> 

Hi,
this is just to inform that, as I just wrote to Bart, I have rebase
the branch [1] against the current content of for-4.11/next.

Jens, Omar, did you find the time to have a look at the main commits
or to run some test?

Thanks,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> Thanks,
> Paolo
> 
> [1] https://github.com/Algodev-github/bfq-mq
> 
> Paolo Valente (4):
>  blk-mq: pass bio to blk_mq_sched_get_rq_priv
>  Move thinktime from bic to bfqq
>  Embed bfq-ioc.c and add locking on request queue
>  Modify interface and operation to comply with blk-mq-sched
> 
> block/bfq-cgroup.c       |   4 -
> block/bfq-mq-iosched.c   | 852 +++++++++++++++++++++++++++++------------------
> block/bfq-mq.h           |  65 ++--
> block/blk-mq-sched.c     |   8 +-
> block/blk-mq-sched.h     |   5 +-
> include/linux/elevator.h |   2 +-
> 6 files changed, 567 insertions(+), 369 deletions(-)
> 
> --
> 2.10.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-13 21:07           ` Paolo Valente
@ 2017-02-13 22:07             ` Bart Van Assche
  -1 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-13 22:07 UTC (permalink / raw)
  To: paolo.valente
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe

On Mon, 2017-02-13 at 22:07 +0100, Paolo Valente wrote:
> but what do you think about trying this fix?

Sorry but with ... the same server I used for the previous test still
didn't boot up properly. A screenshot is available at
https://goo.gl/photos/Za9QVGCNe2BJBwxVA.

> Otherwise, if you have no news or suggestions, would you be willing to
> try my micro-logging proposal https://github.com/Algodev-github/bfq-mq?

Sorry but it's not clear to me what logging mechanism you are referring
to and how to enable it? Are you perhaps referring to
CONFIG_BFQ_REDIRECT_TO_CONSOLE?

Thanks,

Bart.
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality N=
otice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or l=
egally privileged information of WDC and/or its affiliates, and are intende=
d solely for the use of the individual or entity to which they are addresse=
d. If you are not the intended recipient, any disclosure, copying, distribu=
tion or any action taken or omitted to be taken in reliance on it, is prohi=
bited. If you have received this e-mail in error, please notify the sender =
immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-13 22:07             ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-13 22:07 UTC (permalink / raw)
  To: paolo.valente
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe

On Mon, 2017-02-13 at 22:07 +0100, Paolo Valente wrote:
> but what do you think about trying this fix?

Sorry but with ... the same server I used for the previous test still
didn't boot up properly. A screenshot is available at
https://goo.gl/photos/Za9QVGCNe2BJBwxVA.

> Otherwise, if you have no news or suggestions, would you be willing to
> try my micro-logging proposal https://github.com/Algodev-github/bfq-mq?

Sorry but it's not clear to me what logging mechanism you are referring
to and how to enable it? Are you perhaps referring to
CONFIG_BFQ_REDIRECT_TO_CONSOLE?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-13 21:09   ` Paolo Valente
  (?)
@ 2017-02-13 22:29   ` Jens Axboe
  -1 siblings, 0 replies; 37+ messages in thread
From: Jens Axboe @ 2017-02-13 22:29 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Tejun Heo, linux-block, linux-kernel, ulf.hansson, linus.walleij,
	broonie

On 02/13/2017 02:09 PM, Paolo Valente wrote:
> 
>> Il giorno 07 feb 2017, alle ore 18:24, Paolo Valente <paolo.valente@linaro.org> ha scritto:
>>
>> Hi,
>>
>> I have finally pushed here [1] the current WIP branch of bfq for
>> blk-mq, which I have tentatively named bfq-mq.
>>
>> This branch *IS NOT* meant for merging into mainline and contain code
>> that mau easily violate code style, and not only, in many
>> places. Commits implement the following main steps:
>> 1) Add the last version of bfq for blk
>> 2) Clone bfq source files into identical bfq-mq source files
>> 3) Modify bfq-mq files to get a working version of bfq for blk-mq
>> (cgroups support not yet functional)
>>
>> In my intentions, the main goals of this branch are:
>>
>> 1) Show, as soon as I could, the changes I made to let bfq-mq comply
>> with blk-mq-sched framework. I though this could be particularly
>> useful for Jens, being BFQ identical to CFQ in terms of hook
>> interfaces and io-context handling, and almost identical in terms
>> request-merging.
>>
>> 2) Enable people to test this first version bfq-mq. Code is purposely
>> overfull of log messages and invariant checks that halt the system on
>> failure (lock assertions, BUG_ONs, ...).
>>
>> To make it easier to revise commits, I'm sending the patches that
>> transform bfq into bfq-mq (last four patches in the branch [1]). They
>> work on two files, bfq-mq-iosched.c and bfq-mq.h, which, at the
>> beginning, are just copies of bfq-iosched.c and bfq.h.
>>
> 
> Hi,
> this is just to inform that, as I just wrote to Bart, I have rebase
> the branch [1] against the current content of for-4.11/next.
> 
> Jens, Omar, did you find the time to have a look at the main commits
> or to run some test?

I only looked at the core change you proposed for passing in the
bio as well, and Omar fixed up the icq exit part and I also applied
that patch. I haven't look at any of the bfq-mq patches at all yet.
Not sure what I can do with those, I don't think those are
particularly useful to anyone but you.

Might make more sense to post the conversion for review as a
whole.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-13 22:07             ` Bart Van Assche
@ 2017-02-13 22:38               ` Bart Van Assche
  -1 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-13 22:38 UTC (permalink / raw)
  To: paolo.valente
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe

On Mon, 2017-02-13 at 22:07 +0000, Bart Van Assche wrote:
> On Mon, 2017-02-13 at 22:07 +0100, Paolo Valente wrote:
> > but what do you think about trying this fix?
> =

> Sorry but with ... the same server I used for the previous test still
> didn't boot up properly. A screenshot is available at
> https://goo.gl/photos/Za9QVGCNe2BJBwxVA.
> =

> > Otherwise, if you have no news or suggestions, would you be willing to
> > try my micro-logging proposal https://github.com/Algodev-github/bfq-mq?
> =

> Sorry but it's not clear to me what logging mechanism you are referring
> to and how to enable it? Are you perhaps referring to
> CONFIG_BFQ_REDIRECT_TO_CONSOLE?

Anyway, a second screenshot has been added to the same album after I had
applied the following patch:

diff --git a/block/Makefile b/block/Makefile
index 1c04fe19e825..bf472ac82c08 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -2,6 +2,8 @@
 # Makefile for the kernel block layer
 #
 =A0
+KBUILD_CFLAGS +=3D -DCONFIG_BFQ_REDIRECT_TO_CONSOLE
+

 obj-$(CONFIG_BLOCK) :=3D bio.o elevator.o blk-core.o blk-tag.o blk-sysfs.o=
 \
 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0blk-f=
lush.o blk-settings.o blk-ioc.o blk-map.o \
 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0blk-e=
xec.o blk-merge.o blk-softirq.o blk-timeout.o \

Bart.
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality N=
otice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or l=
egally privileged information of WDC and/or its affiliates, and are intende=
d solely for the use of the individual or entity to which they are addresse=
d. If you are not the intended recipient, any disclosure, copying, distribu=
tion or any action taken or omitted to be taken in reliance on it, is prohi=
bited. If you have received this e-mail in error, please notify the sender =
immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-13 22:38               ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-13 22:38 UTC (permalink / raw)
  To: paolo.valente
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe

On Mon, 2017-02-13 at 22:07 +0000, Bart Van Assche wrote:
> On Mon, 2017-02-13 at 22:07 +0100, Paolo Valente wrote:
> > but what do you think about trying this fix?
> 
> Sorry but with ... the same server I used for the previous test still
> didn't boot up properly. A screenshot is available at
> https://goo.gl/photos/Za9QVGCNe2BJBwxVA.
> 
> > Otherwise, if you have no news or suggestions, would you be willing to
> > try my micro-logging proposal https://github.com/Algodev-github/bfq-mq?
> 
> Sorry but it's not clear to me what logging mechanism you are referring
> to and how to enable it? Are you perhaps referring to
> CONFIG_BFQ_REDIRECT_TO_CONSOLE?

Anyway, a second screenshot has been added to the same album after I had
applied the following patch:

diff --git a/block/Makefile b/block/Makefile
index 1c04fe19e825..bf472ac82c08 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -2,6 +2,8 @@
 # Makefile for the kernel block layer
 #
  
+KBUILD_CFLAGS += -DCONFIG_BFQ_REDIRECT_TO_CONSOLE
+

 obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-tag.o blk-sysfs.o \
                        blk-flush.o blk-settings.o blk-ioc.o blk-map.o \
                        blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \

Bart.

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-13 22:38               ` Bart Van Assche
@ 2017-02-22 21:29                 ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-22 21:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe


> Il giorno 13 feb 2017, alle ore 23:38, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On Mon, 2017-02-13 at 22:07 +0000, Bart Van Assche wrote:
>> On Mon, 2017-02-13 at 22:07 +0100, Paolo Valente wrote:
>>> but what do you think about trying this fix?
>>=20
>> Sorry but with ... the same server I used for the previous test still
>> didn't boot up properly. A screenshot is available at
>> https://goo.gl/photos/Za9QVGCNe2BJBwxVA.
>>=20
>>> Otherwise, if you have no news or suggestions, would you be willing =
to
>>> try my micro-logging proposal =
https://github.com/Algodev-github/bfq-mq?
>>=20
>> Sorry but it's not clear to me what logging mechanism you are =
referring
>> to and how to enable it? Are you perhaps referring to
>> CONFIG_BFQ_REDIRECT_TO_CONSOLE?
>=20
> Anyway, a second screenshot has been added to the same album after I =
had
> applied the following patch:
>=20

Hi Bart,
thanks for this second attempt of yours.  Although, unfortunately, not
providing some clear indication of the exact cause of your hang (apart
from a possible deadlock), your log helped me notice another bug.

At any rate, as I have just written to Jens, I have pushed a new
version of the branch [1] (not just added new commits, but also
integrated some old commit with new changes, to make it more quickly).
The branch now contains both a fix for the above bug, and, more
importantly, a fix for the circular dependencies that were still
lurking around.  Could you please test it?

Crossing my fingers,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> diff --git a/block/Makefile b/block/Makefile
> index 1c04fe19e825..bf472ac82c08 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -2,6 +2,8 @@
> # Makefile for the kernel block layer
> #
> =20
> +KBUILD_CFLAGS +=3D -DCONFIG_BFQ_REDIRECT_TO_CONSOLE
> +
>=20
> obj-$(CONFIG_BLOCK) :=3D bio.o elevator.o blk-core.o blk-tag.o =
blk-sysfs.o \
>                        blk-flush.o blk-settings.o blk-ioc.o blk-map.o =
\
>                        blk-exec.o blk-merge.o blk-softirq.o =
blk-timeout.o \
>=20
> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail =
Confidentiality Notice & Disclaimer:
>=20
> This e-mail and any files transmitted with it may contain confidential =
or legally privileged information of WDC and/or its affiliates, and are =
intended solely for the use of the individual or entity to which they =
are addressed. If you are not the intended recipient, any disclosure, =
copying, distribution or any action taken or omitted to be taken in =
reliance on it, is prohibited. If you have received this e-mail in =
error, please notify the sender immediately and delete the e-mail in its =
entirety from your system.
>=20

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-22 21:29                 ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-22 21:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe


> Il giorno 13 feb 2017, alle ore 23:38, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
> 
> On Mon, 2017-02-13 at 22:07 +0000, Bart Van Assche wrote:
>> On Mon, 2017-02-13 at 22:07 +0100, Paolo Valente wrote:
>>> but what do you think about trying this fix?
>> 
>> Sorry but with ... the same server I used for the previous test still
>> didn't boot up properly. A screenshot is available at
>> https://goo.gl/photos/Za9QVGCNe2BJBwxVA.
>> 
>>> Otherwise, if you have no news or suggestions, would you be willing to
>>> try my micro-logging proposal https://github.com/Algodev-github/bfq-mq?
>> 
>> Sorry but it's not clear to me what logging mechanism you are referring
>> to and how to enable it? Are you perhaps referring to
>> CONFIG_BFQ_REDIRECT_TO_CONSOLE?
> 
> Anyway, a second screenshot has been added to the same album after I had
> applied the following patch:
> 

Hi Bart,
thanks for this second attempt of yours.  Although, unfortunately, not
providing some clear indication of the exact cause of your hang (apart
from a possible deadlock), your log helped me notice another bug.

At any rate, as I have just written to Jens, I have pushed a new
version of the branch [1] (not just added new commits, but also
integrated some old commit with new changes, to make it more quickly).
The branch now contains both a fix for the above bug, and, more
importantly, a fix for the circular dependencies that were still
lurking around.  Could you please test it?

Crossing my fingers,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> diff --git a/block/Makefile b/block/Makefile
> index 1c04fe19e825..bf472ac82c08 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -2,6 +2,8 @@
> # Makefile for the kernel block layer
> #
>  
> +KBUILD_CFLAGS += -DCONFIG_BFQ_REDIRECT_TO_CONSOLE
> +
> 
> obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-tag.o blk-sysfs.o \
>                        blk-flush.o blk-settings.o blk-ioc.o blk-map.o \
>                        blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \
> 
> Bart.
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
> 
> This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-22 21:29                 ` Paolo Valente
@ 2017-02-24 18:44                   ` Bart Van Assche
  -1 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-24 18:44 UTC (permalink / raw)
  To: paolo.valente
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe

On Wed, 2017-02-22 at 22:29 +0100, Paolo Valente wrote:
> thanks for this second attempt of yours.  Although, unfortunately, not
> providing some clear indication of the exact cause of your hang (apart
> from a possible deadlock), your log helped me notice another bug.
>=20
> At any rate, as I have just written to Jens, I have pushed a new
> version of the branch [1] (not just added new commits, but also
> integrated some old commit with new changes, to make it more quickly).
> The branch now contains both a fix for the above bug, and, more
> importantly, a fix for the circular dependencies that were still
> lurking around.  Could you please test it?

Hello Paolo,

I have good news: the same test system boots normally with the same
kernel config I used during my previous tests and with the latest
bfq-mq code (commit a965d19585c0) merged with kernel v4.10.

Thanks,

Bart.=

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-24 18:44                   ` Bart Van Assche
  0 siblings, 0 replies; 37+ messages in thread
From: Bart Van Assche @ 2017-02-24 18:44 UTC (permalink / raw)
  To: paolo.valente
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe

On Wed, 2017-02-22 at 22:29 +0100, Paolo Valente wrote:
> thanks for this second attempt of yours.  Although, unfortunately, not
> providing some clear indication of the exact cause of your hang (apart
> from a possible deadlock), your log helped me notice another bug.
> 
> At any rate, as I have just written to Jens, I have pushed a new
> version of the branch [1] (not just added new commits, but also
> integrated some old commit with new changes, to make it more quickly).
> The branch now contains both a fix for the above bug, and, more
> importantly, a fix for the circular dependencies that were still
> lurking around.  Could you please test it?

Hello Paolo,

I have good news: the same test system boots normally with the same
kernel config I used during my previous tests and with the latest
bfq-mq code (commit a965d19585c0) merged with kernel v4.10.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
  2017-02-24 18:44                   ` Bart Van Assche
@ 2017-02-25 17:45                     ` Paolo Valente
  -1 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-25 17:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe


> Il giorno 24 feb 2017, alle ore 19:44, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On Wed, 2017-02-22 at 22:29 +0100, Paolo Valente wrote:
>> thanks for this second attempt of yours.  Although, unfortunately, =
not
>> providing some clear indication of the exact cause of your hang =
(apart
>> from a possible deadlock), your log helped me notice another bug.
>>=20
>> At any rate, as I have just written to Jens, I have pushed a new
>> version of the branch [1] (not just added new commits, but also
>> integrated some old commit with new changes, to make it more =
quickly).
>> The branch now contains both a fix for the above bug, and, more
>> importantly, a fix for the circular dependencies that were still
>> lurking around.  Could you please test it?
>=20
> Hello Paolo,
>=20

Hi

> I have good news: the same test system boots normally with the same
> kernel config I used during my previous tests and with the latest
> bfq-mq code (commit a965d19585c0) merged with kernel v4.10.
>=20

Whew, I was longing for you reply, thanks :)

Should you want to have a look at it, I have just finished completing
cgroups support too, as you have probably already read from my
previous email.

Thanks,
Paolo

> Thanks,
>=20
> Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [WIP PATCHSET 0/4] WIP branch for bfq-mq
@ 2017-02-25 17:45                     ` Paolo Valente
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Valente @ 2017-02-25 17:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: ulf.hansson, tj, linux-kernel, linux-block, broonie,
	linus.walleij, axboe


> Il giorno 24 feb 2017, alle ore 19:44, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
> 
> On Wed, 2017-02-22 at 22:29 +0100, Paolo Valente wrote:
>> thanks for this second attempt of yours.  Although, unfortunately, not
>> providing some clear indication of the exact cause of your hang (apart
>> from a possible deadlock), your log helped me notice another bug.
>> 
>> At any rate, as I have just written to Jens, I have pushed a new
>> version of the branch [1] (not just added new commits, but also
>> integrated some old commit with new changes, to make it more quickly).
>> The branch now contains both a fix for the above bug, and, more
>> importantly, a fix for the circular dependencies that were still
>> lurking around.  Could you please test it?
> 
> Hello Paolo,
> 

Hi

> I have good news: the same test system boots normally with the same
> kernel config I used during my previous tests and with the latest
> bfq-mq code (commit a965d19585c0) merged with kernel v4.10.
> 

Whew, I was longing for you reply, thanks :)

Should you want to have a look at it, I have just finished completing
cgroups support too, as you have probably already read from my
previous email.

Thanks,
Paolo

> Thanks,
> 
> Bart.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2017-02-25 17:46 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-07 17:24 [WIP PATCHSET 0/4] WIP branch for bfq-mq Paolo Valente
2017-02-07 17:24 ` [WIP PATCHSET 1/4] blk-mq: pass bio to blk_mq_sched_get_rq_priv Paolo Valente
2017-02-10 16:10   ` Jens Axboe
2017-02-07 17:24 ` [WIP PATCHSET 2/4] Move thinktime from bic to bfqq Paolo Valente
2017-02-07 17:24 ` [WIP PATCHSET 3/4] Embed bfq-ioc.c and add locking on request queue Paolo Valente
2017-02-07 17:24 ` [WIP PATCHSET 4/4] Modify interface and operation to comply with blk-mq-sched Paolo Valente
2017-02-10 16:08 ` [WIP PATCHSET 0/4] WIP branch for bfq-mq Bart Van Assche
2017-02-10 16:08   ` Bart Van Assche
2017-02-10 16:49   ` Paolo Valente
2017-02-10 16:49     ` Paolo Valente
2017-02-10 16:45 ` Bart Van Assche
2017-02-10 16:45   ` Bart Van Assche
2017-02-10 16:49   ` Paolo Valente
2017-02-10 16:49     ` Paolo Valente
2017-02-10 18:13     ` Bart Van Assche
2017-02-10 18:13       ` Bart Van Assche
2017-02-10 19:49       ` Paolo Valente
2017-02-10 19:49         ` Paolo Valente
2017-02-13 21:07         ` Paolo Valente
2017-02-13 21:07           ` Paolo Valente
2017-02-13 22:07           ` Bart Van Assche
2017-02-13 22:07             ` Bart Van Assche
2017-02-13 22:38             ` Bart Van Assche
2017-02-13 22:38               ` Bart Van Assche
2017-02-22 21:29               ` Paolo Valente
2017-02-22 21:29                 ` Paolo Valente
2017-02-24 18:44                 ` Bart Van Assche
2017-02-24 18:44                   ` Bart Van Assche
2017-02-25 17:45                   ` Paolo Valente
2017-02-25 17:45                     ` Paolo Valente
2017-02-10 18:34 ` Bart Van Assche
2017-02-10 18:34   ` Bart Van Assche
2017-02-10 19:33   ` Paolo Valente
2017-02-10 19:33     ` Paolo Valente
2017-02-13 21:09 ` Paolo Valente
2017-02-13 21:09   ` Paolo Valente
2017-02-13 22:29   ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.