All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
@ 2018-10-25 21:10 Jens Axboe
  2018-10-25 21:10 ` [PATCH 01/28] sunvdc: convert to blk-mq Jens Axboe
                   ` (29 more replies)
  0 siblings, 30 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide

The first round of this went into 4.20-rc, but we've still some of
them pending. This patch series converts the remaining drivers to
blk-mq. The ones that support dual paths (like SCSI and DM) have
the non-mq path removed. At the end, legacy IO code and schedulers
are killed off.

This patch series is on top of my for-linus branch. It can also
be bound in my mq-conversions branch.

 Documentation/block/biodoc.txt         |   88 -
 Documentation/block/cfq-iosched.txt    |  291 --
 Documentation/scsi/scsi-parameters.txt |    5 -
 block/Kconfig                          |    6 -
 block/Kconfig.iosched                  |   61 -
 block/Makefile                         |    5 +-
 block/bfq-iosched.c                    |    1 -
 block/blk-cgroup.c                     |   55 -
 block/blk-core.c                       | 1860 +-----------
 block/blk-exec.c                       |   20 +-
 block/blk-flush.c                      |  154 +-
 block/blk-ioc.c                        |   46 +-
 block/blk-merge.c                      |   35 +-
 block/blk-mq-debugfs.c                 |    2 -
 block/blk-mq-tag.c                     |    6 +-
 block/blk-mq.c                         |   13 +-
 block/blk-settings.c                   |   49 -
 block/blk-softirq.c                    |   20 -
 block/blk-sysfs.c                      |   39 +-
 block/blk-tag.c                        |  378 ---
 block/blk-timeout.c                    |   99 +-
 block/blk-wbt.c                        |    3 +-
 block/blk.h                            |   60 +-
 block/bsg-lib.c                        |  131 +-
 block/cfq-iosched.c                    | 4916 --------------------------------
 block/deadline-iosched.c               |  560 ----
 block/elevator.c                       |  447 +--
 block/kyber-iosched.c                  |    1 -
 block/mq-deadline.c                    |    1 -
 block/noop-iosched.c                   |  124 -
 drivers/block/sunvdc.c                 |  149 +-
 drivers/ide/ide-atapi.c                |   25 +-
 drivers/ide/ide-cd.c                   |  175 +-
 drivers/ide/ide-disk.c                 |    5 +-
 drivers/ide/ide-io.c                   |  101 +-
 drivers/ide/ide-park.c                 |    4 +-
 drivers/ide/ide-pm.c                   |   28 +-
 drivers/ide/ide-probe.c                |   68 +-
 drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
 drivers/md/Kconfig                     |   11 -
 drivers/md/dm-core.h                   |   10 -
 drivers/md/dm-mpath.c                  |   18 +-
 drivers/md/dm-rq.c                     |  293 +-
 drivers/md/dm-rq.h                     |    4 -
 drivers/md/dm-sysfs.c                  |    3 +-
 drivers/md/dm-table.c                  |   36 +-
 drivers/md/dm.c                        |   21 +-
 drivers/md/dm.h                        |    1 -
 drivers/memstick/core/ms_block.c       |  110 +-
 drivers/memstick/core/ms_block.h       |    1 +
 drivers/memstick/core/mspro_block.c    |  121 +-
 drivers/s390/block/dasd_ioctl.c        |   22 +-
 drivers/scsi/Kconfig                   |   12 -
 drivers/scsi/cxlflash/main.c           |    6 -
 drivers/scsi/hosts.c                   |   29 +-
 drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
 drivers/scsi/osd/osd_initiator.c       |    4 +-
 drivers/scsi/osst.c                    |    2 +-
 drivers/scsi/qedi/qedi_main.c          |    3 +-
 drivers/scsi/qla2xxx/qla_os.c          |   30 +-
 drivers/scsi/scsi.c                    |    5 +-
 drivers/scsi/scsi_debug.c              |    3 +-
 drivers/scsi/scsi_error.c              |    4 +-
 drivers/scsi/scsi_lib.c                |  624 +---
 drivers/scsi/scsi_priv.h               |    1 -
 drivers/scsi/scsi_scan.c               |   10 +-
 drivers/scsi/scsi_sysfs.c              |    8 +-
 drivers/scsi/scsi_transport_fc.c       |   72 +-
 drivers/scsi/scsi_transport_iscsi.c    |    9 +-
 drivers/scsi/scsi_transport_sas.c      |   10 +-
 drivers/scsi/sg.c                      |    2 +-
 drivers/scsi/st.c                      |    2 +-
 drivers/scsi/ufs/ufshcd.c              |    6 -
 drivers/target/target_core_pscsi.c     |    2 +-
 include/linux/blk-cgroup.h             |  108 -
 include/linux/blkdev.h                 |  174 +-
 include/linux/bsg-lib.h                |    3 +-
 include/linux/elevator.h               |   90 +-
 include/linux/ide.h                    |   13 +-
 include/linux/init.h                   |    1 -
 include/scsi/scsi_host.h               |   18 +-
 include/scsi/scsi_tcq.h                |   14 +-
 init/do_mounts_initrd.c                |    3 -
 init/initramfs.c                       |    6 -
 init/main.c                            |   12 -
 85 files changed, 833 insertions(+), 11144 deletions(-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH 01/28] sunvdc: convert to blk-mq
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-27 10:42   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 02/28] ms_block: " Jens Axboe
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe, David Miller

Convert from the old request_fn style driver to blk-mq.

Cc: David Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/block/sunvdc.c | 149 +++++++++++++++++++++++++++--------------
 1 file changed, 98 insertions(+), 51 deletions(-)

diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
index b54fa6726303..95cb4ea8e402 100644
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -6,7 +6,7 @@
 #include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
-#include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/hdreg.h>
 #include <linux/genhd.h>
 #include <linux/cdrom.h>
@@ -66,9 +66,10 @@ struct vdc_port {
 
 	u64			max_xfer_size;
 	u32			vdisk_block_size;
+	u32			drain;
 
 	u64			ldc_timeout;
-	struct timer_list	ldc_reset_timer;
+	struct delayed_work	ldc_reset_timer_work;
 	struct work_struct	ldc_reset_work;
 
 	/* The server fills these in for us in the disk attribute
@@ -80,12 +81,14 @@ struct vdc_port {
 	u8			vdisk_mtype;
 	u32			vdisk_phys_blksz;
 
+	struct blk_mq_tag_set	tag_set;
+
 	char			disk_name[32];
 };
 
 static void vdc_ldc_reset(struct vdc_port *port);
 static void vdc_ldc_reset_work(struct work_struct *work);
-static void vdc_ldc_reset_timer(struct timer_list *t);
+static void vdc_ldc_reset_timer_work(struct work_struct *work);
 
 static inline struct vdc_port *to_vdc_port(struct vio_driver_state *vio)
 {
@@ -175,11 +178,8 @@ static void vdc_blk_queue_start(struct vdc_port *port)
 	 * handshake completes, so check for initial handshake before we've
 	 * allocated a disk.
 	 */
-	if (port->disk && blk_queue_stopped(port->disk->queue) &&
-	    vdc_tx_dring_avail(dr) * 100 / VDC_TX_RING_SIZE >= 50) {
-		blk_start_queue(port->disk->queue);
-	}
-
+	if (port->disk && vdc_tx_dring_avail(dr) * 100 / VDC_TX_RING_SIZE >= 50)
+		blk_mq_start_hw_queues(port->disk->queue);
 }
 
 static void vdc_finish(struct vio_driver_state *vio, int err, int waiting_for)
@@ -197,7 +197,7 @@ static void vdc_handshake_complete(struct vio_driver_state *vio)
 {
 	struct vdc_port *port = to_vdc_port(vio);
 
-	del_timer(&port->ldc_reset_timer);
+	cancel_delayed_work(&port->ldc_reset_timer_work);
 	vdc_finish(vio, 0, WAITING_FOR_LINK_UP);
 	vdc_blk_queue_start(port);
 }
@@ -320,7 +320,7 @@ static void vdc_end_one(struct vdc_port *port, struct vio_dring_state *dr,
 
 	rqe->req = NULL;
 
-	__blk_end_request(req, (desc->status ? BLK_STS_IOERR : 0), desc->size);
+	blk_mq_end_request(req, desc->status ? BLK_STS_IOERR : 0);
 
 	vdc_blk_queue_start(port);
 }
@@ -525,29 +525,40 @@ static int __send_request(struct request *req)
 	return err;
 }
 
-static void do_vdc_request(struct request_queue *rq)
+static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
+				 const struct blk_mq_queue_data *bd)
 {
-	struct request *req;
+	struct vdc_port *port = hctx->queue->queuedata;
+	struct vio_dring_state *dr;
+	unsigned long flags;
 
-	while ((req = blk_peek_request(rq)) != NULL) {
-		struct vdc_port *port;
-		struct vio_dring_state *dr;
+	dr = &port->vio.drings[VIO_DRIVER_TX_RING];
 
-		port = req->rq_disk->private_data;
-		dr = &port->vio.drings[VIO_DRIVER_TX_RING];
-		if (unlikely(vdc_tx_dring_avail(dr) < 1))
-			goto wait;
+	blk_mq_start_request(bd->rq);
 
-		blk_start_request(req);
+	spin_lock_irqsave(&port->vio.lock, flags);
 
-		if (__send_request(req) < 0) {
-			blk_requeue_request(rq, req);
-wait:
-			/* Avoid pointless unplugs. */
-			blk_stop_queue(rq);
-			break;
-		}
+	/*
+	 * Doing drain, just end the request in error
+	 */
+	if (unlikely(port->drain)) {
+		spin_unlock_irqrestore(&port->vio.lock, flags);
+		return BLK_STS_IOERR;
 	}
+
+	if (unlikely(vdc_tx_dring_avail(dr) < 1)) {
+		spin_unlock_irqrestore(&port->vio.lock, flags);
+		blk_mq_stop_hw_queue(hctx);
+		return BLK_STS_DEV_RESOURCE;
+	}
+
+	if (__send_request(bd->rq) < 0) {
+		spin_unlock_irqrestore(&port->vio.lock, flags);
+		return BLK_STS_IOERR;
+	}
+
+	spin_unlock_irqrestore(&port->vio.lock, flags);
+	return BLK_STS_OK;
 }
 
 static int generic_request(struct vdc_port *port, u8 op, void *buf, int len)
@@ -759,6 +770,32 @@ static void vdc_port_down(struct vdc_port *port)
 	vio_ldc_free(&port->vio);
 }
 
+static const struct blk_mq_ops vdc_mq_ops = {
+	.queue_rq	= vdc_queue_rq,
+};
+
+static void cleanup_queue(struct request_queue *q)
+{
+	struct vdc_port *port = q->queuedata;
+
+	blk_cleanup_queue(q);
+	blk_mq_free_tag_set(&port->tag_set);
+}
+
+static struct request_queue *init_queue(struct vdc_port *port)
+{
+	struct request_queue *q;
+	int ret;
+
+	q = blk_mq_init_sq_queue(&port->tag_set, &vdc_mq_ops, VDC_TX_RING_SIZE,
+					BLK_MQ_F_SHOULD_MERGE);
+	if (IS_ERR(q))
+		return q;
+
+	q->queuedata = port;
+	return q;
+}
+
 static int probe_disk(struct vdc_port *port)
 {
 	struct request_queue *q;
@@ -796,17 +833,17 @@ static int probe_disk(struct vdc_port *port)
 				    (u64)geom.num_sec);
 	}
 
-	q = blk_init_queue(do_vdc_request, &port->vio.lock);
-	if (!q) {
+	q = init_queue(port);
+	if (IS_ERR(q)) {
 		printk(KERN_ERR PFX "%s: Could not allocate queue.\n",
 		       port->vio.name);
-		return -ENOMEM;
+		return PTR_ERR(q);
 	}
 	g = alloc_disk(1 << PARTITION_SHIFT);
 	if (!g) {
 		printk(KERN_ERR PFX "%s: Could not allocate gendisk.\n",
 		       port->vio.name);
-		blk_cleanup_queue(q);
+		cleanup_queue(q);
 		return -ENOMEM;
 	}
 
@@ -981,7 +1018,7 @@ static int vdc_port_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	 */
 	ldc_timeout = mdesc_get_property(hp, vdev->mp, "vdc-timeout", NULL);
 	port->ldc_timeout = ldc_timeout ? *ldc_timeout : 0;
-	timer_setup(&port->ldc_reset_timer, vdc_ldc_reset_timer, 0);
+	INIT_DELAYED_WORK(&port->ldc_reset_timer_work, vdc_ldc_reset_timer_work);
 	INIT_WORK(&port->ldc_reset_work, vdc_ldc_reset_work);
 
 	err = vio_driver_init(&port->vio, vdev, VDEV_DISK,
@@ -1034,18 +1071,14 @@ static int vdc_port_remove(struct vio_dev *vdev)
 	struct vdc_port *port = dev_get_drvdata(&vdev->dev);
 
 	if (port) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&port->vio.lock, flags);
-		blk_stop_queue(port->disk->queue);
-		spin_unlock_irqrestore(&port->vio.lock, flags);
+		blk_mq_stop_hw_queues(port->disk->queue);
 
 		flush_work(&port->ldc_reset_work);
-		del_timer_sync(&port->ldc_reset_timer);
+		cancel_delayed_work_sync(&port->ldc_reset_timer_work);
 		del_timer_sync(&port->vio.timer);
 
 		del_gendisk(port->disk);
-		blk_cleanup_queue(port->disk->queue);
+		cleanup_queue(port->disk->queue);
 		put_disk(port->disk);
 		port->disk = NULL;
 
@@ -1080,32 +1113,46 @@ static void vdc_requeue_inflight(struct vdc_port *port)
 		}
 
 		rqe->req = NULL;
-		blk_requeue_request(port->disk->queue, req);
+		blk_mq_requeue_request(req, false);
 	}
 }
 
 static void vdc_queue_drain(struct vdc_port *port)
 {
-	struct request *req;
+	struct request_queue *q = port->disk->queue;
 
-	while ((req = blk_fetch_request(port->disk->queue)) != NULL)
-		__blk_end_request_all(req, BLK_STS_IOERR);
+	/*
+	 * Mark the queue as draining, then freeze/quiesce to ensure
+	 * that all existing requests are seen in ->queue_rq() and killed
+	 */
+	port->drain = 1;
+	spin_unlock_irq(&port->vio.lock);
+
+	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
+
+	spin_lock_irq(&port->vio.lock);
+	port->drain = 0;
+	blk_mq_unquiesce_queue(q);
+	blk_mq_unfreeze_queue(q);
 }
 
-static void vdc_ldc_reset_timer(struct timer_list *t)
+static void vdc_ldc_reset_timer_work(struct work_struct *work)
 {
-	struct vdc_port *port = from_timer(port, t, ldc_reset_timer);
-	struct vio_driver_state *vio = &port->vio;
-	unsigned long flags;
+	struct vdc_port *port;
+	struct vio_driver_state *vio;
 
-	spin_lock_irqsave(&vio->lock, flags);
+	port = container_of(work, struct vdc_port, ldc_reset_timer_work.work);
+	vio = &port->vio;
+
+	spin_lock_irq(&vio->lock);
 	if (!(port->vio.hs_state & VIO_HS_COMPLETE)) {
 		pr_warn(PFX "%s ldc down %llu seconds, draining queue\n",
 			port->disk_name, port->ldc_timeout);
 		vdc_queue_drain(port);
 		vdc_blk_queue_start(port);
 	}
-	spin_unlock_irqrestore(&vio->lock, flags);
+	spin_unlock_irq(&vio->lock);
 }
 
 static void vdc_ldc_reset_work(struct work_struct *work)
@@ -1129,7 +1176,7 @@ static void vdc_ldc_reset(struct vdc_port *port)
 	assert_spin_locked(&port->vio.lock);
 
 	pr_warn(PFX "%s ldc link reset\n", port->disk_name);
-	blk_stop_queue(port->disk->queue);
+	blk_mq_stop_hw_queues(port->disk->queue);
 	vdc_requeue_inflight(port);
 	vdc_port_down(port);
 
@@ -1146,7 +1193,7 @@ static void vdc_ldc_reset(struct vdc_port *port)
 	}
 
 	if (port->ldc_timeout)
-		mod_timer(&port->ldc_reset_timer,
+		mod_delayed_work(system_wq, &port->ldc_reset_timer_work,
 			  round_jiffies(jiffies + HZ * port->ldc_timeout));
 	mod_timer(&port->vio.timer, round_jiffies(jiffies + HZ));
 	return;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 02/28] ms_block: convert to blk-mq
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
  2018-10-25 21:10 ` [PATCH 01/28] sunvdc: convert to blk-mq Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-27 10:43   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 03/28] mspro_block: " Jens Axboe
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe, Maxim Levitsky

Straight forward conversion, room for optimization in how everything
is punted to a work queue. Also looks plenty racy all over the map,
with the state changes. I fixed a bunch of them up while doing the
conversion, but there are surely more.

Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/memstick/core/ms_block.c | 110 +++++++++++++++++--------------
 drivers/memstick/core/ms_block.h |   1 +
 2 files changed, 62 insertions(+), 49 deletions(-)

diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
index 8a02f11076f9..ee0be66a9a03 100644
--- a/drivers/memstick/core/ms_block.c
+++ b/drivers/memstick/core/ms_block.c
@@ -15,7 +15,7 @@
 #define pr_fmt(fmt) DRIVER_NAME ": " fmt
 
 #include <linux/module.h>
-#include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/memstick.h>
 #include <linux/idr.h>
 #include <linux/hdreg.h>
@@ -1873,69 +1873,65 @@ static void msb_io_work(struct work_struct *work)
 	struct msb_data *msb = container_of(work, struct msb_data, io_work);
 	int page, error, len;
 	sector_t lba;
-	unsigned long flags;
 	struct scatterlist *sg = msb->prealloc_sg;
+	struct request *req;
 
 	dbg_verbose("IO: work started");
 
 	while (1) {
-		spin_lock_irqsave(&msb->q_lock, flags);
+		spin_lock_irq(&msb->q_lock);
 
 		if (msb->need_flush_cache) {
 			msb->need_flush_cache = false;
-			spin_unlock_irqrestore(&msb->q_lock, flags);
+			spin_unlock_irq(&msb->q_lock);
 			msb_cache_flush(msb);
 			continue;
 		}
 
-		if (!msb->req) {
-			msb->req = blk_fetch_request(msb->queue);
-			if (!msb->req) {
-				dbg_verbose("IO: no more requests exiting");
-				spin_unlock_irqrestore(&msb->q_lock, flags);
-				return;
-			}
+		req = msb->req;
+		if (!req) {
+			dbg_verbose("IO: no more requests exiting");
+			spin_unlock_irq(&msb->q_lock);
+			return;
 		}
 
-		spin_unlock_irqrestore(&msb->q_lock, flags);
-
-		/* If card was removed meanwhile */
-		if (!msb->req)
-			return;
+		spin_unlock_irq(&msb->q_lock);
 
 		/* process the request */
 		dbg_verbose("IO: processing new request");
-		blk_rq_map_sg(msb->queue, msb->req, sg);
+		blk_rq_map_sg(msb->queue, req, sg);
 
-		lba = blk_rq_pos(msb->req);
+		lba = blk_rq_pos(req);
 
 		sector_div(lba, msb->page_size / 512);
 		page = sector_div(lba, msb->pages_in_block);
 
 		if (rq_data_dir(msb->req) == READ)
 			error = msb_do_read_request(msb, lba, page, sg,
-				blk_rq_bytes(msb->req), &len);
+				blk_rq_bytes(req), &len);
 		else
 			error = msb_do_write_request(msb, lba, page, sg,
-				blk_rq_bytes(msb->req), &len);
-
-		spin_lock_irqsave(&msb->q_lock, flags);
+				blk_rq_bytes(req), &len);
 
-		if (len)
-			if (!__blk_end_request(msb->req, BLK_STS_OK, len))
-				msb->req = NULL;
+		if (len && !blk_update_request(req, BLK_STS_OK, len)) {
+			__blk_mq_end_request(req, BLK_STS_OK);
+			spin_lock_irq(&msb->q_lock);
+			msb->req = NULL;
+			spin_unlock_irq(&msb->q_lock);
+		}
 
 		if (error && msb->req) {
 			blk_status_t ret = errno_to_blk_status(error);
+
 			dbg_verbose("IO: ending one sector of the request with error");
-			if (!__blk_end_request(msb->req, ret, msb->page_size))
-				msb->req = NULL;
+			blk_mq_end_request(req, ret);
+			spin_lock_irq(&msb->q_lock);
+			msb->req = NULL;
+			spin_unlock_irq(&msb->q_lock);
 		}
 
 		if (msb->req)
 			dbg_verbose("IO: request still pending");
-
-		spin_unlock_irqrestore(&msb->q_lock, flags);
 	}
 }
 
@@ -2002,29 +1998,40 @@ static int msb_bd_getgeo(struct block_device *bdev,
 	return 0;
 }
 
-static void msb_submit_req(struct request_queue *q)
+static blk_status_t msb_queue_rq(struct blk_mq_hw_ctx *hctx,
+				 const struct blk_mq_queue_data *bd)
 {
-	struct memstick_dev *card = q->queuedata;
+	struct memstick_dev *card = hctx->queue->queuedata;
 	struct msb_data *msb = memstick_get_drvdata(card);
-	struct request *req = NULL;
+	struct request *req = bd->rq;
 
 	dbg_verbose("Submit request");
 
+	spin_lock_irq(&msb->q_lock);
+
 	if (msb->card_dead) {
 		dbg("Refusing requests on removed card");
 
 		WARN_ON(!msb->io_queue_stopped);
 
-		while ((req = blk_fetch_request(q)) != NULL)
-			__blk_end_request_all(req, BLK_STS_IOERR);
-		return;
+		spin_unlock_irq(&msb->q_lock);
+		blk_mq_start_request(req);
+		return BLK_STS_IOERR;
 	}
 
-	if (msb->req)
-		return;
+	if (msb->req) {
+		spin_unlock_irq(&msb->q_lock);
+		return BLK_STS_DEV_RESOURCE;
+	}
+
+	blk_mq_start_request(req);
+	msb->req = req;
 
 	if (!msb->io_queue_stopped)
 		queue_work(msb->io_queue, &msb->io_work);
+
+	spin_unlock_irq(&msb->q_lock);
+	return BLK_STS_OK;
 }
 
 static int msb_check_card(struct memstick_dev *card)
@@ -2040,21 +2047,20 @@ static void msb_stop(struct memstick_dev *card)
 
 	dbg("Stopping all msblock IO");
 
+	blk_mq_stop_hw_queues(msb->queue);
 	spin_lock_irqsave(&msb->q_lock, flags);
-	blk_stop_queue(msb->queue);
 	msb->io_queue_stopped = true;
 	spin_unlock_irqrestore(&msb->q_lock, flags);
 
 	del_timer_sync(&msb->cache_flush_timer);
 	flush_workqueue(msb->io_queue);
 
+	spin_lock_irqsave(&msb->q_lock, flags);
 	if (msb->req) {
-		spin_lock_irqsave(&msb->q_lock, flags);
-		blk_requeue_request(msb->queue, msb->req);
+		blk_mq_requeue_request(msb->req, false);
 		msb->req = NULL;
-		spin_unlock_irqrestore(&msb->q_lock, flags);
 	}
-
+	spin_unlock_irqrestore(&msb->q_lock, flags);
 }
 
 static void msb_start(struct memstick_dev *card)
@@ -2077,9 +2083,7 @@ static void msb_start(struct memstick_dev *card)
 	msb->need_flush_cache = true;
 	msb->io_queue_stopped = false;
 
-	spin_lock_irqsave(&msb->q_lock, flags);
-	blk_start_queue(msb->queue);
-	spin_unlock_irqrestore(&msb->q_lock, flags);
+	blk_mq_start_hw_queues(msb->queue);
 
 	queue_work(msb->io_queue, &msb->io_work);
 
@@ -2092,10 +2096,15 @@ static const struct block_device_operations msb_bdops = {
 	.owner   = THIS_MODULE
 };
 
+static const struct blk_mq_ops msb_mq_ops = {
+	.queue_rq	= msb_queue_rq,
+};
+
 /* Registers the block device */
 static int msb_init_disk(struct memstick_dev *card)
 {
 	struct msb_data *msb = memstick_get_drvdata(card);
+	struct blk_mq_tag_set *set = NULL;
 	int rc;
 	unsigned long capacity;
 
@@ -2112,9 +2121,11 @@ static int msb_init_disk(struct memstick_dev *card)
 		goto out_release_id;
 	}
 
-	msb->queue = blk_init_queue(msb_submit_req, &msb->q_lock);
-	if (!msb->queue) {
-		rc = -ENOMEM;
+	msb->queue = blk_mq_init_sq_queue(&msb->tag_set, &msb_mq_ops, 2,
+						BLK_MQ_F_SHOULD_MERGE);
+	if (IS_ERR(msb->queue)) {
+		rc = PTR_ERR(msb->queue);
+		msb->queue = NULL;
 		goto out_put_disk;
 	}
 
@@ -2202,12 +2213,13 @@ static void msb_remove(struct memstick_dev *card)
 	/* Take care of unhandled + new requests from now on */
 	spin_lock_irqsave(&msb->q_lock, flags);
 	msb->card_dead = true;
-	blk_start_queue(msb->queue);
 	spin_unlock_irqrestore(&msb->q_lock, flags);
+	blk_mq_start_hw_queues(msb->queue);
 
 	/* Remove the disk */
 	del_gendisk(msb->disk);
 	blk_cleanup_queue(msb->queue);
+	blk_mq_free_tag_set(&msb->tag_set);
 	msb->queue = NULL;
 
 	mutex_lock(&msb_disk_lock);
diff --git a/drivers/memstick/core/ms_block.h b/drivers/memstick/core/ms_block.h
index 53962c3b21df..9ba84e0ced63 100644
--- a/drivers/memstick/core/ms_block.h
+++ b/drivers/memstick/core/ms_block.h
@@ -152,6 +152,7 @@ struct msb_data {
 	struct gendisk			*disk;
 	struct request_queue		*queue;
 	spinlock_t			q_lock;
+	struct blk_mq_tag_set		tag_set;
 	struct hd_geometry		geometry;
 	struct attribute_group		attr_group;
 	struct request			*req;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 03/28] mspro_block: convert to blk-mq
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
  2018-10-25 21:10 ` [PATCH 01/28] sunvdc: convert to blk-mq Jens Axboe
  2018-10-25 21:10 ` [PATCH 02/28] ms_block: " Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-27 10:44   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 04/28] ide: " Jens Axboe
                   ` (26 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Straight forward conversion, there's room for improvement.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/memstick/core/mspro_block.c | 121 +++++++++++++++-------------
 1 file changed, 66 insertions(+), 55 deletions(-)

diff --git a/drivers/memstick/core/mspro_block.c b/drivers/memstick/core/mspro_block.c
index 0cd30dcb6801..aba50ec98b4d 100644
--- a/drivers/memstick/core/mspro_block.c
+++ b/drivers/memstick/core/mspro_block.c
@@ -12,7 +12,7 @@
  *
  */
 
-#include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/idr.h>
 #include <linux/hdreg.h>
 #include <linux/kthread.h>
@@ -142,6 +142,7 @@ struct mspro_block_data {
 	struct gendisk        *disk;
 	struct request_queue  *queue;
 	struct request        *block_req;
+	struct blk_mq_tag_set tag_set;
 	spinlock_t            q_lock;
 
 	unsigned short        page_size;
@@ -152,7 +153,6 @@ struct mspro_block_data {
 	unsigned char         system;
 	unsigned char         read_only:1,
 			      eject:1,
-			      has_request:1,
 			      data_dir:1,
 			      active:1;
 	unsigned char         transfer_cmd;
@@ -694,13 +694,12 @@ static void h_mspro_block_setup_cmd(struct memstick_dev *card, u64 offset,
 
 /*** Data transfer ***/
 
-static int mspro_block_issue_req(struct memstick_dev *card, int chunk)
+static int mspro_block_issue_req(struct memstick_dev *card, bool chunk)
 {
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
 	u64 t_off;
 	unsigned int count;
 
-try_again:
 	while (chunk) {
 		msb->current_page = 0;
 		msb->current_seg = 0;
@@ -709,9 +708,17 @@ static int mspro_block_issue_req(struct memstick_dev *card, int chunk)
 					       msb->req_sg);
 
 		if (!msb->seg_count) {
-			chunk = __blk_end_request_cur(msb->block_req,
-					BLK_STS_RESOURCE);
-			continue;
+			unsigned int bytes = blk_rq_cur_bytes(msb->block_req);
+
+			chunk = blk_update_request(msb->block_req,
+							BLK_STS_RESOURCE,
+							bytes);
+			if (chunk)
+				continue;
+			__blk_mq_end_request(msb->block_req,
+						BLK_STS_RESOURCE);
+			msb->block_req = NULL;
+			break;
 		}
 
 		t_off = blk_rq_pos(msb->block_req);
@@ -729,30 +736,22 @@ static int mspro_block_issue_req(struct memstick_dev *card, int chunk)
 		return 0;
 	}
 
-	dev_dbg(&card->dev, "blk_fetch\n");
-	msb->block_req = blk_fetch_request(msb->queue);
-	if (!msb->block_req) {
-		dev_dbg(&card->dev, "issue end\n");
-		return -EAGAIN;
-	}
-
-	dev_dbg(&card->dev, "trying again\n");
-	chunk = 1;
-	goto try_again;
+	return 1;
 }
 
 static int mspro_block_complete_req(struct memstick_dev *card, int error)
 {
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
-	int chunk, cnt;
+	int cnt;
+	bool chunk;
 	unsigned int t_len = 0;
 	unsigned long flags;
 
 	spin_lock_irqsave(&msb->q_lock, flags);
-	dev_dbg(&card->dev, "complete %d, %d\n", msb->has_request ? 1 : 0,
+	dev_dbg(&card->dev, "complete %d, %d\n", msb->block_req ? 1 : 0,
 		error);
 
-	if (msb->has_request) {
+	if (msb->block_req) {
 		/* Nothing to do - not really an error */
 		if (error == -EAGAIN)
 			error = 0;
@@ -777,15 +776,17 @@ static int mspro_block_complete_req(struct memstick_dev *card, int error)
 		if (error && !t_len)
 			t_len = blk_rq_cur_bytes(msb->block_req);
 
-		chunk = __blk_end_request(msb->block_req,
+		chunk = blk_update_request(msb->block_req,
 				errno_to_blk_status(error), t_len);
-
-		error = mspro_block_issue_req(card, chunk);
-
-		if (!error)
-			goto out;
-		else
-			msb->has_request = 0;
+		if (chunk) {
+			error = mspro_block_issue_req(card, chunk);
+			if (!error)
+				goto out;
+		} else {
+			__blk_mq_end_request(msb->block_req,
+						errno_to_blk_status(error));
+			msb->block_req = NULL;
+		}
 	} else {
 		if (!error)
 			error = -EAGAIN;
@@ -806,8 +807,8 @@ static void mspro_block_stop(struct memstick_dev *card)
 
 	while (1) {
 		spin_lock_irqsave(&msb->q_lock, flags);
-		if (!msb->has_request) {
-			blk_stop_queue(msb->queue);
+		if (!msb->block_req) {
+			blk_mq_stop_hw_queues(msb->queue);
 			rc = 1;
 		}
 		spin_unlock_irqrestore(&msb->q_lock, flags);
@@ -822,32 +823,37 @@ static void mspro_block_stop(struct memstick_dev *card)
 static void mspro_block_start(struct memstick_dev *card)
 {
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
-	unsigned long flags;
 
-	spin_lock_irqsave(&msb->q_lock, flags);
-	blk_start_queue(msb->queue);
-	spin_unlock_irqrestore(&msb->q_lock, flags);
+	blk_mq_start_hw_queues(msb->queue);
 }
 
-static void mspro_block_submit_req(struct request_queue *q)
+static blk_status_t mspro_queue_rq(struct blk_mq_hw_ctx *hctx,
+				   const struct blk_mq_queue_data *bd)
 {
-	struct memstick_dev *card = q->queuedata;
+	struct memstick_dev *card = hctx->queue->queuedata;
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
-	struct request *req = NULL;
 
-	if (msb->has_request)
-		return;
+	spin_lock_irq(&msb->q_lock);
 
-	if (msb->eject) {
-		while ((req = blk_fetch_request(q)) != NULL)
-			__blk_end_request_all(req, BLK_STS_IOERR);
+	if (msb->block_req) {
+		spin_unlock_irq(&msb->q_lock);
+		return BLK_STS_DEV_RESOURCE;
+	}
 
-		return;
+	if (msb->eject) {
+		spin_unlock_irq(&msb->q_lock);
+		blk_mq_start_request(bd->rq);
+		return BLK_STS_IOERR;
 	}
 
-	msb->has_request = 1;
-	if (mspro_block_issue_req(card, 0))
-		msb->has_request = 0;
+	msb->block_req = bd->rq;
+	blk_mq_start_request(bd->rq);
+
+	if (mspro_block_issue_req(card, true))
+		msb->block_req = NULL;
+
+	spin_unlock_irq(&msb->q_lock);
+	return BLK_STS_OK;
 }
 
 /*** Initialization ***/
@@ -1167,6 +1173,10 @@ static int mspro_block_init_card(struct memstick_dev *card)
 
 }
 
+static const struct blk_mq_ops mspro_mq_ops = {
+	.queue_rq	= mspro_queue_rq,
+};
+
 static int mspro_block_init_disk(struct memstick_dev *card)
 {
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
@@ -1206,9 +1216,11 @@ static int mspro_block_init_disk(struct memstick_dev *card)
 		goto out_release_id;
 	}
 
-	msb->queue = blk_init_queue(mspro_block_submit_req, &msb->q_lock);
-	if (!msb->queue) {
-		rc = -ENOMEM;
+	msb->queue = blk_mq_init_sq_queue(&msb->tag_set, &mspro_mq_ops, 2,
+						BLK_MQ_F_SHOULD_MERGE);
+	if (IS_ERR(msb->queue)) {
+		rc = PTR_ERR(msb->queue);
+		msb->queue = NULL;
 		goto out_put_disk;
 	}
 
@@ -1318,13 +1330,14 @@ static void mspro_block_remove(struct memstick_dev *card)
 
 	spin_lock_irqsave(&msb->q_lock, flags);
 	msb->eject = 1;
-	blk_start_queue(msb->queue);
 	spin_unlock_irqrestore(&msb->q_lock, flags);
+	blk_mq_start_hw_queues(msb->queue);
 
 	del_gendisk(msb->disk);
 	dev_dbg(&card->dev, "mspro block remove\n");
 
 	blk_cleanup_queue(msb->queue);
+	blk_mq_free_tag_set(&msb->tag_set);
 	msb->queue = NULL;
 
 	sysfs_remove_group(&card->dev.kobj, &msb->attr_group);
@@ -1344,8 +1357,9 @@ static int mspro_block_suspend(struct memstick_dev *card, pm_message_t state)
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
 	unsigned long flags;
 
+	blk_mq_stop_hw_queues(msb->queue);
+
 	spin_lock_irqsave(&msb->q_lock, flags);
-	blk_stop_queue(msb->queue);
 	msb->active = 0;
 	spin_unlock_irqrestore(&msb->q_lock, flags);
 
@@ -1355,7 +1369,6 @@ static int mspro_block_suspend(struct memstick_dev *card, pm_message_t state)
 static int mspro_block_resume(struct memstick_dev *card)
 {
 	struct mspro_block_data *msb = memstick_get_drvdata(card);
-	unsigned long flags;
 	int rc = 0;
 
 #ifdef CONFIG_MEMSTICK_UNSAFE_RESUME
@@ -1401,9 +1414,7 @@ static int mspro_block_resume(struct memstick_dev *card)
 
 #endif /* CONFIG_MEMSTICK_UNSAFE_RESUME */
 
-	spin_lock_irqsave(&msb->q_lock, flags);
-	blk_start_queue(msb->queue);
-	spin_unlock_irqrestore(&msb->q_lock, flags);
+	blk_mq_start_hw_queues(msb->queue);
 	return rc;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 04/28] ide: convert to blk-mq
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (2 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 03/28] mspro_block: " Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-27 10:51   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 05/28] IB/srp: remove old request_fn_active check Jens Axboe
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe, David Miller

ide-disk and ide-cd tested as working just fine, ide-tape and
ide-floppy haven't. But the latter don't require changes, so they
should work without issue.

Add helper function to insert a request from a work queue, since we
cannot invoke the blk-mq request insertion from IRQ context.

Cc: David Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/ide/ide-atapi.c |  25 ++++--
 drivers/ide/ide-cd.c    | 175 +++++++++++++++++++++-------------------
 drivers/ide/ide-disk.c  |   5 +-
 drivers/ide/ide-io.c    | 101 +++++++++++++----------
 drivers/ide/ide-park.c  |   4 +-
 drivers/ide/ide-pm.c    |  28 ++-----
 drivers/ide/ide-probe.c |  68 +++++++++++-----
 include/linux/ide.h     |  13 ++-
 8 files changed, 240 insertions(+), 179 deletions(-)

diff --git a/drivers/ide/ide-atapi.c b/drivers/ide/ide-atapi.c
index 8b2b72b93885..33210bc67618 100644
--- a/drivers/ide/ide-atapi.c
+++ b/drivers/ide/ide-atapi.c
@@ -172,8 +172,8 @@ EXPORT_SYMBOL_GPL(ide_create_request_sense_cmd);
 void ide_prep_sense(ide_drive_t *drive, struct request *rq)
 {
 	struct request_sense *sense = &drive->sense_data;
-	struct request *sense_rq = drive->sense_rq;
-	struct scsi_request *req = scsi_req(sense_rq);
+	struct request *sense_rq;
+	struct scsi_request *req;
 	unsigned int cmd_len, sense_len;
 	int err;
 
@@ -196,9 +196,16 @@ void ide_prep_sense(ide_drive_t *drive, struct request *rq)
 	if (ata_sense_request(rq) || drive->sense_rq_armed)
 		return;
 
+	sense_rq = drive->sense_rq;
+	if (!sense_rq) {
+		sense_rq = blk_mq_alloc_request(drive->queue, REQ_OP_DRV_IN,
+					BLK_MQ_REQ_RESERVED | BLK_MQ_REQ_NOWAIT);
+		drive->sense_rq = sense_rq;
+	}
+	req = scsi_req(sense_rq);
+
 	memset(sense, 0, sizeof(*sense));
 
-	blk_rq_init(rq->q, sense_rq);
 	scsi_req_init(req);
 
 	err = blk_rq_map_kern(drive->queue, sense_rq, sense, sense_len,
@@ -207,6 +214,8 @@ void ide_prep_sense(ide_drive_t *drive, struct request *rq)
 		if (printk_ratelimit())
 			printk(KERN_WARNING PFX "%s: failed to map sense "
 					    "buffer\n", drive->name);
+		blk_mq_free_request(sense_rq);
+		drive->sense_rq = NULL;
 		return;
 	}
 
@@ -226,6 +235,8 @@ EXPORT_SYMBOL_GPL(ide_prep_sense);
 
 int ide_queue_sense_rq(ide_drive_t *drive, void *special)
 {
+	struct request *sense_rq = drive->sense_rq;
+
 	/* deferred failure from ide_prep_sense() */
 	if (!drive->sense_rq_armed) {
 		printk(KERN_WARNING PFX "%s: error queuing a sense request\n",
@@ -233,12 +244,12 @@ int ide_queue_sense_rq(ide_drive_t *drive, void *special)
 		return -ENOMEM;
 	}
 
-	drive->sense_rq->special = special;
+	sense_rq->special = special;
 	drive->sense_rq_armed = false;
 
 	drive->hwif->rq = NULL;
 
-	elv_add_request(drive->queue, drive->sense_rq, ELEVATOR_INSERT_FRONT);
+	ide_insert_request_head(drive, sense_rq);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ide_queue_sense_rq);
@@ -270,10 +281,8 @@ void ide_retry_pc(ide_drive_t *drive)
 	 */
 	drive->hwif->rq = NULL;
 	ide_requeue_and_plug(drive, failed_rq);
-	if (ide_queue_sense_rq(drive, pc)) {
-		blk_start_request(failed_rq);
+	if (ide_queue_sense_rq(drive, pc))
 		ide_complete_rq(drive, BLK_STS_IOERR, blk_rq_bytes(failed_rq));
-	}
 }
 EXPORT_SYMBOL_GPL(ide_retry_pc);
 
diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index f9b59d41813f..4ecaf2ace4cb 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -258,11 +258,22 @@ static int ide_cd_breathe(ide_drive_t *drive, struct request *rq)
 		/*
 		 * take a breather
 		 */
-		blk_delay_queue(drive->queue, 1);
+		blk_mq_requeue_request(rq, false);
+		blk_mq_delay_kick_requeue_list(drive->queue, 1);
 		return 1;
 	}
 }
 
+static void ide_cd_free_sense(ide_drive_t *drive)
+{
+	if (!drive->sense_rq)
+		return;
+
+	blk_mq_free_request(drive->sense_rq);
+	drive->sense_rq = NULL;
+	drive->sense_rq_armed = false;
+}
+
 /**
  * Returns:
  * 0: if the request should be continued.
@@ -516,6 +527,82 @@ static bool ide_cd_error_cmd(ide_drive_t *drive, struct ide_cmd *cmd)
 	return false;
 }
 
+/* standard prep_rq_fn that builds 10 byte cmds */
+static int ide_cdrom_prep_fs(struct request_queue *q, struct request *rq)
+{
+	int hard_sect = queue_logical_block_size(q);
+	long block = (long)blk_rq_pos(rq) / (hard_sect >> 9);
+	unsigned long blocks = blk_rq_sectors(rq) / (hard_sect >> 9);
+	struct scsi_request *req = scsi_req(rq);
+
+	if (rq_data_dir(rq) == READ)
+		req->cmd[0] = GPCMD_READ_10;
+	else
+		req->cmd[0] = GPCMD_WRITE_10;
+
+	/*
+	 * fill in lba
+	 */
+	req->cmd[2] = (block >> 24) & 0xff;
+	req->cmd[3] = (block >> 16) & 0xff;
+	req->cmd[4] = (block >>  8) & 0xff;
+	req->cmd[5] = block & 0xff;
+
+	/*
+	 * and transfer length
+	 */
+	req->cmd[7] = (blocks >> 8) & 0xff;
+	req->cmd[8] = blocks & 0xff;
+	req->cmd_len = 10;
+	return BLKPREP_OK;
+}
+
+/*
+ * Most of the SCSI commands are supported directly by ATAPI devices.
+ * This transform handles the few exceptions.
+ */
+static int ide_cdrom_prep_pc(struct request *rq)
+{
+	u8 *c = scsi_req(rq)->cmd;
+
+	/* transform 6-byte read/write commands to the 10-byte version */
+	if (c[0] == READ_6 || c[0] == WRITE_6) {
+		c[8] = c[4];
+		c[5] = c[3];
+		c[4] = c[2];
+		c[3] = c[1] & 0x1f;
+		c[2] = 0;
+		c[1] &= 0xe0;
+		c[0] += (READ_10 - READ_6);
+		scsi_req(rq)->cmd_len = 10;
+		return BLKPREP_OK;
+	}
+
+	/*
+	 * it's silly to pretend we understand 6-byte sense commands, just
+	 * reject with ILLEGAL_REQUEST and the caller should take the
+	 * appropriate action
+	 */
+	if (c[0] == MODE_SENSE || c[0] == MODE_SELECT) {
+		scsi_req(rq)->result = ILLEGAL_REQUEST;
+		return BLKPREP_KILL;
+	}
+
+	return BLKPREP_OK;
+}
+
+static int ide_cdrom_prep_fn(ide_drive_t *drive, struct request *rq)
+{
+	if (!blk_rq_is_passthrough(rq)) {
+		scsi_req_init(scsi_req(rq));
+
+		return ide_cdrom_prep_fs(drive->queue, rq);
+	} else if (blk_rq_is_scsi(rq))
+		return ide_cdrom_prep_pc(rq);
+
+	return 0;
+}
+
 static ide_startstop_t cdrom_newpc_intr(ide_drive_t *drive)
 {
 	ide_hwif_t *hwif = drive->hwif;
@@ -675,7 +762,7 @@ static ide_startstop_t cdrom_newpc_intr(ide_drive_t *drive)
 out_end:
 	if (blk_rq_is_scsi(rq) && rc == 0) {
 		scsi_req(rq)->resid_len = 0;
-		blk_end_request_all(rq, BLK_STS_OK);
+		blk_mq_end_request(rq, BLK_STS_OK);
 		hwif->rq = NULL;
 	} else {
 		if (sense && uptodate)
@@ -705,6 +792,8 @@ static ide_startstop_t cdrom_newpc_intr(ide_drive_t *drive)
 		if (sense && rc == 2)
 			ide_error(drive, "request sense failure", stat);
 	}
+
+	ide_cd_free_sense(drive);
 	return ide_stopped;
 }
 
@@ -729,7 +818,7 @@ static ide_startstop_t cdrom_start_rw(ide_drive_t *drive, struct request *rq)
 		 * We may be retrying this request after an error.  Fix up any
 		 * weirdness which might be present in the request packet.
 		 */
-		q->prep_rq_fn(q, rq);
+		ide_cdrom_prep_fn(drive, rq);
 	}
 
 	/* fs requests *must* be hardware frame aligned */
@@ -1323,82 +1412,6 @@ static int ide_cdrom_probe_capabilities(ide_drive_t *drive)
 	return nslots;
 }
 
-/* standard prep_rq_fn that builds 10 byte cmds */
-static int ide_cdrom_prep_fs(struct request_queue *q, struct request *rq)
-{
-	int hard_sect = queue_logical_block_size(q);
-	long block = (long)blk_rq_pos(rq) / (hard_sect >> 9);
-	unsigned long blocks = blk_rq_sectors(rq) / (hard_sect >> 9);
-	struct scsi_request *req = scsi_req(rq);
-
-	q->initialize_rq_fn(rq);
-
-	if (rq_data_dir(rq) == READ)
-		req->cmd[0] = GPCMD_READ_10;
-	else
-		req->cmd[0] = GPCMD_WRITE_10;
-
-	/*
-	 * fill in lba
-	 */
-	req->cmd[2] = (block >> 24) & 0xff;
-	req->cmd[3] = (block >> 16) & 0xff;
-	req->cmd[4] = (block >>  8) & 0xff;
-	req->cmd[5] = block & 0xff;
-
-	/*
-	 * and transfer length
-	 */
-	req->cmd[7] = (blocks >> 8) & 0xff;
-	req->cmd[8] = blocks & 0xff;
-	req->cmd_len = 10;
-	return BLKPREP_OK;
-}
-
-/*
- * Most of the SCSI commands are supported directly by ATAPI devices.
- * This transform handles the few exceptions.
- */
-static int ide_cdrom_prep_pc(struct request *rq)
-{
-	u8 *c = scsi_req(rq)->cmd;
-
-	/* transform 6-byte read/write commands to the 10-byte version */
-	if (c[0] == READ_6 || c[0] == WRITE_6) {
-		c[8] = c[4];
-		c[5] = c[3];
-		c[4] = c[2];
-		c[3] = c[1] & 0x1f;
-		c[2] = 0;
-		c[1] &= 0xe0;
-		c[0] += (READ_10 - READ_6);
-		scsi_req(rq)->cmd_len = 10;
-		return BLKPREP_OK;
-	}
-
-	/*
-	 * it's silly to pretend we understand 6-byte sense commands, just
-	 * reject with ILLEGAL_REQUEST and the caller should take the
-	 * appropriate action
-	 */
-	if (c[0] == MODE_SENSE || c[0] == MODE_SELECT) {
-		scsi_req(rq)->result = ILLEGAL_REQUEST;
-		return BLKPREP_KILL;
-	}
-
-	return BLKPREP_OK;
-}
-
-static int ide_cdrom_prep_fn(struct request_queue *q, struct request *rq)
-{
-	if (!blk_rq_is_passthrough(rq))
-		return ide_cdrom_prep_fs(q, rq);
-	else if (blk_rq_is_scsi(rq))
-		return ide_cdrom_prep_pc(rq);
-
-	return 0;
-}
-
 struct cd_list_entry {
 	const char	*id_model;
 	const char	*id_firmware;
@@ -1508,7 +1521,7 @@ static int ide_cdrom_setup(ide_drive_t *drive)
 
 	ide_debug_log(IDE_DBG_PROBE, "enter");
 
-	blk_queue_prep_rq(q, ide_cdrom_prep_fn);
+	drive->prep_rq = ide_cdrom_prep_fn;
 	blk_queue_dma_alignment(q, 31);
 	blk_queue_update_dma_pad(q, 15);
 
@@ -1569,7 +1582,7 @@ static void ide_cd_release(struct device *dev)
 	if (devinfo->handle == drive)
 		unregister_cdrom(devinfo);
 	drive->driver_data = NULL;
-	blk_queue_prep_rq(drive->queue, NULL);
+	drive->prep_rq = NULL;
 	g->private_data = NULL;
 	put_disk(g);
 	kfree(info);
diff --git a/drivers/ide/ide-disk.c b/drivers/ide/ide-disk.c
index e3b4e659082d..f8567c8c9dd1 100644
--- a/drivers/ide/ide-disk.c
+++ b/drivers/ide/ide-disk.c
@@ -427,9 +427,8 @@ static void ide_disk_unlock_native_capacity(ide_drive_t *drive)
 		drive->dev_flags |= IDE_DFLAG_NOHPA; /* disable HPA on resume */
 }
 
-static int idedisk_prep_fn(struct request_queue *q, struct request *rq)
+static int idedisk_prep_fn(ide_drive_t *drive, struct request *rq)
 {
-	ide_drive_t *drive = q->queuedata;
 	struct ide_cmd *cmd;
 
 	if (req_op(rq) != REQ_OP_FLUSH)
@@ -548,7 +547,7 @@ static void update_flush(ide_drive_t *drive)
 
 		if (barrier) {
 			wc = true;
-			blk_queue_prep_rq(drive->queue, idedisk_prep_fn);
+			drive->prep_rq = idedisk_prep_fn;
 		}
 	}
 
diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index 0d93e0cfbeaf..01e7195507a2 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -67,7 +67,15 @@ int ide_end_rq(ide_drive_t *drive, struct request *rq, blk_status_t error,
 		ide_dma_on(drive);
 	}
 
-	return blk_end_request(rq, error, nr_bytes);
+	if (!blk_update_request(rq, error, nr_bytes)) {
+		if (rq == drive->sense_rq)
+			drive->sense_rq = NULL;
+
+		__blk_mq_end_request(rq, error);
+		return 0;
+	}
+
+	return 1;
 }
 EXPORT_SYMBOL_GPL(ide_end_rq);
 
@@ -307,8 +315,6 @@ static ide_startstop_t start_request (ide_drive_t *drive, struct request *rq)
 {
 	ide_startstop_t startstop;
 
-	BUG_ON(!(rq->rq_flags & RQF_STARTED));
-
 #ifdef DEBUG
 	printk("%s: start_request: current=0x%08lx\n",
 		drive->hwif->name, (unsigned long) rq);
@@ -320,6 +326,9 @@ static ide_startstop_t start_request (ide_drive_t *drive, struct request *rq)
 		goto kill_rq;
 	}
 
+	if (drive->prep_rq && drive->prep_rq(drive, rq))
+		return ide_stopped;
+
 	if (ata_pm_request(rq))
 		ide_check_pm_state(drive, rq);
 
@@ -430,44 +439,38 @@ static inline void ide_unlock_host(struct ide_host *host)
 	}
 }
 
-static void __ide_requeue_and_plug(struct request_queue *q, struct request *rq)
-{
-	if (rq)
-		blk_requeue_request(q, rq);
-	if (rq || blk_peek_request(q)) {
-		/* Use 3ms as that was the old plug delay */
-		blk_delay_queue(q, 3);
-	}
-}
-
 void ide_requeue_and_plug(ide_drive_t *drive, struct request *rq)
 {
 	struct request_queue *q = drive->queue;
-	unsigned long flags;
 
-	spin_lock_irqsave(q->queue_lock, flags);
-	__ide_requeue_and_plug(q, rq);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	/* Use 3ms as that was the old plug delay */
+	if (rq) {
+		blk_mq_requeue_request(rq, false);
+		blk_mq_delay_kick_requeue_list(q, 3);
+	} else
+		blk_mq_delay_run_hw_queue(q->queue_hw_ctx[0], 3);
 }
 
 /*
  * Issue a new request to a device.
  */
-void do_ide_request(struct request_queue *q)
+blk_status_t ide_queue_rq(struct blk_mq_hw_ctx *hctx,
+			  const struct blk_mq_queue_data *bd)
 {
-	ide_drive_t	*drive = q->queuedata;
+	ide_drive_t	*drive = hctx->queue->queuedata;
 	ide_hwif_t	*hwif = drive->hwif;
 	struct ide_host *host = hwif->host;
 	struct request	*rq = NULL;
 	ide_startstop_t	startstop;
 
-	spin_unlock_irq(q->queue_lock);
-
 	/* HLD do_request() callback might sleep, make sure it's okay */
 	might_sleep();
 
 	if (ide_lock_host(host, hwif))
-		goto plug_device_2;
+		return BLK_STS_DEV_RESOURCE;
+
+	rq = bd->rq;
+	blk_mq_start_request(rq);
 
 	spin_lock_irq(&hwif->lock);
 
@@ -503,21 +506,16 @@ void do_ide_request(struct request_queue *q)
 		hwif->cur_dev = drive;
 		drive->dev_flags &= ~(IDE_DFLAG_SLEEPING | IDE_DFLAG_PARKED);
 
-		spin_unlock_irq(&hwif->lock);
-		spin_lock_irq(q->queue_lock);
 		/*
 		 * we know that the queue isn't empty, but this can happen
 		 * if the q->prep_rq_fn() decides to kill a request
 		 */
-		if (!rq)
-			rq = blk_fetch_request(drive->queue);
-
-		spin_unlock_irq(q->queue_lock);
-		spin_lock_irq(&hwif->lock);
-
 		if (!rq) {
-			ide_unlock_port(hwif);
-			goto out;
+			rq = bd->rq;
+			if (!rq) {
+				ide_unlock_port(hwif);
+				goto out;
+			}
 		}
 
 		/*
@@ -551,23 +549,23 @@ void do_ide_request(struct request_queue *q)
 		if (startstop == ide_stopped) {
 			rq = hwif->rq;
 			hwif->rq = NULL;
-			goto repeat;
+			if (rq)
+				goto repeat;
+			goto out;
 		}
-	} else
-		goto plug_device;
+	} else {
+plug_device:
+		spin_unlock_irq(&hwif->lock);
+		ide_unlock_host(host);
+		ide_requeue_and_plug(drive, rq);
+		return BLK_STS_OK;
+	}
+
 out:
 	spin_unlock_irq(&hwif->lock);
 	if (rq == NULL)
 		ide_unlock_host(host);
-	spin_lock_irq(q->queue_lock);
-	return;
-
-plug_device:
-	spin_unlock_irq(&hwif->lock);
-	ide_unlock_host(host);
-plug_device_2:
-	spin_lock_irq(q->queue_lock);
-	__ide_requeue_and_plug(q, rq);
+	return BLK_STS_OK;
 }
 
 static int drive_is_ready(ide_drive_t *drive)
@@ -617,6 +615,8 @@ void ide_timer_expiry (struct timer_list *t)
 	int		plug_device = 0;
 	struct request	*uninitialized_var(rq_in_flight);
 
+	return;
+
 	spin_lock_irqsave(&hwif->lock, flags);
 
 	handler = hwif->handler;
@@ -887,3 +887,16 @@ void ide_pad_transfer(ide_drive_t *drive, int write, int len)
 	}
 }
 EXPORT_SYMBOL_GPL(ide_pad_transfer);
+
+void ide_insert_request_head(ide_drive_t *drive, struct request *rq)
+{
+	ide_hwif_t *hwif = drive->hwif;
+	unsigned long flags;
+
+	spin_lock_irqsave(&hwif->lock, flags);
+	list_add_tail(&rq->queuelist, &drive->rq_list);
+	spin_unlock_irqrestore(&hwif->lock, flags);
+
+	kblockd_schedule_work(&drive->rq_work);
+}
+EXPORT_SYMBOL_GPL(ide_insert_request_head);
diff --git a/drivers/ide/ide-park.c b/drivers/ide/ide-park.c
index 622f0edb3945..de9e85cf74d1 100644
--- a/drivers/ide/ide-park.c
+++ b/drivers/ide/ide-park.c
@@ -27,7 +27,7 @@ static void issue_park_cmd(ide_drive_t *drive, unsigned long timeout)
 		spin_unlock_irq(&hwif->lock);
 
 		if (start_queue)
-			blk_run_queue(q);
+			blk_mq_run_hw_queues(q, true);
 		return;
 	}
 	spin_unlock_irq(&hwif->lock);
@@ -54,7 +54,7 @@ static void issue_park_cmd(ide_drive_t *drive, unsigned long timeout)
 	scsi_req(rq)->cmd[0] = REQ_UNPARK_HEADS;
 	scsi_req(rq)->cmd_len = 1;
 	ide_req(rq)->type = ATA_PRIV_MISC;
-	elv_add_request(q, rq, ELEVATOR_INSERT_FRONT);
+	ide_insert_request_head(drive, rq);
 
 out:
 	return;
diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c
index 59217aa1d1fb..ea10507e5190 100644
--- a/drivers/ide/ide-pm.c
+++ b/drivers/ide/ide-pm.c
@@ -40,32 +40,20 @@ int generic_ide_suspend(struct device *dev, pm_message_t mesg)
 	return ret;
 }
 
-static void ide_end_sync_rq(struct request *rq, blk_status_t error)
-{
-	complete(rq->end_io_data);
-}
-
 static int ide_pm_execute_rq(struct request *rq)
 {
 	struct request_queue *q = rq->q;
-	DECLARE_COMPLETION_ONSTACK(wait);
-
-	rq->end_io_data = &wait;
-	rq->end_io = ide_end_sync_rq;
 
 	spin_lock_irq(q->queue_lock);
 	if (unlikely(blk_queue_dying(q))) {
 		rq->rq_flags |= RQF_QUIET;
 		scsi_req(rq)->result = -ENXIO;
-		__blk_end_request_all(rq, BLK_STS_OK);
 		spin_unlock_irq(q->queue_lock);
+		blk_mq_end_request(rq, BLK_STS_OK);
 		return -ENXIO;
 	}
-	__elv_add_request(q, rq, ELEVATOR_INSERT_FRONT);
-	__blk_run_queue_uncond(q);
 	spin_unlock_irq(q->queue_lock);
-
-	wait_for_completion_io(&wait);
+	blk_execute_rq(q, NULL, rq, true);
 
 	return scsi_req(rq)->result ? -EIO : 0;
 }
@@ -79,6 +67,8 @@ int generic_ide_resume(struct device *dev)
 	struct ide_pm_state rqpm;
 	int err;
 
+	blk_mq_start_stopped_hw_queues(drive->queue, true);
+
 	if (ide_port_acpi(hwif)) {
 		/* call ACPI _PS0 / _STM only once */
 		if ((drive->dn & 1) == 0 || pair == NULL) {
@@ -226,15 +216,14 @@ void ide_complete_pm_rq(ide_drive_t *drive, struct request *rq)
 #endif
 	spin_lock_irqsave(q->queue_lock, flags);
 	if (ide_req(rq)->type == ATA_PRIV_PM_SUSPEND)
-		blk_stop_queue(q);
+		blk_mq_stop_hw_queues(q);
 	else
 		drive->dev_flags &= ~IDE_DFLAG_BLOCKED;
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
 	drive->hwif->rq = NULL;
 
-	if (blk_end_request(rq, BLK_STS_OK, 0))
-		BUG();
+	blk_mq_end_request(rq, BLK_STS_OK);
 }
 
 void ide_check_pm_state(ide_drive_t *drive, struct request *rq)
@@ -260,7 +249,6 @@ void ide_check_pm_state(ide_drive_t *drive, struct request *rq)
 		ide_hwif_t *hwif = drive->hwif;
 		const struct ide_tp_ops *tp_ops = hwif->tp_ops;
 		struct request_queue *q = drive->queue;
-		unsigned long flags;
 		int rc;
 #ifdef DEBUG_PM
 		printk("%s: Wakeup request inited, waiting for !BSY...\n", drive->name);
@@ -274,8 +262,6 @@ void ide_check_pm_state(ide_drive_t *drive, struct request *rq)
 		if (rc)
 			printk(KERN_WARNING "%s: drive not ready on wakeup\n", drive->name);
 
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
+		blk_mq_start_hw_queues(q);
 	}
 }
diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 3b75a7b7a284..0089995b091f 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -750,6 +750,11 @@ static void ide_initialize_rq(struct request *rq)
 	req->sreq.sense = req->sense;
 }
 
+static const struct blk_mq_ops ide_mq_ops = {
+	.queue_rq		= ide_queue_rq,
+	.initialize_rq_fn	= ide_initialize_rq,
+};
+
 /*
  * init request queue
  */
@@ -759,6 +764,7 @@ static int ide_init_queue(ide_drive_t *drive)
 	ide_hwif_t *hwif = drive->hwif;
 	int max_sectors = 256;
 	int max_sg_entries = PRD_ENTRIES;
+	struct blk_mq_tag_set *set;
 
 	/*
 	 *	Our default set up assumes the normal IDE case,
@@ -767,19 +773,26 @@ static int ide_init_queue(ide_drive_t *drive)
 	 *	limits and LBA48 we could raise it but as yet
 	 *	do not.
 	 */
-	q = blk_alloc_queue_node(GFP_KERNEL, hwif_to_node(hwif), NULL);
-	if (!q)
+
+	set = &drive->tag_set;
+	set->ops = &ide_mq_ops;
+	set->nr_hw_queues = 1;
+	set->queue_depth = 32;
+	set->reserved_tags = 1;
+	set->cmd_size = sizeof(struct ide_request);
+	set->numa_node = hwif_to_node(hwif);
+	set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
+	if (blk_mq_alloc_tag_set(set))
 		return 1;
 
-	q->request_fn = do_ide_request;
-	q->initialize_rq_fn = ide_initialize_rq;
-	q->cmd_size = sizeof(struct ide_request);
-	blk_queue_flag_set(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
-	if (blk_init_allocated_queue(q) < 0) {
-		blk_cleanup_queue(q);
+	q = blk_mq_init_queue(set);
+	if (IS_ERR(q)) {
+		blk_mq_free_tag_set(set);
 		return 1;
 	}
 
+	blk_queue_flag_set(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
+
 	q->queuedata = drive;
 	blk_queue_segment_boundary(q, 0xffff);
 
@@ -965,6 +978,10 @@ static void drive_release_dev (struct device *dev)
 
 	ide_proc_unregister_device(drive);
 
+	if (drive->sense_rq)
+		blk_mq_free_request(drive->sense_rq);
+
+	blk_mq_free_tag_set(&drive->tag_set);
 	blk_cleanup_queue(drive->queue);
 	drive->queue = NULL;
 
@@ -1133,6 +1150,28 @@ static void ide_port_cable_detect(ide_hwif_t *hwif)
 	}
 }
 
+/*
+ * Deferred request list insertion handler
+ */
+static void drive_rq_insert_work(struct work_struct *work)
+{
+	ide_drive_t *drive = container_of(work, ide_drive_t, rq_work);
+	ide_hwif_t *hwif = drive->hwif;
+	struct request *rq;
+	LIST_HEAD(list);
+
+	spin_lock_irq(&hwif->lock);
+	if (!list_empty(&drive->rq_list))
+		list_splice_init(&drive->rq_list, &list);
+	spin_unlock_irq(&hwif->lock);
+
+	while (!list_empty(&list)) {
+		rq = list_first_entry(&list, struct request, queuelist);
+		list_del_init(&rq->queuelist);
+		blk_execute_rq_nowait(drive->queue, rq->rq_disk, rq, true, NULL);
+	}
+}
+
 static const u8 ide_hwif_to_major[] =
 	{ IDE0_MAJOR, IDE1_MAJOR, IDE2_MAJOR, IDE3_MAJOR, IDE4_MAJOR,
 	  IDE5_MAJOR, IDE6_MAJOR, IDE7_MAJOR, IDE8_MAJOR, IDE9_MAJOR };
@@ -1145,12 +1184,10 @@ static void ide_port_init_devices_data(ide_hwif_t *hwif)
 	ide_port_for_each_dev(i, drive, hwif) {
 		u8 j = (hwif->index * MAX_DRIVES) + i;
 		u16 *saved_id = drive->id;
-		struct request *saved_sense_rq = drive->sense_rq;
 
 		memset(drive, 0, sizeof(*drive));
 		memset(saved_id, 0, SECTOR_SIZE);
 		drive->id = saved_id;
-		drive->sense_rq = saved_sense_rq;
 
 		drive->media			= ide_disk;
 		drive->select			= (i << 4) | ATA_DEVICE_OBS;
@@ -1166,6 +1203,9 @@ static void ide_port_init_devices_data(ide_hwif_t *hwif)
 
 		INIT_LIST_HEAD(&drive->list);
 		init_completion(&drive->gendev_rel_comp);
+
+		INIT_WORK(&drive->rq_work, drive_rq_insert_work);
+		INIT_LIST_HEAD(&drive->rq_list);
 	}
 }
 
@@ -1255,7 +1295,6 @@ static void ide_port_free_devices(ide_hwif_t *hwif)
 	int i;
 
 	ide_port_for_each_dev(i, drive, hwif) {
-		kfree(drive->sense_rq);
 		kfree(drive->id);
 		kfree(drive);
 	}
@@ -1283,17 +1322,10 @@ static int ide_port_alloc_devices(ide_hwif_t *hwif, int node)
 		if (drive->id == NULL)
 			goto out_free_drive;
 
-		drive->sense_rq = kmalloc(sizeof(struct request) +
-				sizeof(struct ide_request), GFP_KERNEL);
-		if (!drive->sense_rq)
-			goto out_free_id;
-
 		hwif->devices[i] = drive;
 	}
 	return 0;
 
-out_free_id:
-	kfree(drive->id);
 out_free_drive:
 	kfree(drive);
 out_nomem:
diff --git a/include/linux/ide.h b/include/linux/ide.h
index c74b0321922a..079f8bc0b0f4 100644
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -10,7 +10,7 @@
 #include <linux/init.h>
 #include <linux/ioport.h>
 #include <linux/ata.h>
-#include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/proc_fs.h>
 #include <linux/interrupt.h>
 #include <linux/bitops.h>
@@ -529,6 +529,10 @@ struct ide_drive_s {
 
 	struct request_queue	*queue;	/* request queue */
 
+	int (*prep_rq)(struct ide_drive_s *, struct request *);
+
+	struct blk_mq_tag_set	tag_set;
+
 	struct request		*rq;	/* current request */
 	void		*driver_data;	/* extra driver data */
 	u16			*id;	/* identification info */
@@ -612,6 +616,10 @@ struct ide_drive_s {
 	bool sense_rq_armed;
 	struct request *sense_rq;
 	struct request_sense sense_data;
+
+	/* async sense insertion */
+	struct work_struct rq_work;
+	struct list_head rq_list;
 };
 
 typedef struct ide_drive_s ide_drive_t;
@@ -1089,6 +1097,7 @@ extern int ide_pci_clk;
 
 int ide_end_rq(ide_drive_t *, struct request *, blk_status_t, unsigned int);
 void ide_kill_rq(ide_drive_t *, struct request *);
+void ide_insert_request_head(ide_drive_t *, struct request *);
 
 void __ide_set_handler(ide_drive_t *, ide_handler_t *, unsigned int);
 void ide_set_handler(ide_drive_t *, ide_handler_t *, unsigned int);
@@ -1208,7 +1217,7 @@ extern void ide_stall_queue(ide_drive_t *drive, unsigned long timeout);
 
 extern void ide_timer_expiry(struct timer_list *t);
 extern irqreturn_t ide_intr(int irq, void *dev_id);
-extern void do_ide_request(struct request_queue *);
+extern blk_status_t ide_queue_rq(struct blk_mq_hw_ctx *, const struct blk_mq_queue_data *);
 extern void ide_requeue_and_plug(ide_drive_t *drive, struct request *rq);
 
 void ide_init_disk(struct gendisk *, ide_drive_t *);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 05/28] IB/srp: remove old request_fn_active check
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (3 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 04/28] ide: " Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-25 21:23   ` Bart Van Assche
  2018-10-26  7:08   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 06/28] blk-mq: remove the request_list usage Jens Axboe
                   ` (24 subsequent siblings)
  29 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide
  Cc: Jens Axboe, Bart Van Assche, Parav Pandit

This check is only viable for non scsi-mq. Since that is going away,
kill this legacy check.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Parav Pandit <parav@mellanox.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 0b34e909505f..5a79444c2f3c 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
 	struct scsi_device *sdev;
 	int i, j;
 
-	/*
-	 * Invoking srp_terminate_io() while srp_queuecommand() is running
-	 * is not safe. Hence the warning statement below.
-	 */
-	shost_for_each_device(sdev, shost)
-		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
-
 	for (i = 0; i < target->ch_count; i++) {
 		ch = &target->ch[i];
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 06/28] blk-mq: remove the request_list usage
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (4 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 05/28] IB/srp: remove old request_fn_active check Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-27 10:52   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 07/28] blk-mq: remove legacy check in queue blk_freeze_queue() Jens Axboe
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

We don't do anything with it, that's just the legacy path.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3f91c6e5b17a..4c82dc44d4d8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -510,9 +510,6 @@ void blk_mq_free_request(struct request *rq)
 
 	rq_qos_done(q, rq);
 
-	if (blk_rq_rl(rq))
-		blk_put_rl(blk_rq_rl(rq));
-
 	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 	if (refcount_dec_and_test(&rq->ref))
 		__blk_mq_free_request(rq);
@@ -1675,8 +1672,6 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio)
 {
 	blk_init_request_from_bio(rq, bio);
 
-	blk_rq_set_rl(rq, blk_get_rl(rq->q, bio));
-
 	blk_account_io_start(rq, true);
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 07/28] blk-mq: remove legacy check in queue blk_freeze_queue()
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (5 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 06/28] blk-mq: remove the request_list usage Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-27 10:52   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 08/28] scsi: kill off the legacy IO path Jens Axboe
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4c82dc44d4d8..a58d2d953876 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -177,8 +177,6 @@ void blk_freeze_queue(struct request_queue *q)
 	 * exported to drivers as the only user for unfreeze is blk_mq.
 	 */
 	blk_freeze_queue_start(q);
-	if (!q->mq_ops)
-		blk_drain_queue(q);
 	blk_mq_freeze_queue_wait(q);
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (6 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 07/28] blk-mq: remove legacy check in queue blk_freeze_queue() Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-25 21:36   ` Bart Van Assche
  2018-10-29  6:48   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 09/28] dm: remove " Jens Axboe
                   ` (21 subsequent siblings)
  29 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 Documentation/scsi/scsi-parameters.txt |   5 -
 drivers/scsi/Kconfig                   |  12 -
 drivers/scsi/cxlflash/main.c           |   6 -
 drivers/scsi/hosts.c                   |  29 +-
 drivers/scsi/lpfc/lpfc_scsi.c          |   2 +-
 drivers/scsi/qedi/qedi_main.c          |   3 +-
 drivers/scsi/qla2xxx/qla_os.c          |  30 +-
 drivers/scsi/scsi.c                    |   5 +-
 drivers/scsi/scsi_debug.c              |   3 +-
 drivers/scsi/scsi_error.c              |   2 +-
 drivers/scsi/scsi_lib.c                | 624 ++-----------------------
 drivers/scsi/scsi_priv.h               |   1 -
 drivers/scsi/scsi_scan.c               |  10 +-
 drivers/scsi/scsi_sysfs.c              |   8 +-
 drivers/scsi/ufs/ufshcd.c              |   6 -
 include/scsi/scsi_host.h               |  18 +-
 include/scsi/scsi_tcq.h                |  14 +-
 17 files changed, 73 insertions(+), 705 deletions(-)

diff --git a/Documentation/scsi/scsi-parameters.txt b/Documentation/scsi/scsi-parameters.txt
index 92999d4e0cb8..25a4b4cf04a6 100644
--- a/Documentation/scsi/scsi-parameters.txt
+++ b/Documentation/scsi/scsi-parameters.txt
@@ -97,11 +97,6 @@ parameters may be changed at runtime by the command
 			allowing boot to proceed.  none ignores them, expecting
 			user space to do the scan.
 
-	scsi_mod.use_blk_mq=
-			[SCSI] use blk-mq I/O path by default
-			See SCSI_MQ_DEFAULT in drivers/scsi/Kconfig.
-			Format: <y/n>
-
 	sim710=		[SCSI,HW]
 			See header of drivers/scsi/sim710.c.
 
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 70988c381268..ff5a569fdbcb 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -50,18 +50,6 @@ config SCSI_NETLINK
 	default	n
 	depends on NET
 
-config SCSI_MQ_DEFAULT
-	bool "SCSI: use blk-mq I/O path by default"
-	default y
-	depends on SCSI
-	---help---
-	  This option enables the blk-mq based I/O path for SCSI devices by
-	  default.  With this option the scsi_mod.use_blk_mq module/boot
-	  option defaults to Y, without it to N, but it can still be
-	  overridden either way.
-
-	  If unsure say Y.
-
 config SCSI_PROC_FS
 	bool "legacy /proc/scsi/ support"
 	depends on SCSI && PROC_FS
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 6637116529aa..abdc9eac4173 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -3088,12 +3088,6 @@ static ssize_t hwq_mode_store(struct device *dev,
 		return -EINVAL;
 	}
 
-	if ((mode == HWQ_MODE_TAG) && !shost_use_blk_mq(shost)) {
-		dev_info(cfgdev, "SCSI-MQ is not enabled, use a different "
-			 "HWQ steering mode.\n");
-		return -EINVAL;
-	}
-
 	afu->hwq_mode = mode;
 
 	return count;
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index ea4b0bb0c1cd..cc71136ba300 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -222,18 +222,9 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	if (error)
 		goto fail;
 
-	if (shost_use_blk_mq(shost)) {
-		error = scsi_mq_setup_tags(shost);
-		if (error)
-			goto fail;
-	} else {
-		shost->bqt = blk_init_tags(shost->can_queue,
-				shost->hostt->tag_alloc_policy);
-		if (!shost->bqt) {
-			error = -ENOMEM;
-			goto fail;
-		}
-	}
+	error = scsi_mq_setup_tags(shost);
+	if (error)
+		goto fail;
 
 	if (!shost->shost_gendev.parent)
 		shost->shost_gendev.parent = dev ? dev : &platform_bus;
@@ -309,8 +300,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	pm_runtime_disable(&shost->shost_gendev);
 	pm_runtime_set_suspended(&shost->shost_gendev);
 	pm_runtime_put_noidle(&shost->shost_gendev);
-	if (shost_use_blk_mq(shost))
-		scsi_mq_destroy_tags(shost);
+	scsi_mq_destroy_tags(shost);
  fail:
 	return error;
 }
@@ -344,13 +334,8 @@ static void scsi_host_dev_release(struct device *dev)
 		kfree(dev_name(&shost->shost_dev));
 	}
 
-	if (shost_use_blk_mq(shost)) {
-		if (shost->tag_set.tags)
-			scsi_mq_destroy_tags(shost);
-	} else {
-		if (shost->bqt)
-			blk_free_tags(shost->bqt);
-	}
+	if (shost->tag_set.tags)
+		scsi_mq_destroy_tags(shost);
 
 	kfree(shost->shost_data);
 
@@ -472,8 +457,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	else
 		shost->dma_boundary = 0xffffffff;
 
-	shost->use_blk_mq = scsi_use_blk_mq || shost->hostt->force_blk_mq;
-
 	device_initialize(&shost->shost_gendev);
 	dev_set_name(&shost->shost_gendev, "host%d", shost->host_no);
 	shost->shost_gendev.bus = &scsi_bus_type;
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 4fa6703a9ec9..baed2b891efb 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -3914,7 +3914,7 @@ int lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba,
 	uint32_t tag;
 	uint16_t hwq;
 
-	if (cmnd && shost_use_blk_mq(cmnd->device->host)) {
+	if (cmnd) {
 		tag = blk_mq_unique_tag(cmnd->request);
 		hwq = blk_mq_unique_tag_to_hwq(tag);
 
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index 105b0e4d7818..311eb22068e1 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -644,8 +644,7 @@ static struct qedi_ctx *qedi_host_alloc(struct pci_dev *pdev)
 	qedi->max_active_conns = ISCSI_MAX_SESS_PER_HBA;
 	qedi->max_sqes = QEDI_SQ_SIZE;
 
-	if (shost_use_blk_mq(shost))
-		shost->nr_hw_queues = MIN_NUM_CPUS_MSIX(qedi);
+	shost->nr_hw_queues = MIN_NUM_CPUS_MSIX(qedi);
 
 	pci_set_drvdata(pdev, qedi);
 
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 8794e54f43a9..3e2665c66bc4 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -857,13 +857,9 @@ qla2xxx_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
 	}
 
 	if (ha->mqenable) {
-		if (shost_use_blk_mq(vha->host)) {
-			tag = blk_mq_unique_tag(cmd->request);
-			hwq = blk_mq_unique_tag_to_hwq(tag);
-			qpair = ha->queue_pair_map[hwq];
-		} else if (vha->vp_idx && vha->qpair) {
-			qpair = vha->qpair;
-		}
+		tag = blk_mq_unique_tag(cmd->request);
+		hwq = blk_mq_unique_tag_to_hwq(tag);
+		qpair = ha->queue_pair_map[hwq];
 
 		if (qpair)
 			return qla2xxx_mqueuecommand(host, cmd, qpair);
@@ -3153,7 +3149,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
 		goto probe_failed;
 	}
 
-	if (ha->mqenable && shost_use_blk_mq(host)) {
+	if (ha->mqenable) {
 		/* number of hardware queues supported by blk/scsi-mq*/
 		host->nr_hw_queues = ha->max_qpairs;
 
@@ -3265,25 +3261,17 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
 	    base_vha->mgmt_svr_loop_id, host->sg_tablesize);
 
 	if (ha->mqenable) {
-		bool mq = false;
 		bool startit = false;
 
-		if (QLA_TGT_MODE_ENABLED()) {
-			mq = true;
+		if (QLA_TGT_MODE_ENABLED())
 			startit = false;
-		}
 
-		if ((ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED) &&
-		    shost_use_blk_mq(host)) {
-			mq = true;
+		if (ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED)
 			startit = true;
-		}
 
-		if (mq) {
-			/* Create start of day qpairs for Block MQ */
-			for (i = 0; i < ha->max_qpairs; i++)
-				qla2xxx_create_qpair(base_vha, 5, 0, startit);
-		}
+		/* Create start of day qpairs for Block MQ */
+		for (i = 0; i < ha->max_qpairs; i++)
+			qla2xxx_create_qpair(base_vha, 5, 0, startit);
 	}
 
 	if (ha->flags.running_gold_fw)
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index fc1356d101b0..99db3f4316b5 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -780,11 +780,8 @@ MODULE_LICENSE("GPL");
 module_param(scsi_logging_level, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(scsi_logging_level, "a bit mask of logging levels");
 
-#ifdef CONFIG_SCSI_MQ_DEFAULT
+/* Kill module parameter */
 bool scsi_use_blk_mq = true;
-#else
-bool scsi_use_blk_mq = false;
-#endif
 module_param_named(use_blk_mq, scsi_use_blk_mq, bool, S_IWUSR | S_IRUGO);
 
 static int __init init_scsi(void)
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 60bcc6df97a9..4740f1e9dd17 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5881,8 +5881,7 @@ static int sdebug_driver_probe(struct device *dev)
 	}
 	/* Decide whether to tell scsi subsystem that we want mq */
 	/* Following should give the same answer for each host */
-	if (shost_use_blk_mq(hpnt))
-		hpnt->nr_hw_queues = submit_queues;
+	hpnt->nr_hw_queues = submit_queues;
 
 	sdbg_host->shost = hpnt;
 	*((struct sdebug_host_info **)hpnt->hostdata) = sdbg_host;
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c736d61b1648..fff128aa9ec2 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -308,7 +308,7 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		 * error handler. In that case we can return immediately as no
 		 * further action is required.
 		 */
-		if (req->q->mq_ops && !blk_mq_mark_complete(req))
+		if (!blk_mq_mark_complete(req))
 			return rtn;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c7fccbb8f554..77b0f10e1be1 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -168,8 +168,6 @@ static void scsi_mq_requeue_cmd(struct scsi_cmnd *cmd)
 static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, bool unbusy)
 {
 	struct scsi_device *device = cmd->device;
-	struct request_queue *q = device->request_queue;
-	unsigned long flags;
 
 	SCSI_LOG_MLQUEUE(1, scmd_printk(KERN_INFO, cmd,
 		"Inserting command %p into mlqueue\n", cmd));
@@ -190,26 +188,20 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, bool unbusy)
 	 * before blk_cleanup_queue() finishes.
 	 */
 	cmd->result = 0;
-	if (q->mq_ops) {
-		/*
-		 * Before a SCSI command is dispatched,
-		 * get_device(&sdev->sdev_gendev) is called and the host,
-		 * target and device busy counters are increased. Since
-		 * requeuing a request causes these actions to be repeated and
-		 * since scsi_device_unbusy() has already been called,
-		 * put_device(&device->sdev_gendev) must still be called. Call
-		 * put_device() after blk_mq_requeue_request() to avoid that
-		 * removal of the SCSI device can start before requeueing has
-		 * happened.
-		 */
-		blk_mq_requeue_request(cmd->request, true);
-		put_device(&device->sdev_gendev);
-		return;
-	}
-	spin_lock_irqsave(q->queue_lock, flags);
-	blk_requeue_request(q, cmd->request);
-	kblockd_schedule_work(&device->requeue_work);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	/*
+	 * Before a SCSI command is dispatched,
+	 * get_device(&sdev->sdev_gendev) is called and the host,
+	 * target and device busy counters are increased. Since
+	 * requeuing a request causes these actions to be repeated and
+	 * since scsi_device_unbusy() has already been called,
+	 * put_device(&device->sdev_gendev) must still be called. Call
+	 * put_device() after blk_mq_requeue_request() to avoid that
+	 * removal of the SCSI device can start before requeueing has
+	 * happened.
+	 */
+	blk_mq_requeue_request(cmd->request, true);
+	put_device(&device->sdev_gendev);
 }
 
 /*
@@ -370,10 +362,7 @@ void scsi_device_unbusy(struct scsi_device *sdev)
 
 static void scsi_kick_queue(struct request_queue *q)
 {
-	if (q->mq_ops)
-		blk_mq_run_hw_queues(q, false);
-	else
-		blk_run_queue(q);
+	blk_mq_run_hw_queues(q, false);
 }
 
 /*
@@ -534,10 +523,7 @@ static void scsi_run_queue(struct request_queue *q)
 	if (!list_empty(&sdev->host->starved_list))
 		scsi_starved_list_run(sdev->host);
 
-	if (q->mq_ops)
-		blk_mq_run_hw_queues(q, false);
-	else
-		blk_run_queue(q);
+	blk_mq_run_hw_queues(q, false);
 }
 
 void scsi_requeue_run_queue(struct work_struct *work)
@@ -550,42 +536,6 @@ void scsi_requeue_run_queue(struct work_struct *work)
 	scsi_run_queue(q);
 }
 
-/*
- * Function:	scsi_requeue_command()
- *
- * Purpose:	Handle post-processing of completed commands.
- *
- * Arguments:	q	- queue to operate on
- *		cmd	- command that may need to be requeued.
- *
- * Returns:	Nothing
- *
- * Notes:	After command completion, there may be blocks left
- *		over which weren't finished by the previous command
- *		this can be for a number of reasons - the main one is
- *		I/O errors in the middle of the request, in which case
- *		we need to request the blocks that come after the bad
- *		sector.
- * Notes:	Upon return, cmd is a stale pointer.
- */
-static void scsi_requeue_command(struct request_queue *q, struct scsi_cmnd *cmd)
-{
-	struct scsi_device *sdev = cmd->device;
-	struct request *req = cmd->request;
-	unsigned long flags;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	blk_unprep_request(req);
-	req->special = NULL;
-	scsi_put_command(cmd);
-	blk_requeue_request(q, req);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-
-	scsi_run_queue(q);
-
-	put_device(&sdev->sdev_gendev);
-}
-
 void scsi_run_host_queues(struct Scsi_Host *shost)
 {
 	struct scsi_device *sdev;
@@ -626,42 +576,6 @@ static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd)
 	scsi_del_cmd_from_list(cmd);
 }
 
-/*
- * Function:    scsi_release_buffers()
- *
- * Purpose:     Free resources allocate for a scsi_command.
- *
- * Arguments:   cmd	- command that we are bailing.
- *
- * Lock status: Assumed that no lock is held upon entry.
- *
- * Returns:     Nothing
- *
- * Notes:       In the event that an upper level driver rejects a
- *		command, we must release resources allocated during
- *		the __init_io() function.  Primarily this would involve
- *		the scatter-gather table.
- */
-static void scsi_release_buffers(struct scsi_cmnd *cmd)
-{
-	if (cmd->sdb.table.nents)
-		sg_free_table_chained(&cmd->sdb.table, false);
-
-	memset(&cmd->sdb, 0, sizeof(cmd->sdb));
-
-	if (scsi_prot_sg_count(cmd))
-		sg_free_table_chained(&cmd->prot_sdb->table, false);
-}
-
-static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd)
-{
-	struct scsi_data_buffer *bidi_sdb = cmd->request->next_rq->special;
-
-	sg_free_table_chained(&bidi_sdb->table, false);
-	kmem_cache_free(scsi_sdb_cache, bidi_sdb);
-	cmd->request->next_rq->special = NULL;
-}
-
 /* Returns false when no more bytes to process, true if there are more */
 static bool scsi_end_request(struct request *req, blk_status_t error,
 		unsigned int bytes, unsigned int bidi_bytes)
@@ -687,37 +601,22 @@ static bool scsi_end_request(struct request *req, blk_status_t error,
 		destroy_rcu_head(&cmd->rcu);
 	}
 
-	if (req->mq_ctx) {
-		/*
-		 * In the MQ case the command gets freed by __blk_mq_end_request,
-		 * so we have to do all cleanup that depends on it earlier.
-		 *
-		 * We also can't kick the queues from irq context, so we
-		 * will have to defer it to a workqueue.
-		 */
-		scsi_mq_uninit_cmd(cmd);
-
-		__blk_mq_end_request(req, error);
-
-		if (scsi_target(sdev)->single_lun ||
-		    !list_empty(&sdev->host->starved_list))
-			kblockd_schedule_work(&sdev->requeue_work);
-		else
-			blk_mq_run_hw_queues(q, true);
-	} else {
-		unsigned long flags;
-
-		if (bidi_bytes)
-			scsi_release_bidi_buffers(cmd);
-		scsi_release_buffers(cmd);
-		scsi_put_command(cmd);
+	/*
+	 * In the MQ case the command gets freed by __blk_mq_end_request,
+	 * so we have to do all cleanup that depends on it earlier.
+	 *
+	 * We also can't kick the queues from irq context, so we
+	 * will have to defer it to a workqueue.
+	 */
+	scsi_mq_uninit_cmd(cmd);
 
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_finish_request(req, error);
-		spin_unlock_irqrestore(q->queue_lock, flags);
+	__blk_mq_end_request(req, error);
 
-		scsi_run_queue(q);
-	}
+	if (scsi_target(sdev)->single_lun ||
+	    !list_empty(&sdev->host->starved_list))
+		kblockd_schedule_work(&sdev->requeue_work);
+	else
+		blk_mq_run_hw_queues(q, true);
 
 	put_device(&sdev->sdev_gendev);
 	return false;
@@ -766,13 +665,7 @@ static void scsi_io_completion_reprep(struct scsi_cmnd *cmd,
 				      struct request_queue *q)
 {
 	/* A new command will be prepared and issued. */
-	if (q->mq_ops) {
-		scsi_mq_requeue_cmd(cmd);
-	} else {
-		/* Unprep request and put it back at head of the queue. */
-		scsi_release_buffers(cmd);
-		scsi_requeue_command(q, cmd);
-	}
+	scsi_mq_requeue_cmd(cmd);
 }
 
 /* Helper for scsi_io_completion() when special action required. */
@@ -1147,9 +1040,7 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb)
  */
 int scsi_init_io(struct scsi_cmnd *cmd)
 {
-	struct scsi_device *sdev = cmd->device;
 	struct request *rq = cmd->request;
-	bool is_mq = (rq->mq_ctx != NULL);
 	int error = BLKPREP_KILL;
 
 	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
@@ -1160,17 +1051,6 @@ int scsi_init_io(struct scsi_cmnd *cmd)
 		goto err_exit;
 
 	if (blk_bidi_rq(rq)) {
-		if (!rq->q->mq_ops) {
-			struct scsi_data_buffer *bidi_sdb =
-				kmem_cache_zalloc(scsi_sdb_cache, GFP_ATOMIC);
-			if (!bidi_sdb) {
-				error = BLKPREP_DEFER;
-				goto err_exit;
-			}
-
-			rq->next_rq->special = bidi_sdb;
-		}
-
 		error = scsi_init_sgtable(rq->next_rq, rq->next_rq->special);
 		if (error)
 			goto err_exit;
@@ -1210,14 +1090,7 @@ int scsi_init_io(struct scsi_cmnd *cmd)
 
 	return BLKPREP_OK;
 err_exit:
-	if (is_mq) {
-		scsi_mq_free_sgtables(cmd);
-	} else {
-		scsi_release_buffers(cmd);
-		cmd->request->special = NULL;
-		scsi_put_command(cmd);
-		put_device(&sdev->sdev_gendev);
-	}
+	scsi_mq_free_sgtables(cmd);
 	return error;
 }
 EXPORT_SYMBOL(scsi_init_io);
@@ -1423,75 +1296,6 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
 	return ret;
 }
 
-static int
-scsi_prep_return(struct request_queue *q, struct request *req, int ret)
-{
-	struct scsi_device *sdev = q->queuedata;
-
-	switch (ret) {
-	case BLKPREP_KILL:
-	case BLKPREP_INVALID:
-		scsi_req(req)->result = DID_NO_CONNECT << 16;
-		/* release the command and kill it */
-		if (req->special) {
-			struct scsi_cmnd *cmd = req->special;
-			scsi_release_buffers(cmd);
-			scsi_put_command(cmd);
-			put_device(&sdev->sdev_gendev);
-			req->special = NULL;
-		}
-		break;
-	case BLKPREP_DEFER:
-		/*
-		 * If we defer, the blk_peek_request() returns NULL, but the
-		 * queue must be restarted, so we schedule a callback to happen
-		 * shortly.
-		 */
-		if (atomic_read(&sdev->device_busy) == 0)
-			blk_delay_queue(q, SCSI_QUEUE_DELAY);
-		break;
-	default:
-		req->rq_flags |= RQF_DONTPREP;
-	}
-
-	return ret;
-}
-
-static int scsi_prep_fn(struct request_queue *q, struct request *req)
-{
-	struct scsi_device *sdev = q->queuedata;
-	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
-	int ret;
-
-	ret = scsi_prep_state_check(sdev, req);
-	if (ret != BLKPREP_OK)
-		goto out;
-
-	if (!req->special) {
-		/* Bail if we can't get a reference to the device */
-		if (unlikely(!get_device(&sdev->sdev_gendev))) {
-			ret = BLKPREP_DEFER;
-			goto out;
-		}
-
-		scsi_init_command(sdev, cmd);
-		req->special = cmd;
-	}
-
-	cmd->tag = req->tag;
-	cmd->request = req;
-	cmd->prot_op = SCSI_PROT_NORMAL;
-
-	ret = scsi_setup_cmnd(sdev, req);
-out:
-	return scsi_prep_return(q, req, ret);
-}
-
-static void scsi_unprep_fn(struct request_queue *q, struct request *req)
-{
-	scsi_uninit_cmd(blk_mq_rq_to_pdu(req));
-}
-
 /*
  * scsi_dev_queue_ready: if we can send requests to sdev, return 1 else
  * return 0.
@@ -1511,14 +1315,8 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
 		/*
 		 * unblock after device_blocked iterates to zero
 		 */
-		if (atomic_dec_return(&sdev->device_blocked) > 0) {
-			/*
-			 * For the MQ case we take care of this in the caller.
-			 */
-			if (!q->mq_ops)
-				blk_delay_queue(q, SCSI_QUEUE_DELAY);
+		if (atomic_dec_return(&sdev->device_blocked) > 0)
 			goto out_dec;
-		}
 		SCSI_LOG_MLQUEUE(3, sdev_printk(KERN_INFO, sdev,
 				   "unblocking device at zero depth\n"));
 	}
@@ -1641,74 +1439,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
 	return 0;
 }
 
-/*
- * Busy state exporting function for request stacking drivers.
- *
- * For efficiency, no lock is taken to check the busy state of
- * shost/starget/sdev, since the returned value is not guaranteed and
- * may be changed after request stacking drivers call the function,
- * regardless of taking lock or not.
- *
- * When scsi can't dispatch I/Os anymore and needs to kill I/Os scsi
- * needs to return 'not busy'. Otherwise, request stacking drivers
- * may hold requests forever.
- */
-static int scsi_lld_busy(struct request_queue *q)
-{
-	struct scsi_device *sdev = q->queuedata;
-	struct Scsi_Host *shost;
-
-	if (blk_queue_dying(q))
-		return 0;
-
-	shost = sdev->host;
-
-	/*
-	 * Ignore host/starget busy state.
-	 * Since block layer does not have a concept of fairness across
-	 * multiple queues, congestion of host/starget needs to be handled
-	 * in SCSI layer.
-	 */
-	if (scsi_host_in_recovery(shost) || scsi_device_is_busy(sdev))
-		return 1;
-
-	return 0;
-}
-
-/*
- * Kill a request for a dead device
- */
-static void scsi_kill_request(struct request *req, struct request_queue *q)
-{
-	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
-	struct scsi_device *sdev;
-	struct scsi_target *starget;
-	struct Scsi_Host *shost;
-
-	blk_start_request(req);
-
-	scmd_printk(KERN_INFO, cmd, "killing request\n");
-
-	sdev = cmd->device;
-	starget = scsi_target(sdev);
-	shost = sdev->host;
-	scsi_init_cmd_errh(cmd);
-	cmd->result = DID_NO_CONNECT << 16;
-	atomic_inc(&cmd->device->iorequest_cnt);
-
-	/*
-	 * SCSI request completion path will do scsi_device_unbusy(),
-	 * bump busy counts.  To bump the counters, we need to dance
-	 * with the locks as normal issue path does.
-	 */
-	atomic_inc(&sdev->device_busy);
-	atomic_inc(&shost->host_busy);
-	if (starget->can_queue > 0)
-		atomic_inc(&starget->target_busy);
-
-	blk_complete_request(req);
-}
-
 static void scsi_softirq_done(struct request *rq)
 {
 	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
@@ -1829,158 +1559,6 @@ static int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
 	return 0;
 }
 
-/**
- * scsi_done - Invoke completion on finished SCSI command.
- * @cmd: The SCSI Command for which a low-level device driver (LLDD) gives
- * ownership back to SCSI Core -- i.e. the LLDD has finished with it.
- *
- * Description: This function is the mid-level's (SCSI Core) interrupt routine,
- * which regains ownership of the SCSI command (de facto) from a LLDD, and
- * calls blk_complete_request() for further processing.
- *
- * This function is interrupt context safe.
- */
-static void scsi_done(struct scsi_cmnd *cmd)
-{
-	trace_scsi_dispatch_cmd_done(cmd);
-	blk_complete_request(cmd->request);
-}
-
-/*
- * Function:    scsi_request_fn()
- *
- * Purpose:     Main strategy routine for SCSI.
- *
- * Arguments:   q       - Pointer to actual queue.
- *
- * Returns:     Nothing
- *
- * Lock status: request queue lock assumed to be held when called.
- *
- * Note: See sd_zbc.c sd_zbc_write_lock_zone() for write order
- * protection for ZBC disks.
- */
-static void scsi_request_fn(struct request_queue *q)
-	__releases(q->queue_lock)
-	__acquires(q->queue_lock)
-{
-	struct scsi_device *sdev = q->queuedata;
-	struct Scsi_Host *shost;
-	struct scsi_cmnd *cmd;
-	struct request *req;
-
-	/*
-	 * To start with, we keep looping until the queue is empty, or until
-	 * the host is no longer able to accept any more requests.
-	 */
-	shost = sdev->host;
-	for (;;) {
-		int rtn;
-		/*
-		 * get next queueable request.  We do this early to make sure
-		 * that the request is fully prepared even if we cannot
-		 * accept it.
-		 */
-		req = blk_peek_request(q);
-		if (!req)
-			break;
-
-		if (unlikely(!scsi_device_online(sdev))) {
-			sdev_printk(KERN_ERR, sdev,
-				    "rejecting I/O to offline device\n");
-			scsi_kill_request(req, q);
-			continue;
-		}
-
-		if (!scsi_dev_queue_ready(q, sdev))
-			break;
-
-		/*
-		 * Remove the request from the request list.
-		 */
-		if (!(blk_queue_tagged(q) && !blk_queue_start_tag(q, req)))
-			blk_start_request(req);
-
-		spin_unlock_irq(q->queue_lock);
-		cmd = blk_mq_rq_to_pdu(req);
-		if (cmd != req->special) {
-			printk(KERN_CRIT "impossible request in %s.\n"
-					 "please mail a stack trace to "
-					 "linux-scsi@vger.kernel.org\n",
-					 __func__);
-			blk_dump_rq_flags(req, "foo");
-			BUG();
-		}
-
-		/*
-		 * We hit this when the driver is using a host wide
-		 * tag map. For device level tag maps the queue_depth check
-		 * in the device ready fn would prevent us from trying
-		 * to allocate a tag. Since the map is a shared host resource
-		 * we add the dev to the starved list so it eventually gets
-		 * a run when a tag is freed.
-		 */
-		if (blk_queue_tagged(q) && !(req->rq_flags & RQF_QUEUED)) {
-			spin_lock_irq(shost->host_lock);
-			if (list_empty(&sdev->starved_entry))
-				list_add_tail(&sdev->starved_entry,
-					      &shost->starved_list);
-			spin_unlock_irq(shost->host_lock);
-			goto not_ready;
-		}
-
-		if (!scsi_target_queue_ready(shost, sdev))
-			goto not_ready;
-
-		if (!scsi_host_queue_ready(q, shost, sdev))
-			goto host_not_ready;
-	
-		if (sdev->simple_tags)
-			cmd->flags |= SCMD_TAGGED;
-		else
-			cmd->flags &= ~SCMD_TAGGED;
-
-		/*
-		 * Finally, initialize any error handling parameters, and set up
-		 * the timers for timeouts.
-		 */
-		scsi_init_cmd_errh(cmd);
-
-		/*
-		 * Dispatch the command to the low-level driver.
-		 */
-		cmd->scsi_done = scsi_done;
-		rtn = scsi_dispatch_cmd(cmd);
-		if (rtn) {
-			scsi_queue_insert(cmd, rtn);
-			spin_lock_irq(q->queue_lock);
-			goto out_delay;
-		}
-		spin_lock_irq(q->queue_lock);
-	}
-
-	return;
-
- host_not_ready:
-	if (scsi_target(sdev)->can_queue > 0)
-		atomic_dec(&scsi_target(sdev)->target_busy);
- not_ready:
-	/*
-	 * lock q, handle tag, requeue req, and decrement device_busy. We
-	 * must return with queue_lock held.
-	 *
-	 * Decrementing device_busy without checking it is OK, as all such
-	 * cases (host limits or settings) should run the queue at some
-	 * later time.
-	 */
-	spin_lock_irq(q->queue_lock);
-	blk_requeue_request(q, req);
-	atomic_dec(&sdev->device_busy);
-out_delay:
-	if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
-		blk_delay_queue(q, SCSI_QUEUE_DELAY);
-}
-
 static inline blk_status_t prep_to_mq(int ret)
 {
 	switch (ret) {
@@ -2243,77 +1821,6 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(__scsi_init_queue);
 
-static int scsi_old_init_rq(struct request_queue *q, struct request *rq,
-			    gfp_t gfp)
-{
-	struct Scsi_Host *shost = q->rq_alloc_data;
-	const bool unchecked_isa_dma = shost->unchecked_isa_dma;
-	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
-
-	memset(cmd, 0, sizeof(*cmd));
-
-	if (unchecked_isa_dma)
-		cmd->flags |= SCMD_UNCHECKED_ISA_DMA;
-	cmd->sense_buffer = scsi_alloc_sense_buffer(unchecked_isa_dma, gfp,
-						    NUMA_NO_NODE);
-	if (!cmd->sense_buffer)
-		goto fail;
-	cmd->req.sense = cmd->sense_buffer;
-
-	if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) {
-		cmd->prot_sdb = kmem_cache_zalloc(scsi_sdb_cache, gfp);
-		if (!cmd->prot_sdb)
-			goto fail_free_sense;
-	}
-
-	return 0;
-
-fail_free_sense:
-	scsi_free_sense_buffer(unchecked_isa_dma, cmd->sense_buffer);
-fail:
-	return -ENOMEM;
-}
-
-static void scsi_old_exit_rq(struct request_queue *q, struct request *rq)
-{
-	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
-
-	if (cmd->prot_sdb)
-		kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb);
-	scsi_free_sense_buffer(cmd->flags & SCMD_UNCHECKED_ISA_DMA,
-			       cmd->sense_buffer);
-}
-
-struct request_queue *scsi_old_alloc_queue(struct scsi_device *sdev)
-{
-	struct Scsi_Host *shost = sdev->host;
-	struct request_queue *q;
-
-	q = blk_alloc_queue_node(GFP_KERNEL, NUMA_NO_NODE, NULL);
-	if (!q)
-		return NULL;
-	q->cmd_size = sizeof(struct scsi_cmnd) + shost->hostt->cmd_size;
-	q->rq_alloc_data = shost;
-	q->request_fn = scsi_request_fn;
-	q->init_rq_fn = scsi_old_init_rq;
-	q->exit_rq_fn = scsi_old_exit_rq;
-	q->initialize_rq_fn = scsi_initialize_rq;
-
-	if (blk_init_allocated_queue(q) < 0) {
-		blk_cleanup_queue(q);
-		return NULL;
-	}
-
-	__scsi_init_queue(shost, q);
-	blk_queue_flag_set(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
-	blk_queue_prep_rq(q, scsi_prep_fn);
-	blk_queue_unprep_rq(q, scsi_unprep_fn);
-	blk_queue_softirq_done(q, scsi_softirq_done);
-	blk_queue_rq_timed_out(q, scsi_times_out);
-	blk_queue_lld_busy(q, scsi_lld_busy);
-	return q;
-}
-
 static const struct blk_mq_ops scsi_mq_ops = {
 	.get_budget	= scsi_mq_get_budget,
 	.put_budget	= scsi_mq_put_budget,
@@ -2380,10 +1887,7 @@ struct scsi_device *scsi_device_from_queue(struct request_queue *q)
 {
 	struct scsi_device *sdev = NULL;
 
-	if (q->mq_ops) {
-		if (q->mq_ops == &scsi_mq_ops)
-			sdev = q->queuedata;
-	} else if (q->request_fn == scsi_request_fn)
+	if (q->mq_ops == &scsi_mq_ops)
 		sdev = q->queuedata;
 	if (!sdev || !get_device(&sdev->sdev_gendev))
 		sdev = NULL;
@@ -2986,39 +2490,6 @@ void sdev_evt_send_simple(struct scsi_device *sdev,
 }
 EXPORT_SYMBOL_GPL(sdev_evt_send_simple);
 
-/**
- * scsi_request_fn_active() - number of kernel threads inside scsi_request_fn()
- * @sdev: SCSI device to count the number of scsi_request_fn() callers for.
- */
-static int scsi_request_fn_active(struct scsi_device *sdev)
-{
-	struct request_queue *q = sdev->request_queue;
-	int request_fn_active;
-
-	WARN_ON_ONCE(sdev->host->use_blk_mq);
-
-	spin_lock_irq(q->queue_lock);
-	request_fn_active = q->request_fn_active;
-	spin_unlock_irq(q->queue_lock);
-
-	return request_fn_active;
-}
-
-/**
- * scsi_wait_for_queuecommand() - wait for ongoing queuecommand() calls
- * @sdev: SCSI device pointer.
- *
- * Wait until the ongoing shost->hostt->queuecommand() calls that are
- * invoked from scsi_request_fn() have finished.
- */
-static void scsi_wait_for_queuecommand(struct scsi_device *sdev)
-{
-	WARN_ON_ONCE(sdev->host->use_blk_mq);
-
-	while (scsi_request_fn_active(sdev))
-		msleep(20);
-}
-
 /**
  *	scsi_device_quiesce - Block user issued commands.
  *	@sdev:	scsi device to quiesce.
@@ -3142,7 +2613,6 @@ EXPORT_SYMBOL(scsi_target_resume);
 int scsi_internal_device_block_nowait(struct scsi_device *sdev)
 {
 	struct request_queue *q = sdev->request_queue;
-	unsigned long flags;
 	int err = 0;
 
 	err = scsi_device_set_state(sdev, SDEV_BLOCK);
@@ -3158,14 +2628,7 @@ int scsi_internal_device_block_nowait(struct scsi_device *sdev)
 	 * block layer from calling the midlayer with this device's
 	 * request queue. 
 	 */
-	if (q->mq_ops) {
-		blk_mq_quiesce_queue_nowait(q);
-	} else {
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_stop_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
-
+	blk_mq_quiesce_queue_nowait(q);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(scsi_internal_device_block_nowait);
@@ -3196,12 +2659,8 @@ static int scsi_internal_device_block(struct scsi_device *sdev)
 
 	mutex_lock(&sdev->state_mutex);
 	err = scsi_internal_device_block_nowait(sdev);
-	if (err == 0) {
-		if (q->mq_ops)
-			blk_mq_quiesce_queue(q);
-		else
-			scsi_wait_for_queuecommand(sdev);
-	}
+	if (err == 0)
+		blk_mq_quiesce_queue(q);
 	mutex_unlock(&sdev->state_mutex);
 
 	return err;
@@ -3210,15 +2669,8 @@ static int scsi_internal_device_block(struct scsi_device *sdev)
 void scsi_start_queue(struct scsi_device *sdev)
 {
 	struct request_queue *q = sdev->request_queue;
-	unsigned long flags;
 
-	if (q->mq_ops) {
-		blk_mq_unquiesce_queue(q);
-	} else {
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
+	blk_mq_unquiesce_queue(q);
 }
 
 /**
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 99f1db5e467e..5f21547b2ad2 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -92,7 +92,6 @@ extern void scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
 extern void scsi_io_completion(struct scsi_cmnd *, unsigned int);
 extern void scsi_run_host_queues(struct Scsi_Host *shost);
 extern void scsi_requeue_run_queue(struct work_struct *work);
-extern struct request_queue *scsi_old_alloc_queue(struct scsi_device *sdev);
 extern struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev);
 extern void scsi_start_queue(struct scsi_device *sdev);
 extern int scsi_mq_setup_tags(struct Scsi_Host *shost);
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 78ca63dfba4a..dd0d516f65e2 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -266,10 +266,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	 */
 	sdev->borken = 1;
 
-	if (shost_use_blk_mq(shost))
-		sdev->request_queue = scsi_mq_alloc_queue(sdev);
-	else
-		sdev->request_queue = scsi_old_alloc_queue(sdev);
+	sdev->request_queue = scsi_mq_alloc_queue(sdev);
 	if (!sdev->request_queue) {
 		/* release fn is set up in scsi_sysfs_device_initialise, so
 		 * have to free and put manually here */
@@ -280,11 +277,6 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	WARN_ON_ONCE(!blk_get_queue(sdev->request_queue));
 	sdev->request_queue->queuedata = sdev;
 
-	if (!shost_use_blk_mq(sdev->host)) {
-		blk_queue_init_tags(sdev->request_queue,
-				    sdev->host->cmd_per_lun, shost->bqt,
-				    shost->hostt->tag_alloc_policy);
-	}
 	scsi_change_queue_depth(sdev, sdev->host->cmd_per_lun ?
 					sdev->host->cmd_per_lun : 1);
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 3aee9464a7bf..12e2c2829df2 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -367,7 +367,6 @@ store_shost_eh_deadline(struct device *dev, struct device_attribute *attr,
 
 static DEVICE_ATTR(eh_deadline, S_IRUGO | S_IWUSR, show_shost_eh_deadline, store_shost_eh_deadline);
 
-shost_rd_attr(use_blk_mq, "%d\n");
 shost_rd_attr(unique_id, "%u\n");
 shost_rd_attr(cmd_per_lun, "%hd\n");
 shost_rd_attr(can_queue, "%hd\n");
@@ -386,6 +385,13 @@ show_host_busy(struct device *dev, struct device_attribute *attr, char *buf)
 }
 static DEVICE_ATTR(host_busy, S_IRUGO, show_host_busy, NULL);
 
+static ssize_t
+show_use_blk_mq(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return snprintf(buf, 20, "1\n");
+}
+static DEVICE_ATTR(use_blk_mq, S_IRUGO, show_use_blk_mq, NULL);
+
 static struct attribute *scsi_sysfs_shost_attrs[] = {
 	&dev_attr_use_blk_mq.attr,
 	&dev_attr_unique_id.attr,
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 23d7cca36ff0..fb308ea8e9a5 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -8100,12 +8100,6 @@ int ufshcd_alloc_host(struct device *dev, struct ufs_hba **hba_handle)
 		goto out_error;
 	}
 
-	/*
-	 * Do not use blk-mq at this time because blk-mq does not support
-	 * runtime pm.
-	 */
-	host->use_blk_mq = false;
-
 	hba = shost_priv(host);
 	hba->host = host;
 	hba->dev = dev;
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 5ea06d310a25..aa760df8c6b3 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -11,7 +11,6 @@
 #include <linux/blk-mq.h>
 #include <scsi/scsi.h>
 
-struct request_queue;
 struct block_device;
 struct completion;
 struct module;
@@ -22,7 +21,6 @@ struct scsi_target;
 struct Scsi_Host;
 struct scsi_host_cmd_pool;
 struct scsi_transport_template;
-struct blk_queue_tags;
 
 
 /*
@@ -547,14 +545,8 @@ struct Scsi_Host {
 	struct scsi_host_template *hostt;
 	struct scsi_transport_template *transportt;
 
-	/*
-	 * Area to keep a shared tag map (if needed, will be
-	 * NULL if not).
-	 */
-	union {
-		struct blk_queue_tag	*bqt;
-		struct blk_mq_tag_set	tag_set;
-	};
+	/* Area to keep a shared tag map */
+	struct blk_mq_tag_set	tag_set;
 
 	atomic_t host_busy;		   /* commands actually active on low-level */
 	atomic_t host_blocked;
@@ -648,7 +640,6 @@ struct Scsi_Host {
 	/* The controller does not support WRITE SAME */
 	unsigned no_write_same:1;
 
-	unsigned use_blk_mq:1;
 	unsigned use_cmd_list:1;
 
 	/* Host responded with short (<36 bytes) INQUIRY result */
@@ -742,11 +733,6 @@ static inline int scsi_host_in_recovery(struct Scsi_Host *shost)
 		shost->tmf_in_progress;
 }
 
-static inline bool shost_use_blk_mq(struct Scsi_Host *shost)
-{
-	return shost->use_blk_mq;
-}
-
 extern int scsi_queue_work(struct Scsi_Host *, struct work_struct *);
 extern void scsi_flush_work(struct Scsi_Host *);
 
diff --git a/include/scsi/scsi_tcq.h b/include/scsi/scsi_tcq.h
index e192a0caa850..6053d46e794e 100644
--- a/include/scsi/scsi_tcq.h
+++ b/include/scsi/scsi_tcq.h
@@ -23,19 +23,15 @@ static inline struct scsi_cmnd *scsi_host_find_tag(struct Scsi_Host *shost,
 		int tag)
 {
 	struct request *req = NULL;
+	u16 hwq;
 
 	if (tag == SCSI_NO_TAG)
 		return NULL;
 
-	if (shost_use_blk_mq(shost)) {
-		u16 hwq = blk_mq_unique_tag_to_hwq(tag);
-
-		if (hwq < shost->tag_set.nr_hw_queues) {
-			req = blk_mq_tag_to_rq(shost->tag_set.tags[hwq],
-				blk_mq_unique_tag_to_tag(tag));
-		}
-	} else {
-		req = blk_map_queue_find_tag(shost->bqt, tag);
+	hwq = blk_mq_unique_tag_to_hwq(tag);
+	if (hwq < shost->tag_set.nr_hw_queues) {
+		req = blk_mq_tag_to_rq(shost->tag_set.tags[hwq],
+					blk_mq_unique_tag_to_tag(tag));
 	}
 
 	if (!req)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 09/28] dm: remove legacy IO path
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (7 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 08/28] scsi: kill off the legacy IO path Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  6:53   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 10/28] dasd: remove dead code Jens Axboe
                   ` (20 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

dm supports both, and since we're killing off the legacy path
in general, get rid of it in dm as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/md/Kconfig    |  11 --
 drivers/md/dm-core.h  |  10 --
 drivers/md/dm-mpath.c |  14 +-
 drivers/md/dm-rq.c    | 293 ++++--------------------------------------
 drivers/md/dm-rq.h    |   4 -
 drivers/md/dm-sysfs.c |   3 +-
 drivers/md/dm-table.c |  36 +-----
 drivers/md/dm.c       |  21 +--
 drivers/md/dm.h       |   1 -
 9 files changed, 35 insertions(+), 358 deletions(-)

diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 8b8c123cae66..3db222509e44 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -215,17 +215,6 @@ config BLK_DEV_DM
 
 	  If unsure, say N.
 
-config DM_MQ_DEFAULT
-	bool "request-based DM: use blk-mq I/O path by default"
-	depends on BLK_DEV_DM
-	---help---
-	  This option enables the blk-mq based I/O path for request-based
-	  DM devices by default.  With the option the dm_mod.use_blk_mq
-	  module/boot option defaults to Y, without it to N, but it can
-	  still be overriden either way.
-
-	  If unsure say N.
-
 config DM_DEBUG
 	bool "Device mapper debugging support"
 	depends on BLK_DEV_DM
diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index 7d480c930eaf..224d44503a06 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -112,18 +112,8 @@ struct mapped_device {
 
 	struct dm_stats stats;
 
-	struct kthread_worker kworker;
-	struct task_struct *kworker_task;
-
-	/* for request-based merge heuristic in dm_request_fn() */
-	unsigned seq_rq_merge_deadline_usecs;
-	int last_rq_rw;
-	sector_t last_rq_pos;
-	ktime_t last_rq_start_time;
-
 	/* for blk-mq request-based DM support */
 	struct blk_mq_tag_set *tag_set;
-	bool use_blk_mq:1;
 	bool init_tio_pdu:1;
 
 	struct srcu_struct io_barrier;
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 419362c2d8ac..a24ed3973e7c 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -203,14 +203,7 @@ static struct multipath *alloc_multipath(struct dm_target *ti)
 static int alloc_multipath_stage2(struct dm_target *ti, struct multipath *m)
 {
 	if (m->queue_mode == DM_TYPE_NONE) {
-		/*
-		 * Default to request-based.
-		 */
-		if (dm_use_blk_mq(dm_table_get_md(ti->table)))
-			m->queue_mode = DM_TYPE_MQ_REQUEST_BASED;
-		else
-			m->queue_mode = DM_TYPE_REQUEST_BASED;
-
+		m->queue_mode = DM_TYPE_MQ_REQUEST_BASED;
 	} else if (m->queue_mode == DM_TYPE_BIO_BASED) {
 		INIT_WORK(&m->process_queued_bios, process_queued_bios);
 		/*
@@ -537,10 +530,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		 * get the queue busy feedback (via BLK_STS_RESOURCE),
 		 * otherwise I/O merging can suffer.
 		 */
-		if (q->mq_ops)
-			return DM_MAPIO_REQUEUE;
-		else
-			return DM_MAPIO_DELAY_REQUEUE;
+		return DM_MAPIO_REQUEUE;
 	}
 	clone->bio = clone->biotail = NULL;
 	clone->rq_disk = bdev->bd_disk;
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 6e547b8dd298..37192b396473 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -23,19 +23,6 @@ static unsigned dm_mq_queue_depth = DM_MQ_QUEUE_DEPTH;
 #define RESERVED_REQUEST_BASED_IOS	256
 static unsigned reserved_rq_based_ios = RESERVED_REQUEST_BASED_IOS;
 
-static bool use_blk_mq = IS_ENABLED(CONFIG_DM_MQ_DEFAULT);
-
-bool dm_use_blk_mq_default(void)
-{
-	return use_blk_mq;
-}
-
-bool dm_use_blk_mq(struct mapped_device *md)
-{
-	return md->use_blk_mq;
-}
-EXPORT_SYMBOL_GPL(dm_use_blk_mq);
-
 unsigned dm_get_reserved_rq_based_ios(void)
 {
 	return __dm_get_module_param(&reserved_rq_based_ios,
@@ -59,16 +46,6 @@ int dm_request_based(struct mapped_device *md)
 	return queue_is_rq_based(md->queue);
 }
 
-static void dm_old_start_queue(struct request_queue *q)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	if (blk_queue_stopped(q))
-		blk_start_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-}
-
 static void dm_mq_start_queue(struct request_queue *q)
 {
 	blk_mq_unquiesce_queue(q);
@@ -77,20 +54,7 @@ static void dm_mq_start_queue(struct request_queue *q)
 
 void dm_start_queue(struct request_queue *q)
 {
-	if (!q->mq_ops)
-		dm_old_start_queue(q);
-	else
-		dm_mq_start_queue(q);
-}
-
-static void dm_old_stop_queue(struct request_queue *q)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	if (!blk_queue_stopped(q))
-		blk_stop_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	dm_mq_start_queue(q);
 }
 
 static void dm_mq_stop_queue(struct request_queue *q)
@@ -103,10 +67,7 @@ static void dm_mq_stop_queue(struct request_queue *q)
 
 void dm_stop_queue(struct request_queue *q)
 {
-	if (!q->mq_ops)
-		dm_old_stop_queue(q);
-	else
-		dm_mq_stop_queue(q);
+	dm_mq_stop_queue(q);
 }
 
 /*
@@ -179,27 +140,12 @@ static void rq_end_stats(struct mapped_device *md, struct request *orig)
  */
 static void rq_completed(struct mapped_device *md, int rw, bool run_queue)
 {
-	struct request_queue *q = md->queue;
-	unsigned long flags;
-
 	atomic_dec(&md->pending[rw]);
 
 	/* nudge anyone waiting on suspend queue */
 	if (!md_in_flight(md))
 		wake_up(&md->wait);
 
-	/*
-	 * Run this off this callpath, as drivers could invoke end_io while
-	 * inside their request_fn (and holding the queue lock). Calling
-	 * back into ->request_fn() could deadlock attempting to grab the
-	 * queue lock again.
-	 */
-	if (!q->mq_ops && run_queue) {
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_run_queue_async(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
-
 	/*
 	 * dm_put() must be at the end of this function. See the comment above
 	 */
@@ -222,27 +168,10 @@ static void dm_end_request(struct request *clone, blk_status_t error)
 	tio->ti->type->release_clone_rq(clone);
 
 	rq_end_stats(md, rq);
-	if (!rq->q->mq_ops)
-		blk_end_request_all(rq, error);
-	else
-		blk_mq_end_request(rq, error);
+	blk_mq_end_request(rq, error);
 	rq_completed(md, rw, true);
 }
 
-/*
- * Requeue the original request of a clone.
- */
-static void dm_old_requeue_request(struct request *rq, unsigned long delay_ms)
-{
-	struct request_queue *q = rq->q;
-	unsigned long flags;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	blk_requeue_request(q, rq);
-	blk_delay_queue(q, delay_ms);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-}
-
 static void __dm_mq_kick_requeue_list(struct request_queue *q, unsigned long msecs)
 {
 	blk_mq_delay_kick_requeue_list(q, msecs);
@@ -273,11 +202,7 @@ static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_
 		tio->ti->type->release_clone_rq(tio->clone);
 	}
 
-	if (!rq->q->mq_ops)
-		dm_old_requeue_request(rq, delay_ms);
-	else
-		dm_mq_delay_requeue_request(rq, delay_ms);
-
+	dm_mq_delay_requeue_request(rq, delay_ms);
 	rq_completed(md, rw, false);
 }
 
@@ -340,10 +265,7 @@ static void dm_softirq_done(struct request *rq)
 
 		rq_end_stats(md, rq);
 		rw = rq_data_dir(rq);
-		if (!rq->q->mq_ops)
-			blk_end_request_all(rq, tio->error);
-		else
-			blk_mq_end_request(rq, tio->error);
+		blk_mq_end_request(rq, tio->error);
 		rq_completed(md, rw, false);
 		return;
 	}
@@ -363,10 +285,7 @@ static void dm_complete_request(struct request *rq, blk_status_t error)
 	struct dm_rq_target_io *tio = tio_from_request(rq);
 
 	tio->error = error;
-	if (!rq->q->mq_ops)
-		blk_complete_request(rq);
-	else
-		blk_mq_complete_request(rq);
+	blk_mq_complete_request(rq);
 }
 
 /*
@@ -446,8 +365,6 @@ static int setup_clone(struct request *clone, struct request *rq,
 	return 0;
 }
 
-static void map_tio_request(struct kthread_work *work);
-
 static void init_tio(struct dm_rq_target_io *tio, struct request *rq,
 		     struct mapped_device *md)
 {
@@ -464,8 +381,6 @@ static void init_tio(struct dm_rq_target_io *tio, struct request *rq,
 	 */
 	if (!md->init_tio_pdu)
 		memset(&tio->info, 0, sizeof(tio->info));
-	if (md->kworker_task)
-		kthread_init_work(&tio->work, map_tio_request);
 }
 
 /*
@@ -504,10 +419,7 @@ static int map_request(struct dm_rq_target_io *tio)
 			blk_rq_unprep_clone(clone);
 			tio->ti->type->release_clone_rq(clone);
 			tio->clone = NULL;
-			if (!rq->q->mq_ops)
-				r = DM_MAPIO_DELAY_REQUEUE;
-			else
-				r = DM_MAPIO_REQUEUE;
+			r = DM_MAPIO_REQUEUE;
 			goto check_again;
 		}
 		break;
@@ -530,20 +442,23 @@ static int map_request(struct dm_rq_target_io *tio)
 	return r;
 }
 
+/* DEPRECATED: previously used for request-based merge heuristic in dm_request_fn() */
+ssize_t dm_attr_rq_based_seq_io_merge_deadline_show(struct mapped_device *md, char *buf)
+{
+	return sprintf(buf, "%u\n", 0);
+}
+
+ssize_t dm_attr_rq_based_seq_io_merge_deadline_store(struct mapped_device *md,
+						     const char *buf, size_t count)
+{
+	return count;
+}
+
 static void dm_start_request(struct mapped_device *md, struct request *orig)
 {
-	if (!orig->q->mq_ops)
-		blk_start_request(orig);
-	else
-		blk_mq_start_request(orig);
+	blk_mq_start_request(orig);
 	atomic_inc(&md->pending[rq_data_dir(orig)]);
 
-	if (md->seq_rq_merge_deadline_usecs) {
-		md->last_rq_pos = rq_end_sector(orig);
-		md->last_rq_rw = rq_data_dir(orig);
-		md->last_rq_start_time = ktime_get();
-	}
-
 	if (unlikely(dm_stats_used(&md->stats))) {
 		struct dm_rq_target_io *tio = tio_from_request(orig);
 		tio->duration_jiffies = jiffies;
@@ -563,8 +478,10 @@ static void dm_start_request(struct mapped_device *md, struct request *orig)
 	dm_get(md);
 }
 
-static int __dm_rq_init_rq(struct mapped_device *md, struct request *rq)
+static int dm_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
+			      unsigned int hctx_idx, unsigned int numa_node)
 {
+	struct mapped_device *md = set->driver_data;
 	struct dm_rq_target_io *tio = blk_mq_rq_to_pdu(rq);
 
 	/*
@@ -581,163 +498,6 @@ static int __dm_rq_init_rq(struct mapped_device *md, struct request *rq)
 	return 0;
 }
 
-static int dm_rq_init_rq(struct request_queue *q, struct request *rq, gfp_t gfp)
-{
-	return __dm_rq_init_rq(q->rq_alloc_data, rq);
-}
-
-static void map_tio_request(struct kthread_work *work)
-{
-	struct dm_rq_target_io *tio = container_of(work, struct dm_rq_target_io, work);
-
-	if (map_request(tio) == DM_MAPIO_REQUEUE)
-		dm_requeue_original_request(tio, false);
-}
-
-ssize_t dm_attr_rq_based_seq_io_merge_deadline_show(struct mapped_device *md, char *buf)
-{
-	return sprintf(buf, "%u\n", md->seq_rq_merge_deadline_usecs);
-}
-
-#define MAX_SEQ_RQ_MERGE_DEADLINE_USECS 100000
-
-ssize_t dm_attr_rq_based_seq_io_merge_deadline_store(struct mapped_device *md,
-						     const char *buf, size_t count)
-{
-	unsigned deadline;
-
-	if (dm_get_md_type(md) != DM_TYPE_REQUEST_BASED)
-		return count;
-
-	if (kstrtouint(buf, 10, &deadline))
-		return -EINVAL;
-
-	if (deadline > MAX_SEQ_RQ_MERGE_DEADLINE_USECS)
-		deadline = MAX_SEQ_RQ_MERGE_DEADLINE_USECS;
-
-	md->seq_rq_merge_deadline_usecs = deadline;
-
-	return count;
-}
-
-static bool dm_old_request_peeked_before_merge_deadline(struct mapped_device *md)
-{
-	ktime_t kt_deadline;
-
-	if (!md->seq_rq_merge_deadline_usecs)
-		return false;
-
-	kt_deadline = ns_to_ktime((u64)md->seq_rq_merge_deadline_usecs * NSEC_PER_USEC);
-	kt_deadline = ktime_add_safe(md->last_rq_start_time, kt_deadline);
-
-	return !ktime_after(ktime_get(), kt_deadline);
-}
-
-/*
- * q->request_fn for old request-based dm.
- * Called with the queue lock held.
- */
-static void dm_old_request_fn(struct request_queue *q)
-{
-	struct mapped_device *md = q->queuedata;
-	struct dm_target *ti = md->immutable_target;
-	struct request *rq;
-	struct dm_rq_target_io *tio;
-	sector_t pos = 0;
-
-	if (unlikely(!ti)) {
-		int srcu_idx;
-		struct dm_table *map = dm_get_live_table(md, &srcu_idx);
-
-		if (unlikely(!map)) {
-			dm_put_live_table(md, srcu_idx);
-			return;
-		}
-		ti = dm_table_find_target(map, pos);
-		dm_put_live_table(md, srcu_idx);
-	}
-
-	/*
-	 * For suspend, check blk_queue_stopped() and increment
-	 * ->pending within a single queue_lock not to increment the
-	 * number of in-flight I/Os after the queue is stopped in
-	 * dm_suspend().
-	 */
-	while (!blk_queue_stopped(q)) {
-		rq = blk_peek_request(q);
-		if (!rq)
-			return;
-
-		/* always use block 0 to find the target for flushes for now */
-		pos = 0;
-		if (req_op(rq) != REQ_OP_FLUSH)
-			pos = blk_rq_pos(rq);
-
-		if ((dm_old_request_peeked_before_merge_deadline(md) &&
-		     md_in_flight(md) && rq->bio && !bio_multiple_segments(rq->bio) &&
-		     md->last_rq_pos == pos && md->last_rq_rw == rq_data_dir(rq)) ||
-		    (ti->type->busy && ti->type->busy(ti))) {
-			blk_delay_queue(q, 10);
-			return;
-		}
-
-		dm_start_request(md, rq);
-
-		tio = tio_from_request(rq);
-		init_tio(tio, rq, md);
-		/* Establish tio->ti before queuing work (map_tio_request) */
-		tio->ti = ti;
-		kthread_queue_work(&md->kworker, &tio->work);
-		BUG_ON(!irqs_disabled());
-	}
-}
-
-/*
- * Fully initialize a .request_fn request-based queue.
- */
-int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
-{
-	struct dm_target *immutable_tgt;
-
-	/* Fully initialize the queue */
-	md->queue->cmd_size = sizeof(struct dm_rq_target_io);
-	md->queue->rq_alloc_data = md;
-	md->queue->request_fn = dm_old_request_fn;
-	md->queue->init_rq_fn = dm_rq_init_rq;
-
-	immutable_tgt = dm_table_get_immutable_target(t);
-	if (immutable_tgt && immutable_tgt->per_io_data_size) {
-		/* any target-specific per-io data is immediately after the tio */
-		md->queue->cmd_size += immutable_tgt->per_io_data_size;
-		md->init_tio_pdu = true;
-	}
-	if (blk_init_allocated_queue(md->queue) < 0)
-		return -EINVAL;
-
-	/* disable dm_old_request_fn's merge heuristic by default */
-	md->seq_rq_merge_deadline_usecs = 0;
-
-	blk_queue_softirq_done(md->queue, dm_softirq_done);
-
-	/* Initialize the request-based DM worker thread */
-	kthread_init_worker(&md->kworker);
-	md->kworker_task = kthread_run(kthread_worker_fn, &md->kworker,
-				       "kdmwork-%s", dm_device_name(md));
-	if (IS_ERR(md->kworker_task)) {
-		int error = PTR_ERR(md->kworker_task);
-		md->kworker_task = NULL;
-		return error;
-	}
-
-	return 0;
-}
-
-static int dm_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
-		unsigned int hctx_idx, unsigned int numa_node)
-{
-	return __dm_rq_init_rq(set->driver_data, rq);
-}
-
 static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 			  const struct blk_mq_queue_data *bd)
 {
@@ -790,11 +550,6 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	struct dm_target *immutable_tgt;
 	int err;
 
-	if (!dm_table_all_blk_mq_devices(t)) {
-		DMERR("request-based dm-mq may only be stacked on blk-mq device(s)");
-		return -EINVAL;
-	}
-
 	md->tag_set = kzalloc_node(sizeof(struct blk_mq_tag_set), GFP_KERNEL, md->numa_node_id);
 	if (!md->tag_set)
 		return -ENOMEM;
@@ -845,6 +600,8 @@ void dm_mq_cleanup_mapped_device(struct mapped_device *md)
 module_param(reserved_rq_based_ios, uint, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(reserved_rq_based_ios, "Reserved IOs in request-based mempools");
 
+/* Unused, but preserved for userspace compatibility */
+static bool use_blk_mq = true;
 module_param(use_blk_mq, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(use_blk_mq, "Use block multiqueue for request-based DM devices");
 
diff --git a/drivers/md/dm-rq.h b/drivers/md/dm-rq.h
index f43c45460aac..b39245545229 100644
--- a/drivers/md/dm-rq.h
+++ b/drivers/md/dm-rq.h
@@ -46,10 +46,6 @@ struct dm_rq_clone_bio_info {
 	struct bio clone;
 };
 
-bool dm_use_blk_mq_default(void);
-bool dm_use_blk_mq(struct mapped_device *md);
-
-int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t);
 int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t);
 void dm_mq_cleanup_mapped_device(struct mapped_device *md);
 
diff --git a/drivers/md/dm-sysfs.c b/drivers/md/dm-sysfs.c
index c209b8a19b84..a05fcd50e1b9 100644
--- a/drivers/md/dm-sysfs.c
+++ b/drivers/md/dm-sysfs.c
@@ -92,7 +92,8 @@ static ssize_t dm_attr_suspended_show(struct mapped_device *md, char *buf)
 
 static ssize_t dm_attr_use_blk_mq_show(struct mapped_device *md, char *buf)
 {
-	sprintf(buf, "%d\n", dm_use_blk_mq(md));
+	/* Purely for userspace compatibility */
+	sprintf(buf, "%d\n", true);
 
 	return strlen(buf);
 }
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index fb4bea20657b..aa01277cc0d1 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -47,7 +47,6 @@ struct dm_table {
 
 	bool integrity_supported:1;
 	bool singleton:1;
-	bool all_blk_mq:1;
 	unsigned integrity_added:1;
 
 	/*
@@ -921,10 +920,7 @@ static int device_is_rq_based(struct dm_target *ti, struct dm_dev *dev,
 	struct request_queue *q = bdev_get_queue(dev->bdev);
 	struct verify_rq_based_data *v = data;
 
-	if (q->mq_ops)
-		v->mq_count++;
-	else
-		v->sq_count++;
+	v->mq_count++;
 
 	return queue_is_rq_based(q);
 }
@@ -1022,11 +1018,9 @@ static int dm_table_determine_type(struct dm_table *t)
 		int srcu_idx;
 		struct dm_table *live_table = dm_get_live_table(t->md, &srcu_idx);
 
-		/* inherit live table's type and all_blk_mq */
-		if (live_table) {
+		/* inherit live table's type */
+		if (live_table)
 			t->type = live_table->type;
-			t->all_blk_mq = live_table->all_blk_mq;
-		}
 		dm_put_live_table(t->md, srcu_idx);
 		return 0;
 	}
@@ -1050,13 +1044,6 @@ static int dm_table_determine_type(struct dm_table *t)
 		DMERR("table load rejected: not all devices are blk-mq request-stackable");
 		return -EINVAL;
 	}
-	t->all_blk_mq = v.mq_count > 0;
-
-	if (!t->all_blk_mq &&
-	    (t->type == DM_TYPE_MQ_REQUEST_BASED || t->type == DM_TYPE_NVME_BIO_BASED)) {
-		DMERR("table load rejected: all devices are not blk-mq request-stackable");
-		return -EINVAL;
-	}
 
 	return 0;
 }
@@ -1105,11 +1092,6 @@ bool dm_table_request_based(struct dm_table *t)
 	return __table_type_request_based(dm_table_get_type(t));
 }
 
-bool dm_table_all_blk_mq_devices(struct dm_table *t)
-{
-	return t->all_blk_mq;
-}
-
 static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *md)
 {
 	enum dm_queue_mode type = dm_table_get_type(t);
@@ -2093,22 +2075,14 @@ void dm_table_run_md_queue_async(struct dm_table *t)
 {
 	struct mapped_device *md;
 	struct request_queue *queue;
-	unsigned long flags;
 
 	if (!dm_table_request_based(t))
 		return;
 
 	md = dm_table_get_md(t);
 	queue = dm_get_md_queue(md);
-	if (queue) {
-		if (queue->mq_ops)
-			blk_mq_run_hw_queues(queue, true);
-		else {
-			spin_lock_irqsave(queue->queue_lock, flags);
-			blk_run_queue_async(queue);
-			spin_unlock_irqrestore(queue->queue_lock, flags);
-		}
-	}
+	if (queue)
+		blk_mq_run_hw_queues(queue, true);
 }
 EXPORT_SYMBOL(dm_table_run_md_queue_async);
 
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 6be21dc210a1..224aabd997df 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1806,8 +1806,6 @@ static void dm_wq_work(struct work_struct *work);
 
 static void dm_init_normal_md_queue(struct mapped_device *md)
 {
-	md->use_blk_mq = false;
-
 	/*
 	 * Initialize aspects of queue that aren't relevant for blk-mq
 	 */
@@ -1818,8 +1816,6 @@ static void cleanup_mapped_device(struct mapped_device *md)
 {
 	if (md->wq)
 		destroy_workqueue(md->wq);
-	if (md->kworker_task)
-		kthread_stop(md->kworker_task);
 	bioset_exit(&md->bs);
 	bioset_exit(&md->io_bs);
 
@@ -1886,7 +1882,6 @@ static struct mapped_device *alloc_dev(int minor)
 		goto bad_io_barrier;
 
 	md->numa_node_id = numa_node_id;
-	md->use_blk_mq = dm_use_blk_mq_default();
 	md->init_tio_pdu = false;
 	md->type = DM_TYPE_NONE;
 	mutex_init(&md->suspend_lock);
@@ -1917,7 +1912,6 @@ static struct mapped_device *alloc_dev(int minor)
 	INIT_WORK(&md->work, dm_wq_work);
 	init_waitqueue_head(&md->eventq);
 	init_completion(&md->kobj_holder.completion);
-	md->kworker_task = NULL;
 
 	md->disk->major = _major;
 	md->disk->first_minor = minor;
@@ -2217,13 +2211,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 
 	switch (type) {
 	case DM_TYPE_REQUEST_BASED:
-		dm_init_normal_md_queue(md);
-		r = dm_old_init_request_queue(md, t);
-		if (r) {
-			DMERR("Cannot initialize queue for request-based mapped device");
-			return r;
-		}
-		break;
 	case DM_TYPE_MQ_REQUEST_BASED:
 		r = dm_mq_init_request_queue(md, t);
 		if (r) {
@@ -2329,9 +2316,6 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
 
 	blk_set_queue_dying(md->queue);
 
-	if (dm_request_based(md) && md->kworker_task)
-		kthread_flush_worker(&md->kworker);
-
 	/*
 	 * Take suspend_lock so that presuspend and postsuspend methods
 	 * do not race with internal suspend.
@@ -2584,11 +2568,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
 	 * Stop md->queue before flushing md->wq in case request-based
 	 * dm defers requests to md->wq from md->queue.
 	 */
-	if (dm_request_based(md)) {
+	if (dm_request_based(md))
 		dm_stop_queue(md->queue);
-		if (md->kworker_task)
-			kthread_flush_worker(&md->kworker);
-	}
 
 	flush_workqueue(md->wq);
 
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 114a81b27c37..2d539b82ec08 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -70,7 +70,6 @@ struct dm_target *dm_table_get_immutable_target(struct dm_table *t);
 struct dm_target *dm_table_get_wildcard_target(struct dm_table *t);
 bool dm_table_bio_based(struct dm_table *t);
 bool dm_table_request_based(struct dm_table *t);
-bool dm_table_all_blk_mq_devices(struct dm_table *t);
 void dm_table_free_md_mempools(struct dm_table *t);
 struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 10/28] dasd: remove dead code
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (8 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 09/28] dm: remove " Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  6:54   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 11/28] bsg: pass in desired timeout handler Jens Axboe
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Since e443343e509a we haven't had a request_fn attached to
this driver, hence any code inside an if (q->request_fn) is
unreachable.

Fixes: e443343e509a ("s390/dasd: blk-mq conversion")
[sth: Keep and fix the dasd_info->chanq_len counter.]
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/s390/block/dasd_ioctl.c | 22 +++++-----------------
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 2016e0ed5865..8e26001dc11c 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -412,6 +412,7 @@ static int dasd_ioctl_information(struct dasd_block *block,
 	struct ccw_dev_id dev_id;
 	struct dasd_device *base;
 	struct ccw_device *cdev;
+	struct list_head *l;
 	unsigned long flags;
 	int rc;
 
@@ -462,23 +463,10 @@ static int dasd_ioctl_information(struct dasd_block *block,
 
 	memcpy(dasd_info->type, base->discipline->name, 4);
 
-	if (block->request_queue->request_fn) {
-		struct list_head *l;
-#ifdef DASD_EXTENDED_PROFILING
-		{
-			struct list_head *l;
-			spin_lock_irqsave(&block->lock, flags);
-			list_for_each(l, &block->request_queue->queue_head)
-				dasd_info->req_queue_len++;
-			spin_unlock_irqrestore(&block->lock, flags);
-		}
-#endif				/* DASD_EXTENDED_PROFILING */
-		spin_lock_irqsave(get_ccwdev_lock(base->cdev), flags);
-		list_for_each(l, &base->ccw_queue)
-			dasd_info->chanq_len++;
-		spin_unlock_irqrestore(get_ccwdev_lock(base->cdev),
-				       flags);
-	}
+	spin_lock_irqsave(&block->queue_lock, flags);
+	list_for_each(l, &base->ccw_queue)
+		dasd_info->chanq_len++;
+	spin_unlock_irqrestore(&block->queue_lock, flags);
 
 	rc = 0;
 	if (copy_to_user(argp, dasd_info,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 11/28] bsg: pass in desired timeout handler
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (9 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 10/28] dasd: remove dead code Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-28 15:53   ` Christoph Hellwig
  2018-10-29  6:55   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 12/28] bsg: provide bsg_remove_queue() helper Jens Axboe
                   ` (18 subsequent siblings)
  29 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide
  Cc: Jens Axboe, Johannes Thumshirn, Benjamin Block

This will ease in the conversion to blk-mq, where we can't set
a timeout handler after queue init.

Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/bsg-lib.c                     | 3 ++-
 drivers/scsi/scsi_transport_fc.c    | 7 +++----
 drivers/scsi/scsi_transport_iscsi.c | 2 +-
 drivers/scsi/scsi_transport_sas.c   | 4 ++--
 include/linux/bsg-lib.h             | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index f3501cdaf1a6..1da011ec04e6 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -304,7 +304,7 @@ static void bsg_exit_rq(struct request_queue *q, struct request *req)
  * @dd_job_size: size of LLD data needed for each job
  */
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
-		bsg_job_fn *job_fn, int dd_job_size)
+		bsg_job_fn *job_fn, rq_timed_out_fn *timeout, int dd_job_size)
 {
 	struct request_queue *q;
 	int ret;
@@ -327,6 +327,7 @@ struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 	blk_queue_flag_set(QUEUE_FLAG_BIDI, q);
 	blk_queue_softirq_done(q, bsg_softirq_done);
 	blk_queue_rq_timeout(q, BLK_DEFAULT_SG_TIMEOUT);
+	blk_queue_rq_timed_out(q, timeout);
 
 	ret = bsg_register_queue(q, dev, name, &bsg_transport_ops);
 	if (ret) {
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 381668fa135d..98aaffb4c715 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3780,7 +3780,8 @@ fc_bsg_hostadd(struct Scsi_Host *shost, struct fc_host_attrs *fc_host)
 	snprintf(bsg_name, sizeof(bsg_name),
 		 "fc_host%d", shost->host_no);
 
-	q = bsg_setup_queue(dev, bsg_name, fc_bsg_dispatch, i->f->dd_bsg_size);
+	q = bsg_setup_queue(dev, bsg_name, fc_bsg_dispatch, fc_bsg_job_timeout,
+				i->f->dd_bsg_size);
 	if (IS_ERR(q)) {
 		dev_err(dev,
 			"fc_host%d: bsg interface failed to initialize - setup queue\n",
@@ -3788,7 +3789,6 @@ fc_bsg_hostadd(struct Scsi_Host *shost, struct fc_host_attrs *fc_host)
 		return PTR_ERR(q);
 	}
 	__scsi_init_queue(shost, q);
-	blk_queue_rq_timed_out(q, fc_bsg_job_timeout);
 	blk_queue_rq_timeout(q, FC_DEFAULT_BSG_TIMEOUT);
 	fc_host->rqst_q = q;
 	return 0;
@@ -3826,14 +3826,13 @@ fc_bsg_rportadd(struct Scsi_Host *shost, struct fc_rport *rport)
 		return -ENOTSUPP;
 
 	q = bsg_setup_queue(dev, dev_name(dev), fc_bsg_dispatch,
-			i->f->dd_bsg_size);
+				fc_bsg_job_timeout, i->f->dd_bsg_size);
 	if (IS_ERR(q)) {
 		dev_err(dev, "failed to setup bsg queue\n");
 		return PTR_ERR(q);
 	}
 	__scsi_init_queue(shost, q);
 	blk_queue_prep_rq(q, fc_bsg_rport_prep);
-	blk_queue_rq_timed_out(q, fc_bsg_job_timeout);
 	blk_queue_rq_timeout(q, BLK_DEFAULT_SG_TIMEOUT);
 	rport->rqst_q = q;
 	return 0;
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 6fd2fe210fc3..26b11a775be9 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1542,7 +1542,7 @@ iscsi_bsg_host_add(struct Scsi_Host *shost, struct iscsi_cls_host *ihost)
 		return -ENOTSUPP;
 
 	snprintf(bsg_name, sizeof(bsg_name), "iscsi_host%d", shost->host_no);
-	q = bsg_setup_queue(dev, bsg_name, iscsi_bsg_host_dispatch, 0);
+	q = bsg_setup_queue(dev, bsg_name, iscsi_bsg_host_dispatch, NULL, 0);
 	if (IS_ERR(q)) {
 		shost_printk(KERN_ERR, shost, "bsg interface failed to "
 			     "initialize - no request queue\n");
diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index 0a165b2b3e81..cf6d47891d77 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -198,7 +198,7 @@ static int sas_bsg_initialize(struct Scsi_Host *shost, struct sas_rphy *rphy)
 
 	if (rphy) {
 		q = bsg_setup_queue(&rphy->dev, dev_name(&rphy->dev),
-				sas_smp_dispatch, 0);
+				sas_smp_dispatch, NULL, 0);
 		if (IS_ERR(q))
 			return PTR_ERR(q);
 		rphy->q = q;
@@ -207,7 +207,7 @@ static int sas_bsg_initialize(struct Scsi_Host *shost, struct sas_rphy *rphy)
 
 		snprintf(name, sizeof(name), "sas_host%d", shost->host_no);
 		q = bsg_setup_queue(&shost->shost_gendev, name,
-				sas_smp_dispatch, 0);
+				sas_smp_dispatch, NULL, 0);
 		if (IS_ERR(q))
 			return PTR_ERR(q);
 		to_sas_host_attrs(shost)->q = q;
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index 6aeaf6472665..b13ae143e7ef 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -72,7 +72,7 @@ struct bsg_job {
 void bsg_job_done(struct bsg_job *job, int result,
 		  unsigned int reply_payload_rcv_len);
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
-		bsg_job_fn *job_fn, int dd_job_size);
+		bsg_job_fn *job_fn, rq_timed_out_fn *timeout, int dd_job_size);
 void bsg_job_put(struct bsg_job *job);
 int __must_check bsg_job_get(struct bsg_job *job);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 12/28] bsg: provide bsg_remove_queue() helper
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (10 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 11/28] bsg: pass in desired timeout handler Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-28 15:53   ` Christoph Hellwig
                     ` (2 more replies)
  2018-10-25 21:10 ` [PATCH 13/28] bsg: convert to use blk-mq Jens Axboe
                   ` (17 subsequent siblings)
  29 siblings, 3 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide
  Cc: Jens Axboe, Johannes Thumshirn, Benjamin Block

All drivers do unregister + cleanup, provide a helper for that.

Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/bsg-lib.c                     | 7 +++++++
 drivers/scsi/scsi_transport_fc.c    | 6 ++----
 drivers/scsi/scsi_transport_iscsi.c | 7 +++----
 drivers/scsi/scsi_transport_sas.c   | 6 ++----
 include/linux/bsg-lib.h             | 1 +
 5 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index 1da011ec04e6..267f965af77a 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -296,6 +296,13 @@ static void bsg_exit_rq(struct request_queue *q, struct request *req)
 	kfree(job->reply);
 }
 
+void bsg_remove_queue(struct request_queue *q)
+{
+	bsg_unregister_queue(q);
+	blk_cleanup_queue(q);
+}
+EXPORT_SYMBOL_GPL(bsg_remove_queue);
+
 /**
  * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
  * @dev: device to attach bsg device to
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 98aaffb4c715..4d64956bb5d3 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3851,10 +3851,8 @@ fc_bsg_rportadd(struct Scsi_Host *shost, struct fc_rport *rport)
 static void
 fc_bsg_remove(struct request_queue *q)
 {
-	if (q) {
-		bsg_unregister_queue(q);
-		blk_cleanup_queue(q);
-	}
+	if (q)
+		bsg_remove_queue(q);
 }
 
 
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 26b11a775be9..3ead0dba5d8d 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1576,10 +1576,9 @@ static int iscsi_remove_host(struct transport_container *tc,
 	struct Scsi_Host *shost = dev_to_shost(dev);
 	struct iscsi_cls_host *ihost = shost->shost_data;
 
-	if (ihost->bsg_q) {
-		bsg_unregister_queue(ihost->bsg_q);
-		blk_cleanup_queue(ihost->bsg_q);
-	}
+	if (ihost->bsg_q)
+		bsg_remove_queue(ihost->bsg_q);
+
 	return 0;
 }
 
diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index cf6d47891d77..c46d642dc133 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -246,10 +246,8 @@ static int sas_host_remove(struct transport_container *tc, struct device *dev,
 	struct Scsi_Host *shost = dev_to_shost(dev);
 	struct request_queue *q = to_sas_host_attrs(shost)->q;
 
-	if (q) {
-		bsg_unregister_queue(q);
-		blk_cleanup_queue(q);
-	}
+	if (q)
+		bsg_remove_queue(q);
 
 	return 0;
 }
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index b13ae143e7ef..9c9b134b1fa5 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -73,6 +73,7 @@ void bsg_job_done(struct bsg_job *job, int result,
 		  unsigned int reply_payload_rcv_len);
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 		bsg_job_fn *job_fn, rq_timed_out_fn *timeout, int dd_job_size);
+void bsg_remove_queue(struct request_queue *q);
 void bsg_job_put(struct bsg_job *job);
 int __must_check bsg_job_get(struct bsg_job *job);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 13/28] bsg: convert to use blk-mq
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (11 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 12/28] bsg: provide bsg_remove_queue() helper Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-28 16:07   ` Christoph Hellwig
  2018-10-29  6:57   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 14/28] block: remove blk_complete_request() Jens Axboe
                   ` (16 subsequent siblings)
  29 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide
  Cc: Jens Axboe, Johannes Thumshirn, Benjamin Block

Requires a few changes to the FC transport class as well.

Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/bsg-lib.c                  | 123 +++++++++++++++++++------------
 drivers/scsi/scsi_transport_fc.c |  59 +++++++++------
 2 files changed, 110 insertions(+), 72 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index 267f965af77a..ef176b472914 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -21,7 +21,7 @@
  *
  */
 #include <linux/slab.h>
-#include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/delay.h>
 #include <linux/scatterlist.h>
 #include <linux/bsg-lib.h>
@@ -129,7 +129,7 @@ static void bsg_teardown_job(struct kref *kref)
 	kfree(job->request_payload.sg_list);
 	kfree(job->reply_payload.sg_list);
 
-	blk_end_request_all(rq, BLK_STS_OK);
+	blk_mq_end_request(rq, BLK_STS_OK);
 }
 
 void bsg_job_put(struct bsg_job *job)
@@ -157,15 +157,15 @@ void bsg_job_done(struct bsg_job *job, int result,
 {
 	job->result = result;
 	job->reply_payload_rcv_len = reply_payload_rcv_len;
-	blk_complete_request(blk_mq_rq_from_pdu(job));
+	blk_mq_complete_request(blk_mq_rq_from_pdu(job));
 }
 EXPORT_SYMBOL_GPL(bsg_job_done);
 
 /**
- * bsg_softirq_done - softirq done routine for destroying the bsg requests
+ * bsg_complete - softirq done routine for destroying the bsg requests
  * @rq: BSG request that holds the job to be destroyed
  */
-static void bsg_softirq_done(struct request *rq)
+static void bsg_complete(struct request *rq)
 {
 	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
 
@@ -224,54 +224,46 @@ static bool bsg_prepare_job(struct device *dev, struct request *req)
 }
 
 /**
- * bsg_request_fn - generic handler for bsg requests
- * @q: request queue to manage
+ * bsg_queue_rq - generic handler for bsg requests
+ * @hctx: hardware queue
+ * @bd: queue data
  *
  * On error the create_bsg_job function should return a -Exyz error value
  * that will be set to ->result.
  *
  * Drivers/subsys should pass this to the queue init function.
  */
-static void bsg_request_fn(struct request_queue *q)
-	__releases(q->queue_lock)
-	__acquires(q->queue_lock)
+static blk_status_t bsg_queue_rq(struct blk_mq_hw_ctx *hctx,
+				 const struct blk_mq_queue_data *bd)
 {
+	struct request_queue *q = hctx->queue;
 	struct device *dev = q->queuedata;
-	struct request *req;
+	struct request *req = bd->rq;
 	int ret;
 
+	blk_mq_start_request(req);
+
 	if (!get_device(dev))
-		return;
-
-	while (1) {
-		req = blk_fetch_request(q);
-		if (!req)
-			break;
-		spin_unlock_irq(q->queue_lock);
-
-		if (!bsg_prepare_job(dev, req)) {
-			blk_end_request_all(req, BLK_STS_OK);
-			spin_lock_irq(q->queue_lock);
-			continue;
-		}
-
-		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
-		spin_lock_irq(q->queue_lock);
-		if (ret)
-			break;
-	}
+		return BLK_STS_IOERR;
+
+	if (!bsg_prepare_job(dev, req))
+		return BLK_STS_IOERR;
+
+	ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
+	if (ret)
+		return BLK_STS_IOERR;
 
-	spin_unlock_irq(q->queue_lock);
 	put_device(dev);
-	spin_lock_irq(q->queue_lock);
+	return BLK_STS_OK;
 }
 
 /* called right after the request is allocated for the request_queue */
-static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
+static int bsg_init_rq(struct blk_mq_tag_set *set, struct request *req,
+		       unsigned int hctx_idx, unsigned int numa_node)
 {
 	struct bsg_job *job = blk_mq_rq_to_pdu(req);
 
-	job->reply = kzalloc(SCSI_SENSE_BUFFERSIZE, gfp);
+	job->reply = kzalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL);
 	if (!job->reply)
 		return -ENOMEM;
 	return 0;
@@ -289,7 +281,8 @@ static void bsg_initialize_rq(struct request *req)
 	job->dd_data = job + 1;
 }
 
-static void bsg_exit_rq(struct request_queue *q, struct request *req)
+static void bsg_exit_rq(struct blk_mq_tag_set *set, struct request *req,
+		       unsigned int hctx_idx)
 {
 	struct bsg_job *job = blk_mq_rq_to_pdu(req);
 
@@ -298,11 +291,35 @@ static void bsg_exit_rq(struct request_queue *q, struct request *req)
 
 void bsg_remove_queue(struct request_queue *q)
 {
+	struct blk_mq_tag_set *set = q->tag_set;
+
 	bsg_unregister_queue(q);
 	blk_cleanup_queue(q);
+	blk_mq_free_tag_set(set);
+	kfree(set);
 }
 EXPORT_SYMBOL_GPL(bsg_remove_queue);
 
+static enum blk_eh_timer_return bsg_timeout(struct request *rq, bool reserved)
+{
+	enum blk_eh_timer_return ret = BLK_EH_DONE;
+	struct request_queue *q = rq->q;
+
+	if (q->rq_timed_out_fn)
+		ret = q->rq_timed_out_fn(rq);
+
+	return ret;
+}
+
+static const struct blk_mq_ops bsg_mq_ops = {
+	.queue_rq		= bsg_queue_rq,
+	.init_request		= bsg_init_rq,
+	.exit_request		= bsg_exit_rq,
+	.initialize_rq_fn	= bsg_initialize_rq,
+	.complete		= bsg_complete,
+	.timeout		= bsg_timeout,
+};
+
 /**
  * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
  * @dev: device to attach bsg device to
@@ -313,28 +330,34 @@ EXPORT_SYMBOL_GPL(bsg_remove_queue);
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 		bsg_job_fn *job_fn, rq_timed_out_fn *timeout, int dd_job_size)
 {
+	struct blk_mq_tag_set *set;
 	struct request_queue *q;
-	int ret;
+	int ret = -ENOMEM;
 
-	q = blk_alloc_queue(GFP_KERNEL);
-	if (!q)
+	set = kzalloc(sizeof(*set), GFP_KERNEL);
+	if (!set)
 		return ERR_PTR(-ENOMEM);
-	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
-	q->init_rq_fn = bsg_init_rq;
-	q->exit_rq_fn = bsg_exit_rq;
-	q->initialize_rq_fn = bsg_initialize_rq;
-	q->request_fn = bsg_request_fn;
 
-	ret = blk_init_allocated_queue(q);
-	if (ret)
-		goto out_cleanup_queue;
+	set->ops = &bsg_mq_ops,
+	set->nr_hw_queues = 1;
+	set->queue_depth = 128;
+	set->numa_node = NUMA_NO_NODE;
+	set->cmd_size = sizeof(struct bsg_job) + dd_job_size;
+	set->flags = BLK_MQ_F_NO_SCHED | BLK_MQ_F_BLOCKING;
+	if (blk_mq_alloc_tag_set(set))
+		goto out_tag_set;
+
+	q = blk_mq_init_queue(set);
+	if (IS_ERR(q)) {
+		ret = PTR_ERR(q);
+		goto out_queue;
+	}
 
 	q->queuedata = dev;
 	q->bsg_job_fn = job_fn;
 	blk_queue_flag_set(QUEUE_FLAG_BIDI, q);
-	blk_queue_softirq_done(q, bsg_softirq_done);
 	blk_queue_rq_timeout(q, BLK_DEFAULT_SG_TIMEOUT);
-	blk_queue_rq_timed_out(q, timeout);
+	q->rq_timed_out_fn = timeout;
 
 	ret = bsg_register_queue(q, dev, name, &bsg_transport_ops);
 	if (ret) {
@@ -346,6 +369,10 @@ struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 	return q;
 out_cleanup_queue:
 	blk_cleanup_queue(q);
+out_queue:
+	blk_mq_free_tag_set(set);
+out_tag_set:
+	kfree(set);
 	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL_GPL(bsg_setup_queue);
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 4d64956bb5d3..0452fe718853 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3592,7 +3592,7 @@ fc_bsg_job_timeout(struct request *req)
 
 	/* the blk_end_sync_io() doesn't check the error */
 	if (inflight)
-		__blk_complete_request(req);
+		blk_mq_end_request(req, BLK_STS_IOERR);
 	return BLK_EH_DONE;
 }
 
@@ -3684,14 +3684,9 @@ static void
 fc_bsg_goose_queue(struct fc_rport *rport)
 {
 	struct request_queue *q = rport->rqst_q;
-	unsigned long flags;
-
-	if (!q)
-		return;
 
-	spin_lock_irqsave(q->queue_lock, flags);
-	blk_run_queue_async(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	if (q)
+		blk_mq_run_hw_queues(q, true);
 }
 
 /**
@@ -3759,6 +3754,37 @@ static int fc_bsg_dispatch(struct bsg_job *job)
 		return fc_bsg_host_dispatch(shost, job);
 }
 
+static blk_status_t fc_bsg_rport_prep(struct fc_rport *rport)
+{
+	if (rport->port_state == FC_PORTSTATE_BLOCKED &&
+	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
+		return BLK_STS_RESOURCE;
+
+	if (rport->port_state != FC_PORTSTATE_ONLINE)
+		return BLK_STS_IOERR;
+
+	return BLK_STS_OK;
+}
+
+
+static int fc_bsg_dispatch_prep(struct bsg_job *job)
+{
+	struct fc_rport *rport = fc_bsg_to_rport(job);
+	blk_status_t ret;
+
+	ret = fc_bsg_rport_prep(rport);
+	switch (ret) {
+	case BLK_STS_OK:
+		break;
+	case BLK_STS_RESOURCE:
+		return -EAGAIN;
+	default:
+		return -EIO;
+	}
+
+	return fc_bsg_dispatch(job);
+}
+
 /**
  * fc_bsg_hostadd - Create and add the bsg hooks so we can receive requests
  * @shost:	shost for fc_host
@@ -3794,20 +3820,6 @@ fc_bsg_hostadd(struct Scsi_Host *shost, struct fc_host_attrs *fc_host)
 	return 0;
 }
 
-static int fc_bsg_rport_prep(struct request_queue *q, struct request *req)
-{
-	struct fc_rport *rport = dev_to_rport(q->queuedata);
-
-	if (rport->port_state == FC_PORTSTATE_BLOCKED &&
-	    !(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT))
-		return BLKPREP_DEFER;
-
-	if (rport->port_state != FC_PORTSTATE_ONLINE)
-		return BLKPREP_KILL;
-
-	return BLKPREP_OK;
-}
-
 /**
  * fc_bsg_rportadd - Create and add the bsg hooks so we can receive requests
  * @shost:	shost that rport is attached to
@@ -3825,14 +3837,13 @@ fc_bsg_rportadd(struct Scsi_Host *shost, struct fc_rport *rport)
 	if (!i->f->bsg_request)
 		return -ENOTSUPP;
 
-	q = bsg_setup_queue(dev, dev_name(dev), fc_bsg_dispatch,
+	q = bsg_setup_queue(dev, dev_name(dev), fc_bsg_dispatch_prep,
 				fc_bsg_job_timeout, i->f->dd_bsg_size);
 	if (IS_ERR(q)) {
 		dev_err(dev, "failed to setup bsg queue\n");
 		return PTR_ERR(q);
 	}
 	__scsi_init_queue(shost, q);
-	blk_queue_prep_rq(q, fc_bsg_rport_prep);
 	blk_queue_rq_timeout(q, BLK_DEFAULT_SG_TIMEOUT);
 	rport->rqst_q = q;
 	return 0;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 14/28] block: remove blk_complete_request()
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (12 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 13/28] bsg: convert to use blk-mq Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  6:59   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 15/28] blk-wbt: kill check for legacy queue type Jens Axboe
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

It's now unused.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-softirq.c    | 20 --------------------
 include/linux/blkdev.h |  1 -
 2 files changed, 21 deletions(-)

diff --git a/block/blk-softirq.c b/block/blk-softirq.c
index e47a2f751884..8ca0f6caf174 100644
--- a/block/blk-softirq.c
+++ b/block/blk-softirq.c
@@ -145,26 +145,6 @@ void __blk_complete_request(struct request *req)
 }
 EXPORT_SYMBOL(__blk_complete_request);
 
-/**
- * blk_complete_request - end I/O on a request
- * @req:      the request being processed
- *
- * Description:
- *     Ends all I/O on a request. It does not handle partial completions,
- *     unless the driver actually implements this in its completion callback
- *     through requeueing. The actual completion happens out-of-order,
- *     through a softirq handler. The user must have registered a completion
- *     callback through blk_queue_softirq_done().
- **/
-void blk_complete_request(struct request *req)
-{
-	if (unlikely(blk_should_fake_timeout(req->q)))
-		return;
-	if (!blk_mark_rq_complete(req))
-		__blk_complete_request(req);
-}
-EXPORT_SYMBOL(blk_complete_request);
-
 static __init int blk_softirq_init(void)
 {
 	int i;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4293dc1cd160..9ff9ab6fc1fe 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1205,7 +1205,6 @@ extern bool __blk_end_request(struct request *rq, blk_status_t error,
 extern void __blk_end_request_all(struct request *rq, blk_status_t error);
 extern bool __blk_end_request_cur(struct request *rq, blk_status_t error);
 
-extern void blk_complete_request(struct request *);
 extern void __blk_complete_request(struct request *);
 extern void blk_abort_request(struct request *);
 extern void blk_unprep_request(struct request *);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 15/28] blk-wbt: kill check for legacy queue type
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (13 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 14/28] block: remove blk_complete_request() Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  6:59   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 16/28] blk-cgroup: remove legacy queue bypassing Jens Axboe
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Everything is blk-mq at this point, so it doesn't make any sense
to have this option available as it does nothing.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/Kconfig   | 6 ------
 block/blk-wbt.c | 3 +--
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/block/Kconfig b/block/Kconfig
index f7045aa47edb..8044452a4fd3 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -155,12 +155,6 @@ config BLK_CGROUP_IOLATENCY
 
 	Note, this is an experimental interface and could be changed someday.
 
-config BLK_WBT_SQ
-	bool "Single queue writeback throttling"
-	depends on BLK_WBT
-	---help---
-	Enable writeback throttling by default on legacy single queue devices
-
 config BLK_WBT_MQ
 	bool "Multiqueue writeback throttling"
 	default y
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 8ac93fcbaa2e..49fac89a981c 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -709,8 +709,7 @@ void wbt_enable_default(struct request_queue *q)
 	if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
 		return;
 
-	if ((q->mq_ops && IS_ENABLED(CONFIG_BLK_WBT_MQ)) ||
-	    (q->request_fn && IS_ENABLED(CONFIG_BLK_WBT_SQ)))
+	if (IS_ENABLED(CONFIG_BLK_WBT_MQ))
 		wbt_init(q);
 }
 EXPORT_SYMBOL_GPL(wbt_enable_default);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 16/28] blk-cgroup: remove legacy queue bypassing
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (14 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 15/28] blk-wbt: kill check for legacy queue type Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:00   ` Hannes Reinecke
  2018-10-29 11:00   ` Johannes Thumshirn
  2018-10-25 21:10 ` [PATCH 17/28] block: remove legacy rq tagging Jens Axboe
                   ` (13 subsequent siblings)
  29 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

We only support mq devices now.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 992da5592c6e..5f10d755ec52 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1446,8 +1446,6 @@ int blkcg_activate_policy(struct request_queue *q,
 
 	if (q->mq_ops)
 		blk_mq_freeze_queue(q);
-	else
-		blk_queue_bypass_start(q);
 pd_prealloc:
 	if (!pd_prealloc) {
 		pd_prealloc = pol->pd_alloc_fn(GFP_KERNEL, q->node);
@@ -1487,8 +1485,6 @@ int blkcg_activate_policy(struct request_queue *q,
 out_bypass_end:
 	if (q->mq_ops)
 		blk_mq_unfreeze_queue(q);
-	else
-		blk_queue_bypass_end(q);
 	if (pd_prealloc)
 		pol->pd_free_fn(pd_prealloc);
 	return ret;
@@ -1513,8 +1509,6 @@ void blkcg_deactivate_policy(struct request_queue *q,
 
 	if (q->mq_ops)
 		blk_mq_freeze_queue(q);
-	else
-		blk_queue_bypass_start(q);
 
 	spin_lock_irq(q->queue_lock);
 
@@ -1533,8 +1527,6 @@ void blkcg_deactivate_policy(struct request_queue *q,
 
 	if (q->mq_ops)
 		blk_mq_unfreeze_queue(q);
-	else
-		blk_queue_bypass_end(q);
 }
 EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 17/28] block: remove legacy rq tagging
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (15 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 16/28] blk-cgroup: remove legacy queue bypassing Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:01   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 18/28] block: remove non mq parts from the flush code Jens Axboe
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

It's now unused, kill it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 Documentation/block/biodoc.txt |  88 --------
 block/Makefile                 |   2 +-
 block/blk-core.c               |   6 -
 block/blk-mq-debugfs.c         |   2 -
 block/blk-mq-tag.c             |   6 +-
 block/blk-sysfs.c              |   3 -
 block/blk-tag.c                | 378 ---------------------------------
 include/linux/blkdev.h         |  35 ---
 8 files changed, 3 insertions(+), 517 deletions(-)
 delete mode 100644 block/blk-tag.c

diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt
index 207eca58efaa..ac18b488cb5e 100644
--- a/Documentation/block/biodoc.txt
+++ b/Documentation/block/biodoc.txt
@@ -65,7 +65,6 @@ Description of Contents:
     3.2.3 I/O completion
     3.2.4 Implications for drivers that do not interpret bios (don't handle
  	  multiple segments)
-    3.2.5 Request command tagging
   3.3 I/O submission
 4. The I/O scheduler
 5. Scalability related changes
@@ -708,93 +707,6 @@ is crossed on completion of a transfer. (The end*request* functions should
 be used if only if the request has come down from block/bio path, not for
 direct access requests which only specify rq->buffer without a valid rq->bio)
 
-3.2.5 Generic request command tagging
-
-3.2.5.1 Tag helpers
-
-Block now offers some simple generic functionality to help support command
-queueing (typically known as tagged command queueing), ie manage more than
-one outstanding command on a queue at any given time.
-
-	blk_queue_init_tags(struct request_queue *q, int depth)
-
-	Initialize internal command tagging structures for a maximum
-	depth of 'depth'.
-
-	blk_queue_free_tags((struct request_queue *q)
-
-	Teardown tag info associated with the queue. This will be done
-	automatically by block if blk_queue_cleanup() is called on a queue
-	that is using tagging.
-
-The above are initialization and exit management, the main helpers during
-normal operations are:
-
-	blk_queue_start_tag(struct request_queue *q, struct request *rq)
-
-	Start tagged operation for this request. A free tag number between
-	0 and 'depth' is assigned to the request (rq->tag holds this number),
-	and 'rq' is added to the internal tag management. If the maximum depth
-	for this queue is already achieved (or if the tag wasn't started for
-	some other reason), 1 is returned. Otherwise 0 is returned.
-
-	blk_queue_end_tag(struct request_queue *q, struct request *rq)
-
-	End tagged operation on this request. 'rq' is removed from the internal
-	book keeping structures.
-
-To minimize struct request and queue overhead, the tag helpers utilize some
-of the same request members that are used for normal request queue management.
-This means that a request cannot both be an active tag and be on the queue
-list at the same time. blk_queue_start_tag() will remove the request, but
-the driver must remember to call blk_queue_end_tag() before signalling
-completion of the request to the block layer. This means ending tag
-operations before calling end_that_request_last()! For an example of a user
-of these helpers, see the IDE tagged command queueing support.
-
-3.2.5.2 Tag info
-
-Some block functions exist to query current tag status or to go from a
-tag number to the associated request. These are, in no particular order:
-
-	blk_queue_tagged(q)
-
-	Returns 1 if the queue 'q' is using tagging, 0 if not.
-
-	blk_queue_tag_request(q, tag)
-
-	Returns a pointer to the request associated with tag 'tag'.
-
-	blk_queue_tag_depth(q)
-	
-	Return current queue depth.
-
-	blk_queue_tag_queue(q)
-
-	Returns 1 if the queue can accept a new queued command, 0 if we are
-	at the maximum depth already.
-
-	blk_queue_rq_tagged(rq)
-
-	Returns 1 if the request 'rq' is tagged.
-
-3.2.5.2 Internal structure
-
-Internally, block manages tags in the blk_queue_tag structure:
-
-	struct blk_queue_tag {
-		struct request **tag_index;	/* array or pointers to rq */
-		unsigned long *tag_map;		/* bitmap of free tags */
-		struct list_head busy_list;	/* fifo list of busy tags */
-		int busy;			/* queue depth */
-		int max_depth;			/* max queue depth */
-	};
-
-Most of the above is simple and straight forward, however busy_list may need
-a bit of explaining. Normally we don't care too much about request ordering,
-but in the event of any barrier requests in the tag queue we need to ensure
-that requests are restarted in the order they were queue.
-
 3.3 I/O Submission
 
 The routine submit_bio() is used to submit a single io. Higher level i/o
diff --git a/block/Makefile b/block/Makefile
index 27eac600474f..213674c8faaa 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the kernel block layer
 #
 
-obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-tag.o blk-sysfs.o \
+obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-sysfs.o \
 			blk-flush.o blk-settings.o blk-ioc.o blk-map.o \
 			blk-exec.o blk-merge.o blk-softirq.o blk-timeout.o \
 			blk-lib.o blk-mq.o blk-mq-tag.o blk-stat.o \
diff --git a/block/blk-core.c b/block/blk-core.c
index bc6ea87d10e0..1a814bce323d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1655,9 +1655,6 @@ void blk_requeue_request(struct request_queue *q, struct request *rq)
 	trace_block_rq_requeue(q, rq);
 	rq_qos_requeue(q, rq);
 
-	if (rq->rq_flags & RQF_QUEUED)
-		blk_queue_end_tag(q, rq);
-
 	BUG_ON(blk_queued_rq(rq));
 
 	elv_requeue_request(q, rq);
@@ -3172,9 +3169,6 @@ void blk_finish_request(struct request *req, blk_status_t error)
 	if (req->rq_flags & RQF_STATS)
 		blk_stat_add(req, now);
 
-	if (req->rq_flags & RQF_QUEUED)
-		blk_queue_end_tag(q, req);
-
 	BUG_ON(blk_queued_rq(req));
 
 	if (unlikely(laptop_mode) && !blk_rq_is_passthrough(req))
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 10b284a1f18d..9ed43a7c70b5 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -112,7 +112,6 @@ static int queue_pm_only_show(void *data, struct seq_file *m)
 
 #define QUEUE_FLAG_NAME(name) [QUEUE_FLAG_##name] = #name
 static const char *const blk_queue_flag_name[] = {
-	QUEUE_FLAG_NAME(QUEUED),
 	QUEUE_FLAG_NAME(STOPPED),
 	QUEUE_FLAG_NAME(DYING),
 	QUEUE_FLAG_NAME(BYPASS),
@@ -318,7 +317,6 @@ static const char *const cmd_flag_name[] = {
 static const char *const rqf_name[] = {
 	RQF_NAME(SORTED),
 	RQF_NAME(STARTED),
-	RQF_NAME(QUEUED),
 	RQF_NAME(SOFTBARRIER),
 	RQF_NAME(FLUSH_SEQ),
 	RQF_NAME(MIXED_MERGE),
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index cfda95b85d34..4254e74c1446 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -530,10 +530,8 @@ u32 blk_mq_unique_tag(struct request *rq)
 	struct blk_mq_hw_ctx *hctx;
 	int hwq = 0;
 
-	if (q->mq_ops) {
-		hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
-		hwq = hctx->queue_num;
-	}
+	hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
+	hwq = hctx->queue_num;
 
 	return (hwq << BLK_MQ_UNIQUE_TAG_BITS) |
 		(rq->tag & BLK_MQ_UNIQUE_TAG_MASK);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 0641533597f1..e4fc3bd9c32e 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -849,9 +849,6 @@ static void __blk_release_queue(struct work_struct *work)
 
 	blk_exit_rl(q, &q->root_rl);
 
-	if (q->queue_tags)
-		__blk_queue_free_tags(q);
-
 	blk_queue_free_zone_bitmaps(q);
 
 	if (!q->mq_ops) {
diff --git a/block/blk-tag.c b/block/blk-tag.c
deleted file mode 100644
index fbc153aef166..000000000000
--- a/block/blk-tag.c
+++ /dev/null
@@ -1,378 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Functions related to tagged command queuing
- */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/bio.h>
-#include <linux/blkdev.h>
-#include <linux/slab.h>
-
-#include "blk.h"
-
-/**
- * blk_queue_find_tag - find a request by its tag and queue
- * @q:	 The request queue for the device
- * @tag: The tag of the request
- *
- * Notes:
- *    Should be used when a device returns a tag and you want to match
- *    it with a request.
- *
- *    no locks need be held.
- **/
-struct request *blk_queue_find_tag(struct request_queue *q, int tag)
-{
-	return blk_map_queue_find_tag(q->queue_tags, tag);
-}
-EXPORT_SYMBOL(blk_queue_find_tag);
-
-/**
- * blk_free_tags - release a given set of tag maintenance info
- * @bqt:	the tag map to free
- *
- * Drop the reference count on @bqt and frees it when the last reference
- * is dropped.
- */
-void blk_free_tags(struct blk_queue_tag *bqt)
-{
-	if (atomic_dec_and_test(&bqt->refcnt)) {
-		BUG_ON(find_first_bit(bqt->tag_map, bqt->max_depth) <
-							bqt->max_depth);
-
-		kfree(bqt->tag_index);
-		bqt->tag_index = NULL;
-
-		kfree(bqt->tag_map);
-		bqt->tag_map = NULL;
-
-		kfree(bqt);
-	}
-}
-EXPORT_SYMBOL(blk_free_tags);
-
-/**
- * __blk_queue_free_tags - release tag maintenance info
- * @q:  the request queue for the device
- *
- *  Notes:
- *    blk_cleanup_queue() will take care of calling this function, if tagging
- *    has been used. So there's no need to call this directly.
- **/
-void __blk_queue_free_tags(struct request_queue *q)
-{
-	struct blk_queue_tag *bqt = q->queue_tags;
-
-	if (!bqt)
-		return;
-
-	blk_free_tags(bqt);
-
-	q->queue_tags = NULL;
-	queue_flag_clear_unlocked(QUEUE_FLAG_QUEUED, q);
-}
-
-/**
- * blk_queue_free_tags - release tag maintenance info
- * @q:  the request queue for the device
- *
- *  Notes:
- *	This is used to disable tagged queuing to a device, yet leave
- *	queue in function.
- **/
-void blk_queue_free_tags(struct request_queue *q)
-{
-	queue_flag_clear_unlocked(QUEUE_FLAG_QUEUED, q);
-}
-EXPORT_SYMBOL(blk_queue_free_tags);
-
-static int
-init_tag_map(struct request_queue *q, struct blk_queue_tag *tags, int depth)
-{
-	struct request **tag_index;
-	unsigned long *tag_map;
-	int nr_ulongs;
-
-	if (q && depth > q->nr_requests * 2) {
-		depth = q->nr_requests * 2;
-		printk(KERN_ERR "%s: adjusted depth to %d\n",
-		       __func__, depth);
-	}
-
-	tag_index = kcalloc(depth, sizeof(struct request *), GFP_ATOMIC);
-	if (!tag_index)
-		goto fail;
-
-	nr_ulongs = ALIGN(depth, BITS_PER_LONG) / BITS_PER_LONG;
-	tag_map = kcalloc(nr_ulongs, sizeof(unsigned long), GFP_ATOMIC);
-	if (!tag_map)
-		goto fail;
-
-	tags->real_max_depth = depth;
-	tags->max_depth = depth;
-	tags->tag_index = tag_index;
-	tags->tag_map = tag_map;
-
-	return 0;
-fail:
-	kfree(tag_index);
-	return -ENOMEM;
-}
-
-static struct blk_queue_tag *__blk_queue_init_tags(struct request_queue *q,
-						int depth, int alloc_policy)
-{
-	struct blk_queue_tag *tags;
-
-	tags = kmalloc(sizeof(struct blk_queue_tag), GFP_ATOMIC);
-	if (!tags)
-		goto fail;
-
-	if (init_tag_map(q, tags, depth))
-		goto fail;
-
-	atomic_set(&tags->refcnt, 1);
-	tags->alloc_policy = alloc_policy;
-	tags->next_tag = 0;
-	return tags;
-fail:
-	kfree(tags);
-	return NULL;
-}
-
-/**
- * blk_init_tags - initialize the tag info for an external tag map
- * @depth:	the maximum queue depth supported
- * @alloc_policy: tag allocation policy
- **/
-struct blk_queue_tag *blk_init_tags(int depth, int alloc_policy)
-{
-	return __blk_queue_init_tags(NULL, depth, alloc_policy);
-}
-EXPORT_SYMBOL(blk_init_tags);
-
-/**
- * blk_queue_init_tags - initialize the queue tag info
- * @q:  the request queue for the device
- * @depth:  the maximum queue depth supported
- * @tags: the tag to use
- * @alloc_policy: tag allocation policy
- *
- * Queue lock must be held here if the function is called to resize an
- * existing map.
- **/
-int blk_queue_init_tags(struct request_queue *q, int depth,
-			struct blk_queue_tag *tags, int alloc_policy)
-{
-	int rc;
-
-	BUG_ON(tags && q->queue_tags && tags != q->queue_tags);
-
-	if (!tags && !q->queue_tags) {
-		tags = __blk_queue_init_tags(q, depth, alloc_policy);
-
-		if (!tags)
-			return -ENOMEM;
-
-	} else if (q->queue_tags) {
-		rc = blk_queue_resize_tags(q, depth);
-		if (rc)
-			return rc;
-		queue_flag_set(QUEUE_FLAG_QUEUED, q);
-		return 0;
-	} else
-		atomic_inc(&tags->refcnt);
-
-	/*
-	 * assign it, all done
-	 */
-	q->queue_tags = tags;
-	queue_flag_set_unlocked(QUEUE_FLAG_QUEUED, q);
-	return 0;
-}
-EXPORT_SYMBOL(blk_queue_init_tags);
-
-/**
- * blk_queue_resize_tags - change the queueing depth
- * @q:  the request queue for the device
- * @new_depth: the new max command queueing depth
- *
- *  Notes:
- *    Must be called with the queue lock held.
- **/
-int blk_queue_resize_tags(struct request_queue *q, int new_depth)
-{
-	struct blk_queue_tag *bqt = q->queue_tags;
-	struct request **tag_index;
-	unsigned long *tag_map;
-	int max_depth, nr_ulongs;
-
-	if (!bqt)
-		return -ENXIO;
-
-	/*
-	 * if we already have large enough real_max_depth.  just
-	 * adjust max_depth.  *NOTE* as requests with tag value
-	 * between new_depth and real_max_depth can be in-flight, tag
-	 * map can not be shrunk blindly here.
-	 */
-	if (new_depth <= bqt->real_max_depth) {
-		bqt->max_depth = new_depth;
-		return 0;
-	}
-
-	/*
-	 * Currently cannot replace a shared tag map with a new
-	 * one, so error out if this is the case
-	 */
-	if (atomic_read(&bqt->refcnt) != 1)
-		return -EBUSY;
-
-	/*
-	 * save the old state info, so we can copy it back
-	 */
-	tag_index = bqt->tag_index;
-	tag_map = bqt->tag_map;
-	max_depth = bqt->real_max_depth;
-
-	if (init_tag_map(q, bqt, new_depth))
-		return -ENOMEM;
-
-	memcpy(bqt->tag_index, tag_index, max_depth * sizeof(struct request *));
-	nr_ulongs = ALIGN(max_depth, BITS_PER_LONG) / BITS_PER_LONG;
-	memcpy(bqt->tag_map, tag_map, nr_ulongs * sizeof(unsigned long));
-
-	kfree(tag_index);
-	kfree(tag_map);
-	return 0;
-}
-EXPORT_SYMBOL(blk_queue_resize_tags);
-
-/**
- * blk_queue_end_tag - end tag operations for a request
- * @q:  the request queue for the device
- * @rq: the request that has completed
- *
- *  Description:
- *    Typically called when end_that_request_first() returns %0, meaning
- *    all transfers have been done for a request. It's important to call
- *    this function before end_that_request_last(), as that will put the
- *    request back on the free list thus corrupting the internal tag list.
- **/
-void blk_queue_end_tag(struct request_queue *q, struct request *rq)
-{
-	struct blk_queue_tag *bqt = q->queue_tags;
-	unsigned tag = rq->tag; /* negative tags invalid */
-
-	lockdep_assert_held(q->queue_lock);
-
-	BUG_ON(tag >= bqt->real_max_depth);
-
-	list_del_init(&rq->queuelist);
-	rq->rq_flags &= ~RQF_QUEUED;
-	rq->tag = -1;
-	rq->internal_tag = -1;
-
-	if (unlikely(bqt->tag_index[tag] == NULL))
-		printk(KERN_ERR "%s: tag %d is missing\n",
-		       __func__, tag);
-
-	bqt->tag_index[tag] = NULL;
-
-	if (unlikely(!test_bit(tag, bqt->tag_map))) {
-		printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)\n",
-		       __func__, tag);
-		return;
-	}
-	/*
-	 * The tag_map bit acts as a lock for tag_index[bit], so we need
-	 * unlock memory barrier semantics.
-	 */
-	clear_bit_unlock(tag, bqt->tag_map);
-}
-
-/**
- * blk_queue_start_tag - find a free tag and assign it
- * @q:  the request queue for the device
- * @rq:  the block request that needs tagging
- *
- *  Description:
- *    This can either be used as a stand-alone helper, or possibly be
- *    assigned as the queue &prep_rq_fn (in which case &struct request
- *    automagically gets a tag assigned). Note that this function
- *    assumes that any type of request can be queued! if this is not
- *    true for your device, you must check the request type before
- *    calling this function.  The request will also be removed from
- *    the request queue, so it's the drivers responsibility to readd
- *    it if it should need to be restarted for some reason.
- **/
-int blk_queue_start_tag(struct request_queue *q, struct request *rq)
-{
-	struct blk_queue_tag *bqt = q->queue_tags;
-	unsigned max_depth;
-	int tag;
-
-	lockdep_assert_held(q->queue_lock);
-
-	if (unlikely((rq->rq_flags & RQF_QUEUED))) {
-		printk(KERN_ERR
-		       "%s: request %p for device [%s] already tagged %d",
-		       __func__, rq,
-		       rq->rq_disk ? rq->rq_disk->disk_name : "?", rq->tag);
-		BUG();
-	}
-
-	/*
-	 * Protect against shared tag maps, as we may not have exclusive
-	 * access to the tag map.
-	 *
-	 * We reserve a few tags just for sync IO, since we don't want
-	 * to starve sync IO on behalf of flooding async IO.
-	 */
-	max_depth = bqt->max_depth;
-	if (!rq_is_sync(rq) && max_depth > 1) {
-		switch (max_depth) {
-		case 2:
-			max_depth = 1;
-			break;
-		case 3:
-			max_depth = 2;
-			break;
-		default:
-			max_depth -= 2;
-		}
-		if (q->in_flight[BLK_RW_ASYNC] > max_depth)
-			return 1;
-	}
-
-	do {
-		if (bqt->alloc_policy == BLK_TAG_ALLOC_FIFO) {
-			tag = find_first_zero_bit(bqt->tag_map, max_depth);
-			if (tag >= max_depth)
-				return 1;
-		} else {
-			int start = bqt->next_tag;
-			int size = min_t(int, bqt->max_depth, max_depth + start);
-			tag = find_next_zero_bit(bqt->tag_map, size, start);
-			if (tag >= size && start + size > bqt->max_depth) {
-				size = start + size - bqt->max_depth;
-				tag = find_first_zero_bit(bqt->tag_map, size);
-			}
-			if (tag >= size)
-				return 1;
-		}
-
-	} while (test_and_set_bit_lock(tag, bqt->tag_map));
-	/*
-	 * We need lock ordering semantics given by test_and_set_bit_lock.
-	 * See blk_queue_end_tag for details.
-	 */
-
-	bqt->next_tag = (tag + 1) % bqt->max_depth;
-	rq->rq_flags |= RQF_QUEUED;
-	rq->tag = tag;
-	bqt->tag_index[tag] = rq;
-	blk_start_request(rq);
-	return 0;
-}
-EXPORT_SYMBOL(blk_queue_start_tag);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 9ff9ab6fc1fe..2e683b1f1ee1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -85,8 +85,6 @@ typedef __u32 __bitwise req_flags_t;
 #define RQF_SORTED		((__force req_flags_t)(1 << 0))
 /* drive already may have started this one */
 #define RQF_STARTED		((__force req_flags_t)(1 << 1))
-/* uses tagged queueing */
-#define RQF_QUEUED		((__force req_flags_t)(1 << 2))
 /* may not be passed by ioscheduler */
 #define RQF_SOFTBARRIER		((__force req_flags_t)(1 << 3))
 /* request for flush sequence */
@@ -337,15 +335,6 @@ enum blk_queue_state {
 	Queue_up,
 };
 
-struct blk_queue_tag {
-	struct request **tag_index;	/* map of busy tags */
-	unsigned long *tag_map;		/* bit map of free/busy tags */
-	int max_depth;			/* what we will send to device */
-	int real_max_depth;		/* what the array can hold */
-	atomic_t refcnt;		/* map can be shared */
-	int alloc_policy;		/* tag allocation policy */
-	int next_tag;			/* next tag */
-};
 #define BLK_TAG_ALLOC_FIFO 0 /* allocate starting from 0 */
 #define BLK_TAG_ALLOC_RR 1 /* allocate starting from last allocated tag */
 
@@ -570,8 +559,6 @@ struct request_queue {
 	unsigned int		dma_pad_mask;
 	unsigned int		dma_alignment;
 
-	struct blk_queue_tag	*queue_tags;
-
 	unsigned int		nr_sorted;
 	unsigned int		in_flight[2];
 
@@ -682,7 +669,6 @@ struct request_queue {
 	u64			write_hints[BLK_MAX_WRITE_HINTS];
 };
 
-#define QUEUE_FLAG_QUEUED	0	/* uses generic tag queueing */
 #define QUEUE_FLAG_STOPPED	1	/* queue is stopped */
 #define QUEUE_FLAG_DYING	2	/* queue being torn down */
 #define QUEUE_FLAG_BYPASS	3	/* act as dumb FIFO queue */
@@ -726,7 +712,6 @@ void blk_queue_flag_clear(unsigned int flag, struct request_queue *q);
 bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q);
 
-#define blk_queue_tagged(q)	test_bit(QUEUE_FLAG_QUEUED, &(q)->queue_flags)
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
 #define blk_queue_dying(q)	test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags)
 #define blk_queue_dead(q)	test_bit(QUEUE_FLAG_DEAD, &(q)->queue_flags)
@@ -1362,26 +1347,6 @@ static inline bool blk_needs_flush_plug(struct task_struct *tsk)
 		 !list_empty(&plug->cb_list));
 }
 
-/*
- * tag stuff
- */
-extern int blk_queue_start_tag(struct request_queue *, struct request *);
-extern struct request *blk_queue_find_tag(struct request_queue *, int);
-extern void blk_queue_end_tag(struct request_queue *, struct request *);
-extern int blk_queue_init_tags(struct request_queue *, int, struct blk_queue_tag *, int);
-extern void blk_queue_free_tags(struct request_queue *);
-extern int blk_queue_resize_tags(struct request_queue *, int);
-extern struct blk_queue_tag *blk_init_tags(int, int);
-extern void blk_free_tags(struct blk_queue_tag *);
-
-static inline struct request *blk_map_queue_find_tag(struct blk_queue_tag *bqt,
-						int tag)
-{
-	if (unlikely(bqt == NULL || tag >= bqt->real_max_depth))
-		return NULL;
-	return bqt->tag_index[tag];
-}
-
 extern int blkdev_issue_flush(struct block_device *, gfp_t, sector_t *);
 extern int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct page *page);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 18/28] block: remove non mq parts from the flush code
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (16 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 17/28] block: remove legacy rq tagging Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:02   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 19/28] block: remove legacy IO schedulers Jens Axboe
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-flush.c | 154 +++++++++-------------------------------------
 block/blk.h       |   4 +-
 2 files changed, 31 insertions(+), 127 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 8b44b86779da..9baa9a119447 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -134,16 +134,8 @@ static void blk_flush_restore_request(struct request *rq)
 
 static bool blk_flush_queue_rq(struct request *rq, bool add_front)
 {
-	if (rq->q->mq_ops) {
-		blk_mq_add_to_requeue_list(rq, add_front, true);
-		return false;
-	} else {
-		if (add_front)
-			list_add(&rq->queuelist, &rq->q->queue_head);
-		else
-			list_add_tail(&rq->queuelist, &rq->q->queue_head);
-		return true;
-	}
+	blk_mq_add_to_requeue_list(rq, add_front, true);
+	return false;
 }
 
 /**
@@ -204,10 +196,7 @@ static bool blk_flush_complete_seq(struct request *rq,
 		BUG_ON(!list_empty(&rq->queuelist));
 		list_del_init(&rq->flush.list);
 		blk_flush_restore_request(rq);
-		if (q->mq_ops)
-			blk_mq_end_request(rq, error);
-		else
-			__blk_end_request_all(rq, error);
+		blk_mq_end_request(rq, error);
 		break;
 
 	default:
@@ -226,20 +215,17 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 	struct request *rq, *n;
 	unsigned long flags = 0;
 	struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);
+	struct blk_mq_hw_ctx *hctx;
 
-	if (q->mq_ops) {
-		struct blk_mq_hw_ctx *hctx;
-
-		/* release the tag's ownership to the req cloned from */
-		spin_lock_irqsave(&fq->mq_flush_lock, flags);
-		hctx = blk_mq_map_queue(q, flush_rq->mq_ctx->cpu);
-		if (!q->elevator) {
-			blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq);
-			flush_rq->tag = -1;
-		} else {
-			blk_mq_put_driver_tag_hctx(hctx, flush_rq);
-			flush_rq->internal_tag = -1;
-		}
+	/* release the tag's ownership to the req cloned from */
+	spin_lock_irqsave(&fq->mq_flush_lock, flags);
+	hctx = blk_mq_map_queue(q, flush_rq->mq_ctx->cpu);
+	if (!q->elevator) {
+		blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq);
+		flush_rq->tag = -1;
+	} else {
+		blk_mq_put_driver_tag_hctx(hctx, flush_rq);
+		flush_rq->internal_tag = -1;
 	}
 
 	running = &fq->flush_queue[fq->flush_running_idx];
@@ -248,9 +234,6 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 	/* account completion of the flush request */
 	fq->flush_running_idx ^= 1;
 
-	if (!q->mq_ops)
-		elv_completed_request(q, flush_rq);
-
 	/* and push the waiting requests to the next stage */
 	list_for_each_entry_safe(rq, n, running, flush.list) {
 		unsigned int seq = blk_flush_cur_seq(rq);
@@ -259,24 +242,8 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 		queued |= blk_flush_complete_seq(rq, fq, seq, error);
 	}
 
-	/*
-	 * Kick the queue to avoid stall for two cases:
-	 * 1. Moving a request silently to empty queue_head may stall the
-	 * queue.
-	 * 2. When flush request is running in non-queueable queue, the
-	 * queue is hold. Restart the queue after flush request is finished
-	 * to avoid stall.
-	 * This function is called from request completion path and calling
-	 * directly into request_fn may confuse the driver.  Always use
-	 * kblockd.
-	 */
-	if (queued || fq->flush_queue_delayed) {
-		WARN_ON(q->mq_ops);
-		blk_run_queue_async(q);
-	}
 	fq->flush_queue_delayed = 0;
-	if (q->mq_ops)
-		spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
+	spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
 }
 
 /**
@@ -301,6 +268,7 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 	struct request *first_rq =
 		list_first_entry(pending, struct request, flush.list);
 	struct request *flush_rq = fq->flush_rq;
+	struct blk_mq_hw_ctx *hctx;
 
 	/* C1 described at the top of this file */
 	if (fq->flush_pending_idx != fq->flush_running_idx || list_empty(pending))
@@ -334,19 +302,15 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 	 * In case of IO scheduler, flush rq need to borrow scheduler tag
 	 * just for cheating put/get driver tag.
 	 */
-	if (q->mq_ops) {
-		struct blk_mq_hw_ctx *hctx;
-
-		flush_rq->mq_ctx = first_rq->mq_ctx;
-
-		if (!q->elevator) {
-			fq->orig_rq = first_rq;
-			flush_rq->tag = first_rq->tag;
-			hctx = blk_mq_map_queue(q, first_rq->mq_ctx->cpu);
-			blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq);
-		} else {
-			flush_rq->internal_tag = first_rq->internal_tag;
-		}
+	flush_rq->mq_ctx = first_rq->mq_ctx;
+
+	if (!q->elevator) {
+		fq->orig_rq = first_rq;
+		flush_rq->tag = first_rq->tag;
+		hctx = blk_mq_map_queue(q, first_rq->mq_ctx->cpu);
+		blk_mq_tag_set_rq(hctx, first_rq->tag, flush_rq);
+	} else {
+		flush_rq->internal_tag = first_rq->internal_tag;
 	}
 
 	flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
@@ -358,49 +322,6 @@ static bool blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 	return blk_flush_queue_rq(flush_rq, false);
 }
 
-static void flush_data_end_io(struct request *rq, blk_status_t error)
-{
-	struct request_queue *q = rq->q;
-	struct blk_flush_queue *fq = blk_get_flush_queue(q, NULL);
-
-	lockdep_assert_held(q->queue_lock);
-
-	/*
-	 * Updating q->in_flight[] here for making this tag usable
-	 * early. Because in blk_queue_start_tag(),
-	 * q->in_flight[BLK_RW_ASYNC] is used to limit async I/O and
-	 * reserve tags for sync I/O.
-	 *
-	 * More importantly this way can avoid the following I/O
-	 * deadlock:
-	 *
-	 * - suppose there are 40 fua requests comming to flush queue
-	 *   and queue depth is 31
-	 * - 30 rqs are scheduled then blk_queue_start_tag() can't alloc
-	 *   tag for async I/O any more
-	 * - all the 30 rqs are completed before FLUSH_PENDING_TIMEOUT
-	 *   and flush_data_end_io() is called
-	 * - the other rqs still can't go ahead if not updating
-	 *   q->in_flight[BLK_RW_ASYNC] here, meantime these rqs
-	 *   are held in flush data queue and make no progress of
-	 *   handling post flush rq
-	 * - only after the post flush rq is handled, all these rqs
-	 *   can be completed
-	 */
-
-	elv_completed_request(q, rq);
-
-	/* for avoiding double accounting */
-	rq->rq_flags &= ~RQF_STARTED;
-
-	/*
-	 * After populating an empty queue, kick it to avoid stall.  Read
-	 * the comment in flush_end_io().
-	 */
-	if (blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error))
-		blk_run_queue_async(q);
-}
-
 static void mq_flush_data_end_io(struct request *rq, blk_status_t error)
 {
 	struct request_queue *q = rq->q;
@@ -443,9 +364,6 @@ void blk_insert_flush(struct request *rq)
 	unsigned int policy = blk_flush_policy(fflags, rq);
 	struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx);
 
-	if (!q->mq_ops)
-		lockdep_assert_held(q->queue_lock);
-
 	/*
 	 * @policy now records what operations need to be done.  Adjust
 	 * REQ_PREFLUSH and FUA for the driver.
@@ -468,10 +386,7 @@ void blk_insert_flush(struct request *rq)
 	 * complete the request.
 	 */
 	if (!policy) {
-		if (q->mq_ops)
-			blk_mq_end_request(rq, 0);
-		else
-			__blk_end_request(rq, 0, 0);
+		blk_mq_end_request(rq, 0);
 		return;
 	}
 
@@ -484,10 +399,7 @@ void blk_insert_flush(struct request *rq)
 	 */
 	if ((policy & REQ_FSEQ_DATA) &&
 	    !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
-		if (q->mq_ops)
-			blk_mq_request_bypass_insert(rq, false);
-		else
-			list_add_tail(&rq->queuelist, &q->queue_head);
+		blk_mq_request_bypass_insert(rq, false);
 		return;
 	}
 
@@ -499,17 +411,12 @@ void blk_insert_flush(struct request *rq)
 	INIT_LIST_HEAD(&rq->flush.list);
 	rq->rq_flags |= RQF_FLUSH_SEQ;
 	rq->flush.saved_end_io = rq->end_io; /* Usually NULL */
-	if (q->mq_ops) {
-		rq->end_io = mq_flush_data_end_io;
 
-		spin_lock_irq(&fq->mq_flush_lock);
-		blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
-		spin_unlock_irq(&fq->mq_flush_lock);
-		return;
-	}
-	rq->end_io = flush_data_end_io;
+	rq->end_io = mq_flush_data_end_io;
 
+	spin_lock_irq(&fq->mq_flush_lock);
 	blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
+	spin_unlock_irq(&fq->mq_flush_lock);
 }
 
 /**
@@ -575,8 +482,7 @@ struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q,
 	if (!fq)
 		goto fail;
 
-	if (q->mq_ops)
-		spin_lock_init(&fq->mq_flush_lock);
+	spin_lock_init(&fq->mq_flush_lock);
 
 	rq_sz = round_up(rq_sz + cmd_size, cache_line_size());
 	fq->flush_rq = kzalloc_node(rq_sz, flags, node);
diff --git a/block/blk.h b/block/blk.h
index a1841b8ff129..57a302bf5a70 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -114,9 +114,7 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 static inline struct blk_flush_queue *blk_get_flush_queue(
 		struct request_queue *q, struct blk_mq_ctx *ctx)
 {
-	if (q->mq_ops)
-		return blk_mq_map_queue(q, ctx->cpu)->fq;
-	return q->fq;
+	return blk_mq_map_queue(q, ctx->cpu)->fq;
 }
 
 static inline void __blk_get_queue(struct request_queue *q)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 19/28] block: remove legacy IO schedulers
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (17 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 18/28] block: remove non mq parts from the flush code Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-25 21:10 ` [PATCH 20/28] block: remove dead elevator code Jens Axboe
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Retain the deadline documentation, as that carries over to mq-deadline
as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 Documentation/block/cfq-iosched.txt |  291 --
 block/Kconfig.iosched               |   61 -
 block/Makefile                      |    3 -
 block/cfq-iosched.c                 | 4916 ---------------------------
 block/deadline-iosched.c            |  560 ---
 block/elevator.c                    |   70 -
 block/noop-iosched.c                |  124 -
 7 files changed, 6025 deletions(-)
 delete mode 100644 Documentation/block/cfq-iosched.txt
 delete mode 100644 block/cfq-iosched.c
 delete mode 100644 block/deadline-iosched.c
 delete mode 100644 block/noop-iosched.c

diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt
deleted file mode 100644
index 895bd3813115..000000000000
--- a/Documentation/block/cfq-iosched.txt
+++ /dev/null
@@ -1,291 +0,0 @@
-CFQ (Complete Fairness Queueing)
-===============================
-
-The main aim of CFQ scheduler is to provide a fair allocation of the disk
-I/O bandwidth for all the processes which requests an I/O operation.
-
-CFQ maintains the per process queue for the processes which request I/O
-operation(synchronous requests). In case of asynchronous requests, all the
-requests from all the processes are batched together according to their
-process's I/O priority.
-
-CFQ ioscheduler tunables
-========================
-
-slice_idle
-----------
-This specifies how long CFQ should idle for next request on certain cfq queues
-(for sequential workloads) and service trees (for random workloads) before
-queue is expired and CFQ selects next queue to dispatch from.
-
-By default slice_idle is a non-zero value. That means by default we idle on
-queues/service trees. This can be very helpful on highly seeky media like
-single spindle SATA/SAS disks where we can cut down on overall number of
-seeks and see improved throughput.
-
-Setting slice_idle to 0 will remove all the idling on queues/service tree
-level and one should see an overall improved throughput on faster storage
-devices like multiple SATA/SAS disks in hardware RAID configuration. The down
-side is that isolation provided from WRITES also goes down and notion of
-IO priority becomes weaker.
-
-So depending on storage and workload, it might be useful to set slice_idle=0.
-In general I think for SATA/SAS disks and software RAID of SATA/SAS disks
-keeping slice_idle enabled should be useful. For any configurations where
-there are multiple spindles behind single LUN (Host based hardware RAID
-controller or for storage arrays), setting slice_idle=0 might end up in better
-throughput and acceptable latencies.
-
-back_seek_max
--------------
-This specifies, given in Kbytes, the maximum "distance" for backward seeking.
-The distance is the amount of space from the current head location to the
-sectors that are backward in terms of distance.
-
-This parameter allows the scheduler to anticipate requests in the "backward"
-direction and consider them as being the "next" if they are within this
-distance from the current head location.
-
-back_seek_penalty
------------------
-This parameter is used to compute the cost of backward seeking. If the
-backward distance of request is just 1/back_seek_penalty from a "front"
-request, then the seeking cost of two requests is considered equivalent.
-
-So scheduler will not bias toward one or the other request (otherwise scheduler
-will bias toward front request). Default value of back_seek_penalty is 2.
-
-fifo_expire_async
------------------
-This parameter is used to set the timeout of asynchronous requests. Default
-value of this is 248ms.
-
-fifo_expire_sync
-----------------
-This parameter is used to set the timeout of synchronous requests. Default
-value of this is 124ms. In case to favor synchronous requests over asynchronous
-one, this value should be decreased relative to fifo_expire_async.
-
-group_idle
------------
-This parameter forces idling at the CFQ group level instead of CFQ
-queue level. This was introduced after a bottleneck was observed
-in higher end storage due to idle on sequential queue and allow dispatch
-from a single queue. The idea with this parameter is that it can be run with
-slice_idle=0 and group_idle=8, so that idling does not happen on individual
-queues in the group but happens overall on the group and thus still keeps the
-IO controller working.
-Not idling on individual queues in the group will dispatch requests from
-multiple queues in the group at the same time and achieve higher throughput
-on higher end storage.
-
-Default value for this parameter is 8ms.
-
-low_latency
------------
-This parameter is used to enable/disable the low latency mode of the CFQ
-scheduler. If enabled, CFQ tries to recompute the slice time for each process
-based on the target_latency set for the system. This favors fairness over
-throughput. Disabling low latency (setting it to 0) ignores target latency,
-allowing each process in the system to get a full time slice.
-
-By default low latency mode is enabled.
-
-target_latency
---------------
-This parameter is used to calculate the time slice for a process if cfq's
-latency mode is enabled. It will ensure that sync requests have an estimated
-latency. But if sequential workload is higher(e.g. sequential read),
-then to meet the latency constraints, throughput may decrease because of less
-time for each process to issue I/O request before the cfq queue is switched.
-
-Though this can be overcome by disabling the latency_mode, it may increase
-the read latency for some applications. This parameter allows for changing
-target_latency through the sysfs interface which can provide the balanced
-throughput and read latency.
-
-Default value for target_latency is 300ms.
-
-slice_async
------------
-This parameter is same as of slice_sync but for asynchronous queue. The
-default value is 40ms.
-
-slice_async_rq
---------------
-This parameter is used to limit the dispatching of asynchronous request to
-device request queue in queue's slice time. The maximum number of request that
-are allowed to be dispatched also depends upon the io priority. Default value
-for this is 2.
-
-slice_sync
-----------
-When a queue is selected for execution, the queues IO requests are only
-executed for a certain amount of time(time_slice) before switching to another
-queue. This parameter is used to calculate the time slice of synchronous
-queue.
-
-time_slice is computed using the below equation:-
-time_slice = slice_sync + (slice_sync/5 * (4 - prio)). To increase the
-time_slice of synchronous queue, increase the value of slice_sync. Default
-value is 100ms.
-
-quantum
--------
-This specifies the number of request dispatched to the device queue. In a
-queue's time slice, a request will not be dispatched if the number of request
-in the device exceeds this parameter. This parameter is used for synchronous
-request.
-
-In case of storage with several disk, this setting can limit the parallel
-processing of request. Therefore, increasing the value can improve the
-performance although this can cause the latency of some I/O to increase due
-to more number of requests.
-
-CFQ Group scheduling
-====================
-
-CFQ supports blkio cgroup and has "blkio." prefixed files in each
-blkio cgroup directory. It is weight-based and there are four knobs
-for configuration - weight[_device] and leaf_weight[_device].
-Internal cgroup nodes (the ones with children) can also have tasks in
-them, so the former two configure how much proportion the cgroup as a
-whole is entitled to at its parent's level while the latter two
-configure how much proportion the tasks in the cgroup have compared to
-its direct children.
-
-Another way to think about it is assuming that each internal node has
-an implicit leaf child node which hosts all the tasks whose weight is
-configured by leaf_weight[_device]. Let's assume a blkio hierarchy
-composed of five cgroups - root, A, B, AA and AB - with the following
-weights where the names represent the hierarchy.
-
-        weight leaf_weight
- root :  125    125
- A    :  500    750
- B    :  250    500
- AA   :  500    500
- AB   : 1000    500
-
-root never has a parent making its weight is meaningless. For backward
-compatibility, weight is always kept in sync with leaf_weight. B, AA
-and AB have no child and thus its tasks have no children cgroup to
-compete with. They always get 100% of what the cgroup won at the
-parent level. Considering only the weights which matter, the hierarchy
-looks like the following.
-
-          root
-       /    |   \
-      A     B    leaf
-     500   250   125
-   /  |  \
-  AA  AB  leaf
- 500 1000 750
-
-If all cgroups have active IOs and competing with each other, disk
-time will be distributed like the following.
-
-Distribution below root. The total active weight at this level is
-A:500 + B:250 + C:125 = 875.
-
- root-leaf :   125 /  875      =~ 14%
- A         :   500 /  875      =~ 57%
- B(-leaf)  :   250 /  875      =~ 28%
-
-A has children and further distributes its 57% among the children and
-the implicit leaf node. The total active weight at this level is
-AA:500 + AB:1000 + A-leaf:750 = 2250.
-
- A-leaf    : ( 750 / 2250) * A =~ 19%
- AA(-leaf) : ( 500 / 2250) * A =~ 12%
- AB(-leaf) : (1000 / 2250) * A =~ 25%
-
-CFQ IOPS Mode for group scheduling
-===================================
-Basic CFQ design is to provide priority based time slices. Higher priority
-process gets bigger time slice and lower priority process gets smaller time
-slice. Measuring time becomes harder if storage is fast and supports NCQ and
-it would be better to dispatch multiple requests from multiple cfq queues in
-request queue at a time. In such scenario, it is not possible to measure time
-consumed by single queue accurately.
-
-What is possible though is to measure number of requests dispatched from a
-single queue and also allow dispatch from multiple cfq queue at the same time.
-This effectively becomes the fairness in terms of IOPS (IO operations per
-second).
-
-If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches
-to IOPS mode and starts providing fairness in terms of number of requests
-dispatched. Note that this mode switching takes effect only for group
-scheduling. For non-cgroup users nothing should change.
-
-CFQ IO scheduler Idling Theory
-===============================
-Idling on a queue is primarily about waiting for the next request to come
-on same queue after completion of a request. In this process CFQ will not
-dispatch requests from other cfq queues even if requests are pending there.
-
-The rationale behind idling is that it can cut down on number of seeks
-on rotational media. For example, if a process is doing dependent
-sequential reads (next read will come on only after completion of previous
-one), then not dispatching request from other queue should help as we
-did not move the disk head and kept on dispatching sequential IO from
-one queue.
-
-CFQ has following service trees and various queues are put on these trees.
-
-	sync-idle	sync-noidle	async
-
-All cfq queues doing synchronous sequential IO go on to sync-idle tree.
-On this tree we idle on each queue individually.
-
-All synchronous non-sequential queues go on sync-noidle tree. Also any
-synchronous write request which is not marked with REQ_IDLE goes on this
-service tree. On this tree we do not idle on individual queues instead idle
-on the whole group of queues or the tree. So if there are 4 queues waiting
-for IO to dispatch we will idle only once last queue has dispatched the IO
-and there is no more IO on this service tree.
-
-All async writes go on async service tree. There is no idling on async
-queues.
-
-CFQ has some optimizations for SSDs and if it detects a non-rotational
-media which can support higher queue depth (multiple requests at in
-flight at a time), then it cuts down on idling of individual queues and
-all the queues move to sync-noidle tree and only tree idle remains. This
-tree idling provides isolation with buffered write queues on async tree.
-
-FAQ
-===
-Q1. Why to idle at all on queues not marked with REQ_IDLE.
-
-A1. We only do tree idle (all queues on sync-noidle tree) on queues not marked
-    with REQ_IDLE. This helps in providing isolation with all the sync-idle
-    queues. Otherwise in presence of many sequential readers, other
-    synchronous IO might not get fair share of disk.
-
-    For example, if there are 10 sequential readers doing IO and they get
-    100ms each. If a !REQ_IDLE request comes in, it will be scheduled
-    roughly after 1 second. If after completion of !REQ_IDLE request we
-    do not idle, and after a couple of milli seconds a another !REQ_IDLE
-    request comes in, again it will be scheduled after 1second. Repeat it
-    and notice how a workload can lose its disk share and suffer due to
-    multiple sequential readers.
-
-    fsync can generate dependent IO where bunch of data is written in the
-    context of fsync, and later some journaling data is written. Journaling
-    data comes in only after fsync has finished its IO (atleast for ext4
-    that seemed to be the case). Now if one decides not to idle on fsync
-    thread due to !REQ_IDLE, then next journaling write will not get
-    scheduled for another second. A process doing small fsync, will suffer
-    badly in presence of multiple sequential readers.
-
-    Hence doing tree idling on threads using !REQ_IDLE flag on requests
-    provides isolation from multiple sequential readers and at the same
-    time we do not idle on individual threads.
-
-Q2. When to specify REQ_IDLE
-A2. I would think whenever one is doing synchronous write and expecting
-    more writes to be dispatched from same context soon, should be able
-    to specify REQ_IDLE on writes and that probably should work well for
-    most of the cases.
diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
index f95a48b0d7b2..4626b88b2d5a 100644
--- a/block/Kconfig.iosched
+++ b/block/Kconfig.iosched
@@ -3,67 +3,6 @@ if BLOCK
 
 menu "IO Schedulers"
 
-config IOSCHED_NOOP
-	bool
-	default y
-	---help---
-	  The no-op I/O scheduler is a minimal scheduler that does basic merging
-	  and sorting. Its main uses include non-disk based block devices like
-	  memory devices, and specialised software or hardware environments
-	  that do their own scheduling and require only minimal assistance from
-	  the kernel.
-
-config IOSCHED_DEADLINE
-	tristate "Deadline I/O scheduler"
-	default y
-	---help---
-	  The deadline I/O scheduler is simple and compact. It will provide
-	  CSCAN service with FIFO expiration of requests, switching to
-	  a new point in the service tree and doing a batch of IO from there
-	  in case of expiry.
-
-config IOSCHED_CFQ
-	tristate "CFQ I/O scheduler"
-	default y
-	---help---
-	  The CFQ I/O scheduler tries to distribute bandwidth equally
-	  among all processes in the system. It should provide a fair
-	  and low latency working environment, suitable for both desktop
-	  and server systems.
-
-	  This is the default I/O scheduler.
-
-config CFQ_GROUP_IOSCHED
-	bool "CFQ Group Scheduling support"
-	depends on IOSCHED_CFQ && BLK_CGROUP
-	---help---
-	  Enable group IO scheduling in CFQ.
-
-choice
-
-	prompt "Default I/O scheduler"
-	default DEFAULT_CFQ
-	help
-	  Select the I/O scheduler which will be used by default for all
-	  block devices.
-
-	config DEFAULT_DEADLINE
-		bool "Deadline" if IOSCHED_DEADLINE=y
-
-	config DEFAULT_CFQ
-		bool "CFQ" if IOSCHED_CFQ=y
-
-	config DEFAULT_NOOP
-		bool "No-op"
-
-endchoice
-
-config DEFAULT_IOSCHED
-	string
-	default "deadline" if DEFAULT_DEADLINE
-	default "cfq" if DEFAULT_CFQ
-	default "noop" if DEFAULT_NOOP
-
 config MQ_IOSCHED_DEADLINE
 	tristate "MQ deadline I/O scheduler"
 	default y
diff --git a/block/Makefile b/block/Makefile
index 213674c8faaa..eee1b4ceecf9 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -18,9 +18,6 @@ obj-$(CONFIG_BLK_DEV_BSGLIB)	+= bsg-lib.o
 obj-$(CONFIG_BLK_CGROUP)	+= blk-cgroup.o
 obj-$(CONFIG_BLK_DEV_THROTTLING)	+= blk-throttle.o
 obj-$(CONFIG_BLK_CGROUP_IOLATENCY)	+= blk-iolatency.o
-obj-$(CONFIG_IOSCHED_NOOP)	+= noop-iosched.o
-obj-$(CONFIG_IOSCHED_DEADLINE)	+= deadline-iosched.o
-obj-$(CONFIG_IOSCHED_CFQ)	+= cfq-iosched.o
 obj-$(CONFIG_MQ_IOSCHED_DEADLINE)	+= mq-deadline.o
 obj-$(CONFIG_MQ_IOSCHED_KYBER)	+= kyber-iosched.o
 bfq-y				:= bfq-iosched.o bfq-wf2q.o bfq-cgroup.o
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
deleted file mode 100644
index 6a3d87dd3c1a..000000000000
--- a/block/cfq-iosched.c
+++ /dev/null
@@ -1,4916 +0,0 @@
-/*
- *  CFQ, or complete fairness queueing, disk scheduler.
- *
- *  Based on ideas from a previously unfinished io
- *  scheduler (round robin per-process disk scheduling) and Andrea Arcangeli.
- *
- *  Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
- */
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/sched/clock.h>
-#include <linux/blkdev.h>
-#include <linux/elevator.h>
-#include <linux/ktime.h>
-#include <linux/rbtree.h>
-#include <linux/ioprio.h>
-#include <linux/blktrace_api.h>
-#include <linux/blk-cgroup.h>
-#include "blk.h"
-#include "blk-wbt.h"
-
-/*
- * tunables
- */
-/* max queue in one round of service */
-static const int cfq_quantum = 8;
-static const u64 cfq_fifo_expire[2] = { NSEC_PER_SEC / 4, NSEC_PER_SEC / 8 };
-/* maximum backwards seek, in KiB */
-static const int cfq_back_max = 16 * 1024;
-/* penalty of a backwards seek */
-static const int cfq_back_penalty = 2;
-static const u64 cfq_slice_sync = NSEC_PER_SEC / 10;
-static u64 cfq_slice_async = NSEC_PER_SEC / 25;
-static const int cfq_slice_async_rq = 2;
-static u64 cfq_slice_idle = NSEC_PER_SEC / 125;
-static u64 cfq_group_idle = NSEC_PER_SEC / 125;
-static const u64 cfq_target_latency = (u64)NSEC_PER_SEC * 3/10; /* 300 ms */
-static const int cfq_hist_divisor = 4;
-
-/*
- * offset from end of queue service tree for idle class
- */
-#define CFQ_IDLE_DELAY		(NSEC_PER_SEC / 5)
-/* offset from end of group service tree under time slice mode */
-#define CFQ_SLICE_MODE_GROUP_DELAY (NSEC_PER_SEC / 5)
-/* offset from end of group service under IOPS mode */
-#define CFQ_IOPS_MODE_GROUP_DELAY (HZ / 5)
-
-/*
- * below this threshold, we consider thinktime immediate
- */
-#define CFQ_MIN_TT		(2 * NSEC_PER_SEC / HZ)
-
-#define CFQ_SLICE_SCALE		(5)
-#define CFQ_HW_QUEUE_MIN	(5)
-#define CFQ_SERVICE_SHIFT       12
-
-#define CFQQ_SEEK_THR		(sector_t)(8 * 100)
-#define CFQQ_CLOSE_THR		(sector_t)(8 * 1024)
-#define CFQQ_SECT_THR_NONROT	(sector_t)(2 * 32)
-#define CFQQ_SEEKY(cfqq)	(hweight32(cfqq->seek_history) > 32/8)
-
-#define RQ_CIC(rq)		icq_to_cic((rq)->elv.icq)
-#define RQ_CFQQ(rq)		(struct cfq_queue *) ((rq)->elv.priv[0])
-#define RQ_CFQG(rq)		(struct cfq_group *) ((rq)->elv.priv[1])
-
-static struct kmem_cache *cfq_pool;
-
-#define CFQ_PRIO_LISTS		IOPRIO_BE_NR
-#define cfq_class_idle(cfqq)	((cfqq)->ioprio_class == IOPRIO_CLASS_IDLE)
-#define cfq_class_rt(cfqq)	((cfqq)->ioprio_class == IOPRIO_CLASS_RT)
-
-#define sample_valid(samples)	((samples) > 80)
-#define rb_entry_cfqg(node)	rb_entry((node), struct cfq_group, rb_node)
-
-/* blkio-related constants */
-#define CFQ_WEIGHT_LEGACY_MIN	10
-#define CFQ_WEIGHT_LEGACY_DFL	500
-#define CFQ_WEIGHT_LEGACY_MAX	1000
-
-struct cfq_ttime {
-	u64 last_end_request;
-
-	u64 ttime_total;
-	u64 ttime_mean;
-	unsigned long ttime_samples;
-};
-
-/*
- * Most of our rbtree usage is for sorting with min extraction, so
- * if we cache the leftmost node we don't have to walk down the tree
- * to find it. Idea borrowed from Ingo Molnars CFS scheduler. We should
- * move this into the elevator for the rq sorting as well.
- */
-struct cfq_rb_root {
-	struct rb_root_cached rb;
-	struct rb_node *rb_rightmost;
-	unsigned count;
-	u64 min_vdisktime;
-	struct cfq_ttime ttime;
-};
-#define CFQ_RB_ROOT	(struct cfq_rb_root) { .rb = RB_ROOT_CACHED, \
-			.rb_rightmost = NULL,			     \
-			.ttime = {.last_end_request = ktime_get_ns(),},}
-
-/*
- * Per process-grouping structure
- */
-struct cfq_queue {
-	/* reference count */
-	int ref;
-	/* various state flags, see below */
-	unsigned int flags;
-	/* parent cfq_data */
-	struct cfq_data *cfqd;
-	/* service_tree member */
-	struct rb_node rb_node;
-	/* service_tree key */
-	u64 rb_key;
-	/* prio tree member */
-	struct rb_node p_node;
-	/* prio tree root we belong to, if any */
-	struct rb_root *p_root;
-	/* sorted list of pending requests */
-	struct rb_root sort_list;
-	/* if fifo isn't expired, next request to serve */
-	struct request *next_rq;
-	/* requests queued in sort_list */
-	int queued[2];
-	/* currently allocated requests */
-	int allocated[2];
-	/* fifo list of requests in sort_list */
-	struct list_head fifo;
-
-	/* time when queue got scheduled in to dispatch first request. */
-	u64 dispatch_start;
-	u64 allocated_slice;
-	u64 slice_dispatch;
-	/* time when first request from queue completed and slice started. */
-	u64 slice_start;
-	u64 slice_end;
-	s64 slice_resid;
-
-	/* pending priority requests */
-	int prio_pending;
-	/* number of requests that are on the dispatch list or inside driver */
-	int dispatched;
-
-	/* io prio of this group */
-	unsigned short ioprio, org_ioprio;
-	unsigned short ioprio_class, org_ioprio_class;
-
-	pid_t pid;
-
-	u32 seek_history;
-	sector_t last_request_pos;
-
-	struct cfq_rb_root *service_tree;
-	struct cfq_queue *new_cfqq;
-	struct cfq_group *cfqg;
-	/* Number of sectors dispatched from queue in single dispatch round */
-	unsigned long nr_sectors;
-};
-
-/*
- * First index in the service_trees.
- * IDLE is handled separately, so it has negative index
- */
-enum wl_class_t {
-	BE_WORKLOAD = 0,
-	RT_WORKLOAD = 1,
-	IDLE_WORKLOAD = 2,
-	CFQ_PRIO_NR,
-};
-
-/*
- * Second index in the service_trees.
- */
-enum wl_type_t {
-	ASYNC_WORKLOAD = 0,
-	SYNC_NOIDLE_WORKLOAD = 1,
-	SYNC_WORKLOAD = 2
-};
-
-struct cfqg_stats {
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	/* number of ios merged */
-	struct blkg_rwstat		merged;
-	/* total time spent on device in ns, may not be accurate w/ queueing */
-	struct blkg_rwstat		service_time;
-	/* total time spent waiting in scheduler queue in ns */
-	struct blkg_rwstat		wait_time;
-	/* number of IOs queued up */
-	struct blkg_rwstat		queued;
-	/* total disk time and nr sectors dispatched by this group */
-	struct blkg_stat		time;
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	/* time not charged to this cgroup */
-	struct blkg_stat		unaccounted_time;
-	/* sum of number of ios queued across all samples */
-	struct blkg_stat		avg_queue_size_sum;
-	/* count of samples taken for average */
-	struct blkg_stat		avg_queue_size_samples;
-	/* how many times this group has been removed from service tree */
-	struct blkg_stat		dequeue;
-	/* total time spent waiting for it to be assigned a timeslice. */
-	struct blkg_stat		group_wait_time;
-	/* time spent idling for this blkcg_gq */
-	struct blkg_stat		idle_time;
-	/* total time with empty current active q with other requests queued */
-	struct blkg_stat		empty_time;
-	/* fields after this shouldn't be cleared on stat reset */
-	u64				start_group_wait_time;
-	u64				start_idle_time;
-	u64				start_empty_time;
-	uint16_t			flags;
-#endif	/* CONFIG_DEBUG_BLK_CGROUP */
-#endif	/* CONFIG_CFQ_GROUP_IOSCHED */
-};
-
-/* Per-cgroup data */
-struct cfq_group_data {
-	/* must be the first member */
-	struct blkcg_policy_data cpd;
-
-	unsigned int weight;
-	unsigned int leaf_weight;
-};
-
-/* This is per cgroup per device grouping structure */
-struct cfq_group {
-	/* must be the first member */
-	struct blkg_policy_data pd;
-
-	/* group service_tree member */
-	struct rb_node rb_node;
-
-	/* group service_tree key */
-	u64 vdisktime;
-
-	/*
-	 * The number of active cfqgs and sum of their weights under this
-	 * cfqg.  This covers this cfqg's leaf_weight and all children's
-	 * weights, but does not cover weights of further descendants.
-	 *
-	 * If a cfqg is on the service tree, it's active.  An active cfqg
-	 * also activates its parent and contributes to the children_weight
-	 * of the parent.
-	 */
-	int nr_active;
-	unsigned int children_weight;
-
-	/*
-	 * vfraction is the fraction of vdisktime that the tasks in this
-	 * cfqg are entitled to.  This is determined by compounding the
-	 * ratios walking up from this cfqg to the root.
-	 *
-	 * It is in fixed point w/ CFQ_SERVICE_SHIFT and the sum of all
-	 * vfractions on a service tree is approximately 1.  The sum may
-	 * deviate a bit due to rounding errors and fluctuations caused by
-	 * cfqgs entering and leaving the service tree.
-	 */
-	unsigned int vfraction;
-
-	/*
-	 * There are two weights - (internal) weight is the weight of this
-	 * cfqg against the sibling cfqgs.  leaf_weight is the wight of
-	 * this cfqg against the child cfqgs.  For the root cfqg, both
-	 * weights are kept in sync for backward compatibility.
-	 */
-	unsigned int weight;
-	unsigned int new_weight;
-	unsigned int dev_weight;
-
-	unsigned int leaf_weight;
-	unsigned int new_leaf_weight;
-	unsigned int dev_leaf_weight;
-
-	/* number of cfqq currently on this group */
-	int nr_cfqq;
-
-	/*
-	 * Per group busy queues average. Useful for workload slice calc. We
-	 * create the array for each prio class but at run time it is used
-	 * only for RT and BE class and slot for IDLE class remains unused.
-	 * This is primarily done to avoid confusion and a gcc warning.
-	 */
-	unsigned int busy_queues_avg[CFQ_PRIO_NR];
-	/*
-	 * rr lists of queues with requests. We maintain service trees for
-	 * RT and BE classes. These trees are subdivided in subclasses
-	 * of SYNC, SYNC_NOIDLE and ASYNC based on workload type. For IDLE
-	 * class there is no subclassification and all the cfq queues go on
-	 * a single tree service_tree_idle.
-	 * Counts are embedded in the cfq_rb_root
-	 */
-	struct cfq_rb_root service_trees[2][3];
-	struct cfq_rb_root service_tree_idle;
-
-	u64 saved_wl_slice;
-	enum wl_type_t saved_wl_type;
-	enum wl_class_t saved_wl_class;
-
-	/* number of requests that are on the dispatch list or inside driver */
-	int dispatched;
-	struct cfq_ttime ttime;
-	struct cfqg_stats stats;	/* stats for this cfqg */
-
-	/* async queue for each priority case */
-	struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR];
-	struct cfq_queue *async_idle_cfqq;
-
-};
-
-struct cfq_io_cq {
-	struct io_cq		icq;		/* must be the first member */
-	struct cfq_queue	*cfqq[2];
-	struct cfq_ttime	ttime;
-	int			ioprio;		/* the current ioprio */
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	uint64_t		blkcg_serial_nr; /* the current blkcg serial */
-#endif
-};
-
-/*
- * Per block device queue structure
- */
-struct cfq_data {
-	struct request_queue *queue;
-	/* Root service tree for cfq_groups */
-	struct cfq_rb_root grp_service_tree;
-	struct cfq_group *root_group;
-
-	/*
-	 * The priority currently being served
-	 */
-	enum wl_class_t serving_wl_class;
-	enum wl_type_t serving_wl_type;
-	u64 workload_expires;
-	struct cfq_group *serving_group;
-
-	/*
-	 * Each priority tree is sorted by next_request position.  These
-	 * trees are used when determining if two or more queues are
-	 * interleaving requests (see cfq_close_cooperator).
-	 */
-	struct rb_root prio_trees[CFQ_PRIO_LISTS];
-
-	unsigned int busy_queues;
-	unsigned int busy_sync_queues;
-
-	int rq_in_driver;
-	int rq_in_flight[2];
-
-	/*
-	 * queue-depth detection
-	 */
-	int rq_queued;
-	int hw_tag;
-	/*
-	 * hw_tag can be
-	 * -1 => indeterminate, (cfq will behave as if NCQ is present, to allow better detection)
-	 *  1 => NCQ is present (hw_tag_est_depth is the estimated max depth)
-	 *  0 => no NCQ
-	 */
-	int hw_tag_est_depth;
-	unsigned int hw_tag_samples;
-
-	/*
-	 * idle window management
-	 */
-	struct hrtimer idle_slice_timer;
-	struct work_struct unplug_work;
-
-	struct cfq_queue *active_queue;
-	struct cfq_io_cq *active_cic;
-
-	sector_t last_position;
-
-	/*
-	 * tunables, see top of file
-	 */
-	unsigned int cfq_quantum;
-	unsigned int cfq_back_penalty;
-	unsigned int cfq_back_max;
-	unsigned int cfq_slice_async_rq;
-	unsigned int cfq_latency;
-	u64 cfq_fifo_expire[2];
-	u64 cfq_slice[2];
-	u64 cfq_slice_idle;
-	u64 cfq_group_idle;
-	u64 cfq_target_latency;
-
-	/*
-	 * Fallback dummy cfqq for extreme OOM conditions
-	 */
-	struct cfq_queue oom_cfqq;
-
-	u64 last_delayed_sync;
-};
-
-static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd);
-static void cfq_put_queue(struct cfq_queue *cfqq);
-
-static struct cfq_rb_root *st_for(struct cfq_group *cfqg,
-					    enum wl_class_t class,
-					    enum wl_type_t type)
-{
-	if (!cfqg)
-		return NULL;
-
-	if (class == IDLE_WORKLOAD)
-		return &cfqg->service_tree_idle;
-
-	return &cfqg->service_trees[class][type];
-}
-
-enum cfqq_state_flags {
-	CFQ_CFQQ_FLAG_on_rr = 0,	/* on round-robin busy list */
-	CFQ_CFQQ_FLAG_wait_request,	/* waiting for a request */
-	CFQ_CFQQ_FLAG_must_dispatch,	/* must be allowed a dispatch */
-	CFQ_CFQQ_FLAG_must_alloc_slice,	/* per-slice must_alloc flag */
-	CFQ_CFQQ_FLAG_fifo_expire,	/* FIFO checked in this slice */
-	CFQ_CFQQ_FLAG_idle_window,	/* slice idling enabled */
-	CFQ_CFQQ_FLAG_prio_changed,	/* task priority has changed */
-	CFQ_CFQQ_FLAG_slice_new,	/* no requests dispatched in slice */
-	CFQ_CFQQ_FLAG_sync,		/* synchronous queue */
-	CFQ_CFQQ_FLAG_coop,		/* cfqq is shared */
-	CFQ_CFQQ_FLAG_split_coop,	/* shared cfqq will be splitted */
-	CFQ_CFQQ_FLAG_deep,		/* sync cfqq experienced large depth */
-	CFQ_CFQQ_FLAG_wait_busy,	/* Waiting for next request */
-};
-
-#define CFQ_CFQQ_FNS(name)						\
-static inline void cfq_mark_cfqq_##name(struct cfq_queue *cfqq)		\
-{									\
-	(cfqq)->flags |= (1 << CFQ_CFQQ_FLAG_##name);			\
-}									\
-static inline void cfq_clear_cfqq_##name(struct cfq_queue *cfqq)	\
-{									\
-	(cfqq)->flags &= ~(1 << CFQ_CFQQ_FLAG_##name);			\
-}									\
-static inline int cfq_cfqq_##name(const struct cfq_queue *cfqq)		\
-{									\
-	return ((cfqq)->flags & (1 << CFQ_CFQQ_FLAG_##name)) != 0;	\
-}
-
-CFQ_CFQQ_FNS(on_rr);
-CFQ_CFQQ_FNS(wait_request);
-CFQ_CFQQ_FNS(must_dispatch);
-CFQ_CFQQ_FNS(must_alloc_slice);
-CFQ_CFQQ_FNS(fifo_expire);
-CFQ_CFQQ_FNS(idle_window);
-CFQ_CFQQ_FNS(prio_changed);
-CFQ_CFQQ_FNS(slice_new);
-CFQ_CFQQ_FNS(sync);
-CFQ_CFQQ_FNS(coop);
-CFQ_CFQQ_FNS(split_coop);
-CFQ_CFQQ_FNS(deep);
-CFQ_CFQQ_FNS(wait_busy);
-#undef CFQ_CFQQ_FNS
-
-#if defined(CONFIG_CFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
-
-/* cfqg stats flags */
-enum cfqg_stats_flags {
-	CFQG_stats_waiting = 0,
-	CFQG_stats_idling,
-	CFQG_stats_empty,
-};
-
-#define CFQG_FLAG_FNS(name)						\
-static inline void cfqg_stats_mark_##name(struct cfqg_stats *stats)	\
-{									\
-	stats->flags |= (1 << CFQG_stats_##name);			\
-}									\
-static inline void cfqg_stats_clear_##name(struct cfqg_stats *stats)	\
-{									\
-	stats->flags &= ~(1 << CFQG_stats_##name);			\
-}									\
-static inline int cfqg_stats_##name(struct cfqg_stats *stats)		\
-{									\
-	return (stats->flags & (1 << CFQG_stats_##name)) != 0;		\
-}									\
-
-CFQG_FLAG_FNS(waiting)
-CFQG_FLAG_FNS(idling)
-CFQG_FLAG_FNS(empty)
-#undef CFQG_FLAG_FNS
-
-/* This should be called with the queue_lock held. */
-static void cfqg_stats_update_group_wait_time(struct cfqg_stats *stats)
-{
-	u64 now;
-
-	if (!cfqg_stats_waiting(stats))
-		return;
-
-	now = ktime_get_ns();
-	if (now > stats->start_group_wait_time)
-		blkg_stat_add(&stats->group_wait_time,
-			      now - stats->start_group_wait_time);
-	cfqg_stats_clear_waiting(stats);
-}
-
-/* This should be called with the queue_lock held. */
-static void cfqg_stats_set_start_group_wait_time(struct cfq_group *cfqg,
-						 struct cfq_group *curr_cfqg)
-{
-	struct cfqg_stats *stats = &cfqg->stats;
-
-	if (cfqg_stats_waiting(stats))
-		return;
-	if (cfqg == curr_cfqg)
-		return;
-	stats->start_group_wait_time = ktime_get_ns();
-	cfqg_stats_mark_waiting(stats);
-}
-
-/* This should be called with the queue_lock held. */
-static void cfqg_stats_end_empty_time(struct cfqg_stats *stats)
-{
-	u64 now;
-
-	if (!cfqg_stats_empty(stats))
-		return;
-
-	now = ktime_get_ns();
-	if (now > stats->start_empty_time)
-		blkg_stat_add(&stats->empty_time,
-			      now - stats->start_empty_time);
-	cfqg_stats_clear_empty(stats);
-}
-
-static void cfqg_stats_update_dequeue(struct cfq_group *cfqg)
-{
-	blkg_stat_add(&cfqg->stats.dequeue, 1);
-}
-
-static void cfqg_stats_set_start_empty_time(struct cfq_group *cfqg)
-{
-	struct cfqg_stats *stats = &cfqg->stats;
-
-	if (blkg_rwstat_total(&stats->queued))
-		return;
-
-	/*
-	 * group is already marked empty. This can happen if cfqq got new
-	 * request in parent group and moved to this group while being added
-	 * to service tree. Just ignore the event and move on.
-	 */
-	if (cfqg_stats_empty(stats))
-		return;
-
-	stats->start_empty_time = ktime_get_ns();
-	cfqg_stats_mark_empty(stats);
-}
-
-static void cfqg_stats_update_idle_time(struct cfq_group *cfqg)
-{
-	struct cfqg_stats *stats = &cfqg->stats;
-
-	if (cfqg_stats_idling(stats)) {
-		u64 now = ktime_get_ns();
-
-		if (now > stats->start_idle_time)
-			blkg_stat_add(&stats->idle_time,
-				      now - stats->start_idle_time);
-		cfqg_stats_clear_idling(stats);
-	}
-}
-
-static void cfqg_stats_set_start_idle_time(struct cfq_group *cfqg)
-{
-	struct cfqg_stats *stats = &cfqg->stats;
-
-	BUG_ON(cfqg_stats_idling(stats));
-
-	stats->start_idle_time = ktime_get_ns();
-	cfqg_stats_mark_idling(stats);
-}
-
-static void cfqg_stats_update_avg_queue_size(struct cfq_group *cfqg)
-{
-	struct cfqg_stats *stats = &cfqg->stats;
-
-	blkg_stat_add(&stats->avg_queue_size_sum,
-		      blkg_rwstat_total(&stats->queued));
-	blkg_stat_add(&stats->avg_queue_size_samples, 1);
-	cfqg_stats_update_group_wait_time(stats);
-}
-
-#else	/* CONFIG_CFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */
-
-static inline void cfqg_stats_set_start_group_wait_time(struct cfq_group *cfqg, struct cfq_group *curr_cfqg) { }
-static inline void cfqg_stats_end_empty_time(struct cfqg_stats *stats) { }
-static inline void cfqg_stats_update_dequeue(struct cfq_group *cfqg) { }
-static inline void cfqg_stats_set_start_empty_time(struct cfq_group *cfqg) { }
-static inline void cfqg_stats_update_idle_time(struct cfq_group *cfqg) { }
-static inline void cfqg_stats_set_start_idle_time(struct cfq_group *cfqg) { }
-static inline void cfqg_stats_update_avg_queue_size(struct cfq_group *cfqg) { }
-
-#endif	/* CONFIG_CFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-
-static inline struct cfq_group *pd_to_cfqg(struct blkg_policy_data *pd)
-{
-	return pd ? container_of(pd, struct cfq_group, pd) : NULL;
-}
-
-static struct cfq_group_data
-*cpd_to_cfqgd(struct blkcg_policy_data *cpd)
-{
-	return cpd ? container_of(cpd, struct cfq_group_data, cpd) : NULL;
-}
-
-static inline struct blkcg_gq *cfqg_to_blkg(struct cfq_group *cfqg)
-{
-	return pd_to_blkg(&cfqg->pd);
-}
-
-static struct blkcg_policy blkcg_policy_cfq;
-
-static inline struct cfq_group *blkg_to_cfqg(struct blkcg_gq *blkg)
-{
-	return pd_to_cfqg(blkg_to_pd(blkg, &blkcg_policy_cfq));
-}
-
-static struct cfq_group_data *blkcg_to_cfqgd(struct blkcg *blkcg)
-{
-	return cpd_to_cfqgd(blkcg_to_cpd(blkcg, &blkcg_policy_cfq));
-}
-
-static inline struct cfq_group *cfqg_parent(struct cfq_group *cfqg)
-{
-	struct blkcg_gq *pblkg = cfqg_to_blkg(cfqg)->parent;
-
-	return pblkg ? blkg_to_cfqg(pblkg) : NULL;
-}
-
-static inline bool cfqg_is_descendant(struct cfq_group *cfqg,
-				      struct cfq_group *ancestor)
-{
-	return cgroup_is_descendant(cfqg_to_blkg(cfqg)->blkcg->css.cgroup,
-				    cfqg_to_blkg(ancestor)->blkcg->css.cgroup);
-}
-
-static inline void cfqg_get(struct cfq_group *cfqg)
-{
-	return blkg_get(cfqg_to_blkg(cfqg));
-}
-
-static inline void cfqg_put(struct cfq_group *cfqg)
-{
-	return blkg_put(cfqg_to_blkg(cfqg));
-}
-
-#define cfq_log_cfqq(cfqd, cfqq, fmt, args...)	do {			\
-	blk_add_cgroup_trace_msg((cfqd)->queue,				\
-			cfqg_to_blkg((cfqq)->cfqg)->blkcg,		\
-			"cfq%d%c%c " fmt, (cfqq)->pid,			\
-			cfq_cfqq_sync((cfqq)) ? 'S' : 'A',		\
-			cfqq_type((cfqq)) == SYNC_NOIDLE_WORKLOAD ? 'N' : ' ',\
-			  ##args);					\
-} while (0)
-
-#define cfq_log_cfqg(cfqd, cfqg, fmt, args...)	do {			\
-	blk_add_cgroup_trace_msg((cfqd)->queue,				\
-			cfqg_to_blkg(cfqg)->blkcg, fmt, ##args);	\
-} while (0)
-
-static inline void cfqg_stats_update_io_add(struct cfq_group *cfqg,
-					    struct cfq_group *curr_cfqg,
-					    unsigned int op)
-{
-	blkg_rwstat_add(&cfqg->stats.queued, op, 1);
-	cfqg_stats_end_empty_time(&cfqg->stats);
-	cfqg_stats_set_start_group_wait_time(cfqg, curr_cfqg);
-}
-
-static inline void cfqg_stats_update_timeslice_used(struct cfq_group *cfqg,
-			uint64_t time, unsigned long unaccounted_time)
-{
-	blkg_stat_add(&cfqg->stats.time, time);
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	blkg_stat_add(&cfqg->stats.unaccounted_time, unaccounted_time);
-#endif
-}
-
-static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg,
-					       unsigned int op)
-{
-	blkg_rwstat_add(&cfqg->stats.queued, op, -1);
-}
-
-static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg,
-					       unsigned int op)
-{
-	blkg_rwstat_add(&cfqg->stats.merged, op, 1);
-}
-
-static inline void cfqg_stats_update_completion(struct cfq_group *cfqg,
-						u64 start_time_ns,
-						u64 io_start_time_ns,
-						unsigned int op)
-{
-	struct cfqg_stats *stats = &cfqg->stats;
-	u64 now = ktime_get_ns();
-
-	if (now > io_start_time_ns)
-		blkg_rwstat_add(&stats->service_time, op,
-				now - io_start_time_ns);
-	if (io_start_time_ns > start_time_ns)
-		blkg_rwstat_add(&stats->wait_time, op,
-				io_start_time_ns - start_time_ns);
-}
-
-/* @stats = 0 */
-static void cfqg_stats_reset(struct cfqg_stats *stats)
-{
-	/* queued stats shouldn't be cleared */
-	blkg_rwstat_reset(&stats->merged);
-	blkg_rwstat_reset(&stats->service_time);
-	blkg_rwstat_reset(&stats->wait_time);
-	blkg_stat_reset(&stats->time);
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	blkg_stat_reset(&stats->unaccounted_time);
-	blkg_stat_reset(&stats->avg_queue_size_sum);
-	blkg_stat_reset(&stats->avg_queue_size_samples);
-	blkg_stat_reset(&stats->dequeue);
-	blkg_stat_reset(&stats->group_wait_time);
-	blkg_stat_reset(&stats->idle_time);
-	blkg_stat_reset(&stats->empty_time);
-#endif
-}
-
-/* @to += @from */
-static void cfqg_stats_add_aux(struct cfqg_stats *to, struct cfqg_stats *from)
-{
-	/* queued stats shouldn't be cleared */
-	blkg_rwstat_add_aux(&to->merged, &from->merged);
-	blkg_rwstat_add_aux(&to->service_time, &from->service_time);
-	blkg_rwstat_add_aux(&to->wait_time, &from->wait_time);
-	blkg_stat_add_aux(&from->time, &from->time);
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	blkg_stat_add_aux(&to->unaccounted_time, &from->unaccounted_time);
-	blkg_stat_add_aux(&to->avg_queue_size_sum, &from->avg_queue_size_sum);
-	blkg_stat_add_aux(&to->avg_queue_size_samples, &from->avg_queue_size_samples);
-	blkg_stat_add_aux(&to->dequeue, &from->dequeue);
-	blkg_stat_add_aux(&to->group_wait_time, &from->group_wait_time);
-	blkg_stat_add_aux(&to->idle_time, &from->idle_time);
-	blkg_stat_add_aux(&to->empty_time, &from->empty_time);
-#endif
-}
-
-/*
- * Transfer @cfqg's stats to its parent's aux counts so that the ancestors'
- * recursive stats can still account for the amount used by this cfqg after
- * it's gone.
- */
-static void cfqg_stats_xfer_dead(struct cfq_group *cfqg)
-{
-	struct cfq_group *parent = cfqg_parent(cfqg);
-
-	lockdep_assert_held(cfqg_to_blkg(cfqg)->q->queue_lock);
-
-	if (unlikely(!parent))
-		return;
-
-	cfqg_stats_add_aux(&parent->stats, &cfqg->stats);
-	cfqg_stats_reset(&cfqg->stats);
-}
-
-#else	/* CONFIG_CFQ_GROUP_IOSCHED */
-
-static inline struct cfq_group *cfqg_parent(struct cfq_group *cfqg) { return NULL; }
-static inline bool cfqg_is_descendant(struct cfq_group *cfqg,
-				      struct cfq_group *ancestor)
-{
-	return true;
-}
-static inline void cfqg_get(struct cfq_group *cfqg) { }
-static inline void cfqg_put(struct cfq_group *cfqg) { }
-
-#define cfq_log_cfqq(cfqd, cfqq, fmt, args...)	\
-	blk_add_trace_msg((cfqd)->queue, "cfq%d%c%c " fmt, (cfqq)->pid,	\
-			cfq_cfqq_sync((cfqq)) ? 'S' : 'A',		\
-			cfqq_type((cfqq)) == SYNC_NOIDLE_WORKLOAD ? 'N' : ' ',\
-				##args)
-#define cfq_log_cfqg(cfqd, cfqg, fmt, args...)		do {} while (0)
-
-static inline void cfqg_stats_update_io_add(struct cfq_group *cfqg,
-			struct cfq_group *curr_cfqg, unsigned int op) { }
-static inline void cfqg_stats_update_timeslice_used(struct cfq_group *cfqg,
-			uint64_t time, unsigned long unaccounted_time) { }
-static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg,
-			unsigned int op) { }
-static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg,
-			unsigned int op) { }
-static inline void cfqg_stats_update_completion(struct cfq_group *cfqg,
-						u64 start_time_ns,
-						u64 io_start_time_ns,
-						unsigned int op) { }
-
-#endif	/* CONFIG_CFQ_GROUP_IOSCHED */
-
-#define cfq_log(cfqd, fmt, args...)	\
-	blk_add_trace_msg((cfqd)->queue, "cfq " fmt, ##args)
-
-/* Traverses through cfq group service trees */
-#define for_each_cfqg_st(cfqg, i, j, st) \
-	for (i = 0; i <= IDLE_WORKLOAD; i++) \
-		for (j = 0, st = i < IDLE_WORKLOAD ? &cfqg->service_trees[i][j]\
-			: &cfqg->service_tree_idle; \
-			(i < IDLE_WORKLOAD && j <= SYNC_WORKLOAD) || \
-			(i == IDLE_WORKLOAD && j == 0); \
-			j++, st = i < IDLE_WORKLOAD ? \
-			&cfqg->service_trees[i][j]: NULL) \
-
-static inline bool cfq_io_thinktime_big(struct cfq_data *cfqd,
-	struct cfq_ttime *ttime, bool group_idle)
-{
-	u64 slice;
-	if (!sample_valid(ttime->ttime_samples))
-		return false;
-	if (group_idle)
-		slice = cfqd->cfq_group_idle;
-	else
-		slice = cfqd->cfq_slice_idle;
-	return ttime->ttime_mean > slice;
-}
-
-static inline bool iops_mode(struct cfq_data *cfqd)
-{
-	/*
-	 * If we are not idling on queues and it is a NCQ drive, parallel
-	 * execution of requests is on and measuring time is not possible
-	 * in most of the cases until and unless we drive shallower queue
-	 * depths and that becomes a performance bottleneck. In such cases
-	 * switch to start providing fairness in terms of number of IOs.
-	 */
-	if (!cfqd->cfq_slice_idle && cfqd->hw_tag)
-		return true;
-	else
-		return false;
-}
-
-static inline enum wl_class_t cfqq_class(struct cfq_queue *cfqq)
-{
-	if (cfq_class_idle(cfqq))
-		return IDLE_WORKLOAD;
-	if (cfq_class_rt(cfqq))
-		return RT_WORKLOAD;
-	return BE_WORKLOAD;
-}
-
-
-static enum wl_type_t cfqq_type(struct cfq_queue *cfqq)
-{
-	if (!cfq_cfqq_sync(cfqq))
-		return ASYNC_WORKLOAD;
-	if (!cfq_cfqq_idle_window(cfqq))
-		return SYNC_NOIDLE_WORKLOAD;
-	return SYNC_WORKLOAD;
-}
-
-static inline int cfq_group_busy_queues_wl(enum wl_class_t wl_class,
-					struct cfq_data *cfqd,
-					struct cfq_group *cfqg)
-{
-	if (wl_class == IDLE_WORKLOAD)
-		return cfqg->service_tree_idle.count;
-
-	return cfqg->service_trees[wl_class][ASYNC_WORKLOAD].count +
-		cfqg->service_trees[wl_class][SYNC_NOIDLE_WORKLOAD].count +
-		cfqg->service_trees[wl_class][SYNC_WORKLOAD].count;
-}
-
-static inline int cfqg_busy_async_queues(struct cfq_data *cfqd,
-					struct cfq_group *cfqg)
-{
-	return cfqg->service_trees[RT_WORKLOAD][ASYNC_WORKLOAD].count +
-		cfqg->service_trees[BE_WORKLOAD][ASYNC_WORKLOAD].count;
-}
-
-static void cfq_dispatch_insert(struct request_queue *, struct request *);
-static struct cfq_queue *cfq_get_queue(struct cfq_data *cfqd, bool is_sync,
-				       struct cfq_io_cq *cic, struct bio *bio);
-
-static inline struct cfq_io_cq *icq_to_cic(struct io_cq *icq)
-{
-	/* cic->icq is the first member, %NULL will convert to %NULL */
-	return container_of(icq, struct cfq_io_cq, icq);
-}
-
-static inline struct cfq_io_cq *cfq_cic_lookup(struct cfq_data *cfqd,
-					       struct io_context *ioc)
-{
-	if (ioc)
-		return icq_to_cic(ioc_lookup_icq(ioc, cfqd->queue));
-	return NULL;
-}
-
-static inline struct cfq_queue *cic_to_cfqq(struct cfq_io_cq *cic, bool is_sync)
-{
-	return cic->cfqq[is_sync];
-}
-
-static inline void cic_set_cfqq(struct cfq_io_cq *cic, struct cfq_queue *cfqq,
-				bool is_sync)
-{
-	cic->cfqq[is_sync] = cfqq;
-}
-
-static inline struct cfq_data *cic_to_cfqd(struct cfq_io_cq *cic)
-{
-	return cic->icq.q->elevator->elevator_data;
-}
-
-/*
- * scheduler run of queue, if there are requests pending and no one in the
- * driver that will restart queueing
- */
-static inline void cfq_schedule_dispatch(struct cfq_data *cfqd)
-{
-	if (cfqd->busy_queues) {
-		cfq_log(cfqd, "schedule dispatch");
-		kblockd_schedule_work(&cfqd->unplug_work);
-	}
-}
-
-/*
- * Scale schedule slice based on io priority. Use the sync time slice only
- * if a queue is marked sync and has sync io queued. A sync queue with async
- * io only, should not get full sync slice length.
- */
-static inline u64 cfq_prio_slice(struct cfq_data *cfqd, bool sync,
-				 unsigned short prio)
-{
-	u64 base_slice = cfqd->cfq_slice[sync];
-	u64 slice = div_u64(base_slice, CFQ_SLICE_SCALE);
-
-	WARN_ON(prio >= IOPRIO_BE_NR);
-
-	return base_slice + (slice * (4 - prio));
-}
-
-static inline u64
-cfq_prio_to_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	return cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio);
-}
-
-/**
- * cfqg_scale_charge - scale disk time charge according to cfqg weight
- * @charge: disk time being charged
- * @vfraction: vfraction of the cfqg, fixed point w/ CFQ_SERVICE_SHIFT
- *
- * Scale @charge according to @vfraction, which is in range (0, 1].  The
- * scaling is inversely proportional.
- *
- * scaled = charge / vfraction
- *
- * The result is also in fixed point w/ CFQ_SERVICE_SHIFT.
- */
-static inline u64 cfqg_scale_charge(u64 charge,
-				    unsigned int vfraction)
-{
-	u64 c = charge << CFQ_SERVICE_SHIFT;	/* make it fixed point */
-
-	/* charge / vfraction */
-	c <<= CFQ_SERVICE_SHIFT;
-	return div_u64(c, vfraction);
-}
-
-static inline u64 max_vdisktime(u64 min_vdisktime, u64 vdisktime)
-{
-	s64 delta = (s64)(vdisktime - min_vdisktime);
-	if (delta > 0)
-		min_vdisktime = vdisktime;
-
-	return min_vdisktime;
-}
-
-static void update_min_vdisktime(struct cfq_rb_root *st)
-{
-	if (!RB_EMPTY_ROOT(&st->rb.rb_root)) {
-		struct cfq_group *cfqg = rb_entry_cfqg(st->rb.rb_leftmost);
-
-		st->min_vdisktime = max_vdisktime(st->min_vdisktime,
-						  cfqg->vdisktime);
-	}
-}
-
-/*
- * get averaged number of queues of RT/BE priority.
- * average is updated, with a formula that gives more weight to higher numbers,
- * to quickly follows sudden increases and decrease slowly
- */
-
-static inline unsigned cfq_group_get_avg_queues(struct cfq_data *cfqd,
-					struct cfq_group *cfqg, bool rt)
-{
-	unsigned min_q, max_q;
-	unsigned mult  = cfq_hist_divisor - 1;
-	unsigned round = cfq_hist_divisor / 2;
-	unsigned busy = cfq_group_busy_queues_wl(rt, cfqd, cfqg);
-
-	min_q = min(cfqg->busy_queues_avg[rt], busy);
-	max_q = max(cfqg->busy_queues_avg[rt], busy);
-	cfqg->busy_queues_avg[rt] = (mult * max_q + min_q + round) /
-		cfq_hist_divisor;
-	return cfqg->busy_queues_avg[rt];
-}
-
-static inline u64
-cfq_group_slice(struct cfq_data *cfqd, struct cfq_group *cfqg)
-{
-	return cfqd->cfq_target_latency * cfqg->vfraction >> CFQ_SERVICE_SHIFT;
-}
-
-static inline u64
-cfq_scaled_cfqq_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	u64 slice = cfq_prio_to_slice(cfqd, cfqq);
-	if (cfqd->cfq_latency) {
-		/*
-		 * interested queues (we consider only the ones with the same
-		 * priority class in the cfq group)
-		 */
-		unsigned iq = cfq_group_get_avg_queues(cfqd, cfqq->cfqg,
-						cfq_class_rt(cfqq));
-		u64 sync_slice = cfqd->cfq_slice[1];
-		u64 expect_latency = sync_slice * iq;
-		u64 group_slice = cfq_group_slice(cfqd, cfqq->cfqg);
-
-		if (expect_latency > group_slice) {
-			u64 base_low_slice = 2 * cfqd->cfq_slice_idle;
-			u64 low_slice;
-
-			/* scale low_slice according to IO priority
-			 * and sync vs async */
-			low_slice = div64_u64(base_low_slice*slice, sync_slice);
-			low_slice = min(slice, low_slice);
-			/* the adapted slice value is scaled to fit all iqs
-			 * into the target latency */
-			slice = div64_u64(slice*group_slice, expect_latency);
-			slice = max(slice, low_slice);
-		}
-	}
-	return slice;
-}
-
-static inline void
-cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	u64 slice = cfq_scaled_cfqq_slice(cfqd, cfqq);
-	u64 now = ktime_get_ns();
-
-	cfqq->slice_start = now;
-	cfqq->slice_end = now + slice;
-	cfqq->allocated_slice = slice;
-	cfq_log_cfqq(cfqd, cfqq, "set_slice=%llu", cfqq->slice_end - now);
-}
-
-/*
- * We need to wrap this check in cfq_cfqq_slice_new(), since ->slice_end
- * isn't valid until the first request from the dispatch is activated
- * and the slice time set.
- */
-static inline bool cfq_slice_used(struct cfq_queue *cfqq)
-{
-	if (cfq_cfqq_slice_new(cfqq))
-		return false;
-	if (ktime_get_ns() < cfqq->slice_end)
-		return false;
-
-	return true;
-}
-
-/*
- * Lifted from AS - choose which of rq1 and rq2 that is best served now.
- * We choose the request that is closest to the head right now. Distance
- * behind the head is penalized and only allowed to a certain extent.
- */
-static struct request *
-cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, struct request *rq2, sector_t last)
-{
-	sector_t s1, s2, d1 = 0, d2 = 0;
-	unsigned long back_max;
-#define CFQ_RQ1_WRAP	0x01 /* request 1 wraps */
-#define CFQ_RQ2_WRAP	0x02 /* request 2 wraps */
-	unsigned wrap = 0; /* bit mask: requests behind the disk head? */
-
-	if (rq1 == NULL || rq1 == rq2)
-		return rq2;
-	if (rq2 == NULL)
-		return rq1;
-
-	if (rq_is_sync(rq1) != rq_is_sync(rq2))
-		return rq_is_sync(rq1) ? rq1 : rq2;
-
-	if ((rq1->cmd_flags ^ rq2->cmd_flags) & REQ_PRIO)
-		return rq1->cmd_flags & REQ_PRIO ? rq1 : rq2;
-
-	s1 = blk_rq_pos(rq1);
-	s2 = blk_rq_pos(rq2);
-
-	/*
-	 * by definition, 1KiB is 2 sectors
-	 */
-	back_max = cfqd->cfq_back_max * 2;
-
-	/*
-	 * Strict one way elevator _except_ in the case where we allow
-	 * short backward seeks which are biased as twice the cost of a
-	 * similar forward seek.
-	 */
-	if (s1 >= last)
-		d1 = s1 - last;
-	else if (s1 + back_max >= last)
-		d1 = (last - s1) * cfqd->cfq_back_penalty;
-	else
-		wrap |= CFQ_RQ1_WRAP;
-
-	if (s2 >= last)
-		d2 = s2 - last;
-	else if (s2 + back_max >= last)
-		d2 = (last - s2) * cfqd->cfq_back_penalty;
-	else
-		wrap |= CFQ_RQ2_WRAP;
-
-	/* Found required data */
-
-	/*
-	 * By doing switch() on the bit mask "wrap" we avoid having to
-	 * check two variables for all permutations: --> faster!
-	 */
-	switch (wrap) {
-	case 0: /* common case for CFQ: rq1 and rq2 not wrapped */
-		if (d1 < d2)
-			return rq1;
-		else if (d2 < d1)
-			return rq2;
-		else {
-			if (s1 >= s2)
-				return rq1;
-			else
-				return rq2;
-		}
-
-	case CFQ_RQ2_WRAP:
-		return rq1;
-	case CFQ_RQ1_WRAP:
-		return rq2;
-	case (CFQ_RQ1_WRAP|CFQ_RQ2_WRAP): /* both rqs wrapped */
-	default:
-		/*
-		 * Since both rqs are wrapped,
-		 * start with the one that's further behind head
-		 * (--> only *one* back seek required),
-		 * since back seek takes more time than forward.
-		 */
-		if (s1 <= s2)
-			return rq1;
-		else
-			return rq2;
-	}
-}
-
-static struct cfq_queue *cfq_rb_first(struct cfq_rb_root *root)
-{
-	/* Service tree is empty */
-	if (!root->count)
-		return NULL;
-
-	return rb_entry(rb_first_cached(&root->rb), struct cfq_queue, rb_node);
-}
-
-static struct cfq_group *cfq_rb_first_group(struct cfq_rb_root *root)
-{
-	return rb_entry_cfqg(rb_first_cached(&root->rb));
-}
-
-static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)
-{
-	if (root->rb_rightmost == n)
-		root->rb_rightmost = rb_prev(n);
-
-	rb_erase_cached(n, &root->rb);
-	RB_CLEAR_NODE(n);
-
-	--root->count;
-}
-
-/*
- * would be nice to take fifo expire time into account as well
- */
-static struct request *
-cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-		  struct request *last)
-{
-	struct rb_node *rbnext = rb_next(&last->rb_node);
-	struct rb_node *rbprev = rb_prev(&last->rb_node);
-	struct request *next = NULL, *prev = NULL;
-
-	BUG_ON(RB_EMPTY_NODE(&last->rb_node));
-
-	if (rbprev)
-		prev = rb_entry_rq(rbprev);
-
-	if (rbnext)
-		next = rb_entry_rq(rbnext);
-	else {
-		rbnext = rb_first(&cfqq->sort_list);
-		if (rbnext && rbnext != &last->rb_node)
-			next = rb_entry_rq(rbnext);
-	}
-
-	return cfq_choose_req(cfqd, next, prev, blk_rq_pos(last));
-}
-
-static u64 cfq_slice_offset(struct cfq_data *cfqd,
-			    struct cfq_queue *cfqq)
-{
-	/*
-	 * just an approximation, should be ok.
-	 */
-	return (cfqq->cfqg->nr_cfqq - 1) * (cfq_prio_slice(cfqd, 1, 0) -
-		       cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio));
-}
-
-static inline s64
-cfqg_key(struct cfq_rb_root *st, struct cfq_group *cfqg)
-{
-	return cfqg->vdisktime - st->min_vdisktime;
-}
-
-static void
-__cfq_group_service_tree_add(struct cfq_rb_root *st, struct cfq_group *cfqg)
-{
-	struct rb_node **node = &st->rb.rb_root.rb_node;
-	struct rb_node *parent = NULL;
-	struct cfq_group *__cfqg;
-	s64 key = cfqg_key(st, cfqg);
-	bool leftmost = true, rightmost = true;
-
-	while (*node != NULL) {
-		parent = *node;
-		__cfqg = rb_entry_cfqg(parent);
-
-		if (key < cfqg_key(st, __cfqg)) {
-			node = &parent->rb_left;
-			rightmost = false;
-		} else {
-			node = &parent->rb_right;
-			leftmost = false;
-		}
-	}
-
-	if (rightmost)
-		st->rb_rightmost = &cfqg->rb_node;
-
-	rb_link_node(&cfqg->rb_node, parent, node);
-	rb_insert_color_cached(&cfqg->rb_node, &st->rb, leftmost);
-}
-
-/*
- * This has to be called only on activation of cfqg
- */
-static void
-cfq_update_group_weight(struct cfq_group *cfqg)
-{
-	if (cfqg->new_weight) {
-		cfqg->weight = cfqg->new_weight;
-		cfqg->new_weight = 0;
-	}
-}
-
-static void
-cfq_update_group_leaf_weight(struct cfq_group *cfqg)
-{
-	BUG_ON(!RB_EMPTY_NODE(&cfqg->rb_node));
-
-	if (cfqg->new_leaf_weight) {
-		cfqg->leaf_weight = cfqg->new_leaf_weight;
-		cfqg->new_leaf_weight = 0;
-	}
-}
-
-static void
-cfq_group_service_tree_add(struct cfq_rb_root *st, struct cfq_group *cfqg)
-{
-	unsigned int vfr = 1 << CFQ_SERVICE_SHIFT;	/* start with 1 */
-	struct cfq_group *pos = cfqg;
-	struct cfq_group *parent;
-	bool propagate;
-
-	/* add to the service tree */
-	BUG_ON(!RB_EMPTY_NODE(&cfqg->rb_node));
-
-	/*
-	 * Update leaf_weight.  We cannot update weight at this point
-	 * because cfqg might already have been activated and is
-	 * contributing its current weight to the parent's child_weight.
-	 */
-	cfq_update_group_leaf_weight(cfqg);
-	__cfq_group_service_tree_add(st, cfqg);
-
-	/*
-	 * Activate @cfqg and calculate the portion of vfraction @cfqg is
-	 * entitled to.  vfraction is calculated by walking the tree
-	 * towards the root calculating the fraction it has at each level.
-	 * The compounded ratio is how much vfraction @cfqg owns.
-	 *
-	 * Start with the proportion tasks in this cfqg has against active
-	 * children cfqgs - its leaf_weight against children_weight.
-	 */
-	propagate = !pos->nr_active++;
-	pos->children_weight += pos->leaf_weight;
-	vfr = vfr * pos->leaf_weight / pos->children_weight;
-
-	/*
-	 * Compound ->weight walking up the tree.  Both activation and
-	 * vfraction calculation are done in the same loop.  Propagation
-	 * stops once an already activated node is met.  vfraction
-	 * calculation should always continue to the root.
-	 */
-	while ((parent = cfqg_parent(pos))) {
-		if (propagate) {
-			cfq_update_group_weight(pos);
-			propagate = !parent->nr_active++;
-			parent->children_weight += pos->weight;
-		}
-		vfr = vfr * pos->weight / parent->children_weight;
-		pos = parent;
-	}
-
-	cfqg->vfraction = max_t(unsigned, vfr, 1);
-}
-
-static inline u64 cfq_get_cfqg_vdisktime_delay(struct cfq_data *cfqd)
-{
-	if (!iops_mode(cfqd))
-		return CFQ_SLICE_MODE_GROUP_DELAY;
-	else
-		return CFQ_IOPS_MODE_GROUP_DELAY;
-}
-
-static void
-cfq_group_notify_queue_add(struct cfq_data *cfqd, struct cfq_group *cfqg)
-{
-	struct cfq_rb_root *st = &cfqd->grp_service_tree;
-	struct cfq_group *__cfqg;
-	struct rb_node *n;
-
-	cfqg->nr_cfqq++;
-	if (!RB_EMPTY_NODE(&cfqg->rb_node))
-		return;
-
-	/*
-	 * Currently put the group at the end. Later implement something
-	 * so that groups get lesser vtime based on their weights, so that
-	 * if group does not loose all if it was not continuously backlogged.
-	 */
-	n = st->rb_rightmost;
-	if (n) {
-		__cfqg = rb_entry_cfqg(n);
-		cfqg->vdisktime = __cfqg->vdisktime +
-			cfq_get_cfqg_vdisktime_delay(cfqd);
-	} else
-		cfqg->vdisktime = st->min_vdisktime;
-	cfq_group_service_tree_add(st, cfqg);
-}
-
-static void
-cfq_group_service_tree_del(struct cfq_rb_root *st, struct cfq_group *cfqg)
-{
-	struct cfq_group *pos = cfqg;
-	bool propagate;
-
-	/*
-	 * Undo activation from cfq_group_service_tree_add().  Deactivate
-	 * @cfqg and propagate deactivation upwards.
-	 */
-	propagate = !--pos->nr_active;
-	pos->children_weight -= pos->leaf_weight;
-
-	while (propagate) {
-		struct cfq_group *parent = cfqg_parent(pos);
-
-		/* @pos has 0 nr_active at this point */
-		WARN_ON_ONCE(pos->children_weight);
-		pos->vfraction = 0;
-
-		if (!parent)
-			break;
-
-		propagate = !--parent->nr_active;
-		parent->children_weight -= pos->weight;
-		pos = parent;
-	}
-
-	/* remove from the service tree */
-	if (!RB_EMPTY_NODE(&cfqg->rb_node))
-		cfq_rb_erase(&cfqg->rb_node, st);
-}
-
-static void
-cfq_group_notify_queue_del(struct cfq_data *cfqd, struct cfq_group *cfqg)
-{
-	struct cfq_rb_root *st = &cfqd->grp_service_tree;
-
-	BUG_ON(cfqg->nr_cfqq < 1);
-	cfqg->nr_cfqq--;
-
-	/* If there are other cfq queues under this group, don't delete it */
-	if (cfqg->nr_cfqq)
-		return;
-
-	cfq_log_cfqg(cfqd, cfqg, "del_from_rr group");
-	cfq_group_service_tree_del(st, cfqg);
-	cfqg->saved_wl_slice = 0;
-	cfqg_stats_update_dequeue(cfqg);
-}
-
-static inline u64 cfq_cfqq_slice_usage(struct cfq_queue *cfqq,
-				       u64 *unaccounted_time)
-{
-	u64 slice_used;
-	u64 now = ktime_get_ns();
-
-	/*
-	 * Queue got expired before even a single request completed or
-	 * got expired immediately after first request completion.
-	 */
-	if (!cfqq->slice_start || cfqq->slice_start == now) {
-		/*
-		 * Also charge the seek time incurred to the group, otherwise
-		 * if there are mutiple queues in the group, each can dispatch
-		 * a single request on seeky media and cause lots of seek time
-		 * and group will never know it.
-		 */
-		slice_used = max_t(u64, (now - cfqq->dispatch_start),
-					jiffies_to_nsecs(1));
-	} else {
-		slice_used = now - cfqq->slice_start;
-		if (slice_used > cfqq->allocated_slice) {
-			*unaccounted_time = slice_used - cfqq->allocated_slice;
-			slice_used = cfqq->allocated_slice;
-		}
-		if (cfqq->slice_start > cfqq->dispatch_start)
-			*unaccounted_time += cfqq->slice_start -
-					cfqq->dispatch_start;
-	}
-
-	return slice_used;
-}
-
-static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
-				struct cfq_queue *cfqq)
-{
-	struct cfq_rb_root *st = &cfqd->grp_service_tree;
-	u64 used_sl, charge, unaccounted_sl = 0;
-	int nr_sync = cfqg->nr_cfqq - cfqg_busy_async_queues(cfqd, cfqg)
-			- cfqg->service_tree_idle.count;
-	unsigned int vfr;
-	u64 now = ktime_get_ns();
-
-	BUG_ON(nr_sync < 0);
-	used_sl = charge = cfq_cfqq_slice_usage(cfqq, &unaccounted_sl);
-
-	if (iops_mode(cfqd))
-		charge = cfqq->slice_dispatch;
-	else if (!cfq_cfqq_sync(cfqq) && !nr_sync)
-		charge = cfqq->allocated_slice;
-
-	/*
-	 * Can't update vdisktime while on service tree and cfqg->vfraction
-	 * is valid only while on it.  Cache vfr, leave the service tree,
-	 * update vdisktime and go back on.  The re-addition to the tree
-	 * will also update the weights as necessary.
-	 */
-	vfr = cfqg->vfraction;
-	cfq_group_service_tree_del(st, cfqg);
-	cfqg->vdisktime += cfqg_scale_charge(charge, vfr);
-	cfq_group_service_tree_add(st, cfqg);
-
-	/* This group is being expired. Save the context */
-	if (cfqd->workload_expires > now) {
-		cfqg->saved_wl_slice = cfqd->workload_expires - now;
-		cfqg->saved_wl_type = cfqd->serving_wl_type;
-		cfqg->saved_wl_class = cfqd->serving_wl_class;
-	} else
-		cfqg->saved_wl_slice = 0;
-
-	cfq_log_cfqg(cfqd, cfqg, "served: vt=%llu min_vt=%llu", cfqg->vdisktime,
-					st->min_vdisktime);
-	cfq_log_cfqq(cfqq->cfqd, cfqq,
-		     "sl_used=%llu disp=%llu charge=%llu iops=%u sect=%lu",
-		     used_sl, cfqq->slice_dispatch, charge,
-		     iops_mode(cfqd), cfqq->nr_sectors);
-	cfqg_stats_update_timeslice_used(cfqg, used_sl, unaccounted_sl);
-	cfqg_stats_set_start_empty_time(cfqg);
-}
-
-/**
- * cfq_init_cfqg_base - initialize base part of a cfq_group
- * @cfqg: cfq_group to initialize
- *
- * Initialize the base part which is used whether %CONFIG_CFQ_GROUP_IOSCHED
- * is enabled or not.
- */
-static void cfq_init_cfqg_base(struct cfq_group *cfqg)
-{
-	struct cfq_rb_root *st;
-	int i, j;
-
-	for_each_cfqg_st(cfqg, i, j, st)
-		*st = CFQ_RB_ROOT;
-	RB_CLEAR_NODE(&cfqg->rb_node);
-
-	cfqg->ttime.last_end_request = ktime_get_ns();
-}
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-static int __cfq_set_weight(struct cgroup_subsys_state *css, u64 val,
-			    bool on_dfl, bool reset_dev, bool is_leaf_weight);
-
-static void cfqg_stats_exit(struct cfqg_stats *stats)
-{
-	blkg_rwstat_exit(&stats->merged);
-	blkg_rwstat_exit(&stats->service_time);
-	blkg_rwstat_exit(&stats->wait_time);
-	blkg_rwstat_exit(&stats->queued);
-	blkg_stat_exit(&stats->time);
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	blkg_stat_exit(&stats->unaccounted_time);
-	blkg_stat_exit(&stats->avg_queue_size_sum);
-	blkg_stat_exit(&stats->avg_queue_size_samples);
-	blkg_stat_exit(&stats->dequeue);
-	blkg_stat_exit(&stats->group_wait_time);
-	blkg_stat_exit(&stats->idle_time);
-	blkg_stat_exit(&stats->empty_time);
-#endif
-}
-
-static int cfqg_stats_init(struct cfqg_stats *stats, gfp_t gfp)
-{
-	if (blkg_rwstat_init(&stats->merged, gfp) ||
-	    blkg_rwstat_init(&stats->service_time, gfp) ||
-	    blkg_rwstat_init(&stats->wait_time, gfp) ||
-	    blkg_rwstat_init(&stats->queued, gfp) ||
-	    blkg_stat_init(&stats->time, gfp))
-		goto err;
-
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	if (blkg_stat_init(&stats->unaccounted_time, gfp) ||
-	    blkg_stat_init(&stats->avg_queue_size_sum, gfp) ||
-	    blkg_stat_init(&stats->avg_queue_size_samples, gfp) ||
-	    blkg_stat_init(&stats->dequeue, gfp) ||
-	    blkg_stat_init(&stats->group_wait_time, gfp) ||
-	    blkg_stat_init(&stats->idle_time, gfp) ||
-	    blkg_stat_init(&stats->empty_time, gfp))
-		goto err;
-#endif
-	return 0;
-err:
-	cfqg_stats_exit(stats);
-	return -ENOMEM;
-}
-
-static struct blkcg_policy_data *cfq_cpd_alloc(gfp_t gfp)
-{
-	struct cfq_group_data *cgd;
-
-	cgd = kzalloc(sizeof(*cgd), gfp);
-	if (!cgd)
-		return NULL;
-	return &cgd->cpd;
-}
-
-static void cfq_cpd_init(struct blkcg_policy_data *cpd)
-{
-	struct cfq_group_data *cgd = cpd_to_cfqgd(cpd);
-	unsigned int weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ?
-			      CGROUP_WEIGHT_DFL : CFQ_WEIGHT_LEGACY_DFL;
-
-	if (cpd_to_blkcg(cpd) == &blkcg_root)
-		weight *= 2;
-
-	cgd->weight = weight;
-	cgd->leaf_weight = weight;
-}
-
-static void cfq_cpd_free(struct blkcg_policy_data *cpd)
-{
-	kfree(cpd_to_cfqgd(cpd));
-}
-
-static void cfq_cpd_bind(struct blkcg_policy_data *cpd)
-{
-	struct blkcg *blkcg = cpd_to_blkcg(cpd);
-	bool on_dfl = cgroup_subsys_on_dfl(io_cgrp_subsys);
-	unsigned int weight = on_dfl ? CGROUP_WEIGHT_DFL : CFQ_WEIGHT_LEGACY_DFL;
-
-	if (blkcg == &blkcg_root)
-		weight *= 2;
-
-	WARN_ON_ONCE(__cfq_set_weight(&blkcg->css, weight, on_dfl, true, false));
-	WARN_ON_ONCE(__cfq_set_weight(&blkcg->css, weight, on_dfl, true, true));
-}
-
-static struct blkg_policy_data *cfq_pd_alloc(gfp_t gfp, int node)
-{
-	struct cfq_group *cfqg;
-
-	cfqg = kzalloc_node(sizeof(*cfqg), gfp, node);
-	if (!cfqg)
-		return NULL;
-
-	cfq_init_cfqg_base(cfqg);
-	if (cfqg_stats_init(&cfqg->stats, gfp)) {
-		kfree(cfqg);
-		return NULL;
-	}
-
-	return &cfqg->pd;
-}
-
-static void cfq_pd_init(struct blkg_policy_data *pd)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-	struct cfq_group_data *cgd = blkcg_to_cfqgd(pd->blkg->blkcg);
-
-	cfqg->weight = cgd->weight;
-	cfqg->leaf_weight = cgd->leaf_weight;
-}
-
-static void cfq_pd_offline(struct blkg_policy_data *pd)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-	int i;
-
-	for (i = 0; i < IOPRIO_BE_NR; i++) {
-		if (cfqg->async_cfqq[0][i]) {
-			cfq_put_queue(cfqg->async_cfqq[0][i]);
-			cfqg->async_cfqq[0][i] = NULL;
-		}
-		if (cfqg->async_cfqq[1][i]) {
-			cfq_put_queue(cfqg->async_cfqq[1][i]);
-			cfqg->async_cfqq[1][i] = NULL;
-		}
-	}
-
-	if (cfqg->async_idle_cfqq) {
-		cfq_put_queue(cfqg->async_idle_cfqq);
-		cfqg->async_idle_cfqq = NULL;
-	}
-
-	/*
-	 * @blkg is going offline and will be ignored by
-	 * blkg_[rw]stat_recursive_sum().  Transfer stats to the parent so
-	 * that they don't get lost.  If IOs complete after this point, the
-	 * stats for them will be lost.  Oh well...
-	 */
-	cfqg_stats_xfer_dead(cfqg);
-}
-
-static void cfq_pd_free(struct blkg_policy_data *pd)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-
-	cfqg_stats_exit(&cfqg->stats);
-	return kfree(cfqg);
-}
-
-static void cfq_pd_reset_stats(struct blkg_policy_data *pd)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-
-	cfqg_stats_reset(&cfqg->stats);
-}
-
-static struct cfq_group *cfq_lookup_cfqg(struct cfq_data *cfqd,
-					 struct blkcg *blkcg)
-{
-	struct blkcg_gq *blkg;
-
-	blkg = blkg_lookup(blkcg, cfqd->queue);
-	if (likely(blkg))
-		return blkg_to_cfqg(blkg);
-	return NULL;
-}
-
-static void cfq_link_cfqq_cfqg(struct cfq_queue *cfqq, struct cfq_group *cfqg)
-{
-	cfqq->cfqg = cfqg;
-	/* cfqq reference on cfqg */
-	cfqg_get(cfqg);
-}
-
-static u64 cfqg_prfill_weight_device(struct seq_file *sf,
-				     struct blkg_policy_data *pd, int off)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-
-	if (!cfqg->dev_weight)
-		return 0;
-	return __blkg_prfill_u64(sf, pd, cfqg->dev_weight);
-}
-
-static int cfqg_print_weight_device(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_weight_device, &blkcg_policy_cfq,
-			  0, false);
-	return 0;
-}
-
-static u64 cfqg_prfill_leaf_weight_device(struct seq_file *sf,
-					  struct blkg_policy_data *pd, int off)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-
-	if (!cfqg->dev_leaf_weight)
-		return 0;
-	return __blkg_prfill_u64(sf, pd, cfqg->dev_leaf_weight);
-}
-
-static int cfqg_print_leaf_weight_device(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_leaf_weight_device, &blkcg_policy_cfq,
-			  0, false);
-	return 0;
-}
-
-static int cfq_print_weight(struct seq_file *sf, void *v)
-{
-	struct blkcg *blkcg = css_to_blkcg(seq_css(sf));
-	struct cfq_group_data *cgd = blkcg_to_cfqgd(blkcg);
-	unsigned int val = 0;
-
-	if (cgd)
-		val = cgd->weight;
-
-	seq_printf(sf, "%u\n", val);
-	return 0;
-}
-
-static int cfq_print_leaf_weight(struct seq_file *sf, void *v)
-{
-	struct blkcg *blkcg = css_to_blkcg(seq_css(sf));
-	struct cfq_group_data *cgd = blkcg_to_cfqgd(blkcg);
-	unsigned int val = 0;
-
-	if (cgd)
-		val = cgd->leaf_weight;
-
-	seq_printf(sf, "%u\n", val);
-	return 0;
-}
-
-static ssize_t __cfqg_set_weight_device(struct kernfs_open_file *of,
-					char *buf, size_t nbytes, loff_t off,
-					bool on_dfl, bool is_leaf_weight)
-{
-	unsigned int min = on_dfl ? CGROUP_WEIGHT_MIN : CFQ_WEIGHT_LEGACY_MIN;
-	unsigned int max = on_dfl ? CGROUP_WEIGHT_MAX : CFQ_WEIGHT_LEGACY_MAX;
-	struct blkcg *blkcg = css_to_blkcg(of_css(of));
-	struct blkg_conf_ctx ctx;
-	struct cfq_group *cfqg;
-	struct cfq_group_data *cfqgd;
-	int ret;
-	u64 v;
-
-	ret = blkg_conf_prep(blkcg, &blkcg_policy_cfq, buf, &ctx);
-	if (ret)
-		return ret;
-
-	if (sscanf(ctx.body, "%llu", &v) == 1) {
-		/* require "default" on dfl */
-		ret = -ERANGE;
-		if (!v && on_dfl)
-			goto out_finish;
-	} else if (!strcmp(strim(ctx.body), "default")) {
-		v = 0;
-	} else {
-		ret = -EINVAL;
-		goto out_finish;
-	}
-
-	cfqg = blkg_to_cfqg(ctx.blkg);
-	cfqgd = blkcg_to_cfqgd(blkcg);
-
-	ret = -ERANGE;
-	if (!v || (v >= min && v <= max)) {
-		if (!is_leaf_weight) {
-			cfqg->dev_weight = v;
-			cfqg->new_weight = v ?: cfqgd->weight;
-		} else {
-			cfqg->dev_leaf_weight = v;
-			cfqg->new_leaf_weight = v ?: cfqgd->leaf_weight;
-		}
-		ret = 0;
-	}
-out_finish:
-	blkg_conf_finish(&ctx);
-	return ret ?: nbytes;
-}
-
-static ssize_t cfqg_set_weight_device(struct kernfs_open_file *of,
-				      char *buf, size_t nbytes, loff_t off)
-{
-	return __cfqg_set_weight_device(of, buf, nbytes, off, false, false);
-}
-
-static ssize_t cfqg_set_leaf_weight_device(struct kernfs_open_file *of,
-					   char *buf, size_t nbytes, loff_t off)
-{
-	return __cfqg_set_weight_device(of, buf, nbytes, off, false, true);
-}
-
-static int __cfq_set_weight(struct cgroup_subsys_state *css, u64 val,
-			    bool on_dfl, bool reset_dev, bool is_leaf_weight)
-{
-	unsigned int min = on_dfl ? CGROUP_WEIGHT_MIN : CFQ_WEIGHT_LEGACY_MIN;
-	unsigned int max = on_dfl ? CGROUP_WEIGHT_MAX : CFQ_WEIGHT_LEGACY_MAX;
-	struct blkcg *blkcg = css_to_blkcg(css);
-	struct blkcg_gq *blkg;
-	struct cfq_group_data *cfqgd;
-	int ret = 0;
-
-	if (val < min || val > max)
-		return -ERANGE;
-
-	spin_lock_irq(&blkcg->lock);
-	cfqgd = blkcg_to_cfqgd(blkcg);
-	if (!cfqgd) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	if (!is_leaf_weight)
-		cfqgd->weight = val;
-	else
-		cfqgd->leaf_weight = val;
-
-	hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
-		struct cfq_group *cfqg = blkg_to_cfqg(blkg);
-
-		if (!cfqg)
-			continue;
-
-		if (!is_leaf_weight) {
-			if (reset_dev)
-				cfqg->dev_weight = 0;
-			if (!cfqg->dev_weight)
-				cfqg->new_weight = cfqgd->weight;
-		} else {
-			if (reset_dev)
-				cfqg->dev_leaf_weight = 0;
-			if (!cfqg->dev_leaf_weight)
-				cfqg->new_leaf_weight = cfqgd->leaf_weight;
-		}
-	}
-
-out:
-	spin_unlock_irq(&blkcg->lock);
-	return ret;
-}
-
-static int cfq_set_weight(struct cgroup_subsys_state *css, struct cftype *cft,
-			  u64 val)
-{
-	return __cfq_set_weight(css, val, false, false, false);
-}
-
-static int cfq_set_leaf_weight(struct cgroup_subsys_state *css,
-			       struct cftype *cft, u64 val)
-{
-	return __cfq_set_weight(css, val, false, false, true);
-}
-
-static int cfqg_print_stat(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat,
-			  &blkcg_policy_cfq, seq_cft(sf)->private, false);
-	return 0;
-}
-
-static int cfqg_print_rwstat(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat,
-			  &blkcg_policy_cfq, seq_cft(sf)->private, true);
-	return 0;
-}
-
-static u64 cfqg_prfill_stat_recursive(struct seq_file *sf,
-				      struct blkg_policy_data *pd, int off)
-{
-	u64 sum = blkg_stat_recursive_sum(pd_to_blkg(pd),
-					  &blkcg_policy_cfq, off);
-	return __blkg_prfill_u64(sf, pd, sum);
-}
-
-static u64 cfqg_prfill_rwstat_recursive(struct seq_file *sf,
-					struct blkg_policy_data *pd, int off)
-{
-	struct blkg_rwstat sum = blkg_rwstat_recursive_sum(pd_to_blkg(pd),
-							&blkcg_policy_cfq, off);
-	return __blkg_prfill_rwstat(sf, pd, &sum);
-}
-
-static int cfqg_print_stat_recursive(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_stat_recursive, &blkcg_policy_cfq,
-			  seq_cft(sf)->private, false);
-	return 0;
-}
-
-static int cfqg_print_rwstat_recursive(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_rwstat_recursive, &blkcg_policy_cfq,
-			  seq_cft(sf)->private, true);
-	return 0;
-}
-
-static u64 cfqg_prfill_sectors(struct seq_file *sf, struct blkg_policy_data *pd,
-			       int off)
-{
-	u64 sum = blkg_rwstat_total(&pd->blkg->stat_bytes);
-
-	return __blkg_prfill_u64(sf, pd, sum >> 9);
-}
-
-static int cfqg_print_stat_sectors(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_sectors, &blkcg_policy_cfq, 0, false);
-	return 0;
-}
-
-static u64 cfqg_prfill_sectors_recursive(struct seq_file *sf,
-					 struct blkg_policy_data *pd, int off)
-{
-	struct blkg_rwstat tmp = blkg_rwstat_recursive_sum(pd->blkg, NULL,
-					offsetof(struct blkcg_gq, stat_bytes));
-	u64 sum = atomic64_read(&tmp.aux_cnt[BLKG_RWSTAT_READ]) +
-		atomic64_read(&tmp.aux_cnt[BLKG_RWSTAT_WRITE]);
-
-	return __blkg_prfill_u64(sf, pd, sum >> 9);
-}
-
-static int cfqg_print_stat_sectors_recursive(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_sectors_recursive, &blkcg_policy_cfq, 0,
-			  false);
-	return 0;
-}
-
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-static u64 cfqg_prfill_avg_queue_size(struct seq_file *sf,
-				      struct blkg_policy_data *pd, int off)
-{
-	struct cfq_group *cfqg = pd_to_cfqg(pd);
-	u64 samples = blkg_stat_read(&cfqg->stats.avg_queue_size_samples);
-	u64 v = 0;
-
-	if (samples) {
-		v = blkg_stat_read(&cfqg->stats.avg_queue_size_sum);
-		v = div64_u64(v, samples);
-	}
-	__blkg_prfill_u64(sf, pd, v);
-	return 0;
-}
-
-/* print avg_queue_size */
-static int cfqg_print_avg_queue_size(struct seq_file *sf, void *v)
-{
-	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
-			  cfqg_prfill_avg_queue_size, &blkcg_policy_cfq,
-			  0, false);
-	return 0;
-}
-#endif	/* CONFIG_DEBUG_BLK_CGROUP */
-
-static struct cftype cfq_blkcg_legacy_files[] = {
-	/* on root, weight is mapped to leaf_weight */
-	{
-		.name = "weight_device",
-		.flags = CFTYPE_ONLY_ON_ROOT,
-		.seq_show = cfqg_print_leaf_weight_device,
-		.write = cfqg_set_leaf_weight_device,
-	},
-	{
-		.name = "weight",
-		.flags = CFTYPE_ONLY_ON_ROOT,
-		.seq_show = cfq_print_leaf_weight,
-		.write_u64 = cfq_set_leaf_weight,
-	},
-
-	/* no such mapping necessary for !roots */
-	{
-		.name = "weight_device",
-		.flags = CFTYPE_NOT_ON_ROOT,
-		.seq_show = cfqg_print_weight_device,
-		.write = cfqg_set_weight_device,
-	},
-	{
-		.name = "weight",
-		.flags = CFTYPE_NOT_ON_ROOT,
-		.seq_show = cfq_print_weight,
-		.write_u64 = cfq_set_weight,
-	},
-
-	{
-		.name = "leaf_weight_device",
-		.seq_show = cfqg_print_leaf_weight_device,
-		.write = cfqg_set_leaf_weight_device,
-	},
-	{
-		.name = "leaf_weight",
-		.seq_show = cfq_print_leaf_weight,
-		.write_u64 = cfq_set_leaf_weight,
-	},
-
-	/* statistics, covers only the tasks in the cfqg */
-	{
-		.name = "time",
-		.private = offsetof(struct cfq_group, stats.time),
-		.seq_show = cfqg_print_stat,
-	},
-	{
-		.name = "sectors",
-		.seq_show = cfqg_print_stat_sectors,
-	},
-	{
-		.name = "io_service_bytes",
-		.private = (unsigned long)&blkcg_policy_cfq,
-		.seq_show = blkg_print_stat_bytes,
-	},
-	{
-		.name = "io_serviced",
-		.private = (unsigned long)&blkcg_policy_cfq,
-		.seq_show = blkg_print_stat_ios,
-	},
-	{
-		.name = "io_service_time",
-		.private = offsetof(struct cfq_group, stats.service_time),
-		.seq_show = cfqg_print_rwstat,
-	},
-	{
-		.name = "io_wait_time",
-		.private = offsetof(struct cfq_group, stats.wait_time),
-		.seq_show = cfqg_print_rwstat,
-	},
-	{
-		.name = "io_merged",
-		.private = offsetof(struct cfq_group, stats.merged),
-		.seq_show = cfqg_print_rwstat,
-	},
-	{
-		.name = "io_queued",
-		.private = offsetof(struct cfq_group, stats.queued),
-		.seq_show = cfqg_print_rwstat,
-	},
-
-	/* the same statictics which cover the cfqg and its descendants */
-	{
-		.name = "time_recursive",
-		.private = offsetof(struct cfq_group, stats.time),
-		.seq_show = cfqg_print_stat_recursive,
-	},
-	{
-		.name = "sectors_recursive",
-		.seq_show = cfqg_print_stat_sectors_recursive,
-	},
-	{
-		.name = "io_service_bytes_recursive",
-		.private = (unsigned long)&blkcg_policy_cfq,
-		.seq_show = blkg_print_stat_bytes_recursive,
-	},
-	{
-		.name = "io_serviced_recursive",
-		.private = (unsigned long)&blkcg_policy_cfq,
-		.seq_show = blkg_print_stat_ios_recursive,
-	},
-	{
-		.name = "io_service_time_recursive",
-		.private = offsetof(struct cfq_group, stats.service_time),
-		.seq_show = cfqg_print_rwstat_recursive,
-	},
-	{
-		.name = "io_wait_time_recursive",
-		.private = offsetof(struct cfq_group, stats.wait_time),
-		.seq_show = cfqg_print_rwstat_recursive,
-	},
-	{
-		.name = "io_merged_recursive",
-		.private = offsetof(struct cfq_group, stats.merged),
-		.seq_show = cfqg_print_rwstat_recursive,
-	},
-	{
-		.name = "io_queued_recursive",
-		.private = offsetof(struct cfq_group, stats.queued),
-		.seq_show = cfqg_print_rwstat_recursive,
-	},
-#ifdef CONFIG_DEBUG_BLK_CGROUP
-	{
-		.name = "avg_queue_size",
-		.seq_show = cfqg_print_avg_queue_size,
-	},
-	{
-		.name = "group_wait_time",
-		.private = offsetof(struct cfq_group, stats.group_wait_time),
-		.seq_show = cfqg_print_stat,
-	},
-	{
-		.name = "idle_time",
-		.private = offsetof(struct cfq_group, stats.idle_time),
-		.seq_show = cfqg_print_stat,
-	},
-	{
-		.name = "empty_time",
-		.private = offsetof(struct cfq_group, stats.empty_time),
-		.seq_show = cfqg_print_stat,
-	},
-	{
-		.name = "dequeue",
-		.private = offsetof(struct cfq_group, stats.dequeue),
-		.seq_show = cfqg_print_stat,
-	},
-	{
-		.name = "unaccounted_time",
-		.private = offsetof(struct cfq_group, stats.unaccounted_time),
-		.seq_show = cfqg_print_stat,
-	},
-#endif	/* CONFIG_DEBUG_BLK_CGROUP */
-	{ }	/* terminate */
-};
-
-static int cfq_print_weight_on_dfl(struct seq_file *sf, void *v)
-{
-	struct blkcg *blkcg = css_to_blkcg(seq_css(sf));
-	struct cfq_group_data *cgd = blkcg_to_cfqgd(blkcg);
-
-	seq_printf(sf, "default %u\n", cgd->weight);
-	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_weight_device,
-			  &blkcg_policy_cfq, 0, false);
-	return 0;
-}
-
-static ssize_t cfq_set_weight_on_dfl(struct kernfs_open_file *of,
-				     char *buf, size_t nbytes, loff_t off)
-{
-	char *endp;
-	int ret;
-	u64 v;
-
-	buf = strim(buf);
-
-	/* "WEIGHT" or "default WEIGHT" sets the default weight */
-	v = simple_strtoull(buf, &endp, 0);
-	if (*endp == '\0' || sscanf(buf, "default %llu", &v) == 1) {
-		ret = __cfq_set_weight(of_css(of), v, true, false, false);
-		return ret ?: nbytes;
-	}
-
-	/* "MAJ:MIN WEIGHT" */
-	return __cfqg_set_weight_device(of, buf, nbytes, off, true, false);
-}
-
-static struct cftype cfq_blkcg_files[] = {
-	{
-		.name = "weight",
-		.flags = CFTYPE_NOT_ON_ROOT,
-		.seq_show = cfq_print_weight_on_dfl,
-		.write = cfq_set_weight_on_dfl,
-	},
-	{ }	/* terminate */
-};
-
-#else /* GROUP_IOSCHED */
-static struct cfq_group *cfq_lookup_cfqg(struct cfq_data *cfqd,
-					 struct blkcg *blkcg)
-{
-	return cfqd->root_group;
-}
-
-static inline void
-cfq_link_cfqq_cfqg(struct cfq_queue *cfqq, struct cfq_group *cfqg) {
-	cfqq->cfqg = cfqg;
-}
-
-#endif /* GROUP_IOSCHED */
-
-/*
- * The cfqd->service_trees holds all pending cfq_queue's that have
- * requests waiting to be processed. It is sorted in the order that
- * we will service the queues.
- */
-static void cfq_service_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-				 bool add_front)
-{
-	struct rb_node **p, *parent;
-	struct cfq_queue *__cfqq;
-	u64 rb_key;
-	struct cfq_rb_root *st;
-	bool leftmost = true;
-	int new_cfqq = 1;
-	u64 now = ktime_get_ns();
-
-	st = st_for(cfqq->cfqg, cfqq_class(cfqq), cfqq_type(cfqq));
-	if (cfq_class_idle(cfqq)) {
-		rb_key = CFQ_IDLE_DELAY;
-		parent = st->rb_rightmost;
-		if (parent && parent != &cfqq->rb_node) {
-			__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
-			rb_key += __cfqq->rb_key;
-		} else
-			rb_key += now;
-	} else if (!add_front) {
-		/*
-		 * Get our rb key offset. Subtract any residual slice
-		 * value carried from last service. A negative resid
-		 * count indicates slice overrun, and this should position
-		 * the next service time further away in the tree.
-		 */
-		rb_key = cfq_slice_offset(cfqd, cfqq) + now;
-		rb_key -= cfqq->slice_resid;
-		cfqq->slice_resid = 0;
-	} else {
-		rb_key = -NSEC_PER_SEC;
-		__cfqq = cfq_rb_first(st);
-		rb_key += __cfqq ? __cfqq->rb_key : now;
-	}
-
-	if (!RB_EMPTY_NODE(&cfqq->rb_node)) {
-		new_cfqq = 0;
-		/*
-		 * same position, nothing more to do
-		 */
-		if (rb_key == cfqq->rb_key && cfqq->service_tree == st)
-			return;
-
-		cfq_rb_erase(&cfqq->rb_node, cfqq->service_tree);
-		cfqq->service_tree = NULL;
-	}
-
-	parent = NULL;
-	cfqq->service_tree = st;
-	p = &st->rb.rb_root.rb_node;
-	while (*p) {
-		parent = *p;
-		__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
-
-		/*
-		 * sort by key, that represents service time.
-		 */
-		if (rb_key < __cfqq->rb_key)
-			p = &parent->rb_left;
-		else {
-			p = &parent->rb_right;
-			leftmost = false;
-		}
-	}
-
-	cfqq->rb_key = rb_key;
-	rb_link_node(&cfqq->rb_node, parent, p);
-	rb_insert_color_cached(&cfqq->rb_node, &st->rb, leftmost);
-	st->count++;
-	if (add_front || !new_cfqq)
-		return;
-	cfq_group_notify_queue_add(cfqd, cfqq->cfqg);
-}
-
-static struct cfq_queue *
-cfq_prio_tree_lookup(struct cfq_data *cfqd, struct rb_root *root,
-		     sector_t sector, struct rb_node **ret_parent,
-		     struct rb_node ***rb_link)
-{
-	struct rb_node **p, *parent;
-	struct cfq_queue *cfqq = NULL;
-
-	parent = NULL;
-	p = &root->rb_node;
-	while (*p) {
-		struct rb_node **n;
-
-		parent = *p;
-		cfqq = rb_entry(parent, struct cfq_queue, p_node);
-
-		/*
-		 * Sort strictly based on sector.  Smallest to the left,
-		 * largest to the right.
-		 */
-		if (sector > blk_rq_pos(cfqq->next_rq))
-			n = &(*p)->rb_right;
-		else if (sector < blk_rq_pos(cfqq->next_rq))
-			n = &(*p)->rb_left;
-		else
-			break;
-		p = n;
-		cfqq = NULL;
-	}
-
-	*ret_parent = parent;
-	if (rb_link)
-		*rb_link = p;
-	return cfqq;
-}
-
-static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	struct rb_node **p, *parent;
-	struct cfq_queue *__cfqq;
-
-	if (cfqq->p_root) {
-		rb_erase(&cfqq->p_node, cfqq->p_root);
-		cfqq->p_root = NULL;
-	}
-
-	if (cfq_class_idle(cfqq))
-		return;
-	if (!cfqq->next_rq)
-		return;
-
-	cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio];
-	__cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root,
-				      blk_rq_pos(cfqq->next_rq), &parent, &p);
-	if (!__cfqq) {
-		rb_link_node(&cfqq->p_node, parent, p);
-		rb_insert_color(&cfqq->p_node, cfqq->p_root);
-	} else
-		cfqq->p_root = NULL;
-}
-
-/*
- * Update cfqq's position in the service tree.
- */
-static void cfq_resort_rr_list(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	/*
-	 * Resorting requires the cfqq to be on the RR list already.
-	 */
-	if (cfq_cfqq_on_rr(cfqq)) {
-		cfq_service_tree_add(cfqd, cfqq, 0);
-		cfq_prio_tree_add(cfqd, cfqq);
-	}
-}
-
-/*
- * add to busy list of queues for service, trying to be fair in ordering
- * the pending list according to last request service
- */
-static void cfq_add_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	cfq_log_cfqq(cfqd, cfqq, "add_to_rr");
-	BUG_ON(cfq_cfqq_on_rr(cfqq));
-	cfq_mark_cfqq_on_rr(cfqq);
-	cfqd->busy_queues++;
-	if (cfq_cfqq_sync(cfqq))
-		cfqd->busy_sync_queues++;
-
-	cfq_resort_rr_list(cfqd, cfqq);
-}
-
-/*
- * Called when the cfqq no longer has requests pending, remove it from
- * the service tree.
- */
-static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	cfq_log_cfqq(cfqd, cfqq, "del_from_rr");
-	BUG_ON(!cfq_cfqq_on_rr(cfqq));
-	cfq_clear_cfqq_on_rr(cfqq);
-
-	if (!RB_EMPTY_NODE(&cfqq->rb_node)) {
-		cfq_rb_erase(&cfqq->rb_node, cfqq->service_tree);
-		cfqq->service_tree = NULL;
-	}
-	if (cfqq->p_root) {
-		rb_erase(&cfqq->p_node, cfqq->p_root);
-		cfqq->p_root = NULL;
-	}
-
-	cfq_group_notify_queue_del(cfqd, cfqq->cfqg);
-	BUG_ON(!cfqd->busy_queues);
-	cfqd->busy_queues--;
-	if (cfq_cfqq_sync(cfqq))
-		cfqd->busy_sync_queues--;
-}
-
-/*
- * rb tree support functions
- */
-static void cfq_del_rq_rb(struct request *rq)
-{
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-	const int sync = rq_is_sync(rq);
-
-	BUG_ON(!cfqq->queued[sync]);
-	cfqq->queued[sync]--;
-
-	elv_rb_del(&cfqq->sort_list, rq);
-
-	if (cfq_cfqq_on_rr(cfqq) && RB_EMPTY_ROOT(&cfqq->sort_list)) {
-		/*
-		 * Queue will be deleted from service tree when we actually
-		 * expire it later. Right now just remove it from prio tree
-		 * as it is empty.
-		 */
-		if (cfqq->p_root) {
-			rb_erase(&cfqq->p_node, cfqq->p_root);
-			cfqq->p_root = NULL;
-		}
-	}
-}
-
-static void cfq_add_rq_rb(struct request *rq)
-{
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-	struct cfq_data *cfqd = cfqq->cfqd;
-	struct request *prev;
-
-	cfqq->queued[rq_is_sync(rq)]++;
-
-	elv_rb_add(&cfqq->sort_list, rq);
-
-	if (!cfq_cfqq_on_rr(cfqq))
-		cfq_add_cfqq_rr(cfqd, cfqq);
-
-	/*
-	 * check if this request is a better next-serve candidate
-	 */
-	prev = cfqq->next_rq;
-	cfqq->next_rq = cfq_choose_req(cfqd, cfqq->next_rq, rq, cfqd->last_position);
-
-	/*
-	 * adjust priority tree position, if ->next_rq changes
-	 */
-	if (prev != cfqq->next_rq)
-		cfq_prio_tree_add(cfqd, cfqq);
-
-	BUG_ON(!cfqq->next_rq);
-}
-
-static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq)
-{
-	elv_rb_del(&cfqq->sort_list, rq);
-	cfqq->queued[rq_is_sync(rq)]--;
-	cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->cmd_flags);
-	cfq_add_rq_rb(rq);
-	cfqg_stats_update_io_add(RQ_CFQG(rq), cfqq->cfqd->serving_group,
-				 rq->cmd_flags);
-}
-
-static struct request *
-cfq_find_rq_fmerge(struct cfq_data *cfqd, struct bio *bio)
-{
-	struct task_struct *tsk = current;
-	struct cfq_io_cq *cic;
-	struct cfq_queue *cfqq;
-
-	cic = cfq_cic_lookup(cfqd, tsk->io_context);
-	if (!cic)
-		return NULL;
-
-	cfqq = cic_to_cfqq(cic, op_is_sync(bio->bi_opf));
-	if (cfqq)
-		return elv_rb_find(&cfqq->sort_list, bio_end_sector(bio));
-
-	return NULL;
-}
-
-static void cfq_activate_request(struct request_queue *q, struct request *rq)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-
-	cfqd->rq_in_driver++;
-	cfq_log_cfqq(cfqd, RQ_CFQQ(rq), "activate rq, drv=%d",
-						cfqd->rq_in_driver);
-
-	cfqd->last_position = blk_rq_pos(rq) + blk_rq_sectors(rq);
-}
-
-static void cfq_deactivate_request(struct request_queue *q, struct request *rq)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-
-	WARN_ON(!cfqd->rq_in_driver);
-	cfqd->rq_in_driver--;
-	cfq_log_cfqq(cfqd, RQ_CFQQ(rq), "deactivate rq, drv=%d",
-						cfqd->rq_in_driver);
-}
-
-static void cfq_remove_request(struct request *rq)
-{
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-
-	if (cfqq->next_rq == rq)
-		cfqq->next_rq = cfq_find_next_rq(cfqq->cfqd, cfqq, rq);
-
-	list_del_init(&rq->queuelist);
-	cfq_del_rq_rb(rq);
-
-	cfqq->cfqd->rq_queued--;
-	cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->cmd_flags);
-	if (rq->cmd_flags & REQ_PRIO) {
-		WARN_ON(!cfqq->prio_pending);
-		cfqq->prio_pending--;
-	}
-}
-
-static enum elv_merge cfq_merge(struct request_queue *q, struct request **req,
-		     struct bio *bio)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct request *__rq;
-
-	__rq = cfq_find_rq_fmerge(cfqd, bio);
-	if (__rq && elv_bio_merge_ok(__rq, bio)) {
-		*req = __rq;
-		return ELEVATOR_FRONT_MERGE;
-	}
-
-	return ELEVATOR_NO_MERGE;
-}
-
-static void cfq_merged_request(struct request_queue *q, struct request *req,
-			       enum elv_merge type)
-{
-	if (type == ELEVATOR_FRONT_MERGE) {
-		struct cfq_queue *cfqq = RQ_CFQQ(req);
-
-		cfq_reposition_rq_rb(cfqq, req);
-	}
-}
-
-static void cfq_bio_merged(struct request_queue *q, struct request *req,
-				struct bio *bio)
-{
-	cfqg_stats_update_io_merged(RQ_CFQG(req), bio->bi_opf);
-}
-
-static void
-cfq_merged_requests(struct request_queue *q, struct request *rq,
-		    struct request *next)
-{
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-
-	/*
-	 * reposition in fifo if next is older than rq
-	 */
-	if (!list_empty(&rq->queuelist) && !list_empty(&next->queuelist) &&
-	    next->fifo_time < rq->fifo_time &&
-	    cfqq == RQ_CFQQ(next)) {
-		list_move(&rq->queuelist, &next->queuelist);
-		rq->fifo_time = next->fifo_time;
-	}
-
-	if (cfqq->next_rq == next)
-		cfqq->next_rq = rq;
-	cfq_remove_request(next);
-	cfqg_stats_update_io_merged(RQ_CFQG(rq), next->cmd_flags);
-
-	cfqq = RQ_CFQQ(next);
-	/*
-	 * all requests of this queue are merged to other queues, delete it
-	 * from the service tree. If it's the active_queue,
-	 * cfq_dispatch_requests() will choose to expire it or do idle
-	 */
-	if (cfq_cfqq_on_rr(cfqq) && RB_EMPTY_ROOT(&cfqq->sort_list) &&
-	    cfqq != cfqd->active_queue)
-		cfq_del_cfqq_rr(cfqd, cfqq);
-}
-
-static int cfq_allow_bio_merge(struct request_queue *q, struct request *rq,
-			       struct bio *bio)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	bool is_sync = op_is_sync(bio->bi_opf);
-	struct cfq_io_cq *cic;
-	struct cfq_queue *cfqq;
-
-	/*
-	 * Disallow merge of a sync bio into an async request.
-	 */
-	if (is_sync && !rq_is_sync(rq))
-		return false;
-
-	/*
-	 * Lookup the cfqq that this bio will be queued with and allow
-	 * merge only if rq is queued there.
-	 */
-	cic = cfq_cic_lookup(cfqd, current->io_context);
-	if (!cic)
-		return false;
-
-	cfqq = cic_to_cfqq(cic, is_sync);
-	return cfqq == RQ_CFQQ(rq);
-}
-
-static int cfq_allow_rq_merge(struct request_queue *q, struct request *rq,
-			      struct request *next)
-{
-	return RQ_CFQQ(rq) == RQ_CFQQ(next);
-}
-
-static inline void cfq_del_timer(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	hrtimer_try_to_cancel(&cfqd->idle_slice_timer);
-	cfqg_stats_update_idle_time(cfqq->cfqg);
-}
-
-static void __cfq_set_active_queue(struct cfq_data *cfqd,
-				   struct cfq_queue *cfqq)
-{
-	if (cfqq) {
-		cfq_log_cfqq(cfqd, cfqq, "set_active wl_class:%d wl_type:%d",
-				cfqd->serving_wl_class, cfqd->serving_wl_type);
-		cfqg_stats_update_avg_queue_size(cfqq->cfqg);
-		cfqq->slice_start = 0;
-		cfqq->dispatch_start = ktime_get_ns();
-		cfqq->allocated_slice = 0;
-		cfqq->slice_end = 0;
-		cfqq->slice_dispatch = 0;
-		cfqq->nr_sectors = 0;
-
-		cfq_clear_cfqq_wait_request(cfqq);
-		cfq_clear_cfqq_must_dispatch(cfqq);
-		cfq_clear_cfqq_must_alloc_slice(cfqq);
-		cfq_clear_cfqq_fifo_expire(cfqq);
-		cfq_mark_cfqq_slice_new(cfqq);
-
-		cfq_del_timer(cfqd, cfqq);
-	}
-
-	cfqd->active_queue = cfqq;
-}
-
-/*
- * current cfqq expired its slice (or was too idle), select new one
- */
-static void
-__cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-		    bool timed_out)
-{
-	cfq_log_cfqq(cfqd, cfqq, "slice expired t=%d", timed_out);
-
-	if (cfq_cfqq_wait_request(cfqq))
-		cfq_del_timer(cfqd, cfqq);
-
-	cfq_clear_cfqq_wait_request(cfqq);
-	cfq_clear_cfqq_wait_busy(cfqq);
-
-	/*
-	 * If this cfqq is shared between multiple processes, check to
-	 * make sure that those processes are still issuing I/Os within
-	 * the mean seek distance.  If not, it may be time to break the
-	 * queues apart again.
-	 */
-	if (cfq_cfqq_coop(cfqq) && CFQQ_SEEKY(cfqq))
-		cfq_mark_cfqq_split_coop(cfqq);
-
-	/*
-	 * store what was left of this slice, if the queue idled/timed out
-	 */
-	if (timed_out) {
-		if (cfq_cfqq_slice_new(cfqq))
-			cfqq->slice_resid = cfq_scaled_cfqq_slice(cfqd, cfqq);
-		else
-			cfqq->slice_resid = cfqq->slice_end - ktime_get_ns();
-		cfq_log_cfqq(cfqd, cfqq, "resid=%lld", cfqq->slice_resid);
-	}
-
-	cfq_group_served(cfqd, cfqq->cfqg, cfqq);
-
-	if (cfq_cfqq_on_rr(cfqq) && RB_EMPTY_ROOT(&cfqq->sort_list))
-		cfq_del_cfqq_rr(cfqd, cfqq);
-
-	cfq_resort_rr_list(cfqd, cfqq);
-
-	if (cfqq == cfqd->active_queue)
-		cfqd->active_queue = NULL;
-
-	if (cfqd->active_cic) {
-		put_io_context(cfqd->active_cic->icq.ioc);
-		cfqd->active_cic = NULL;
-	}
-}
-
-static inline void cfq_slice_expired(struct cfq_data *cfqd, bool timed_out)
-{
-	struct cfq_queue *cfqq = cfqd->active_queue;
-
-	if (cfqq)
-		__cfq_slice_expired(cfqd, cfqq, timed_out);
-}
-
-/*
- * Get next queue for service. Unless we have a queue preemption,
- * we'll simply select the first cfqq in the service tree.
- */
-static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
-{
-	struct cfq_rb_root *st = st_for(cfqd->serving_group,
-			cfqd->serving_wl_class, cfqd->serving_wl_type);
-
-	if (!cfqd->rq_queued)
-		return NULL;
-
-	/* There is nothing to dispatch */
-	if (!st)
-		return NULL;
-	if (RB_EMPTY_ROOT(&st->rb.rb_root))
-		return NULL;
-	return cfq_rb_first(st);
-}
-
-static struct cfq_queue *cfq_get_next_queue_forced(struct cfq_data *cfqd)
-{
-	struct cfq_group *cfqg;
-	struct cfq_queue *cfqq;
-	int i, j;
-	struct cfq_rb_root *st;
-
-	if (!cfqd->rq_queued)
-		return NULL;
-
-	cfqg = cfq_get_next_cfqg(cfqd);
-	if (!cfqg)
-		return NULL;
-
-	for_each_cfqg_st(cfqg, i, j, st) {
-		cfqq = cfq_rb_first(st);
-		if (cfqq)
-			return cfqq;
-	}
-	return NULL;
-}
-
-/*
- * Get and set a new active queue for service.
- */
-static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd,
-					      struct cfq_queue *cfqq)
-{
-	if (!cfqq)
-		cfqq = cfq_get_next_queue(cfqd);
-
-	__cfq_set_active_queue(cfqd, cfqq);
-	return cfqq;
-}
-
-static inline sector_t cfq_dist_from_last(struct cfq_data *cfqd,
-					  struct request *rq)
-{
-	if (blk_rq_pos(rq) >= cfqd->last_position)
-		return blk_rq_pos(rq) - cfqd->last_position;
-	else
-		return cfqd->last_position - blk_rq_pos(rq);
-}
-
-static inline int cfq_rq_close(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-			       struct request *rq)
-{
-	return cfq_dist_from_last(cfqd, rq) <= CFQQ_CLOSE_THR;
-}
-
-static struct cfq_queue *cfqq_close(struct cfq_data *cfqd,
-				    struct cfq_queue *cur_cfqq)
-{
-	struct rb_root *root = &cfqd->prio_trees[cur_cfqq->org_ioprio];
-	struct rb_node *parent, *node;
-	struct cfq_queue *__cfqq;
-	sector_t sector = cfqd->last_position;
-
-	if (RB_EMPTY_ROOT(root))
-		return NULL;
-
-	/*
-	 * First, if we find a request starting at the end of the last
-	 * request, choose it.
-	 */
-	__cfqq = cfq_prio_tree_lookup(cfqd, root, sector, &parent, NULL);
-	if (__cfqq)
-		return __cfqq;
-
-	/*
-	 * If the exact sector wasn't found, the parent of the NULL leaf
-	 * will contain the closest sector.
-	 */
-	__cfqq = rb_entry(parent, struct cfq_queue, p_node);
-	if (cfq_rq_close(cfqd, cur_cfqq, __cfqq->next_rq))
-		return __cfqq;
-
-	if (blk_rq_pos(__cfqq->next_rq) < sector)
-		node = rb_next(&__cfqq->p_node);
-	else
-		node = rb_prev(&__cfqq->p_node);
-	if (!node)
-		return NULL;
-
-	__cfqq = rb_entry(node, struct cfq_queue, p_node);
-	if (cfq_rq_close(cfqd, cur_cfqq, __cfqq->next_rq))
-		return __cfqq;
-
-	return NULL;
-}
-
-/*
- * cfqd - obvious
- * cur_cfqq - passed in so that we don't decide that the current queue is
- * 	      closely cooperating with itself.
- *
- * So, basically we're assuming that that cur_cfqq has dispatched at least
- * one request, and that cfqd->last_position reflects a position on the disk
- * associated with the I/O issued by cur_cfqq.  I'm not sure this is a valid
- * assumption.
- */
-static struct cfq_queue *cfq_close_cooperator(struct cfq_data *cfqd,
-					      struct cfq_queue *cur_cfqq)
-{
-	struct cfq_queue *cfqq;
-
-	if (cfq_class_idle(cur_cfqq))
-		return NULL;
-	if (!cfq_cfqq_sync(cur_cfqq))
-		return NULL;
-	if (CFQQ_SEEKY(cur_cfqq))
-		return NULL;
-
-	/*
-	 * Don't search priority tree if it's the only queue in the group.
-	 */
-	if (cur_cfqq->cfqg->nr_cfqq == 1)
-		return NULL;
-
-	/*
-	 * We should notice if some of the queues are cooperating, eg
-	 * working closely on the same area of the disk. In that case,
-	 * we can group them together and don't waste time idling.
-	 */
-	cfqq = cfqq_close(cfqd, cur_cfqq);
-	if (!cfqq)
-		return NULL;
-
-	/* If new queue belongs to different cfq_group, don't choose it */
-	if (cur_cfqq->cfqg != cfqq->cfqg)
-		return NULL;
-
-	/*
-	 * It only makes sense to merge sync queues.
-	 */
-	if (!cfq_cfqq_sync(cfqq))
-		return NULL;
-	if (CFQQ_SEEKY(cfqq))
-		return NULL;
-
-	/*
-	 * Do not merge queues of different priority classes
-	 */
-	if (cfq_class_rt(cfqq) != cfq_class_rt(cur_cfqq))
-		return NULL;
-
-	return cfqq;
-}
-
-/*
- * Determine whether we should enforce idle window for this queue.
- */
-
-static bool cfq_should_idle(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	enum wl_class_t wl_class = cfqq_class(cfqq);
-	struct cfq_rb_root *st = cfqq->service_tree;
-
-	BUG_ON(!st);
-	BUG_ON(!st->count);
-
-	if (!cfqd->cfq_slice_idle)
-		return false;
-
-	/* We never do for idle class queues. */
-	if (wl_class == IDLE_WORKLOAD)
-		return false;
-
-	/* We do for queues that were marked with idle window flag. */
-	if (cfq_cfqq_idle_window(cfqq) &&
-	   !(blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag))
-		return true;
-
-	/*
-	 * Otherwise, we do only if they are the last ones
-	 * in their service tree.
-	 */
-	if (st->count == 1 && cfq_cfqq_sync(cfqq) &&
-	   !cfq_io_thinktime_big(cfqd, &st->ttime, false))
-		return true;
-	cfq_log_cfqq(cfqd, cfqq, "Not idling. st->count:%d", st->count);
-	return false;
-}
-
-static void cfq_arm_slice_timer(struct cfq_data *cfqd)
-{
-	struct cfq_queue *cfqq = cfqd->active_queue;
-	struct cfq_rb_root *st = cfqq->service_tree;
-	struct cfq_io_cq *cic;
-	u64 sl, group_idle = 0;
-	u64 now = ktime_get_ns();
-
-	/*
-	 * SSD device without seek penalty, disable idling. But only do so
-	 * for devices that support queuing, otherwise we still have a problem
-	 * with sync vs async workloads.
-	 */
-	if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag &&
-		!cfqd->cfq_group_idle)
-		return;
-
-	WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list));
-	WARN_ON(cfq_cfqq_slice_new(cfqq));
-
-	/*
-	 * idle is disabled, either manually or by past process history
-	 */
-	if (!cfq_should_idle(cfqd, cfqq)) {
-		/* no queue idling. Check for group idling */
-		if (cfqd->cfq_group_idle)
-			group_idle = cfqd->cfq_group_idle;
-		else
-			return;
-	}
-
-	/*
-	 * still active requests from this queue, don't idle
-	 */
-	if (cfqq->dispatched)
-		return;
-
-	/*
-	 * task has exited, don't wait
-	 */
-	cic = cfqd->active_cic;
-	if (!cic || !atomic_read(&cic->icq.ioc->active_ref))
-		return;
-
-	/*
-	 * If our average think time is larger than the remaining time
-	 * slice, then don't idle. This avoids overrunning the allotted
-	 * time slice.
-	 */
-	if (sample_valid(cic->ttime.ttime_samples) &&
-	    (cfqq->slice_end - now < cic->ttime.ttime_mean)) {
-		cfq_log_cfqq(cfqd, cfqq, "Not idling. think_time:%llu",
-			     cic->ttime.ttime_mean);
-		return;
-	}
-
-	/*
-	 * There are other queues in the group or this is the only group and
-	 * it has too big thinktime, don't do group idle.
-	 */
-	if (group_idle &&
-	    (cfqq->cfqg->nr_cfqq > 1 ||
-	     cfq_io_thinktime_big(cfqd, &st->ttime, true)))
-		return;
-
-	cfq_mark_cfqq_wait_request(cfqq);
-
-	if (group_idle)
-		sl = cfqd->cfq_group_idle;
-	else
-		sl = cfqd->cfq_slice_idle;
-
-	hrtimer_start(&cfqd->idle_slice_timer, ns_to_ktime(sl),
-		      HRTIMER_MODE_REL);
-	cfqg_stats_set_start_idle_time(cfqq->cfqg);
-	cfq_log_cfqq(cfqd, cfqq, "arm_idle: %llu group_idle: %d", sl,
-			group_idle ? 1 : 0);
-}
-
-/*
- * Move request from internal lists to the request queue dispatch list.
- */
-static void cfq_dispatch_insert(struct request_queue *q, struct request *rq)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-
-	cfq_log_cfqq(cfqd, cfqq, "dispatch_insert");
-
-	cfqq->next_rq = cfq_find_next_rq(cfqd, cfqq, rq);
-	cfq_remove_request(rq);
-	cfqq->dispatched++;
-	(RQ_CFQG(rq))->dispatched++;
-	elv_dispatch_sort(q, rq);
-
-	cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]++;
-	cfqq->nr_sectors += blk_rq_sectors(rq);
-}
-
-/*
- * return expired entry, or NULL to just start from scratch in rbtree
- */
-static struct request *cfq_check_fifo(struct cfq_queue *cfqq)
-{
-	struct request *rq = NULL;
-
-	if (cfq_cfqq_fifo_expire(cfqq))
-		return NULL;
-
-	cfq_mark_cfqq_fifo_expire(cfqq);
-
-	if (list_empty(&cfqq->fifo))
-		return NULL;
-
-	rq = rq_entry_fifo(cfqq->fifo.next);
-	if (ktime_get_ns() < rq->fifo_time)
-		rq = NULL;
-
-	return rq;
-}
-
-static inline int
-cfq_prio_to_maxrq(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	const int base_rq = cfqd->cfq_slice_async_rq;
-
-	WARN_ON(cfqq->ioprio >= IOPRIO_BE_NR);
-
-	return 2 * base_rq * (IOPRIO_BE_NR - cfqq->ioprio);
-}
-
-/*
- * Must be called with the queue_lock held.
- */
-static int cfqq_process_refs(struct cfq_queue *cfqq)
-{
-	int process_refs, io_refs;
-
-	io_refs = cfqq->allocated[READ] + cfqq->allocated[WRITE];
-	process_refs = cfqq->ref - io_refs;
-	BUG_ON(process_refs < 0);
-	return process_refs;
-}
-
-static void cfq_setup_merge(struct cfq_queue *cfqq, struct cfq_queue *new_cfqq)
-{
-	int process_refs, new_process_refs;
-	struct cfq_queue *__cfqq;
-
-	/*
-	 * If there are no process references on the new_cfqq, then it is
-	 * unsafe to follow the ->new_cfqq chain as other cfqq's in the
-	 * chain may have dropped their last reference (not just their
-	 * last process reference).
-	 */
-	if (!cfqq_process_refs(new_cfqq))
-		return;
-
-	/* Avoid a circular list and skip interim queue merges */
-	while ((__cfqq = new_cfqq->new_cfqq)) {
-		if (__cfqq == cfqq)
-			return;
-		new_cfqq = __cfqq;
-	}
-
-	process_refs = cfqq_process_refs(cfqq);
-	new_process_refs = cfqq_process_refs(new_cfqq);
-	/*
-	 * If the process for the cfqq has gone away, there is no
-	 * sense in merging the queues.
-	 */
-	if (process_refs == 0 || new_process_refs == 0)
-		return;
-
-	/*
-	 * Merge in the direction of the lesser amount of work.
-	 */
-	if (new_process_refs >= process_refs) {
-		cfqq->new_cfqq = new_cfqq;
-		new_cfqq->ref += process_refs;
-	} else {
-		new_cfqq->new_cfqq = cfqq;
-		cfqq->ref += new_process_refs;
-	}
-}
-
-static enum wl_type_t cfq_choose_wl_type(struct cfq_data *cfqd,
-			struct cfq_group *cfqg, enum wl_class_t wl_class)
-{
-	struct cfq_queue *queue;
-	int i;
-	bool key_valid = false;
-	u64 lowest_key = 0;
-	enum wl_type_t cur_best = SYNC_NOIDLE_WORKLOAD;
-
-	for (i = 0; i <= SYNC_WORKLOAD; ++i) {
-		/* select the one with lowest rb_key */
-		queue = cfq_rb_first(st_for(cfqg, wl_class, i));
-		if (queue &&
-		    (!key_valid || queue->rb_key < lowest_key)) {
-			lowest_key = queue->rb_key;
-			cur_best = i;
-			key_valid = true;
-		}
-	}
-
-	return cur_best;
-}
-
-static void
-choose_wl_class_and_type(struct cfq_data *cfqd, struct cfq_group *cfqg)
-{
-	u64 slice;
-	unsigned count;
-	struct cfq_rb_root *st;
-	u64 group_slice;
-	enum wl_class_t original_class = cfqd->serving_wl_class;
-	u64 now = ktime_get_ns();
-
-	/* Choose next priority. RT > BE > IDLE */
-	if (cfq_group_busy_queues_wl(RT_WORKLOAD, cfqd, cfqg))
-		cfqd->serving_wl_class = RT_WORKLOAD;
-	else if (cfq_group_busy_queues_wl(BE_WORKLOAD, cfqd, cfqg))
-		cfqd->serving_wl_class = BE_WORKLOAD;
-	else {
-		cfqd->serving_wl_class = IDLE_WORKLOAD;
-		cfqd->workload_expires = now + jiffies_to_nsecs(1);
-		return;
-	}
-
-	if (original_class != cfqd->serving_wl_class)
-		goto new_workload;
-
-	/*
-	 * For RT and BE, we have to choose also the type
-	 * (SYNC, SYNC_NOIDLE, ASYNC), and to compute a workload
-	 * expiration time
-	 */
-	st = st_for(cfqg, cfqd->serving_wl_class, cfqd->serving_wl_type);
-	count = st->count;
-
-	/*
-	 * check workload expiration, and that we still have other queues ready
-	 */
-	if (count && !(now > cfqd->workload_expires))
-		return;
-
-new_workload:
-	/* otherwise select new workload type */
-	cfqd->serving_wl_type = cfq_choose_wl_type(cfqd, cfqg,
-					cfqd->serving_wl_class);
-	st = st_for(cfqg, cfqd->serving_wl_class, cfqd->serving_wl_type);
-	count = st->count;
-
-	/*
-	 * the workload slice is computed as a fraction of target latency
-	 * proportional to the number of queues in that workload, over
-	 * all the queues in the same priority class
-	 */
-	group_slice = cfq_group_slice(cfqd, cfqg);
-
-	slice = div_u64(group_slice * count,
-		max_t(unsigned, cfqg->busy_queues_avg[cfqd->serving_wl_class],
-		      cfq_group_busy_queues_wl(cfqd->serving_wl_class, cfqd,
-					cfqg)));
-
-	if (cfqd->serving_wl_type == ASYNC_WORKLOAD) {
-		u64 tmp;
-
-		/*
-		 * Async queues are currently system wide. Just taking
-		 * proportion of queues with-in same group will lead to higher
-		 * async ratio system wide as generally root group is going
-		 * to have higher weight. A more accurate thing would be to
-		 * calculate system wide asnc/sync ratio.
-		 */
-		tmp = cfqd->cfq_target_latency *
-			cfqg_busy_async_queues(cfqd, cfqg);
-		tmp = div_u64(tmp, cfqd->busy_queues);
-		slice = min_t(u64, slice, tmp);
-
-		/* async workload slice is scaled down according to
-		 * the sync/async slice ratio. */
-		slice = div64_u64(slice*cfqd->cfq_slice[0], cfqd->cfq_slice[1]);
-	} else
-		/* sync workload slice is at least 2 * cfq_slice_idle */
-		slice = max(slice, 2 * cfqd->cfq_slice_idle);
-
-	slice = max_t(u64, slice, CFQ_MIN_TT);
-	cfq_log(cfqd, "workload slice:%llu", slice);
-	cfqd->workload_expires = now + slice;
-}
-
-static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd)
-{
-	struct cfq_rb_root *st = &cfqd->grp_service_tree;
-	struct cfq_group *cfqg;
-
-	if (RB_EMPTY_ROOT(&st->rb.rb_root))
-		return NULL;
-	cfqg = cfq_rb_first_group(st);
-	update_min_vdisktime(st);
-	return cfqg;
-}
-
-static void cfq_choose_cfqg(struct cfq_data *cfqd)
-{
-	struct cfq_group *cfqg = cfq_get_next_cfqg(cfqd);
-	u64 now = ktime_get_ns();
-
-	cfqd->serving_group = cfqg;
-
-	/* Restore the workload type data */
-	if (cfqg->saved_wl_slice) {
-		cfqd->workload_expires = now + cfqg->saved_wl_slice;
-		cfqd->serving_wl_type = cfqg->saved_wl_type;
-		cfqd->serving_wl_class = cfqg->saved_wl_class;
-	} else
-		cfqd->workload_expires = now - 1;
-
-	choose_wl_class_and_type(cfqd, cfqg);
-}
-
-/*
- * Select a queue for service. If we have a current active queue,
- * check whether to continue servicing it, or retrieve and set a new one.
- */
-static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd)
-{
-	struct cfq_queue *cfqq, *new_cfqq = NULL;
-	u64 now = ktime_get_ns();
-
-	cfqq = cfqd->active_queue;
-	if (!cfqq)
-		goto new_queue;
-
-	if (!cfqd->rq_queued)
-		return NULL;
-
-	/*
-	 * We were waiting for group to get backlogged. Expire the queue
-	 */
-	if (cfq_cfqq_wait_busy(cfqq) && !RB_EMPTY_ROOT(&cfqq->sort_list))
-		goto expire;
-
-	/*
-	 * The active queue has run out of time, expire it and select new.
-	 */
-	if (cfq_slice_used(cfqq) && !cfq_cfqq_must_dispatch(cfqq)) {
-		/*
-		 * If slice had not expired at the completion of last request
-		 * we might not have turned on wait_busy flag. Don't expire
-		 * the queue yet. Allow the group to get backlogged.
-		 *
-		 * The very fact that we have used the slice, that means we
-		 * have been idling all along on this queue and it should be
-		 * ok to wait for this request to complete.
-		 */
-		if (cfqq->cfqg->nr_cfqq == 1 && RB_EMPTY_ROOT(&cfqq->sort_list)
-		    && cfqq->dispatched && cfq_should_idle(cfqd, cfqq)) {
-			cfqq = NULL;
-			goto keep_queue;
-		} else
-			goto check_group_idle;
-	}
-
-	/*
-	 * The active queue has requests and isn't expired, allow it to
-	 * dispatch.
-	 */
-	if (!RB_EMPTY_ROOT(&cfqq->sort_list))
-		goto keep_queue;
-
-	/*
-	 * If another queue has a request waiting within our mean seek
-	 * distance, let it run.  The expire code will check for close
-	 * cooperators and put the close queue at the front of the service
-	 * tree.  If possible, merge the expiring queue with the new cfqq.
-	 */
-	new_cfqq = cfq_close_cooperator(cfqd, cfqq);
-	if (new_cfqq) {
-		if (!cfqq->new_cfqq)
-			cfq_setup_merge(cfqq, new_cfqq);
-		goto expire;
-	}
-
-	/*
-	 * No requests pending. If the active queue still has requests in
-	 * flight or is idling for a new request, allow either of these
-	 * conditions to happen (or time out) before selecting a new queue.
-	 */
-	if (hrtimer_active(&cfqd->idle_slice_timer)) {
-		cfqq = NULL;
-		goto keep_queue;
-	}
-
-	/*
-	 * This is a deep seek queue, but the device is much faster than
-	 * the queue can deliver, don't idle
-	 **/
-	if (CFQQ_SEEKY(cfqq) && cfq_cfqq_idle_window(cfqq) &&
-	    (cfq_cfqq_slice_new(cfqq) ||
-	    (cfqq->slice_end - now > now - cfqq->slice_start))) {
-		cfq_clear_cfqq_deep(cfqq);
-		cfq_clear_cfqq_idle_window(cfqq);
-	}
-
-	if (cfqq->dispatched && cfq_should_idle(cfqd, cfqq)) {
-		cfqq = NULL;
-		goto keep_queue;
-	}
-
-	/*
-	 * If group idle is enabled and there are requests dispatched from
-	 * this group, wait for requests to complete.
-	 */
-check_group_idle:
-	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
-	    cfqq->cfqg->dispatched &&
-	    !cfq_io_thinktime_big(cfqd, &cfqq->cfqg->ttime, true)) {
-		cfqq = NULL;
-		goto keep_queue;
-	}
-
-expire:
-	cfq_slice_expired(cfqd, 0);
-new_queue:
-	/*
-	 * Current queue expired. Check if we have to switch to a new
-	 * service tree
-	 */
-	if (!new_cfqq)
-		cfq_choose_cfqg(cfqd);
-
-	cfqq = cfq_set_active_queue(cfqd, new_cfqq);
-keep_queue:
-	return cfqq;
-}
-
-static int __cfq_forced_dispatch_cfqq(struct cfq_queue *cfqq)
-{
-	int dispatched = 0;
-
-	while (cfqq->next_rq) {
-		cfq_dispatch_insert(cfqq->cfqd->queue, cfqq->next_rq);
-		dispatched++;
-	}
-
-	BUG_ON(!list_empty(&cfqq->fifo));
-
-	/* By default cfqq is not expired if it is empty. Do it explicitly */
-	__cfq_slice_expired(cfqq->cfqd, cfqq, 0);
-	return dispatched;
-}
-
-/*
- * Drain our current requests. Used for barriers and when switching
- * io schedulers on-the-fly.
- */
-static int cfq_forced_dispatch(struct cfq_data *cfqd)
-{
-	struct cfq_queue *cfqq;
-	int dispatched = 0;
-
-	/* Expire the timeslice of the current active queue first */
-	cfq_slice_expired(cfqd, 0);
-	while ((cfqq = cfq_get_next_queue_forced(cfqd)) != NULL) {
-		__cfq_set_active_queue(cfqd, cfqq);
-		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
-	}
-
-	BUG_ON(cfqd->busy_queues);
-
-	cfq_log(cfqd, "forced_dispatch=%d", dispatched);
-	return dispatched;
-}
-
-static inline bool cfq_slice_used_soon(struct cfq_data *cfqd,
-	struct cfq_queue *cfqq)
-{
-	u64 now = ktime_get_ns();
-
-	/* the queue hasn't finished any request, can't estimate */
-	if (cfq_cfqq_slice_new(cfqq))
-		return true;
-	if (now + cfqd->cfq_slice_idle * cfqq->dispatched > cfqq->slice_end)
-		return true;
-
-	return false;
-}
-
-static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	unsigned int max_dispatch;
-
-	if (cfq_cfqq_must_dispatch(cfqq))
-		return true;
-
-	/*
-	 * Drain async requests before we start sync IO
-	 */
-	if (cfq_should_idle(cfqd, cfqq) && cfqd->rq_in_flight[BLK_RW_ASYNC])
-		return false;
-
-	/*
-	 * If this is an async queue and we have sync IO in flight, let it wait
-	 */
-	if (cfqd->rq_in_flight[BLK_RW_SYNC] && !cfq_cfqq_sync(cfqq))
-		return false;
-
-	max_dispatch = max_t(unsigned int, cfqd->cfq_quantum / 2, 1);
-	if (cfq_class_idle(cfqq))
-		max_dispatch = 1;
-
-	/*
-	 * Does this cfqq already have too much IO in flight?
-	 */
-	if (cfqq->dispatched >= max_dispatch) {
-		bool promote_sync = false;
-		/*
-		 * idle queue must always only have a single IO in flight
-		 */
-		if (cfq_class_idle(cfqq))
-			return false;
-
-		/*
-		 * If there is only one sync queue
-		 * we can ignore async queue here and give the sync
-		 * queue no dispatch limit. The reason is a sync queue can
-		 * preempt async queue, limiting the sync queue doesn't make
-		 * sense. This is useful for aiostress test.
-		 */
-		if (cfq_cfqq_sync(cfqq) && cfqd->busy_sync_queues == 1)
-			promote_sync = true;
-
-		/*
-		 * We have other queues, don't allow more IO from this one
-		 */
-		if (cfqd->busy_queues > 1 && cfq_slice_used_soon(cfqd, cfqq) &&
-				!promote_sync)
-			return false;
-
-		/*
-		 * Sole queue user, no limit
-		 */
-		if (cfqd->busy_queues == 1 || promote_sync)
-			max_dispatch = -1;
-		else
-			/*
-			 * Normally we start throttling cfqq when cfq_quantum/2
-			 * requests have been dispatched. But we can drive
-			 * deeper queue depths at the beginning of slice
-			 * subjected to upper limit of cfq_quantum.
-			 * */
-			max_dispatch = cfqd->cfq_quantum;
-	}
-
-	/*
-	 * Async queues must wait a bit before being allowed dispatch.
-	 * We also ramp up the dispatch depth gradually for async IO,
-	 * based on the last sync IO we serviced
-	 */
-	if (!cfq_cfqq_sync(cfqq) && cfqd->cfq_latency) {
-		u64 last_sync = ktime_get_ns() - cfqd->last_delayed_sync;
-		unsigned int depth;
-
-		depth = div64_u64(last_sync, cfqd->cfq_slice[1]);
-		if (!depth && !cfqq->dispatched)
-			depth = 1;
-		if (depth < max_dispatch)
-			max_dispatch = depth;
-	}
-
-	/*
-	 * If we're below the current max, allow a dispatch
-	 */
-	return cfqq->dispatched < max_dispatch;
-}
-
-/*
- * Dispatch a request from cfqq, moving them to the request queue
- * dispatch list.
- */
-static bool cfq_dispatch_request(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	struct request *rq;
-
-	BUG_ON(RB_EMPTY_ROOT(&cfqq->sort_list));
-
-	rq = cfq_check_fifo(cfqq);
-	if (rq)
-		cfq_mark_cfqq_must_dispatch(cfqq);
-
-	if (!cfq_may_dispatch(cfqd, cfqq))
-		return false;
-
-	/*
-	 * follow expired path, else get first next available
-	 */
-	if (!rq)
-		rq = cfqq->next_rq;
-	else
-		cfq_log_cfqq(cfqq->cfqd, cfqq, "fifo=%p", rq);
-
-	/*
-	 * insert request into driver dispatch list
-	 */
-	cfq_dispatch_insert(cfqd->queue, rq);
-
-	if (!cfqd->active_cic) {
-		struct cfq_io_cq *cic = RQ_CIC(rq);
-
-		atomic_long_inc(&cic->icq.ioc->refcount);
-		cfqd->active_cic = cic;
-	}
-
-	return true;
-}
-
-/*
- * Find the cfqq that we need to service and move a request from that to the
- * dispatch list
- */
-static int cfq_dispatch_requests(struct request_queue *q, int force)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct cfq_queue *cfqq;
-
-	if (!cfqd->busy_queues)
-		return 0;
-
-	if (unlikely(force))
-		return cfq_forced_dispatch(cfqd);
-
-	cfqq = cfq_select_queue(cfqd);
-	if (!cfqq)
-		return 0;
-
-	/*
-	 * Dispatch a request from this cfqq, if it is allowed
-	 */
-	if (!cfq_dispatch_request(cfqd, cfqq))
-		return 0;
-
-	cfqq->slice_dispatch++;
-	cfq_clear_cfqq_must_dispatch(cfqq);
-
-	/*
-	 * expire an async queue immediately if it has used up its slice. idle
-	 * queue always expire after 1 dispatch round.
-	 */
-	if (cfqd->busy_queues > 1 && ((!cfq_cfqq_sync(cfqq) &&
-	    cfqq->slice_dispatch >= cfq_prio_to_maxrq(cfqd, cfqq)) ||
-	    cfq_class_idle(cfqq))) {
-		cfqq->slice_end = ktime_get_ns() + 1;
-		cfq_slice_expired(cfqd, 0);
-	}
-
-	cfq_log_cfqq(cfqd, cfqq, "dispatched a request");
-	return 1;
-}
-
-/*
- * task holds one reference to the queue, dropped when task exits. each rq
- * in-flight on this queue also holds a reference, dropped when rq is freed.
- *
- * Each cfq queue took a reference on the parent group. Drop it now.
- * queue lock must be held here.
- */
-static void cfq_put_queue(struct cfq_queue *cfqq)
-{
-	struct cfq_data *cfqd = cfqq->cfqd;
-	struct cfq_group *cfqg;
-
-	BUG_ON(cfqq->ref <= 0);
-
-	cfqq->ref--;
-	if (cfqq->ref)
-		return;
-
-	cfq_log_cfqq(cfqd, cfqq, "put_queue");
-	BUG_ON(rb_first(&cfqq->sort_list));
-	BUG_ON(cfqq->allocated[READ] + cfqq->allocated[WRITE]);
-	cfqg = cfqq->cfqg;
-
-	if (unlikely(cfqd->active_queue == cfqq)) {
-		__cfq_slice_expired(cfqd, cfqq, 0);
-		cfq_schedule_dispatch(cfqd);
-	}
-
-	BUG_ON(cfq_cfqq_on_rr(cfqq));
-	kmem_cache_free(cfq_pool, cfqq);
-	cfqg_put(cfqg);
-}
-
-static void cfq_put_cooperator(struct cfq_queue *cfqq)
-{
-	struct cfq_queue *__cfqq, *next;
-
-	/*
-	 * If this queue was scheduled to merge with another queue, be
-	 * sure to drop the reference taken on that queue (and others in
-	 * the merge chain).  See cfq_setup_merge and cfq_merge_cfqqs.
-	 */
-	__cfqq = cfqq->new_cfqq;
-	while (__cfqq) {
-		if (__cfqq == cfqq) {
-			WARN(1, "cfqq->new_cfqq loop detected\n");
-			break;
-		}
-		next = __cfqq->new_cfqq;
-		cfq_put_queue(__cfqq);
-		__cfqq = next;
-	}
-}
-
-static void cfq_exit_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	if (unlikely(cfqq == cfqd->active_queue)) {
-		__cfq_slice_expired(cfqd, cfqq, 0);
-		cfq_schedule_dispatch(cfqd);
-	}
-
-	cfq_put_cooperator(cfqq);
-
-	cfq_put_queue(cfqq);
-}
-
-static void cfq_init_icq(struct io_cq *icq)
-{
-	struct cfq_io_cq *cic = icq_to_cic(icq);
-
-	cic->ttime.last_end_request = ktime_get_ns();
-}
-
-static void cfq_exit_icq(struct io_cq *icq)
-{
-	struct cfq_io_cq *cic = icq_to_cic(icq);
-	struct cfq_data *cfqd = cic_to_cfqd(cic);
-
-	if (cic_to_cfqq(cic, false)) {
-		cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, false));
-		cic_set_cfqq(cic, NULL, false);
-	}
-
-	if (cic_to_cfqq(cic, true)) {
-		cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, true));
-		cic_set_cfqq(cic, NULL, true);
-	}
-}
-
-static void cfq_init_prio_data(struct cfq_queue *cfqq, struct cfq_io_cq *cic)
-{
-	struct task_struct *tsk = current;
-	int ioprio_class;
-
-	if (!cfq_cfqq_prio_changed(cfqq))
-		return;
-
-	ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
-	switch (ioprio_class) {
-	default:
-		printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
-		/* fall through */
-	case IOPRIO_CLASS_NONE:
-		/*
-		 * no prio set, inherit CPU scheduling settings
-		 */
-		cfqq->ioprio = task_nice_ioprio(tsk);
-		cfqq->ioprio_class = task_nice_ioclass(tsk);
-		break;
-	case IOPRIO_CLASS_RT:
-		cfqq->ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
-		cfqq->ioprio_class = IOPRIO_CLASS_RT;
-		break;
-	case IOPRIO_CLASS_BE:
-		cfqq->ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
-		cfqq->ioprio_class = IOPRIO_CLASS_BE;
-		break;
-	case IOPRIO_CLASS_IDLE:
-		cfqq->ioprio_class = IOPRIO_CLASS_IDLE;
-		cfqq->ioprio = 7;
-		cfq_clear_cfqq_idle_window(cfqq);
-		break;
-	}
-
-	/*
-	 * keep track of original prio settings in case we have to temporarily
-	 * elevate the priority of this queue
-	 */
-	cfqq->org_ioprio = cfqq->ioprio;
-	cfqq->org_ioprio_class = cfqq->ioprio_class;
-	cfq_clear_cfqq_prio_changed(cfqq);
-}
-
-static void check_ioprio_changed(struct cfq_io_cq *cic, struct bio *bio)
-{
-	int ioprio = cic->icq.ioc->ioprio;
-	struct cfq_data *cfqd = cic_to_cfqd(cic);
-	struct cfq_queue *cfqq;
-
-	/*
-	 * Check whether ioprio has changed.  The condition may trigger
-	 * spuriously on a newly created cic but there's no harm.
-	 */
-	if (unlikely(!cfqd) || likely(cic->ioprio == ioprio))
-		return;
-
-	cfqq = cic_to_cfqq(cic, false);
-	if (cfqq) {
-		cfq_put_queue(cfqq);
-		cfqq = cfq_get_queue(cfqd, BLK_RW_ASYNC, cic, bio);
-		cic_set_cfqq(cic, cfqq, false);
-	}
-
-	cfqq = cic_to_cfqq(cic, true);
-	if (cfqq)
-		cfq_mark_cfqq_prio_changed(cfqq);
-
-	cic->ioprio = ioprio;
-}
-
-static void cfq_init_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-			  pid_t pid, bool is_sync)
-{
-	RB_CLEAR_NODE(&cfqq->rb_node);
-	RB_CLEAR_NODE(&cfqq->p_node);
-	INIT_LIST_HEAD(&cfqq->fifo);
-
-	cfqq->ref = 0;
-	cfqq->cfqd = cfqd;
-
-	cfq_mark_cfqq_prio_changed(cfqq);
-
-	if (is_sync) {
-		if (!cfq_class_idle(cfqq))
-			cfq_mark_cfqq_idle_window(cfqq);
-		cfq_mark_cfqq_sync(cfqq);
-	}
-	cfqq->pid = pid;
-}
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-static void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio)
-{
-	struct cfq_data *cfqd = cic_to_cfqd(cic);
-	struct cfq_queue *cfqq;
-	uint64_t serial_nr;
-
-	rcu_read_lock();
-	serial_nr = __bio_blkcg(bio)->css.serial_nr;
-	rcu_read_unlock();
-
-	/*
-	 * Check whether blkcg has changed.  The condition may trigger
-	 * spuriously on a newly created cic but there's no harm.
-	 */
-	if (unlikely(!cfqd) || likely(cic->blkcg_serial_nr == serial_nr))
-		return;
-
-	/*
-	 * Drop reference to queues.  New queues will be assigned in new
-	 * group upon arrival of fresh requests.
-	 */
-	cfqq = cic_to_cfqq(cic, false);
-	if (cfqq) {
-		cfq_log_cfqq(cfqd, cfqq, "changed cgroup");
-		cic_set_cfqq(cic, NULL, false);
-		cfq_put_queue(cfqq);
-	}
-
-	cfqq = cic_to_cfqq(cic, true);
-	if (cfqq) {
-		cfq_log_cfqq(cfqd, cfqq, "changed cgroup");
-		cic_set_cfqq(cic, NULL, true);
-		cfq_put_queue(cfqq);
-	}
-
-	cic->blkcg_serial_nr = serial_nr;
-}
-#else
-static inline void check_blkcg_changed(struct cfq_io_cq *cic, struct bio *bio)
-{
-}
-#endif  /* CONFIG_CFQ_GROUP_IOSCHED */
-
-static struct cfq_queue **
-cfq_async_queue_prio(struct cfq_group *cfqg, int ioprio_class, int ioprio)
-{
-	switch (ioprio_class) {
-	case IOPRIO_CLASS_RT:
-		return &cfqg->async_cfqq[0][ioprio];
-	case IOPRIO_CLASS_NONE:
-		ioprio = IOPRIO_NORM;
-		/* fall through */
-	case IOPRIO_CLASS_BE:
-		return &cfqg->async_cfqq[1][ioprio];
-	case IOPRIO_CLASS_IDLE:
-		return &cfqg->async_idle_cfqq;
-	default:
-		BUG();
-	}
-}
-
-static struct cfq_queue *
-cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
-	      struct bio *bio)
-{
-	int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
-	int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
-	struct cfq_queue **async_cfqq = NULL;
-	struct cfq_queue *cfqq;
-	struct cfq_group *cfqg;
-
-	rcu_read_lock();
-	cfqg = cfq_lookup_cfqg(cfqd, __bio_blkcg(bio));
-	if (!cfqg) {
-		cfqq = &cfqd->oom_cfqq;
-		goto out;
-	}
-
-	if (!is_sync) {
-		if (!ioprio_valid(cic->ioprio)) {
-			struct task_struct *tsk = current;
-			ioprio = task_nice_ioprio(tsk);
-			ioprio_class = task_nice_ioclass(tsk);
-		}
-		async_cfqq = cfq_async_queue_prio(cfqg, ioprio_class, ioprio);
-		cfqq = *async_cfqq;
-		if (cfqq)
-			goto out;
-	}
-
-	cfqq = kmem_cache_alloc_node(cfq_pool,
-				     GFP_NOWAIT | __GFP_ZERO | __GFP_NOWARN,
-				     cfqd->queue->node);
-	if (!cfqq) {
-		cfqq = &cfqd->oom_cfqq;
-		goto out;
-	}
-
-	/* cfq_init_cfqq() assumes cfqq->ioprio_class is initialized. */
-	cfqq->ioprio_class = IOPRIO_CLASS_NONE;
-	cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
-	cfq_init_prio_data(cfqq, cic);
-	cfq_link_cfqq_cfqg(cfqq, cfqg);
-	cfq_log_cfqq(cfqd, cfqq, "alloced");
-
-	if (async_cfqq) {
-		/* a new async queue is created, pin and remember */
-		cfqq->ref++;
-		*async_cfqq = cfqq;
-	}
-out:
-	cfqq->ref++;
-	rcu_read_unlock();
-	return cfqq;
-}
-
-static void
-__cfq_update_io_thinktime(struct cfq_ttime *ttime, u64 slice_idle)
-{
-	u64 elapsed = ktime_get_ns() - ttime->last_end_request;
-	elapsed = min(elapsed, 2UL * slice_idle);
-
-	ttime->ttime_samples = (7*ttime->ttime_samples + 256) / 8;
-	ttime->ttime_total = div_u64(7*ttime->ttime_total + 256*elapsed,  8);
-	ttime->ttime_mean = div64_ul(ttime->ttime_total + 128,
-				     ttime->ttime_samples);
-}
-
-static void
-cfq_update_io_thinktime(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-			struct cfq_io_cq *cic)
-{
-	if (cfq_cfqq_sync(cfqq)) {
-		__cfq_update_io_thinktime(&cic->ttime, cfqd->cfq_slice_idle);
-		__cfq_update_io_thinktime(&cfqq->service_tree->ttime,
-			cfqd->cfq_slice_idle);
-	}
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	__cfq_update_io_thinktime(&cfqq->cfqg->ttime, cfqd->cfq_group_idle);
-#endif
-}
-
-static void
-cfq_update_io_seektime(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-		       struct request *rq)
-{
-	sector_t sdist = 0;
-	sector_t n_sec = blk_rq_sectors(rq);
-	if (cfqq->last_request_pos) {
-		if (cfqq->last_request_pos < blk_rq_pos(rq))
-			sdist = blk_rq_pos(rq) - cfqq->last_request_pos;
-		else
-			sdist = cfqq->last_request_pos - blk_rq_pos(rq);
-	}
-
-	cfqq->seek_history <<= 1;
-	if (blk_queue_nonrot(cfqd->queue))
-		cfqq->seek_history |= (n_sec < CFQQ_SECT_THR_NONROT);
-	else
-		cfqq->seek_history |= (sdist > CFQQ_SEEK_THR);
-}
-
-static inline bool req_noidle(struct request *req)
-{
-	return req_op(req) == REQ_OP_WRITE &&
-		(req->cmd_flags & (REQ_SYNC | REQ_IDLE)) == REQ_SYNC;
-}
-
-/*
- * Disable idle window if the process thinks too long or seeks so much that
- * it doesn't matter
- */
-static void
-cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-		       struct cfq_io_cq *cic)
-{
-	int old_idle, enable_idle;
-
-	/*
-	 * Don't idle for async or idle io prio class
-	 */
-	if (!cfq_cfqq_sync(cfqq) || cfq_class_idle(cfqq))
-		return;
-
-	enable_idle = old_idle = cfq_cfqq_idle_window(cfqq);
-
-	if (cfqq->queued[0] + cfqq->queued[1] >= 4)
-		cfq_mark_cfqq_deep(cfqq);
-
-	if (cfqq->next_rq && req_noidle(cfqq->next_rq))
-		enable_idle = 0;
-	else if (!atomic_read(&cic->icq.ioc->active_ref) ||
-		 !cfqd->cfq_slice_idle ||
-		 (!cfq_cfqq_deep(cfqq) && CFQQ_SEEKY(cfqq)))
-		enable_idle = 0;
-	else if (sample_valid(cic->ttime.ttime_samples)) {
-		if (cic->ttime.ttime_mean > cfqd->cfq_slice_idle)
-			enable_idle = 0;
-		else
-			enable_idle = 1;
-	}
-
-	if (old_idle != enable_idle) {
-		cfq_log_cfqq(cfqd, cfqq, "idle=%d", enable_idle);
-		if (enable_idle)
-			cfq_mark_cfqq_idle_window(cfqq);
-		else
-			cfq_clear_cfqq_idle_window(cfqq);
-	}
-}
-
-/*
- * Check if new_cfqq should preempt the currently active queue. Return 0 for
- * no or if we aren't sure, a 1 will cause a preempt.
- */
-static bool
-cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
-		   struct request *rq)
-{
-	struct cfq_queue *cfqq;
-
-	cfqq = cfqd->active_queue;
-	if (!cfqq)
-		return false;
-
-	if (cfq_class_idle(new_cfqq))
-		return false;
-
-	if (cfq_class_idle(cfqq))
-		return true;
-
-	/*
-	 * Don't allow a non-RT request to preempt an ongoing RT cfqq timeslice.
-	 */
-	if (cfq_class_rt(cfqq) && !cfq_class_rt(new_cfqq))
-		return false;
-
-	/*
-	 * if the new request is sync, but the currently running queue is
-	 * not, let the sync request have priority.
-	 */
-	if (rq_is_sync(rq) && !cfq_cfqq_sync(cfqq) && !cfq_cfqq_must_dispatch(cfqq))
-		return true;
-
-	/*
-	 * Treat ancestors of current cgroup the same way as current cgroup.
-	 * For anybody else we disallow preemption to guarantee service
-	 * fairness among cgroups.
-	 */
-	if (!cfqg_is_descendant(cfqq->cfqg, new_cfqq->cfqg))
-		return false;
-
-	if (cfq_slice_used(cfqq))
-		return true;
-
-	/*
-	 * Allow an RT request to pre-empt an ongoing non-RT cfqq timeslice.
-	 */
-	if (cfq_class_rt(new_cfqq) && !cfq_class_rt(cfqq))
-		return true;
-
-	WARN_ON_ONCE(cfqq->ioprio_class != new_cfqq->ioprio_class);
-	/* Allow preemption only if we are idling on sync-noidle tree */
-	if (cfqd->serving_wl_type == SYNC_NOIDLE_WORKLOAD &&
-	    cfqq_type(new_cfqq) == SYNC_NOIDLE_WORKLOAD &&
-	    RB_EMPTY_ROOT(&cfqq->sort_list))
-		return true;
-
-	/*
-	 * So both queues are sync. Let the new request get disk time if
-	 * it's a metadata request and the current queue is doing regular IO.
-	 */
-	if ((rq->cmd_flags & REQ_PRIO) && !cfqq->prio_pending)
-		return true;
-
-	/* An idle queue should not be idle now for some reason */
-	if (RB_EMPTY_ROOT(&cfqq->sort_list) && !cfq_should_idle(cfqd, cfqq))
-		return true;
-
-	if (!cfqd->active_cic || !cfq_cfqq_wait_request(cfqq))
-		return false;
-
-	/*
-	 * if this request is as-good as one we would expect from the
-	 * current cfqq, let it preempt
-	 */
-	if (cfq_rq_close(cfqd, cfqq, rq))
-		return true;
-
-	return false;
-}
-
-/*
- * cfqq preempts the active queue. if we allowed preempt with no slice left,
- * let it have half of its nominal slice.
- */
-static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	enum wl_type_t old_type = cfqq_type(cfqd->active_queue);
-
-	cfq_log_cfqq(cfqd, cfqq, "preempt");
-	cfq_slice_expired(cfqd, 1);
-
-	/*
-	 * workload type is changed, don't save slice, otherwise preempt
-	 * doesn't happen
-	 */
-	if (old_type != cfqq_type(cfqq))
-		cfqq->cfqg->saved_wl_slice = 0;
-
-	/*
-	 * Put the new queue at the front of the of the current list,
-	 * so we know that it will be selected next.
-	 */
-	BUG_ON(!cfq_cfqq_on_rr(cfqq));
-
-	cfq_service_tree_add(cfqd, cfqq, 1);
-
-	cfqq->slice_end = 0;
-	cfq_mark_cfqq_slice_new(cfqq);
-}
-
-/*
- * Called when a new fs request (rq) is added (to cfqq). Check if there's
- * something we should do about it
- */
-static void
-cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-		struct request *rq)
-{
-	struct cfq_io_cq *cic = RQ_CIC(rq);
-
-	cfqd->rq_queued++;
-	if (rq->cmd_flags & REQ_PRIO)
-		cfqq->prio_pending++;
-
-	cfq_update_io_thinktime(cfqd, cfqq, cic);
-	cfq_update_io_seektime(cfqd, cfqq, rq);
-	cfq_update_idle_window(cfqd, cfqq, cic);
-
-	cfqq->last_request_pos = blk_rq_pos(rq) + blk_rq_sectors(rq);
-
-	if (cfqq == cfqd->active_queue) {
-		/*
-		 * Remember that we saw a request from this process, but
-		 * don't start queuing just yet. Otherwise we risk seeing lots
-		 * of tiny requests, because we disrupt the normal plugging
-		 * and merging. If the request is already larger than a single
-		 * page, let it rip immediately. For that case we assume that
-		 * merging is already done. Ditto for a busy system that
-		 * has other work pending, don't risk delaying until the
-		 * idle timer unplug to continue working.
-		 */
-		if (cfq_cfqq_wait_request(cfqq)) {
-			if (blk_rq_bytes(rq) > PAGE_SIZE ||
-			    cfqd->busy_queues > 1) {
-				cfq_del_timer(cfqd, cfqq);
-				cfq_clear_cfqq_wait_request(cfqq);
-				__blk_run_queue(cfqd->queue);
-			} else {
-				cfqg_stats_update_idle_time(cfqq->cfqg);
-				cfq_mark_cfqq_must_dispatch(cfqq);
-			}
-		}
-	} else if (cfq_should_preempt(cfqd, cfqq, rq)) {
-		/*
-		 * not the active queue - expire current slice if it is
-		 * idle and has expired it's mean thinktime or this new queue
-		 * has some old slice time left and is of higher priority or
-		 * this new queue is RT and the current one is BE
-		 */
-		cfq_preempt_queue(cfqd, cfqq);
-		__blk_run_queue(cfqd->queue);
-	}
-}
-
-static void cfq_insert_request(struct request_queue *q, struct request *rq)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-
-	cfq_log_cfqq(cfqd, cfqq, "insert_request");
-	cfq_init_prio_data(cfqq, RQ_CIC(rq));
-
-	rq->fifo_time = ktime_get_ns() + cfqd->cfq_fifo_expire[rq_is_sync(rq)];
-	list_add_tail(&rq->queuelist, &cfqq->fifo);
-	cfq_add_rq_rb(rq);
-	cfqg_stats_update_io_add(RQ_CFQG(rq), cfqd->serving_group,
-				 rq->cmd_flags);
-	cfq_rq_enqueued(cfqd, cfqq, rq);
-}
-
-/*
- * Update hw_tag based on peak queue depth over 50 samples under
- * sufficient load.
- */
-static void cfq_update_hw_tag(struct cfq_data *cfqd)
-{
-	struct cfq_queue *cfqq = cfqd->active_queue;
-
-	if (cfqd->rq_in_driver > cfqd->hw_tag_est_depth)
-		cfqd->hw_tag_est_depth = cfqd->rq_in_driver;
-
-	if (cfqd->hw_tag == 1)
-		return;
-
-	if (cfqd->rq_queued <= CFQ_HW_QUEUE_MIN &&
-	    cfqd->rq_in_driver <= CFQ_HW_QUEUE_MIN)
-		return;
-
-	/*
-	 * If active queue hasn't enough requests and can idle, cfq might not
-	 * dispatch sufficient requests to hardware. Don't zero hw_tag in this
-	 * case
-	 */
-	if (cfqq && cfq_cfqq_idle_window(cfqq) &&
-	    cfqq->dispatched + cfqq->queued[0] + cfqq->queued[1] <
-	    CFQ_HW_QUEUE_MIN && cfqd->rq_in_driver < CFQ_HW_QUEUE_MIN)
-		return;
-
-	if (cfqd->hw_tag_samples++ < 50)
-		return;
-
-	if (cfqd->hw_tag_est_depth >= CFQ_HW_QUEUE_MIN)
-		cfqd->hw_tag = 1;
-	else
-		cfqd->hw_tag = 0;
-}
-
-static bool cfq_should_wait_busy(struct cfq_data *cfqd, struct cfq_queue *cfqq)
-{
-	struct cfq_io_cq *cic = cfqd->active_cic;
-	u64 now = ktime_get_ns();
-
-	/* If the queue already has requests, don't wait */
-	if (!RB_EMPTY_ROOT(&cfqq->sort_list))
-		return false;
-
-	/* If there are other queues in the group, don't wait */
-	if (cfqq->cfqg->nr_cfqq > 1)
-		return false;
-
-	/* the only queue in the group, but think time is big */
-	if (cfq_io_thinktime_big(cfqd, &cfqq->cfqg->ttime, true))
-		return false;
-
-	if (cfq_slice_used(cfqq))
-		return true;
-
-	/* if slice left is less than think time, wait busy */
-	if (cic && sample_valid(cic->ttime.ttime_samples)
-	    && (cfqq->slice_end - now < cic->ttime.ttime_mean))
-		return true;
-
-	/*
-	 * If think times is less than a jiffy than ttime_mean=0 and above
-	 * will not be true. It might happen that slice has not expired yet
-	 * but will expire soon (4-5 ns) during select_queue(). To cover the
-	 * case where think time is less than a jiffy, mark the queue wait
-	 * busy if only 1 jiffy is left in the slice.
-	 */
-	if (cfqq->slice_end - now <= jiffies_to_nsecs(1))
-		return true;
-
-	return false;
-}
-
-static void cfq_completed_request(struct request_queue *q, struct request *rq)
-{
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-	struct cfq_data *cfqd = cfqq->cfqd;
-	const int sync = rq_is_sync(rq);
-	u64 now = ktime_get_ns();
-
-	cfq_log_cfqq(cfqd, cfqq, "complete rqnoidle %d", req_noidle(rq));
-
-	cfq_update_hw_tag(cfqd);
-
-	WARN_ON(!cfqd->rq_in_driver);
-	WARN_ON(!cfqq->dispatched);
-	cfqd->rq_in_driver--;
-	cfqq->dispatched--;
-	(RQ_CFQG(rq))->dispatched--;
-	cfqg_stats_update_completion(cfqq->cfqg, rq->start_time_ns,
-				     rq->io_start_time_ns, rq->cmd_flags);
-
-	cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]--;
-
-	if (sync) {
-		struct cfq_rb_root *st;
-
-		RQ_CIC(rq)->ttime.last_end_request = now;
-
-		if (cfq_cfqq_on_rr(cfqq))
-			st = cfqq->service_tree;
-		else
-			st = st_for(cfqq->cfqg, cfqq_class(cfqq),
-					cfqq_type(cfqq));
-
-		st->ttime.last_end_request = now;
-		if (rq->start_time_ns + cfqd->cfq_fifo_expire[1] <= now)
-			cfqd->last_delayed_sync = now;
-	}
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	cfqq->cfqg->ttime.last_end_request = now;
-#endif
-
-	/*
-	 * If this is the active queue, check if it needs to be expired,
-	 * or if we want to idle in case it has no pending requests.
-	 */
-	if (cfqd->active_queue == cfqq) {
-		const bool cfqq_empty = RB_EMPTY_ROOT(&cfqq->sort_list);
-
-		if (cfq_cfqq_slice_new(cfqq)) {
-			cfq_set_prio_slice(cfqd, cfqq);
-			cfq_clear_cfqq_slice_new(cfqq);
-		}
-
-		/*
-		 * Should we wait for next request to come in before we expire
-		 * the queue.
-		 */
-		if (cfq_should_wait_busy(cfqd, cfqq)) {
-			u64 extend_sl = cfqd->cfq_slice_idle;
-			if (!cfqd->cfq_slice_idle)
-				extend_sl = cfqd->cfq_group_idle;
-			cfqq->slice_end = now + extend_sl;
-			cfq_mark_cfqq_wait_busy(cfqq);
-			cfq_log_cfqq(cfqd, cfqq, "will busy wait");
-		}
-
-		/*
-		 * Idling is not enabled on:
-		 * - expired queues
-		 * - idle-priority queues
-		 * - async queues
-		 * - queues with still some requests queued
-		 * - when there is a close cooperator
-		 */
-		if (cfq_slice_used(cfqq) || cfq_class_idle(cfqq))
-			cfq_slice_expired(cfqd, 1);
-		else if (sync && cfqq_empty &&
-			 !cfq_close_cooperator(cfqd, cfqq)) {
-			cfq_arm_slice_timer(cfqd);
-		}
-	}
-
-	if (!cfqd->rq_in_driver)
-		cfq_schedule_dispatch(cfqd);
-}
-
-static void cfqq_boost_on_prio(struct cfq_queue *cfqq, unsigned int op)
-{
-	/*
-	 * If REQ_PRIO is set, boost class and prio level, if it's below
-	 * BE/NORM. If prio is not set, restore the potentially boosted
-	 * class/prio level.
-	 */
-	if (!(op & REQ_PRIO)) {
-		cfqq->ioprio_class = cfqq->org_ioprio_class;
-		cfqq->ioprio = cfqq->org_ioprio;
-	} else {
-		if (cfq_class_idle(cfqq))
-			cfqq->ioprio_class = IOPRIO_CLASS_BE;
-		if (cfqq->ioprio > IOPRIO_NORM)
-			cfqq->ioprio = IOPRIO_NORM;
-	}
-}
-
-static inline int __cfq_may_queue(struct cfq_queue *cfqq)
-{
-	if (cfq_cfqq_wait_request(cfqq) && !cfq_cfqq_must_alloc_slice(cfqq)) {
-		cfq_mark_cfqq_must_alloc_slice(cfqq);
-		return ELV_MQUEUE_MUST;
-	}
-
-	return ELV_MQUEUE_MAY;
-}
-
-static int cfq_may_queue(struct request_queue *q, unsigned int op)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct task_struct *tsk = current;
-	struct cfq_io_cq *cic;
-	struct cfq_queue *cfqq;
-
-	/*
-	 * don't force setup of a queue from here, as a call to may_queue
-	 * does not necessarily imply that a request actually will be queued.
-	 * so just lookup a possibly existing queue, or return 'may queue'
-	 * if that fails
-	 */
-	cic = cfq_cic_lookup(cfqd, tsk->io_context);
-	if (!cic)
-		return ELV_MQUEUE_MAY;
-
-	cfqq = cic_to_cfqq(cic, op_is_sync(op));
-	if (cfqq) {
-		cfq_init_prio_data(cfqq, cic);
-		cfqq_boost_on_prio(cfqq, op);
-
-		return __cfq_may_queue(cfqq);
-	}
-
-	return ELV_MQUEUE_MAY;
-}
-
-/*
- * queue lock held here
- */
-static void cfq_put_request(struct request *rq)
-{
-	struct cfq_queue *cfqq = RQ_CFQQ(rq);
-
-	if (cfqq) {
-		const int rw = rq_data_dir(rq);
-
-		BUG_ON(!cfqq->allocated[rw]);
-		cfqq->allocated[rw]--;
-
-		/* Put down rq reference on cfqg */
-		cfqg_put(RQ_CFQG(rq));
-		rq->elv.priv[0] = NULL;
-		rq->elv.priv[1] = NULL;
-
-		cfq_put_queue(cfqq);
-	}
-}
-
-static struct cfq_queue *
-cfq_merge_cfqqs(struct cfq_data *cfqd, struct cfq_io_cq *cic,
-		struct cfq_queue *cfqq)
-{
-	cfq_log_cfqq(cfqd, cfqq, "merging with queue %p", cfqq->new_cfqq);
-	cic_set_cfqq(cic, cfqq->new_cfqq, 1);
-	cfq_mark_cfqq_coop(cfqq->new_cfqq);
-	cfq_put_queue(cfqq);
-	return cic_to_cfqq(cic, 1);
-}
-
-/*
- * Returns NULL if a new cfqq should be allocated, or the old cfqq if this
- * was the last process referring to said cfqq.
- */
-static struct cfq_queue *
-split_cfqq(struct cfq_io_cq *cic, struct cfq_queue *cfqq)
-{
-	if (cfqq_process_refs(cfqq) == 1) {
-		cfqq->pid = current->pid;
-		cfq_clear_cfqq_coop(cfqq);
-		cfq_clear_cfqq_split_coop(cfqq);
-		return cfqq;
-	}
-
-	cic_set_cfqq(cic, NULL, 1);
-
-	cfq_put_cooperator(cfqq);
-
-	cfq_put_queue(cfqq);
-	return NULL;
-}
-/*
- * Allocate cfq data structures associated with this request.
- */
-static int
-cfq_set_request(struct request_queue *q, struct request *rq, struct bio *bio,
-		gfp_t gfp_mask)
-{
-	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct cfq_io_cq *cic = icq_to_cic(rq->elv.icq);
-	const int rw = rq_data_dir(rq);
-	const bool is_sync = rq_is_sync(rq);
-	struct cfq_queue *cfqq;
-
-	spin_lock_irq(q->queue_lock);
-
-	check_ioprio_changed(cic, bio);
-	check_blkcg_changed(cic, bio);
-new_queue:
-	cfqq = cic_to_cfqq(cic, is_sync);
-	if (!cfqq || cfqq == &cfqd->oom_cfqq) {
-		if (cfqq)
-			cfq_put_queue(cfqq);
-		cfqq = cfq_get_queue(cfqd, is_sync, cic, bio);
-		cic_set_cfqq(cic, cfqq, is_sync);
-	} else {
-		/*
-		 * If the queue was seeky for too long, break it apart.
-		 */
-		if (cfq_cfqq_coop(cfqq) && cfq_cfqq_split_coop(cfqq)) {
-			cfq_log_cfqq(cfqd, cfqq, "breaking apart cfqq");
-			cfqq = split_cfqq(cic, cfqq);
-			if (!cfqq)
-				goto new_queue;
-		}
-
-		/*
-		 * Check to see if this queue is scheduled to merge with
-		 * another, closely cooperating queue.  The merging of
-		 * queues happens here as it must be done in process context.
-		 * The reference on new_cfqq was taken in merge_cfqqs.
-		 */
-		if (cfqq->new_cfqq)
-			cfqq = cfq_merge_cfqqs(cfqd, cic, cfqq);
-	}
-
-	cfqq->allocated[rw]++;
-
-	cfqq->ref++;
-	cfqg_get(cfqq->cfqg);
-	rq->elv.priv[0] = cfqq;
-	rq->elv.priv[1] = cfqq->cfqg;
-	spin_unlock_irq(q->queue_lock);
-
-	return 0;
-}
-
-static void cfq_kick_queue(struct work_struct *work)
-{
-	struct cfq_data *cfqd =
-		container_of(work, struct cfq_data, unplug_work);
-	struct request_queue *q = cfqd->queue;
-
-	spin_lock_irq(q->queue_lock);
-	__blk_run_queue(cfqd->queue);
-	spin_unlock_irq(q->queue_lock);
-}
-
-/*
- * Timer running if the active_queue is currently idling inside its time slice
- */
-static enum hrtimer_restart cfq_idle_slice_timer(struct hrtimer *timer)
-{
-	struct cfq_data *cfqd = container_of(timer, struct cfq_data,
-					     idle_slice_timer);
-	struct cfq_queue *cfqq;
-	unsigned long flags;
-	int timed_out = 1;
-
-	cfq_log(cfqd, "idle timer fired");
-
-	spin_lock_irqsave(cfqd->queue->queue_lock, flags);
-
-	cfqq = cfqd->active_queue;
-	if (cfqq) {
-		timed_out = 0;
-
-		/*
-		 * We saw a request before the queue expired, let it through
-		 */
-		if (cfq_cfqq_must_dispatch(cfqq))
-			goto out_kick;
-
-		/*
-		 * expired
-		 */
-		if (cfq_slice_used(cfqq))
-			goto expire;
-
-		/*
-		 * only expire and reinvoke request handler, if there are
-		 * other queues with pending requests
-		 */
-		if (!cfqd->busy_queues)
-			goto out_cont;
-
-		/*
-		 * not expired and it has a request pending, let it dispatch
-		 */
-		if (!RB_EMPTY_ROOT(&cfqq->sort_list))
-			goto out_kick;
-
-		/*
-		 * Queue depth flag is reset only when the idle didn't succeed
-		 */
-		cfq_clear_cfqq_deep(cfqq);
-	}
-expire:
-	cfq_slice_expired(cfqd, timed_out);
-out_kick:
-	cfq_schedule_dispatch(cfqd);
-out_cont:
-	spin_unlock_irqrestore(cfqd->queue->queue_lock, flags);
-	return HRTIMER_NORESTART;
-}
-
-static void cfq_shutdown_timer_wq(struct cfq_data *cfqd)
-{
-	hrtimer_cancel(&cfqd->idle_slice_timer);
-	cancel_work_sync(&cfqd->unplug_work);
-}
-
-static void cfq_exit_queue(struct elevator_queue *e)
-{
-	struct cfq_data *cfqd = e->elevator_data;
-	struct request_queue *q = cfqd->queue;
-
-	cfq_shutdown_timer_wq(cfqd);
-
-	spin_lock_irq(q->queue_lock);
-
-	if (cfqd->active_queue)
-		__cfq_slice_expired(cfqd, cfqd->active_queue, 0);
-
-	spin_unlock_irq(q->queue_lock);
-
-	cfq_shutdown_timer_wq(cfqd);
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	blkcg_deactivate_policy(q, &blkcg_policy_cfq);
-#else
-	kfree(cfqd->root_group);
-#endif
-	kfree(cfqd);
-}
-
-static int cfq_init_queue(struct request_queue *q, struct elevator_type *e)
-{
-	struct cfq_data *cfqd;
-	struct blkcg_gq *blkg __maybe_unused;
-	int i, ret;
-	struct elevator_queue *eq;
-
-	eq = elevator_alloc(q, e);
-	if (!eq)
-		return -ENOMEM;
-
-	cfqd = kzalloc_node(sizeof(*cfqd), GFP_KERNEL, q->node);
-	if (!cfqd) {
-		kobject_put(&eq->kobj);
-		return -ENOMEM;
-	}
-	eq->elevator_data = cfqd;
-
-	cfqd->queue = q;
-	spin_lock_irq(q->queue_lock);
-	q->elevator = eq;
-	spin_unlock_irq(q->queue_lock);
-
-	/* Init root service tree */
-	cfqd->grp_service_tree = CFQ_RB_ROOT;
-
-	/* Init root group and prefer root group over other groups by default */
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	ret = blkcg_activate_policy(q, &blkcg_policy_cfq);
-	if (ret)
-		goto out_free;
-
-	cfqd->root_group = blkg_to_cfqg(q->root_blkg);
-#else
-	ret = -ENOMEM;
-	cfqd->root_group = kzalloc_node(sizeof(*cfqd->root_group),
-					GFP_KERNEL, cfqd->queue->node);
-	if (!cfqd->root_group)
-		goto out_free;
-
-	cfq_init_cfqg_base(cfqd->root_group);
-	cfqd->root_group->weight = 2 * CFQ_WEIGHT_LEGACY_DFL;
-	cfqd->root_group->leaf_weight = 2 * CFQ_WEIGHT_LEGACY_DFL;
-#endif
-
-	/*
-	 * Not strictly needed (since RB_ROOT just clears the node and we
-	 * zeroed cfqd on alloc), but better be safe in case someone decides
-	 * to add magic to the rb code
-	 */
-	for (i = 0; i < CFQ_PRIO_LISTS; i++)
-		cfqd->prio_trees[i] = RB_ROOT;
-
-	/*
-	 * Our fallback cfqq if cfq_get_queue() runs into OOM issues.
-	 * Grab a permanent reference to it, so that the normal code flow
-	 * will not attempt to free it.  oom_cfqq is linked to root_group
-	 * but shouldn't hold a reference as it'll never be unlinked.  Lose
-	 * the reference from linking right away.
-	 */
-	cfq_init_cfqq(cfqd, &cfqd->oom_cfqq, 1, 0);
-	cfqd->oom_cfqq.ref++;
-
-	spin_lock_irq(q->queue_lock);
-	cfq_link_cfqq_cfqg(&cfqd->oom_cfqq, cfqd->root_group);
-	cfqg_put(cfqd->root_group);
-	spin_unlock_irq(q->queue_lock);
-
-	hrtimer_init(&cfqd->idle_slice_timer, CLOCK_MONOTONIC,
-		     HRTIMER_MODE_REL);
-	cfqd->idle_slice_timer.function = cfq_idle_slice_timer;
-
-	INIT_WORK(&cfqd->unplug_work, cfq_kick_queue);
-
-	cfqd->cfq_quantum = cfq_quantum;
-	cfqd->cfq_fifo_expire[0] = cfq_fifo_expire[0];
-	cfqd->cfq_fifo_expire[1] = cfq_fifo_expire[1];
-	cfqd->cfq_back_max = cfq_back_max;
-	cfqd->cfq_back_penalty = cfq_back_penalty;
-	cfqd->cfq_slice[0] = cfq_slice_async;
-	cfqd->cfq_slice[1] = cfq_slice_sync;
-	cfqd->cfq_target_latency = cfq_target_latency;
-	cfqd->cfq_slice_async_rq = cfq_slice_async_rq;
-	cfqd->cfq_slice_idle = cfq_slice_idle;
-	cfqd->cfq_group_idle = cfq_group_idle;
-	cfqd->cfq_latency = 1;
-	cfqd->hw_tag = -1;
-	/*
-	 * we optimistically start assuming sync ops weren't delayed in last
-	 * second, in order to have larger depth for async operations.
-	 */
-	cfqd->last_delayed_sync = ktime_get_ns() - NSEC_PER_SEC;
-	return 0;
-
-out_free:
-	kfree(cfqd);
-	kobject_put(&eq->kobj);
-	return ret;
-}
-
-static void cfq_registered_queue(struct request_queue *q)
-{
-	struct elevator_queue *e = q->elevator;
-	struct cfq_data *cfqd = e->elevator_data;
-
-	/*
-	 * Default to IOPS mode with no idling for SSDs
-	 */
-	if (blk_queue_nonrot(q))
-		cfqd->cfq_slice_idle = 0;
-	wbt_disable_default(q);
-}
-
-/*
- * sysfs parts below -->
- */
-static ssize_t
-cfq_var_show(unsigned int var, char *page)
-{
-	return sprintf(page, "%u\n", var);
-}
-
-static void
-cfq_var_store(unsigned int *var, const char *page)
-{
-	char *p = (char *) page;
-
-	*var = simple_strtoul(p, &p, 10);
-}
-
-#define SHOW_FUNCTION(__FUNC, __VAR, __CONV)				\
-static ssize_t __FUNC(struct elevator_queue *e, char *page)		\
-{									\
-	struct cfq_data *cfqd = e->elevator_data;			\
-	u64 __data = __VAR;						\
-	if (__CONV)							\
-		__data = div_u64(__data, NSEC_PER_MSEC);			\
-	return cfq_var_show(__data, (page));				\
-}
-SHOW_FUNCTION(cfq_quantum_show, cfqd->cfq_quantum, 0);
-SHOW_FUNCTION(cfq_fifo_expire_sync_show, cfqd->cfq_fifo_expire[1], 1);
-SHOW_FUNCTION(cfq_fifo_expire_async_show, cfqd->cfq_fifo_expire[0], 1);
-SHOW_FUNCTION(cfq_back_seek_max_show, cfqd->cfq_back_max, 0);
-SHOW_FUNCTION(cfq_back_seek_penalty_show, cfqd->cfq_back_penalty, 0);
-SHOW_FUNCTION(cfq_slice_idle_show, cfqd->cfq_slice_idle, 1);
-SHOW_FUNCTION(cfq_group_idle_show, cfqd->cfq_group_idle, 1);
-SHOW_FUNCTION(cfq_slice_sync_show, cfqd->cfq_slice[1], 1);
-SHOW_FUNCTION(cfq_slice_async_show, cfqd->cfq_slice[0], 1);
-SHOW_FUNCTION(cfq_slice_async_rq_show, cfqd->cfq_slice_async_rq, 0);
-SHOW_FUNCTION(cfq_low_latency_show, cfqd->cfq_latency, 0);
-SHOW_FUNCTION(cfq_target_latency_show, cfqd->cfq_target_latency, 1);
-#undef SHOW_FUNCTION
-
-#define USEC_SHOW_FUNCTION(__FUNC, __VAR)				\
-static ssize_t __FUNC(struct elevator_queue *e, char *page)		\
-{									\
-	struct cfq_data *cfqd = e->elevator_data;			\
-	u64 __data = __VAR;						\
-	__data = div_u64(__data, NSEC_PER_USEC);			\
-	return cfq_var_show(__data, (page));				\
-}
-USEC_SHOW_FUNCTION(cfq_slice_idle_us_show, cfqd->cfq_slice_idle);
-USEC_SHOW_FUNCTION(cfq_group_idle_us_show, cfqd->cfq_group_idle);
-USEC_SHOW_FUNCTION(cfq_slice_sync_us_show, cfqd->cfq_slice[1]);
-USEC_SHOW_FUNCTION(cfq_slice_async_us_show, cfqd->cfq_slice[0]);
-USEC_SHOW_FUNCTION(cfq_target_latency_us_show, cfqd->cfq_target_latency);
-#undef USEC_SHOW_FUNCTION
-
-#define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV)			\
-static ssize_t __FUNC(struct elevator_queue *e, const char *page, size_t count)	\
-{									\
-	struct cfq_data *cfqd = e->elevator_data;			\
-	unsigned int __data, __min = (MIN), __max = (MAX);		\
-									\
-	cfq_var_store(&__data, (page));					\
-	if (__data < __min)						\
-		__data = __min;						\
-	else if (__data > __max)					\
-		__data = __max;						\
-	if (__CONV)							\
-		*(__PTR) = (u64)__data * NSEC_PER_MSEC;			\
-	else								\
-		*(__PTR) = __data;					\
-	return count;							\
-}
-STORE_FUNCTION(cfq_quantum_store, &cfqd->cfq_quantum, 1, UINT_MAX, 0);
-STORE_FUNCTION(cfq_fifo_expire_sync_store, &cfqd->cfq_fifo_expire[1], 1,
-		UINT_MAX, 1);
-STORE_FUNCTION(cfq_fifo_expire_async_store, &cfqd->cfq_fifo_expire[0], 1,
-		UINT_MAX, 1);
-STORE_FUNCTION(cfq_back_seek_max_store, &cfqd->cfq_back_max, 0, UINT_MAX, 0);
-STORE_FUNCTION(cfq_back_seek_penalty_store, &cfqd->cfq_back_penalty, 1,
-		UINT_MAX, 0);
-STORE_FUNCTION(cfq_slice_idle_store, &cfqd->cfq_slice_idle, 0, UINT_MAX, 1);
-STORE_FUNCTION(cfq_group_idle_store, &cfqd->cfq_group_idle, 0, UINT_MAX, 1);
-STORE_FUNCTION(cfq_slice_sync_store, &cfqd->cfq_slice[1], 1, UINT_MAX, 1);
-STORE_FUNCTION(cfq_slice_async_store, &cfqd->cfq_slice[0], 1, UINT_MAX, 1);
-STORE_FUNCTION(cfq_slice_async_rq_store, &cfqd->cfq_slice_async_rq, 1,
-		UINT_MAX, 0);
-STORE_FUNCTION(cfq_low_latency_store, &cfqd->cfq_latency, 0, 1, 0);
-STORE_FUNCTION(cfq_target_latency_store, &cfqd->cfq_target_latency, 1, UINT_MAX, 1);
-#undef STORE_FUNCTION
-
-#define USEC_STORE_FUNCTION(__FUNC, __PTR, MIN, MAX)			\
-static ssize_t __FUNC(struct elevator_queue *e, const char *page, size_t count)	\
-{									\
-	struct cfq_data *cfqd = e->elevator_data;			\
-	unsigned int __data, __min = (MIN), __max = (MAX);		\
-									\
-	cfq_var_store(&__data, (page));					\
-	if (__data < __min)						\
-		__data = __min;						\
-	else if (__data > __max)					\
-		__data = __max;						\
-	*(__PTR) = (u64)__data * NSEC_PER_USEC;				\
-	return count;							\
-}
-USEC_STORE_FUNCTION(cfq_slice_idle_us_store, &cfqd->cfq_slice_idle, 0, UINT_MAX);
-USEC_STORE_FUNCTION(cfq_group_idle_us_store, &cfqd->cfq_group_idle, 0, UINT_MAX);
-USEC_STORE_FUNCTION(cfq_slice_sync_us_store, &cfqd->cfq_slice[1], 1, UINT_MAX);
-USEC_STORE_FUNCTION(cfq_slice_async_us_store, &cfqd->cfq_slice[0], 1, UINT_MAX);
-USEC_STORE_FUNCTION(cfq_target_latency_us_store, &cfqd->cfq_target_latency, 1, UINT_MAX);
-#undef USEC_STORE_FUNCTION
-
-#define CFQ_ATTR(name) \
-	__ATTR(name, 0644, cfq_##name##_show, cfq_##name##_store)
-
-static struct elv_fs_entry cfq_attrs[] = {
-	CFQ_ATTR(quantum),
-	CFQ_ATTR(fifo_expire_sync),
-	CFQ_ATTR(fifo_expire_async),
-	CFQ_ATTR(back_seek_max),
-	CFQ_ATTR(back_seek_penalty),
-	CFQ_ATTR(slice_sync),
-	CFQ_ATTR(slice_sync_us),
-	CFQ_ATTR(slice_async),
-	CFQ_ATTR(slice_async_us),
-	CFQ_ATTR(slice_async_rq),
-	CFQ_ATTR(slice_idle),
-	CFQ_ATTR(slice_idle_us),
-	CFQ_ATTR(group_idle),
-	CFQ_ATTR(group_idle_us),
-	CFQ_ATTR(low_latency),
-	CFQ_ATTR(target_latency),
-	CFQ_ATTR(target_latency_us),
-	__ATTR_NULL
-};
-
-static struct elevator_type iosched_cfq = {
-	.ops.sq = {
-		.elevator_merge_fn = 		cfq_merge,
-		.elevator_merged_fn =		cfq_merged_request,
-		.elevator_merge_req_fn =	cfq_merged_requests,
-		.elevator_allow_bio_merge_fn =	cfq_allow_bio_merge,
-		.elevator_allow_rq_merge_fn =	cfq_allow_rq_merge,
-		.elevator_bio_merged_fn =	cfq_bio_merged,
-		.elevator_dispatch_fn =		cfq_dispatch_requests,
-		.elevator_add_req_fn =		cfq_insert_request,
-		.elevator_activate_req_fn =	cfq_activate_request,
-		.elevator_deactivate_req_fn =	cfq_deactivate_request,
-		.elevator_completed_req_fn =	cfq_completed_request,
-		.elevator_former_req_fn =	elv_rb_former_request,
-		.elevator_latter_req_fn =	elv_rb_latter_request,
-		.elevator_init_icq_fn =		cfq_init_icq,
-		.elevator_exit_icq_fn =		cfq_exit_icq,
-		.elevator_set_req_fn =		cfq_set_request,
-		.elevator_put_req_fn =		cfq_put_request,
-		.elevator_may_queue_fn =	cfq_may_queue,
-		.elevator_init_fn =		cfq_init_queue,
-		.elevator_exit_fn =		cfq_exit_queue,
-		.elevator_registered_fn =	cfq_registered_queue,
-	},
-	.icq_size	=	sizeof(struct cfq_io_cq),
-	.icq_align	=	__alignof__(struct cfq_io_cq),
-	.elevator_attrs =	cfq_attrs,
-	.elevator_name	=	"cfq",
-	.elevator_owner =	THIS_MODULE,
-};
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-static struct blkcg_policy blkcg_policy_cfq = {
-	.dfl_cftypes		= cfq_blkcg_files,
-	.legacy_cftypes		= cfq_blkcg_legacy_files,
-
-	.cpd_alloc_fn		= cfq_cpd_alloc,
-	.cpd_init_fn		= cfq_cpd_init,
-	.cpd_free_fn		= cfq_cpd_free,
-	.cpd_bind_fn		= cfq_cpd_bind,
-
-	.pd_alloc_fn		= cfq_pd_alloc,
-	.pd_init_fn		= cfq_pd_init,
-	.pd_offline_fn		= cfq_pd_offline,
-	.pd_free_fn		= cfq_pd_free,
-	.pd_reset_stats_fn	= cfq_pd_reset_stats,
-};
-#endif
-
-static int __init cfq_init(void)
-{
-	int ret;
-
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	ret = blkcg_policy_register(&blkcg_policy_cfq);
-	if (ret)
-		return ret;
-#else
-	cfq_group_idle = 0;
-#endif
-
-	ret = -ENOMEM;
-	cfq_pool = KMEM_CACHE(cfq_queue, 0);
-	if (!cfq_pool)
-		goto err_pol_unreg;
-
-	ret = elv_register(&iosched_cfq);
-	if (ret)
-		goto err_free_pool;
-
-	return 0;
-
-err_free_pool:
-	kmem_cache_destroy(cfq_pool);
-err_pol_unreg:
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	blkcg_policy_unregister(&blkcg_policy_cfq);
-#endif
-	return ret;
-}
-
-static void __exit cfq_exit(void)
-{
-#ifdef CONFIG_CFQ_GROUP_IOSCHED
-	blkcg_policy_unregister(&blkcg_policy_cfq);
-#endif
-	elv_unregister(&iosched_cfq);
-	kmem_cache_destroy(cfq_pool);
-}
-
-module_init(cfq_init);
-module_exit(cfq_exit);
-
-MODULE_AUTHOR("Jens Axboe");
-MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("Completely Fair Queueing IO scheduler");
diff --git a/block/deadline-iosched.c b/block/deadline-iosched.c
deleted file mode 100644
index ef2f1f09e9b3..000000000000
--- a/block/deadline-iosched.c
+++ /dev/null
@@ -1,560 +0,0 @@
-/*
- *  Deadline i/o scheduler.
- *
- *  Copyright (C) 2002 Jens Axboe <axboe@kernel.dk>
- */
-#include <linux/kernel.h>
-#include <linux/fs.h>
-#include <linux/blkdev.h>
-#include <linux/elevator.h>
-#include <linux/bio.h>
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/init.h>
-#include <linux/compiler.h>
-#include <linux/rbtree.h>
-
-/*
- * See Documentation/block/deadline-iosched.txt
- */
-static const int read_expire = HZ / 2;  /* max time before a read is submitted. */
-static const int write_expire = 5 * HZ; /* ditto for writes, these limits are SOFT! */
-static const int writes_starved = 2;    /* max times reads can starve a write */
-static const int fifo_batch = 16;       /* # of sequential requests treated as one
-				     by the above parameters. For throughput. */
-
-struct deadline_data {
-	/*
-	 * run time data
-	 */
-
-	/*
-	 * requests (deadline_rq s) are present on both sort_list and fifo_list
-	 */
-	struct rb_root sort_list[2];	
-	struct list_head fifo_list[2];
-
-	/*
-	 * next in sort order. read, write or both are NULL
-	 */
-	struct request *next_rq[2];
-	unsigned int batching;		/* number of sequential requests made */
-	unsigned int starved;		/* times reads have starved writes */
-
-	/*
-	 * settings that change how the i/o scheduler behaves
-	 */
-	int fifo_expire[2];
-	int fifo_batch;
-	int writes_starved;
-	int front_merges;
-};
-
-static inline struct rb_root *
-deadline_rb_root(struct deadline_data *dd, struct request *rq)
-{
-	return &dd->sort_list[rq_data_dir(rq)];
-}
-
-/*
- * get the request after `rq' in sector-sorted order
- */
-static inline struct request *
-deadline_latter_request(struct request *rq)
-{
-	struct rb_node *node = rb_next(&rq->rb_node);
-
-	if (node)
-		return rb_entry_rq(node);
-
-	return NULL;
-}
-
-static void
-deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
-{
-	struct rb_root *root = deadline_rb_root(dd, rq);
-
-	elv_rb_add(root, rq);
-}
-
-static inline void
-deadline_del_rq_rb(struct deadline_data *dd, struct request *rq)
-{
-	const int data_dir = rq_data_dir(rq);
-
-	if (dd->next_rq[data_dir] == rq)
-		dd->next_rq[data_dir] = deadline_latter_request(rq);
-
-	elv_rb_del(deadline_rb_root(dd, rq), rq);
-}
-
-/*
- * add rq to rbtree and fifo
- */
-static void
-deadline_add_request(struct request_queue *q, struct request *rq)
-{
-	struct deadline_data *dd = q->elevator->elevator_data;
-	const int data_dir = rq_data_dir(rq);
-
-	/*
-	 * This may be a requeue of a write request that has locked its
-	 * target zone. If it is the case, this releases the zone lock.
-	 */
-	blk_req_zone_write_unlock(rq);
-
-	deadline_add_rq_rb(dd, rq);
-
-	/*
-	 * set expire time and add to fifo list
-	 */
-	rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
-	list_add_tail(&rq->queuelist, &dd->fifo_list[data_dir]);
-}
-
-/*
- * remove rq from rbtree and fifo.
- */
-static void deadline_remove_request(struct request_queue *q, struct request *rq)
-{
-	struct deadline_data *dd = q->elevator->elevator_data;
-
-	rq_fifo_clear(rq);
-	deadline_del_rq_rb(dd, rq);
-}
-
-static enum elv_merge
-deadline_merge(struct request_queue *q, struct request **req, struct bio *bio)
-{
-	struct deadline_data *dd = q->elevator->elevator_data;
-	struct request *__rq;
-
-	/*
-	 * check for front merge
-	 */
-	if (dd->front_merges) {
-		sector_t sector = bio_end_sector(bio);
-
-		__rq = elv_rb_find(&dd->sort_list[bio_data_dir(bio)], sector);
-		if (__rq) {
-			BUG_ON(sector != blk_rq_pos(__rq));
-
-			if (elv_bio_merge_ok(__rq, bio)) {
-				*req = __rq;
-				return ELEVATOR_FRONT_MERGE;
-			}
-		}
-	}
-
-	return ELEVATOR_NO_MERGE;
-}
-
-static void deadline_merged_request(struct request_queue *q,
-				    struct request *req, enum elv_merge type)
-{
-	struct deadline_data *dd = q->elevator->elevator_data;
-
-	/*
-	 * if the merge was a front merge, we need to reposition request
-	 */
-	if (type == ELEVATOR_FRONT_MERGE) {
-		elv_rb_del(deadline_rb_root(dd, req), req);
-		deadline_add_rq_rb(dd, req);
-	}
-}
-
-static void
-deadline_merged_requests(struct request_queue *q, struct request *req,
-			 struct request *next)
-{
-	/*
-	 * if next expires before rq, assign its expire time to rq
-	 * and move into next position (next will be deleted) in fifo
-	 */
-	if (!list_empty(&req->queuelist) && !list_empty(&next->queuelist)) {
-		if (time_before((unsigned long)next->fifo_time,
-				(unsigned long)req->fifo_time)) {
-			list_move(&req->queuelist, &next->queuelist);
-			req->fifo_time = next->fifo_time;
-		}
-	}
-
-	/*
-	 * kill knowledge of next, this one is a goner
-	 */
-	deadline_remove_request(q, next);
-}
-
-/*
- * move request from sort list to dispatch queue.
- */
-static inline void
-deadline_move_to_dispatch(struct deadline_data *dd, struct request *rq)
-{
-	struct request_queue *q = rq->q;
-
-	/*
-	 * For a zoned block device, write requests must write lock their
-	 * target zone.
-	 */
-	blk_req_zone_write_lock(rq);
-
-	deadline_remove_request(q, rq);
-	elv_dispatch_add_tail(q, rq);
-}
-
-/*
- * move an entry to dispatch queue
- */
-static void
-deadline_move_request(struct deadline_data *dd, struct request *rq)
-{
-	const int data_dir = rq_data_dir(rq);
-
-	dd->next_rq[READ] = NULL;
-	dd->next_rq[WRITE] = NULL;
-	dd->next_rq[data_dir] = deadline_latter_request(rq);
-
-	/*
-	 * take it off the sort and fifo list, move
-	 * to dispatch queue
-	 */
-	deadline_move_to_dispatch(dd, rq);
-}
-
-/*
- * deadline_check_fifo returns 0 if there are no expired requests on the fifo,
- * 1 otherwise. Requires !list_empty(&dd->fifo_list[data_dir])
- */
-static inline int deadline_check_fifo(struct deadline_data *dd, int ddir)
-{
-	struct request *rq = rq_entry_fifo(dd->fifo_list[ddir].next);
-
-	/*
-	 * rq is expired!
-	 */
-	if (time_after_eq(jiffies, (unsigned long)rq->fifo_time))
-		return 1;
-
-	return 0;
-}
-
-/*
- * For the specified data direction, return the next request to dispatch using
- * arrival ordered lists.
- */
-static struct request *
-deadline_fifo_request(struct deadline_data *dd, int data_dir)
-{
-	struct request *rq;
-
-	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
-		return NULL;
-
-	if (list_empty(&dd->fifo_list[data_dir]))
-		return NULL;
-
-	rq = rq_entry_fifo(dd->fifo_list[data_dir].next);
-	if (data_dir == READ || !blk_queue_is_zoned(rq->q))
-		return rq;
-
-	/*
-	 * Look for a write request that can be dispatched, that is one with
-	 * an unlocked target zone.
-	 */
-	list_for_each_entry(rq, &dd->fifo_list[WRITE], queuelist) {
-		if (blk_req_can_dispatch_to_zone(rq))
-			return rq;
-	}
-
-	return NULL;
-}
-
-/*
- * For the specified data direction, return the next request to dispatch using
- * sector position sorted lists.
- */
-static struct request *
-deadline_next_request(struct deadline_data *dd, int data_dir)
-{
-	struct request *rq;
-
-	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
-		return NULL;
-
-	rq = dd->next_rq[data_dir];
-	if (!rq)
-		return NULL;
-
-	if (data_dir == READ || !blk_queue_is_zoned(rq->q))
-		return rq;
-
-	/*
-	 * Look for a write request that can be dispatched, that is one with
-	 * an unlocked target zone.
-	 */
-	while (rq) {
-		if (blk_req_can_dispatch_to_zone(rq))
-			return rq;
-		rq = deadline_latter_request(rq);
-	}
-
-	return NULL;
-}
-
-/*
- * deadline_dispatch_requests selects the best request according to
- * read/write expire, fifo_batch, etc
- */
-static int deadline_dispatch_requests(struct request_queue *q, int force)
-{
-	struct deadline_data *dd = q->elevator->elevator_data;
-	const int reads = !list_empty(&dd->fifo_list[READ]);
-	const int writes = !list_empty(&dd->fifo_list[WRITE]);
-	struct request *rq, *next_rq;
-	int data_dir;
-
-	/*
-	 * batches are currently reads XOR writes
-	 */
-	rq = deadline_next_request(dd, WRITE);
-	if (!rq)
-		rq = deadline_next_request(dd, READ);
-
-	if (rq && dd->batching < dd->fifo_batch)
-		/* we have a next request are still entitled to batch */
-		goto dispatch_request;
-
-	/*
-	 * at this point we are not running a batch. select the appropriate
-	 * data direction (read / write)
-	 */
-
-	if (reads) {
-		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[READ]));
-
-		if (deadline_fifo_request(dd, WRITE) &&
-		    (dd->starved++ >= dd->writes_starved))
-			goto dispatch_writes;
-
-		data_dir = READ;
-
-		goto dispatch_find_request;
-	}
-
-	/*
-	 * there are either no reads or writes have been starved
-	 */
-
-	if (writes) {
-dispatch_writes:
-		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[WRITE]));
-
-		dd->starved = 0;
-
-		data_dir = WRITE;
-
-		goto dispatch_find_request;
-	}
-
-	return 0;
-
-dispatch_find_request:
-	/*
-	 * we are not running a batch, find best request for selected data_dir
-	 */
-	next_rq = deadline_next_request(dd, data_dir);
-	if (deadline_check_fifo(dd, data_dir) || !next_rq) {
-		/*
-		 * A deadline has expired, the last request was in the other
-		 * direction, or we have run out of higher-sectored requests.
-		 * Start again from the request with the earliest expiry time.
-		 */
-		rq = deadline_fifo_request(dd, data_dir);
-	} else {
-		/*
-		 * The last req was the same dir and we have a next request in
-		 * sort order. No expired requests so continue on from here.
-		 */
-		rq = next_rq;
-	}
-
-	/*
-	 * For a zoned block device, if we only have writes queued and none of
-	 * them can be dispatched, rq will be NULL.
-	 */
-	if (!rq)
-		return 0;
-
-	dd->batching = 0;
-
-dispatch_request:
-	/*
-	 * rq is the selected appropriate request.
-	 */
-	dd->batching++;
-	deadline_move_request(dd, rq);
-
-	return 1;
-}
-
-/*
- * For zoned block devices, write unlock the target zone of completed
- * write requests.
- */
-static void
-deadline_completed_request(struct request_queue *q, struct request *rq)
-{
-	blk_req_zone_write_unlock(rq);
-}
-
-static void deadline_exit_queue(struct elevator_queue *e)
-{
-	struct deadline_data *dd = e->elevator_data;
-
-	BUG_ON(!list_empty(&dd->fifo_list[READ]));
-	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
-
-	kfree(dd);
-}
-
-/*
- * initialize elevator private data (deadline_data).
- */
-static int deadline_init_queue(struct request_queue *q, struct elevator_type *e)
-{
-	struct deadline_data *dd;
-	struct elevator_queue *eq;
-
-	eq = elevator_alloc(q, e);
-	if (!eq)
-		return -ENOMEM;
-
-	dd = kzalloc_node(sizeof(*dd), GFP_KERNEL, q->node);
-	if (!dd) {
-		kobject_put(&eq->kobj);
-		return -ENOMEM;
-	}
-	eq->elevator_data = dd;
-
-	INIT_LIST_HEAD(&dd->fifo_list[READ]);
-	INIT_LIST_HEAD(&dd->fifo_list[WRITE]);
-	dd->sort_list[READ] = RB_ROOT;
-	dd->sort_list[WRITE] = RB_ROOT;
-	dd->fifo_expire[READ] = read_expire;
-	dd->fifo_expire[WRITE] = write_expire;
-	dd->writes_starved = writes_starved;
-	dd->front_merges = 1;
-	dd->fifo_batch = fifo_batch;
-
-	spin_lock_irq(q->queue_lock);
-	q->elevator = eq;
-	spin_unlock_irq(q->queue_lock);
-	return 0;
-}
-
-/*
- * sysfs parts below
- */
-
-static ssize_t
-deadline_var_show(int var, char *page)
-{
-	return sprintf(page, "%d\n", var);
-}
-
-static void
-deadline_var_store(int *var, const char *page)
-{
-	char *p = (char *) page;
-
-	*var = simple_strtol(p, &p, 10);
-}
-
-#define SHOW_FUNCTION(__FUNC, __VAR, __CONV)				\
-static ssize_t __FUNC(struct elevator_queue *e, char *page)		\
-{									\
-	struct deadline_data *dd = e->elevator_data;			\
-	int __data = __VAR;						\
-	if (__CONV)							\
-		__data = jiffies_to_msecs(__data);			\
-	return deadline_var_show(__data, (page));			\
-}
-SHOW_FUNCTION(deadline_read_expire_show, dd->fifo_expire[READ], 1);
-SHOW_FUNCTION(deadline_write_expire_show, dd->fifo_expire[WRITE], 1);
-SHOW_FUNCTION(deadline_writes_starved_show, dd->writes_starved, 0);
-SHOW_FUNCTION(deadline_front_merges_show, dd->front_merges, 0);
-SHOW_FUNCTION(deadline_fifo_batch_show, dd->fifo_batch, 0);
-#undef SHOW_FUNCTION
-
-#define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV)			\
-static ssize_t __FUNC(struct elevator_queue *e, const char *page, size_t count)	\
-{									\
-	struct deadline_data *dd = e->elevator_data;			\
-	int __data;							\
-	deadline_var_store(&__data, (page));				\
-	if (__data < (MIN))						\
-		__data = (MIN);						\
-	else if (__data > (MAX))					\
-		__data = (MAX);						\
-	if (__CONV)							\
-		*(__PTR) = msecs_to_jiffies(__data);			\
-	else								\
-		*(__PTR) = __data;					\
-	return count;							\
-}
-STORE_FUNCTION(deadline_read_expire_store, &dd->fifo_expire[READ], 0, INT_MAX, 1);
-STORE_FUNCTION(deadline_write_expire_store, &dd->fifo_expire[WRITE], 0, INT_MAX, 1);
-STORE_FUNCTION(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX, 0);
-STORE_FUNCTION(deadline_front_merges_store, &dd->front_merges, 0, 1, 0);
-STORE_FUNCTION(deadline_fifo_batch_store, &dd->fifo_batch, 0, INT_MAX, 0);
-#undef STORE_FUNCTION
-
-#define DD_ATTR(name) \
-	__ATTR(name, 0644, deadline_##name##_show, deadline_##name##_store)
-
-static struct elv_fs_entry deadline_attrs[] = {
-	DD_ATTR(read_expire),
-	DD_ATTR(write_expire),
-	DD_ATTR(writes_starved),
-	DD_ATTR(front_merges),
-	DD_ATTR(fifo_batch),
-	__ATTR_NULL
-};
-
-static struct elevator_type iosched_deadline = {
-	.ops.sq = {
-		.elevator_merge_fn = 		deadline_merge,
-		.elevator_merged_fn =		deadline_merged_request,
-		.elevator_merge_req_fn =	deadline_merged_requests,
-		.elevator_dispatch_fn =		deadline_dispatch_requests,
-		.elevator_completed_req_fn =	deadline_completed_request,
-		.elevator_add_req_fn =		deadline_add_request,
-		.elevator_former_req_fn =	elv_rb_former_request,
-		.elevator_latter_req_fn =	elv_rb_latter_request,
-		.elevator_init_fn =		deadline_init_queue,
-		.elevator_exit_fn =		deadline_exit_queue,
-	},
-
-	.elevator_attrs = deadline_attrs,
-	.elevator_name = "deadline",
-	.elevator_owner = THIS_MODULE,
-};
-
-static int __init deadline_init(void)
-{
-	return elv_register(&iosched_deadline);
-}
-
-static void __exit deadline_exit(void)
-{
-	elv_unregister(&iosched_deadline);
-}
-
-module_init(deadline_init);
-module_exit(deadline_exit);
-
-MODULE_AUTHOR("Jens Axboe");
-MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("deadline IO scheduler");
diff --git a/block/elevator.c b/block/elevator.c
index 8fdcd64ae12e..54e1adac26c5 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -225,8 +225,6 @@ int elevator_init(struct request_queue *q)
 							chosen_elevator);
 	}
 
-	if (!e)
-		e = elevator_get(q, CONFIG_DEFAULT_IOSCHED, false);
 	if (!e) {
 		printk(KERN_ERR
 			"Default I/O scheduler not found. Using noop.\n");
@@ -356,68 +354,6 @@ struct request *elv_rb_find(struct rb_root *root, sector_t sector)
 }
 EXPORT_SYMBOL(elv_rb_find);
 
-/*
- * Insert rq into dispatch queue of q.  Queue lock must be held on
- * entry.  rq is sort instead into the dispatch queue. To be used by
- * specific elevators.
- */
-void elv_dispatch_sort(struct request_queue *q, struct request *rq)
-{
-	sector_t boundary;
-	struct list_head *entry;
-
-	if (q->last_merge == rq)
-		q->last_merge = NULL;
-
-	elv_rqhash_del(q, rq);
-
-	q->nr_sorted--;
-
-	boundary = q->end_sector;
-	list_for_each_prev(entry, &q->queue_head) {
-		struct request *pos = list_entry_rq(entry);
-
-		if (req_op(rq) != req_op(pos))
-			break;
-		if (rq_data_dir(rq) != rq_data_dir(pos))
-			break;
-		if (pos->rq_flags & (RQF_STARTED | RQF_SOFTBARRIER))
-			break;
-		if (blk_rq_pos(rq) >= boundary) {
-			if (blk_rq_pos(pos) < boundary)
-				continue;
-		} else {
-			if (blk_rq_pos(pos) >= boundary)
-				break;
-		}
-		if (blk_rq_pos(rq) >= blk_rq_pos(pos))
-			break;
-	}
-
-	list_add(&rq->queuelist, entry);
-}
-EXPORT_SYMBOL(elv_dispatch_sort);
-
-/*
- * Insert rq into dispatch queue of q.  Queue lock must be held on
- * entry.  rq is added to the back of the dispatch queue. To be used by
- * specific elevators.
- */
-void elv_dispatch_add_tail(struct request_queue *q, struct request *rq)
-{
-	if (q->last_merge == rq)
-		q->last_merge = NULL;
-
-	elv_rqhash_del(q, rq);
-
-	q->nr_sorted--;
-
-	q->end_sector = rq_end_sector(rq);
-	q->boundary_rq = rq;
-	list_add_tail(&rq->queuelist, &q->queue_head);
-}
-EXPORT_SYMBOL(elv_dispatch_add_tail);
-
 enum elv_merge elv_merge(struct request_queue *q, struct request **req,
 		struct bio *bio)
 {
@@ -881,12 +817,6 @@ int elv_register(struct elevator_type *e)
 	list_add_tail(&e->list, &elv_list);
 	spin_unlock(&elv_list_lock);
 
-	/* print pretty message */
-	if (elevator_match(e, chosen_elevator) ||
-			(!*chosen_elevator &&
-			 elevator_match(e, CONFIG_DEFAULT_IOSCHED)))
-				def = " (default)";
-
 	printk(KERN_INFO "io scheduler %s registered%s\n", e->elevator_name,
 								def);
 	return 0;
diff --git a/block/noop-iosched.c b/block/noop-iosched.c
deleted file mode 100644
index 2d1b15d89b45..000000000000
--- a/block/noop-iosched.c
+++ /dev/null
@@ -1,124 +0,0 @@
-/*
- * elevator noop
- */
-#include <linux/blkdev.h>
-#include <linux/elevator.h>
-#include <linux/bio.h>
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/init.h>
-
-struct noop_data {
-	struct list_head queue;
-};
-
-static void noop_merged_requests(struct request_queue *q, struct request *rq,
-				 struct request *next)
-{
-	list_del_init(&next->queuelist);
-}
-
-static int noop_dispatch(struct request_queue *q, int force)
-{
-	struct noop_data *nd = q->elevator->elevator_data;
-	struct request *rq;
-
-	rq = list_first_entry_or_null(&nd->queue, struct request, queuelist);
-	if (rq) {
-		list_del_init(&rq->queuelist);
-		elv_dispatch_sort(q, rq);
-		return 1;
-	}
-	return 0;
-}
-
-static void noop_add_request(struct request_queue *q, struct request *rq)
-{
-	struct noop_data *nd = q->elevator->elevator_data;
-
-	list_add_tail(&rq->queuelist, &nd->queue);
-}
-
-static struct request *
-noop_former_request(struct request_queue *q, struct request *rq)
-{
-	struct noop_data *nd = q->elevator->elevator_data;
-
-	if (rq->queuelist.prev == &nd->queue)
-		return NULL;
-	return list_prev_entry(rq, queuelist);
-}
-
-static struct request *
-noop_latter_request(struct request_queue *q, struct request *rq)
-{
-	struct noop_data *nd = q->elevator->elevator_data;
-
-	if (rq->queuelist.next == &nd->queue)
-		return NULL;
-	return list_next_entry(rq, queuelist);
-}
-
-static int noop_init_queue(struct request_queue *q, struct elevator_type *e)
-{
-	struct noop_data *nd;
-	struct elevator_queue *eq;
-
-	eq = elevator_alloc(q, e);
-	if (!eq)
-		return -ENOMEM;
-
-	nd = kmalloc_node(sizeof(*nd), GFP_KERNEL, q->node);
-	if (!nd) {
-		kobject_put(&eq->kobj);
-		return -ENOMEM;
-	}
-	eq->elevator_data = nd;
-
-	INIT_LIST_HEAD(&nd->queue);
-
-	spin_lock_irq(q->queue_lock);
-	q->elevator = eq;
-	spin_unlock_irq(q->queue_lock);
-	return 0;
-}
-
-static void noop_exit_queue(struct elevator_queue *e)
-{
-	struct noop_data *nd = e->elevator_data;
-
-	BUG_ON(!list_empty(&nd->queue));
-	kfree(nd);
-}
-
-static struct elevator_type elevator_noop = {
-	.ops.sq = {
-		.elevator_merge_req_fn		= noop_merged_requests,
-		.elevator_dispatch_fn		= noop_dispatch,
-		.elevator_add_req_fn		= noop_add_request,
-		.elevator_former_req_fn		= noop_former_request,
-		.elevator_latter_req_fn		= noop_latter_request,
-		.elevator_init_fn		= noop_init_queue,
-		.elevator_exit_fn		= noop_exit_queue,
-	},
-	.elevator_name = "noop",
-	.elevator_owner = THIS_MODULE,
-};
-
-static int __init noop_init(void)
-{
-	return elv_register(&elevator_noop);
-}
-
-static void __exit noop_exit(void)
-{
-	elv_unregister(&elevator_noop);
-}
-
-module_init(noop_init);
-module_exit(noop_exit);
-
-
-MODULE_AUTHOR("Jens Axboe");
-MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("No-op IO scheduler");
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 20/28] block: remove dead elevator code
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (18 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 19/28] block: remove legacy IO schedulers Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-25 21:10 ` [PATCH 21/28] block: remove __blk_put_request() Jens Axboe
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

This removes a bunch of core and elevator related code. On the core
front, we remove anything related to queue running, draining,
initialization, plugging, and congestions. We also kill anything
related to request allocation, merging, retrieval, and completion.

Remove any checking for single queue IO schedulers, as they no
longer exist. This means we can also delete a bunch of code related
to request issue, adding, completion, etc - and all the SQ related
ops and helpers.

Also kill the load_default_modules(), as all that did was provide
for a way to load the default single queue elevator.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/bfq-iosched.c      |    1 -
 block/blk-core.c         | 1749 +-------------------------------------
 block/blk-exec.c         |   20 +-
 block/blk-ioc.c          |   46 +-
 block/blk-merge.c        |    5 -
 block/blk-settings.c     |   36 -
 block/blk-sysfs.c        |   36 +-
 block/blk.h              |   51 --
 block/elevator.c         |  377 +-------
 block/kyber-iosched.c    |    1 -
 block/mq-deadline.c      |    1 -
 include/linux/blkdev.h   |   93 +-
 include/linux/elevator.h |   90 +-
 include/linux/init.h     |    1 -
 init/do_mounts_initrd.c  |    3 -
 init/initramfs.c         |    6 -
 init/main.c              |   12 -
 17 files changed, 73 insertions(+), 2455 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 6075100f03a5..83b29c007f76 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5745,7 +5745,6 @@ static struct elevator_type iosched_bfq_mq = {
 		.exit_sched		= bfq_exit_queue,
 	},
 
-	.uses_mq =		true,
 	.icq_size =		sizeof(struct bfq_io_cq),
 	.icq_align =		__alignof__(struct bfq_io_cq),
 	.elevator_attrs =	bfq_attrs,
diff --git a/block/blk-core.c b/block/blk-core.c
index 1a814bce323d..009ac18cc7e1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -144,46 +144,6 @@ bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(blk_queue_flag_test_and_clear);
 
-static void blk_clear_congested(struct request_list *rl, int sync)
-{
-#ifdef CONFIG_CGROUP_WRITEBACK
-	clear_wb_congested(rl->blkg->wb_congested, sync);
-#else
-	/*
-	 * If !CGROUP_WRITEBACK, all blkg's map to bdi->wb and we shouldn't
-	 * flip its congestion state for events on other blkcgs.
-	 */
-	if (rl == &rl->q->root_rl)
-		clear_wb_congested(rl->q->backing_dev_info->wb.congested, sync);
-#endif
-}
-
-static void blk_set_congested(struct request_list *rl, int sync)
-{
-#ifdef CONFIG_CGROUP_WRITEBACK
-	set_wb_congested(rl->blkg->wb_congested, sync);
-#else
-	/* see blk_clear_congested() */
-	if (rl == &rl->q->root_rl)
-		set_wb_congested(rl->q->backing_dev_info->wb.congested, sync);
-#endif
-}
-
-void blk_queue_congestion_threshold(struct request_queue *q)
-{
-	int nr;
-
-	nr = q->nr_requests - (q->nr_requests / 8) + 1;
-	if (nr > q->nr_requests)
-		nr = q->nr_requests;
-	q->nr_congestion_on = nr;
-
-	nr = q->nr_requests - (q->nr_requests / 8) - (q->nr_requests / 16) - 1;
-	if (nr < 1)
-		nr = 1;
-	q->nr_congestion_off = nr;
-}
-
 void blk_rq_init(struct request_queue *q, struct request *rq)
 {
 	memset(rq, 0, sizeof(*rq));
@@ -292,99 +252,6 @@ void blk_dump_rq_flags(struct request *rq, char *msg)
 }
 EXPORT_SYMBOL(blk_dump_rq_flags);
 
-static void blk_delay_work(struct work_struct *work)
-{
-	struct request_queue *q;
-
-	q = container_of(work, struct request_queue, delay_work.work);
-	spin_lock_irq(q->queue_lock);
-	__blk_run_queue(q);
-	spin_unlock_irq(q->queue_lock);
-}
-
-/**
- * blk_delay_queue - restart queueing after defined interval
- * @q:		The &struct request_queue in question
- * @msecs:	Delay in msecs
- *
- * Description:
- *   Sometimes queueing needs to be postponed for a little while, to allow
- *   resources to come back. This function will make sure that queueing is
- *   restarted around the specified time.
- */
-void blk_delay_queue(struct request_queue *q, unsigned long msecs)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	if (likely(!blk_queue_dead(q)))
-		queue_delayed_work(kblockd_workqueue, &q->delay_work,
-				   msecs_to_jiffies(msecs));
-}
-EXPORT_SYMBOL(blk_delay_queue);
-
-/**
- * blk_start_queue_async - asynchronously restart a previously stopped queue
- * @q:    The &struct request_queue in question
- *
- * Description:
- *   blk_start_queue_async() will clear the stop flag on the queue, and
- *   ensure that the request_fn for the queue is run from an async
- *   context.
- **/
-void blk_start_queue_async(struct request_queue *q)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	queue_flag_clear(QUEUE_FLAG_STOPPED, q);
-	blk_run_queue_async(q);
-}
-EXPORT_SYMBOL(blk_start_queue_async);
-
-/**
- * blk_start_queue - restart a previously stopped queue
- * @q:    The &struct request_queue in question
- *
- * Description:
- *   blk_start_queue() will clear the stop flag on the queue, and call
- *   the request_fn for the queue if it was in a stopped state when
- *   entered. Also see blk_stop_queue().
- **/
-void blk_start_queue(struct request_queue *q)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	queue_flag_clear(QUEUE_FLAG_STOPPED, q);
-	__blk_run_queue(q);
-}
-EXPORT_SYMBOL(blk_start_queue);
-
-/**
- * blk_stop_queue - stop a queue
- * @q:    The &struct request_queue in question
- *
- * Description:
- *   The Linux block layer assumes that a block driver will consume all
- *   entries on the request queue when the request_fn strategy is called.
- *   Often this will not happen, because of hardware limitations (queue
- *   depth settings). If a device driver gets a 'queue full' response,
- *   or if it simply chooses not to queue more I/O at one point, it can
- *   call this function to prevent the request_fn from being called until
- *   the driver has signalled it's ready to go again. This happens by calling
- *   blk_start_queue() to restart queue operations.
- **/
-void blk_stop_queue(struct request_queue *q)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	cancel_delayed_work(&q->delay_work);
-	queue_flag_set(QUEUE_FLAG_STOPPED, q);
-}
-EXPORT_SYMBOL(blk_stop_queue);
-
 /**
  * blk_sync_queue - cancel any pending callbacks on a queue
  * @q: the queue
@@ -415,8 +282,6 @@ void blk_sync_queue(struct request_queue *q)
 		cancel_delayed_work_sync(&q->requeue_work);
 		queue_for_each_hw_ctx(q, hctx, i)
 			cancel_delayed_work_sync(&hctx->run_work);
-	} else {
-		cancel_delayed_work_sync(&q->delay_work);
 	}
 }
 EXPORT_SYMBOL(blk_sync_queue);
@@ -442,250 +307,12 @@ void blk_clear_pm_only(struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(blk_clear_pm_only);
 
-/**
- * __blk_run_queue_uncond - run a queue whether or not it has been stopped
- * @q:	The queue to run
- *
- * Description:
- *    Invoke request handling on a queue if there are any pending requests.
- *    May be used to restart request handling after a request has completed.
- *    This variant runs the queue whether or not the queue has been
- *    stopped. Must be called with the queue lock held and interrupts
- *    disabled. See also @blk_run_queue.
- */
-inline void __blk_run_queue_uncond(struct request_queue *q)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	if (unlikely(blk_queue_dead(q)))
-		return;
-
-	/*
-	 * Some request_fn implementations, e.g. scsi_request_fn(), unlock
-	 * the queue lock internally. As a result multiple threads may be
-	 * running such a request function concurrently. Keep track of the
-	 * number of active request_fn invocations such that blk_drain_queue()
-	 * can wait until all these request_fn calls have finished.
-	 */
-	q->request_fn_active++;
-	q->request_fn(q);
-	q->request_fn_active--;
-}
-EXPORT_SYMBOL_GPL(__blk_run_queue_uncond);
-
-/**
- * __blk_run_queue - run a single device queue
- * @q:	The queue to run
- *
- * Description:
- *    See @blk_run_queue.
- */
-void __blk_run_queue(struct request_queue *q)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	if (unlikely(blk_queue_stopped(q)))
-		return;
-
-	__blk_run_queue_uncond(q);
-}
-EXPORT_SYMBOL(__blk_run_queue);
-
-/**
- * blk_run_queue_async - run a single device queue in workqueue context
- * @q:	The queue to run
- *
- * Description:
- *    Tells kblockd to perform the equivalent of @blk_run_queue on behalf
- *    of us.
- *
- * Note:
- *    Since it is not allowed to run q->delay_work after blk_cleanup_queue()
- *    has canceled q->delay_work, callers must hold the queue lock to avoid
- *    race conditions between blk_cleanup_queue() and blk_run_queue_async().
- */
-void blk_run_queue_async(struct request_queue *q)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	if (likely(!blk_queue_stopped(q) && !blk_queue_dead(q)))
-		mod_delayed_work(kblockd_workqueue, &q->delay_work, 0);
-}
-EXPORT_SYMBOL(blk_run_queue_async);
-
-/**
- * blk_run_queue - run a single device queue
- * @q: The queue to run
- *
- * Description:
- *    Invoke request handling on this queue, if it has pending work to do.
- *    May be used to restart queueing when a request has completed.
- */
-void blk_run_queue(struct request_queue *q)
-{
-	unsigned long flags;
-
-	WARN_ON_ONCE(q->mq_ops);
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	__blk_run_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-}
-EXPORT_SYMBOL(blk_run_queue);
-
 void blk_put_queue(struct request_queue *q)
 {
 	kobject_put(&q->kobj);
 }
 EXPORT_SYMBOL(blk_put_queue);
 
-/**
- * __blk_drain_queue - drain requests from request_queue
- * @q: queue to drain
- * @drain_all: whether to drain all requests or only the ones w/ ELVPRIV
- *
- * Drain requests from @q.  If @drain_all is set, all requests are drained.
- * If not, only ELVPRIV requests are drained.  The caller is responsible
- * for ensuring that no new requests which need to be drained are queued.
- */
-static void __blk_drain_queue(struct request_queue *q, bool drain_all)
-	__releases(q->queue_lock)
-	__acquires(q->queue_lock)
-{
-	int i;
-
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	while (true) {
-		bool drain = false;
-
-		/*
-		 * The caller might be trying to drain @q before its
-		 * elevator is initialized.
-		 */
-		if (q->elevator)
-			elv_drain_elevator(q);
-
-		blkcg_drain_queue(q);
-
-		/*
-		 * This function might be called on a queue which failed
-		 * driver init after queue creation or is not yet fully
-		 * active yet.  Some drivers (e.g. fd and loop) get unhappy
-		 * in such cases.  Kick queue iff dispatch queue has
-		 * something on it and @q has request_fn set.
-		 */
-		if (!list_empty(&q->queue_head) && q->request_fn)
-			__blk_run_queue(q);
-
-		drain |= q->nr_rqs_elvpriv;
-		drain |= q->request_fn_active;
-
-		/*
-		 * Unfortunately, requests are queued at and tracked from
-		 * multiple places and there's no single counter which can
-		 * be drained.  Check all the queues and counters.
-		 */
-		if (drain_all) {
-			struct blk_flush_queue *fq = blk_get_flush_queue(q, NULL);
-			drain |= !list_empty(&q->queue_head);
-			for (i = 0; i < 2; i++) {
-				drain |= q->nr_rqs[i];
-				drain |= q->in_flight[i];
-				if (fq)
-				    drain |= !list_empty(&fq->flush_queue[i]);
-			}
-		}
-
-		if (!drain)
-			break;
-
-		spin_unlock_irq(q->queue_lock);
-
-		msleep(10);
-
-		spin_lock_irq(q->queue_lock);
-	}
-
-	/*
-	 * With queue marked dead, any woken up waiter will fail the
-	 * allocation path, so the wakeup chaining is lost and we're
-	 * left with hung waiters. We need to wake up those waiters.
-	 */
-	if (q->request_fn) {
-		struct request_list *rl;
-
-		blk_queue_for_each_rl(rl, q)
-			for (i = 0; i < ARRAY_SIZE(rl->wait); i++)
-				wake_up_all(&rl->wait[i]);
-	}
-}
-
-void blk_drain_queue(struct request_queue *q)
-{
-	spin_lock_irq(q->queue_lock);
-	__blk_drain_queue(q, true);
-	spin_unlock_irq(q->queue_lock);
-}
-
-/**
- * blk_queue_bypass_start - enter queue bypass mode
- * @q: queue of interest
- *
- * In bypass mode, only the dispatch FIFO queue of @q is used.  This
- * function makes @q enter bypass mode and drains all requests which were
- * throttled or issued before.  On return, it's guaranteed that no request
- * is being throttled or has ELVPRIV set and blk_queue_bypass() %true
- * inside queue or RCU read lock.
- */
-void blk_queue_bypass_start(struct request_queue *q)
-{
-	WARN_ON_ONCE(q->mq_ops);
-
-	spin_lock_irq(q->queue_lock);
-	q->bypass_depth++;
-	queue_flag_set(QUEUE_FLAG_BYPASS, q);
-	spin_unlock_irq(q->queue_lock);
-
-	/*
-	 * Queues start drained.  Skip actual draining till init is
-	 * complete.  This avoids lenghty delays during queue init which
-	 * can happen many times during boot.
-	 */
-	if (blk_queue_init_done(q)) {
-		spin_lock_irq(q->queue_lock);
-		__blk_drain_queue(q, false);
-		spin_unlock_irq(q->queue_lock);
-
-		/* ensure blk_queue_bypass() is %true inside RCU read lock */
-		synchronize_rcu();
-	}
-}
-EXPORT_SYMBOL_GPL(blk_queue_bypass_start);
-
-/**
- * blk_queue_bypass_end - leave queue bypass mode
- * @q: queue of interest
- *
- * Leave bypass mode and restore the normal queueing behavior.
- *
- * Note: although blk_queue_bypass_start() is only called for blk-sq queues,
- * this function is called for both blk-sq and blk-mq queues.
- */
-void blk_queue_bypass_end(struct request_queue *q)
-{
-	spin_lock_irq(q->queue_lock);
-	if (!--q->bypass_depth)
-		queue_flag_clear(QUEUE_FLAG_BYPASS, q);
-	WARN_ON_ONCE(q->bypass_depth < 0);
-	spin_unlock_irq(q->queue_lock);
-}
-EXPORT_SYMBOL_GPL(blk_queue_bypass_end);
-
 void blk_set_queue_dying(struct request_queue *q)
 {
 	blk_queue_flag_set(QUEUE_FLAG_DYING, q);
@@ -699,18 +326,6 @@ void blk_set_queue_dying(struct request_queue *q)
 
 	if (q->mq_ops)
 		blk_mq_wake_waiters(q);
-	else {
-		struct request_list *rl;
-
-		spin_lock_irq(q->queue_lock);
-		blk_queue_for_each_rl(rl, q) {
-			if (rl->rq_pool) {
-				wake_up_all(&rl->wait[BLK_RW_SYNC]);
-				wake_up_all(&rl->wait[BLK_RW_ASYNC]);
-			}
-		}
-		spin_unlock_irq(q->queue_lock);
-	}
 
 	/* Make blk_queue_enter() reexamine the DYING flag. */
 	wake_up_all(&q->mq_freeze_wq);
@@ -819,6 +434,7 @@ void blk_cleanup_queue(struct request_queue *q)
 
 	if (q->mq_ops)
 		blk_mq_free_queue(q);
+
 	percpu_ref_exit(&q->q_usage_counter);
 
 	spin_lock_irq(lock);
@@ -1010,8 +626,6 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id,
 
 	INIT_LIST_HEAD(&q->queue_head);
 	q->last_merge = NULL;
-	q->end_sector = 0;
-	q->boundary_rq = NULL;
 
 	q->id = ida_simple_get(&blk_queue_ida, 0, 0, gfp_mask);
 	if (q->id < 0)
@@ -1044,7 +658,6 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id,
 #ifdef CONFIG_BLK_CGROUP
 	INIT_LIST_HEAD(&q->blkg_list);
 #endif
-	INIT_DELAYED_WORK(&q->delay_work, blk_delay_work);
 
 	kobject_init(&q->kobj, &blk_queue_ktype);
 
@@ -1097,105 +710,6 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id,
 }
 EXPORT_SYMBOL(blk_alloc_queue_node);
 
-/**
- * blk_init_queue  - prepare a request queue for use with a block device
- * @rfn:  The function to be called to process requests that have been
- *        placed on the queue.
- * @lock: Request queue spin lock
- *
- * Description:
- *    If a block device wishes to use the standard request handling procedures,
- *    which sorts requests and coalesces adjacent requests, then it must
- *    call blk_init_queue().  The function @rfn will be called when there
- *    are requests on the queue that need to be processed.  If the device
- *    supports plugging, then @rfn may not be called immediately when requests
- *    are available on the queue, but may be called at some time later instead.
- *    Plugged queues are generally unplugged when a buffer belonging to one
- *    of the requests on the queue is needed, or due to memory pressure.
- *
- *    @rfn is not required, or even expected, to remove all requests off the
- *    queue, but only as many as it can handle at a time.  If it does leave
- *    requests on the queue, it is responsible for arranging that the requests
- *    get dealt with eventually.
- *
- *    The queue spin lock must be held while manipulating the requests on the
- *    request queue; this lock will be taken also from interrupt context, so irq
- *    disabling is needed for it.
- *
- *    Function returns a pointer to the initialized request queue, or %NULL if
- *    it didn't succeed.
- *
- * Note:
- *    blk_init_queue() must be paired with a blk_cleanup_queue() call
- *    when the block device is deactivated (such as at module unload).
- **/
-
-struct request_queue *blk_init_queue(request_fn_proc *rfn, spinlock_t *lock)
-{
-	return blk_init_queue_node(rfn, lock, NUMA_NO_NODE);
-}
-EXPORT_SYMBOL(blk_init_queue);
-
-struct request_queue *
-blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
-{
-	struct request_queue *q;
-
-	q = blk_alloc_queue_node(GFP_KERNEL, node_id, lock);
-	if (!q)
-		return NULL;
-
-	q->request_fn = rfn;
-	if (blk_init_allocated_queue(q) < 0) {
-		blk_cleanup_queue(q);
-		return NULL;
-	}
-
-	return q;
-}
-EXPORT_SYMBOL(blk_init_queue_node);
-
-static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio);
-
-
-int blk_init_allocated_queue(struct request_queue *q)
-{
-	WARN_ON_ONCE(q->mq_ops);
-
-	q->fq = blk_alloc_flush_queue(q, NUMA_NO_NODE, q->cmd_size, GFP_KERNEL);
-	if (!q->fq)
-		return -ENOMEM;
-
-	if (q->init_rq_fn && q->init_rq_fn(q, q->fq->flush_rq, GFP_KERNEL))
-		goto out_free_flush_queue;
-
-	if (blk_init_rl(&q->root_rl, q, GFP_KERNEL))
-		goto out_exit_flush_rq;
-
-	INIT_WORK(&q->timeout_work, blk_timeout_work);
-	q->queue_flags		|= QUEUE_FLAG_DEFAULT;
-
-	/*
-	 * This also sets hw/phys segments, boundary and size
-	 */
-	blk_queue_make_request(q, blk_queue_bio);
-
-	q->sg_reserved_size = INT_MAX;
-
-	if (elevator_init(q))
-		goto out_exit_flush_rq;
-	return 0;
-
-out_exit_flush_rq:
-	if (q->exit_rq_fn)
-		q->exit_rq_fn(q, q->fq->flush_rq);
-out_free_flush_queue:
-	blk_free_flush_queue(q->fq);
-	q->fq = NULL;
-	return -ENOMEM;
-}
-EXPORT_SYMBOL(blk_init_allocated_queue);
-
 bool blk_get_queue(struct request_queue *q)
 {
 	if (likely(!blk_queue_dying(q))) {
@@ -1207,477 +721,38 @@ bool blk_get_queue(struct request_queue *q)
 }
 EXPORT_SYMBOL(blk_get_queue);
 
-static inline void blk_free_request(struct request_list *rl, struct request *rq)
-{
-	if (rq->rq_flags & RQF_ELVPRIV) {
-		elv_put_request(rl->q, rq);
-		if (rq->elv.icq)
-			put_io_context(rq->elv.icq->ioc);
-	}
-
-	mempool_free(rq, rl->rq_pool);
-}
-
-/*
- * ioc_batching returns true if the ioc is a valid batching request and
- * should be given priority access to a request.
+/**
+ * blk_get_request - allocate a request
+ * @q: request queue to allocate a request for
+ * @op: operation (REQ_OP_*) and REQ_* flags, e.g. REQ_SYNC.
+ * @flags: BLK_MQ_REQ_* flags, e.g. BLK_MQ_REQ_NOWAIT.
  */
-static inline int ioc_batching(struct request_queue *q, struct io_context *ioc)
+struct request *blk_get_request(struct request_queue *q, unsigned int op,
+				blk_mq_req_flags_t flags)
 {
-	if (!ioc)
-		return 0;
+	struct request *req;
 
-	/*
-	 * Make sure the process is able to allocate at least 1 request
-	 * even if the batch times out, otherwise we could theoretically
-	 * lose wakeups.
-	 */
-	return ioc->nr_batch_requests == q->nr_batching ||
-		(ioc->nr_batch_requests > 0
-		&& time_before(jiffies, ioc->last_waited + BLK_BATCH_TIME));
-}
+	WARN_ON_ONCE(op & REQ_NOWAIT);
+	WARN_ON_ONCE(flags & ~(BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_PREEMPT));
 
-/*
- * ioc_set_batching sets ioc to be a new "batcher" if it is not one. This
- * will cause the process to be a "batcher" on all queues in the system. This
- * is the behaviour we want though - once it gets a wakeup it should be given
- * a nice run.
- */
-static void ioc_set_batching(struct request_queue *q, struct io_context *ioc)
-{
-	if (!ioc || ioc_batching(q, ioc))
-		return;
+	req = blk_mq_alloc_request(q, op, flags);
+	if (!IS_ERR(req) && q->mq_ops->initialize_rq_fn)
+		q->mq_ops->initialize_rq_fn(req);
 
-	ioc->nr_batch_requests = q->nr_batching;
-	ioc->last_waited = jiffies;
+	return req;
 }
+EXPORT_SYMBOL(blk_get_request);
 
-static void __freed_request(struct request_list *rl, int sync)
+static void part_round_stats_single(struct request_queue *q, int cpu,
+				    struct hd_struct *part, unsigned long now,
+				    unsigned int inflight)
 {
-	struct request_queue *q = rl->q;
-
-	if (rl->count[sync] < queue_congestion_off_threshold(q))
-		blk_clear_congested(rl, sync);
-
-	if (rl->count[sync] + 1 <= q->nr_requests) {
-		if (waitqueue_active(&rl->wait[sync]))
-			wake_up(&rl->wait[sync]);
-
-		blk_clear_rl_full(rl, sync);
+	if (inflight) {
+		__part_stat_add(cpu, part, time_in_queue,
+				inflight * (now - part->stamp));
+		__part_stat_add(cpu, part, io_ticks, (now - part->stamp));
 	}
-}
-
-/*
- * A request has just been released.  Account for it, update the full and
- * congestion status, wake up any waiters.   Called under q->queue_lock.
- */
-static void freed_request(struct request_list *rl, bool sync,
-		req_flags_t rq_flags)
-{
-	struct request_queue *q = rl->q;
-
-	q->nr_rqs[sync]--;
-	rl->count[sync]--;
-	if (rq_flags & RQF_ELVPRIV)
-		q->nr_rqs_elvpriv--;
-
-	__freed_request(rl, sync);
-
-	if (unlikely(rl->starved[sync ^ 1]))
-		__freed_request(rl, sync ^ 1);
-}
-
-int blk_update_nr_requests(struct request_queue *q, unsigned int nr)
-{
-	struct request_list *rl;
-	int on_thresh, off_thresh;
-
-	WARN_ON_ONCE(q->mq_ops);
-
-	spin_lock_irq(q->queue_lock);
-	q->nr_requests = nr;
-	blk_queue_congestion_threshold(q);
-	on_thresh = queue_congestion_on_threshold(q);
-	off_thresh = queue_congestion_off_threshold(q);
-
-	blk_queue_for_each_rl(rl, q) {
-		if (rl->count[BLK_RW_SYNC] >= on_thresh)
-			blk_set_congested(rl, BLK_RW_SYNC);
-		else if (rl->count[BLK_RW_SYNC] < off_thresh)
-			blk_clear_congested(rl, BLK_RW_SYNC);
-
-		if (rl->count[BLK_RW_ASYNC] >= on_thresh)
-			blk_set_congested(rl, BLK_RW_ASYNC);
-		else if (rl->count[BLK_RW_ASYNC] < off_thresh)
-			blk_clear_congested(rl, BLK_RW_ASYNC);
-
-		if (rl->count[BLK_RW_SYNC] >= q->nr_requests) {
-			blk_set_rl_full(rl, BLK_RW_SYNC);
-		} else {
-			blk_clear_rl_full(rl, BLK_RW_SYNC);
-			wake_up(&rl->wait[BLK_RW_SYNC]);
-		}
-
-		if (rl->count[BLK_RW_ASYNC] >= q->nr_requests) {
-			blk_set_rl_full(rl, BLK_RW_ASYNC);
-		} else {
-			blk_clear_rl_full(rl, BLK_RW_ASYNC);
-			wake_up(&rl->wait[BLK_RW_ASYNC]);
-		}
-	}
-
-	spin_unlock_irq(q->queue_lock);
-	return 0;
-}
-
-/**
- * __get_request - get a free request
- * @rl: request list to allocate from
- * @op: operation and flags
- * @bio: bio to allocate request for (can be %NULL)
- * @flags: BLQ_MQ_REQ_* flags
- * @gfp_mask: allocator flags
- *
- * Get a free request from @q.  This function may fail under memory
- * pressure or if @q is dead.
- *
- * Must be called with @q->queue_lock held and,
- * Returns ERR_PTR on failure, with @q->queue_lock held.
- * Returns request pointer on success, with @q->queue_lock *not held*.
- */
-static struct request *__get_request(struct request_list *rl, unsigned int op,
-		struct bio *bio, blk_mq_req_flags_t flags, gfp_t gfp_mask)
-{
-	struct request_queue *q = rl->q;
-	struct request *rq;
-	struct elevator_type *et = q->elevator->type;
-	struct io_context *ioc = rq_ioc(bio);
-	struct io_cq *icq = NULL;
-	const bool is_sync = op_is_sync(op);
-	int may_queue;
-	req_flags_t rq_flags = RQF_ALLOCED;
-
-	lockdep_assert_held(q->queue_lock);
-
-	if (unlikely(blk_queue_dying(q)))
-		return ERR_PTR(-ENODEV);
-
-	may_queue = elv_may_queue(q, op);
-	if (may_queue == ELV_MQUEUE_NO)
-		goto rq_starved;
-
-	if (rl->count[is_sync]+1 >= queue_congestion_on_threshold(q)) {
-		if (rl->count[is_sync]+1 >= q->nr_requests) {
-			/*
-			 * The queue will fill after this allocation, so set
-			 * it as full, and mark this process as "batching".
-			 * This process will be allowed to complete a batch of
-			 * requests, others will be blocked.
-			 */
-			if (!blk_rl_full(rl, is_sync)) {
-				ioc_set_batching(q, ioc);
-				blk_set_rl_full(rl, is_sync);
-			} else {
-				if (may_queue != ELV_MQUEUE_MUST
-						&& !ioc_batching(q, ioc)) {
-					/*
-					 * The queue is full and the allocating
-					 * process is not a "batcher", and not
-					 * exempted by the IO scheduler
-					 */
-					return ERR_PTR(-ENOMEM);
-				}
-			}
-		}
-		blk_set_congested(rl, is_sync);
-	}
-
-	/*
-	 * Only allow batching queuers to allocate up to 50% over the defined
-	 * limit of requests, otherwise we could have thousands of requests
-	 * allocated with any setting of ->nr_requests
-	 */
-	if (rl->count[is_sync] >= (3 * q->nr_requests / 2))
-		return ERR_PTR(-ENOMEM);
-
-	q->nr_rqs[is_sync]++;
-	rl->count[is_sync]++;
-	rl->starved[is_sync] = 0;
-
-	/*
-	 * Decide whether the new request will be managed by elevator.  If
-	 * so, mark @rq_flags and increment elvpriv.  Non-zero elvpriv will
-	 * prevent the current elevator from being destroyed until the new
-	 * request is freed.  This guarantees icq's won't be destroyed and
-	 * makes creating new ones safe.
-	 *
-	 * Flush requests do not use the elevator so skip initialization.
-	 * This allows a request to share the flush and elevator data.
-	 *
-	 * Also, lookup icq while holding queue_lock.  If it doesn't exist,
-	 * it will be created after releasing queue_lock.
-	 */
-	if (!op_is_flush(op) && !blk_queue_bypass(q)) {
-		rq_flags |= RQF_ELVPRIV;
-		q->nr_rqs_elvpriv++;
-		if (et->icq_cache && ioc)
-			icq = ioc_lookup_icq(ioc, q);
-	}
-
-	if (blk_queue_io_stat(q))
-		rq_flags |= RQF_IO_STAT;
-	spin_unlock_irq(q->queue_lock);
-
-	/* allocate and init request */
-	rq = mempool_alloc(rl->rq_pool, gfp_mask);
-	if (!rq)
-		goto fail_alloc;
-
-	blk_rq_init(q, rq);
-	blk_rq_set_rl(rq, rl);
-	rq->cmd_flags = op;
-	rq->rq_flags = rq_flags;
-	if (flags & BLK_MQ_REQ_PREEMPT)
-		rq->rq_flags |= RQF_PREEMPT;
-
-	/* init elvpriv */
-	if (rq_flags & RQF_ELVPRIV) {
-		if (unlikely(et->icq_cache && !icq)) {
-			if (ioc)
-				icq = ioc_create_icq(ioc, q, gfp_mask);
-			if (!icq)
-				goto fail_elvpriv;
-		}
-
-		rq->elv.icq = icq;
-		if (unlikely(elv_set_request(q, rq, bio, gfp_mask)))
-			goto fail_elvpriv;
-
-		/* @rq->elv.icq holds io_context until @rq is freed */
-		if (icq)
-			get_io_context(icq->ioc);
-	}
-out:
-	/*
-	 * ioc may be NULL here, and ioc_batching will be false. That's
-	 * OK, if the queue is under the request limit then requests need
-	 * not count toward the nr_batch_requests limit. There will always
-	 * be some limit enforced by BLK_BATCH_TIME.
-	 */
-	if (ioc_batching(q, ioc))
-		ioc->nr_batch_requests--;
-
-	trace_block_getrq(q, bio, op);
-	return rq;
-
-fail_elvpriv:
-	/*
-	 * elvpriv init failed.  ioc, icq and elvpriv aren't mempool backed
-	 * and may fail indefinitely under memory pressure and thus
-	 * shouldn't stall IO.  Treat this request as !elvpriv.  This will
-	 * disturb iosched and blkcg but weird is bettern than dead.
-	 */
-	printk_ratelimited(KERN_WARNING "%s: dev %s: request aux data allocation failed, iosched may be disturbed\n",
-			   __func__, dev_name(q->backing_dev_info->dev));
-
-	rq->rq_flags &= ~RQF_ELVPRIV;
-	rq->elv.icq = NULL;
-
-	spin_lock_irq(q->queue_lock);
-	q->nr_rqs_elvpriv--;
-	spin_unlock_irq(q->queue_lock);
-	goto out;
-
-fail_alloc:
-	/*
-	 * Allocation failed presumably due to memory. Undo anything we
-	 * might have messed up.
-	 *
-	 * Allocating task should really be put onto the front of the wait
-	 * queue, but this is pretty rare.
-	 */
-	spin_lock_irq(q->queue_lock);
-	freed_request(rl, is_sync, rq_flags);
-
-	/*
-	 * in the very unlikely event that allocation failed and no
-	 * requests for this direction was pending, mark us starved so that
-	 * freeing of a request in the other direction will notice
-	 * us. another possible fix would be to split the rq mempool into
-	 * READ and WRITE
-	 */
-rq_starved:
-	if (unlikely(rl->count[is_sync] == 0))
-		rl->starved[is_sync] = 1;
-	return ERR_PTR(-ENOMEM);
-}
-
-/**
- * get_request - get a free request
- * @q: request_queue to allocate request from
- * @op: operation and flags
- * @bio: bio to allocate request for (can be %NULL)
- * @flags: BLK_MQ_REQ_* flags.
- * @gfp: allocator flags
- *
- * Get a free request from @q.  If %BLK_MQ_REQ_NOWAIT is set in @flags,
- * this function keeps retrying under memory pressure and fails iff @q is dead.
- *
- * Must be called with @q->queue_lock held and,
- * Returns ERR_PTR on failure, with @q->queue_lock held.
- * Returns request pointer on success, with @q->queue_lock *not held*.
- */
-static struct request *get_request(struct request_queue *q, unsigned int op,
-		struct bio *bio, blk_mq_req_flags_t flags, gfp_t gfp)
-{
-	const bool is_sync = op_is_sync(op);
-	DEFINE_WAIT(wait);
-	struct request_list *rl;
-	struct request *rq;
-
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	rl = blk_get_rl(q, bio);	/* transferred to @rq on success */
-retry:
-	rq = __get_request(rl, op, bio, flags, gfp);
-	if (!IS_ERR(rq))
-		return rq;
-
-	if (op & REQ_NOWAIT) {
-		blk_put_rl(rl);
-		return ERR_PTR(-EAGAIN);
-	}
-
-	if ((flags & BLK_MQ_REQ_NOWAIT) || unlikely(blk_queue_dying(q))) {
-		blk_put_rl(rl);
-		return rq;
-	}
-
-	/* wait on @rl and retry */
-	prepare_to_wait_exclusive(&rl->wait[is_sync], &wait,
-				  TASK_UNINTERRUPTIBLE);
-
-	trace_block_sleeprq(q, bio, op);
-
-	spin_unlock_irq(q->queue_lock);
-	io_schedule();
-
-	/*
-	 * After sleeping, we become a "batching" process and will be able
-	 * to allocate at least one request, and up to a big batch of them
-	 * for a small period time.  See ioc_batching, ioc_set_batching
-	 */
-	ioc_set_batching(q, current->io_context);
-
-	spin_lock_irq(q->queue_lock);
-	finish_wait(&rl->wait[is_sync], &wait);
-
-	goto retry;
-}
-
-/* flags: BLK_MQ_REQ_PREEMPT and/or BLK_MQ_REQ_NOWAIT. */
-static struct request *blk_old_get_request(struct request_queue *q,
-				unsigned int op, blk_mq_req_flags_t flags)
-{
-	struct request *rq;
-	gfp_t gfp_mask = flags & BLK_MQ_REQ_NOWAIT ? GFP_ATOMIC : GFP_NOIO;
-	int ret = 0;
-
-	WARN_ON_ONCE(q->mq_ops);
-
-	/* create ioc upfront */
-	create_io_context(gfp_mask, q->node);
-
-	ret = blk_queue_enter(q, flags);
-	if (ret)
-		return ERR_PTR(ret);
-	spin_lock_irq(q->queue_lock);
-	rq = get_request(q, op, NULL, flags, gfp_mask);
-	if (IS_ERR(rq)) {
-		spin_unlock_irq(q->queue_lock);
-		blk_queue_exit(q);
-		return rq;
-	}
-
-	/* q->queue_lock is unlocked at this point */
-	rq->__data_len = 0;
-	rq->__sector = (sector_t) -1;
-	rq->bio = rq->biotail = NULL;
-	return rq;
-}
-
-/**
- * blk_get_request - allocate a request
- * @q: request queue to allocate a request for
- * @op: operation (REQ_OP_*) and REQ_* flags, e.g. REQ_SYNC.
- * @flags: BLK_MQ_REQ_* flags, e.g. BLK_MQ_REQ_NOWAIT.
- */
-struct request *blk_get_request(struct request_queue *q, unsigned int op,
-				blk_mq_req_flags_t flags)
-{
-	struct request *req;
-
-	WARN_ON_ONCE(op & REQ_NOWAIT);
-	WARN_ON_ONCE(flags & ~(BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_PREEMPT));
-
-	if (q->mq_ops) {
-		req = blk_mq_alloc_request(q, op, flags);
-		if (!IS_ERR(req) && q->mq_ops->initialize_rq_fn)
-			q->mq_ops->initialize_rq_fn(req);
-	} else {
-		req = blk_old_get_request(q, op, flags);
-		if (!IS_ERR(req) && q->initialize_rq_fn)
-			q->initialize_rq_fn(req);
-	}
-
-	return req;
-}
-EXPORT_SYMBOL(blk_get_request);
-
-/**
- * blk_requeue_request - put a request back on queue
- * @q:		request queue where request should be inserted
- * @rq:		request to be inserted
- *
- * Description:
- *    Drivers often keep queueing requests until the hardware cannot accept
- *    more, when that condition happens we need to put the request back
- *    on the queue. Must be called with queue lock held.
- */
-void blk_requeue_request(struct request_queue *q, struct request *rq)
-{
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	blk_delete_timer(rq);
-	blk_clear_rq_complete(rq);
-	trace_block_rq_requeue(q, rq);
-	rq_qos_requeue(q, rq);
-
-	BUG_ON(blk_queued_rq(rq));
-
-	elv_requeue_request(q, rq);
-}
-EXPORT_SYMBOL(blk_requeue_request);
-
-static void add_acct_request(struct request_queue *q, struct request *rq,
-			     int where)
-{
-	blk_account_io_start(rq, true);
-	__elv_add_request(q, rq, where);
-}
-
-static void part_round_stats_single(struct request_queue *q, int cpu,
-				    struct hd_struct *part, unsigned long now,
-				    unsigned int inflight)
-{
-	if (inflight) {
-		__part_stat_add(cpu, part, time_in_queue,
-				inflight * (now - part->stamp));
-		__part_stat_add(cpu, part, io_ticks, (now - part->stamp));
-	}
-	part->stamp = now;
+	part->stamp = now;
 }
 
 /**
@@ -1727,61 +802,16 @@ EXPORT_SYMBOL_GPL(part_round_stats);
 
 void __blk_put_request(struct request_queue *q, struct request *req)
 {
-	req_flags_t rq_flags = req->rq_flags;
-
 	if (unlikely(!q))
 		return;
 
-	if (q->mq_ops) {
-		blk_mq_free_request(req);
-		return;
-	}
-
-	lockdep_assert_held(q->queue_lock);
-
-	blk_req_zone_write_unlock(req);
-	blk_pm_put_request(req);
-	blk_pm_mark_last_busy(req);
-
-	elv_completed_request(q, req);
-
-	/* this is a bio leak */
-	WARN_ON(req->bio != NULL);
-
-	rq_qos_done(q, req);
-
-	/*
-	 * Request may not have originated from ll_rw_blk. if not,
-	 * it didn't come out of our reserved rq pools
-	 */
-	if (rq_flags & RQF_ALLOCED) {
-		struct request_list *rl = blk_rq_rl(req);
-		bool sync = op_is_sync(req->cmd_flags);
-
-		BUG_ON(!list_empty(&req->queuelist));
-		BUG_ON(ELV_ON_HASH(req));
-
-		blk_free_request(rl, req);
-		freed_request(rl, sync, rq_flags);
-		blk_put_rl(rl);
-		blk_queue_exit(q);
-	}
+	blk_mq_free_request(req);
 }
 EXPORT_SYMBOL_GPL(__blk_put_request);
 
 void blk_put_request(struct request *req)
 {
-	struct request_queue *q = req->q;
-
-	if (q->mq_ops)
-		blk_mq_free_request(req);
-	else {
-		unsigned long flags;
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		__blk_put_request(q, req);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
+	blk_mq_free_request(req);
 }
 EXPORT_SYMBOL(blk_put_request);
 
@@ -1890,10 +920,7 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 		return false;
 	*request_count = 0;
 
-	if (q->mq_ops)
-		plug_list = &plug->mq_list;
-	else
-		plug_list = &plug->list;
+	plug_list = &plug->mq_list;
 
 	list_for_each_entry_reverse(rq, plug_list, queuelist) {
 		bool merged = false;
@@ -1944,11 +971,7 @@ unsigned int blk_plug_queued_count(struct request_queue *q)
 	if (!plug)
 		goto out;
 
-	if (q->mq_ops)
-		plug_list = &plug->mq_list;
-	else
-		plug_list = &plug->list;
-
+	plug_list = &plug->mq_list;
 	list_for_each_entry(rq, plug_list, queuelist) {
 		if (rq->q == q)
 			ret++;
@@ -1976,133 +999,6 @@ void blk_init_request_from_bio(struct request *req, struct bio *bio)
 }
 EXPORT_SYMBOL_GPL(blk_init_request_from_bio);
 
-static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
-{
-	struct blk_plug *plug;
-	int where = ELEVATOR_INSERT_SORT;
-	struct request *req, *free;
-	unsigned int request_count = 0;
-
-	/*
-	 * low level driver can indicate that it wants pages above a
-	 * certain limit bounced to low memory (ie for highmem, or even
-	 * ISA dma in theory)
-	 */
-	blk_queue_bounce(q, &bio);
-
-	blk_queue_split(q, &bio);
-
-	if (!bio_integrity_prep(bio))
-		return BLK_QC_T_NONE;
-
-	if (op_is_flush(bio->bi_opf)) {
-		spin_lock_irq(q->queue_lock);
-		where = ELEVATOR_INSERT_FLUSH;
-		goto get_rq;
-	}
-
-	/*
-	 * Check if we can merge with the plugged list before grabbing
-	 * any locks.
-	 */
-	if (!blk_queue_nomerges(q)) {
-		if (blk_attempt_plug_merge(q, bio, &request_count, NULL))
-			return BLK_QC_T_NONE;
-	} else
-		request_count = blk_plug_queued_count(q);
-
-	spin_lock_irq(q->queue_lock);
-
-	switch (elv_merge(q, &req, bio)) {
-	case ELEVATOR_BACK_MERGE:
-		if (!bio_attempt_back_merge(q, req, bio))
-			break;
-		elv_bio_merged(q, req, bio);
-		free = attempt_back_merge(q, req);
-		if (free)
-			__blk_put_request(q, free);
-		else
-			elv_merged_request(q, req, ELEVATOR_BACK_MERGE);
-		goto out_unlock;
-	case ELEVATOR_FRONT_MERGE:
-		if (!bio_attempt_front_merge(q, req, bio))
-			break;
-		elv_bio_merged(q, req, bio);
-		free = attempt_front_merge(q, req);
-		if (free)
-			__blk_put_request(q, free);
-		else
-			elv_merged_request(q, req, ELEVATOR_FRONT_MERGE);
-		goto out_unlock;
-	default:
-		break;
-	}
-
-get_rq:
-	rq_qos_throttle(q, bio, q->queue_lock);
-
-	/*
-	 * Grab a free request. This is might sleep but can not fail.
-	 * Returns with the queue unlocked.
-	 */
-	blk_queue_enter_live(q);
-	req = get_request(q, bio->bi_opf, bio, 0, GFP_NOIO);
-	if (IS_ERR(req)) {
-		blk_queue_exit(q);
-		rq_qos_cleanup(q, bio);
-		if (PTR_ERR(req) == -ENOMEM)
-			bio->bi_status = BLK_STS_RESOURCE;
-		else
-			bio->bi_status = BLK_STS_IOERR;
-		bio_endio(bio);
-		goto out_unlock;
-	}
-
-	rq_qos_track(q, req, bio);
-
-	/*
-	 * After dropping the lock and possibly sleeping here, our request
-	 * may now be mergeable after it had proven unmergeable (above).
-	 * We don't worry about that case for efficiency. It won't happen
-	 * often, and the elevators are able to handle it.
-	 */
-	blk_init_request_from_bio(req, bio);
-
-	if (test_bit(QUEUE_FLAG_SAME_COMP, &q->queue_flags))
-		req->cpu = raw_smp_processor_id();
-
-	plug = current->plug;
-	if (plug) {
-		/*
-		 * If this is the first request added after a plug, fire
-		 * of a plug trace.
-		 *
-		 * @request_count may become stale because of schedule
-		 * out, so check plug list again.
-		 */
-		if (!request_count || list_empty(&plug->list))
-			trace_block_plug(q);
-		else {
-			struct request *last = list_entry_rq(plug->list.prev);
-			if (request_count >= BLK_MAX_REQUEST_COUNT ||
-			    blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE) {
-				blk_flush_plug_list(plug, false);
-				trace_block_plug(q);
-			}
-		}
-		list_add_tail(&req->queuelist, &plug->list);
-		blk_account_io_start(req, true);
-	} else {
-		spin_lock_irq(q->queue_lock);
-		add_acct_request(q, req, where);
-		__blk_run_queue(q);
-out_unlock:
-		spin_unlock_irq(q->queue_lock);
-	}
-
-	return BLK_QC_T_NONE;
-}
-
 static void handle_bad_sector(struct bio *bio, sector_t maxsector)
 {
 	char b[BDEVNAME_SIZE];
@@ -2615,9 +1511,6 @@ static int blk_cloned_rq_check_limits(struct request_queue *q,
  */
 blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq)
 {
-	unsigned long flags;
-	int where = ELEVATOR_INSERT_BACK;
-
 	if (blk_cloned_rq_check_limits(q, rq))
 		return BLK_STS_IOERR;
 
@@ -2625,38 +1518,15 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 	    should_fail_request(&rq->rq_disk->part0, blk_rq_bytes(rq)))
 		return BLK_STS_IOERR;
 
-	if (q->mq_ops) {
-		if (blk_queue_io_stat(q))
-			blk_account_io_start(rq, true);
-		/*
-		 * Since we have a scheduler attached on the top device,
-		 * bypass a potential scheduler on the bottom device for
-		 * insert.
-		 */
-		return blk_mq_request_issue_directly(rq);
-	}
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	if (unlikely(blk_queue_dying(q))) {
-		spin_unlock_irqrestore(q->queue_lock, flags);
-		return BLK_STS_IOERR;
-	}
+	if (blk_queue_io_stat(q))
+		blk_account_io_start(rq, true);
 
 	/*
-	 * Submitting request must be dequeued before calling this function
-	 * because it will be linked to another request_queue
+	 * Since we have a scheduler attached on the top device,
+	 * bypass a potential scheduler on the bottom device for
+	 * insert.
 	 */
-	BUG_ON(blk_queued_rq(rq));
-
-	if (op_is_flush(rq->cmd_flags))
-		where = ELEVATOR_INSERT_FLUSH;
-
-	add_acct_request(q, rq, where);
-	if (where == ELEVATOR_INSERT_FLUSH)
-		__blk_run_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-
-	return BLK_STS_OK;
+	return blk_mq_request_issue_directly(rq);
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
@@ -2776,225 +1646,6 @@ void blk_account_io_start(struct request *rq, bool new_io)
 	part_stat_unlock();
 }
 
-static struct request *elv_next_request(struct request_queue *q)
-{
-	struct request *rq;
-	struct blk_flush_queue *fq = blk_get_flush_queue(q, NULL);
-
-	WARN_ON_ONCE(q->mq_ops);
-
-	while (1) {
-		list_for_each_entry(rq, &q->queue_head, queuelist) {
-#ifdef CONFIG_PM
-			/*
-			 * If a request gets queued in state RPM_SUSPENDED
-			 * then that's a kernel bug.
-			 */
-			WARN_ON_ONCE(q->rpm_status == RPM_SUSPENDED);
-#endif
-			return rq;
-		}
-
-		/*
-		 * Flush request is running and flush request isn't queueable
-		 * in the drive, we can hold the queue till flush request is
-		 * finished. Even we don't do this, driver can't dispatch next
-		 * requests and will requeue them. And this can improve
-		 * throughput too. For example, we have request flush1, write1,
-		 * flush 2. flush1 is dispatched, then queue is hold, write1
-		 * isn't inserted to queue. After flush1 is finished, flush2
-		 * will be dispatched. Since disk cache is already clean,
-		 * flush2 will be finished very soon, so looks like flush2 is
-		 * folded to flush1.
-		 * Since the queue is hold, a flag is set to indicate the queue
-		 * should be restarted later. Please see flush_end_io() for
-		 * details.
-		 */
-		if (fq->flush_pending_idx != fq->flush_running_idx &&
-				!queue_flush_queueable(q)) {
-			fq->flush_queue_delayed = 1;
-			return NULL;
-		}
-		if (unlikely(blk_queue_bypass(q)) ||
-		    !q->elevator->type->ops.sq.elevator_dispatch_fn(q, 0))
-			return NULL;
-	}
-}
-
-/**
- * blk_peek_request - peek at the top of a request queue
- * @q: request queue to peek at
- *
- * Description:
- *     Return the request at the top of @q.  The returned request
- *     should be started using blk_start_request() before LLD starts
- *     processing it.
- *
- * Return:
- *     Pointer to the request at the top of @q if available.  Null
- *     otherwise.
- */
-struct request *blk_peek_request(struct request_queue *q)
-{
-	struct request *rq;
-	int ret;
-
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	while ((rq = elv_next_request(q)) != NULL) {
-		if (!(rq->rq_flags & RQF_STARTED)) {
-			/*
-			 * This is the first time the device driver
-			 * sees this request (possibly after
-			 * requeueing).  Notify IO scheduler.
-			 */
-			if (rq->rq_flags & RQF_SORTED)
-				elv_activate_rq(q, rq);
-
-			/*
-			 * just mark as started even if we don't start
-			 * it, a request that has been delayed should
-			 * not be passed by new incoming requests
-			 */
-			rq->rq_flags |= RQF_STARTED;
-			trace_block_rq_issue(q, rq);
-		}
-
-		if (!q->boundary_rq || q->boundary_rq == rq) {
-			q->end_sector = rq_end_sector(rq);
-			q->boundary_rq = NULL;
-		}
-
-		if (rq->rq_flags & RQF_DONTPREP)
-			break;
-
-		if (q->dma_drain_size && blk_rq_bytes(rq)) {
-			/*
-			 * make sure space for the drain appears we
-			 * know we can do this because max_hw_segments
-			 * has been adjusted to be one fewer than the
-			 * device can handle
-			 */
-			rq->nr_phys_segments++;
-		}
-
-		if (!q->prep_rq_fn)
-			break;
-
-		ret = q->prep_rq_fn(q, rq);
-		if (ret == BLKPREP_OK) {
-			break;
-		} else if (ret == BLKPREP_DEFER) {
-			/*
-			 * the request may have been (partially) prepped.
-			 * we need to keep this request in the front to
-			 * avoid resource deadlock.  RQF_STARTED will
-			 * prevent other fs requests from passing this one.
-			 */
-			if (q->dma_drain_size && blk_rq_bytes(rq) &&
-			    !(rq->rq_flags & RQF_DONTPREP)) {
-				/*
-				 * remove the space for the drain we added
-				 * so that we don't add it again
-				 */
-				--rq->nr_phys_segments;
-			}
-
-			rq = NULL;
-			break;
-		} else if (ret == BLKPREP_KILL || ret == BLKPREP_INVALID) {
-			rq->rq_flags |= RQF_QUIET;
-			/*
-			 * Mark this request as started so we don't trigger
-			 * any debug logic in the end I/O path.
-			 */
-			blk_start_request(rq);
-			__blk_end_request_all(rq, ret == BLKPREP_INVALID ?
-					BLK_STS_TARGET : BLK_STS_IOERR);
-		} else {
-			printk(KERN_ERR "%s: bad return=%d\n", __func__, ret);
-			break;
-		}
-	}
-
-	return rq;
-}
-EXPORT_SYMBOL(blk_peek_request);
-
-static void blk_dequeue_request(struct request *rq)
-{
-	struct request_queue *q = rq->q;
-
-	BUG_ON(list_empty(&rq->queuelist));
-	BUG_ON(ELV_ON_HASH(rq));
-
-	list_del_init(&rq->queuelist);
-
-	/*
-	 * the time frame between a request being removed from the lists
-	 * and to it is freed is accounted as io that is in progress at
-	 * the driver side.
-	 */
-	if (blk_account_rq(rq))
-		q->in_flight[rq_is_sync(rq)]++;
-}
-
-/**
- * blk_start_request - start request processing on the driver
- * @req: request to dequeue
- *
- * Description:
- *     Dequeue @req and start timeout timer on it.  This hands off the
- *     request to the driver.
- */
-void blk_start_request(struct request *req)
-{
-	lockdep_assert_held(req->q->queue_lock);
-	WARN_ON_ONCE(req->q->mq_ops);
-
-	blk_dequeue_request(req);
-
-	if (test_bit(QUEUE_FLAG_STATS, &req->q->queue_flags)) {
-		req->io_start_time_ns = ktime_get_ns();
-#ifdef CONFIG_BLK_DEV_THROTTLING_LOW
-		req->throtl_size = blk_rq_sectors(req);
-#endif
-		req->rq_flags |= RQF_STATS;
-		rq_qos_issue(req->q, req);
-	}
-
-	BUG_ON(blk_rq_is_complete(req));
-	blk_add_timer(req);
-}
-EXPORT_SYMBOL(blk_start_request);
-
-/**
- * blk_fetch_request - fetch a request from a request queue
- * @q: request queue to fetch a request from
- *
- * Description:
- *     Return the request at the top of @q.  The request is started on
- *     return and LLD can start processing it immediately.
- *
- * Return:
- *     Pointer to the request at the top of @q if available.  Null
- *     otherwise.
- */
-struct request *blk_fetch_request(struct request_queue *q)
-{
-	struct request *rq;
-
-	lockdep_assert_held(q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	rq = blk_peek_request(q);
-	if (rq)
-		blk_start_request(rq);
-	return rq;
-}
-EXPORT_SYMBOL(blk_fetch_request);
-
 /*
  * Steal bios from a request and add them to a bio list.
  * The request must not have been partially completed before.
@@ -3120,252 +1771,6 @@ bool blk_update_request(struct request *req, blk_status_t error,
 }
 EXPORT_SYMBOL_GPL(blk_update_request);
 
-static bool blk_update_bidi_request(struct request *rq, blk_status_t error,
-				    unsigned int nr_bytes,
-				    unsigned int bidi_bytes)
-{
-	if (blk_update_request(rq, error, nr_bytes))
-		return true;
-
-	/* Bidi request must be completed as a whole */
-	if (unlikely(blk_bidi_rq(rq)) &&
-	    blk_update_request(rq->next_rq, error, bidi_bytes))
-		return true;
-
-	if (blk_queue_add_random(rq->q))
-		add_disk_randomness(rq->rq_disk);
-
-	return false;
-}
-
-/**
- * blk_unprep_request - unprepare a request
- * @req:	the request
- *
- * This function makes a request ready for complete resubmission (or
- * completion).  It happens only after all error handling is complete,
- * so represents the appropriate moment to deallocate any resources
- * that were allocated to the request in the prep_rq_fn.  The queue
- * lock is held when calling this.
- */
-void blk_unprep_request(struct request *req)
-{
-	struct request_queue *q = req->q;
-
-	req->rq_flags &= ~RQF_DONTPREP;
-	if (q->unprep_rq_fn)
-		q->unprep_rq_fn(q, req);
-}
-EXPORT_SYMBOL_GPL(blk_unprep_request);
-
-void blk_finish_request(struct request *req, blk_status_t error)
-{
-	struct request_queue *q = req->q;
-	u64 now = ktime_get_ns();
-
-	lockdep_assert_held(req->q->queue_lock);
-	WARN_ON_ONCE(q->mq_ops);
-
-	if (req->rq_flags & RQF_STATS)
-		blk_stat_add(req, now);
-
-	BUG_ON(blk_queued_rq(req));
-
-	if (unlikely(laptop_mode) && !blk_rq_is_passthrough(req))
-		laptop_io_completion(req->q->backing_dev_info);
-
-	blk_delete_timer(req);
-
-	if (req->rq_flags & RQF_DONTPREP)
-		blk_unprep_request(req);
-
-	blk_account_io_done(req, now);
-
-	if (req->end_io) {
-		rq_qos_done(q, req);
-		req->end_io(req, error);
-	} else {
-		if (blk_bidi_rq(req))
-			__blk_put_request(req->next_rq->q, req->next_rq);
-
-		__blk_put_request(q, req);
-	}
-}
-EXPORT_SYMBOL(blk_finish_request);
-
-/**
- * blk_end_bidi_request - Complete a bidi request
- * @rq:         the request to complete
- * @error:      block status code
- * @nr_bytes:   number of bytes to complete @rq
- * @bidi_bytes: number of bytes to complete @rq->next_rq
- *
- * Description:
- *     Ends I/O on a number of bytes attached to @rq and @rq->next_rq.
- *     Drivers that supports bidi can safely call this member for any
- *     type of request, bidi or uni.  In the later case @bidi_bytes is
- *     just ignored.
- *
- * Return:
- *     %false - we are done with this request
- *     %true  - still buffers pending for this request
- **/
-static bool blk_end_bidi_request(struct request *rq, blk_status_t error,
-				 unsigned int nr_bytes, unsigned int bidi_bytes)
-{
-	struct request_queue *q = rq->q;
-	unsigned long flags;
-
-	WARN_ON_ONCE(q->mq_ops);
-
-	if (blk_update_bidi_request(rq, error, nr_bytes, bidi_bytes))
-		return true;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	blk_finish_request(rq, error);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-
-	return false;
-}
-
-/**
- * __blk_end_bidi_request - Complete a bidi request with queue lock held
- * @rq:         the request to complete
- * @error:      block status code
- * @nr_bytes:   number of bytes to complete @rq
- * @bidi_bytes: number of bytes to complete @rq->next_rq
- *
- * Description:
- *     Identical to blk_end_bidi_request() except that queue lock is
- *     assumed to be locked on entry and remains so on return.
- *
- * Return:
- *     %false - we are done with this request
- *     %true  - still buffers pending for this request
- **/
-static bool __blk_end_bidi_request(struct request *rq, blk_status_t error,
-				   unsigned int nr_bytes, unsigned int bidi_bytes)
-{
-	lockdep_assert_held(rq->q->queue_lock);
-	WARN_ON_ONCE(rq->q->mq_ops);
-
-	if (blk_update_bidi_request(rq, error, nr_bytes, bidi_bytes))
-		return true;
-
-	blk_finish_request(rq, error);
-
-	return false;
-}
-
-/**
- * blk_end_request - Helper function for drivers to complete the request.
- * @rq:       the request being processed
- * @error:    block status code
- * @nr_bytes: number of bytes to complete
- *
- * Description:
- *     Ends I/O on a number of bytes attached to @rq.
- *     If @rq has leftover, sets it up for the next range of segments.
- *
- * Return:
- *     %false - we are done with this request
- *     %true  - still buffers pending for this request
- **/
-bool blk_end_request(struct request *rq, blk_status_t error,
-		unsigned int nr_bytes)
-{
-	WARN_ON_ONCE(rq->q->mq_ops);
-	return blk_end_bidi_request(rq, error, nr_bytes, 0);
-}
-EXPORT_SYMBOL(blk_end_request);
-
-/**
- * blk_end_request_all - Helper function for drives to finish the request.
- * @rq: the request to finish
- * @error: block status code
- *
- * Description:
- *     Completely finish @rq.
- */
-void blk_end_request_all(struct request *rq, blk_status_t error)
-{
-	bool pending;
-	unsigned int bidi_bytes = 0;
-
-	if (unlikely(blk_bidi_rq(rq)))
-		bidi_bytes = blk_rq_bytes(rq->next_rq);
-
-	pending = blk_end_bidi_request(rq, error, blk_rq_bytes(rq), bidi_bytes);
-	BUG_ON(pending);
-}
-EXPORT_SYMBOL(blk_end_request_all);
-
-/**
- * __blk_end_request - Helper function for drivers to complete the request.
- * @rq:       the request being processed
- * @error:    block status code
- * @nr_bytes: number of bytes to complete
- *
- * Description:
- *     Must be called with queue lock held unlike blk_end_request().
- *
- * Return:
- *     %false - we are done with this request
- *     %true  - still buffers pending for this request
- **/
-bool __blk_end_request(struct request *rq, blk_status_t error,
-		unsigned int nr_bytes)
-{
-	lockdep_assert_held(rq->q->queue_lock);
-	WARN_ON_ONCE(rq->q->mq_ops);
-
-	return __blk_end_bidi_request(rq, error, nr_bytes, 0);
-}
-EXPORT_SYMBOL(__blk_end_request);
-
-/**
- * __blk_end_request_all - Helper function for drives to finish the request.
- * @rq: the request to finish
- * @error:    block status code
- *
- * Description:
- *     Completely finish @rq.  Must be called with queue lock held.
- */
-void __blk_end_request_all(struct request *rq, blk_status_t error)
-{
-	bool pending;
-	unsigned int bidi_bytes = 0;
-
-	lockdep_assert_held(rq->q->queue_lock);
-	WARN_ON_ONCE(rq->q->mq_ops);
-
-	if (unlikely(blk_bidi_rq(rq)))
-		bidi_bytes = blk_rq_bytes(rq->next_rq);
-
-	pending = __blk_end_bidi_request(rq, error, blk_rq_bytes(rq), bidi_bytes);
-	BUG_ON(pending);
-}
-EXPORT_SYMBOL(__blk_end_request_all);
-
-/**
- * __blk_end_request_cur - Helper function to finish the current request chunk.
- * @rq: the request to finish the current chunk for
- * @error:    block status code
- *
- * Description:
- *     Complete the current consecutively mapped chunk from @rq.  Must
- *     be called with queue lock held.
- *
- * Return:
- *     %false - we are done with this request
- *     %true  - still buffers pending for this request
- */
-bool __blk_end_request_cur(struct request *rq, blk_status_t error)
-{
-	return __blk_end_request(rq, error, blk_rq_cur_bytes(rq));
-}
-EXPORT_SYMBOL(__blk_end_request_cur);
-
 void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
 		     struct bio *bio)
 {
@@ -3565,7 +1970,6 @@ void blk_start_plug(struct blk_plug *plug)
 	if (tsk->plug)
 		return;
 
-	INIT_LIST_HEAD(&plug->list);
 	INIT_LIST_HEAD(&plug->mq_list);
 	INIT_LIST_HEAD(&plug->cb_list);
 	/*
@@ -3576,36 +1980,6 @@ void blk_start_plug(struct blk_plug *plug)
 }
 EXPORT_SYMBOL(blk_start_plug);
 
-static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
-{
-	struct request *rqa = container_of(a, struct request, queuelist);
-	struct request *rqb = container_of(b, struct request, queuelist);
-
-	return !(rqa->q < rqb->q ||
-		(rqa->q == rqb->q && blk_rq_pos(rqa) < blk_rq_pos(rqb)));
-}
-
-/*
- * If 'from_schedule' is true, then postpone the dispatch of requests
- * until a safe kblockd context. We due this to avoid accidental big
- * additional stack usage in driver dispatch, in places where the originally
- * plugger did not intend it.
- */
-static void queue_unplugged(struct request_queue *q, unsigned int depth,
-			    bool from_schedule)
-	__releases(q->queue_lock)
-{
-	lockdep_assert_held(q->queue_lock);
-
-	trace_block_unplug(q, depth, !from_schedule);
-
-	if (from_schedule)
-		blk_run_queue_async(q);
-	else
-		__blk_run_queue(q);
-	spin_unlock_irq(q->queue_lock);
-}
-
 static void flush_plug_callbacks(struct blk_plug *plug, bool from_schedule)
 {
 	LIST_HEAD(callbacks);
@@ -3650,65 +2024,10 @@ EXPORT_SYMBOL(blk_check_plugged);
 
 void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 {
-	struct request_queue *q;
-	struct request *rq;
-	LIST_HEAD(list);
-	unsigned int depth;
-
 	flush_plug_callbacks(plug, from_schedule);
 
 	if (!list_empty(&plug->mq_list))
 		blk_mq_flush_plug_list(plug, from_schedule);
-
-	if (list_empty(&plug->list))
-		return;
-
-	list_splice_init(&plug->list, &list);
-
-	list_sort(NULL, &list, plug_rq_cmp);
-
-	q = NULL;
-	depth = 0;
-
-	while (!list_empty(&list)) {
-		rq = list_entry_rq(list.next);
-		list_del_init(&rq->queuelist);
-		BUG_ON(!rq->q);
-		if (rq->q != q) {
-			/*
-			 * This drops the queue lock
-			 */
-			if (q)
-				queue_unplugged(q, depth, from_schedule);
-			q = rq->q;
-			depth = 0;
-			spin_lock_irq(q->queue_lock);
-		}
-
-		/*
-		 * Short-circuit if @q is dead
-		 */
-		if (unlikely(blk_queue_dying(q))) {
-			__blk_end_request_all(rq, BLK_STS_IOERR);
-			continue;
-		}
-
-		/*
-		 * rq is already accounted, so use raw insert
-		 */
-		if (op_is_flush(rq->cmd_flags))
-			__elv_add_request(q, rq, ELEVATOR_INSERT_FLUSH);
-		else
-			__elv_add_request(q, rq, ELEVATOR_INSERT_SORT_MERGE);
-
-		depth++;
-	}
-
-	/*
-	 * This drops the queue lock
-	 */
-	if (q)
-		queue_unplugged(q, depth, from_schedule);
 }
 
 void blk_finish_plug(struct blk_plug *plug)
diff --git a/block/blk-exec.c b/block/blk-exec.c
index f7b292f12449..a34b7d918742 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -48,8 +48,6 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 			   struct request *rq, int at_head,
 			   rq_end_io_fn *done)
 {
-	int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK;
-
 	WARN_ON(irqs_disabled());
 	WARN_ON(!blk_rq_is_passthrough(rq));
 
@@ -60,23 +58,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	 * don't check dying flag for MQ because the request won't
 	 * be reused after dying flag is set
 	 */
-	if (q->mq_ops) {
-		blk_mq_sched_insert_request(rq, at_head, true, false);
-		return;
-	}
-
-	spin_lock_irq(q->queue_lock);
-
-	if (unlikely(blk_queue_dying(q))) {
-		rq->rq_flags |= RQF_QUIET;
-		__blk_end_request_all(rq, BLK_STS_IOERR);
-		spin_unlock_irq(q->queue_lock);
-		return;
-	}
-
-	__elv_add_request(q, rq, where);
-	__blk_run_queue(q);
-	spin_unlock_irq(q->queue_lock);
+	blk_mq_sched_insert_request(rq, at_head, true, false);
 }
 EXPORT_SYMBOL_GPL(blk_execute_rq_nowait);
 
diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 01580f88fcb3..b8afdd6ec4b8 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -48,10 +48,8 @@ static void ioc_exit_icq(struct io_cq *icq)
 	if (icq->flags & ICQ_EXITED)
 		return;
 
-	if (et->uses_mq && et->ops.mq.exit_icq)
+	if (et->ops.mq.exit_icq)
 		et->ops.mq.exit_icq(icq);
-	else if (!et->uses_mq && et->ops.sq.elevator_exit_icq_fn)
-		et->ops.sq.elevator_exit_icq_fn(icq);
 
 	icq->flags |= ICQ_EXITED;
 }
@@ -187,25 +185,13 @@ void put_io_context_active(struct io_context *ioc)
 	 * reverse double locking.  Read comment in ioc_release_fn() for
 	 * explanation on the nested locking annotation.
 	 */
-retry:
 	spin_lock_irqsave_nested(&ioc->lock, flags, 1);
 	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
 		if (icq->flags & ICQ_EXITED)
 			continue;
 
 		et = icq->q->elevator->type;
-		if (et->uses_mq) {
-			ioc_exit_icq(icq);
-		} else {
-			if (spin_trylock(icq->q->queue_lock)) {
-				ioc_exit_icq(icq);
-				spin_unlock(icq->q->queue_lock);
-			} else {
-				spin_unlock_irqrestore(&ioc->lock, flags);
-				cpu_relax();
-				goto retry;
-			}
-		}
+		ioc_exit_icq(icq);
 	}
 	spin_unlock_irqrestore(&ioc->lock, flags);
 
@@ -226,21 +212,6 @@ void exit_io_context(struct task_struct *task)
 	put_io_context_active(ioc);
 }
 
-static void __ioc_clear_queue(struct list_head *icq_list)
-{
-	unsigned long flags;
-
-	while (!list_empty(icq_list)) {
-		struct io_cq *icq = list_entry(icq_list->next,
-					       struct io_cq, q_node);
-		struct io_context *ioc = icq->ioc;
-
-		spin_lock_irqsave(&ioc->lock, flags);
-		ioc_destroy_icq(icq);
-		spin_unlock_irqrestore(&ioc->lock, flags);
-	}
-}
-
 /**
  * ioc_clear_queue - break any ioc association with the specified queue
  * @q: request_queue being cleared
@@ -253,14 +224,7 @@ void ioc_clear_queue(struct request_queue *q)
 
 	spin_lock_irq(q->queue_lock);
 	list_splice_init(&q->icq_list, &icq_list);
-
-	if (q->mq_ops) {
-		spin_unlock_irq(q->queue_lock);
-		__ioc_clear_queue(&icq_list);
-	} else {
-		__ioc_clear_queue(&icq_list);
-		spin_unlock_irq(q->queue_lock);
-	}
+	spin_unlock_irq(q->queue_lock);
 }
 
 int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
@@ -415,10 +379,8 @@ struct io_cq *ioc_create_icq(struct io_context *ioc, struct request_queue *q,
 	if (likely(!radix_tree_insert(&ioc->icq_tree, q->id, icq))) {
 		hlist_add_head(&icq->ioc_node, &ioc->icq_list);
 		list_add(&icq->q_node, &q->icq_list);
-		if (et->uses_mq && et->ops.mq.init_icq)
+		if (et->ops.mq.init_icq)
 			et->ops.mq.init_icq(icq);
-		else if (!et->uses_mq && et->ops.sq.elevator_init_icq_fn)
-			et->ops.sq.elevator_init_icq_fn(icq);
 	} else {
 		kmem_cache_free(et->icq_cache, icq);
 		icq = ioc_lookup_icq(ioc, q);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 42a46744c11b..5e4e30a88ced 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -835,13 +835,8 @@ struct request *attempt_front_merge(struct request_queue *q, struct request *rq)
 int blk_attempt_req_merge(struct request_queue *q, struct request *rq,
 			  struct request *next)
 {
-	struct elevator_queue *e = q->elevator;
 	struct request *free;
 
-	if (!e->uses_mq && e->type->ops.sq.elevator_allow_rq_merge_fn)
-		if (!e->type->ops.sq.elevator_allow_rq_merge_fn(q, rq, next))
-			return 0;
-
 	free = attempt_merge(q, rq, next);
 	if (free) {
 		__blk_put_request(q, free);
diff --git a/block/blk-settings.c b/block/blk-settings.c
index ffd459969689..764d0e9e092f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -20,40 +20,6 @@ EXPORT_SYMBOL(blk_max_low_pfn);
 
 unsigned long blk_max_pfn;
 
-/**
- * blk_queue_prep_rq - set a prepare_request function for queue
- * @q:		queue
- * @pfn:	prepare_request function
- *
- * It's possible for a queue to register a prepare_request callback which
- * is invoked before the request is handed to the request_fn. The goal of
- * the function is to prepare a request for I/O, it can be used to build a
- * cdb from the request data for instance.
- *
- */
-void blk_queue_prep_rq(struct request_queue *q, prep_rq_fn *pfn)
-{
-	q->prep_rq_fn = pfn;
-}
-EXPORT_SYMBOL(blk_queue_prep_rq);
-
-/**
- * blk_queue_unprep_rq - set an unprepare_request function for queue
- * @q:		queue
- * @ufn:	unprepare_request function
- *
- * It's possible for a queue to register an unprepare_request callback
- * which is invoked before the request is finally completed. The goal
- * of the function is to deallocate any data that was allocated in the
- * prepare_request callback.
- *
- */
-void blk_queue_unprep_rq(struct request_queue *q, unprep_rq_fn *ufn)
-{
-	q->unprep_rq_fn = ufn;
-}
-EXPORT_SYMBOL(blk_queue_unprep_rq);
-
 void blk_queue_softirq_done(struct request_queue *q, softirq_done_fn *fn)
 {
 	q->softirq_done_fn = fn;
@@ -169,8 +135,6 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
 
 	q->make_request_fn = mfn;
 	blk_queue_dma_alignment(q, 511);
-	blk_queue_congestion_threshold(q);
-	q->nr_batching = BLK_BATCH_REQ;
 
 	blk_set_default_limits(&q->limits);
 }
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index e4fc3bd9c32e..5edf66bd4808 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -68,7 +68,7 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count)
 	unsigned long nr;
 	int ret, err;
 
-	if (!q->request_fn && !q->mq_ops)
+	if (!q->mq_ops)
 		return -EINVAL;
 
 	ret = queue_var_store(&nr, page, count);
@@ -78,11 +78,7 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count)
 	if (nr < BLKDEV_MIN_RQ)
 		nr = BLKDEV_MIN_RQ;
 
-	if (q->request_fn)
-		err = blk_update_nr_requests(q, nr);
-	else
-		err = blk_mq_update_nr_requests(q, nr);
-
+	err = blk_mq_update_nr_requests(q, nr);
 	if (err)
 		return err;
 
@@ -463,20 +459,14 @@ static ssize_t queue_wb_lat_store(struct request_queue *q, const char *page,
 	 * ends up either enabling or disabling wbt completely. We can't
 	 * have IO inflight if that happens.
 	 */
-	if (q->mq_ops) {
-		blk_mq_freeze_queue(q);
-		blk_mq_quiesce_queue(q);
-	} else
-		blk_queue_bypass_start(q);
+	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	wbt_set_min_lat(q, val);
 	wbt_update_limits(q);
 
-	if (q->mq_ops) {
-		blk_mq_unquiesce_queue(q);
-		blk_mq_unfreeze_queue(q);
-	} else
-		blk_queue_bypass_end(q);
+	blk_mq_unquiesce_queue(q);
+	blk_mq_unfreeze_queue(q);
 
 	return count;
 }
@@ -847,17 +837,10 @@ static void __blk_release_queue(struct work_struct *work)
 
 	blk_free_queue_stats(q->stats);
 
-	blk_exit_rl(q, &q->root_rl);
-
 	blk_queue_free_zone_bitmaps(q);
 
-	if (!q->mq_ops) {
-		if (q->exit_rq_fn)
-			q->exit_rq_fn(q, q->fq->flush_rq);
-		blk_free_flush_queue(q->fq);
-	} else {
+	if (q->mq_ops)
 		blk_mq_release(q);
-	}
 
 	blk_trace_shutdown(q);
 
@@ -920,7 +903,6 @@ int blk_register_queue(struct gendisk *disk)
 	if (!blk_queue_init_done(q)) {
 		queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
 		percpu_ref_switch_to_percpu(&q->q_usage_counter);
-		blk_queue_bypass_end(q);
 	}
 
 	ret = blk_trace_init_sysfs(dev);
@@ -947,7 +929,7 @@ int blk_register_queue(struct gendisk *disk)
 
 	blk_throtl_register_queue(q);
 
-	if (q->request_fn || (q->mq_ops && q->elevator)) {
+	if ((q->mq_ops && q->elevator)) {
 		ret = elv_register_queue(q);
 		if (ret) {
 			mutex_unlock(&q->sysfs_lock);
@@ -1007,7 +989,7 @@ void blk_unregister_queue(struct gendisk *disk)
 	rq_qos_exit(q);
 
 	mutex_lock(&q->sysfs_lock);
-	if (q->request_fn || (q->mq_ops && q->elevator))
+	if (q->mq_ops && q->elevator)
 		elv_unregister_queue(q);
 	mutex_unlock(&q->sysfs_lock);
 
diff --git a/block/blk.h b/block/blk.h
index 57a302bf5a70..e2604ae7ddfa 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -7,12 +7,6 @@
 #include <xen/xen.h>
 #include "blk-mq.h"
 
-/* Amount of time in which a process may batch requests */
-#define BLK_BATCH_TIME	(HZ/50UL)
-
-/* Number of requests a "batching" process may submit */
-#define BLK_BATCH_REQ	32
-
 /* Max future timer expiry for timeouts */
 #define BLK_MAX_TIMEOUT		(5 * HZ)
 
@@ -132,9 +126,6 @@ void blk_exit_rl(struct request_queue *q, struct request_list *rl);
 void blk_exit_queue(struct request_queue *q);
 void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
 			struct bio *bio);
-void blk_queue_bypass_start(struct request_queue *q);
-void blk_queue_bypass_end(struct request_queue *q);
-void __blk_queue_free_tags(struct request_queue *q);
 void blk_freeze_queue(struct request_queue *q);
 
 static inline void blk_queue_enter_live(struct request_queue *q)
@@ -281,23 +272,6 @@ static inline bool blk_rq_is_complete(struct request *rq)
 
 void blk_insert_flush(struct request *rq);
 
-static inline void elv_activate_rq(struct request_queue *q, struct request *rq)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (e->type->ops.sq.elevator_activate_req_fn)
-		e->type->ops.sq.elevator_activate_req_fn(q, rq);
-}
-
-static inline void elv_deactivate_rq(struct request_queue *q, struct request *rq)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (e->type->ops.sq.elevator_deactivate_req_fn)
-		e->type->ops.sq.elevator_deactivate_req_fn(q, rq);
-}
-
-int elevator_init(struct request_queue *);
 int elevator_init_mq(struct request_queue *q);
 int elevator_switch_mq(struct request_queue *q,
 			      struct elevator_type *new_e);
@@ -332,31 +306,8 @@ void blk_rq_set_mixed_merge(struct request *rq);
 bool blk_rq_merge_ok(struct request *rq, struct bio *bio);
 enum elv_merge blk_try_merge(struct request *rq, struct bio *bio);
 
-void blk_queue_congestion_threshold(struct request_queue *q);
-
 int blk_dev_init(void);
 
-
-/*
- * Return the threshold (number of used requests) at which the queue is
- * considered to be congested.  It include a little hysteresis to keep the
- * context switch rate down.
- */
-static inline int queue_congestion_on_threshold(struct request_queue *q)
-{
-	return q->nr_congestion_on;
-}
-
-/*
- * The threshold at which a queue is considered to be uncongested
- */
-static inline int queue_congestion_off_threshold(struct request_queue *q)
-{
-	return q->nr_congestion_off;
-}
-
-extern int blk_update_nr_requests(struct request_queue *, unsigned int);
-
 /*
  * Contribute to IO statistics IFF:
  *
@@ -478,8 +429,6 @@ static inline void blk_queue_bounce(struct request_queue *q, struct bio **bio)
 }
 #endif /* CONFIG_BOUNCE */
 
-extern void blk_drain_queue(struct request_queue *q);
-
 #ifdef CONFIG_BLK_CGROUP_IOLATENCY
 extern int blk_iolatency_init(struct request_queue *q);
 #else
diff --git a/block/elevator.c b/block/elevator.c
index 54e1adac26c5..334097c54b08 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -61,10 +61,8 @@ static int elv_iosched_allow_bio_merge(struct request *rq, struct bio *bio)
 	struct request_queue *q = rq->q;
 	struct elevator_queue *e = q->elevator;
 
-	if (e->uses_mq && e->type->ops.mq.allow_merge)
+	if (e->type->ops.mq.allow_merge)
 		return e->type->ops.mq.allow_merge(q, rq, bio);
-	else if (!e->uses_mq && e->type->ops.sq.elevator_allow_bio_merge_fn)
-		return e->type->ops.sq.elevator_allow_bio_merge_fn(q, rq, bio);
 
 	return 1;
 }
@@ -95,14 +93,14 @@ static bool elevator_match(const struct elevator_type *e, const char *name)
 }
 
 /*
- * Return scheduler with name 'name' and with matching 'mq capability
+ * Return scheduler with name 'name'
  */
-static struct elevator_type *elevator_find(const char *name, bool mq)
+static struct elevator_type *elevator_find(const char *name)
 {
 	struct elevator_type *e;
 
 	list_for_each_entry(e, &elv_list, list) {
-		if (elevator_match(e, name) && (mq == e->uses_mq))
+		if (elevator_match(e, name))
 			return e;
 	}
 
@@ -121,12 +119,12 @@ static struct elevator_type *elevator_get(struct request_queue *q,
 
 	spin_lock(&elv_list_lock);
 
-	e = elevator_find(name, q->mq_ops != NULL);
+	e = elevator_find(name);
 	if (!e && try_loading) {
 		spin_unlock(&elv_list_lock);
 		request_module("%s-iosched", name);
 		spin_lock(&elv_list_lock);
-		e = elevator_find(name, q->mq_ops != NULL);
+		e = elevator_find(name);
 	}
 
 	if (e && !try_module_get(e->elevator_owner))
@@ -150,26 +148,6 @@ static int __init elevator_setup(char *str)
 
 __setup("elevator=", elevator_setup);
 
-/* called during boot to load the elevator chosen by the elevator param */
-void __init load_default_elevator_module(void)
-{
-	struct elevator_type *e;
-
-	if (!chosen_elevator[0])
-		return;
-
-	/*
-	 * Boot parameter is deprecated, we haven't supported that for MQ.
-	 * Only look for non-mq schedulers from here.
-	 */
-	spin_lock(&elv_list_lock);
-	e = elevator_find(chosen_elevator, false);
-	spin_unlock(&elv_list_lock);
-
-	if (!e)
-		request_module("%s-iosched", chosen_elevator);
-}
-
 static struct kobj_type elv_ktype;
 
 struct elevator_queue *elevator_alloc(struct request_queue *q,
@@ -185,7 +163,6 @@ struct elevator_queue *elevator_alloc(struct request_queue *q,
 	kobject_init(&eq->kobj, &elv_ktype);
 	mutex_init(&eq->sysfs_lock);
 	hash_init(eq->hash);
-	eq->uses_mq = e->uses_mq;
 
 	return eq;
 }
@@ -200,52 +177,11 @@ static void elevator_release(struct kobject *kobj)
 	kfree(e);
 }
 
-/*
- * Use the default elevator specified by config boot param for non-mq devices,
- * or by config option.  Don't try to load modules as we could be running off
- * async and request_module() isn't allowed from async.
- */
-int elevator_init(struct request_queue *q)
-{
-	struct elevator_type *e = NULL;
-	int err = 0;
-
-	/*
-	 * q->sysfs_lock must be held to provide mutual exclusion between
-	 * elevator_switch() and here.
-	 */
-	mutex_lock(&q->sysfs_lock);
-	if (unlikely(q->elevator))
-		goto out_unlock;
-
-	if (*chosen_elevator) {
-		e = elevator_get(q, chosen_elevator, false);
-		if (!e)
-			printk(KERN_ERR "I/O scheduler %s not found\n",
-							chosen_elevator);
-	}
-
-	if (!e) {
-		printk(KERN_ERR
-			"Default I/O scheduler not found. Using noop.\n");
-		e = elevator_get(q, "noop", false);
-	}
-
-	err = e->ops.sq.elevator_init_fn(q, e);
-	if (err)
-		elevator_put(e);
-out_unlock:
-	mutex_unlock(&q->sysfs_lock);
-	return err;
-}
-
 void elevator_exit(struct request_queue *q, struct elevator_queue *e)
 {
 	mutex_lock(&e->sysfs_lock);
-	if (e->uses_mq && e->type->ops.mq.exit_sched)
+	if (e->type->ops.mq.exit_sched)
 		blk_mq_exit_sched(q, e);
-	else if (!e->uses_mq && e->type->ops.sq.elevator_exit_fn)
-		e->type->ops.sq.elevator_exit_fn(e);
 	mutex_unlock(&e->sysfs_lock);
 
 	kobject_put(&e->kobj);
@@ -393,10 +329,8 @@ enum elv_merge elv_merge(struct request_queue *q, struct request **req,
 		return ELEVATOR_BACK_MERGE;
 	}
 
-	if (e->uses_mq && e->type->ops.mq.request_merge)
+	if (e->type->ops.mq.request_merge)
 		return e->type->ops.mq.request_merge(q, req, bio);
-	else if (!e->uses_mq && e->type->ops.sq.elevator_merge_fn)
-		return e->type->ops.sq.elevator_merge_fn(q, req, bio);
 
 	return ELEVATOR_NO_MERGE;
 }
@@ -447,10 +381,8 @@ void elv_merged_request(struct request_queue *q, struct request *rq,
 {
 	struct elevator_queue *e = q->elevator;
 
-	if (e->uses_mq && e->type->ops.mq.request_merged)
+	if (e->type->ops.mq.request_merged)
 		e->type->ops.mq.request_merged(q, rq, type);
-	else if (!e->uses_mq && e->type->ops.sq.elevator_merged_fn)
-		e->type->ops.sq.elevator_merged_fn(q, rq, type);
 
 	if (type == ELEVATOR_BACK_MERGE)
 		elv_rqhash_reposition(q, rq);
@@ -464,13 +396,8 @@ void elv_merge_requests(struct request_queue *q, struct request *rq,
 	struct elevator_queue *e = q->elevator;
 	bool next_sorted = false;
 
-	if (e->uses_mq && e->type->ops.mq.requests_merged)
+	if (e->type->ops.mq.requests_merged)
 		e->type->ops.mq.requests_merged(q, rq, next);
-	else if (e->type->ops.sq.elevator_merge_req_fn) {
-		next_sorted = (__force bool)(next->rq_flags & RQF_SORTED);
-		if (next_sorted)
-			e->type->ops.sq.elevator_merge_req_fn(q, rq, next);
-	}
 
 	elv_rqhash_reposition(q, rq);
 
@@ -482,156 +409,12 @@ void elv_merge_requests(struct request_queue *q, struct request *rq,
 	q->last_merge = rq;
 }
 
-void elv_bio_merged(struct request_queue *q, struct request *rq,
-			struct bio *bio)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (WARN_ON_ONCE(e->uses_mq))
-		return;
-
-	if (e->type->ops.sq.elevator_bio_merged_fn)
-		e->type->ops.sq.elevator_bio_merged_fn(q, rq, bio);
-}
-
-void elv_requeue_request(struct request_queue *q, struct request *rq)
-{
-	/*
-	 * it already went through dequeue, we need to decrement the
-	 * in_flight count again
-	 */
-	if (blk_account_rq(rq)) {
-		q->in_flight[rq_is_sync(rq)]--;
-		if (rq->rq_flags & RQF_SORTED)
-			elv_deactivate_rq(q, rq);
-	}
-
-	rq->rq_flags &= ~RQF_STARTED;
-
-	blk_pm_requeue_request(rq);
-
-	__elv_add_request(q, rq, ELEVATOR_INSERT_REQUEUE);
-}
-
-void elv_drain_elevator(struct request_queue *q)
-{
-	struct elevator_queue *e = q->elevator;
-	static int printed;
-
-	if (WARN_ON_ONCE(e->uses_mq))
-		return;
-
-	lockdep_assert_held(q->queue_lock);
-
-	while (e->type->ops.sq.elevator_dispatch_fn(q, 1))
-		;
-	if (q->nr_sorted && !blk_queue_is_zoned(q) && printed++ < 10 ) {
-		printk(KERN_ERR "%s: forced dispatching is broken "
-		       "(nr_sorted=%u), please report this\n",
-		       q->elevator->type->elevator_name, q->nr_sorted);
-	}
-}
-
-void __elv_add_request(struct request_queue *q, struct request *rq, int where)
-{
-	trace_block_rq_insert(q, rq);
-
-	blk_pm_add_request(q, rq);
-
-	rq->q = q;
-
-	if (rq->rq_flags & RQF_SOFTBARRIER) {
-		/* barriers are scheduling boundary, update end_sector */
-		if (!blk_rq_is_passthrough(rq)) {
-			q->end_sector = rq_end_sector(rq);
-			q->boundary_rq = rq;
-		}
-	} else if (!(rq->rq_flags & RQF_ELVPRIV) &&
-		    (where == ELEVATOR_INSERT_SORT ||
-		     where == ELEVATOR_INSERT_SORT_MERGE))
-		where = ELEVATOR_INSERT_BACK;
-
-	switch (where) {
-	case ELEVATOR_INSERT_REQUEUE:
-	case ELEVATOR_INSERT_FRONT:
-		rq->rq_flags |= RQF_SOFTBARRIER;
-		list_add(&rq->queuelist, &q->queue_head);
-		break;
-
-	case ELEVATOR_INSERT_BACK:
-		rq->rq_flags |= RQF_SOFTBARRIER;
-		elv_drain_elevator(q);
-		list_add_tail(&rq->queuelist, &q->queue_head);
-		/*
-		 * We kick the queue here for the following reasons.
-		 * - The elevator might have returned NULL previously
-		 *   to delay requests and returned them now.  As the
-		 *   queue wasn't empty before this request, ll_rw_blk
-		 *   won't run the queue on return, resulting in hang.
-		 * - Usually, back inserted requests won't be merged
-		 *   with anything.  There's no point in delaying queue
-		 *   processing.
-		 */
-		__blk_run_queue(q);
-		break;
-
-	case ELEVATOR_INSERT_SORT_MERGE:
-		/*
-		 * If we succeed in merging this request with one in the
-		 * queue already, we are done - rq has now been freed,
-		 * so no need to do anything further.
-		 */
-		if (elv_attempt_insert_merge(q, rq))
-			break;
-		/* fall through */
-	case ELEVATOR_INSERT_SORT:
-		BUG_ON(blk_rq_is_passthrough(rq));
-		rq->rq_flags |= RQF_SORTED;
-		q->nr_sorted++;
-		if (rq_mergeable(rq)) {
-			elv_rqhash_add(q, rq);
-			if (!q->last_merge)
-				q->last_merge = rq;
-		}
-
-		/*
-		 * Some ioscheds (cfq) run q->request_fn directly, so
-		 * rq cannot be accessed after calling
-		 * elevator_add_req_fn.
-		 */
-		q->elevator->type->ops.sq.elevator_add_req_fn(q, rq);
-		break;
-
-	case ELEVATOR_INSERT_FLUSH:
-		rq->rq_flags |= RQF_SOFTBARRIER;
-		blk_insert_flush(rq);
-		break;
-	default:
-		printk(KERN_ERR "%s: bad insertion point %d\n",
-		       __func__, where);
-		BUG();
-	}
-}
-EXPORT_SYMBOL(__elv_add_request);
-
-void elv_add_request(struct request_queue *q, struct request *rq, int where)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-	__elv_add_request(q, rq, where);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-}
-EXPORT_SYMBOL(elv_add_request);
-
 struct request *elv_latter_request(struct request_queue *q, struct request *rq)
 {
 	struct elevator_queue *e = q->elevator;
 
-	if (e->uses_mq && e->type->ops.mq.next_request)
+	if (e->type->ops.mq.next_request)
 		return e->type->ops.mq.next_request(q, rq);
-	else if (!e->uses_mq && e->type->ops.sq.elevator_latter_req_fn)
-		return e->type->ops.sq.elevator_latter_req_fn(q, rq);
 
 	return NULL;
 }
@@ -640,66 +423,10 @@ struct request *elv_former_request(struct request_queue *q, struct request *rq)
 {
 	struct elevator_queue *e = q->elevator;
 
-	if (e->uses_mq && e->type->ops.mq.former_request)
+	if (e->type->ops.mq.former_request)
 		return e->type->ops.mq.former_request(q, rq);
-	if (!e->uses_mq && e->type->ops.sq.elevator_former_req_fn)
-		return e->type->ops.sq.elevator_former_req_fn(q, rq);
-	return NULL;
-}
-
-int elv_set_request(struct request_queue *q, struct request *rq,
-		    struct bio *bio, gfp_t gfp_mask)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (WARN_ON_ONCE(e->uses_mq))
-		return 0;
 
-	if (e->type->ops.sq.elevator_set_req_fn)
-		return e->type->ops.sq.elevator_set_req_fn(q, rq, bio, gfp_mask);
-	return 0;
-}
-
-void elv_put_request(struct request_queue *q, struct request *rq)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (WARN_ON_ONCE(e->uses_mq))
-		return;
-
-	if (e->type->ops.sq.elevator_put_req_fn)
-		e->type->ops.sq.elevator_put_req_fn(rq);
-}
-
-int elv_may_queue(struct request_queue *q, unsigned int op)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (WARN_ON_ONCE(e->uses_mq))
-		return 0;
-
-	if (e->type->ops.sq.elevator_may_queue_fn)
-		return e->type->ops.sq.elevator_may_queue_fn(q, op);
-
-	return ELV_MQUEUE_MAY;
-}
-
-void elv_completed_request(struct request_queue *q, struct request *rq)
-{
-	struct elevator_queue *e = q->elevator;
-
-	if (WARN_ON_ONCE(e->uses_mq))
-		return;
-
-	/*
-	 * request is released from the driver, io must be done
-	 */
-	if (blk_account_rq(rq)) {
-		q->in_flight[rq_is_sync(rq)]--;
-		if ((rq->rq_flags & RQF_SORTED) &&
-		    e->type->ops.sq.elevator_completed_req_fn)
-			e->type->ops.sq.elevator_completed_req_fn(q, rq);
-	}
+	return NULL;
 }
 
 #define to_elv(atr) container_of((atr), struct elv_fs_entry, attr)
@@ -768,8 +495,6 @@ int elv_register_queue(struct request_queue *q)
 		}
 		kobject_uevent(&e->kobj, KOBJ_ADD);
 		e->registered = 1;
-		if (!e->uses_mq && e->type->ops.sq.elevator_registered_fn)
-			e->type->ops.sq.elevator_registered_fn(q);
 	}
 	return error;
 }
@@ -809,7 +534,7 @@ int elv_register(struct elevator_type *e)
 
 	/* register, don't allow duplicate names */
 	spin_lock(&elv_list_lock);
-	if (elevator_find(e->elevator_name, e->uses_mq)) {
+	if (elevator_find(e->elevator_name)) {
 		spin_unlock(&elv_list_lock);
 		kmem_cache_destroy(e->icq_cache);
 		return -EBUSY;
@@ -919,71 +644,17 @@ int elevator_init_mq(struct request_queue *q)
  */
 static int elevator_switch(struct request_queue *q, struct elevator_type *new_e)
 {
-	struct elevator_queue *old = q->elevator;
-	bool old_registered = false;
 	int err;
 
 	lockdep_assert_held(&q->sysfs_lock);
 
-	if (q->mq_ops) {
-		blk_mq_freeze_queue(q);
-		blk_mq_quiesce_queue(q);
-
-		err = elevator_switch_mq(q, new_e);
-
-		blk_mq_unquiesce_queue(q);
-		blk_mq_unfreeze_queue(q);
-
-		return err;
-	}
-
-	/*
-	 * Turn on BYPASS and drain all requests w/ elevator private data.
-	 * Block layer doesn't call into a quiesced elevator - all requests
-	 * are directly put on the dispatch list without elevator data
-	 * using INSERT_BACK.  All requests have SOFTBARRIER set and no
-	 * merge happens either.
-	 */
-	if (old) {
-		old_registered = old->registered;
-
-		blk_queue_bypass_start(q);
-
-		/* unregister and clear all auxiliary data of the old elevator */
-		if (old_registered)
-			elv_unregister_queue(q);
-
-		ioc_clear_queue(q);
-	}
-
-	/* allocate, init and register new elevator */
-	err = new_e->ops.sq.elevator_init_fn(q, new_e);
-	if (err)
-		goto fail_init;
-
-	err = elv_register_queue(q);
-	if (err)
-		goto fail_register;
-
-	/* done, kill the old one and finish */
-	if (old) {
-		elevator_exit(q, old);
-		blk_queue_bypass_end(q);
-	}
-
-	blk_add_trace_msg(q, "elv switch: %s", new_e->elevator_name);
+	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
-	return 0;
+	err = elevator_switch_mq(q, new_e);
 
-fail_register:
-	elevator_exit(q, q->elevator);
-fail_init:
-	/* switch failed, restore and re-register old elevator */
-	if (old) {
-		q->elevator = old;
-		elv_register_queue(q);
-		blk_queue_bypass_end(q);
-	}
+	blk_mq_unquiesce_queue(q);
+	blk_mq_unfreeze_queue(q);
 
 	return err;
 }
@@ -1032,7 +703,7 @@ ssize_t elv_iosched_store(struct request_queue *q, const char *name,
 {
 	int ret;
 
-	if (!(q->mq_ops || q->request_fn) || !elv_support_iosched(q))
+	if (!q->mq_ops || !elv_support_iosched(q))
 		return count;
 
 	ret = __elevator_change(q, name);
@@ -1047,7 +718,6 @@ ssize_t elv_iosched_show(struct request_queue *q, char *name)
 	struct elevator_queue *e = q->elevator;
 	struct elevator_type *elv = NULL;
 	struct elevator_type *__e;
-	bool uses_mq = q->mq_ops != NULL;
 	int len = 0;
 
 	if (!queue_is_rq_based(q))
@@ -1060,14 +730,11 @@ ssize_t elv_iosched_show(struct request_queue *q, char *name)
 
 	spin_lock(&elv_list_lock);
 	list_for_each_entry(__e, &elv_list, list) {
-		if (elv && elevator_match(elv, __e->elevator_name) &&
-		    (__e->uses_mq == uses_mq)) {
+		if (elv && elevator_match(elv, __e->elevator_name)) {
 			len += sprintf(name+len, "[%s] ", elv->elevator_name);
 			continue;
 		}
-		if (__e->uses_mq && q->mq_ops && elv_support_iosched(q))
-			len += sprintf(name+len, "%s ", __e->elevator_name);
-		else if (!__e->uses_mq && !q->mq_ops)
+		if (elv_support_iosched(q))
 			len += sprintf(name+len, "%s ", __e->elevator_name);
 	}
 	spin_unlock(&elv_list_lock);
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index eccac01a10b6..728757a34fa0 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -1032,7 +1032,6 @@ static struct elevator_type kyber_sched = {
 		.dispatch_request = kyber_dispatch_request,
 		.has_work = kyber_has_work,
 	},
-	.uses_mq = true,
 #ifdef CONFIG_BLK_DEBUG_FS
 	.queue_debugfs_attrs = kyber_queue_debugfs_attrs,
 	.hctx_debugfs_attrs = kyber_hctx_debugfs_attrs,
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 099a9e05854c..513edefd10fd 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -777,7 +777,6 @@ static struct elevator_type mq_deadline = {
 		.exit_sched		= dd_exit_queue,
 	},
 
-	.uses_mq	= true,
 #ifdef CONFIG_BLK_DEBUG_FS
 	.queue_debugfs_attrs = deadline_queue_debugfs_attrs,
 #endif
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2e683b1f1ee1..e87eeb47aac3 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -58,9 +58,6 @@ struct blk_stat_callback;
 
 typedef void (rq_end_io_fn)(struct request *, blk_status_t);
 
-#define BLK_RL_SYNCFULL		(1U << 0)
-#define BLK_RL_ASYNCFULL	(1U << 1)
-
 struct request_list {
 	struct request_queue	*q;	/* the queue this rl belongs to */
 #ifdef CONFIG_BLK_CGROUP
@@ -309,11 +306,8 @@ static inline unsigned short req_get_ioprio(struct request *req)
 
 struct blk_queue_ctx;
 
-typedef void (request_fn_proc) (struct request_queue *q);
 typedef blk_qc_t (make_request_fn) (struct request_queue *q, struct bio *bio);
 typedef bool (poll_q_fn) (struct request_queue *q, blk_qc_t);
-typedef int (prep_rq_fn) (struct request_queue *, struct request *);
-typedef void (unprep_rq_fn) (struct request_queue *, struct request *);
 
 struct bio_vec;
 typedef void (softirq_done_fn)(struct request *);
@@ -433,8 +427,6 @@ struct request_queue {
 	struct list_head	queue_head;
 	struct request		*last_merge;
 	struct elevator_queue	*elevator;
-	int			nr_rqs[2];	/* # allocated [a]sync rqs */
-	int			nr_rqs_elvpriv;	/* # allocated rqs w/ elvpriv */
 
 	struct blk_queue_stats	*stats;
 	struct rq_qos		*rq_qos;
@@ -447,11 +439,8 @@ struct request_queue {
 	 */
 	struct request_list	root_rl;
 
-	request_fn_proc		*request_fn;
 	make_request_fn		*make_request_fn;
 	poll_q_fn		*poll_fn;
-	prep_rq_fn		*prep_rq_fn;
-	unprep_rq_fn		*unprep_rq_fn;
 	softirq_done_fn		*softirq_done_fn;
 	rq_timed_out_fn		*rq_timed_out_fn;
 	dma_drain_needed_fn	*dma_drain_needed;
@@ -460,8 +449,6 @@ struct request_queue {
 	init_rq_fn		*init_rq_fn;
 	/* Called just before a request is freed */
 	exit_rq_fn		*exit_rq_fn;
-	/* Called from inside blk_get_request() */
-	void (*initialize_rq_fn)(struct request *rq);
 
 	const struct blk_mq_ops	*mq_ops;
 
@@ -477,17 +464,6 @@ struct request_queue {
 	struct blk_mq_hw_ctx	**queue_hw_ctx;
 	unsigned int		nr_hw_queues;
 
-	/*
-	 * Dispatch queue sorting
-	 */
-	sector_t		end_sector;
-	struct request		*boundary_rq;
-
-	/*
-	 * Delayed queue handling
-	 */
-	struct delayed_work	delay_work;
-
 	struct backing_dev_info	*backing_dev_info;
 
 	/*
@@ -550,9 +526,6 @@ struct request_queue {
 	 * queue settings
 	 */
 	unsigned long		nr_requests;	/* Max # of requests */
-	unsigned int		nr_congestion_on;
-	unsigned int		nr_congestion_off;
-	unsigned int		nr_batching;
 
 	unsigned int		dma_drain_size;
 	void			*dma_drain_buffer;
@@ -562,13 +535,6 @@ struct request_queue {
 	unsigned int		nr_sorted;
 	unsigned int		in_flight[2];
 
-	/*
-	 * Number of active block driver functions for which blk_drain_queue()
-	 * must wait. Must be incremented around functions that unlock the
-	 * queue_lock internally, e.g. scsi_request_fn().
-	 */
-	unsigned int		request_fn_active;
-
 	unsigned int		rq_timeout;
 	int			poll_nsec;
 
@@ -742,11 +708,6 @@ bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q);
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);
 
-static inline int queue_in_flight(struct request_queue *q)
-{
-	return q->in_flight[0] + q->in_flight[1];
-}
-
 static inline bool blk_account_rq(struct request *rq)
 {
 	return (rq->rq_flags & RQF_STARTED) && !blk_rq_is_passthrough(rq);
@@ -767,7 +728,7 @@ static inline bool blk_account_rq(struct request *rq)
  */
 static inline bool queue_is_rq_based(struct request_queue *q)
 {
-	return q->request_fn || q->mq_ops;
+	return q->mq_ops;
 }
 
 static inline unsigned int blk_queue_cluster(struct request_queue *q)
@@ -830,27 +791,6 @@ static inline bool rq_is_sync(struct request *rq)
 	return op_is_sync(rq->cmd_flags);
 }
 
-static inline bool blk_rl_full(struct request_list *rl, bool sync)
-{
-	unsigned int flag = sync ? BLK_RL_SYNCFULL : BLK_RL_ASYNCFULL;
-
-	return rl->flags & flag;
-}
-
-static inline void blk_set_rl_full(struct request_list *rl, bool sync)
-{
-	unsigned int flag = sync ? BLK_RL_SYNCFULL : BLK_RL_ASYNCFULL;
-
-	rl->flags |= flag;
-}
-
-static inline void blk_clear_rl_full(struct request_list *rl, bool sync)
-{
-	unsigned int flag = sync ? BLK_RL_SYNCFULL : BLK_RL_ASYNCFULL;
-
-	rl->flags &= ~flag;
-}
-
 static inline bool rq_mergeable(struct request *rq)
 {
 	if (blk_rq_is_passthrough(rq))
@@ -971,7 +911,6 @@ extern void blk_put_request(struct request *);
 extern void __blk_put_request(struct request_queue *, struct request *);
 extern struct request *blk_get_request(struct request_queue *, unsigned int op,
 				       blk_mq_req_flags_t flags);
-extern void blk_requeue_request(struct request_queue *, struct request *);
 extern int blk_lld_busy(struct request_queue *q);
 extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 			     struct bio_set *bs, gfp_t gfp_mask,
@@ -981,7 +920,6 @@ extern void blk_rq_unprep_clone(struct request *rq);
 extern blk_status_t blk_insert_cloned_request(struct request_queue *q,
 				     struct request *rq);
 extern int blk_rq_append_bio(struct request *rq, struct bio **bio);
-extern void blk_delay_queue(struct request_queue *, unsigned long);
 extern void blk_queue_split(struct request_queue *, struct bio **);
 extern void blk_recount_segments(struct request_queue *, struct bio *);
 extern int scsi_verify_blk_ioctl(struct block_device *, unsigned int);
@@ -994,15 +932,7 @@ extern int sg_scsi_ioctl(struct request_queue *, struct gendisk *, fmode_t,
 
 extern int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags);
 extern void blk_queue_exit(struct request_queue *q);
-extern void blk_start_queue(struct request_queue *q);
-extern void blk_start_queue_async(struct request_queue *q);
-extern void blk_stop_queue(struct request_queue *q);
 extern void blk_sync_queue(struct request_queue *q);
-extern void __blk_stop_queue(struct request_queue *q);
-extern void __blk_run_queue(struct request_queue *q);
-extern void __blk_run_queue_uncond(struct request_queue *q);
-extern void blk_run_queue(struct request_queue *);
-extern void blk_run_queue_async(struct request_queue *q);
 extern int blk_rq_map_user(struct request_queue *, struct request *,
 			   struct rq_map_data *, void __user *, unsigned long,
 			   gfp_t);
@@ -1157,13 +1087,6 @@ static inline unsigned int blk_rq_count_bios(struct request *rq)
 	return nr_bios;
 }
 
-/*
- * Request issue related functions.
- */
-extern struct request *blk_peek_request(struct request_queue *q);
-extern void blk_start_request(struct request *rq);
-extern struct request *blk_fetch_request(struct request_queue *q);
-
 void blk_steal_bios(struct bio_list *list, struct request *rq);
 
 /*
@@ -1181,9 +1104,6 @@ void blk_steal_bios(struct bio_list *list, struct request *rq);
  */
 extern bool blk_update_request(struct request *rq, blk_status_t error,
 			       unsigned int nr_bytes);
-extern void blk_finish_request(struct request *rq, blk_status_t error);
-extern bool blk_end_request(struct request *rq, blk_status_t error,
-			    unsigned int nr_bytes);
 extern void blk_end_request_all(struct request *rq, blk_status_t error);
 extern bool __blk_end_request(struct request *rq, blk_status_t error,
 			      unsigned int nr_bytes);
@@ -1192,15 +1112,10 @@ extern bool __blk_end_request_cur(struct request *rq, blk_status_t error);
 
 extern void __blk_complete_request(struct request *);
 extern void blk_abort_request(struct request *);
-extern void blk_unprep_request(struct request *);
 
 /*
  * Access functions for manipulating queue properties
  */
-extern struct request_queue *blk_init_queue_node(request_fn_proc *rfn,
-					spinlock_t *lock, int node_id);
-extern struct request_queue *blk_init_queue(request_fn_proc *, spinlock_t *);
-extern int blk_init_allocated_queue(struct request_queue *);
 extern void blk_cleanup_queue(struct request_queue *);
 extern void blk_queue_make_request(struct request_queue *, make_request_fn *);
 extern void blk_queue_bounce_limit(struct request_queue *, u64);
@@ -1242,8 +1157,6 @@ extern int blk_queue_dma_drain(struct request_queue *q,
 extern void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn);
 extern void blk_queue_segment_boundary(struct request_queue *, unsigned long);
 extern void blk_queue_virt_boundary(struct request_queue *, unsigned long);
-extern void blk_queue_prep_rq(struct request_queue *, prep_rq_fn *pfn);
-extern void blk_queue_unprep_rq(struct request_queue *, unprep_rq_fn *ufn);
 extern void blk_queue_dma_alignment(struct request_queue *, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_softirq_done(struct request_queue *, softirq_done_fn *);
@@ -1301,7 +1214,6 @@ extern void blk_set_queue_dying(struct request_queue *);
  * schedule() where blk_schedule_flush_plug() is called.
  */
 struct blk_plug {
-	struct list_head list; /* requests */
 	struct list_head mq_list; /* blk-mq requests */
 	struct list_head cb_list; /* md requires an unplug callback */
 };
@@ -1342,8 +1254,7 @@ static inline bool blk_needs_flush_plug(struct task_struct *tsk)
 	struct blk_plug *plug = tsk->plug;
 
 	return plug &&
-		(!list_empty(&plug->list) ||
-		 !list_empty(&plug->mq_list) ||
+		 (!list_empty(&plug->mq_list) ||
 		 !list_empty(&plug->cb_list));
 }
 
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 015bb59c0331..158004f1754d 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -23,74 +23,6 @@ enum elv_merge {
 	ELEVATOR_DISCARD_MERGE	= 3,
 };
 
-typedef enum elv_merge (elevator_merge_fn) (struct request_queue *, struct request **,
-				 struct bio *);
-
-typedef void (elevator_merge_req_fn) (struct request_queue *, struct request *, struct request *);
-
-typedef void (elevator_merged_fn) (struct request_queue *, struct request *, enum elv_merge);
-
-typedef int (elevator_allow_bio_merge_fn) (struct request_queue *,
-					   struct request *, struct bio *);
-
-typedef int (elevator_allow_rq_merge_fn) (struct request_queue *,
-					  struct request *, struct request *);
-
-typedef void (elevator_bio_merged_fn) (struct request_queue *,
-						struct request *, struct bio *);
-
-typedef int (elevator_dispatch_fn) (struct request_queue *, int);
-
-typedef void (elevator_add_req_fn) (struct request_queue *, struct request *);
-typedef struct request *(elevator_request_list_fn) (struct request_queue *, struct request *);
-typedef void (elevator_completed_req_fn) (struct request_queue *, struct request *);
-typedef int (elevator_may_queue_fn) (struct request_queue *, unsigned int);
-
-typedef void (elevator_init_icq_fn) (struct io_cq *);
-typedef void (elevator_exit_icq_fn) (struct io_cq *);
-typedef int (elevator_set_req_fn) (struct request_queue *, struct request *,
-				   struct bio *, gfp_t);
-typedef void (elevator_put_req_fn) (struct request *);
-typedef void (elevator_activate_req_fn) (struct request_queue *, struct request *);
-typedef void (elevator_deactivate_req_fn) (struct request_queue *, struct request *);
-
-typedef int (elevator_init_fn) (struct request_queue *,
-				struct elevator_type *e);
-typedef void (elevator_exit_fn) (struct elevator_queue *);
-typedef void (elevator_registered_fn) (struct request_queue *);
-
-struct elevator_ops
-{
-	elevator_merge_fn *elevator_merge_fn;
-	elevator_merged_fn *elevator_merged_fn;
-	elevator_merge_req_fn *elevator_merge_req_fn;
-	elevator_allow_bio_merge_fn *elevator_allow_bio_merge_fn;
-	elevator_allow_rq_merge_fn *elevator_allow_rq_merge_fn;
-	elevator_bio_merged_fn *elevator_bio_merged_fn;
-
-	elevator_dispatch_fn *elevator_dispatch_fn;
-	elevator_add_req_fn *elevator_add_req_fn;
-	elevator_activate_req_fn *elevator_activate_req_fn;
-	elevator_deactivate_req_fn *elevator_deactivate_req_fn;
-
-	elevator_completed_req_fn *elevator_completed_req_fn;
-
-	elevator_request_list_fn *elevator_former_req_fn;
-	elevator_request_list_fn *elevator_latter_req_fn;
-
-	elevator_init_icq_fn *elevator_init_icq_fn;	/* see iocontext.h */
-	elevator_exit_icq_fn *elevator_exit_icq_fn;	/* ditto */
-
-	elevator_set_req_fn *elevator_set_req_fn;
-	elevator_put_req_fn *elevator_put_req_fn;
-
-	elevator_may_queue_fn *elevator_may_queue_fn;
-
-	elevator_init_fn *elevator_init_fn;
-	elevator_exit_fn *elevator_exit_fn;
-	elevator_registered_fn *elevator_registered_fn;
-};
-
 struct blk_mq_alloc_data;
 struct blk_mq_hw_ctx;
 
@@ -138,16 +70,15 @@ struct elevator_type
 
 	/* fields provided by elevator implementation */
 	union {
-		struct elevator_ops sq;
 		struct elevator_mq_ops mq;
 	} ops;
+
 	size_t icq_size;	/* see iocontext.h */
 	size_t icq_align;	/* ditto */
 	struct elv_fs_entry *elevator_attrs;
 	char elevator_name[ELV_NAME_MAX];
 	const char *elevator_alias;
 	struct module *elevator_owner;
-	bool uses_mq;
 #ifdef CONFIG_BLK_DEBUG_FS
 	const struct blk_mq_debugfs_attr *queue_debugfs_attrs;
 	const struct blk_mq_debugfs_attr *hctx_debugfs_attrs;
@@ -175,40 +106,25 @@ struct elevator_queue
 	struct kobject kobj;
 	struct mutex sysfs_lock;
 	unsigned int registered:1;
-	unsigned int uses_mq:1;
 	DECLARE_HASHTABLE(hash, ELV_HASH_BITS);
 };
 
 /*
  * block elevator interface
  */
-extern void elv_dispatch_sort(struct request_queue *, struct request *);
-extern void elv_dispatch_add_tail(struct request_queue *, struct request *);
-extern void elv_add_request(struct request_queue *, struct request *, int);
-extern void __elv_add_request(struct request_queue *, struct request *, int);
 extern enum elv_merge elv_merge(struct request_queue *, struct request **,
 		struct bio *);
 extern void elv_merge_requests(struct request_queue *, struct request *,
 			       struct request *);
 extern void elv_merged_request(struct request_queue *, struct request *,
 		enum elv_merge);
-extern void elv_bio_merged(struct request_queue *q, struct request *,
-				struct bio *);
 extern bool elv_attempt_insert_merge(struct request_queue *, struct request *);
-extern void elv_requeue_request(struct request_queue *, struct request *);
 extern struct request *elv_former_request(struct request_queue *, struct request *);
 extern struct request *elv_latter_request(struct request_queue *, struct request *);
-extern int elv_may_queue(struct request_queue *, unsigned int);
-extern void elv_completed_request(struct request_queue *, struct request *);
-extern int elv_set_request(struct request_queue *q, struct request *rq,
-			   struct bio *bio, gfp_t gfp_mask);
-extern void elv_put_request(struct request_queue *, struct request *);
-extern void elv_drain_elevator(struct request_queue *);
 
 /*
  * io scheduler registration
  */
-extern void __init load_default_elevator_module(void);
 extern int elv_register(struct elevator_type *);
 extern void elv_unregister(struct elevator_type *);
 
@@ -260,9 +176,5 @@ enum {
 #define rq_entry_fifo(ptr)	list_entry((ptr), struct request, queuelist)
 #define rq_fifo_clear(rq)	list_del_init(&(rq)->queuelist)
 
-#else /* CONFIG_BLOCK */
-
-static inline void load_default_elevator_module(void) { }
-
 #endif /* CONFIG_BLOCK */
 #endif
diff --git a/include/linux/init.h b/include/linux/init.h
index 9c2aba1dbabf..5255069f5a9f 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -146,7 +146,6 @@ extern unsigned int reset_devices;
 /* used by init/main.c */
 void setup_arch(char **);
 void prepare_namespace(void);
-void __init load_default_modules(void);
 int __init init_rootfs(void);
 
 #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_STRICT_MODULE_RWX)
diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c
index d1a5d885ce13..73e02ea5d5d1 100644
--- a/init/do_mounts_initrd.c
+++ b/init/do_mounts_initrd.c
@@ -53,9 +53,6 @@ static void __init handle_initrd(void)
 	ksys_mkdir("/old", 0700);
 	ksys_chdir("/old");
 
-	/* try loading default modules from initrd */
-	load_default_modules();
-
 	/*
 	 * In case that a resume from disk is carried out by linuxrc or one of
 	 * its children, we need to tell the freezer not to wait for us.
diff --git a/init/initramfs.c b/init/initramfs.c
index 640557788026..96af18fec4d0 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -644,12 +644,6 @@ static int __init populate_rootfs(void)
 #endif
 	}
 	flush_delayed_fput();
-	/*
-	 * Try loading default modules from initramfs.  This gives
-	 * us a chance to load before device_initcalls.
-	 */
-	load_default_modules();
-
 	return 0;
 }
 rootfs_initcall(populate_rootfs);
diff --git a/init/main.c b/init/main.c
index 1c3f90264280..1f204c578627 100644
--- a/init/main.c
+++ b/init/main.c
@@ -993,17 +993,6 @@ static void __init do_pre_smp_initcalls(void)
 		do_one_initcall(initcall_from_entry(fn));
 }
 
-/*
- * This function requests modules which should be loaded by default and is
- * called twice right after initrd is mounted and right before init is
- * exec'd.  If such modules are on either initrd or rootfs, they will be
- * loaded before control is passed to userland.
- */
-void __init load_default_modules(void)
-{
-	load_default_elevator_module();
-}
-
 static int run_init_process(const char *init_filename)
 {
 	argv_init[0] = init_filename;
@@ -1177,5 +1166,4 @@ static noinline void __init kernel_init_freeable(void)
 	 */
 
 	integrity_load_keys();
-	load_default_modules();
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 21/28] block: remove __blk_put_request()
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (19 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 20/28] block: remove dead elevator code Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:03   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 22/28] block: kill legacy parts of timeout handling Jens Axboe
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Now there's no difference between blk_put_request() and
__blk_put_request() anymore, get rid of the underscore version and
convert the few callers.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c                   | 9 ---------
 block/blk-merge.c                  | 2 +-
 drivers/scsi/osd/osd_initiator.c   | 4 ++--
 drivers/scsi/osst.c                | 2 +-
 drivers/scsi/scsi_error.c          | 2 +-
 drivers/scsi/sg.c                  | 2 +-
 drivers/scsi/st.c                  | 2 +-
 drivers/target/target_core_pscsi.c | 2 +-
 include/linux/blkdev.h             | 1 -
 9 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 009ac18cc7e1..ad79e8b09801 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -800,15 +800,6 @@ void part_round_stats(struct request_queue *q, int cpu, struct hd_struct *part)
 }
 EXPORT_SYMBOL_GPL(part_round_stats);
 
-void __blk_put_request(struct request_queue *q, struct request *req)
-{
-	if (unlikely(!q))
-		return;
-
-	blk_mq_free_request(req);
-}
-EXPORT_SYMBOL_GPL(__blk_put_request);
-
 void blk_put_request(struct request *req)
 {
 	blk_mq_free_request(req);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 5e4e30a88ced..7fedc0391610 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -839,7 +839,7 @@ int blk_attempt_req_merge(struct request_queue *q, struct request *rq,
 
 	free = attempt_merge(q, rq, next);
 	if (free) {
-		__blk_put_request(q, free);
+		blk_put_request(free);
 		return 1;
 	}
 
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index e19fa883376f..60cf7c5eb880 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -506,11 +506,11 @@ static void osd_request_async_done(struct request *req, blk_status_t error)
 
 	_set_error_resid(or, req, error);
 	if (req->next_rq) {
-		__blk_put_request(req->q, req->next_rq);
+		blk_put_request(req->next_rq);
 		req->next_rq = NULL;
 	}
 
-	__blk_put_request(req->q, req);
+	blk_put_request(req);
 	or->request = NULL;
 	or->in.req = NULL;
 	or->out.req = NULL;
diff --git a/drivers/scsi/osst.c b/drivers/scsi/osst.c
index 7a1a1edde35d..664c1238a87f 100644
--- a/drivers/scsi/osst.c
+++ b/drivers/scsi/osst.c
@@ -341,7 +341,7 @@ static void osst_end_async(struct request *req, blk_status_t status)
 		blk_rq_unmap_user(SRpnt->bio);
 	}
 
-	__blk_put_request(req->q, req);
+	blk_put_request(req);
 }
 
 /* osst_request memory management */
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index fff128aa9ec2..dd338a8cd275 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1932,7 +1932,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 
 static void eh_lock_door_done(struct request *req, blk_status_t status)
 {
-	__blk_put_request(req->q, req);
+	blk_put_request(req);
 }
 
 /**
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c6ad00703c5b..4e27460ec926 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1390,7 +1390,7 @@ sg_rq_end_io(struct request *rq, blk_status_t status)
 	 */
 	srp->rq = NULL;
 	scsi_req_free_cmd(scsi_req(rq));
-	__blk_put_request(rq->q, rq);
+	blk_put_request(rq);
 
 	write_lock_irqsave(&sfp->rq_list_lock, iflags);
 	if (unlikely(srp->orphan)) {
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index 307df2fa39a3..7ff22d3f03e3 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -530,7 +530,7 @@ static void st_scsi_execute_end(struct request *req, blk_status_t status)
 		complete(SRpnt->waiting);
 
 	blk_rq_unmap_user(tmp);
-	__blk_put_request(req->q, req);
+	blk_put_request(req);
 }
 
 static int st_scsi_execute(struct st_request *SRpnt, const unsigned char *cmd,
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 47d76c862014..c062d363dce3 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -1094,7 +1094,7 @@ static void pscsi_req_done(struct request *req, blk_status_t status)
 		break;
 	}
 
-	__blk_put_request(req->q, req);
+	blk_put_request(req);
 	kfree(pt);
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e87eeb47aac3..33a81cfde6a6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -908,7 +908,6 @@ extern blk_qc_t direct_make_request(struct bio *bio);
 extern void blk_rq_init(struct request_queue *q, struct request *rq);
 extern void blk_init_request_from_bio(struct request *req, struct bio *bio);
 extern void blk_put_request(struct request *);
-extern void __blk_put_request(struct request_queue *, struct request *);
 extern struct request *blk_get_request(struct request_queue *, unsigned int op,
 				       blk_mq_req_flags_t flags);
 extern int blk_lld_busy(struct request_queue *q);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 22/28] block: kill legacy parts of timeout handling
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (20 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 21/28] block: remove __blk_put_request() Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:04   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 23/28] block: kill lld busy Jens Axboe
                   ` (7 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

The only user of legacy timing now is BSG, which is invoked
from the mq timeout handler. Kill the legacy code, and rename
the q->rq_timed_out_fn to q->bsg_job_timeout_fn.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c       |  1 -
 block/blk-settings.c   |  7 ---
 block/blk-timeout.c    | 99 +++---------------------------------------
 block/blk.h            |  1 -
 block/bsg-lib.c        |  6 +--
 include/linux/blkdev.h |  4 +-
 6 files changed, 11 insertions(+), 107 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ad79e8b09801..4c39c7865f9c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -653,7 +653,6 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id,
 		    laptop_mode_timer_fn, 0);
 	timer_setup(&q->timeout, blk_rq_timed_out_timer, 0);
 	INIT_WORK(&q->timeout_work, NULL);
-	INIT_LIST_HEAD(&q->timeout_list);
 	INIT_LIST_HEAD(&q->icq_list);
 #ifdef CONFIG_BLK_CGROUP
 	INIT_LIST_HEAD(&q->blkg_list);
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 764d0e9e092f..3c5da75c2def 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -32,13 +32,6 @@ void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
 }
 EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
 
-void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
-{
-	WARN_ON_ONCE(q->mq_ops);
-	q->rq_timed_out_fn = fn;
-}
-EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
-
 void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
 {
 	q->lld_busy_fn = fn;
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index f2cfd56e1606..6428d458072a 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -78,70 +78,6 @@ void blk_delete_timer(struct request *req)
 	list_del_init(&req->timeout_list);
 }
 
-static void blk_rq_timed_out(struct request *req)
-{
-	struct request_queue *q = req->q;
-	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
-
-	if (q->rq_timed_out_fn)
-		ret = q->rq_timed_out_fn(req);
-	switch (ret) {
-	case BLK_EH_RESET_TIMER:
-		blk_add_timer(req);
-		blk_clear_rq_complete(req);
-		break;
-	case BLK_EH_DONE:
-		/*
-		 * LLD handles this for now but in the future
-		 * we can send a request msg to abort the command
-		 * and we can move more of the generic scsi eh code to
-		 * the blk layer.
-		 */
-		break;
-	default:
-		printk(KERN_ERR "block: bad eh return: %d\n", ret);
-		break;
-	}
-}
-
-static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout,
-			  unsigned int *next_set)
-{
-	const unsigned long deadline = blk_rq_deadline(rq);
-
-	if (time_after_eq(jiffies, deadline)) {
-		list_del_init(&rq->timeout_list);
-
-		/*
-		 * Check if we raced with end io completion
-		 */
-		if (!blk_mark_rq_complete(rq))
-			blk_rq_timed_out(rq);
-	} else if (!*next_set || time_after(*next_timeout, deadline)) {
-		*next_timeout = deadline;
-		*next_set = 1;
-	}
-}
-
-void blk_timeout_work(struct work_struct *work)
-{
-	struct request_queue *q =
-		container_of(work, struct request_queue, timeout_work);
-	unsigned long flags, next = 0;
-	struct request *rq, *tmp;
-	int next_set = 0;
-
-	spin_lock_irqsave(q->queue_lock, flags);
-
-	list_for_each_entry_safe(rq, tmp, &q->timeout_list, timeout_list)
-		blk_rq_check_expired(rq, &next, &next_set);
-
-	if (next_set)
-		mod_timer(&q->timeout, round_jiffies_up(next));
-
-	spin_unlock_irqrestore(q->queue_lock, flags);
-}
-
 /**
  * blk_abort_request -- Request request recovery for the specified command
  * @req:	pointer to the request of interest
@@ -153,20 +89,13 @@ void blk_timeout_work(struct work_struct *work)
  */
 void blk_abort_request(struct request *req)
 {
-	if (req->q->mq_ops) {
-		/*
-		 * All we need to ensure is that timeout scan takes place
-		 * immediately and that scan sees the new timeout value.
-		 * No need for fancy synchronizations.
-		 */
-		blk_rq_set_deadline(req, jiffies);
-		kblockd_schedule_work(&req->q->timeout_work);
-	} else {
-		if (blk_mark_rq_complete(req))
-			return;
-		blk_delete_timer(req);
-		blk_rq_timed_out(req);
-	}
+	/*
+	 * All we need to ensure is that timeout scan takes place
+	 * immediately and that scan sees the new timeout value.
+	 * No need for fancy synchronizations.
+	 */
+	blk_rq_set_deadline(req, jiffies);
+	kblockd_schedule_work(&req->q->timeout_work);
 }
 EXPORT_SYMBOL_GPL(blk_abort_request);
 
@@ -194,13 +123,6 @@ void blk_add_timer(struct request *req)
 	struct request_queue *q = req->q;
 	unsigned long expiry;
 
-	if (!q->mq_ops)
-		lockdep_assert_held(q->queue_lock);
-
-	/* blk-mq has its own handler, so we don't need ->rq_timed_out_fn */
-	if (!q->mq_ops && !q->rq_timed_out_fn)
-		return;
-
 	BUG_ON(!list_empty(&req->timeout_list));
 
 	/*
@@ -213,13 +135,6 @@ void blk_add_timer(struct request *req)
 	req->rq_flags &= ~RQF_TIMED_OUT;
 	blk_rq_set_deadline(req, jiffies + req->timeout);
 
-	/*
-	 * Only the non-mq case needs to add the request to a protected list.
-	 * For the mq case we simply scan the tag map.
-	 */
-	if (!q->mq_ops)
-		list_add_tail(&req->timeout_list, &req->q->timeout_list);
-
 	/*
 	 * If the timer isn't already pending or this timeout is earlier
 	 * than an existing one, modify the timer. Round up to next nearest
diff --git a/block/blk.h b/block/blk.h
index e2604ae7ddfa..4ae6cacb4548 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -224,7 +224,6 @@ static inline bool bio_integrity_endio(struct bio *bio)
 }
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
 
-void blk_timeout_work(struct work_struct *work);
 unsigned long blk_rq_timeout(unsigned long timeout);
 void blk_add_timer(struct request *req);
 void blk_delete_timer(struct request *);
diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index ef176b472914..3a5938205825 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -305,8 +305,8 @@ static enum blk_eh_timer_return bsg_timeout(struct request *rq, bool reserved)
 	enum blk_eh_timer_return ret = BLK_EH_DONE;
 	struct request_queue *q = rq->q;
 
-	if (q->rq_timed_out_fn)
-		ret = q->rq_timed_out_fn(rq);
+	if (q->bsg_job_timeout_fn)
+		ret = q->bsg_job_timeout_fn(rq);
 
 	return ret;
 }
@@ -355,9 +355,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 
 	q->queuedata = dev;
 	q->bsg_job_fn = job_fn;
+	q->bsg_job_timeout_fn = timeout;
 	blk_queue_flag_set(QUEUE_FLAG_BIDI, q);
 	blk_queue_rq_timeout(q, BLK_DEFAULT_SG_TIMEOUT);
-	q->rq_timed_out_fn = timeout;
 
 	ret = bsg_register_queue(q, dev, name, &bsg_transport_ops);
 	if (ret) {
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 33a81cfde6a6..bfe40de81c19 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -442,7 +442,6 @@ struct request_queue {
 	make_request_fn		*make_request_fn;
 	poll_q_fn		*poll_fn;
 	softirq_done_fn		*softirq_done_fn;
-	rq_timed_out_fn		*rq_timed_out_fn;
 	dma_drain_needed_fn	*dma_drain_needed;
 	lld_busy_fn		*lld_busy_fn;
 	/* Called just after a request is allocated */
@@ -543,7 +542,6 @@ struct request_queue {
 
 	struct timer_list	timeout;
 	struct work_struct	timeout_work;
-	struct list_head	timeout_list;
 
 	struct list_head	icq_list;
 #ifdef CONFIG_BLK_CGROUP
@@ -603,6 +601,7 @@ struct request_queue {
 
 #if defined(CONFIG_BLK_DEV_BSG)
 	bsg_job_fn		*bsg_job_fn;
+	rq_timed_out_fn		*bsg_job_timeout_fn;
 	struct bsg_class_device bsg_dev;
 #endif
 
@@ -1159,7 +1158,6 @@ extern void blk_queue_virt_boundary(struct request_queue *, unsigned long);
 extern void blk_queue_dma_alignment(struct request_queue *, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_softirq_done(struct request_queue *, softirq_done_fn *);
-extern void blk_queue_rq_timed_out(struct request_queue *, rq_timed_out_fn *);
 extern void blk_queue_rq_timeout(struct request_queue *, unsigned int);
 extern void blk_queue_flush_queueable(struct request_queue *q, bool queueable);
 extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fua);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 23/28] block: kill lld busy
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (21 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 22/28] block: kill legacy parts of timeout handling Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-25 21:42   ` Bart Van Assche
  2018-10-29  7:10   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 24/28] block: remove request_list code Jens Axboe
                   ` (6 subsequent siblings)
  29 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Nobody sets the helper, so we always return 0. Kill it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c       | 28 ----------------------------
 block/blk-settings.c   |  6 ------
 drivers/md/dm-mpath.c  |  4 +---
 include/linux/blkdev.h |  4 ----
 4 files changed, 1 insertion(+), 41 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4c39c7865f9c..dd1328f4dc31 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1795,34 +1795,6 @@ void rq_flush_dcache_pages(struct request *rq)
 EXPORT_SYMBOL_GPL(rq_flush_dcache_pages);
 #endif
 
-/**
- * blk_lld_busy - Check if underlying low-level drivers of a device are busy
- * @q : the queue of the device being checked
- *
- * Description:
- *    Check if underlying low-level drivers of a device are busy.
- *    If the drivers want to export their busy state, they must set own
- *    exporting function using blk_queue_lld_busy() first.
- *
- *    Basically, this function is used only by request stacking drivers
- *    to stop dispatching requests to underlying devices when underlying
- *    devices are busy.  This behavior helps more I/O merging on the queue
- *    of the request stacking driver and prevents I/O throughput regression
- *    on burst I/O load.
- *
- * Return:
- *    0 - Not busy (The request stacking driver should dispatch request)
- *    1 - Busy (The request stacking driver should stop dispatching request)
- */
-int blk_lld_busy(struct request_queue *q)
-{
-	if (q->lld_busy_fn)
-		return q->lld_busy_fn(q);
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(blk_lld_busy);
-
 /**
  * blk_rq_unprep_clone - Helper function to free all bios in a cloned request
  * @rq: the clone request to be cleaned up
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 3c5da75c2def..1895f499bbe5 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -32,12 +32,6 @@ void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
 }
 EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
 
-void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
-{
-	q->lld_busy_fn = fn;
-}
-EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
-
 /**
  * blk_set_default_limits - reset limits to default values
  * @lim:  the queue_limits structure to reset
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index a24ed3973e7c..4d736e0fd67f 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1936,9 +1936,7 @@ static int multipath_iterate_devices(struct dm_target *ti,
 
 static int pgpath_busy(struct pgpath *pgpath)
 {
-	struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev);
-
-	return blk_lld_busy(q);
+	return 0;
 }
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bfe40de81c19..115199e7c581 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -312,7 +312,6 @@ typedef bool (poll_q_fn) (struct request_queue *q, blk_qc_t);
 struct bio_vec;
 typedef void (softirq_done_fn)(struct request *);
 typedef int (dma_drain_needed_fn)(struct request *);
-typedef int (lld_busy_fn) (struct request_queue *q);
 typedef int (bsg_job_fn) (struct bsg_job *);
 typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
 typedef void (exit_rq_fn)(struct request_queue *, struct request *);
@@ -443,7 +442,6 @@ struct request_queue {
 	poll_q_fn		*poll_fn;
 	softirq_done_fn		*softirq_done_fn;
 	dma_drain_needed_fn	*dma_drain_needed;
-	lld_busy_fn		*lld_busy_fn;
 	/* Called just after a request is allocated */
 	init_rq_fn		*init_rq_fn;
 	/* Called just before a request is freed */
@@ -909,7 +907,6 @@ extern void blk_init_request_from_bio(struct request *req, struct bio *bio);
 extern void blk_put_request(struct request *);
 extern struct request *blk_get_request(struct request_queue *, unsigned int op,
 				       blk_mq_req_flags_t flags);
-extern int blk_lld_busy(struct request_queue *q);
 extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 			     struct bio_set *bs, gfp_t gfp_mask,
 			     int (*bio_ctr)(struct bio *, struct bio *, void *),
@@ -1152,7 +1149,6 @@ extern void blk_queue_update_dma_pad(struct request_queue *, unsigned int);
 extern int blk_queue_dma_drain(struct request_queue *q,
 			       dma_drain_needed_fn *dma_drain_needed,
 			       void *buf, unsigned int size);
-extern void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn);
 extern void blk_queue_segment_boundary(struct request_queue *, unsigned long);
 extern void blk_queue_virt_boundary(struct request_queue *, unsigned long);
 extern void blk_queue_dma_alignment(struct request_queue *, int);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 24/28] block: remove request_list code
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (22 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 23/28] block: kill lld busy Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:10   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 25/28] block: kill request slab cache Jens Axboe
                   ` (5 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

It's now dead code, nobody uses it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c         |  47 ----------------
 block/blk-core.c           |  75 --------------------------
 block/blk-mq.c             |   4 --
 block/blk.h                |   3 --
 include/linux/blk-cgroup.h | 108 -------------------------------------
 include/linux/blkdev.h     |  34 ------------
 6 files changed, 271 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 5f10d755ec52..020869a37d11 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -76,9 +76,6 @@ static void blkg_free(struct blkcg_gq *blkg)
 		if (blkg->pd[i])
 			blkcg_policy[i]->pd_free_fn(blkg->pd[i]);
 
-	if (blkg->blkcg != &blkcg_root)
-		blk_exit_rl(blkg->q, &blkg->rl);
-
 	blkg_rwstat_exit(&blkg->stat_ios);
 	blkg_rwstat_exit(&blkg->stat_bytes);
 	kfree(blkg);
@@ -142,13 +139,6 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q,
 	INIT_LIST_HEAD(&blkg->q_node);
 	blkg->blkcg = blkcg;
 
-	/* root blkg uses @q->root_rl, init rl only for !root blkgs */
-	if (blkcg != &blkcg_root) {
-		if (blk_init_rl(&blkg->rl, q, gfp_mask))
-			goto err_free;
-		blkg->rl.blkg = blkg;
-	}
-
 	for (i = 0; i < BLKCG_MAX_POLS; i++) {
 		struct blkcg_policy *pol = blkcg_policy[i];
 		struct blkg_policy_data *pd;
@@ -448,42 +438,6 @@ static void blkg_destroy_all(struct request_queue *q)
 	}
 
 	q->root_blkg = NULL;
-	q->root_rl.blkg = NULL;
-}
-
-/*
- * The next function used by blk_queue_for_each_rl().  It's a bit tricky
- * because the root blkg uses @q->root_rl instead of its own rl.
- */
-struct request_list *__blk_queue_next_rl(struct request_list *rl,
-					 struct request_queue *q)
-{
-	struct list_head *ent;
-	struct blkcg_gq *blkg;
-
-	/*
-	 * Determine the current blkg list_head.  The first entry is
-	 * root_rl which is off @q->blkg_list and mapped to the head.
-	 */
-	if (rl == &q->root_rl) {
-		ent = &q->blkg_list;
-		/* There are no more block groups, hence no request lists */
-		if (list_empty(ent))
-			return NULL;
-	} else {
-		blkg = container_of(rl, struct blkcg_gq, rl);
-		ent = &blkg->q_node;
-	}
-
-	/* walk to the next list_head, skip root blkcg */
-	ent = ent->next;
-	if (ent == &q->root_blkg->q_node)
-		ent = ent->next;
-	if (ent == &q->blkg_list)
-		return NULL;
-
-	blkg = container_of(ent, struct blkcg_gq, q_node);
-	return &blkg->rl;
 }
 
 static int blkcg_reset_stats(struct cgroup_subsys_state *css,
@@ -1278,7 +1232,6 @@ int blkcg_init_queue(struct request_queue *q)
 	if (IS_ERR(blkg))
 		goto err_unlock;
 	q->root_blkg = blkg;
-	q->root_rl.blkg = blkg;
 	spin_unlock_irq(q->queue_lock);
 	rcu_read_unlock();
 
diff --git a/block/blk-core.c b/block/blk-core.c
index dd1328f4dc31..e8f60ed456a2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -447,81 +447,6 @@ void blk_cleanup_queue(struct request_queue *q)
 }
 EXPORT_SYMBOL(blk_cleanup_queue);
 
-/* Allocate memory local to the request queue */
-static void *alloc_request_simple(gfp_t gfp_mask, void *data)
-{
-	struct request_queue *q = data;
-
-	return kmem_cache_alloc_node(request_cachep, gfp_mask, q->node);
-}
-
-static void free_request_simple(void *element, void *data)
-{
-	kmem_cache_free(request_cachep, element);
-}
-
-static void *alloc_request_size(gfp_t gfp_mask, void *data)
-{
-	struct request_queue *q = data;
-	struct request *rq;
-
-	rq = kmalloc_node(sizeof(struct request) + q->cmd_size, gfp_mask,
-			q->node);
-	if (rq && q->init_rq_fn && q->init_rq_fn(q, rq, gfp_mask) < 0) {
-		kfree(rq);
-		rq = NULL;
-	}
-	return rq;
-}
-
-static void free_request_size(void *element, void *data)
-{
-	struct request_queue *q = data;
-
-	if (q->exit_rq_fn)
-		q->exit_rq_fn(q, element);
-	kfree(element);
-}
-
-int blk_init_rl(struct request_list *rl, struct request_queue *q,
-		gfp_t gfp_mask)
-{
-	if (unlikely(rl->rq_pool) || q->mq_ops)
-		return 0;
-
-	rl->q = q;
-	rl->count[BLK_RW_SYNC] = rl->count[BLK_RW_ASYNC] = 0;
-	rl->starved[BLK_RW_SYNC] = rl->starved[BLK_RW_ASYNC] = 0;
-	init_waitqueue_head(&rl->wait[BLK_RW_SYNC]);
-	init_waitqueue_head(&rl->wait[BLK_RW_ASYNC]);
-
-	if (q->cmd_size) {
-		rl->rq_pool = mempool_create_node(BLKDEV_MIN_RQ,
-				alloc_request_size, free_request_size,
-				q, gfp_mask, q->node);
-	} else {
-		rl->rq_pool = mempool_create_node(BLKDEV_MIN_RQ,
-				alloc_request_simple, free_request_simple,
-				q, gfp_mask, q->node);
-	}
-	if (!rl->rq_pool)
-		return -ENOMEM;
-
-	if (rl != &q->root_rl)
-		WARN_ON_ONCE(!blk_get_queue(q));
-
-	return 0;
-}
-
-void blk_exit_rl(struct request_queue *q, struct request_list *rl)
-{
-	if (rl->rq_pool) {
-		mempool_destroy(rl->rq_pool);
-		if (rl != &q->root_rl)
-			blk_put_queue(q);
-	}
-}
-
 struct request_queue *blk_alloc_queue(gfp_t gfp_mask)
 {
 	return blk_alloc_queue_node(gfp_mask, NUMA_NO_NODE, NULL);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a58d2d953876..d43c9232c77c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -326,10 +326,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	rq->end_io_data = NULL;
 	rq->next_rq = NULL;
 
-#ifdef CONFIG_BLK_CGROUP
-	rq->rl = NULL;
-#endif
-
 	data->ctx->rq_dispatched[op_is_sync(op)]++;
 	refcount_set(&rq->ref, 1);
 	return rq;
diff --git a/block/blk.h b/block/blk.h
index 4ae6cacb4548..e925cf4fe4de 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -120,9 +120,6 @@ struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q,
 		int node, int cmd_size, gfp_t flags);
 void blk_free_flush_queue(struct blk_flush_queue *q);
 
-int blk_init_rl(struct request_list *rl, struct request_queue *q,
-		gfp_t gfp_mask);
-void blk_exit_rl(struct request_queue *q, struct request_list *rl);
 void blk_exit_queue(struct request_queue *q);
 void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
 			struct bio *bio);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 1e76ceebeb5d..f2c067071336 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -122,9 +122,6 @@ struct blkcg_gq {
 	/* all non-root blkcg_gq's are guaranteed to have access to parent */
 	struct blkcg_gq			*parent;
 
-	/* request allocation list for this blkcg-q pair */
-	struct request_list		rl;
-
 	/* reference count */
 	struct percpu_ref		refcnt;
 
@@ -561,105 +558,6 @@ static inline void blkg_put(struct blkcg_gq *blkg)
 		if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css),	\
 					      (p_blkg)->q, false)))
 
-/**
- * blk_get_rl - get request_list to use
- * @q: request_queue of interest
- * @bio: bio which will be attached to the allocated request (may be %NULL)
- *
- * The caller wants to allocate a request from @q to use for @bio.  Find
- * the request_list to use and obtain a reference on it.  Should be called
- * under queue_lock.  This function is guaranteed to return non-%NULL
- * request_list.
- */
-static inline struct request_list *blk_get_rl(struct request_queue *q,
-					      struct bio *bio)
-{
-	struct blkcg *blkcg;
-	struct blkcg_gq *blkg;
-
-	rcu_read_lock();
-
-	if (bio && bio->bi_blkg) {
-		blkcg = bio->bi_blkg->blkcg;
-		if (blkcg == &blkcg_root)
-			goto rl_use_root;
-
-		blkg_get(bio->bi_blkg);
-		rcu_read_unlock();
-		return &bio->bi_blkg->rl;
-	}
-
-	blkcg = css_to_blkcg(blkcg_css());
-	if (blkcg == &blkcg_root)
-		goto rl_use_root;
-
-	blkg = blkg_lookup(blkcg, q);
-	if (unlikely(!blkg))
-		blkg = __blkg_lookup_create(blkcg, q);
-
-	if (blkg->blkcg == &blkcg_root || !blkg_tryget(blkg))
-		goto rl_use_root;
-
-	rcu_read_unlock();
-	return &blkg->rl;
-
-	/*
-	 * Each blkg has its own request_list, however, the root blkcg
-	 * uses the request_queue's root_rl.  This is to avoid most
-	 * overhead for the root blkcg.
-	 */
-rl_use_root:
-	rcu_read_unlock();
-	return &q->root_rl;
-}
-
-/**
- * blk_put_rl - put request_list
- * @rl: request_list to put
- *
- * Put the reference acquired by blk_get_rl().  Should be called under
- * queue_lock.
- */
-static inline void blk_put_rl(struct request_list *rl)
-{
-	if (rl->blkg->blkcg != &blkcg_root)
-		blkg_put(rl->blkg);
-}
-
-/**
- * blk_rq_set_rl - associate a request with a request_list
- * @rq: request of interest
- * @rl: target request_list
- *
- * Associate @rq with @rl so that accounting and freeing can know the
- * request_list @rq came from.
- */
-static inline void blk_rq_set_rl(struct request *rq, struct request_list *rl)
-{
-	rq->rl = rl;
-}
-
-/**
- * blk_rq_rl - return the request_list a request came from
- * @rq: request of interest
- *
- * Return the request_list @rq is allocated from.
- */
-static inline struct request_list *blk_rq_rl(struct request *rq)
-{
-	return rq->rl;
-}
-
-struct request_list *__blk_queue_next_rl(struct request_list *rl,
-					 struct request_queue *q);
-/**
- * blk_queue_for_each_rl - iterate through all request_lists of a request_queue
- *
- * Should be used under queue_lock.
- */
-#define blk_queue_for_each_rl(rl, q)	\
-	for ((rl) = &(q)->root_rl; (rl); (rl) = __blk_queue_next_rl((rl), (q)))
-
 static inline int blkg_stat_init(struct blkg_stat *stat, gfp_t gfp)
 {
 	int ret;
@@ -993,12 +891,6 @@ static inline char *blkg_path(struct blkcg_gq *blkg) { return NULL; }
 static inline void blkg_get(struct blkcg_gq *blkg) { }
 static inline void blkg_put(struct blkcg_gq *blkg) { }
 
-static inline struct request_list *blk_get_rl(struct request_queue *q,
-					      struct bio *bio) { return &q->root_rl; }
-static inline void blk_put_rl(struct request_list *rl) { }
-static inline void blk_rq_set_rl(struct request *rq, struct request_list *rl) { }
-static inline struct request_list *blk_rq_rl(struct request *rq) { return &rq->q->root_rl; }
-
 static inline void blkcg_bio_issue_init(struct bio *bio) { }
 static inline bool blkcg_bio_issue_check(struct request_queue *q,
 					 struct bio *bio) { return true; }
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 115199e7c581..95e119409490 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -58,22 +58,6 @@ struct blk_stat_callback;
 
 typedef void (rq_end_io_fn)(struct request *, blk_status_t);
 
-struct request_list {
-	struct request_queue	*q;	/* the queue this rl belongs to */
-#ifdef CONFIG_BLK_CGROUP
-	struct blkcg_gq		*blkg;	/* blkg this request pool belongs to */
-#endif
-	/*
-	 * count[], starved[], and wait[] are indexed by
-	 * BLK_RW_SYNC/BLK_RW_ASYNC
-	 */
-	int			count[2];
-	int			starved[2];
-	mempool_t		*rq_pool;
-	wait_queue_head_t	wait[2];
-	unsigned int		flags;
-};
-
 /*
  * request flags */
 typedef __u32 __bitwise req_flags_t;
@@ -259,10 +243,6 @@ struct request {
 
 	/* for bidi */
 	struct request *next_rq;
-
-#ifdef CONFIG_BLK_CGROUP
-	struct request_list *rl;		/* rl this rq is alloced from */
-#endif
 };
 
 static inline bool blk_op_is_scsi(unsigned int op)
@@ -313,8 +293,6 @@ struct bio_vec;
 typedef void (softirq_done_fn)(struct request *);
 typedef int (dma_drain_needed_fn)(struct request *);
 typedef int (bsg_job_fn) (struct bsg_job *);
-typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
-typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
 	BLK_EH_DONE,		/* drivers has completed the command */
@@ -430,22 +408,10 @@ struct request_queue {
 	struct blk_queue_stats	*stats;
 	struct rq_qos		*rq_qos;
 
-	/*
-	 * If blkcg is not used, @q->root_rl serves all requests.  If blkcg
-	 * is used, root blkg allocates from @q->root_rl and all other
-	 * blkgs from their own blkg->rl.  Which one to use should be
-	 * determined using bio_request_list().
-	 */
-	struct request_list	root_rl;
-
 	make_request_fn		*make_request_fn;
 	poll_q_fn		*poll_fn;
 	softirq_done_fn		*softirq_done_fn;
 	dma_drain_needed_fn	*dma_drain_needed;
-	/* Called just after a request is allocated */
-	init_rq_fn		*init_rq_fn;
-	/* Called just before a request is freed */
-	exit_rq_fn		*exit_rq_fn;
 
 	const struct blk_mq_ops	*mq_ops;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 25/28] block: kill request slab cache
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (23 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 24/28] block: remove request_list code Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:11   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 26/28] block: remove req_no_special_merge() from merging code Jens Axboe
                   ` (4 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-core.c | 8 --------
 block/blk.h      | 1 -
 2 files changed, 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index e8f60ed456a2..6074bf2ee2e7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -57,11 +57,6 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(block_unplug);
 
 DEFINE_IDA(blk_queue_ida);
 
-/*
- * For the allocated request tables
- */
-struct kmem_cache *request_cachep;
-
 /*
  * For queue allocation
  */
@@ -1941,9 +1936,6 @@ int __init blk_dev_init(void)
 	if (!kblockd_workqueue)
 		panic("Failed to create kblockd\n");
 
-	request_cachep = kmem_cache_create("blkdev_requests",
-			sizeof(struct request), 0, SLAB_PANIC, NULL);
-
 	blk_requestq_cachep = kmem_cache_create("request_queue",
 			sizeof(struct request_queue), 0, SLAB_PANIC, NULL);
 
diff --git a/block/blk.h b/block/blk.h
index e925cf4fe4de..2bf1cfeeb9c0 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -32,7 +32,6 @@ struct blk_flush_queue {
 };
 
 extern struct kmem_cache *blk_requestq_cachep;
-extern struct kmem_cache *request_cachep;
 extern struct kobj_type blk_queue_ktype;
 extern struct ida blk_queue_ida;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 26/28] block: remove req_no_special_merge() from merging code
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (24 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 25/28] block: kill request slab cache Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:12   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 27/28] blk-merge: kill dead queue lock held check Jens Axboe
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

It'll always be false at this point, just remove it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-merge.c | 25 +++----------------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 7fedc0391610..3561dcce2260 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -595,17 +595,6 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req,
 	return ll_new_hw_segment(q, req, bio);
 }
 
-/*
- * blk-mq uses req->special to carry normal driver per-request payload, it
- * does not indicate a prepared command that we cannot merge with.
- */
-static bool req_no_special_merge(struct request *req)
-{
-	struct request_queue *q = req->q;
-
-	return !q->mq_ops && req->special;
-}
-
 static bool req_attempt_discard_merge(struct request_queue *q, struct request *req,
 		struct request *next)
 {
@@ -631,13 +620,6 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 	unsigned int seg_size =
 		req->biotail->bi_seg_back_size + next->bio->bi_seg_front_size;
 
-	/*
-	 * First check if the either of the requests are re-queued
-	 * requests.  Can't merge them if they are.
-	 */
-	if (req_no_special_merge(req) || req_no_special_merge(next))
-		return 0;
-
 	if (req_gap_back_merge(req, next->bio))
 		return 0;
 
@@ -738,8 +720,7 @@ static struct request *attempt_merge(struct request_queue *q,
 		return NULL;
 
 	if (rq_data_dir(req) != rq_data_dir(next)
-	    || req->rq_disk != next->rq_disk
-	    || req_no_special_merge(next))
+	    || req->rq_disk != next->rq_disk)
 		return NULL;
 
 	if (req_op(req) == REQ_OP_WRITE_SAME &&
@@ -858,8 +839,8 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 	if (bio_data_dir(bio) != rq_data_dir(rq))
 		return false;
 
-	/* must be same device and not a special request */
-	if (rq->rq_disk != bio->bi_disk || req_no_special_merge(rq))
+	/* must be same device */
+	if (rq->rq_disk != bio->bi_disk)
 		return false;
 
 	/* only merge integrity protected bio into ditto rq */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 27/28] blk-merge: kill dead queue lock held check
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (25 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 26/28] block: remove req_no_special_merge() from merging code Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:12   ` Hannes Reinecke
  2018-10-25 21:10 ` [PATCH 28/28] block: get rid of blk_queued_rq() Jens Axboe
                   ` (2 subsequent siblings)
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

This is dead code, any queue reaching this part has mq_ops
attached.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-merge.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 3561dcce2260..0128284bded4 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -704,9 +704,6 @@ static void blk_account_io_merge(struct request *req)
 static struct request *attempt_merge(struct request_queue *q,
 				     struct request *req, struct request *next)
 {
-	if (!q->mq_ops)
-		lockdep_assert_held(q->queue_lock);
-
 	if (!rq_mergeable(req) || !rq_mergeable(next))
 		return NULL;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 28/28] block: get rid of blk_queued_rq()
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (26 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 27/28] blk-merge: kill dead queue lock held check Jens Axboe
@ 2018-10-25 21:10 ` Jens Axboe
  2018-10-29  7:12   ` Hannes Reinecke
  2018-10-25 23:09 ` [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Bart Van Assche
  2018-10-29 12:00 ` Ming Lei
  29 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:10 UTC (permalink / raw)
  To: linux-block, linux-scsi, linux-ide; +Cc: Jens Axboe

No point in hiding what this does, just open code it in the
one spot where we are still using it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c         | 2 +-
 include/linux/blkdev.h | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d43c9232c77c..21e4147c4810 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -692,7 +692,7 @@ void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list)
 	/* this request will be re-inserted to io scheduler queue */
 	blk_mq_sched_requeue_request(rq);
 
-	BUG_ON(blk_queued_rq(rq));
+	BUG_ON(!list_empty(&rq->queuelist));
 	blk_mq_add_to_requeue_list(rq, true, kick_requeue_list);
 }
 EXPORT_SYMBOL(blk_mq_requeue_request);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 95e119409490..82b6cf45c6e0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -678,8 +678,6 @@ static inline bool blk_account_rq(struct request *rq)
 
 #define blk_rq_cpu_valid(rq)	((rq)->cpu != -1)
 #define blk_bidi_rq(rq)		((rq)->next_rq != NULL)
-/* rq->queuelist of dequeued request must be list_empty() */
-#define blk_queued_rq(rq)	(!list_empty(&(rq)->queuelist))
 
 #define list_entry_rq(ptr)	list_entry((ptr), struct request, queuelist)
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH 05/28] IB/srp: remove old request_fn_active check
  2018-10-25 21:10 ` [PATCH 05/28] IB/srp: remove old request_fn_active check Jens Axboe
@ 2018-10-25 21:23   ` Bart Van Assche
  2018-10-25 21:24     ` Jens Axboe
  2018-10-26  7:08   ` Hannes Reinecke
  1 sibling, 1 reply; 90+ messages in thread
From: Bart Van Assche @ 2018-10-25 21:23 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide; +Cc: Parav Pandit

On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
> This check is only viable for non scsi-mq. Since that is going away,
> kill this legacy check.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Parav Pandit <parav@mellanox.com>
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 0b34e909505f..5a79444c2f3c 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
>  	struct scsi_device *sdev;
>  	int i, j;
>  
> -	/*
> -	 * Invoking srp_terminate_io() while srp_queuecommand() is running
> -	 * is not safe. Hence the warning statement below.
> -	 */
> -	shost_for_each_device(sdev, shost)
> -		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
> -
>  	for (i = 0; i < target->ch_count; i++) {
>  		ch = &target->ch[i];

It seems like Hannes' and my Signed-off-by/Reviewed-by tags are missing?
See also https://marc.info/?l=linux-scsi&m=153970174514608.

Bart.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 05/28] IB/srp: remove old request_fn_active check
  2018-10-25 21:23   ` Bart Van Assche
@ 2018-10-25 21:24     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 21:24 UTC (permalink / raw)
  To: Bart Van Assche, linux-block, linux-scsi, linux-ide; +Cc: Parav Pandit

On 10/25/18 3:23 PM, Bart Van Assche wrote:
> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>> This check is only viable for non scsi-mq. Since that is going away,
>> kill this legacy check.
>>
>> Cc: Bart Van Assche <bvanassche@acm.org>
>> Cc: Parav Pandit <parav@mellanox.com>
>> Cc: linux-scsi@vger.kernel.org
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
>>  1 file changed, 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 0b34e909505f..5a79444c2f3c 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
>>  	struct scsi_device *sdev;
>>  	int i, j;
>>  
>> -	/*
>> -	 * Invoking srp_terminate_io() while srp_queuecommand() is running
>> -	 * is not safe. Hence the warning statement below.
>> -	 */
>> -	shost_for_each_device(sdev, shost)
>> -		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
>> -
>>  	for (i = 0; i < target->ch_count; i++) {
>>  		ch = &target->ch[i];
> 
> It seems like Hannes' and my Signed-off-by/Reviewed-by tags are missing?
> See also https://marc.info/?l=linux-scsi&m=153970174514608.

Sorry, forgot to update that one. I'll do that for v2. Thanks for letting
me know.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 21:10 ` [PATCH 08/28] scsi: kill off the legacy IO path Jens Axboe
@ 2018-10-25 21:36   ` Bart Van Assche
  2018-10-25 22:18     ` Jens Axboe
  2018-10-29  6:48   ` Hannes Reinecke
  1 sibling, 1 reply; 90+ messages in thread
From: Bart Van Assche @ 2018-10-25 21:36 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide, Himanshu Madhani

On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
> @@ -3265,25 +3261,17 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
>  	    base_vha->mgmt_svr_loop_id, host->sg_tablesize);
>  
>  	if (ha->mqenable) {
> -		bool mq = false;
>  		bool startit = false;
>  
> -		if (QLA_TGT_MODE_ENABLED()) {
> -			mq = true;
> +		if (QLA_TGT_MODE_ENABLED())
>  			startit = false;
> -		}
>  
> -		if ((ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED) &&
> -		    shost_use_blk_mq(host)) {
> -			mq = true;
> +		if (ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED)
>  			startit = true;
> -		}
>  
> -		if (mq) {
> -			/* Create start of day qpairs for Block MQ */
> -			for (i = 0; i < ha->max_qpairs; i++)
> -				qla2xxx_create_qpair(base_vha, 5, 0, startit);
> -		}
> +		/* Create start of day qpairs for Block MQ */
> +		for (i = 0; i < ha->max_qpairs; i++)
> +			qla2xxx_create_qpair(base_vha, 5, 0, startit);
>  	}
>  
>  	if (ha->flags.running_gold_fw)

(+Himanshu)

Since I'm not sure that "mq" in the above code refers to "scsi-mq" nor that it
refers to "blk-mq", I'm not sure the above changes should be included in this
patch. Himanshu, can you have a look?

> +static ssize_t
> +show_use_blk_mq(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> +	return snprintf(buf, 20, "1\n");
> +}
> +static DEVICE_ATTR(use_blk_mq, S_IRUGO, show_use_blk_mq, NULL);

The pattern "snprintf(buf, 20, ...)" looks like cargo-cult programming to me.
Please consider changing "snprintf(buf, 20, " into "sprintf(buf, ".

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 23/28] block: kill lld busy
  2018-10-25 21:10 ` [PATCH 23/28] block: kill lld busy Jens Axboe
@ 2018-10-25 21:42   ` Bart Van Assche
  2018-10-25 22:18     ` Jens Axboe
  2018-10-29  7:10   ` Hannes Reinecke
  1 sibling, 1 reply; 90+ messages in thread
From: Bart Van Assche @ 2018-10-25 21:42 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
> Nobody sets the helper, so we always return 0. Kill it.

How about changing the description of this patch into something
like "Now that the blk_lld_busy() and blk_queue_lld_busy() callers
have been removed, also remove the implementations of these functions."?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 21:36   ` Bart Van Assche
@ 2018-10-25 22:18     ` Jens Axboe
  2018-10-25 22:44       ` Madhani, Himanshu
  0 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 22:18 UTC (permalink / raw)
  To: Bart Van Assche, linux-block, linux-scsi, linux-ide, Himanshu Madhani

On 10/25/18 3:36 PM, Bart Van Assche wrote:
> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>> @@ -3265,25 +3261,17 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
>>  	    base_vha->mgmt_svr_loop_id, host->sg_tablesize);
>>  
>>  	if (ha->mqenable) {
>> -		bool mq = false;
>>  		bool startit = false;
>>  
>> -		if (QLA_TGT_MODE_ENABLED()) {
>> -			mq = true;
>> +		if (QLA_TGT_MODE_ENABLED())
>>  			startit = false;
>> -		}
>>  
>> -		if ((ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED) &&
>> -		    shost_use_blk_mq(host)) {
>> -			mq = true;
>> +		if (ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED)
>>  			startit = true;
>> -		}
>>  
>> -		if (mq) {
>> -			/* Create start of day qpairs for Block MQ */
>> -			for (i = 0; i < ha->max_qpairs; i++)
>> -				qla2xxx_create_qpair(base_vha, 5, 0, startit);
>> -		}
>> +		/* Create start of day qpairs for Block MQ */
>> +		for (i = 0; i < ha->max_qpairs; i++)
>> +			qla2xxx_create_qpair(base_vha, 5, 0, startit);
>>  	}
>>  
>>  	if (ha->flags.running_gold_fw)
> 
> (+Himanshu)
> 
> Since I'm not sure that "mq" in the above code refers to "scsi-mq" nor that it
> refers to "blk-mq", I'm not sure the above changes should be included in this
> patch. Himanshu, can you have a look?

There's literally a comment there that says "for Block MQ" :-)

>> +static ssize_t
>> +show_use_blk_mq(struct device *dev, struct device_attribute *attr, char *buf)
>> +{
>> +	return snprintf(buf, 20, "1\n");
>> +}
>> +static DEVICE_ATTR(use_blk_mq, S_IRUGO, show_use_blk_mq, NULL);
> 
> The pattern "snprintf(buf, 20, ...)" looks like cargo-cult programming to me.
> Please consider changing "snprintf(buf, 20, " into "sprintf(buf, ".

Good point, I'll do that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 23/28] block: kill lld busy
  2018-10-25 21:42   ` Bart Van Assche
@ 2018-10-25 22:18     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 22:18 UTC (permalink / raw)
  To: Bart Van Assche, linux-block, linux-scsi, linux-ide

On 10/25/18 3:42 PM, Bart Van Assche wrote:
> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>> Nobody sets the helper, so we always return 0. Kill it.
> 
> How about changing the description of this patch into something
> like "Now that the blk_lld_busy() and blk_queue_lld_busy() callers
> have been removed, also remove the implementations of these functions."?

Sure, I can improve the commit message.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 22:18     ` Jens Axboe
@ 2018-10-25 22:44       ` Madhani, Himanshu
  2018-10-25 23:00         ` Jens Axboe
  0 siblings, 1 reply; 90+ messages in thread
From: Madhani, Himanshu @ 2018-10-25 22:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bart Van Assche, linux-block, linux-scsi, linux-ide

Jens,=20

> On Oct 25, 2018, at 3:18 PM, Jens Axboe <axboe@kernel.dk> wrote:
>=20
> External Email
>=20
> On 10/25/18 3:36 PM, Bart Van Assche wrote:
>> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>>> @@ -3265,25 +3261,17 @@ qla2x00_probe_one(struct pci_dev *pdev, const s=
truct pci_device_id *id)
>>>         base_vha->mgmt_svr_loop_id, host->sg_tablesize);
>>>=20
>>>     if (ha->mqenable) {
>>> -            bool mq =3D false;
>>>             bool startit =3D false;
>>>=20
>>> -            if (QLA_TGT_MODE_ENABLED()) {
>>> -                    mq =3D true;
>>> +            if (QLA_TGT_MODE_ENABLED())
>>>                     startit =3D false;
>>> -            }
>>>=20
>>> -            if ((ql2x_ini_mode =3D=3D QLA2XXX_INI_MODE_ENABLED) &&
>>> -                shost_use_blk_mq(host)) {
>>> -                    mq =3D true;
>>> +            if (ql2x_ini_mode =3D=3D QLA2XXX_INI_MODE_ENABLED)
>>>                     startit =3D true;
>>> -            }
>>>=20
>>> -            if (mq) {
>>> -                    /* Create start of day qpairs for Block MQ */
>>> -                    for (i =3D 0; i < ha->max_qpairs; i++)
>>> -                            qla2xxx_create_qpair(base_vha, 5, 0, start=
it);
>>> -            }
>>> +            /* Create start of day qpairs for Block MQ */
>>> +            for (i =3D 0; i < ha->max_qpairs; i++)
>>> +                    qla2xxx_create_qpair(base_vha, 5, 0, startit);
>>>     }
>>>=20
>>>     if (ha->flags.running_gold_fw)
>>=20
>> (+Himanshu)
>>=20
>> Since I'm not sure that "mq" in the above code refers to "scsi-mq" nor t=
hat it
>> refers to "blk-mq", I'm not sure the above changes should be included in=
 this
>> patch. Himanshu, can you have a look?
>=20
> There's literally a comment there that says "for Block MQ" :-)

This change is Good.

We were using mq boolean to determine if we are in Target mode or Initiator=
 with BLK-MQ on
to create qpair in firmware.

Since this patch is removing shost_use_blk_mq)(), so the mq boolean becomes=
 no-op.=20

>>> +static ssize_t
>>> +show_use_blk_mq(struct device *dev, struct device_attribute *attr, cha=
r *buf)
>>> +{
>>> +    return snprintf(buf, 20, "1\n");
>>> +}
>>> +static DEVICE_ATTR(use_blk_mq, S_IRUGO, show_use_blk_mq, NULL);
>>=20
>> The pattern "snprintf(buf, 20, ...)" looks like cargo-cult programming t=
o me.
>> Please consider changing "snprintf(buf, 20, " into "sprintf(buf, ".
>=20
> Good point, I'll do that.
>=20
> --
> Jens Axboe
>=20

Thanks,
- Himanshu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 22:44       ` Madhani, Himanshu
@ 2018-10-25 23:00         ` Jens Axboe
  2018-10-25 23:06           ` Madhani, Himanshu
  0 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 23:00 UTC (permalink / raw)
  To: Madhani, Himanshu; +Cc: Bart Van Assche, linux-block, linux-scsi, linux-ide

On 10/25/18 4:44 PM, Madhani, Himanshu wrote:
> Jens, 
> 
>> On Oct 25, 2018, at 3:18 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> External Email
>>
>> On 10/25/18 3:36 PM, Bart Van Assche wrote:
>>> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>>>> @@ -3265,25 +3261,17 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
>>>>         base_vha->mgmt_svr_loop_id, host->sg_tablesize);
>>>>
>>>>     if (ha->mqenable) {
>>>> -            bool mq = false;
>>>>             bool startit = false;
>>>>
>>>> -            if (QLA_TGT_MODE_ENABLED()) {
>>>> -                    mq = true;
>>>> +            if (QLA_TGT_MODE_ENABLED())
>>>>                     startit = false;
>>>> -            }
>>>>
>>>> -            if ((ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED) &&
>>>> -                shost_use_blk_mq(host)) {
>>>> -                    mq = true;
>>>> +            if (ql2x_ini_mode == QLA2XXX_INI_MODE_ENABLED)
>>>>                     startit = true;
>>>> -            }
>>>>
>>>> -            if (mq) {
>>>> -                    /* Create start of day qpairs for Block MQ */
>>>> -                    for (i = 0; i < ha->max_qpairs; i++)
>>>> -                            qla2xxx_create_qpair(base_vha, 5, 0, startit);
>>>> -            }
>>>> +            /* Create start of day qpairs for Block MQ */
>>>> +            for (i = 0; i < ha->max_qpairs; i++)
>>>> +                    qla2xxx_create_qpair(base_vha, 5, 0, startit);
>>>>     }
>>>>
>>>>     if (ha->flags.running_gold_fw)
>>>
>>> (+Himanshu)
>>>
>>> Since I'm not sure that "mq" in the above code refers to "scsi-mq" nor that it
>>> refers to "blk-mq", I'm not sure the above changes should be included in this
>>> patch. Himanshu, can you have a look?
>>
>> There's literally a comment there that says "for Block MQ" :-)
> 
> This change is Good.
> 
> We were using mq boolean to determine if we are in Target mode or Initiator with BLK-MQ on
> to create qpair in firmware.
> 
> Since this patch is removing shost_use_blk_mq)(), so the mq boolean becomes no-op. 

Thanks for checking. Can I add your acked-by or reviewed-by to the patch?

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 23:00         ` Jens Axboe
@ 2018-10-25 23:06           ` Madhani, Himanshu
  0 siblings, 0 replies; 90+ messages in thread
From: Madhani, Himanshu @ 2018-10-25 23:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bart Van Assche, linux-block, linux-scsi, linux-ide


> On Oct 25, 2018, at 4:00 PM, Jens Axboe <axboe@kernel.dk> wrote:
>=20
> External Email
>=20
> On 10/25/18 4:44 PM, Madhani, Himanshu wrote:
>> Jens,
>>=20
>>> On Oct 25, 2018, at 3:18 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>>=20
>>> External Email
>>>=20
>>> On 10/25/18 3:36 PM, Bart Van Assche wrote:
>>>> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>>>>> @@ -3265,25 +3261,17 @@ qla2x00_probe_one(struct pci_dev *pdev, const=
 struct pci_device_id *id)
>>>>>        base_vha->mgmt_svr_loop_id, host->sg_tablesize);
>>>>>=20
>>>>>    if (ha->mqenable) {
>>>>> -            bool mq =3D false;
>>>>>            bool startit =3D false;
>>>>>=20
>>>>> -            if (QLA_TGT_MODE_ENABLED()) {
>>>>> -                    mq =3D true;
>>>>> +            if (QLA_TGT_MODE_ENABLED())
>>>>>                    startit =3D false;
>>>>> -            }
>>>>>=20
>>>>> -            if ((ql2x_ini_mode =3D=3D QLA2XXX_INI_MODE_ENABLED) &&
>>>>> -                shost_use_blk_mq(host)) {
>>>>> -                    mq =3D true;
>>>>> +            if (ql2x_ini_mode =3D=3D QLA2XXX_INI_MODE_ENABLED)
>>>>>                    startit =3D true;
>>>>> -            }
>>>>>=20
>>>>> -            if (mq) {
>>>>> -                    /* Create start of day qpairs for Block MQ */
>>>>> -                    for (i =3D 0; i < ha->max_qpairs; i++)
>>>>> -                            qla2xxx_create_qpair(base_vha, 5, 0, sta=
rtit);
>>>>> -            }
>>>>> +            /* Create start of day qpairs for Block MQ */
>>>>> +            for (i =3D 0; i < ha->max_qpairs; i++)
>>>>> +                    qla2xxx_create_qpair(base_vha, 5, 0, startit);
>>>>>    }
>>>>>=20
>>>>>    if (ha->flags.running_gold_fw)
>>>>=20
>>>> (+Himanshu)
>>>>=20
>>>> Since I'm not sure that "mq" in the above code refers to "scsi-mq" nor=
 that it
>>>> refers to "blk-mq", I'm not sure the above changes should be included =
in this
>>>> patch. Himanshu, can you have a look?
>>>=20
>>> There's literally a comment there that says "for Block MQ" :-)
>>=20
>> This change is Good.
>>=20
>> We were using mq boolean to determine if we are in Target mode or Initia=
tor with BLK-MQ on
>> to create qpair in firmware.
>>=20
>> Since this patch is removing shost_use_blk_mq)(), so the mq boolean beco=
mes no-op.
>=20
> Thanks for checking. Can I add your acked-by or reviewed-by to the patch?
>=20

Yes, for qla2xxx changes

Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>

> --
> Jens Axboe

Thanks,
- Himanshu

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (27 preceding siblings ...)
  2018-10-25 21:10 ` [PATCH 28/28] block: get rid of blk_queued_rq() Jens Axboe
@ 2018-10-25 23:09 ` Bart Van Assche
  2018-10-25 23:11   ` Jens Axboe
  2018-10-29 12:00 ` Ming Lei
  29 siblings, 1 reply; 90+ messages in thread
From: Bart Van Assche @ 2018-10-25 23:09 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
> The first round of this went into 4.20-rc, but we've still some of
> them pending. This patch series converts the remaining drivers to
> blk-mq. The ones that support dual paths (like SCSI and DM) have
> the non-mq path removed. At the end, legacy IO code and schedulers
> are killed off.
> 
> This patch series is on top of my for-linus branch. It can also
> be bound in my mq-conversions branch.

Hi Jens,

I cloned your mq-conversions branch and ran the block, srp and nvmeof-mp
test groups from the blktests project. The result of these tests on my
setup is the same as with kernel v4.19: except for a few false positive
lockdep complaints, all tests pass. The most recent commit that is
involved in the false positive lockdep complaints I encountered is
87915adc3f0a ("workqueue: re-add lockdep dependencies for flushing").
Patches are being discussed to address these false positives.

Bart.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-25 23:09 ` [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Bart Van Assche
@ 2018-10-25 23:11   ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-25 23:11 UTC (permalink / raw)
  To: Bart Van Assche, linux-block, linux-scsi, linux-ide

On 10/25/18 5:09 PM, Bart Van Assche wrote:
> On Thu, 2018-10-25 at 15:10 -0600, Jens Axboe wrote:
>> The first round of this went into 4.20-rc, but we've still some of
>> them pending. This patch series converts the remaining drivers to
>> blk-mq. The ones that support dual paths (like SCSI and DM) have
>> the non-mq path removed. At the end, legacy IO code and schedulers
>> are killed off.
>>
>> This patch series is on top of my for-linus branch. It can also
>> be bound in my mq-conversions branch.
> 
> Hi Jens,
> 
> I cloned your mq-conversions branch and ran the block, srp and nvmeof-mp
> test groups from the blktests project. The result of these tests on my
> setup is the same as with kernel v4.19: except for a few false positive
> lockdep complaints, all tests pass. The most recent commit that is
> involved in the false positive lockdep complaints I encountered is
> 87915adc3f0a ("workqueue: re-add lockdep dependencies for flushing").
> Patches are being discussed to address these false positives.

Great, thanks for doing this testing, Bart!

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 05/28] IB/srp: remove old request_fn_active check
  2018-10-25 21:10 ` [PATCH 05/28] IB/srp: remove old request_fn_active check Jens Axboe
  2018-10-25 21:23   ` Bart Van Assche
@ 2018-10-26  7:08   ` Hannes Reinecke
  2018-10-26 14:32     ` Jens Axboe
  1 sibling, 1 reply; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-26  7:08 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide
  Cc: Bart Van Assche, Parav Pandit

On 10/25/18 11:10 PM, Jens Axboe wrote:
> This check is only viable for non scsi-mq. Since that is going away,
> kill this legacy check.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Parav Pandit <parav@mellanox.com>
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
>   1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 0b34e909505f..5a79444c2f3c 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
>   	struct scsi_device *sdev;
>   	int i, j;
>   
> -	/*
> -	 * Invoking srp_terminate_io() while srp_queuecommand() is running
> -	 * is not safe. Hence the warning statement below.
> -	 */
> -	shost_for_each_device(sdev, shost)
> -		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
> -
>   	for (i = 0; i < target->ch_count; i++) {
>   		ch = &target->ch[i];
>   
> 
I _think_ I've already killed that one; please do check the latest tree 
from Doug Ledford.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 05/28] IB/srp: remove old request_fn_active check
  2018-10-26  7:08   ` Hannes Reinecke
@ 2018-10-26 14:32     ` Jens Axboe
  2018-10-26 15:03       ` Bart Van Assche
  0 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-26 14:32 UTC (permalink / raw)
  To: Hannes Reinecke, linux-block, linux-scsi, linux-ide
  Cc: Bart Van Assche, Parav Pandit

On 10/26/18 1:08 AM, Hannes Reinecke wrote:
> On 10/25/18 11:10 PM, Jens Axboe wrote:
>> This check is only viable for non scsi-mq. Since that is going away,
>> kill this legacy check.
>>
>> Cc: Bart Van Assche <bvanassche@acm.org>
>> Cc: Parav Pandit <parav@mellanox.com>
>> Cc: linux-scsi@vger.kernel.org
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
>>   1 file changed, 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 0b34e909505f..5a79444c2f3c 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
>>   	struct scsi_device *sdev;
>>   	int i, j;
>>   
>> -	/*
>> -	 * Invoking srp_terminate_io() while srp_queuecommand() is running
>> -	 * is not safe. Hence the warning statement below.
>> -	 */
>> -	shost_for_each_device(sdev, shost)
>> -		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
>> -
>>   	for (i = 0; i < target->ch_count; i++) {
>>   		ch = &target->ch[i];
>>   
>>
> I _think_ I've already killed that one; please do check the latest tree 
> from Doug Ledford.

I don't know which tree is Doug's, but as long as it's queued up
for mainline, that's all I care about. I'm still rebasing this
branch, so it'll just get dropped when it shows up in mainline.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 05/28] IB/srp: remove old request_fn_active check
  2018-10-26 14:32     ` Jens Axboe
@ 2018-10-26 15:03       ` Bart Van Assche
  0 siblings, 0 replies; 90+ messages in thread
From: Bart Van Assche @ 2018-10-26 15:03 UTC (permalink / raw)
  To: Jens Axboe, Hannes Reinecke, linux-block, linux-scsi, linux-ide
  Cc: Parav Pandit

On Fri, 2018-10-26 at 08:32 -0600, Jens Axboe wrote:
> On 10/26/18 1:08 AM, Hannes Reinecke wrote:
> > On 10/25/18 11:10 PM, Jens Axboe wrote:
> > > This check is only viable for non scsi-mq. Since that is going away,
> > > kill this legacy check.
> > > 
> > > Cc: Bart Van Assche <bvanassche@acm.org>
> > > Cc: Parav Pandit <parav@mellanox.com>
> > > Cc: linux-scsi@vger.kernel.org
> > > Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > > ---
> > >   drivers/infiniband/ulp/srp/ib_srp.c | 7 -------
> > >   1 file changed, 7 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> > > index 0b34e909505f..5a79444c2f3c 100644
> > > --- a/drivers/infiniband/ulp/srp/ib_srp.c
> > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> > > @@ -1334,13 +1334,6 @@ static void srp_terminate_io(struct srp_rport *rport)
> > >   	struct scsi_device *sdev;
> > >   	int i, j;
> > >   
> > > -	/*
> > > -	 * Invoking srp_terminate_io() while srp_queuecommand() is running
> > > -	 * is not safe. Hence the warning statement below.
> > > -	 */
> > > -	shost_for_each_device(sdev, shost)
> > > -		WARN_ON_ONCE(sdev->request_queue->request_fn_active);
> > > -
> > >   	for (i = 0; i < target->ch_count; i++) {
> > >   		ch = &target->ch[i];
> > >   
> > > 
> > 
> > I _think_ I've already killed that one; please do check the latest tree 
> > from Doug Ledford.
> 
> I don't know which tree is Doug's, but as long as it's queued up
> for mainline, that's all I care about. I'm still rebasing this
> branch, so it'll just get dropped when it shows up in mainline.

Doug and Jason share a git repository. I think this is the git repository they
ask Linus to pull from: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git.
Although Doug had reported to have queued the above patch I haven't found that patch
in the RDMA git repository.

Bart.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 01/28] sunvdc: convert to blk-mq
  2018-10-25 21:10 ` [PATCH 01/28] sunvdc: convert to blk-mq Jens Axboe
@ 2018-10-27 10:42   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-27 10:42 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide; +Cc: David Miller

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Convert from the old request_fn style driver to blk-mq.
> 
> Cc: David Miller <davem@davemloft.net>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/block/sunvdc.c | 149 +++++++++++++++++++++++++++--------------
>   1 file changed, 98 insertions(+), 51 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 02/28] ms_block: convert to blk-mq
  2018-10-25 21:10 ` [PATCH 02/28] ms_block: " Jens Axboe
@ 2018-10-27 10:43   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-27 10:43 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide; +Cc: Maxim Levitsky

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Straight forward conversion, room for optimization in how everything
> is punted to a work queue. Also looks plenty racy all over the map,
> with the state changes. I fixed a bunch of them up while doing the
> conversion, but there are surely more.
> 
> Cc: Maxim Levitsky <maximlevitsky@gmail.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/memstick/core/ms_block.c | 110 +++++++++++++++++--------------
>   drivers/memstick/core/ms_block.h |   1 +
>   2 files changed, 62 insertions(+), 49 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 03/28] mspro_block: convert to blk-mq
  2018-10-25 21:10 ` [PATCH 03/28] mspro_block: " Jens Axboe
@ 2018-10-27 10:44   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-27 10:44 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Straight forward conversion, there's room for improvement.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/memstick/core/mspro_block.c | 121 +++++++++++++++-------------
>   1 file changed, 66 insertions(+), 55 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 04/28] ide: convert to blk-mq
  2018-10-25 21:10 ` [PATCH 04/28] ide: " Jens Axboe
@ 2018-10-27 10:51   ` Hannes Reinecke
  2018-10-27 16:51     ` Jens Axboe
  0 siblings, 1 reply; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-27 10:51 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide; +Cc: David Miller

On 10/25/18 11:10 PM, Jens Axboe wrote:
> ide-disk and ide-cd tested as working just fine, ide-tape and
> ide-floppy haven't. But the latter don't require changes, so they
> should work without issue.
> 
> Add helper function to insert a request from a work queue, since we
> cannot invoke the blk-mq request insertion from IRQ context.
> 
> Cc: David Miller <davem@davemloft.net>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/ide/ide-atapi.c |  25 ++++--
>   drivers/ide/ide-cd.c    | 175 +++++++++++++++++++++-------------------
>   drivers/ide/ide-disk.c  |   5 +-
>   drivers/ide/ide-io.c    | 101 +++++++++++++----------
>   drivers/ide/ide-park.c  |   4 +-
>   drivers/ide/ide-pm.c    |  28 ++-----
>   drivers/ide/ide-probe.c |  68 +++++++++++-----
>   include/linux/ide.h     |  13 ++-
>   8 files changed, 240 insertions(+), 179 deletions(-)
> 
There are some things here which are curious (like always setting the 
queue depth to 32, and the insert_head thingie), but as this is legacy 
anyway:

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 06/28] blk-mq: remove the request_list usage
  2018-10-25 21:10 ` [PATCH 06/28] blk-mq: remove the request_list usage Jens Axboe
@ 2018-10-27 10:52   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-27 10:52 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> We don't do anything with it, that's just the legacy path.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-mq.c | 5 -----
>   1 file changed, 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 07/28] blk-mq: remove legacy check in queue blk_freeze_queue()
  2018-10-25 21:10 ` [PATCH 07/28] blk-mq: remove legacy check in queue blk_freeze_queue() Jens Axboe
@ 2018-10-27 10:52   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-27 10:52 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-mq.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4c82dc44d4d8..a58d2d953876 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -177,8 +177,6 @@ void blk_freeze_queue(struct request_queue *q)
>   	 * exported to drivers as the only user for unfreeze is blk_mq.
>   	 */
>   	blk_freeze_queue_start(q);
> -	if (!q->mq_ops)
> -		blk_drain_queue(q);
>   	blk_mq_freeze_queue_wait(q);
>   }
>   
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 04/28] ide: convert to blk-mq
  2018-10-27 10:51   ` Hannes Reinecke
@ 2018-10-27 16:51     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-27 16:51 UTC (permalink / raw)
  To: Hannes Reinecke, linux-block, linux-scsi, linux-ide; +Cc: David Miller

On 10/27/18 4:51 AM, Hannes Reinecke wrote:
> On 10/25/18 11:10 PM, Jens Axboe wrote:
>> ide-disk and ide-cd tested as working just fine, ide-tape and
>> ide-floppy haven't. But the latter don't require changes, so they
>> should work without issue.
>>
>> Add helper function to insert a request from a work queue, since we
>> cannot invoke the blk-mq request insertion from IRQ context.
>>
>> Cc: David Miller <davem@davemloft.net>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   drivers/ide/ide-atapi.c |  25 ++++--
>>   drivers/ide/ide-cd.c    | 175 +++++++++++++++++++++-------------------
>>   drivers/ide/ide-disk.c  |   5 +-
>>   drivers/ide/ide-io.c    | 101 +++++++++++++----------
>>   drivers/ide/ide-park.c  |   4 +-
>>   drivers/ide/ide-pm.c    |  28 ++-----
>>   drivers/ide/ide-probe.c |  68 +++++++++++-----
>>   include/linux/ide.h     |  13 ++-
>>   8 files changed, 240 insertions(+), 179 deletions(-)
>>
> There are some things here which are curious (like always setting the 
> queue depth to 32, and the insert_head thingie), but as this is legacy 
> anyway:

The insert head addition is needed, the alternative would be
making the blk-mq locks irq safe. And that would be a performance
hit that isn't acceptable. Perhaps I should make that clearer
in the commit message...

The 32 hard code is a bit strange, but I doubt it matters that much.
It's enough to saturate the drive for sure, and it'll be twice that
number if a scheduler is attached, ensuring we have plenty of
request to get the merging we want.

> Reviewed-by: Hannes Reinecke <hare@suse.com>

Thanks!

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 11/28] bsg: pass in desired timeout handler
  2018-10-25 21:10 ` [PATCH 11/28] bsg: pass in desired timeout handler Jens Axboe
@ 2018-10-28 15:53   ` Christoph Hellwig
  2018-10-28 23:05     ` Jens Axboe
  2018-10-29  6:55   ` Hannes Reinecke
  1 sibling, 1 reply; 90+ messages in thread
From: Christoph Hellwig @ 2018-10-28 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-ide, Johannes Thumshirn, Benjamin Block

On Thu, Oct 25, 2018 at 03:10:22PM -0600, Jens Axboe wrote:
> This will ease in the conversion to blk-mq, where we can't set
> a timeout handler after queue init.

On thing I hoped to archive with the last round of bsg cleanups is
to isolate the users of bsg from even knowing there is a struct
request underneath.  Based on that it would be nice to pass
the struct bsg_job to the actual timeout handlers instead of
struct request.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 12/28] bsg: provide bsg_remove_queue() helper
  2018-10-25 21:10 ` [PATCH 12/28] bsg: provide bsg_remove_queue() helper Jens Axboe
@ 2018-10-28 15:53   ` Christoph Hellwig
  2018-10-29  6:55   ` Hannes Reinecke
  2018-10-29 10:16   ` Johannes Thumshirn
  2 siblings, 0 replies; 90+ messages in thread
From: Christoph Hellwig @ 2018-10-28 15:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-ide, Johannes Thumshirn, Benjamin Block

On Thu, Oct 25, 2018 at 03:10:23PM -0600, Jens Axboe wrote:
> All drivers do unregister + cleanup, provide a helper for that.

Nice cleanup,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 13/28] bsg: convert to use blk-mq
  2018-10-25 21:10 ` [PATCH 13/28] bsg: convert to use blk-mq Jens Axboe
@ 2018-10-28 16:07   ` Christoph Hellwig
  2018-10-28 23:25     ` Jens Axboe
  2018-10-29  6:57   ` Hannes Reinecke
  1 sibling, 1 reply; 90+ messages in thread
From: Christoph Hellwig @ 2018-10-28 16:07 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, linux-ide, Johannes Thumshirn, Benjamin Block

> +static enum blk_eh_timer_return bsg_timeout(struct request *rq, bool reserved)
> +{
> +	enum blk_eh_timer_return ret = BLK_EH_DONE;
> +	struct request_queue *q = rq->q;
> +
> +	if (q->rq_timed_out_fn)
> +		ret = q->rq_timed_out_fn(rq);
> +
> +	return ret;

This is pretty awkward.  I guess it is ok as an intermediate step,
but I really don't want to keep rq_timed_out_fn to just for bsg.

I think we should do something like this ontop of your series to move
all the bsg-lib bits outside struct request_queue:

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index 3a5938205825..21dccaf8399e 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -31,6 +31,12 @@
 
 #define uptr64(val) ((void __user *)(uintptr_t)(val))
 
+struct bsg_set {
+	struct blk_mq_tag_set	tag_set;
+	bsg_job_fn		*job_fn;
+	bsg_timeout_fn		*timeout_fn;
+};
+
 static int bsg_transport_check_proto(struct sg_io_v4 *hdr)
 {
 	if (hdr->protocol != BSG_PROTOCOL_SCSI  ||
@@ -239,6 +245,8 @@ static blk_status_t bsg_queue_rq(struct blk_mq_hw_ctx *hctx,
 	struct request_queue *q = hctx->queue;
 	struct device *dev = q->queuedata;
 	struct request *req = bd->rq;
+	struct bsg_set *bset =
+		container_of(q->tag_set, struct bsg_set, tag_set);
 	int ret;
 
 	blk_mq_start_request(req);
@@ -249,7 +257,7 @@ static blk_status_t bsg_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (!bsg_prepare_job(dev, req))
 		return BLK_STS_IOERR;
 
-	ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
+	ret = bset->job_fn(blk_mq_rq_to_pdu(req));
 	if (ret)
 		return BLK_STS_IOERR;
 
@@ -302,13 +310,12 @@ EXPORT_SYMBOL_GPL(bsg_remove_queue);
 
 static enum blk_eh_timer_return bsg_timeout(struct request *rq, bool reserved)
 {
-	enum blk_eh_timer_return ret = BLK_EH_DONE;
-	struct request_queue *q = rq->q;
-
-	if (q->bsg_job_timeout_fn)
-		ret = q->bsg_job_timeout_fn(rq);
+	struct bsg_set *bset =
+		container_of(rq->q->tag_set, struct bsg_set, tag_set);
 
-	return ret;
+	if (!bset->timeout_fn)
+		return BLK_EH_DONE;
+	return bset->timeout_fn(rq);
 }
 
 static const struct blk_mq_ops bsg_mq_ops = {
@@ -328,16 +335,21 @@ static const struct blk_mq_ops bsg_mq_ops = {
  * @dd_job_size: size of LLD data needed for each job
  */
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
-		bsg_job_fn *job_fn, rq_timed_out_fn *timeout, int dd_job_size)
+		bsg_job_fn *job_fn, bsg_timeout_fn *timeout, int dd_job_size)
 {
+	struct bsg_set *bset;
 	struct blk_mq_tag_set *set;
 	struct request_queue *q;
 	int ret = -ENOMEM;
 
-	set = kzalloc(sizeof(*set), GFP_KERNEL);
-	if (!set)
+	bset = kzalloc(sizeof(*bset), GFP_KERNEL);
+	if (!bset)
 		return ERR_PTR(-ENOMEM);
 
+	bset->job_fn = job_fn;
+	bset->timeout_fn = timeout;
+
+	set = &bset->tag_set;
 	set->ops = &bsg_mq_ops,
 	set->nr_hw_queues = 1;
 	set->queue_depth = 128;
@@ -354,8 +366,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 	}
 
 	q->queuedata = dev;
-	q->bsg_job_fn = job_fn;
-	q->bsg_job_timeout_fn = timeout;
 	blk_queue_flag_set(QUEUE_FLAG_BIDI, q);
 	blk_queue_rq_timeout(q, BLK_DEFAULT_SG_TIMEOUT);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 82b6cf45c6e0..eb7399f76b42 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -292,15 +292,12 @@ typedef bool (poll_q_fn) (struct request_queue *q, blk_qc_t);
 struct bio_vec;
 typedef void (softirq_done_fn)(struct request *);
 typedef int (dma_drain_needed_fn)(struct request *);
-typedef int (bsg_job_fn) (struct bsg_job *);
 
 enum blk_eh_timer_return {
 	BLK_EH_DONE,		/* drivers has completed the command */
 	BLK_EH_RESET_TIMER,	/* reset timer and try again */
 };
 
-typedef enum blk_eh_timer_return (rq_timed_out_fn)(struct request *);
-
 enum blk_queue_state {
 	Queue_down,
 	Queue_up,
@@ -564,8 +561,6 @@ struct request_queue {
 	atomic_t		mq_freeze_depth;
 
 #if defined(CONFIG_BLK_DEV_BSG)
-	bsg_job_fn		*bsg_job_fn;
-	rq_timed_out_fn		*bsg_job_timeout_fn;
 	struct bsg_class_device bsg_dev;
 #endif
 
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index 9c9b134b1fa5..b356e0006731 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -31,6 +31,9 @@ struct device;
 struct scatterlist;
 struct request_queue;
 
+typedef int (bsg_job_fn) (struct bsg_job *);
+typedef enum blk_eh_timer_return (bsg_timeout_fn)(struct request *);
+
 struct bsg_buffer {
 	unsigned int payload_len;
 	int sg_cnt;
@@ -72,7 +75,7 @@ struct bsg_job {
 void bsg_job_done(struct bsg_job *job, int result,
 		  unsigned int reply_payload_rcv_len);
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
-		bsg_job_fn *job_fn, rq_timed_out_fn *timeout, int dd_job_size);
+		bsg_job_fn *job_fn, bsg_timeout_fn *timeout, int dd_job_size);
 void bsg_remove_queue(struct request_queue *q);
 void bsg_job_put(struct bsg_job *job);
 int __must_check bsg_job_get(struct bsg_job *job);

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH 11/28] bsg: pass in desired timeout handler
  2018-10-28 15:53   ` Christoph Hellwig
@ 2018-10-28 23:05     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-28 23:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-scsi, linux-ide, Johannes Thumshirn, Benjamin Block

On 10/28/18 9:53 AM, Christoph Hellwig wrote:
> On Thu, Oct 25, 2018 at 03:10:22PM -0600, Jens Axboe wrote:
>> This will ease in the conversion to blk-mq, where we can't set
>> a timeout handler after queue init.
> 
> On thing I hoped to archive with the last round of bsg cleanups is
> to isolate the users of bsg from even knowing there is a struct
> request underneath.  Based on that it would be nice to pass
> the struct bsg_job to the actual timeout handlers instead of
> struct request.

Makes sense, I can make that change.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 13/28] bsg: convert to use blk-mq
  2018-10-28 16:07   ` Christoph Hellwig
@ 2018-10-28 23:25     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-28 23:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, linux-scsi, linux-ide, Johannes Thumshirn, Benjamin Block

On 10/28/18 10:07 AM, Christoph Hellwig wrote:
>> +static enum blk_eh_timer_return bsg_timeout(struct request *rq, bool reserved)
>> +{
>> +	enum blk_eh_timer_return ret = BLK_EH_DONE;
>> +	struct request_queue *q = rq->q;
>> +
>> +	if (q->rq_timed_out_fn)
>> +		ret = q->rq_timed_out_fn(rq);
>> +
>> +	return ret;
> 
> This is pretty awkward.  I guess it is ok as an intermediate step,
> but I really don't want to keep rq_timed_out_fn to just for bsg.

Me neither.

> I think we should do something like this ontop of your series to move
> all the bsg-lib bits outside struct request_queue:

Looks good to me, I'll queue this up after this patch to clean things
up nicely.

I added this incremental on top, you missed some error handling and
cleanup bits. Care to send as a proper patch?


diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index 21dccaf8399e..f01f11048d5e 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -299,12 +299,13 @@ static void bsg_exit_rq(struct blk_mq_tag_set *set, struct request *req,
 
 void bsg_remove_queue(struct request_queue *q)
 {
-	struct blk_mq_tag_set *set = q->tag_set;
+	struct bsg_set *bset =
+		container_of(q->tag_set, struct bsg_set, tag_set);
 
 	bsg_unregister_queue(q);
 	blk_cleanup_queue(q);
-	blk_mq_free_tag_set(set);
-	kfree(set);
+	blk_mq_free_tag_set(&bset->tag_set);
+	kfree(bset);
 }
 EXPORT_SYMBOL_GPL(bsg_remove_queue);
 
@@ -382,7 +383,7 @@ struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 out_queue:
 	blk_mq_free_tag_set(set);
 out_tag_set:
-	kfree(set);
+	kfree(bset);
 	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL_GPL(bsg_setup_queue);

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH 08/28] scsi: kill off the legacy IO path
  2018-10-25 21:10 ` [PATCH 08/28] scsi: kill off the legacy IO path Jens Axboe
  2018-10-25 21:36   ` Bart Van Assche
@ 2018-10-29  6:48   ` Hannes Reinecke
  1 sibling, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:48 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   Documentation/scsi/scsi-parameters.txt |   5 -
>   drivers/scsi/Kconfig                   |  12 -
>   drivers/scsi/cxlflash/main.c           |   6 -
>   drivers/scsi/hosts.c                   |  29 +-
>   drivers/scsi/lpfc/lpfc_scsi.c          |   2 +-
>   drivers/scsi/qedi/qedi_main.c          |   3 +-
>   drivers/scsi/qla2xxx/qla_os.c          |  30 +-
>   drivers/scsi/scsi.c                    |   5 +-
>   drivers/scsi/scsi_debug.c              |   3 +-
>   drivers/scsi/scsi_error.c              |   2 +-
>   drivers/scsi/scsi_lib.c                | 624 ++-----------------------
>   drivers/scsi/scsi_priv.h               |   1 -
>   drivers/scsi/scsi_scan.c               |  10 +-
>   drivers/scsi/scsi_sysfs.c              |   8 +-
>   drivers/scsi/ufs/ufshcd.c              |   6 -
>   include/scsi/scsi_host.h               |  18 +-
>   include/scsi/scsi_tcq.h                |  14 +-
>   17 files changed, 73 insertions(+), 705 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 09/28] dm: remove legacy IO path
  2018-10-25 21:10 ` [PATCH 09/28] dm: remove " Jens Axboe
@ 2018-10-29  6:53   ` Hannes Reinecke
  2018-10-29 14:17     ` Jens Axboe
  0 siblings, 1 reply; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:53 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> dm supports both, and since we're killing off the legacy path
> in general, get rid of it in dm as well.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/md/Kconfig    |  11 --
>   drivers/md/dm-core.h  |  10 --
>   drivers/md/dm-mpath.c |  14 +-
>   drivers/md/dm-rq.c    | 293 ++++--------------------------------------
>   drivers/md/dm-rq.h    |   4 -
>   drivers/md/dm-sysfs.c |   3 +-
>   drivers/md/dm-table.c |  36 +-----
>   drivers/md/dm.c       |  21 +--
>   drivers/md/dm.h       |   1 -
>   9 files changed, 35 insertions(+), 358 deletions(-)
> 
[ .. ]

> @@ -790,11 +550,6 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
>   	struct dm_target *immutable_tgt;
>   	int err;
>   
> -	if (!dm_table_all_blk_mq_devices(t)) {
> -		DMERR("request-based dm-mq may only be stacked on blk-mq device(s)");
> -		return -EINVAL;
> -	}
> -
>   	md->tag_set = kzalloc_node(sizeof(struct blk_mq_tag_set), GFP_KERNEL, md->numa_node_id);
>   	if (!md->tag_set)
>   		return -ENOMEM;
That warnint is still valid, no?

[ .. ]
> @@ -2217,13 +2211,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>   
>   	switch (type) {
>   	case DM_TYPE_REQUEST_BASED:
> -		dm_init_normal_md_queue(md);
> -		r = dm_old_init_request_queue(md, t);
> -		if (r) {
> -			DMERR("Cannot initialize queue for request-based mapped device");
> -			return r;
> -		}
> -		break;
>   	case DM_TYPE_MQ_REQUEST_BASED:
>   		r = dm_mq_init_request_queue(md, t);
>   		if (r) {
I'd love to kill DM_TYPE_REQUEST_BASED completely, seeing that it's 
referring to the now-defunct legacy I/O path.
Mike?

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 10/28] dasd: remove dead code
  2018-10-25 21:10 ` [PATCH 10/28] dasd: remove dead code Jens Axboe
@ 2018-10-29  6:54   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:54 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Since e443343e509a we haven't had a request_fn attached to
> this driver, hence any code inside an if (q->request_fn) is
> unreachable.
> 
> Fixes: e443343e509a ("s390/dasd: blk-mq conversion")
> [sth: Keep and fix the dasd_info->chanq_len counter.]
> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
> Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/s390/block/dasd_ioctl.c | 22 +++++-----------------
>   1 file changed, 5 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
> index 2016e0ed5865..8e26001dc11c 100644
> --- a/drivers/s390/block/dasd_ioctl.c
> +++ b/drivers/s390/block/dasd_ioctl.c
> @@ -412,6 +412,7 @@ static int dasd_ioctl_information(struct dasd_block *block,
>   	struct ccw_dev_id dev_id;
>   	struct dasd_device *base;
>   	struct ccw_device *cdev;
> +	struct list_head *l;
>   	unsigned long flags;
>   	int rc;
>   
> @@ -462,23 +463,10 @@ static int dasd_ioctl_information(struct dasd_block *block,
>   
>   	memcpy(dasd_info->type, base->discipline->name, 4);
>   
> -	if (block->request_queue->request_fn) {
> -		struct list_head *l;
> -#ifdef DASD_EXTENDED_PROFILING
> -		{
> -			struct list_head *l;
> -			spin_lock_irqsave(&block->lock, flags);
> -			list_for_each(l, &block->request_queue->queue_head)
> -				dasd_info->req_queue_len++;
> -			spin_unlock_irqrestore(&block->lock, flags);
> -		}
> -#endif				/* DASD_EXTENDED_PROFILING */
> -		spin_lock_irqsave(get_ccwdev_lock(base->cdev), flags);
> -		list_for_each(l, &base->ccw_queue)
> -			dasd_info->chanq_len++;
> -		spin_unlock_irqrestore(get_ccwdev_lock(base->cdev),
> -				       flags);
> -	}
> +	spin_lock_irqsave(&block->queue_lock, flags);
> +	list_for_each(l, &base->ccw_queue)
> +		dasd_info->chanq_len++;
> +	spin_unlock_irqrestore(&block->queue_lock, flags);
>   
>   	rc = 0;
>   	if (copy_to_user(argp, dasd_info,
> 

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 11/28] bsg: pass in desired timeout handler
  2018-10-25 21:10 ` [PATCH 11/28] bsg: pass in desired timeout handler Jens Axboe
  2018-10-28 15:53   ` Christoph Hellwig
@ 2018-10-29  6:55   ` Hannes Reinecke
  1 sibling, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:55 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide
  Cc: Johannes Thumshirn, Benjamin Block

On 10/25/18 11:10 PM, Jens Axboe wrote:
> This will ease in the conversion to blk-mq, where we can't set
> a timeout handler after queue init.
> 
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/bsg-lib.c                     | 3 ++-
>   drivers/scsi/scsi_transport_fc.c    | 7 +++----
>   drivers/scsi/scsi_transport_iscsi.c | 2 +-
>   drivers/scsi/scsi_transport_sas.c   | 4 ++--
>   include/linux/bsg-lib.h             | 2 +-
>   5 files changed, 9 insertions(+), 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 12/28] bsg: provide bsg_remove_queue() helper
  2018-10-25 21:10 ` [PATCH 12/28] bsg: provide bsg_remove_queue() helper Jens Axboe
  2018-10-28 15:53   ` Christoph Hellwig
@ 2018-10-29  6:55   ` Hannes Reinecke
  2018-10-29 10:16   ` Johannes Thumshirn
  2 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:55 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide
  Cc: Johannes Thumshirn, Benjamin Block

On 10/25/18 11:10 PM, Jens Axboe wrote:
> All drivers do unregister + cleanup, provide a helper for that.
> 
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/bsg-lib.c                     | 7 +++++++
>   drivers/scsi/scsi_transport_fc.c    | 6 ++----
>   drivers/scsi/scsi_transport_iscsi.c | 7 +++----
>   drivers/scsi/scsi_transport_sas.c   | 6 ++----
>   include/linux/bsg-lib.h             | 1 +
>   5 files changed, 15 insertions(+), 12 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 13/28] bsg: convert to use blk-mq
  2018-10-25 21:10 ` [PATCH 13/28] bsg: convert to use blk-mq Jens Axboe
  2018-10-28 16:07   ` Christoph Hellwig
@ 2018-10-29  6:57   ` Hannes Reinecke
  1 sibling, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:57 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide
  Cc: Johannes Thumshirn, Benjamin Block

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Requires a few changes to the FC transport class as well.
> 
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/bsg-lib.c                  | 123 +++++++++++++++++++------------
>   drivers/scsi/scsi_transport_fc.c |  59 +++++++++------
>   2 files changed, 110 insertions(+), 72 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 14/28] block: remove blk_complete_request()
  2018-10-25 21:10 ` [PATCH 14/28] block: remove blk_complete_request() Jens Axboe
@ 2018-10-29  6:59   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:59 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> It's now unused.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-softirq.c    | 20 --------------------
>   include/linux/blkdev.h |  1 -
>   2 files changed, 21 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 15/28] blk-wbt: kill check for legacy queue type
  2018-10-25 21:10 ` [PATCH 15/28] blk-wbt: kill check for legacy queue type Jens Axboe
@ 2018-10-29  6:59   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  6:59 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Everything is blk-mq at this point, so it doesn't make any sense
> to have this option available as it does nothing.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/Kconfig   | 6 ------
>   block/blk-wbt.c | 3 +--
>   2 files changed, 1 insertion(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 16/28] blk-cgroup: remove legacy queue bypassing
  2018-10-25 21:10 ` [PATCH 16/28] blk-cgroup: remove legacy queue bypassing Jens Axboe
@ 2018-10-29  7:00   ` Hannes Reinecke
  2018-10-29 11:00   ` Johannes Thumshirn
  1 sibling, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:00 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> We only support mq devices now.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-cgroup.c | 8 --------
>   1 file changed, 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 17/28] block: remove legacy rq tagging
  2018-10-25 21:10 ` [PATCH 17/28] block: remove legacy rq tagging Jens Axboe
@ 2018-10-29  7:01   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:01 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> It's now unused, kill it.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   Documentation/block/biodoc.txt |  88 --------
>   block/Makefile                 |   2 +-
>   block/blk-core.c               |   6 -
>   block/blk-mq-debugfs.c         |   2 -
>   block/blk-mq-tag.c             |   6 +-
>   block/blk-sysfs.c              |   3 -
>   block/blk-tag.c                | 378 ---------------------------------
>   include/linux/blkdev.h         |  35 ---
>   8 files changed, 3 insertions(+), 517 deletions(-)
>   delete mode 100644 block/blk-tag.c
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 18/28] block: remove non mq parts from the flush code
  2018-10-25 21:10 ` [PATCH 18/28] block: remove non mq parts from the flush code Jens Axboe
@ 2018-10-29  7:02   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:02 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-flush.c | 154 +++++++++-------------------------------------
>   block/blk.h       |   4 +-
>   2 files changed, 31 insertions(+), 127 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 21/28] block: remove __blk_put_request()
  2018-10-25 21:10 ` [PATCH 21/28] block: remove __blk_put_request() Jens Axboe
@ 2018-10-29  7:03   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:03 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Now there's no difference between blk_put_request() and
> __blk_put_request() anymore, get rid of the underscore version and
> convert the few callers.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-core.c                   | 9 ---------
>   block/blk-merge.c                  | 2 +-
>   drivers/scsi/osd/osd_initiator.c   | 4 ++--
>   drivers/scsi/osst.c                | 2 +-
>   drivers/scsi/scsi_error.c          | 2 +-
>   drivers/scsi/sg.c                  | 2 +-
>   drivers/scsi/st.c                  | 2 +-
>   drivers/target/target_core_pscsi.c | 2 +-
>   include/linux/blkdev.h             | 1 -
>   9 files changed, 8 insertions(+), 18 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 22/28] block: kill legacy parts of timeout handling
  2018-10-25 21:10 ` [PATCH 22/28] block: kill legacy parts of timeout handling Jens Axboe
@ 2018-10-29  7:04   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:04 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> The only user of legacy timing now is BSG, which is invoked
> from the mq timeout handler. Kill the legacy code, and rename
> the q->rq_timed_out_fn to q->bsg_job_timeout_fn.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-core.c       |  1 -
>   block/blk-settings.c   |  7 ---
>   block/blk-timeout.c    | 99 +++---------------------------------------
>   block/blk.h            |  1 -
>   block/bsg-lib.c        |  6 +--
>   include/linux/blkdev.h |  4 +-
>   6 files changed, 11 insertions(+), 107 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 23/28] block: kill lld busy
  2018-10-25 21:10 ` [PATCH 23/28] block: kill lld busy Jens Axboe
  2018-10-25 21:42   ` Bart Van Assche
@ 2018-10-29  7:10   ` Hannes Reinecke
  2018-10-29 14:25     ` Jens Axboe
  1 sibling, 1 reply; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Nobody sets the helper, so we always return 0. Kill it.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-core.c       | 28 ----------------------------
>   block/blk-settings.c   |  6 ------
>   drivers/md/dm-mpath.c  |  4 +---
>   include/linux/blkdev.h |  4 ----
>   4 files changed, 1 insertion(+), 41 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 4c39c7865f9c..dd1328f4dc31 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1795,34 +1795,6 @@ void rq_flush_dcache_pages(struct request *rq)
>   EXPORT_SYMBOL_GPL(rq_flush_dcache_pages);
>   #endif
>   
> -/**
> - * blk_lld_busy - Check if underlying low-level drivers of a device are busy
> - * @q : the queue of the device being checked
> - *
> - * Description:
> - *    Check if underlying low-level drivers of a device are busy.
> - *    If the drivers want to export their busy state, they must set own
> - *    exporting function using blk_queue_lld_busy() first.
> - *
> - *    Basically, this function is used only by request stacking drivers
> - *    to stop dispatching requests to underlying devices when underlying
> - *    devices are busy.  This behavior helps more I/O merging on the queue
> - *    of the request stacking driver and prevents I/O throughput regression
> - *    on burst I/O load.
> - *
> - * Return:
> - *    0 - Not busy (The request stacking driver should dispatch request)
> - *    1 - Busy (The request stacking driver should stop dispatching request)
> - */
> -int blk_lld_busy(struct request_queue *q)
> -{
> -	if (q->lld_busy_fn)
> -		return q->lld_busy_fn(q);
> -
> -	return 0;
> -}
> -EXPORT_SYMBOL_GPL(blk_lld_busy);
> -
>   /**
>    * blk_rq_unprep_clone - Helper function to free all bios in a cloned request
>    * @rq: the clone request to be cleaned up
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 3c5da75c2def..1895f499bbe5 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -32,12 +32,6 @@ void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
>   }
>   EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
>   
> -void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
> -{
> -	q->lld_busy_fn = fn;
> -}
> -EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
> -
>   /**
>    * blk_set_default_limits - reset limits to default values
>    * @lim:  the queue_limits structure to reset
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index a24ed3973e7c..4d736e0fd67f 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1936,9 +1936,7 @@ static int multipath_iterate_devices(struct dm_target *ti,
>   
>   static int pgpath_busy(struct pgpath *pgpath)
>   {
> -	struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev);
> -
> -	return blk_lld_busy(q);
> +	return 0;
>   }
>   
>   /*
Actually, I'm not quite sure this is correct; dm-mpath needs to return a 
busy status (via the '->busy' callback) to allow for back-pressure on 
the upper layers.
Just disabling the callback will disable this.
Shouldn't we check if the tagmap is busy here?

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 24/28] block: remove request_list code
  2018-10-25 21:10 ` [PATCH 24/28] block: remove request_list code Jens Axboe
@ 2018-10-29  7:10   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:10 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> It's now dead code, nobody uses it.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-cgroup.c         |  47 ----------------
>   block/blk-core.c           |  75 --------------------------
>   block/blk-mq.c             |   4 --
>   block/blk.h                |   3 --
>   include/linux/blk-cgroup.h | 108 -------------------------------------
>   include/linux/blkdev.h     |  34 ------------
>   6 files changed, 271 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 25/28] block: kill request slab cache
  2018-10-25 21:10 ` [PATCH 25/28] block: kill request slab cache Jens Axboe
@ 2018-10-29  7:11   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:11 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-core.c | 8 --------
>   block/blk.h      | 1 -
>   2 files changed, 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 26/28] block: remove req_no_special_merge() from merging code
  2018-10-25 21:10 ` [PATCH 26/28] block: remove req_no_special_merge() from merging code Jens Axboe
@ 2018-10-29  7:12   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:12 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> It'll always be false at this point, just remove it.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-merge.c | 25 +++----------------------
>   1 file changed, 3 insertions(+), 22 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 27/28] blk-merge: kill dead queue lock held check
  2018-10-25 21:10 ` [PATCH 27/28] blk-merge: kill dead queue lock held check Jens Axboe
@ 2018-10-29  7:12   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:12 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> This is dead code, any queue reaching this part has mq_ops
> attached.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-merge.c | 3 ---
>   1 file changed, 3 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 3561dcce2260..0128284bded4 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -704,9 +704,6 @@ static void blk_account_io_merge(struct request *req)
>   static struct request *attempt_merge(struct request_queue *q,
>   				     struct request *req, struct request *next)
>   {
> -	if (!q->mq_ops)
> -		lockdep_assert_held(q->queue_lock);
> -
>   	if (!rq_mergeable(req) || !rq_mergeable(next))
>   		return NULL;
>   
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 28/28] block: get rid of blk_queued_rq()
  2018-10-25 21:10 ` [PATCH 28/28] block: get rid of blk_queued_rq() Jens Axboe
@ 2018-10-29  7:12   ` Hannes Reinecke
  0 siblings, 0 replies; 90+ messages in thread
From: Hannes Reinecke @ 2018-10-29  7:12 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 10/25/18 11:10 PM, Jens Axboe wrote:
> No point in hiding what this does, just open code it in the
> one spot where we are still using it.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-mq.c         | 2 +-
>   include/linux/blkdev.h | 2 --
>   2 files changed, 1 insertion(+), 3 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 12/28] bsg: provide bsg_remove_queue() helper
  2018-10-25 21:10 ` [PATCH 12/28] bsg: provide bsg_remove_queue() helper Jens Axboe
  2018-10-28 15:53   ` Christoph Hellwig
  2018-10-29  6:55   ` Hannes Reinecke
@ 2018-10-29 10:16   ` Johannes Thumshirn
  2018-10-29 14:15     ` Jens Axboe
  2 siblings, 1 reply; 90+ messages in thread
From: Johannes Thumshirn @ 2018-10-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide; +Cc: Benjamin Block

On 25/10/18 23:10, Jens Axboe wrote:
> All drivers do unregister + cleanup, provide a helper for that.
> 
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  block/bsg-lib.c                     | 7 +++++++
>  drivers/scsi/scsi_transport_fc.c    | 6 ++----
>  drivers/scsi/scsi_transport_iscsi.c | 7 +++----
>  drivers/scsi/scsi_transport_sas.c   | 6 ++----
>  include/linux/bsg-lib.h             | 1 +
>  5 files changed, 15 insertions(+), 12 deletions(-)
> 
> diff --git a/block/bsg-lib.c b/block/bsg-lib.c
> index 1da011ec04e6..267f965af77a 100644
> --- a/block/bsg-lib.c
> +++ b/block/bsg-lib.c
> @@ -296,6 +296,13 @@ static void bsg_exit_rq(struct request_queue *q, struct request *req)
>  	kfree(job->reply);
>  }
>  
> +void bsg_remove_queue(struct request_queue *q)
> +{
> +	bsg_unregister_queue(q);
> +	blk_cleanup_queue(q);
> +}
> +EXPORT_SYMBOL_GPL(bsg_remove_queue);
> +
>  /**
>   * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
>   * @dev: device to attach bsg device to
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index 98aaffb4c715..4d64956bb5d3 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -3851,10 +3851,8 @@ fc_bsg_rportadd(struct Scsi_Host *shost, struct fc_rport *rport)
>  static void
>  fc_bsg_remove(struct request_queue *q)
>  {
> -	if (q) {
> -		bsg_unregister_queue(q);
> -		blk_cleanup_queue(q);
> -	}
> +	if (q)
> +		bsg_remove_queue(q);
>  }

Not sure if I'm too late to the game, but as all callers do a
"if  (q)" check, can't we just move it into bsg_remove_queue()?

Otherwise:
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                        SUSE Labs
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 16/28] blk-cgroup: remove legacy queue bypassing
  2018-10-25 21:10 ` [PATCH 16/28] blk-cgroup: remove legacy queue bypassing Jens Axboe
  2018-10-29  7:00   ` Hannes Reinecke
@ 2018-10-29 11:00   ` Johannes Thumshirn
  2018-10-29 14:23     ` Jens Axboe
  1 sibling, 1 reply; 90+ messages in thread
From: Johannes Thumshirn @ 2018-10-29 11:00 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

Hi Jens,

On 25/10/18 23:10, Jens Axboe wrote:
[...]
> @@ -1487,8 +1485,6 @@ int blkcg_activate_policy(struct request_queue *q,
>  out_bypass_end:
>  	if (q->mq_ops)
>  		blk_mq_unfreeze_queue(q);
> -	else
> -		blk_queue_bypass_end(q);

>  
> 

Now that we only have mq, do we still need all these checks for q->mq_ops?

-- 
Johannes Thumshirn                                        SUSE Labs
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
                   ` (28 preceding siblings ...)
  2018-10-25 23:09 ` [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Bart Van Assche
@ 2018-10-29 12:00 ` Ming Lei
  2018-10-29 14:50   ` Jens Axboe
  29 siblings, 1 reply; 90+ messages in thread
From: Ming Lei @ 2018-10-29 12:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-ide

On Thu, Oct 25, 2018 at 03:10:11PM -0600, Jens Axboe wrote:
> The first round of this went into 4.20-rc, but we've still some of
> them pending. This patch series converts the remaining drivers to
> blk-mq. The ones that support dual paths (like SCSI and DM) have
> the non-mq path removed. At the end, legacy IO code and schedulers
> are killed off.
> 
> This patch series is on top of my for-linus branch. It can also
> be bound in my mq-conversions branch.
> 
>  Documentation/block/biodoc.txt         |   88 -
>  Documentation/block/cfq-iosched.txt    |  291 --
>  Documentation/scsi/scsi-parameters.txt |    5 -
>  block/Kconfig                          |    6 -
>  block/Kconfig.iosched                  |   61 -
>  block/Makefile                         |    5 +-
>  block/bfq-iosched.c                    |    1 -
>  block/blk-cgroup.c                     |   55 -
>  block/blk-core.c                       | 1860 +-----------
>  block/blk-exec.c                       |   20 +-
>  block/blk-flush.c                      |  154 +-
>  block/blk-ioc.c                        |   46 +-
>  block/blk-merge.c                      |   35 +-
>  block/blk-mq-debugfs.c                 |    2 -
>  block/blk-mq-tag.c                     |    6 +-
>  block/blk-mq.c                         |   13 +-
>  block/blk-settings.c                   |   49 -
>  block/blk-softirq.c                    |   20 -
>  block/blk-sysfs.c                      |   39 +-
>  block/blk-tag.c                        |  378 ---
>  block/blk-timeout.c                    |   99 +-
>  block/blk-wbt.c                        |    3 +-
>  block/blk.h                            |   60 +-
>  block/bsg-lib.c                        |  131 +-
>  block/cfq-iosched.c                    | 4916 --------------------------------
>  block/deadline-iosched.c               |  560 ----
>  block/elevator.c                       |  447 +--
>  block/kyber-iosched.c                  |    1 -
>  block/mq-deadline.c                    |    1 -
>  block/noop-iosched.c                   |  124 -
>  drivers/block/sunvdc.c                 |  149 +-
>  drivers/ide/ide-atapi.c                |   25 +-
>  drivers/ide/ide-cd.c                   |  175 +-
>  drivers/ide/ide-disk.c                 |    5 +-
>  drivers/ide/ide-io.c                   |  101 +-
>  drivers/ide/ide-park.c                 |    4 +-
>  drivers/ide/ide-pm.c                   |   28 +-
>  drivers/ide/ide-probe.c                |   68 +-
>  drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
>  drivers/md/Kconfig                     |   11 -
>  drivers/md/dm-core.h                   |   10 -
>  drivers/md/dm-mpath.c                  |   18 +-
>  drivers/md/dm-rq.c                     |  293 +-
>  drivers/md/dm-rq.h                     |    4 -
>  drivers/md/dm-sysfs.c                  |    3 +-
>  drivers/md/dm-table.c                  |   36 +-
>  drivers/md/dm.c                        |   21 +-
>  drivers/md/dm.h                        |    1 -
>  drivers/memstick/core/ms_block.c       |  110 +-
>  drivers/memstick/core/ms_block.h       |    1 +
>  drivers/memstick/core/mspro_block.c    |  121 +-
>  drivers/s390/block/dasd_ioctl.c        |   22 +-
>  drivers/scsi/Kconfig                   |   12 -
>  drivers/scsi/cxlflash/main.c           |    6 -
>  drivers/scsi/hosts.c                   |   29 +-
>  drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
>  drivers/scsi/osd/osd_initiator.c       |    4 +-
>  drivers/scsi/osst.c                    |    2 +-
>  drivers/scsi/qedi/qedi_main.c          |    3 +-
>  drivers/scsi/qla2xxx/qla_os.c          |   30 +-
>  drivers/scsi/scsi.c                    |    5 +-
>  drivers/scsi/scsi_debug.c              |    3 +-
>  drivers/scsi/scsi_error.c              |    4 +-
>  drivers/scsi/scsi_lib.c                |  624 +---
>  drivers/scsi/scsi_priv.h               |    1 -
>  drivers/scsi/scsi_scan.c               |   10 +-
>  drivers/scsi/scsi_sysfs.c              |    8 +-
>  drivers/scsi/scsi_transport_fc.c       |   72 +-
>  drivers/scsi/scsi_transport_iscsi.c    |    9 +-
>  drivers/scsi/scsi_transport_sas.c      |   10 +-
>  drivers/scsi/sg.c                      |    2 +-
>  drivers/scsi/st.c                      |    2 +-
>  drivers/scsi/ufs/ufshcd.c              |    6 -
>  drivers/target/target_core_pscsi.c     |    2 +-
>  include/linux/blk-cgroup.h             |  108 -
>  include/linux/blkdev.h                 |  174 +-
>  include/linux/bsg-lib.h                |    3 +-
>  include/linux/elevator.h               |   90 +-
>  include/linux/ide.h                    |   13 +-
>  include/linux/init.h                   |    1 -
>  include/scsi/scsi_host.h               |   18 +-
>  include/scsi/scsi_tcq.h                |   14 +-
>  init/do_mounts_initrd.c                |    3 -
>  init/initramfs.c                       |    6 -
>  init/main.c                            |   12 -
>  85 files changed, 833 insertions(+), 11144 deletions(-)

Hi Jens,

Kernel oops[2] is triggered when running elevator switch stress test[1].

[1] https://people.redhat.com/~minlei/tests/tools/elv-switch

[2] kernel oops

[  399.702365] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
[  399.703603] sd 8:0:0:0: Power-on or device reset occurred
[  399.709833] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
[  399.711745] sd 8:0:0:0: [sdd] Write Protect is off
[  399.712395] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
[  399.715119] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
[  399.803561] sd 8:0:0:0: [sdd] Attached SCSI disk
[  399.844825] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
[  399.968945] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
[  399.970077] sd 8:0:0:0: Power-on or device reset occurred
[  399.972904] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
[  399.974823] sd 8:0:0:0: [sdd] Write Protect is off
[  399.975411] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
[  399.978052] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
[  400.030588] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
[  400.032195] PGD 0 P4D 0
[  400.032482] Oops: 0000 [#1] PREEMPT SMP PTI
[  400.032945] CPU: 2 PID: 318 Comm: kworker/u8:5 Not tainted 4.19.0_0ffdea6679a9_mq-conversions+ #1
[  400.033927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
[  400.034915] Workqueue: events_unbound async_run_entry_fn
[  400.035536] RIP: 0010:bfq_reset_rate_computation+0x52/0x85
[  400.036177] Code: 00 48 89 83 b0 00 00 00 8b 45 20 c1 e8 09 89 83 d8 00 00 00 48 89 83 d0 00 00 00 eb 0a c7 87 c8 00 00 00 00 00 00 00 48 8b 03 <48> 8b b8 00 05 00 00 48 85 ff 74 24 8b 8b c8 00 00 00 4c 8b 8b d0
[  400.038300] RSP: 0018:ffffc90000a47c50 EFLAGS: 00010046
[  400.038915] RAX: 0000000000000000 RBX: ffff88017ac2bc00 RCX: 0000000017d6067a
[  400.039746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88017ac2bc00
[  400.040573] RBP: ffff88017ac2bc00 R08: 00000000f461e6ac R09: 0000000000000006
[  400.041361] R10: 000000000000020f R11: 0000000000000000 R12: ffff88016c130000
[  400.042127] R13: 0000005d1c095013 R14: ffff88017ac2bf78 R15: 0000000000000020
[  400.042957] FS:  0000000000000000(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000
[  400.043895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  400.044518] CR2: 0000000000000500 CR3: 0000000179c32004 CR4: 0000000000760ee0
[  400.045294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  400.046070] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  400.046843] PKRU: 55555554
[  400.047144] Call Trace:
[  400.047438]  bfq_finish_requeue_request+0x145/0x2fd
[  400.047977]  blk_mq_free_request+0x49/0xea
[  400.048429]  __scsi_execute+0x16c/0x17b
[  400.048885]  scsi_mode_sense+0x113/0x265
[  400.049356]  sd_revalidate_disk+0xb5b/0x109a [sd_mod]
[  400.049947]  ? __device_add_disk+0x3d2/0x3f6
[  400.050415]  sd_probe_async+0x10d/0x18b [sd_mod]
[  400.050924]  async_run_entry_fn+0x6d/0x12b
[  400.051374]  process_one_work+0x1da/0x313
[  400.051841]  ? rescuer_thread+0x282/0x282
[  400.052313]  worker_thread+0x1ca/0x295
[  400.052769]  kthread+0x115/0x11d
[  400.053127]  ? kthread_park+0x76/0x76
[  400.053531]  ret_from_fork+0x35/0x40
[  400.053929] Modules linked in: scsi_debug null_blk isofs iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core lpc_ich mfd_core ip_tables sr_mod cdrom usb_storage sd_mod ahci libahci crc32c_intel libata virtio_scsi qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod [last unloaded: null_blk]
[  400.056640] Dumping ftrace buffer:
[  400.057022]    (ftrace buffer empty)
[  400.057418] CR2: 0000000000000500
[  400.057820] ---[ end trace 812cbc0fe8802fdd ]---

-- 
Ming

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 12/28] bsg: provide bsg_remove_queue() helper
  2018-10-29 10:16   ` Johannes Thumshirn
@ 2018-10-29 14:15     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 14:15 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-block, linux-scsi, linux-ide; +Cc: Benjamin Block

On 10/29/18 4:16 AM, Johannes Thumshirn wrote:
> On 25/10/18 23:10, Jens Axboe wrote:
>> All drivers do unregister + cleanup, provide a helper for that.
>>
>> Cc: Johannes Thumshirn <jthumshirn@suse.de>
>> Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
>> Cc: linux-scsi@vger.kernel.org
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  block/bsg-lib.c                     | 7 +++++++
>>  drivers/scsi/scsi_transport_fc.c    | 6 ++----
>>  drivers/scsi/scsi_transport_iscsi.c | 7 +++----
>>  drivers/scsi/scsi_transport_sas.c   | 6 ++----
>>  include/linux/bsg-lib.h             | 1 +
>>  5 files changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/block/bsg-lib.c b/block/bsg-lib.c
>> index 1da011ec04e6..267f965af77a 100644
>> --- a/block/bsg-lib.c
>> +++ b/block/bsg-lib.c
>> @@ -296,6 +296,13 @@ static void bsg_exit_rq(struct request_queue *q, struct request *req)
>>  	kfree(job->reply);
>>  }
>>  
>> +void bsg_remove_queue(struct request_queue *q)
>> +{
>> +	bsg_unregister_queue(q);
>> +	blk_cleanup_queue(q);
>> +}
>> +EXPORT_SYMBOL_GPL(bsg_remove_queue);
>> +
>>  /**
>>   * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
>>   * @dev: device to attach bsg device to
>> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
>> index 98aaffb4c715..4d64956bb5d3 100644
>> --- a/drivers/scsi/scsi_transport_fc.c
>> +++ b/drivers/scsi/scsi_transport_fc.c
>> @@ -3851,10 +3851,8 @@ fc_bsg_rportadd(struct Scsi_Host *shost, struct fc_rport *rport)
>>  static void
>>  fc_bsg_remove(struct request_queue *q)
>>  {
>> -	if (q) {
>> -		bsg_unregister_queue(q);
>> -		blk_cleanup_queue(q);
>> -	}
>> +	if (q)
>> +		bsg_remove_queue(q);
>>  }
> 
> Not sure if I'm too late to the game, but as all callers do a
> "if  (q)" check, can't we just move it into bsg_remove_queue()?

Not too late, I'll make that change.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 09/28] dm: remove legacy IO path
  2018-10-29  6:53   ` Hannes Reinecke
@ 2018-10-29 14:17     ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 14:17 UTC (permalink / raw)
  To: Hannes Reinecke, linux-block, linux-scsi, linux-ide

On 10/29/18 12:53 AM, Hannes Reinecke wrote:
> On 10/25/18 11:10 PM, Jens Axboe wrote:
>> dm supports both, and since we're killing off the legacy path
>> in general, get rid of it in dm as well.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   drivers/md/Kconfig    |  11 --
>>   drivers/md/dm-core.h  |  10 --
>>   drivers/md/dm-mpath.c |  14 +-
>>   drivers/md/dm-rq.c    | 293 ++++--------------------------------------
>>   drivers/md/dm-rq.h    |   4 -
>>   drivers/md/dm-sysfs.c |   3 +-
>>   drivers/md/dm-table.c |  36 +-----
>>   drivers/md/dm.c       |  21 +--
>>   drivers/md/dm.h       |   1 -
>>   9 files changed, 35 insertions(+), 358 deletions(-)
>>
> [ .. ]
> 
>> @@ -790,11 +550,6 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
>>   	struct dm_target *immutable_tgt;
>>   	int err;
>>   
>> -	if (!dm_table_all_blk_mq_devices(t)) {
>> -		DMERR("request-based dm-mq may only be stacked on blk-mq device(s)");
>> -		return -EINVAL;
>> -	}
>> -
>>   	md->tag_set = kzalloc_node(sizeof(struct blk_mq_tag_set), GFP_KERNEL, md->numa_node_id);
>>   	if (!md->tag_set)
>>   		return -ENOMEM;
> That warnint is still valid, no?
> 
> [ .. ]
>> @@ -2217,13 +2211,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>>   
>>   	switch (type) {
>>   	case DM_TYPE_REQUEST_BASED:
>> -		dm_init_normal_md_queue(md);
>> -		r = dm_old_init_request_queue(md, t);
>> -		if (r) {
>> -			DMERR("Cannot initialize queue for request-based mapped device");
>> -			return r;
>> -		}
>> -		break;
>>   	case DM_TYPE_MQ_REQUEST_BASED:
>>   		r = dm_mq_init_request_queue(md, t);
>>   		if (r) {
> I'd love to kill DM_TYPE_REQUEST_BASED completely, seeing that it's 
> referring to the now-defunct legacy I/O path.
> Mike?

For both of these, Mike carried a version that had some of those bits
unified, and it's now in mainline. I have just dropped this version,
was just carrying to resolve that dependency until mainline had
it dropped.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 16/28] blk-cgroup: remove legacy queue bypassing
  2018-10-29 11:00   ` Johannes Thumshirn
@ 2018-10-29 14:23     ` Jens Axboe
  2018-10-29 14:25       ` Johannes Thumshirn
  0 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 14:23 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-block, linux-scsi, linux-ide

On 10/29/18 5:00 AM, Johannes Thumshirn wrote:
> Hi Jens,
> 
> On 25/10/18 23:10, Jens Axboe wrote:
> [...]
>> @@ -1487,8 +1485,6 @@ int blkcg_activate_policy(struct request_queue *q,
>>  out_bypass_end:
>>  	if (q->mq_ops)
>>  		blk_mq_unfreeze_queue(q);
>> -	else
>> -		blk_queue_bypass_end(q);
> 
>>  
>>
> 
> Now that we only have mq, do we still need all these checks for q->mq_ops?

We need it for the cases where we can be passing in a stacked driver
or an mq driver, or for cases where we explicitly register mq attributes,
and that sort of thing.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 16/28] blk-cgroup: remove legacy queue bypassing
  2018-10-29 14:23     ` Jens Axboe
@ 2018-10-29 14:25       ` Johannes Thumshirn
  2018-10-29 14:59         ` Jens Axboe
  0 siblings, 1 reply; 90+ messages in thread
From: Johannes Thumshirn @ 2018-10-29 14:25 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-scsi, linux-ide

On 29/10/18 15:23, Jens Axboe wrote:
> On 10/29/18 5:00 AM, Johannes Thumshirn wrote:
>> Hi Jens,
>>
>> On 25/10/18 23:10, Jens Axboe wrote:
>> [...]
>>> @@ -1487,8 +1485,6 @@ int blkcg_activate_policy(struct request_queue *q,
>>>  out_bypass_end:
>>>  	if (q->mq_ops)
>>>  		blk_mq_unfreeze_queue(q);
>>> -	else
>>> -		blk_queue_bypass_end(q);
>>
>>>  
>>>
>>
>> Now that we only have mq, do we still need all these checks for q->mq_ops?
> 
> We need it for the cases where we can be passing in a stacked driver
> or an mq driver, or for cases where we explicitly register mq attributes,
> and that sort of thing.
> 

Ah ok. Guess we'll have to live with it then.

	Johannes

-- 
Johannes Thumshirn                                        SUSE Labs
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 23/28] block: kill lld busy
  2018-10-29  7:10   ` Hannes Reinecke
@ 2018-10-29 14:25     ` Jens Axboe
  2018-10-29 15:51       ` Mike Snitzer
  0 siblings, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 14:25 UTC (permalink / raw)
  To: Hannes Reinecke, linux-block, linux-scsi, linux-ide, Mike Snitzer

On 10/29/18 1:10 AM, Hannes Reinecke wrote:
> On 10/25/18 11:10 PM, Jens Axboe wrote:
>> Nobody sets the helper, so we always return 0. Kill it.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   block/blk-core.c       | 28 ----------------------------
>>   block/blk-settings.c   |  6 ------
>>   drivers/md/dm-mpath.c  |  4 +---
>>   include/linux/blkdev.h |  4 ----
>>   4 files changed, 1 insertion(+), 41 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 4c39c7865f9c..dd1328f4dc31 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1795,34 +1795,6 @@ void rq_flush_dcache_pages(struct request *rq)
>>   EXPORT_SYMBOL_GPL(rq_flush_dcache_pages);
>>   #endif
>>   
>> -/**
>> - * blk_lld_busy - Check if underlying low-level drivers of a device are busy
>> - * @q : the queue of the device being checked
>> - *
>> - * Description:
>> - *    Check if underlying low-level drivers of a device are busy.
>> - *    If the drivers want to export their busy state, they must set own
>> - *    exporting function using blk_queue_lld_busy() first.
>> - *
>> - *    Basically, this function is used only by request stacking drivers
>> - *    to stop dispatching requests to underlying devices when underlying
>> - *    devices are busy.  This behavior helps more I/O merging on the queue
>> - *    of the request stacking driver and prevents I/O throughput regression
>> - *    on burst I/O load.
>> - *
>> - * Return:
>> - *    0 - Not busy (The request stacking driver should dispatch request)
>> - *    1 - Busy (The request stacking driver should stop dispatching request)
>> - */
>> -int blk_lld_busy(struct request_queue *q)
>> -{
>> -	if (q->lld_busy_fn)
>> -		return q->lld_busy_fn(q);
>> -
>> -	return 0;
>> -}
>> -EXPORT_SYMBOL_GPL(blk_lld_busy);
>> -
>>   /**
>>    * blk_rq_unprep_clone - Helper function to free all bios in a cloned request
>>    * @rq: the clone request to be cleaned up
>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>> index 3c5da75c2def..1895f499bbe5 100644
>> --- a/block/blk-settings.c
>> +++ b/block/blk-settings.c
>> @@ -32,12 +32,6 @@ void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
>>   }
>>   EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
>>   
>> -void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
>> -{
>> -	q->lld_busy_fn = fn;
>> -}
>> -EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
>> -
>>   /**
>>    * blk_set_default_limits - reset limits to default values
>>    * @lim:  the queue_limits structure to reset
>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
>> index a24ed3973e7c..4d736e0fd67f 100644
>> --- a/drivers/md/dm-mpath.c
>> +++ b/drivers/md/dm-mpath.c
>> @@ -1936,9 +1936,7 @@ static int multipath_iterate_devices(struct dm_target *ti,
>>   
>>   static int pgpath_busy(struct pgpath *pgpath)
>>   {
>> -	struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev);
>> -
>> -	return blk_lld_busy(q);
>> +	return 0;
>>   }
>>   
>>   /*
> Actually, I'm not quite sure this is correct; dm-mpath needs to return a 
> busy status (via the '->busy' callback) to allow for back-pressure on 
> the upper layers.
> Just disabling the callback will disable this.
> Shouldn't we check if the tagmap is busy here?

Mike? We can pretty easily make it just check for busy tags.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-29 12:00 ` Ming Lei
@ 2018-10-29 14:50   ` Jens Axboe
  2018-10-29 15:04     ` Jens Axboe
  2018-10-29 15:05     ` Ming Lei
  0 siblings, 2 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 14:50 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, linux-scsi, linux-ide, Paolo Valente

On 10/29/18 6:00 AM, Ming Lei wrote:
> On Thu, Oct 25, 2018 at 03:10:11PM -0600, Jens Axboe wrote:
>> The first round of this went into 4.20-rc, but we've still some of
>> them pending. This patch series converts the remaining drivers to
>> blk-mq. The ones that support dual paths (like SCSI and DM) have
>> the non-mq path removed. At the end, legacy IO code and schedulers
>> are killed off.
>>
>> This patch series is on top of my for-linus branch. It can also
>> be bound in my mq-conversions branch.
>>
>>  Documentation/block/biodoc.txt         |   88 -
>>  Documentation/block/cfq-iosched.txt    |  291 --
>>  Documentation/scsi/scsi-parameters.txt |    5 -
>>  block/Kconfig                          |    6 -
>>  block/Kconfig.iosched                  |   61 -
>>  block/Makefile                         |    5 +-
>>  block/bfq-iosched.c                    |    1 -
>>  block/blk-cgroup.c                     |   55 -
>>  block/blk-core.c                       | 1860 +-----------
>>  block/blk-exec.c                       |   20 +-
>>  block/blk-flush.c                      |  154 +-
>>  block/blk-ioc.c                        |   46 +-
>>  block/blk-merge.c                      |   35 +-
>>  block/blk-mq-debugfs.c                 |    2 -
>>  block/blk-mq-tag.c                     |    6 +-
>>  block/blk-mq.c                         |   13 +-
>>  block/blk-settings.c                   |   49 -
>>  block/blk-softirq.c                    |   20 -
>>  block/blk-sysfs.c                      |   39 +-
>>  block/blk-tag.c                        |  378 ---
>>  block/blk-timeout.c                    |   99 +-
>>  block/blk-wbt.c                        |    3 +-
>>  block/blk.h                            |   60 +-
>>  block/bsg-lib.c                        |  131 +-
>>  block/cfq-iosched.c                    | 4916 --------------------------------
>>  block/deadline-iosched.c               |  560 ----
>>  block/elevator.c                       |  447 +--
>>  block/kyber-iosched.c                  |    1 -
>>  block/mq-deadline.c                    |    1 -
>>  block/noop-iosched.c                   |  124 -
>>  drivers/block/sunvdc.c                 |  149 +-
>>  drivers/ide/ide-atapi.c                |   25 +-
>>  drivers/ide/ide-cd.c                   |  175 +-
>>  drivers/ide/ide-disk.c                 |    5 +-
>>  drivers/ide/ide-io.c                   |  101 +-
>>  drivers/ide/ide-park.c                 |    4 +-
>>  drivers/ide/ide-pm.c                   |   28 +-
>>  drivers/ide/ide-probe.c                |   68 +-
>>  drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
>>  drivers/md/Kconfig                     |   11 -
>>  drivers/md/dm-core.h                   |   10 -
>>  drivers/md/dm-mpath.c                  |   18 +-
>>  drivers/md/dm-rq.c                     |  293 +-
>>  drivers/md/dm-rq.h                     |    4 -
>>  drivers/md/dm-sysfs.c                  |    3 +-
>>  drivers/md/dm-table.c                  |   36 +-
>>  drivers/md/dm.c                        |   21 +-
>>  drivers/md/dm.h                        |    1 -
>>  drivers/memstick/core/ms_block.c       |  110 +-
>>  drivers/memstick/core/ms_block.h       |    1 +
>>  drivers/memstick/core/mspro_block.c    |  121 +-
>>  drivers/s390/block/dasd_ioctl.c        |   22 +-
>>  drivers/scsi/Kconfig                   |   12 -
>>  drivers/scsi/cxlflash/main.c           |    6 -
>>  drivers/scsi/hosts.c                   |   29 +-
>>  drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
>>  drivers/scsi/osd/osd_initiator.c       |    4 +-
>>  drivers/scsi/osst.c                    |    2 +-
>>  drivers/scsi/qedi/qedi_main.c          |    3 +-
>>  drivers/scsi/qla2xxx/qla_os.c          |   30 +-
>>  drivers/scsi/scsi.c                    |    5 +-
>>  drivers/scsi/scsi_debug.c              |    3 +-
>>  drivers/scsi/scsi_error.c              |    4 +-
>>  drivers/scsi/scsi_lib.c                |  624 +---
>>  drivers/scsi/scsi_priv.h               |    1 -
>>  drivers/scsi/scsi_scan.c               |   10 +-
>>  drivers/scsi/scsi_sysfs.c              |    8 +-
>>  drivers/scsi/scsi_transport_fc.c       |   72 +-
>>  drivers/scsi/scsi_transport_iscsi.c    |    9 +-
>>  drivers/scsi/scsi_transport_sas.c      |   10 +-
>>  drivers/scsi/sg.c                      |    2 +-
>>  drivers/scsi/st.c                      |    2 +-
>>  drivers/scsi/ufs/ufshcd.c              |    6 -
>>  drivers/target/target_core_pscsi.c     |    2 +-
>>  include/linux/blk-cgroup.h             |  108 -
>>  include/linux/blkdev.h                 |  174 +-
>>  include/linux/bsg-lib.h                |    3 +-
>>  include/linux/elevator.h               |   90 +-
>>  include/linux/ide.h                    |   13 +-
>>  include/linux/init.h                   |    1 -
>>  include/scsi/scsi_host.h               |   18 +-
>>  include/scsi/scsi_tcq.h                |   14 +-
>>  init/do_mounts_initrd.c                |    3 -
>>  init/initramfs.c                       |    6 -
>>  init/main.c                            |   12 -
>>  85 files changed, 833 insertions(+), 11144 deletions(-)
> 
> Hi Jens,
> 
> Kernel oops[2] is triggered when running elevator switch stress test[1].
> 
> [1] https://people.redhat.com/~minlei/tests/tools/elv-switch
> 
> [2] kernel oops
> 
> [  399.702365] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> [  399.703603] sd 8:0:0:0: Power-on or device reset occurred
> [  399.709833] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> [  399.711745] sd 8:0:0:0: [sdd] Write Protect is off
> [  399.712395] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
> [  399.715119] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [  399.803561] sd 8:0:0:0: [sdd] Attached SCSI disk
> [  399.844825] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
> [  399.968945] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> [  399.970077] sd 8:0:0:0: Power-on or device reset occurred
> [  399.972904] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> [  399.974823] sd 8:0:0:0: [sdd] Write Protect is off
> [  399.975411] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
> [  399.978052] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [  400.030588] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
> [  400.032195] PGD 0 P4D 0
> [  400.032482] Oops: 0000 [#1] PREEMPT SMP PTI
> [  400.032945] CPU: 2 PID: 318 Comm: kworker/u8:5 Not tainted 4.19.0_0ffdea6679a9_mq-conversions+ #1
> [  400.033927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
> [  400.034915] Workqueue: events_unbound async_run_entry_fn
> [  400.035536] RIP: 0010:bfq_reset_rate_computation+0x52/0x85
> [  400.036177] Code: 00 48 89 83 b0 00 00 00 8b 45 20 c1 e8 09 89 83 d8 00 00 00 48 89 83 d0 00 00 00 eb 0a c7 87 c8 00 00 00 00 00 00 00 48 8b 03 <48> 8b b8 00 05 00 00 48 85 ff 74 24 8b 8b c8 00 00 00 4c 8b 8b d0
> [  400.038300] RSP: 0018:ffffc90000a47c50 EFLAGS: 00010046
> [  400.038915] RAX: 0000000000000000 RBX: ffff88017ac2bc00 RCX: 0000000017d6067a
> [  400.039746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88017ac2bc00
> [  400.040573] RBP: ffff88017ac2bc00 R08: 00000000f461e6ac R09: 0000000000000006
> [  400.041361] R10: 000000000000020f R11: 0000000000000000 R12: ffff88016c130000
> [  400.042127] R13: 0000005d1c095013 R14: ffff88017ac2bf78 R15: 0000000000000020
> [  400.042957] FS:  0000000000000000(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000
> [  400.043895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  400.044518] CR2: 0000000000000500 CR3: 0000000179c32004 CR4: 0000000000760ee0
> [  400.045294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  400.046070] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  400.046843] PKRU: 55555554
> [  400.047144] Call Trace:
> [  400.047438]  bfq_finish_requeue_request+0x145/0x2fd
> [  400.047977]  blk_mq_free_request+0x49/0xea
> [  400.048429]  __scsi_execute+0x16c/0x17b
> [  400.048885]  scsi_mode_sense+0x113/0x265
> [  400.049356]  sd_revalidate_disk+0xb5b/0x109a [sd_mod]
> [  400.049947]  ? __device_add_disk+0x3d2/0x3f6
> [  400.050415]  sd_probe_async+0x10d/0x18b [sd_mod]
> [  400.050924]  async_run_entry_fn+0x6d/0x12b
> [  400.051374]  process_one_work+0x1da/0x313
> [  400.051841]  ? rescuer_thread+0x282/0x282
> [  400.052313]  worker_thread+0x1ca/0x295
> [  400.052769]  kthread+0x115/0x11d
> [  400.053127]  ? kthread_park+0x76/0x76
> [  400.053531]  ret_from_fork+0x35/0x40
> [  400.053929] Modules linked in: scsi_debug null_blk isofs iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core lpc_ich mfd_core ip_tables sr_mod cdrom usb_storage sd_mod ahci libahci crc32c_intel libata virtio_scsi qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod [last unloaded: null_blk]
> [  400.056640] Dumping ftrace buffer:
> [  400.057022]    (ftrace buffer empty)
> [  400.057418] CR2: 0000000000000500
> [  400.057820] ---[ end trace 812cbc0fe8802fdd ]---

Doesn't look related to the series, more like a bug in BFQ. Are you
sure it passes on current -git? Or maybe the bug is exposed by the
series.

I'll try and reproduce it here and see what happens.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 16/28] blk-cgroup: remove legacy queue bypassing
  2018-10-29 14:25       ` Johannes Thumshirn
@ 2018-10-29 14:59         ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 14:59 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-block, linux-scsi, linux-ide

On 10/29/18 8:25 AM, Johannes Thumshirn wrote:
> On 29/10/18 15:23, Jens Axboe wrote:
>> On 10/29/18 5:00 AM, Johannes Thumshirn wrote:
>>> Hi Jens,
>>>
>>> On 25/10/18 23:10, Jens Axboe wrote:
>>> [...]
>>>> @@ -1487,8 +1485,6 @@ int blkcg_activate_policy(struct request_queue *q,
>>>>  out_bypass_end:
>>>>  	if (q->mq_ops)
>>>>  		blk_mq_unfreeze_queue(q);
>>>> -	else
>>>> -		blk_queue_bypass_end(q);
>>>
>>>>  
>>>>
>>>
>>> Now that we only have mq, do we still need all these checks for q->mq_ops?
>>
>> We need it for the cases where we can be passing in a stacked driver
>> or an mq driver, or for cases where we explicitly register mq attributes,
>> and that sort of thing.
>>
> 
> Ah ok. Guess we'll have to live with it then.

At this point the check should be equivalent to queue_is_rq_based(), fwiw.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-29 14:50   ` Jens Axboe
@ 2018-10-29 15:04     ` Jens Axboe
  2018-10-30  9:41       ` Ming Lei
  2018-10-29 15:05     ` Ming Lei
  1 sibling, 1 reply; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 15:04 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, linux-scsi, linux-ide, Paolo Valente

On 10/29/18 8:50 AM, Jens Axboe wrote:
> On 10/29/18 6:00 AM, Ming Lei wrote:
>> On Thu, Oct 25, 2018 at 03:10:11PM -0600, Jens Axboe wrote:
>>> The first round of this went into 4.20-rc, but we've still some of
>>> them pending. This patch series converts the remaining drivers to
>>> blk-mq. The ones that support dual paths (like SCSI and DM) have
>>> the non-mq path removed. At the end, legacy IO code and schedulers
>>> are killed off.
>>>
>>> This patch series is on top of my for-linus branch. It can also
>>> be bound in my mq-conversions branch.
>>>
>>>  Documentation/block/biodoc.txt         |   88 -
>>>  Documentation/block/cfq-iosched.txt    |  291 --
>>>  Documentation/scsi/scsi-parameters.txt |    5 -
>>>  block/Kconfig                          |    6 -
>>>  block/Kconfig.iosched                  |   61 -
>>>  block/Makefile                         |    5 +-
>>>  block/bfq-iosched.c                    |    1 -
>>>  block/blk-cgroup.c                     |   55 -
>>>  block/blk-core.c                       | 1860 +-----------
>>>  block/blk-exec.c                       |   20 +-
>>>  block/blk-flush.c                      |  154 +-
>>>  block/blk-ioc.c                        |   46 +-
>>>  block/blk-merge.c                      |   35 +-
>>>  block/blk-mq-debugfs.c                 |    2 -
>>>  block/blk-mq-tag.c                     |    6 +-
>>>  block/blk-mq.c                         |   13 +-
>>>  block/blk-settings.c                   |   49 -
>>>  block/blk-softirq.c                    |   20 -
>>>  block/blk-sysfs.c                      |   39 +-
>>>  block/blk-tag.c                        |  378 ---
>>>  block/blk-timeout.c                    |   99 +-
>>>  block/blk-wbt.c                        |    3 +-
>>>  block/blk.h                            |   60 +-
>>>  block/bsg-lib.c                        |  131 +-
>>>  block/cfq-iosched.c                    | 4916 --------------------------------
>>>  block/deadline-iosched.c               |  560 ----
>>>  block/elevator.c                       |  447 +--
>>>  block/kyber-iosched.c                  |    1 -
>>>  block/mq-deadline.c                    |    1 -
>>>  block/noop-iosched.c                   |  124 -
>>>  drivers/block/sunvdc.c                 |  149 +-
>>>  drivers/ide/ide-atapi.c                |   25 +-
>>>  drivers/ide/ide-cd.c                   |  175 +-
>>>  drivers/ide/ide-disk.c                 |    5 +-
>>>  drivers/ide/ide-io.c                   |  101 +-
>>>  drivers/ide/ide-park.c                 |    4 +-
>>>  drivers/ide/ide-pm.c                   |   28 +-
>>>  drivers/ide/ide-probe.c                |   68 +-
>>>  drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
>>>  drivers/md/Kconfig                     |   11 -
>>>  drivers/md/dm-core.h                   |   10 -
>>>  drivers/md/dm-mpath.c                  |   18 +-
>>>  drivers/md/dm-rq.c                     |  293 +-
>>>  drivers/md/dm-rq.h                     |    4 -
>>>  drivers/md/dm-sysfs.c                  |    3 +-
>>>  drivers/md/dm-table.c                  |   36 +-
>>>  drivers/md/dm.c                        |   21 +-
>>>  drivers/md/dm.h                        |    1 -
>>>  drivers/memstick/core/ms_block.c       |  110 +-
>>>  drivers/memstick/core/ms_block.h       |    1 +
>>>  drivers/memstick/core/mspro_block.c    |  121 +-
>>>  drivers/s390/block/dasd_ioctl.c        |   22 +-
>>>  drivers/scsi/Kconfig                   |   12 -
>>>  drivers/scsi/cxlflash/main.c           |    6 -
>>>  drivers/scsi/hosts.c                   |   29 +-
>>>  drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
>>>  drivers/scsi/osd/osd_initiator.c       |    4 +-
>>>  drivers/scsi/osst.c                    |    2 +-
>>>  drivers/scsi/qedi/qedi_main.c          |    3 +-
>>>  drivers/scsi/qla2xxx/qla_os.c          |   30 +-
>>>  drivers/scsi/scsi.c                    |    5 +-
>>>  drivers/scsi/scsi_debug.c              |    3 +-
>>>  drivers/scsi/scsi_error.c              |    4 +-
>>>  drivers/scsi/scsi_lib.c                |  624 +---
>>>  drivers/scsi/scsi_priv.h               |    1 -
>>>  drivers/scsi/scsi_scan.c               |   10 +-
>>>  drivers/scsi/scsi_sysfs.c              |    8 +-
>>>  drivers/scsi/scsi_transport_fc.c       |   72 +-
>>>  drivers/scsi/scsi_transport_iscsi.c    |    9 +-
>>>  drivers/scsi/scsi_transport_sas.c      |   10 +-
>>>  drivers/scsi/sg.c                      |    2 +-
>>>  drivers/scsi/st.c                      |    2 +-
>>>  drivers/scsi/ufs/ufshcd.c              |    6 -
>>>  drivers/target/target_core_pscsi.c     |    2 +-
>>>  include/linux/blk-cgroup.h             |  108 -
>>>  include/linux/blkdev.h                 |  174 +-
>>>  include/linux/bsg-lib.h                |    3 +-
>>>  include/linux/elevator.h               |   90 +-
>>>  include/linux/ide.h                    |   13 +-
>>>  include/linux/init.h                   |    1 -
>>>  include/scsi/scsi_host.h               |   18 +-
>>>  include/scsi/scsi_tcq.h                |   14 +-
>>>  init/do_mounts_initrd.c                |    3 -
>>>  init/initramfs.c                       |    6 -
>>>  init/main.c                            |   12 -
>>>  85 files changed, 833 insertions(+), 11144 deletions(-)
>>
>> Hi Jens,
>>
>> Kernel oops[2] is triggered when running elevator switch stress test[1].
>>
>> [1] https://people.redhat.com/~minlei/tests/tools/elv-switch
>>
>> [2] kernel oops
>>
>> [  399.702365] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
>> [  399.703603] sd 8:0:0:0: Power-on or device reset occurred
>> [  399.709833] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
>> [  399.711745] sd 8:0:0:0: [sdd] Write Protect is off
>> [  399.712395] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
>> [  399.715119] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
>> [  399.803561] sd 8:0:0:0: [sdd] Attached SCSI disk
>> [  399.844825] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
>> [  399.968945] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
>> [  399.970077] sd 8:0:0:0: Power-on or device reset occurred
>> [  399.972904] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
>> [  399.974823] sd 8:0:0:0: [sdd] Write Protect is off
>> [  399.975411] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
>> [  399.978052] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
>> [  400.030588] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
>> [  400.032195] PGD 0 P4D 0
>> [  400.032482] Oops: 0000 [#1] PREEMPT SMP PTI
>> [  400.032945] CPU: 2 PID: 318 Comm: kworker/u8:5 Not tainted 4.19.0_0ffdea6679a9_mq-conversions+ #1
>> [  400.033927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
>> [  400.034915] Workqueue: events_unbound async_run_entry_fn
>> [  400.035536] RIP: 0010:bfq_reset_rate_computation+0x52/0x85
>> [  400.036177] Code: 00 48 89 83 b0 00 00 00 8b 45 20 c1 e8 09 89 83 d8 00 00 00 48 89 83 d0 00 00 00 eb 0a c7 87 c8 00 00 00 00 00 00 00 48 8b 03 <48> 8b b8 00 05 00 00 48 85 ff 74 24 8b 8b c8 00 00 00 4c 8b 8b d0
>> [  400.038300] RSP: 0018:ffffc90000a47c50 EFLAGS: 00010046
>> [  400.038915] RAX: 0000000000000000 RBX: ffff88017ac2bc00 RCX: 0000000017d6067a
>> [  400.039746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88017ac2bc00
>> [  400.040573] RBP: ffff88017ac2bc00 R08: 00000000f461e6ac R09: 0000000000000006
>> [  400.041361] R10: 000000000000020f R11: 0000000000000000 R12: ffff88016c130000
>> [  400.042127] R13: 0000005d1c095013 R14: ffff88017ac2bf78 R15: 0000000000000020
>> [  400.042957] FS:  0000000000000000(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000
>> [  400.043895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  400.044518] CR2: 0000000000000500 CR3: 0000000179c32004 CR4: 0000000000760ee0
>> [  400.045294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [  400.046070] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [  400.046843] PKRU: 55555554
>> [  400.047144] Call Trace:
>> [  400.047438]  bfq_finish_requeue_request+0x145/0x2fd
>> [  400.047977]  blk_mq_free_request+0x49/0xea
>> [  400.048429]  __scsi_execute+0x16c/0x17b
>> [  400.048885]  scsi_mode_sense+0x113/0x265
>> [  400.049356]  sd_revalidate_disk+0xb5b/0x109a [sd_mod]
>> [  400.049947]  ? __device_add_disk+0x3d2/0x3f6
>> [  400.050415]  sd_probe_async+0x10d/0x18b [sd_mod]
>> [  400.050924]  async_run_entry_fn+0x6d/0x12b
>> [  400.051374]  process_one_work+0x1da/0x313
>> [  400.051841]  ? rescuer_thread+0x282/0x282
>> [  400.052313]  worker_thread+0x1ca/0x295
>> [  400.052769]  kthread+0x115/0x11d
>> [  400.053127]  ? kthread_park+0x76/0x76
>> [  400.053531]  ret_from_fork+0x35/0x40
>> [  400.053929] Modules linked in: scsi_debug null_blk isofs iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core lpc_ich mfd_core ip_tables sr_mod cdrom usb_storage sd_mod ahci libahci crc32c_intel libata virtio_scsi qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod [last unloaded: null_blk]
>> [  400.056640] Dumping ftrace buffer:
>> [  400.057022]    (ftrace buffer empty)
>> [  400.057418] CR2: 0000000000000500
>> [  400.057820] ---[ end trace 812cbc0fe8802fdd ]---
> 
> Doesn't look related to the series, more like a bug in BFQ. Are you
> sure it passes on current -git? Or maybe the bug is exposed by the
> series.
> 
> I'll try and reproduce it here and see what happens.

I think I see the error, it is introduced by my series. This incremental
should help, I'm going to test it now.


diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index b8afdd6ec4b8..391128456aec 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -212,6 +212,21 @@ void exit_io_context(struct task_struct *task)
 	put_io_context_active(ioc);
 }
 
+static void __ioc_clear_queue(struct list_head *icq_list)
+{
+	unsigned long flags;
+
+	while (!list_empty(icq_list)) {
+		struct io_cq *icq = list_entry(icq_list->next,
+						struct io_cq, q_node);
+		struct io_context *ioc = icq->ioc;
+
+		spin_lock_irqsave(&ioc->lock, flags);
+		ioc_destroy_icq(icq);
+		spin_unlock_irqrestore(&ioc->lock, flags);
+	}
+}
+
 /**
  * ioc_clear_queue - break any ioc association with the specified queue
  * @q: request_queue being cleared
@@ -225,6 +240,8 @@ void ioc_clear_queue(struct request_queue *q)
 	spin_lock_irq(q->queue_lock);
 	list_splice_init(&q->icq_list, &icq_list);
 	spin_unlock_irq(q->queue_lock);
+
+	__ioc_clear_queue(&icq_list);
 }
 
 int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-29 14:50   ` Jens Axboe
  2018-10-29 15:04     ` Jens Axboe
@ 2018-10-29 15:05     ` Ming Lei
  1 sibling, 0 replies; 90+ messages in thread
From: Ming Lei @ 2018-10-29 15:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-ide, Paolo Valente

On Mon, Oct 29, 2018 at 08:50:25AM -0600, Jens Axboe wrote:
> On 10/29/18 6:00 AM, Ming Lei wrote:
> > On Thu, Oct 25, 2018 at 03:10:11PM -0600, Jens Axboe wrote:
> >> The first round of this went into 4.20-rc, but we've still some of
> >> them pending. This patch series converts the remaining drivers to
> >> blk-mq. The ones that support dual paths (like SCSI and DM) have
> >> the non-mq path removed. At the end, legacy IO code and schedulers
> >> are killed off.
> >>
> >> This patch series is on top of my for-linus branch. It can also
> >> be bound in my mq-conversions branch.
> >>
> >>  Documentation/block/biodoc.txt         |   88 -
> >>  Documentation/block/cfq-iosched.txt    |  291 --
> >>  Documentation/scsi/scsi-parameters.txt |    5 -
> >>  block/Kconfig                          |    6 -
> >>  block/Kconfig.iosched                  |   61 -
> >>  block/Makefile                         |    5 +-
> >>  block/bfq-iosched.c                    |    1 -
> >>  block/blk-cgroup.c                     |   55 -
> >>  block/blk-core.c                       | 1860 +-----------
> >>  block/blk-exec.c                       |   20 +-
> >>  block/blk-flush.c                      |  154 +-
> >>  block/blk-ioc.c                        |   46 +-
> >>  block/blk-merge.c                      |   35 +-
> >>  block/blk-mq-debugfs.c                 |    2 -
> >>  block/blk-mq-tag.c                     |    6 +-
> >>  block/blk-mq.c                         |   13 +-
> >>  block/blk-settings.c                   |   49 -
> >>  block/blk-softirq.c                    |   20 -
> >>  block/blk-sysfs.c                      |   39 +-
> >>  block/blk-tag.c                        |  378 ---
> >>  block/blk-timeout.c                    |   99 +-
> >>  block/blk-wbt.c                        |    3 +-
> >>  block/blk.h                            |   60 +-
> >>  block/bsg-lib.c                        |  131 +-
> >>  block/cfq-iosched.c                    | 4916 --------------------------------
> >>  block/deadline-iosched.c               |  560 ----
> >>  block/elevator.c                       |  447 +--
> >>  block/kyber-iosched.c                  |    1 -
> >>  block/mq-deadline.c                    |    1 -
> >>  block/noop-iosched.c                   |  124 -
> >>  drivers/block/sunvdc.c                 |  149 +-
> >>  drivers/ide/ide-atapi.c                |   25 +-
> >>  drivers/ide/ide-cd.c                   |  175 +-
> >>  drivers/ide/ide-disk.c                 |    5 +-
> >>  drivers/ide/ide-io.c                   |  101 +-
> >>  drivers/ide/ide-park.c                 |    4 +-
> >>  drivers/ide/ide-pm.c                   |   28 +-
> >>  drivers/ide/ide-probe.c                |   68 +-
> >>  drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
> >>  drivers/md/Kconfig                     |   11 -
> >>  drivers/md/dm-core.h                   |   10 -
> >>  drivers/md/dm-mpath.c                  |   18 +-
> >>  drivers/md/dm-rq.c                     |  293 +-
> >>  drivers/md/dm-rq.h                     |    4 -
> >>  drivers/md/dm-sysfs.c                  |    3 +-
> >>  drivers/md/dm-table.c                  |   36 +-
> >>  drivers/md/dm.c                        |   21 +-
> >>  drivers/md/dm.h                        |    1 -
> >>  drivers/memstick/core/ms_block.c       |  110 +-
> >>  drivers/memstick/core/ms_block.h       |    1 +
> >>  drivers/memstick/core/mspro_block.c    |  121 +-
> >>  drivers/s390/block/dasd_ioctl.c        |   22 +-
> >>  drivers/scsi/Kconfig                   |   12 -
> >>  drivers/scsi/cxlflash/main.c           |    6 -
> >>  drivers/scsi/hosts.c                   |   29 +-
> >>  drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
> >>  drivers/scsi/osd/osd_initiator.c       |    4 +-
> >>  drivers/scsi/osst.c                    |    2 +-
> >>  drivers/scsi/qedi/qedi_main.c          |    3 +-
> >>  drivers/scsi/qla2xxx/qla_os.c          |   30 +-
> >>  drivers/scsi/scsi.c                    |    5 +-
> >>  drivers/scsi/scsi_debug.c              |    3 +-
> >>  drivers/scsi/scsi_error.c              |    4 +-
> >>  drivers/scsi/scsi_lib.c                |  624 +---
> >>  drivers/scsi/scsi_priv.h               |    1 -
> >>  drivers/scsi/scsi_scan.c               |   10 +-
> >>  drivers/scsi/scsi_sysfs.c              |    8 +-
> >>  drivers/scsi/scsi_transport_fc.c       |   72 +-
> >>  drivers/scsi/scsi_transport_iscsi.c    |    9 +-
> >>  drivers/scsi/scsi_transport_sas.c      |   10 +-
> >>  drivers/scsi/sg.c                      |    2 +-
> >>  drivers/scsi/st.c                      |    2 +-
> >>  drivers/scsi/ufs/ufshcd.c              |    6 -
> >>  drivers/target/target_core_pscsi.c     |    2 +-
> >>  include/linux/blk-cgroup.h             |  108 -
> >>  include/linux/blkdev.h                 |  174 +-
> >>  include/linux/bsg-lib.h                |    3 +-
> >>  include/linux/elevator.h               |   90 +-
> >>  include/linux/ide.h                    |   13 +-
> >>  include/linux/init.h                   |    1 -
> >>  include/scsi/scsi_host.h               |   18 +-
> >>  include/scsi/scsi_tcq.h                |   14 +-
> >>  init/do_mounts_initrd.c                |    3 -
> >>  init/initramfs.c                       |    6 -
> >>  init/main.c                            |   12 -
> >>  85 files changed, 833 insertions(+), 11144 deletions(-)
> > 
> > Hi Jens,
> > 
> > Kernel oops[2] is triggered when running elevator switch stress test[1].
> > 
> > [1] https://people.redhat.com/~minlei/tests/tools/elv-switch
> > 
> > [2] kernel oops
> > 
> > [  399.702365] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> > [  399.703603] sd 8:0:0:0: Power-on or device reset occurred
> > [  399.709833] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> > [  399.711745] sd 8:0:0:0: [sdd] Write Protect is off
> > [  399.712395] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
> > [  399.715119] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> > [  399.803561] sd 8:0:0:0: [sdd] Attached SCSI disk
> > [  399.844825] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
> > [  399.968945] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> > [  399.970077] sd 8:0:0:0: Power-on or device reset occurred
> > [  399.972904] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> > [  399.974823] sd 8:0:0:0: [sdd] Write Protect is off
> > [  399.975411] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
> > [  399.978052] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> > [  400.030588] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
> > [  400.032195] PGD 0 P4D 0
> > [  400.032482] Oops: 0000 [#1] PREEMPT SMP PTI
> > [  400.032945] CPU: 2 PID: 318 Comm: kworker/u8:5 Not tainted 4.19.0_0ffdea6679a9_mq-conversions+ #1
> > [  400.033927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
> > [  400.034915] Workqueue: events_unbound async_run_entry_fn
> > [  400.035536] RIP: 0010:bfq_reset_rate_computation+0x52/0x85
> > [  400.036177] Code: 00 48 89 83 b0 00 00 00 8b 45 20 c1 e8 09 89 83 d8 00 00 00 48 89 83 d0 00 00 00 eb 0a c7 87 c8 00 00 00 00 00 00 00 48 8b 03 <48> 8b b8 00 05 00 00 48 85 ff 74 24 8b 8b c8 00 00 00 4c 8b 8b d0
> > [  400.038300] RSP: 0018:ffffc90000a47c50 EFLAGS: 00010046
> > [  400.038915] RAX: 0000000000000000 RBX: ffff88017ac2bc00 RCX: 0000000017d6067a
> > [  400.039746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88017ac2bc00
> > [  400.040573] RBP: ffff88017ac2bc00 R08: 00000000f461e6ac R09: 0000000000000006
> > [  400.041361] R10: 000000000000020f R11: 0000000000000000 R12: ffff88016c130000
> > [  400.042127] R13: 0000005d1c095013 R14: ffff88017ac2bf78 R15: 0000000000000020
> > [  400.042957] FS:  0000000000000000(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000
> > [  400.043895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  400.044518] CR2: 0000000000000500 CR3: 0000000179c32004 CR4: 0000000000760ee0
> > [  400.045294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  400.046070] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  400.046843] PKRU: 55555554
> > [  400.047144] Call Trace:
> > [  400.047438]  bfq_finish_requeue_request+0x145/0x2fd
> > [  400.047977]  blk_mq_free_request+0x49/0xea
> > [  400.048429]  __scsi_execute+0x16c/0x17b
> > [  400.048885]  scsi_mode_sense+0x113/0x265
> > [  400.049356]  sd_revalidate_disk+0xb5b/0x109a [sd_mod]
> > [  400.049947]  ? __device_add_disk+0x3d2/0x3f6
> > [  400.050415]  sd_probe_async+0x10d/0x18b [sd_mod]
> > [  400.050924]  async_run_entry_fn+0x6d/0x12b
> > [  400.051374]  process_one_work+0x1da/0x313
> > [  400.051841]  ? rescuer_thread+0x282/0x282
> > [  400.052313]  worker_thread+0x1ca/0x295
> > [  400.052769]  kthread+0x115/0x11d
> > [  400.053127]  ? kthread_park+0x76/0x76
> > [  400.053531]  ret_from_fork+0x35/0x40
> > [  400.053929] Modules linked in: scsi_debug null_blk isofs iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core lpc_ich mfd_core ip_tables sr_mod cdrom usb_storage sd_mod ahci libahci crc32c_intel libata virtio_scsi qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod [last unloaded: null_blk]
> > [  400.056640] Dumping ftrace buffer:
> > [  400.057022]    (ftrace buffer empty)
> > [  400.057418] CR2: 0000000000000500
> > [  400.057820] ---[ end trace 812cbc0fe8802fdd ]---
> 
> Doesn't look related to the series, more like a bug in BFQ. Are you
> sure it passes on current -git? Or maybe the bug is exposed by the
> series.
> 
> I'll try and reproduce it here and see what happens.

Look it is only observed on mq-conversion, and not see it on
for-linus, will double check it tomorrow.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 23/28] block: kill lld busy
  2018-10-29 14:25     ` Jens Axboe
@ 2018-10-29 15:51       ` Mike Snitzer
  2018-10-29 16:26         ` Jens Axboe
  0 siblings, 1 reply; 90+ messages in thread
From: Mike Snitzer @ 2018-10-29 15:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Hannes Reinecke, linux-block, linux-scsi, linux-ide

On Mon, Oct 29 2018 at 10:25am -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 10/29/18 1:10 AM, Hannes Reinecke wrote:
> > On 10/25/18 11:10 PM, Jens Axboe wrote:
> >> Nobody sets the helper, so we always return 0. Kill it.
> >>
> >> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> >> ---
> >>   block/blk-core.c       | 28 ----------------------------
> >>   block/blk-settings.c   |  6 ------
> >>   drivers/md/dm-mpath.c  |  4 +---
> >>   include/linux/blkdev.h |  4 ----
> >>   4 files changed, 1 insertion(+), 41 deletions(-)
> >>
> >> diff --git a/block/blk-core.c b/block/blk-core.c
> >> index 4c39c7865f9c..dd1328f4dc31 100644
> >> --- a/block/blk-core.c
> >> +++ b/block/blk-core.c
> >> @@ -1795,34 +1795,6 @@ void rq_flush_dcache_pages(struct request *rq)
> >>   EXPORT_SYMBOL_GPL(rq_flush_dcache_pages);
> >>   #endif
> >>   
> >> -/**
> >> - * blk_lld_busy - Check if underlying low-level drivers of a device are busy
> >> - * @q : the queue of the device being checked
> >> - *
> >> - * Description:
> >> - *    Check if underlying low-level drivers of a device are busy.
> >> - *    If the drivers want to export their busy state, they must set own
> >> - *    exporting function using blk_queue_lld_busy() first.
> >> - *
> >> - *    Basically, this function is used only by request stacking drivers
> >> - *    to stop dispatching requests to underlying devices when underlying
> >> - *    devices are busy.  This behavior helps more I/O merging on the queue
> >> - *    of the request stacking driver and prevents I/O throughput regression
> >> - *    on burst I/O load.
> >> - *
> >> - * Return:
> >> - *    0 - Not busy (The request stacking driver should dispatch request)
> >> - *    1 - Busy (The request stacking driver should stop dispatching request)
> >> - */
> >> -int blk_lld_busy(struct request_queue *q)
> >> -{
> >> -	if (q->lld_busy_fn)
> >> -		return q->lld_busy_fn(q);
> >> -
> >> -	return 0;
> >> -}
> >> -EXPORT_SYMBOL_GPL(blk_lld_busy);
> >> -
> >>   /**
> >>    * blk_rq_unprep_clone - Helper function to free all bios in a cloned request
> >>    * @rq: the clone request to be cleaned up
> >> diff --git a/block/blk-settings.c b/block/blk-settings.c
> >> index 3c5da75c2def..1895f499bbe5 100644
> >> --- a/block/blk-settings.c
> >> +++ b/block/blk-settings.c
> >> @@ -32,12 +32,6 @@ void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
> >>   }
> >>   EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
> >>   
> >> -void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
> >> -{
> >> -	q->lld_busy_fn = fn;
> >> -}
> >> -EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
> >> -
> >>   /**
> >>    * blk_set_default_limits - reset limits to default values
> >>    * @lim:  the queue_limits structure to reset
> >> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> >> index a24ed3973e7c..4d736e0fd67f 100644
> >> --- a/drivers/md/dm-mpath.c
> >> +++ b/drivers/md/dm-mpath.c
> >> @@ -1936,9 +1936,7 @@ static int multipath_iterate_devices(struct dm_target *ti,
> >>   
> >>   static int pgpath_busy(struct pgpath *pgpath)
> >>   {
> >> -	struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev);
> >> -
> >> -	return blk_lld_busy(q);
> >> +	return 0;
> >>   }
> >>   
> >>   /*
> > Actually, I'm not quite sure this is correct; dm-mpath needs to return a 
> > busy status (via the '->busy' callback) to allow for back-pressure on 
> > the upper layers.
> > Just disabling the callback will disable this.
> > Shouldn't we check if the tagmap is busy here?
> 
> Mike? We can pretty easily make it just check for busy tags.

OK, please feel free to do so.

Happy to pick the patch up or provide review so you can stage it in your
4.21 oriented tree.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 23/28] block: kill lld busy
  2018-10-29 15:51       ` Mike Snitzer
@ 2018-10-29 16:26         ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-29 16:26 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Hannes Reinecke, linux-block, linux-scsi, linux-ide

On 10/29/18 9:51 AM, Mike Snitzer wrote:
> On Mon, Oct 29 2018 at 10:25am -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 10/29/18 1:10 AM, Hannes Reinecke wrote:
>>> On 10/25/18 11:10 PM, Jens Axboe wrote:
>>>> Nobody sets the helper, so we always return 0. Kill it.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>> ---
>>>>   block/blk-core.c       | 28 ----------------------------
>>>>   block/blk-settings.c   |  6 ------
>>>>   drivers/md/dm-mpath.c  |  4 +---
>>>>   include/linux/blkdev.h |  4 ----
>>>>   4 files changed, 1 insertion(+), 41 deletions(-)
>>>>
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 4c39c7865f9c..dd1328f4dc31 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -1795,34 +1795,6 @@ void rq_flush_dcache_pages(struct request *rq)
>>>>   EXPORT_SYMBOL_GPL(rq_flush_dcache_pages);
>>>>   #endif
>>>>   
>>>> -/**
>>>> - * blk_lld_busy - Check if underlying low-level drivers of a device are busy
>>>> - * @q : the queue of the device being checked
>>>> - *
>>>> - * Description:
>>>> - *    Check if underlying low-level drivers of a device are busy.
>>>> - *    If the drivers want to export their busy state, they must set own
>>>> - *    exporting function using blk_queue_lld_busy() first.
>>>> - *
>>>> - *    Basically, this function is used only by request stacking drivers
>>>> - *    to stop dispatching requests to underlying devices when underlying
>>>> - *    devices are busy.  This behavior helps more I/O merging on the queue
>>>> - *    of the request stacking driver and prevents I/O throughput regression
>>>> - *    on burst I/O load.
>>>> - *
>>>> - * Return:
>>>> - *    0 - Not busy (The request stacking driver should dispatch request)
>>>> - *    1 - Busy (The request stacking driver should stop dispatching request)
>>>> - */
>>>> -int blk_lld_busy(struct request_queue *q)
>>>> -{
>>>> -	if (q->lld_busy_fn)
>>>> -		return q->lld_busy_fn(q);
>>>> -
>>>> -	return 0;
>>>> -}
>>>> -EXPORT_SYMBOL_GPL(blk_lld_busy);
>>>> -
>>>>   /**
>>>>    * blk_rq_unprep_clone - Helper function to free all bios in a cloned request
>>>>    * @rq: the clone request to be cleaned up
>>>> diff --git a/block/blk-settings.c b/block/blk-settings.c
>>>> index 3c5da75c2def..1895f499bbe5 100644
>>>> --- a/block/blk-settings.c
>>>> +++ b/block/blk-settings.c
>>>> @@ -32,12 +32,6 @@ void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
>>>>   
>>>> -void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
>>>> -{
>>>> -	q->lld_busy_fn = fn;
>>>> -}
>>>> -EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
>>>> -
>>>>   /**
>>>>    * blk_set_default_limits - reset limits to default values
>>>>    * @lim:  the queue_limits structure to reset
>>>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
>>>> index a24ed3973e7c..4d736e0fd67f 100644
>>>> --- a/drivers/md/dm-mpath.c
>>>> +++ b/drivers/md/dm-mpath.c
>>>> @@ -1936,9 +1936,7 @@ static int multipath_iterate_devices(struct dm_target *ti,
>>>>   
>>>>   static int pgpath_busy(struct pgpath *pgpath)
>>>>   {
>>>> -	struct request_queue *q = bdev_get_queue(pgpath->path.dev->bdev);
>>>> -
>>>> -	return blk_lld_busy(q);
>>>> +	return 0;
>>>>   }
>>>>   
>>>>   /*
>>> Actually, I'm not quite sure this is correct; dm-mpath needs to return a 
>>> busy status (via the '->busy' callback) to allow for back-pressure on 
>>> the upper layers.
>>> Just disabling the callback will disable this.
>>> Shouldn't we check if the tagmap is busy here?
>>
>> Mike? We can pretty easily make it just check for busy tags.
> 
> OK, please feel free to do so.
> 
> Happy to pick the patch up or provide review so you can stage it in your
> 4.21 oriented tree.

I reshuffled a bit and made SCSI provide an mq variant of the target
busy. So blk_lld_busy() remains, just with an mq handler instead.
This means that dm-mpath won't need any changing, I think we're good.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-29 15:04     ` Jens Axboe
@ 2018-10-30  9:41       ` Ming Lei
  2018-10-30 14:13         ` Jens Axboe
  0 siblings, 1 reply; 90+ messages in thread
From: Ming Lei @ 2018-10-30  9:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-scsi, linux-ide, Paolo Valente

On Mon, Oct 29, 2018 at 09:04:33AM -0600, Jens Axboe wrote:
> On 10/29/18 8:50 AM, Jens Axboe wrote:
> > On 10/29/18 6:00 AM, Ming Lei wrote:
> >> On Thu, Oct 25, 2018 at 03:10:11PM -0600, Jens Axboe wrote:
> >>> The first round of this went into 4.20-rc, but we've still some of
> >>> them pending. This patch series converts the remaining drivers to
> >>> blk-mq. The ones that support dual paths (like SCSI and DM) have
> >>> the non-mq path removed. At the end, legacy IO code and schedulers
> >>> are killed off.
> >>>
> >>> This patch series is on top of my for-linus branch. It can also
> >>> be bound in my mq-conversions branch.
> >>>
> >>>  Documentation/block/biodoc.txt         |   88 -
> >>>  Documentation/block/cfq-iosched.txt    |  291 --
> >>>  Documentation/scsi/scsi-parameters.txt |    5 -
> >>>  block/Kconfig                          |    6 -
> >>>  block/Kconfig.iosched                  |   61 -
> >>>  block/Makefile                         |    5 +-
> >>>  block/bfq-iosched.c                    |    1 -
> >>>  block/blk-cgroup.c                     |   55 -
> >>>  block/blk-core.c                       | 1860 +-----------
> >>>  block/blk-exec.c                       |   20 +-
> >>>  block/blk-flush.c                      |  154 +-
> >>>  block/blk-ioc.c                        |   46 +-
> >>>  block/blk-merge.c                      |   35 +-
> >>>  block/blk-mq-debugfs.c                 |    2 -
> >>>  block/blk-mq-tag.c                     |    6 +-
> >>>  block/blk-mq.c                         |   13 +-
> >>>  block/blk-settings.c                   |   49 -
> >>>  block/blk-softirq.c                    |   20 -
> >>>  block/blk-sysfs.c                      |   39 +-
> >>>  block/blk-tag.c                        |  378 ---
> >>>  block/blk-timeout.c                    |   99 +-
> >>>  block/blk-wbt.c                        |    3 +-
> >>>  block/blk.h                            |   60 +-
> >>>  block/bsg-lib.c                        |  131 +-
> >>>  block/cfq-iosched.c                    | 4916 --------------------------------
> >>>  block/deadline-iosched.c               |  560 ----
> >>>  block/elevator.c                       |  447 +--
> >>>  block/kyber-iosched.c                  |    1 -
> >>>  block/mq-deadline.c                    |    1 -
> >>>  block/noop-iosched.c                   |  124 -
> >>>  drivers/block/sunvdc.c                 |  149 +-
> >>>  drivers/ide/ide-atapi.c                |   25 +-
> >>>  drivers/ide/ide-cd.c                   |  175 +-
> >>>  drivers/ide/ide-disk.c                 |    5 +-
> >>>  drivers/ide/ide-io.c                   |  101 +-
> >>>  drivers/ide/ide-park.c                 |    4 +-
> >>>  drivers/ide/ide-pm.c                   |   28 +-
> >>>  drivers/ide/ide-probe.c                |   68 +-
> >>>  drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
> >>>  drivers/md/Kconfig                     |   11 -
> >>>  drivers/md/dm-core.h                   |   10 -
> >>>  drivers/md/dm-mpath.c                  |   18 +-
> >>>  drivers/md/dm-rq.c                     |  293 +-
> >>>  drivers/md/dm-rq.h                     |    4 -
> >>>  drivers/md/dm-sysfs.c                  |    3 +-
> >>>  drivers/md/dm-table.c                  |   36 +-
> >>>  drivers/md/dm.c                        |   21 +-
> >>>  drivers/md/dm.h                        |    1 -
> >>>  drivers/memstick/core/ms_block.c       |  110 +-
> >>>  drivers/memstick/core/ms_block.h       |    1 +
> >>>  drivers/memstick/core/mspro_block.c    |  121 +-
> >>>  drivers/s390/block/dasd_ioctl.c        |   22 +-
> >>>  drivers/scsi/Kconfig                   |   12 -
> >>>  drivers/scsi/cxlflash/main.c           |    6 -
> >>>  drivers/scsi/hosts.c                   |   29 +-
> >>>  drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
> >>>  drivers/scsi/osd/osd_initiator.c       |    4 +-
> >>>  drivers/scsi/osst.c                    |    2 +-
> >>>  drivers/scsi/qedi/qedi_main.c          |    3 +-
> >>>  drivers/scsi/qla2xxx/qla_os.c          |   30 +-
> >>>  drivers/scsi/scsi.c                    |    5 +-
> >>>  drivers/scsi/scsi_debug.c              |    3 +-
> >>>  drivers/scsi/scsi_error.c              |    4 +-
> >>>  drivers/scsi/scsi_lib.c                |  624 +---
> >>>  drivers/scsi/scsi_priv.h               |    1 -
> >>>  drivers/scsi/scsi_scan.c               |   10 +-
> >>>  drivers/scsi/scsi_sysfs.c              |    8 +-
> >>>  drivers/scsi/scsi_transport_fc.c       |   72 +-
> >>>  drivers/scsi/scsi_transport_iscsi.c    |    9 +-
> >>>  drivers/scsi/scsi_transport_sas.c      |   10 +-
> >>>  drivers/scsi/sg.c                      |    2 +-
> >>>  drivers/scsi/st.c                      |    2 +-
> >>>  drivers/scsi/ufs/ufshcd.c              |    6 -
> >>>  drivers/target/target_core_pscsi.c     |    2 +-
> >>>  include/linux/blk-cgroup.h             |  108 -
> >>>  include/linux/blkdev.h                 |  174 +-
> >>>  include/linux/bsg-lib.h                |    3 +-
> >>>  include/linux/elevator.h               |   90 +-
> >>>  include/linux/ide.h                    |   13 +-
> >>>  include/linux/init.h                   |    1 -
> >>>  include/scsi/scsi_host.h               |   18 +-
> >>>  include/scsi/scsi_tcq.h                |   14 +-
> >>>  init/do_mounts_initrd.c                |    3 -
> >>>  init/initramfs.c                       |    6 -
> >>>  init/main.c                            |   12 -
> >>>  85 files changed, 833 insertions(+), 11144 deletions(-)
> >>
> >> Hi Jens,
> >>
> >> Kernel oops[2] is triggered when running elevator switch stress test[1].
> >>
> >> [1] https://people.redhat.com/~minlei/tests/tools/elv-switch
> >>
> >> [2] kernel oops
> >>
> >> [  399.702365] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> >> [  399.703603] sd 8:0:0:0: Power-on or device reset occurred
> >> [  399.709833] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> >> [  399.711745] sd 8:0:0:0: [sdd] Write Protect is off
> >> [  399.712395] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
> >> [  399.715119] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> >> [  399.803561] sd 8:0:0:0: [sdd] Attached SCSI disk
> >> [  399.844825] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
> >> [  399.968945] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> >> [  399.970077] sd 8:0:0:0: Power-on or device reset occurred
> >> [  399.972904] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> >> [  399.974823] sd 8:0:0:0: [sdd] Write Protect is off
> >> [  399.975411] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
> >> [  399.978052] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> >> [  400.030588] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
> >> [  400.032195] PGD 0 P4D 0
> >> [  400.032482] Oops: 0000 [#1] PREEMPT SMP PTI
> >> [  400.032945] CPU: 2 PID: 318 Comm: kworker/u8:5 Not tainted 4.19.0_0ffdea6679a9_mq-conversions+ #1
> >> [  400.033927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
> >> [  400.034915] Workqueue: events_unbound async_run_entry_fn
> >> [  400.035536] RIP: 0010:bfq_reset_rate_computation+0x52/0x85
> >> [  400.036177] Code: 00 48 89 83 b0 00 00 00 8b 45 20 c1 e8 09 89 83 d8 00 00 00 48 89 83 d0 00 00 00 eb 0a c7 87 c8 00 00 00 00 00 00 00 48 8b 03 <48> 8b b8 00 05 00 00 48 85 ff 74 24 8b 8b c8 00 00 00 4c 8b 8b d0
> >> [  400.038300] RSP: 0018:ffffc90000a47c50 EFLAGS: 00010046
> >> [  400.038915] RAX: 0000000000000000 RBX: ffff88017ac2bc00 RCX: 0000000017d6067a
> >> [  400.039746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88017ac2bc00
> >> [  400.040573] RBP: ffff88017ac2bc00 R08: 00000000f461e6ac R09: 0000000000000006
> >> [  400.041361] R10: 000000000000020f R11: 0000000000000000 R12: ffff88016c130000
> >> [  400.042127] R13: 0000005d1c095013 R14: ffff88017ac2bf78 R15: 0000000000000020
> >> [  400.042957] FS:  0000000000000000(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000
> >> [  400.043895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [  400.044518] CR2: 0000000000000500 CR3: 0000000179c32004 CR4: 0000000000760ee0
> >> [  400.045294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [  400.046070] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> [  400.046843] PKRU: 55555554
> >> [  400.047144] Call Trace:
> >> [  400.047438]  bfq_finish_requeue_request+0x145/0x2fd
> >> [  400.047977]  blk_mq_free_request+0x49/0xea
> >> [  400.048429]  __scsi_execute+0x16c/0x17b
> >> [  400.048885]  scsi_mode_sense+0x113/0x265
> >> [  400.049356]  sd_revalidate_disk+0xb5b/0x109a [sd_mod]
> >> [  400.049947]  ? __device_add_disk+0x3d2/0x3f6
> >> [  400.050415]  sd_probe_async+0x10d/0x18b [sd_mod]
> >> [  400.050924]  async_run_entry_fn+0x6d/0x12b
> >> [  400.051374]  process_one_work+0x1da/0x313
> >> [  400.051841]  ? rescuer_thread+0x282/0x282
> >> [  400.052313]  worker_thread+0x1ca/0x295
> >> [  400.052769]  kthread+0x115/0x11d
> >> [  400.053127]  ? kthread_park+0x76/0x76
> >> [  400.053531]  ret_from_fork+0x35/0x40
> >> [  400.053929] Modules linked in: scsi_debug null_blk isofs iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core lpc_ich mfd_core ip_tables sr_mod cdrom usb_storage sd_mod ahci libahci crc32c_intel libata virtio_scsi qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod [last unloaded: null_blk]
> >> [  400.056640] Dumping ftrace buffer:
> >> [  400.057022]    (ftrace buffer empty)
> >> [  400.057418] CR2: 0000000000000500
> >> [  400.057820] ---[ end trace 812cbc0fe8802fdd ]---
> > 
> > Doesn't look related to the series, more like a bug in BFQ. Are you
> > sure it passes on current -git? Or maybe the bug is exposed by the
> > series.
> > 
> > I'll try and reproduce it here and see what happens.
> 
> I think I see the error, it is introduced by my series. This incremental
> should help, I'm going to test it now.
> 
> 
> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index b8afdd6ec4b8..391128456aec 100644
> --- a/block/blk-ioc.c
> +++ b/block/blk-ioc.c
> @@ -212,6 +212,21 @@ void exit_io_context(struct task_struct *task)
>  	put_io_context_active(ioc);
>  }
>  
> +static void __ioc_clear_queue(struct list_head *icq_list)
> +{
> +	unsigned long flags;
> +
> +	while (!list_empty(icq_list)) {
> +		struct io_cq *icq = list_entry(icq_list->next,
> +						struct io_cq, q_node);
> +		struct io_context *ioc = icq->ioc;
> +
> +		spin_lock_irqsave(&ioc->lock, flags);
> +		ioc_destroy_icq(icq);
> +		spin_unlock_irqrestore(&ioc->lock, flags);
> +	}
> +}
> +
>  /**
>   * ioc_clear_queue - break any ioc association with the specified queue
>   * @q: request_queue being cleared
> @@ -225,6 +240,8 @@ void ioc_clear_queue(struct request_queue *q)
>  	spin_lock_irq(q->queue_lock);
>  	list_splice_init(&q->icq_list, &icq_list);
>  	spin_unlock_irq(q->queue_lock);
> +
> +	__ioc_clear_queue(&icq_list);
>  }
>  
>  int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)

Not see such issue anymore on today's mq-conversions.

-- 
Ming

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCHSET 0/28] blk-mq driver conversions and legacy path removal
  2018-10-30  9:41       ` Ming Lei
@ 2018-10-30 14:13         ` Jens Axboe
  0 siblings, 0 replies; 90+ messages in thread
From: Jens Axboe @ 2018-10-30 14:13 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, linux-scsi, linux-ide, Paolo Valente

On 10/30/18 3:41 AM, Ming Lei wrote:
> On Mon, Oct 29, 2018 at 09:04:33AM -0600, Jens Axboe wrote:
>> On 10/29/18 8:50 AM, Jens Axboe wrote:
>>> On 10/29/18 6:00 AM, Ming Lei wrote:
>>>> On Thu, Oct 25, 2018 at 03:10:11PM -0600, Jens Axboe wrote:
>>>>> The first round of this went into 4.20-rc, but we've still some of
>>>>> them pending. This patch series converts the remaining drivers to
>>>>> blk-mq. The ones that support dual paths (like SCSI and DM) have
>>>>> the non-mq path removed. At the end, legacy IO code and schedulers
>>>>> are killed off.
>>>>>
>>>>> This patch series is on top of my for-linus branch. It can also
>>>>> be bound in my mq-conversions branch.
>>>>>
>>>>>  Documentation/block/biodoc.txt         |   88 -
>>>>>  Documentation/block/cfq-iosched.txt    |  291 --
>>>>>  Documentation/scsi/scsi-parameters.txt |    5 -
>>>>>  block/Kconfig                          |    6 -
>>>>>  block/Kconfig.iosched                  |   61 -
>>>>>  block/Makefile                         |    5 +-
>>>>>  block/bfq-iosched.c                    |    1 -
>>>>>  block/blk-cgroup.c                     |   55 -
>>>>>  block/blk-core.c                       | 1860 +-----------
>>>>>  block/blk-exec.c                       |   20 +-
>>>>>  block/blk-flush.c                      |  154 +-
>>>>>  block/blk-ioc.c                        |   46 +-
>>>>>  block/blk-merge.c                      |   35 +-
>>>>>  block/blk-mq-debugfs.c                 |    2 -
>>>>>  block/blk-mq-tag.c                     |    6 +-
>>>>>  block/blk-mq.c                         |   13 +-
>>>>>  block/blk-settings.c                   |   49 -
>>>>>  block/blk-softirq.c                    |   20 -
>>>>>  block/blk-sysfs.c                      |   39 +-
>>>>>  block/blk-tag.c                        |  378 ---
>>>>>  block/blk-timeout.c                    |   99 +-
>>>>>  block/blk-wbt.c                        |    3 +-
>>>>>  block/blk.h                            |   60 +-
>>>>>  block/bsg-lib.c                        |  131 +-
>>>>>  block/cfq-iosched.c                    | 4916 --------------------------------
>>>>>  block/deadline-iosched.c               |  560 ----
>>>>>  block/elevator.c                       |  447 +--
>>>>>  block/kyber-iosched.c                  |    1 -
>>>>>  block/mq-deadline.c                    |    1 -
>>>>>  block/noop-iosched.c                   |  124 -
>>>>>  drivers/block/sunvdc.c                 |  149 +-
>>>>>  drivers/ide/ide-atapi.c                |   25 +-
>>>>>  drivers/ide/ide-cd.c                   |  175 +-
>>>>>  drivers/ide/ide-disk.c                 |    5 +-
>>>>>  drivers/ide/ide-io.c                   |  101 +-
>>>>>  drivers/ide/ide-park.c                 |    4 +-
>>>>>  drivers/ide/ide-pm.c                   |   28 +-
>>>>>  drivers/ide/ide-probe.c                |   68 +-
>>>>>  drivers/infiniband/ulp/srp/ib_srp.c    |    7 -
>>>>>  drivers/md/Kconfig                     |   11 -
>>>>>  drivers/md/dm-core.h                   |   10 -
>>>>>  drivers/md/dm-mpath.c                  |   18 +-
>>>>>  drivers/md/dm-rq.c                     |  293 +-
>>>>>  drivers/md/dm-rq.h                     |    4 -
>>>>>  drivers/md/dm-sysfs.c                  |    3 +-
>>>>>  drivers/md/dm-table.c                  |   36 +-
>>>>>  drivers/md/dm.c                        |   21 +-
>>>>>  drivers/md/dm.h                        |    1 -
>>>>>  drivers/memstick/core/ms_block.c       |  110 +-
>>>>>  drivers/memstick/core/ms_block.h       |    1 +
>>>>>  drivers/memstick/core/mspro_block.c    |  121 +-
>>>>>  drivers/s390/block/dasd_ioctl.c        |   22 +-
>>>>>  drivers/scsi/Kconfig                   |   12 -
>>>>>  drivers/scsi/cxlflash/main.c           |    6 -
>>>>>  drivers/scsi/hosts.c                   |   29 +-
>>>>>  drivers/scsi/lpfc/lpfc_scsi.c          |    2 +-
>>>>>  drivers/scsi/osd/osd_initiator.c       |    4 +-
>>>>>  drivers/scsi/osst.c                    |    2 +-
>>>>>  drivers/scsi/qedi/qedi_main.c          |    3 +-
>>>>>  drivers/scsi/qla2xxx/qla_os.c          |   30 +-
>>>>>  drivers/scsi/scsi.c                    |    5 +-
>>>>>  drivers/scsi/scsi_debug.c              |    3 +-
>>>>>  drivers/scsi/scsi_error.c              |    4 +-
>>>>>  drivers/scsi/scsi_lib.c                |  624 +---
>>>>>  drivers/scsi/scsi_priv.h               |    1 -
>>>>>  drivers/scsi/scsi_scan.c               |   10 +-
>>>>>  drivers/scsi/scsi_sysfs.c              |    8 +-
>>>>>  drivers/scsi/scsi_transport_fc.c       |   72 +-
>>>>>  drivers/scsi/scsi_transport_iscsi.c    |    9 +-
>>>>>  drivers/scsi/scsi_transport_sas.c      |   10 +-
>>>>>  drivers/scsi/sg.c                      |    2 +-
>>>>>  drivers/scsi/st.c                      |    2 +-
>>>>>  drivers/scsi/ufs/ufshcd.c              |    6 -
>>>>>  drivers/target/target_core_pscsi.c     |    2 +-
>>>>>  include/linux/blk-cgroup.h             |  108 -
>>>>>  include/linux/blkdev.h                 |  174 +-
>>>>>  include/linux/bsg-lib.h                |    3 +-
>>>>>  include/linux/elevator.h               |   90 +-
>>>>>  include/linux/ide.h                    |   13 +-
>>>>>  include/linux/init.h                   |    1 -
>>>>>  include/scsi/scsi_host.h               |   18 +-
>>>>>  include/scsi/scsi_tcq.h                |   14 +-
>>>>>  init/do_mounts_initrd.c                |    3 -
>>>>>  init/initramfs.c                       |    6 -
>>>>>  init/main.c                            |   12 -
>>>>>  85 files changed, 833 insertions(+), 11144 deletions(-)
>>>>
>>>> Hi Jens,
>>>>
>>>> Kernel oops[2] is triggered when running elevator switch stress test[1].
>>>>
>>>> [1] https://people.redhat.com/~minlei/tests/tools/elv-switch
>>>>
>>>> [2] kernel oops
>>>>
>>>> [  399.702365] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
>>>> [  399.703603] sd 8:0:0:0: Power-on or device reset occurred
>>>> [  399.709833] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
>>>> [  399.711745] sd 8:0:0:0: [sdd] Write Protect is off
>>>> [  399.712395] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
>>>> [  399.715119] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
>>>> [  399.803561] sd 8:0:0:0: [sdd] Attached SCSI disk
>>>> [  399.844825] sd 8:0:0:0: [sdd] Synchronizing SCSI cache
>>>> [  399.968945] scsi 8:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
>>>> [  399.970077] sd 8:0:0:0: Power-on or device reset occurred
>>>> [  399.972904] sd 8:0:0:0: [sdd] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
>>>> [  399.974823] sd 8:0:0:0: [sdd] Write Protect is off
>>>> [  399.975411] sd 8:0:0:0: [sdd] Mode Sense: 73 00 10 08
>>>> [  399.978052] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
>>>> [  400.030588] BUG: unable to handle kernel NULL pointer dereference at 0000000000000500
>>>> [  400.032195] PGD 0 P4D 0
>>>> [  400.032482] Oops: 0000 [#1] PREEMPT SMP PTI
>>>> [  400.032945] CPU: 2 PID: 318 Comm: kworker/u8:5 Not tainted 4.19.0_0ffdea6679a9_mq-conversions+ #1
>>>> [  400.033927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
>>>> [  400.034915] Workqueue: events_unbound async_run_entry_fn
>>>> [  400.035536] RIP: 0010:bfq_reset_rate_computation+0x52/0x85
>>>> [  400.036177] Code: 00 48 89 83 b0 00 00 00 8b 45 20 c1 e8 09 89 83 d8 00 00 00 48 89 83 d0 00 00 00 eb 0a c7 87 c8 00 00 00 00 00 00 00 48 8b 03 <48> 8b b8 00 05 00 00 48 85 ff 74 24 8b 8b c8 00 00 00 4c 8b 8b d0
>>>> [  400.038300] RSP: 0018:ffffc90000a47c50 EFLAGS: 00010046
>>>> [  400.038915] RAX: 0000000000000000 RBX: ffff88017ac2bc00 RCX: 0000000017d6067a
>>>> [  400.039746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88017ac2bc00
>>>> [  400.040573] RBP: ffff88017ac2bc00 R08: 00000000f461e6ac R09: 0000000000000006
>>>> [  400.041361] R10: 000000000000020f R11: 0000000000000000 R12: ffff88016c130000
>>>> [  400.042127] R13: 0000005d1c095013 R14: ffff88017ac2bf78 R15: 0000000000000020
>>>> [  400.042957] FS:  0000000000000000(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000
>>>> [  400.043895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [  400.044518] CR2: 0000000000000500 CR3: 0000000179c32004 CR4: 0000000000760ee0
>>>> [  400.045294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> [  400.046070] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> [  400.046843] PKRU: 55555554
>>>> [  400.047144] Call Trace:
>>>> [  400.047438]  bfq_finish_requeue_request+0x145/0x2fd
>>>> [  400.047977]  blk_mq_free_request+0x49/0xea
>>>> [  400.048429]  __scsi_execute+0x16c/0x17b
>>>> [  400.048885]  scsi_mode_sense+0x113/0x265
>>>> [  400.049356]  sd_revalidate_disk+0xb5b/0x109a [sd_mod]
>>>> [  400.049947]  ? __device_add_disk+0x3d2/0x3f6
>>>> [  400.050415]  sd_probe_async+0x10d/0x18b [sd_mod]
>>>> [  400.050924]  async_run_entry_fn+0x6d/0x12b
>>>> [  400.051374]  process_one_work+0x1da/0x313
>>>> [  400.051841]  ? rescuer_thread+0x282/0x282
>>>> [  400.052313]  worker_thread+0x1ca/0x295
>>>> [  400.052769]  kthread+0x115/0x11d
>>>> [  400.053127]  ? kthread_park+0x76/0x76
>>>> [  400.053531]  ret_from_fork+0x35/0x40
>>>> [  400.053929] Modules linked in: scsi_debug null_blk isofs iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core lpc_ich mfd_core ip_tables sr_mod cdrom usb_storage sd_mod ahci libahci crc32c_intel libata virtio_scsi qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod [last unloaded: null_blk]
>>>> [  400.056640] Dumping ftrace buffer:
>>>> [  400.057022]    (ftrace buffer empty)
>>>> [  400.057418] CR2: 0000000000000500
>>>> [  400.057820] ---[ end trace 812cbc0fe8802fdd ]---
>>>
>>> Doesn't look related to the series, more like a bug in BFQ. Are you
>>> sure it passes on current -git? Or maybe the bug is exposed by the
>>> series.
>>>
>>> I'll try and reproduce it here and see what happens.
>>
>> I think I see the error, it is introduced by my series. This incremental
>> should help, I'm going to test it now.
>>
>>
>> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
>> index b8afdd6ec4b8..391128456aec 100644
>> --- a/block/blk-ioc.c
>> +++ b/block/blk-ioc.c
>> @@ -212,6 +212,21 @@ void exit_io_context(struct task_struct *task)
>>  	put_io_context_active(ioc);
>>  }
>>  
>> +static void __ioc_clear_queue(struct list_head *icq_list)
>> +{
>> +	unsigned long flags;
>> +
>> +	while (!list_empty(icq_list)) {
>> +		struct io_cq *icq = list_entry(icq_list->next,
>> +						struct io_cq, q_node);
>> +		struct io_context *ioc = icq->ioc;
>> +
>> +		spin_lock_irqsave(&ioc->lock, flags);
>> +		ioc_destroy_icq(icq);
>> +		spin_unlock_irqrestore(&ioc->lock, flags);
>> +	}
>> +}
>> +
>>  /**
>>   * ioc_clear_queue - break any ioc association with the specified queue
>>   * @q: request_queue being cleared
>> @@ -225,6 +240,8 @@ void ioc_clear_queue(struct request_queue *q)
>>  	spin_lock_irq(q->queue_lock);
>>  	list_splice_init(&q->icq_list, &icq_list);
>>  	spin_unlock_irq(q->queue_lock);
>> +
>> +	__ioc_clear_queue(&icq_list);
>>  }
>>  
>>  int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
> 
> Not see such issue anymore on today's mq-conversions.

Great, thanks for retesting.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2018-10-30 14:13 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-25 21:10 [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Jens Axboe
2018-10-25 21:10 ` [PATCH 01/28] sunvdc: convert to blk-mq Jens Axboe
2018-10-27 10:42   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 02/28] ms_block: " Jens Axboe
2018-10-27 10:43   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 03/28] mspro_block: " Jens Axboe
2018-10-27 10:44   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 04/28] ide: " Jens Axboe
2018-10-27 10:51   ` Hannes Reinecke
2018-10-27 16:51     ` Jens Axboe
2018-10-25 21:10 ` [PATCH 05/28] IB/srp: remove old request_fn_active check Jens Axboe
2018-10-25 21:23   ` Bart Van Assche
2018-10-25 21:24     ` Jens Axboe
2018-10-26  7:08   ` Hannes Reinecke
2018-10-26 14:32     ` Jens Axboe
2018-10-26 15:03       ` Bart Van Assche
2018-10-25 21:10 ` [PATCH 06/28] blk-mq: remove the request_list usage Jens Axboe
2018-10-27 10:52   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 07/28] blk-mq: remove legacy check in queue blk_freeze_queue() Jens Axboe
2018-10-27 10:52   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 08/28] scsi: kill off the legacy IO path Jens Axboe
2018-10-25 21:36   ` Bart Van Assche
2018-10-25 22:18     ` Jens Axboe
2018-10-25 22:44       ` Madhani, Himanshu
2018-10-25 23:00         ` Jens Axboe
2018-10-25 23:06           ` Madhani, Himanshu
2018-10-29  6:48   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 09/28] dm: remove " Jens Axboe
2018-10-29  6:53   ` Hannes Reinecke
2018-10-29 14:17     ` Jens Axboe
2018-10-25 21:10 ` [PATCH 10/28] dasd: remove dead code Jens Axboe
2018-10-29  6:54   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 11/28] bsg: pass in desired timeout handler Jens Axboe
2018-10-28 15:53   ` Christoph Hellwig
2018-10-28 23:05     ` Jens Axboe
2018-10-29  6:55   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 12/28] bsg: provide bsg_remove_queue() helper Jens Axboe
2018-10-28 15:53   ` Christoph Hellwig
2018-10-29  6:55   ` Hannes Reinecke
2018-10-29 10:16   ` Johannes Thumshirn
2018-10-29 14:15     ` Jens Axboe
2018-10-25 21:10 ` [PATCH 13/28] bsg: convert to use blk-mq Jens Axboe
2018-10-28 16:07   ` Christoph Hellwig
2018-10-28 23:25     ` Jens Axboe
2018-10-29  6:57   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 14/28] block: remove blk_complete_request() Jens Axboe
2018-10-29  6:59   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 15/28] blk-wbt: kill check for legacy queue type Jens Axboe
2018-10-29  6:59   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 16/28] blk-cgroup: remove legacy queue bypassing Jens Axboe
2018-10-29  7:00   ` Hannes Reinecke
2018-10-29 11:00   ` Johannes Thumshirn
2018-10-29 14:23     ` Jens Axboe
2018-10-29 14:25       ` Johannes Thumshirn
2018-10-29 14:59         ` Jens Axboe
2018-10-25 21:10 ` [PATCH 17/28] block: remove legacy rq tagging Jens Axboe
2018-10-29  7:01   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 18/28] block: remove non mq parts from the flush code Jens Axboe
2018-10-29  7:02   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 19/28] block: remove legacy IO schedulers Jens Axboe
2018-10-25 21:10 ` [PATCH 20/28] block: remove dead elevator code Jens Axboe
2018-10-25 21:10 ` [PATCH 21/28] block: remove __blk_put_request() Jens Axboe
2018-10-29  7:03   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 22/28] block: kill legacy parts of timeout handling Jens Axboe
2018-10-29  7:04   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 23/28] block: kill lld busy Jens Axboe
2018-10-25 21:42   ` Bart Van Assche
2018-10-25 22:18     ` Jens Axboe
2018-10-29  7:10   ` Hannes Reinecke
2018-10-29 14:25     ` Jens Axboe
2018-10-29 15:51       ` Mike Snitzer
2018-10-29 16:26         ` Jens Axboe
2018-10-25 21:10 ` [PATCH 24/28] block: remove request_list code Jens Axboe
2018-10-29  7:10   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 25/28] block: kill request slab cache Jens Axboe
2018-10-29  7:11   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 26/28] block: remove req_no_special_merge() from merging code Jens Axboe
2018-10-29  7:12   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 27/28] blk-merge: kill dead queue lock held check Jens Axboe
2018-10-29  7:12   ` Hannes Reinecke
2018-10-25 21:10 ` [PATCH 28/28] block: get rid of blk_queued_rq() Jens Axboe
2018-10-29  7:12   ` Hannes Reinecke
2018-10-25 23:09 ` [PATCHSET 0/28] blk-mq driver conversions and legacy path removal Bart Van Assche
2018-10-25 23:11   ` Jens Axboe
2018-10-29 12:00 ` Ming Lei
2018-10-29 14:50   ` Jens Axboe
2018-10-29 15:04     ` Jens Axboe
2018-10-30  9:41       ` Ming Lei
2018-10-30 14:13         ` Jens Axboe
2018-10-29 15:05     ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.