linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/4] Add MMC software queue support
@ 2019-11-18 10:43 Baolin Wang
  2019-11-18 10:43 ` [PATCH v7 1/4] mmc: Add MMC host " Baolin Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Baolin Wang @ 2019-11-18 10:43 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, baolin.wang7, linux-mmc, linux-kernel

Hi All,

Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead, especially for high I/O per second rates, to affect the IO
performance.

Thus this patch set will introduce the MMC software command queue support
based on command queue engine's interfaces, and set the queue depth as 32
to allow more requests can be be prepared, merged and inserted into IO
scheduler, but we only allow 2 requests in flight, that is enough to let
the irq handler always trigger the next request without a context switch,
as well as avoiding a long latency.

Moreover we can expand the MMC software queue interface to support
MMC packed request or packed command instead of adding new interfaces,
according to previosus discussion.

Below are some comparison data with fio tool. The fio command I used
is like below with changing the '--rw' parameter and enabling the direct
IO flag to measure the actual hardware transfer speed in 4K block size.

./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=1G --group_reporting --numjobs=20 --name=test_read

My eMMC card working at HS400 Enhanced strobe mode:
[    2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[    2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB 
[    2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
[    2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
[    2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0)

1. Without MMC software queue
I tested 5 times for each case and output a average speed.

1) Sequential read:
Speed: 59.4MiB/s, 63.4MiB/s, 57.5MiB/s, 57.2MiB/s, 60.8MiB/s
Average speed: 59.66MiB/s

2) Random read:
Speed: 26.9MiB/s, 26.9MiB/s, 27.1MiB/s, 27.1MiB/s, 27.2MiB/s
Average speed: 27.04MiB/s

3) Sequential write:
Speed: 71.6MiB/s, 72.5MiB/s, 72.2MiB/s, 64.6MiB/s, 67.5MiB/s
Average speed: 69.68MiB/s

4) Random write:
Speed: 36.3MiB/s, 35.4MiB/s, 38.6MiB/s, 34MiB/s, 35.5MiB/s
Average speed: 35.96MiB/s

2. With MMC software queue
I tested 5 times for each case and output a average speed.

1) Sequential read:
Speed: 59.2MiB/s, 60.4MiB/s, 63.6MiB/s, 60.3MiB/s, 59.9MiB/s
Average speed: 60.68MiB/s

2) Random read:
Speed: 31.3MiB/s, 31.4MiB/s, 31.5MiB/s, 31.3MiB/s, 31.3MiB/s
Average speed: 31.36MiB/s

3) Sequential write:
Speed: 71MiB/s, 71.8MiB/s, 72.3MiB/s, 72.2MiB/s, 71MiB/s
Average speed: 71.66MiB/s

4) Random write:
Speed: 68.9MiB/s, 68.7MiB/s, 68.8MiB/s, 68.6MiB/s, 68.8MiB/s
Average speed: 68.76MiB/s

Form above data, we can see the MMC software queue can help to improve some
performance obviously for random read and write, though no obvious improvement
for sequential read and write.

Any comments are welcome. Thanks a lot.

Changes from v6:
 - Change the patch order and set host->always_defer_done = true for the
 Spreadtrum host driver.

Changes from v5:
 - Modify the condition of defering to complete request suggested by Adrian.

Changes from v4:
 - Add a seperate patch to introduce a variable to defer to complete
 data requests for some host drivers, when using host software queue.

Changes from v3:
 - Use host software queue instead of sqhci.
 - Fix random config building issue.
 - Change queue depth to 32, but still only allow 2 requests in flight.
 - Update the testing data.

Changes from v2:
 - Remove reference to 'struct cqhci_host' and 'struct cqhci_slot',
 instead adding 'struct sqhci_host', which is only used by software queue.

Changes from v1:
 - Add request_done ops for sdhci_ops.
 - Replace virtual command queue with software queue for functions and
 variables.
 - Rename the software queue file and add sqhci.h header file.

Baolin Wang (4):
  mmc: Add MMC host software queue support
  mmc: host: sdhci: Add request_done ops for struct sdhci_ops
  mmc: host: sdhci: Add a variable to defer to complete requests if
    needed
  mmc: host: sdhci-sprd: Add software queue support

 drivers/mmc/core/block.c      |   61 ++++++++
 drivers/mmc/core/mmc.c        |   13 +-
 drivers/mmc/core/queue.c      |   33 +++-
 drivers/mmc/host/Kconfig      |    8 +
 drivers/mmc/host/Makefile     |    1 +
 drivers/mmc/host/mmc_hsq.c    |  344 +++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/mmc_hsq.h    |   30 ++++
 drivers/mmc/host/sdhci-sprd.c |   28 ++++
 drivers/mmc/host/sdhci.c      |   14 +-
 drivers/mmc/host/sdhci.h      |    3 +
 include/linux/mmc/host.h      |    3 +
 11 files changed, 525 insertions(+), 13 deletions(-)
 create mode 100644 drivers/mmc/host/mmc_hsq.c
 create mode 100644 drivers/mmc/host/mmc_hsq.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v7 1/4] mmc: Add MMC host software queue support
  2019-11-18 10:43 [PATCH v7 0/4] Add MMC software queue support Baolin Wang
@ 2019-11-18 10:43 ` Baolin Wang
  2019-11-22 10:32   ` Arnd Bergmann
  2020-01-21 11:52   ` Ulf Hansson
  2019-11-18 10:43 ` [PATCH v7 2/4] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Baolin Wang @ 2019-11-18 10:43 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, baolin.wang7, linux-mmc, linux-kernel

From: Baolin Wang <baolin.wang@linaro.org>

Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead, especially for high I/O per second rates, to affect the IO
performance.

Thus this patch introduces MMC software queue interface based on the
hardware command queue engine's interfaces, which is similar with the
hardware command queue engine's idea, that can remove the context
switching. Moreover we set the default queue depth as 32 for software
queue, which allows more requests to be prepared, merged and inserted
into IO scheduler to improve performance, but we only allow 2 requests
in flight, that is enough to let the irq handler always trigger the
next request without a context switch, as well as avoiding a long latency.

>From the fio testing data in cover letter, we can see the software
queue can improve some performance with 4K block size, increasing
about 16% for random read, increasing about 90% for random write,
though no obvious improvement for sequential read and write.

Moreover we can expand the software queue interface to support MMC
packed request or packed command in future.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
---
 drivers/mmc/core/block.c   |   61 ++++++++
 drivers/mmc/core/mmc.c     |   13 +-
 drivers/mmc/core/queue.c   |   33 ++++-
 drivers/mmc/host/Kconfig   |    7 +
 drivers/mmc/host/Makefile  |    1 +
 drivers/mmc/host/mmc_hsq.c |  344 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/mmc_hsq.h |   30 ++++
 include/linux/mmc/host.h   |    3 +
 8 files changed, 482 insertions(+), 10 deletions(-)
 create mode 100644 drivers/mmc/host/mmc_hsq.c
 create mode 100644 drivers/mmc/host/mmc_hsq.h

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 2c71a43..870462c 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -168,6 +168,11 @@ struct mmc_rpmb_data {
 
 static inline int mmc_blk_part_switch(struct mmc_card *card,
 				      unsigned int part_type);
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+			       struct mmc_card *card,
+			       int disable_multi,
+			       struct mmc_queue *mq);
+static void mmc_blk_swq_req_done(struct mmc_request *mrq);
 
 static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
 {
@@ -1569,9 +1574,30 @@ static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
 	return mmc_blk_cqe_start_req(mq->card->host, mrq);
 }
 
+static int mmc_blk_swq_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+	int err;
+
+	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
+	mqrq->brq.mrq.done = mmc_blk_swq_req_done;
+	mmc_pre_req(host, &mqrq->brq.mrq);
+
+	err = mmc_cqe_start_req(host, &mqrq->brq.mrq);
+	if (err)
+		mmc_post_req(host, &mqrq->brq.mrq, err);
+
+	return err;
+}
+
 static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+
+	if (host->swq_enabled)
+		return mmc_blk_swq_issue_rw_rq(mq, req);
 
 	mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
 
@@ -1957,6 +1983,41 @@ static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
 		mmc_run_bkops(mq->card);
 }
 
+static void mmc_blk_swq_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq =
+		container_of(mrq, struct mmc_queue_req, brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	unsigned long flags;
+
+	if (mmc_blk_rq_error(&mqrq->brq) ||
+	    mmc_blk_urgent_bkops_needed(mq, mqrq)) {
+		spin_lock_irqsave(&mq->lock, flags);
+		mq->recovery_needed = true;
+		mq->recovery_req = req;
+		spin_unlock_irqrestore(&mq->lock, flags);
+
+		host->cqe_ops->cqe_recovery_start(host);
+
+		schedule_work(&mq->recovery_work);
+		return;
+	}
+
+	mmc_blk_rw_reset_success(mq, req);
+
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->in_recovery)
+		mmc_blk_cqe_complete_rq(mq, req);
+	else
+		blk_mq_complete_request(req);
+}
+
 void mmc_blk_mq_complete(struct request *req)
 {
 	struct mmc_queue *mq = req->q->queuedata;
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index c880489..8eac1a2 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1852,15 +1852,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
 	 */
 	card->reenable_cmdq = card->ext_csd.cmdq_en;
 
-	if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
+	if (host->cqe_ops && !host->cqe_enabled) {
 		err = host->cqe_ops->cqe_enable(host, card);
 		if (err) {
 			pr_err("%s: Failed to enable CQE, error %d\n",
 				mmc_hostname(host), err);
 		} else {
 			host->cqe_enabled = true;
-			pr_info("%s: Command Queue Engine enabled\n",
-				mmc_hostname(host));
+
+			if (card->ext_csd.cmdq_en) {
+				pr_info("%s: Command Queue Engine enabled\n",
+					mmc_hostname(host));
+			} else {
+				host->swq_enabled = true;
+				pr_info("%s: Software Queue enabled\n",
+					mmc_hostname(host));
+			}
 		}
 	}
 
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 9edc086..d9086c1 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -62,7 +62,7 @@ enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_host *host = mq->card->host;
 
-	if (mq->use_cqe)
+	if (mq->use_cqe && !host->swq_enabled)
 		return mmc_cqe_issue_type(host, req);
 
 	if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
@@ -124,12 +124,14 @@ static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
 {
 	struct request_queue *q = req->q;
 	struct mmc_queue *mq = q->queuedata;
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
 	unsigned long flags;
 	int ret;
 
 	spin_lock_irqsave(&mq->lock, flags);
 
-	if (mq->recovery_needed || !mq->use_cqe)
+	if (mq->recovery_needed || !mq->use_cqe || host->swq_enabled)
 		ret = BLK_EH_RESET_TIMER;
 	else
 		ret = mmc_cqe_timed_out(req);
@@ -144,12 +146,13 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
 	struct mmc_queue *mq = container_of(work, struct mmc_queue,
 					    recovery_work);
 	struct request_queue *q = mq->queue;
+	struct mmc_host *host = mq->card->host;
 
 	mmc_get_card(mq->card, &mq->ctx);
 
 	mq->in_recovery = true;
 
-	if (mq->use_cqe)
+	if (mq->use_cqe && !host->swq_enabled)
 		mmc_blk_cqe_recovery(mq);
 	else
 		mmc_blk_mq_recovery(mq);
@@ -160,6 +163,9 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
 	mq->recovery_needed = false;
 	spin_unlock_irq(&mq->lock);
 
+	if (host->swq_enabled)
+		host->cqe_ops->cqe_recovery_finish(host);
+
 	mmc_put_card(mq->card, &mq->ctx);
 
 	blk_mq_run_hw_queues(q, true);
@@ -279,6 +285,14 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 		}
 		break;
 	case MMC_ISSUE_ASYNC:
+		/*
+		 * For MMC host software queue, we only allow 2 requests in
+		 * flight to avoid a long latency.
+		 */
+		if (host->swq_enabled && mq->in_flight[issue_type] > 2) {
+			spin_unlock_irq(&mq->lock);
+			return BLK_STS_RESOURCE;
+		}
 		break;
 	default:
 		/*
@@ -430,11 +444,16 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card)
 	 * The queue depth for CQE must match the hardware because the request
 	 * tag is used to index the hardware queue.
 	 */
-	if (mq->use_cqe)
-		mq->tag_set.queue_depth =
-			min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
-	else
+	if (mq->use_cqe) {
+		if (host->swq_enabled)
+			mq->tag_set.queue_depth = host->cqe_qdepth;
+		else
+			mq->tag_set.queue_depth =
+				min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
+	} else {
 		mq->tag_set.queue_depth = MMC_QUEUE_DEPTH;
+	}
+
 	mq->tag_set.numa_node = NUMA_NO_NODE;
 	mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
 	mq->tag_set.nr_hw_queues = 1;
diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 49ea02c..efa4019 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -936,6 +936,13 @@ config MMC_CQHCI
 
 	  If unsure, say N.
 
+config MMC_HSQ
+	tristate "MMC Host Software Queue support"
+	help
+	  This selects the Software Queue support.
+
+	  If unsure, say N.
+
 config MMC_TOSHIBA_PCI
 	tristate "Toshiba Type A SD/MMC Card Interface Driver"
 	depends on PCI
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 11c4598..c14b439 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -98,6 +98,7 @@ obj-$(CONFIG_MMC_SDHCI_BRCMSTB)		+= sdhci-brcmstb.o
 obj-$(CONFIG_MMC_SDHCI_OMAP)		+= sdhci-omap.o
 obj-$(CONFIG_MMC_SDHCI_SPRD)		+= sdhci-sprd.o
 obj-$(CONFIG_MMC_CQHCI)			+= cqhci.o
+obj-$(CONFIG_MMC_HSQ)			+= mmc_hsq.o
 
 ifeq ($(CONFIG_CB710_DEBUG),y)
 	CFLAGS-cb710-mmc	+= -DDEBUG
diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
new file mode 100644
index 0000000..f5a4f93
--- /dev/null
+++ b/drivers/mmc/host/mmc_hsq.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * MMC software queue support based on command queue interfaces
+ *
+ * Copyright (C) 2019 Linaro, Inc.
+ * Author: Baolin Wang <baolin.wang@linaro.org>
+ */
+
+#include <linux/mmc/card.h>
+#include <linux/mmc/host.h>
+
+#include "mmc_hsq.h"
+
+#define HSQ_NUM_SLOTS	32
+#define HSQ_INVALID_TAG	HSQ_NUM_SLOTS
+
+static void mmc_hsq_pump_requests(struct mmc_hsq *hsq)
+{
+	struct mmc_host *mmc = hsq->mmc;
+	struct hsq_slot *slot;
+	unsigned long flags;
+
+	spin_lock_irqsave(&hsq->lock, flags);
+
+	/* Make sure we are not already running a request now */
+	if (hsq->mrq) {
+		spin_unlock_irqrestore(&hsq->lock, flags);
+		return;
+	}
+
+	/* Make sure there are remain requests need to pump */
+	if (!hsq->qcnt || !hsq->enabled) {
+		spin_unlock_irqrestore(&hsq->lock, flags);
+		return;
+	}
+
+	slot = &hsq->slot[hsq->next_tag];
+	hsq->mrq = slot->mrq;
+	hsq->qcnt--;
+
+	spin_unlock_irqrestore(&hsq->lock, flags);
+
+	mmc->ops->request(mmc, hsq->mrq);
+}
+
+static void mmc_hsq_update_next_tag(struct mmc_hsq *hsq, int remains)
+{
+	struct hsq_slot *slot;
+	int tag;
+
+	/*
+	 * If there are no remain requests in software queue, then set a invalid
+	 * tag.
+	 */
+	if (!remains) {
+		hsq->next_tag = HSQ_INVALID_TAG;
+		return;
+	}
+
+	/*
+	 * Increasing the next tag and check if the corresponding request is
+	 * available, if yes, then we found a candidate request.
+	 */
+	if (++hsq->next_tag != HSQ_INVALID_TAG) {
+		slot = &hsq->slot[hsq->next_tag];
+		if (slot->mrq)
+			return;
+	}
+
+	/* Othersie we should iterate all slots to find a available tag. */
+	for (tag = 0; tag < HSQ_NUM_SLOTS; tag++) {
+		slot = &hsq->slot[tag];
+		if (slot->mrq)
+			break;
+	}
+
+	if (tag == HSQ_NUM_SLOTS)
+		tag = HSQ_INVALID_TAG;
+
+	hsq->next_tag = tag;
+}
+
+static void mmc_hsq_post_request(struct mmc_hsq *hsq)
+{
+	unsigned long flags;
+	int remains;
+
+	spin_lock_irqsave(&hsq->lock, flags);
+
+	remains = hsq->qcnt;
+	hsq->mrq = NULL;
+
+	/* Update the next available tag to be queued. */
+	mmc_hsq_update_next_tag(hsq, remains);
+
+	if (hsq->waiting_for_idle && !remains) {
+		hsq->waiting_for_idle = false;
+		wake_up(&hsq->wait_queue);
+	}
+
+	/* Do not pump new request in recovery mode. */
+	if (hsq->recovery_halt) {
+		spin_unlock_irqrestore(&hsq->lock, flags);
+		return;
+	}
+
+	spin_unlock_irqrestore(&hsq->lock, flags);
+
+	 /*
+	  * Try to pump new request to host controller as fast as possible,
+	  * after completing previous request.
+	  */
+	if (remains > 0)
+		mmc_hsq_pump_requests(hsq);
+}
+
+/**
+ * mmc_hsq_finalize_request - finalize one request if the request is done
+ * @mmc: the host controller
+ * @mrq: the request need to be finalized
+ *
+ * Return true if we finalized the corresponding request in software queue,
+ * otherwise return false.
+ */
+bool mmc_hsq_finalize_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+	unsigned long flags;
+
+	spin_lock_irqsave(&hsq->lock, flags);
+
+	if (!hsq->enabled || !hsq->mrq || hsq->mrq != mrq) {
+		spin_unlock_irqrestore(&hsq->lock, flags);
+		return false;
+	}
+
+	/*
+	 * Clear current completed slot request to make a room for new request.
+	 */
+	hsq->slot[hsq->next_tag].mrq = NULL;
+
+	spin_unlock_irqrestore(&hsq->lock, flags);
+
+	mmc_cqe_request_done(mmc, hsq->mrq);
+
+	mmc_hsq_post_request(hsq);
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(mmc_hsq_finalize_request);
+
+static void mmc_hsq_recovery_start(struct mmc_host *mmc)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+	unsigned long flags;
+
+	spin_lock_irqsave(&hsq->lock, flags);
+
+	hsq->recovery_halt = true;
+
+	spin_unlock_irqrestore(&hsq->lock, flags);
+}
+
+static void mmc_hsq_recovery_finish(struct mmc_host *mmc)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+	int remains;
+
+	spin_lock_irq(&hsq->lock);
+
+	hsq->recovery_halt = false;
+	remains = hsq->qcnt;
+
+	spin_unlock_irq(&hsq->lock);
+
+	/*
+	 * Try to pump new request if there are request pending in software
+	 * queue after finishing recovery.
+	 */
+	if (remains > 0)
+		mmc_hsq_pump_requests(hsq);
+}
+
+static int mmc_hsq_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+	int tag = mrq->tag;
+
+	spin_lock_irq(&hsq->lock);
+
+	if (!hsq->enabled) {
+		spin_unlock_irq(&hsq->lock);
+		return -ESHUTDOWN;
+	}
+
+	/* Do not queue any new requests in recovery mode. */
+	if (hsq->recovery_halt) {
+		spin_unlock_irq(&hsq->lock);
+		return -EBUSY;
+	}
+
+	hsq->slot[tag].mrq = mrq;
+
+	/*
+	 * Set the next tag as current request tag if no available
+	 * next tag.
+	 */
+	if (hsq->next_tag == HSQ_INVALID_TAG)
+		hsq->next_tag = tag;
+
+	hsq->qcnt++;
+
+	spin_unlock_irq(&hsq->lock);
+
+	mmc_hsq_pump_requests(hsq);
+
+	return 0;
+}
+
+static void mmc_hsq_post_req(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	if (mmc->ops->post_req)
+		mmc->ops->post_req(mmc, mrq, 0);
+}
+
+static bool mmc_hsq_queue_is_idle(struct mmc_hsq *hsq, int *ret)
+{
+	bool is_idle;
+
+	spin_lock_irq(&hsq->lock);
+
+	is_idle = (!hsq->mrq && !hsq->qcnt) ||
+		hsq->recovery_halt;
+
+	*ret = hsq->recovery_halt ? -EBUSY : 0;
+	hsq->waiting_for_idle = !is_idle;
+
+	spin_unlock_irq(&hsq->lock);
+
+	return is_idle;
+}
+
+static int mmc_hsq_wait_for_idle(struct mmc_host *mmc)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+	int ret;
+
+	wait_event(hsq->wait_queue,
+		   mmc_hsq_queue_is_idle(hsq, &ret));
+
+	return ret;
+}
+
+static void mmc_hsq_disable(struct mmc_host *mmc)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+	u32 timeout = 500;
+	int ret;
+
+	spin_lock_irq(&hsq->lock);
+
+	if (!hsq->enabled) {
+		spin_unlock_irq(&hsq->lock);
+		return;
+	}
+
+	spin_unlock_irq(&hsq->lock);
+
+	ret = wait_event_timeout(hsq->wait_queue,
+				 mmc_hsq_queue_is_idle(hsq, &ret),
+				 msecs_to_jiffies(timeout));
+	if (ret == 0) {
+		pr_warn("could not stop mmc software queue\n");
+		return;
+	}
+
+	spin_lock_irq(&hsq->lock);
+
+	hsq->enabled = false;
+
+	spin_unlock_irq(&hsq->lock);
+}
+
+static int mmc_hsq_enable(struct mmc_host *mmc, struct mmc_card *card)
+{
+	struct mmc_hsq *hsq = mmc->cqe_private;
+
+	spin_lock_irq(&hsq->lock);
+
+	if (hsq->enabled) {
+		spin_unlock_irq(&hsq->lock);
+		return -EBUSY;
+	}
+
+	hsq->enabled = true;
+
+	spin_unlock_irq(&hsq->lock);
+
+	return 0;
+}
+
+static const struct mmc_cqe_ops mmc_hsq_ops = {
+	.cqe_enable = mmc_hsq_enable,
+	.cqe_disable = mmc_hsq_disable,
+	.cqe_request = mmc_hsq_request,
+	.cqe_post_req = mmc_hsq_post_req,
+	.cqe_wait_for_idle = mmc_hsq_wait_for_idle,
+	.cqe_recovery_start = mmc_hsq_recovery_start,
+	.cqe_recovery_finish = mmc_hsq_recovery_finish,
+};
+
+int mmc_hsq_init(struct mmc_hsq *hsq, struct mmc_host *mmc)
+{
+	hsq->num_slots = HSQ_NUM_SLOTS;
+	hsq->next_tag = HSQ_INVALID_TAG;
+	mmc->cqe_qdepth = HSQ_NUM_SLOTS;
+
+	hsq->slot = devm_kcalloc(mmc_dev(mmc), hsq->num_slots,
+				 sizeof(struct hsq_slot), GFP_KERNEL);
+	if (!hsq->slot)
+		return -ENOMEM;
+
+	hsq->mmc = mmc;
+	hsq->mmc->cqe_private = hsq;
+	mmc->cqe_ops = &mmc_hsq_ops;
+
+	spin_lock_init(&hsq->lock);
+	init_waitqueue_head(&hsq->wait_queue);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mmc_hsq_init);
+
+void mmc_hsq_suspend(struct mmc_host *mmc)
+{
+	mmc_hsq_disable(mmc);
+}
+EXPORT_SYMBOL_GPL(mmc_hsq_suspend);
+
+int mmc_hsq_resume(struct mmc_host *mmc)
+{
+	return mmc_hsq_enable(mmc, NULL);
+}
+EXPORT_SYMBOL_GPL(mmc_hsq_resume);
diff --git a/drivers/mmc/host/mmc_hsq.h b/drivers/mmc/host/mmc_hsq.h
new file mode 100644
index 0000000..d51beb7
--- /dev/null
+++ b/drivers/mmc/host/mmc_hsq.h
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef LINUX_MMC_HSQ_H
+#define LINUX_MMC_HSQ_H
+
+struct hsq_slot {
+	struct mmc_request *mrq;
+};
+
+struct mmc_hsq {
+	struct mmc_host *mmc;
+	struct mmc_request *mrq;
+	wait_queue_head_t wait_queue;
+	struct hsq_slot *slot;
+	spinlock_t lock;
+
+	int next_tag;
+	int num_slots;
+	int qcnt;
+
+	bool enabled;
+	bool waiting_for_idle;
+	bool recovery_halt;
+};
+
+int mmc_hsq_init(struct mmc_hsq *hsq, struct mmc_host *mmc);
+void mmc_hsq_suspend(struct mmc_host *mmc);
+int mmc_hsq_resume(struct mmc_host *mmc);
+bool mmc_hsq_finalize_request(struct mmc_host *mmc, struct mmc_request *mrq);
+
+#endif
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index ba70338..3931aa3 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -462,6 +462,9 @@ struct mmc_host {
 	bool			cqe_enabled;
 	bool			cqe_on;
 
+	/* Software Queue support */
+	bool			swq_enabled;
+
 	unsigned long		private[0] ____cacheline_aligned;
 };
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 2/4] mmc: host: sdhci: Add request_done ops for struct sdhci_ops
  2019-11-18 10:43 [PATCH v7 0/4] Add MMC software queue support Baolin Wang
  2019-11-18 10:43 ` [PATCH v7 1/4] mmc: Add MMC host " Baolin Wang
@ 2019-11-18 10:43 ` Baolin Wang
  2019-11-22 12:13   ` Adrian Hunter
  2019-11-18 10:43 ` [PATCH v7 3/4] mmc: host: sdhci: Add a variable to defer to complete requests if needed Baolin Wang
  2019-11-18 10:43 ` [PATCH v7 4/4] mmc: host: sdhci-sprd: Add software queue support Baolin Wang
  3 siblings, 1 reply; 14+ messages in thread
From: Baolin Wang @ 2019-11-18 10:43 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, baolin.wang7, linux-mmc, linux-kernel

From: Baolin Wang <baolin.wang@linaro.org>

Add request_done ops for struct sdhci_ops as a preparation in case some
host controllers have different method to complete one request, such as
supporting request completion of MMC software queue.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
---
 drivers/mmc/host/sdhci.c |   12 ++++++++++--
 drivers/mmc/host/sdhci.h |    2 ++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index b056400..850241f 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2729,7 +2729,10 @@ static bool sdhci_request_done(struct sdhci_host *host)
 
 	spin_unlock_irqrestore(&host->lock, flags);
 
-	mmc_request_done(host->mmc, mrq);
+	if (host->ops->request_done)
+		host->ops->request_done(host, mrq);
+	else
+		mmc_request_done(host->mmc, mrq);
 
 	return false;
 }
@@ -3157,7 +3160,12 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
 
 	/* Process mrqs ready for immediate completion */
 	for (i = 0; i < SDHCI_MAX_MRQS; i++) {
-		if (mrqs_done[i])
+		if (!mrqs_done[i])
+			continue;
+
+		if (host->ops->request_done)
+			host->ops->request_done(host, mrqs_done[i]);
+		else
 			mmc_request_done(host->mmc, mrqs_done[i]);
 	}
 
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 0ed3e0e..d89cdb9 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -644,6 +644,8 @@ struct sdhci_ops {
 	void	(*voltage_switch)(struct sdhci_host *host);
 	void	(*adma_write_desc)(struct sdhci_host *host, void **desc,
 				   dma_addr_t addr, int len, unsigned int cmd);
+	void	(*request_done)(struct sdhci_host *host,
+				struct mmc_request *mrq);
 };
 
 #ifdef CONFIG_MMC_SDHCI_IO_ACCESSORS
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 3/4] mmc: host: sdhci: Add a variable to defer to complete requests if needed
  2019-11-18 10:43 [PATCH v7 0/4] Add MMC software queue support Baolin Wang
  2019-11-18 10:43 ` [PATCH v7 1/4] mmc: Add MMC host " Baolin Wang
  2019-11-18 10:43 ` [PATCH v7 2/4] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
@ 2019-11-18 10:43 ` Baolin Wang
  2019-11-22 12:14   ` Adrian Hunter
  2019-11-18 10:43 ` [PATCH v7 4/4] mmc: host: sdhci-sprd: Add software queue support Baolin Wang
  3 siblings, 1 reply; 14+ messages in thread
From: Baolin Wang @ 2019-11-18 10:43 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, baolin.wang7, linux-mmc, linux-kernel

From: Baolin Wang <baolin.wang@linaro.org>

When using the host software queue, it will trigger the next request in
irq handler without a context switch. But the sdhci_request() can not be
called in interrupt context when using host software queue for some host
drivers, due to the get_cd() ops can be sleepable.

But for some host drivers, such as Spreadtrum host driver, the card is
nonremovable, so the get_cd() ops is not sleepable, which means we can
complete the data request and trigger the next request in irq handler
to remove the context switch for the Spreadtrum host driver.

As suggested by Adrian, we should introduce a request_atomic() API to
indicate that a request can be called in interrupt context to remove
the context switch when using mmc host software queue. But this should
be done in another thread to convert the users of mmc host software queue.
Thus we can introduce a variable in struct sdhci_host to indicate that
we will always to defer to complete requests when using the host software
queue.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
---
 drivers/mmc/host/sdhci.c |    2 +-
 drivers/mmc/host/sdhci.h |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 850241f..4bef066 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -3035,7 +3035,7 @@ static inline bool sdhci_defer_done(struct sdhci_host *host,
 {
 	struct mmc_data *data = mrq->data;
 
-	return host->pending_reset ||
+	return host->pending_reset || host->always_defer_done ||
 	       ((host->flags & SDHCI_REQ_USE_DMA) && data &&
 		data->host_cookie == COOKIE_MAPPED);
 }
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index d89cdb9..a73ce89 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -533,6 +533,7 @@ struct sdhci_host {
 	bool pending_reset;	/* Cmd/data reset is pending */
 	bool irq_wake_enabled;	/* IRQ wakeup is enabled */
 	bool v4_mode;		/* Host Version 4 Enable */
+	bool always_defer_done;	/* Always defer to complete requests */
 
 	struct mmc_request *mrqs_done[SDHCI_MAX_MRQS];	/* Requests done */
 	struct mmc_command *cmd;	/* Current command */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 4/4] mmc: host: sdhci-sprd: Add software queue support
  2019-11-18 10:43 [PATCH v7 0/4] Add MMC software queue support Baolin Wang
                   ` (2 preceding siblings ...)
  2019-11-18 10:43 ` [PATCH v7 3/4] mmc: host: sdhci: Add a variable to defer to complete requests if needed Baolin Wang
@ 2019-11-18 10:43 ` Baolin Wang
  3 siblings, 0 replies; 14+ messages in thread
From: Baolin Wang @ 2019-11-18 10:43 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, baolin.wang7, linux-mmc, linux-kernel

From: Baolin Wang <baolin.wang@linaro.org>

Add software queue support to improve the performance.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
---
 drivers/mmc/host/Kconfig      |    1 +
 drivers/mmc/host/sdhci-sprd.c |   28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index efa4019..54b86f6 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -632,6 +632,7 @@ config MMC_SDHCI_SPRD
 	depends on ARCH_SPRD
 	depends on MMC_SDHCI_PLTFM
 	select MMC_SDHCI_IO_ACCESSORS
+	select MMC_HSQ
 	help
 	  This selects the SDIO Host Controller in Spreadtrum
 	  SoCs, this driver supports R11(IP version: R11P0).
diff --git a/drivers/mmc/host/sdhci-sprd.c b/drivers/mmc/host/sdhci-sprd.c
index d07b979..d346223 100644
--- a/drivers/mmc/host/sdhci-sprd.c
+++ b/drivers/mmc/host/sdhci-sprd.c
@@ -19,6 +19,7 @@
 #include <linux/slab.h>
 
 #include "sdhci-pltfm.h"
+#include "mmc_hsq.h"
 
 /* SDHCI_ARGUMENT2 register high 16bit */
 #define SDHCI_SPRD_ARG2_STUFF		GENMASK(31, 16)
@@ -379,6 +380,16 @@ static unsigned int sdhci_sprd_get_ro(struct sdhci_host *host)
 	return 0;
 }
 
+static void sdhci_sprd_request_done(struct sdhci_host *host,
+				    struct mmc_request *mrq)
+{
+	/* Validate if the request was from software queue firstly. */
+	if (mmc_hsq_finalize_request(host->mmc, mrq))
+		return;
+
+	 mmc_request_done(host->mmc, mrq);
+}
+
 static struct sdhci_ops sdhci_sprd_ops = {
 	.read_l = sdhci_sprd_readl,
 	.write_l = sdhci_sprd_writel,
@@ -392,6 +403,7 @@ static unsigned int sdhci_sprd_get_ro(struct sdhci_host *host)
 	.hw_reset = sdhci_sprd_hw_reset,
 	.get_max_timeout_count = sdhci_sprd_get_max_timeout_count,
 	.get_ro = sdhci_sprd_get_ro,
+	.request_done = sdhci_sprd_request_done,
 };
 
 static void sdhci_sprd_request(struct mmc_host *mmc, struct mmc_request *mrq)
@@ -521,6 +533,7 @@ static int sdhci_sprd_probe(struct platform_device *pdev)
 {
 	struct sdhci_host *host;
 	struct sdhci_sprd_host *sprd_host;
+	struct mmc_hsq *hsq;
 	struct clk *clk;
 	int ret = 0;
 
@@ -631,6 +644,18 @@ static int sdhci_sprd_probe(struct platform_device *pdev)
 
 	sprd_host->flags = host->flags;
 
+	hsq = devm_kzalloc(&pdev->dev, sizeof(*hsq), GFP_KERNEL);
+	if (!hsq) {
+		ret = -ENOMEM;
+		goto err_cleanup_host;
+	}
+
+	ret = mmc_hsq_init(hsq, host->mmc);
+	if (ret)
+		goto err_cleanup_host;
+
+	host->always_defer_done = true;
+
 	ret = __sdhci_add_host(host);
 	if (ret)
 		goto err_cleanup_host;
@@ -689,6 +714,7 @@ static int sdhci_sprd_runtime_suspend(struct device *dev)
 	struct sdhci_host *host = dev_get_drvdata(dev);
 	struct sdhci_sprd_host *sprd_host = TO_SPRD_HOST(host);
 
+	mmc_hsq_suspend(host->mmc);
 	sdhci_runtime_suspend_host(host);
 
 	clk_disable_unprepare(sprd_host->clk_sdio);
@@ -717,6 +743,8 @@ static int sdhci_sprd_runtime_resume(struct device *dev)
 		goto clk_disable;
 
 	sdhci_runtime_resume_host(host, 1);
+	mmc_hsq_resume(host->mmc);
+
 	return 0;
 
 clk_disable:
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
  2019-11-18 10:43 ` [PATCH v7 1/4] mmc: Add MMC host " Baolin Wang
@ 2019-11-22 10:32   ` Arnd Bergmann
  2019-11-22 10:42     ` Baolin Wang
  2020-01-21 11:52   ` Ulf Hansson
  1 sibling, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2019-11-22 10:32 UTC (permalink / raw)
  To: Baolin Wang
  Cc: Adrian Hunter, Ulf Hansson, asutoshd, Orson Zhai, Lyra Zhang,
	Linus Walleij, Vincent Guittot, Baolin Wang, linux-mmc,
	linux-kernel, Paolo Valente

On Mon, Nov 18, 2019 at 11:43 AM Baolin Wang <baolin.wang7@gmail.com> wrote:
>
> From: Baolin Wang <baolin.wang@linaro.org>
>
> Now the MMC read/write stack will always wait for previous request is
> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> or queue a work to complete request, that will bring context switching
> overhead, especially for high I/O per second rates, to affect the IO
> performance.
>
> Thus this patch introduces MMC software queue interface based on the
> hardware command queue engine's interfaces, which is similar with the
> hardware command queue engine's idea, that can remove the context
> switching. Moreover we set the default queue depth as 32 for software
> queue, which allows more requests to be prepared, merged and inserted
> into IO scheduler to improve performance, but we only allow 2 requests
> in flight, that is enough to let the irq handler always trigger the
> next request without a context switch, as well as avoiding a long latency.
>
> From the fio testing data in cover letter, we can see the software
> queue can improve some performance with 4K block size, increasing
> about 16% for random read, increasing about 90% for random write,
> though no obvious improvement for sequential read and write.
>
> Moreover we can expand the software queue interface to support MMC
> packed request or packed command in future.
>
> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>

Overall, this looks like enough of a win that I think we should just
use the current version for the moment, while still working on all the
other improvements.

My biggest concern is the naming of "software queue", which is
a concept that runs against the idea of doing all the heavy lifting,
in particular the queueing in bfq.

Then again, it does not /actually/ do much queuing at all, beyond
preparing a single request so it can fire it off early. Even with the
packed command support added in, there is not really any queuing
beyond what it has to do anyway.

Using the infrastructure that was added for cqe seems like a good
compromise, as this already has a way to hand down multiple
requests to the hardware and is overall more modern than the
existing support.

I still think we should do all the other things I mentioned in my
earlier reply today, but they can be done as add-ons:

- remove all blocking calls from the queue_rq() function:
  partition-change, retune, etc should become non-blocking
  operations that return busy in the queue_rq function.

- get bfq to send down multiple requests all the way into
  the device driver, so we don't have to actually queue them
  here at all to do packed commands

- add packed command support

- submit cmds from hardirq context if this is advantageous,
  and move everything else in the irq handler into irqthread
  context in order to remove all other workqueue and softirq
  processing from the request processing path.

If we can agree on this as the rough plan for the future,
feel free to add my

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
  2019-11-22 10:32   ` Arnd Bergmann
@ 2019-11-22 10:42     ` Baolin Wang
  2019-12-09  9:07       ` Baolin Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Baolin Wang @ 2019-11-22 10:42 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Adrian Hunter, Ulf Hansson, asutoshd, Orson Zhai, Lyra Zhang,
	Linus Walleij, Vincent Guittot, Baolin Wang, linux-mmc,
	linux-kernel, Paolo Valente

Hi Arnd,

On Fri, Nov 22, 2019 at 6:32 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Mon, Nov 18, 2019 at 11:43 AM Baolin Wang <baolin.wang7@gmail.com> wrote:
> >
> > From: Baolin Wang <baolin.wang@linaro.org>
> >
> > Now the MMC read/write stack will always wait for previous request is
> > completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> > or queue a work to complete request, that will bring context switching
> > overhead, especially for high I/O per second rates, to affect the IO
> > performance.
> >
> > Thus this patch introduces MMC software queue interface based on the
> > hardware command queue engine's interfaces, which is similar with the
> > hardware command queue engine's idea, that can remove the context
> > switching. Moreover we set the default queue depth as 32 for software
> > queue, which allows more requests to be prepared, merged and inserted
> > into IO scheduler to improve performance, but we only allow 2 requests
> > in flight, that is enough to let the irq handler always trigger the
> > next request without a context switch, as well as avoiding a long latency.
> >
> > From the fio testing data in cover letter, we can see the software
> > queue can improve some performance with 4K block size, increasing
> > about 16% for random read, increasing about 90% for random write,
> > though no obvious improvement for sequential read and write.
> >
> > Moreover we can expand the software queue interface to support MMC
> > packed request or packed command in future.
> >
> > Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> > Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
>
> Overall, this looks like enough of a win that I think we should just
> use the current version for the moment, while still working on all the
> other improvements.
>
> My biggest concern is the naming of "software queue", which is
> a concept that runs against the idea of doing all the heavy lifting,
> in particular the queueing in bfq.
>
> Then again, it does not /actually/ do much queuing at all, beyond
> preparing a single request so it can fire it off early. Even with the
> packed command support added in, there is not really any queuing
> beyond what it has to do anyway.

Yes. But can not find any better name until now and 'software queue'
was suggested by Adrian.

>
> Using the infrastructure that was added for cqe seems like a good
> compromise, as this already has a way to hand down multiple
> requests to the hardware and is overall more modern than the
> existing support.
>
> I still think we should do all the other things I mentioned in my
> earlier reply today, but they can be done as add-ons:
>
> - remove all blocking calls from the queue_rq() function:
>   partition-change, retune, etc should become non-blocking
>   operations that return busy in the queue_rq function.
>
> - get bfq to send down multiple requests all the way into
>   the device driver, so we don't have to actually queue them
>   here at all to do packed commands
>
> - add packed command support
>
> - submit cmds from hardirq context if this is advantageous,
>   and move everything else in the irq handler into irqthread
>   context in order to remove all other workqueue and softirq
>   processing from the request processing path.
>
> If we can agree on this as the rough plan for the future,
> feel free to add my

Yes, I agree with your plan. Thast's what we should do in future.

>
> Reviewed-by: Arnd Bergmann <arnd@arndb.de>

Thanks for your reviewing and good suggestion.

Ulf,

I am not sure if there is any chance to merge this patch set into
V5.5, I've tested for a long time and did not find any resession.
Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 2/4] mmc: host: sdhci: Add request_done ops for struct sdhci_ops
  2019-11-18 10:43 ` [PATCH v7 2/4] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
@ 2019-11-22 12:13   ` Adrian Hunter
  0 siblings, 0 replies; 14+ messages in thread
From: Adrian Hunter @ 2019-11-22 12:13 UTC (permalink / raw)
  To: Baolin Wang, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, linux-mmc, linux-kernel

On 18/11/19 12:43 PM, Baolin Wang wrote:
> From: Baolin Wang <baolin.wang@linaro.org>
> 
> Add request_done ops for struct sdhci_ops as a preparation in case some
> host controllers have different method to complete one request, such as
> supporting request completion of MMC software queue.
> 
> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/mmc/host/sdhci.c |   12 ++++++++++--
>  drivers/mmc/host/sdhci.h |    2 ++
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index b056400..850241f 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -2729,7 +2729,10 @@ static bool sdhci_request_done(struct sdhci_host *host)
>  
>  	spin_unlock_irqrestore(&host->lock, flags);
>  
> -	mmc_request_done(host->mmc, mrq);
> +	if (host->ops->request_done)
> +		host->ops->request_done(host, mrq);
> +	else
> +		mmc_request_done(host->mmc, mrq);
>  
>  	return false;
>  }
> @@ -3157,7 +3160,12 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
>  
>  	/* Process mrqs ready for immediate completion */
>  	for (i = 0; i < SDHCI_MAX_MRQS; i++) {
> -		if (mrqs_done[i])
> +		if (!mrqs_done[i])
> +			continue;
> +
> +		if (host->ops->request_done)
> +			host->ops->request_done(host, mrqs_done[i]);
> +		else
>  			mmc_request_done(host->mmc, mrqs_done[i]);
>  	}
>  
> diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
> index 0ed3e0e..d89cdb9 100644
> --- a/drivers/mmc/host/sdhci.h
> +++ b/drivers/mmc/host/sdhci.h
> @@ -644,6 +644,8 @@ struct sdhci_ops {
>  	void	(*voltage_switch)(struct sdhci_host *host);
>  	void	(*adma_write_desc)(struct sdhci_host *host, void **desc,
>  				   dma_addr_t addr, int len, unsigned int cmd);
> +	void	(*request_done)(struct sdhci_host *host,
> +				struct mmc_request *mrq);
>  };
>  
>  #ifdef CONFIG_MMC_SDHCI_IO_ACCESSORS
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 3/4] mmc: host: sdhci: Add a variable to defer to complete requests if needed
  2019-11-18 10:43 ` [PATCH v7 3/4] mmc: host: sdhci: Add a variable to defer to complete requests if needed Baolin Wang
@ 2019-11-22 12:14   ` Adrian Hunter
  0 siblings, 0 replies; 14+ messages in thread
From: Adrian Hunter @ 2019-11-22 12:14 UTC (permalink / raw)
  To: Baolin Wang, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, linux-mmc, linux-kernel

On 18/11/19 12:43 PM, Baolin Wang wrote:
> From: Baolin Wang <baolin.wang@linaro.org>
> 
> When using the host software queue, it will trigger the next request in
> irq handler without a context switch. But the sdhci_request() can not be
> called in interrupt context when using host software queue for some host
> drivers, due to the get_cd() ops can be sleepable.
> 
> But for some host drivers, such as Spreadtrum host driver, the card is
> nonremovable, so the get_cd() ops is not sleepable, which means we can
> complete the data request and trigger the next request in irq handler
> to remove the context switch for the Spreadtrum host driver.
> 
> As suggested by Adrian, we should introduce a request_atomic() API to
> indicate that a request can be called in interrupt context to remove
> the context switch when using mmc host software queue. But this should
> be done in another thread to convert the users of mmc host software queue.
> Thus we can introduce a variable in struct sdhci_host to indicate that
> we will always to defer to complete requests when using the host software
> queue.
> 
> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/mmc/host/sdhci.c |    2 +-
>  drivers/mmc/host/sdhci.h |    1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index 850241f..4bef066 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -3035,7 +3035,7 @@ static inline bool sdhci_defer_done(struct sdhci_host *host,
>  {
>  	struct mmc_data *data = mrq->data;
>  
> -	return host->pending_reset ||
> +	return host->pending_reset || host->always_defer_done ||
>  	       ((host->flags & SDHCI_REQ_USE_DMA) && data &&
>  		data->host_cookie == COOKIE_MAPPED);
>  }
> diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
> index d89cdb9..a73ce89 100644
> --- a/drivers/mmc/host/sdhci.h
> +++ b/drivers/mmc/host/sdhci.h
> @@ -533,6 +533,7 @@ struct sdhci_host {
>  	bool pending_reset;	/* Cmd/data reset is pending */
>  	bool irq_wake_enabled;	/* IRQ wakeup is enabled */
>  	bool v4_mode;		/* Host Version 4 Enable */
> +	bool always_defer_done;	/* Always defer to complete requests */
>  
>  	struct mmc_request *mrqs_done[SDHCI_MAX_MRQS];	/* Requests done */
>  	struct mmc_command *cmd;	/* Current command */
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
  2019-11-22 10:42     ` Baolin Wang
@ 2019-12-09  9:07       ` Baolin Wang
       [not found]         ` <CAMz4kuJ2q_=kEcpz2+GJANdPm5DmwWMLbqBmZHGgtBiEhNFqzw@mail.gmail.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Baolin Wang @ 2019-12-09  9:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Adrian Hunter, Ulf Hansson, asutoshd, Orson Zhai, Lyra Zhang,
	Linus Walleij, Vincent Guittot, Baolin Wang, linux-mmc,
	linux-kernel, Paolo Valente

Hi Ulf,

On Fri, Nov 22, 2019 at 6:42 PM Baolin Wang <baolin.wang7@gmail.com> wrote:
>
> Hi Arnd,
>
> On Fri, Nov 22, 2019 at 6:32 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Mon, Nov 18, 2019 at 11:43 AM Baolin Wang <baolin.wang7@gmail.com> wrote:
> > >
> > > From: Baolin Wang <baolin.wang@linaro.org>
> > >
> > > Now the MMC read/write stack will always wait for previous request is
> > > completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> > > or queue a work to complete request, that will bring context switching
> > > overhead, especially for high I/O per second rates, to affect the IO
> > > performance.
> > >
> > > Thus this patch introduces MMC software queue interface based on the
> > > hardware command queue engine's interfaces, which is similar with the
> > > hardware command queue engine's idea, that can remove the context
> > > switching. Moreover we set the default queue depth as 32 for software
> > > queue, which allows more requests to be prepared, merged and inserted
> > > into IO scheduler to improve performance, but we only allow 2 requests
> > > in flight, that is enough to let the irq handler always trigger the
> > > next request without a context switch, as well as avoiding a long latency.
> > >
> > > From the fio testing data in cover letter, we can see the software
> > > queue can improve some performance with 4K block size, increasing
> > > about 16% for random read, increasing about 90% for random write,
> > > though no obvious improvement for sequential read and write.
> > >
> > > Moreover we can expand the software queue interface to support MMC
> > > packed request or packed command in future.
> > >
> > > Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> > > Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
> >
> > Overall, this looks like enough of a win that I think we should just
> > use the current version for the moment, while still working on all the
> > other improvements.
> >
> > My biggest concern is the naming of "software queue", which is
> > a concept that runs against the idea of doing all the heavy lifting,
> > in particular the queueing in bfq.
> >
> > Then again, it does not /actually/ do much queuing at all, beyond
> > preparing a single request so it can fire it off early. Even with the
> > packed command support added in, there is not really any queuing
> > beyond what it has to do anyway.
>
> Yes. But can not find any better name until now and 'software queue'
> was suggested by Adrian.
>
> >
> > Using the infrastructure that was added for cqe seems like a good
> > compromise, as this already has a way to hand down multiple
> > requests to the hardware and is overall more modern than the
> > existing support.
> >
> > I still think we should do all the other things I mentioned in my
> > earlier reply today, but they can be done as add-ons:
> >
> > - remove all blocking calls from the queue_rq() function:
> >   partition-change, retune, etc should become non-blocking
> >   operations that return busy in the queue_rq function.
> >
> > - get bfq to send down multiple requests all the way into
> >   the device driver, so we don't have to actually queue them
> >   here at all to do packed commands
> >
> > - add packed command support
> >
> > - submit cmds from hardirq context if this is advantageous,
> >   and move everything else in the irq handler into irqthread
> >   context in order to remove all other workqueue and softirq
> >   processing from the request processing path.
> >
> > If we can agree on this as the rough plan for the future,
> > feel free to add my
>
> Yes, I agree with your plan. Thast's what we should do in future.
>
> >
> > Reviewed-by: Arnd Bergmann <arnd@arndb.de>
>
> Thanks for your reviewing and good suggestion.
>
> Ulf,
>
> I am not sure if there is any chance to merge this patch set into
> V5.5, I've tested for a long time and did not find any resession.
> Thanks.

Could you apply this patchset if no objection from your side? Or do
you need me to rebase and resend? Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
       [not found]                   ` <CAMz4kuKe+Xg=-N2e7V0_GBcddKzfRkt7zRG_j-vjGyFvkXcTMA@mail.gmail.com>
@ 2019-12-19 15:21                     ` Ulf Hansson
  2019-12-20  3:50                       ` (Exiting) Baolin Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Ulf Hansson @ 2019-12-19 15:21 UTC (permalink / raw)
  To: (Exiting) Baolin Wang; +Cc: linux-mmc

+ linux-mmc

On Thu, 19 Dec 2019 at 08:26, (Exiting) Baolin Wang
<baolin.wang@linaro.org> wrote:
>
> Hi Ulf,
>
> On Fri, 13 Dec 2019 at 09:53, (Exiting) Baolin Wang
> <baolin.wang@linaro.org> wrote:
> >
> > Hi Ulf,
> >
> > On Thu, 12 Dec 2019 at 23:30, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >
> > > [...]
> > >
> > > > > > > > Ulf,
> > > > > > > >
> > > > > > > > I am not sure if there is any chance to merge this patch set into
> > > > > > > > V5.5, I've tested for a long time and did not find any resession.
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > Could you apply this patchset if no objection from your side? Or do
> > > > > > > you need me to rebase and resend? Thanks.
> > > > > >
> > > > > > Sorry for troubling you in this way. Just want to make sure you did
> > > > > > not miss my V7 patchset for the MMC software queue, since it was
> > > > > > pending for a while, and I got a consensus with Arnd and Adrian
> > > > > > finally. Could you apply them if no objection from your side? As we
> > > > > > talked before, there are some packed request support patches will
> > > > > > depend on the MMC software queue. Thanks a lot.
> > > > >
> > > > > Thanks for reminding me! Apologize for the delays, just been too busy!
> > > >
> > > > No worries, I understood :)
> > > >
> > > > >
> > > > > Sounds promising! Let me have a closer look, by the end of this week.
> > > >
> > > > OK. Thank you very much.
> > >
> > > Baolin, I am looking at your series, but I need some more time. Yes,
> > > even more, sorry.
> > >
> > > I am out most of tomorrow and the entire weekend, so it seems like I
> > > will have to continue reviewing on Monday.
> >
> > Thanks for letting me know the patches' status. OK, no problem.
> >
>
> Apologize for reminding you again. :)
>
> I know next week will be your holiday, not sure if this patch set will
> be still pending for another long time. And the idea of the solution
> was discussed with Arnd and you, so I thought we all got a consensus
> about how to add the packed request support step by step. Moreover
> this patch set will not impact the normal routine without enabling MMC
> software queue and I already did lots of stable testing (including
> request handling, tuning and recovering).

Did you test system suspend/resume, while also having an ongoing
file/dd operations towards the mmc/sd card?

In any case, I am aware of the consensus - it looks promising. More
importantly, I appreciate the work you are doing here. Don't get me
wrong on that, even if I am causing these long an unacceptable delays
- sorry about that!

I have spent most of my reviewing time this week, looking at your
series, but it's not a trivial review and I want to take my time to
review it thoroughly. And fore sure, I fully respects Arnd and Adrian
reviews that is made already.

That said, I am sorry to disappoint you, but I am just not ready to
apply it yet.

In regards to the holidays, don't worry, I will be working. Well,
except for those days that are public holidays in Sweden. :-)

>
> We really want to use the packed request support with adding ADMA3
> transfer mode to improve the IO performance on our platform ASAP, and
> I think we still have more work and potential discussion to add the
> packed request support (maybe need optimize the blk-mq to support
> batch requests handling), but as we discussed before, we should
> introduce the MMC software queue first, then I can move on to the next
> step. I am sorry again to troubling you.

To help you out a bit in regards to this, I have hosted a separate
branch in my git tree that have the series applied (based on today's
"next" branch). The branch is namned "next_host_sq". I may decide to
merge it to next, to get some test coverage, but let's see about that.

In any case, feel free to base AMDA3 (packed command) support on the
new branch, at least for now.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
  2019-12-19 15:21                     ` Ulf Hansson
@ 2019-12-20  3:50                       ` (Exiting) Baolin Wang
  0 siblings, 0 replies; 14+ messages in thread
From: (Exiting) Baolin Wang @ 2019-12-20  3:50 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: linux-mmc

Hi Ulf,

On Thu, 19 Dec 2019 at 23:22, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> + linux-mmc
>
> On Thu, 19 Dec 2019 at 08:26, (Exiting) Baolin Wang
> <baolin.wang@linaro.org> wrote:
> >
> > Hi Ulf,
> >
> > On Fri, 13 Dec 2019 at 09:53, (Exiting) Baolin Wang
> > <baolin.wang@linaro.org> wrote:
> > >
> > > Hi Ulf,
> > >
> > > On Thu, 12 Dec 2019 at 23:30, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > >
> > > > [...]
> > > >
> > > > > > > > > Ulf,
> > > > > > > > >
> > > > > > > > > I am not sure if there is any chance to merge this patch set into
> > > > > > > > > V5.5, I've tested for a long time and did not find any resession.
> > > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Could you apply this patchset if no objection from your side? Or do
> > > > > > > > you need me to rebase and resend? Thanks.
> > > > > > >
> > > > > > > Sorry for troubling you in this way. Just want to make sure you did
> > > > > > > not miss my V7 patchset for the MMC software queue, since it was
> > > > > > > pending for a while, and I got a consensus with Arnd and Adrian
> > > > > > > finally. Could you apply them if no objection from your side? As we
> > > > > > > talked before, there are some packed request support patches will
> > > > > > > depend on the MMC software queue. Thanks a lot.
> > > > > >
> > > > > > Thanks for reminding me! Apologize for the delays, just been too busy!
> > > > >
> > > > > No worries, I understood :)
> > > > >
> > > > > >
> > > > > > Sounds promising! Let me have a closer look, by the end of this week.
> > > > >
> > > > > OK. Thank you very much.
> > > >
> > > > Baolin, I am looking at your series, but I need some more time. Yes,
> > > > even more, sorry.
> > > >
> > > > I am out most of tomorrow and the entire weekend, so it seems like I
> > > > will have to continue reviewing on Monday.
> > >
> > > Thanks for letting me know the patches' status. OK, no problem.
> > >
> >
> > Apologize for reminding you again. :)
> >
> > I know next week will be your holiday, not sure if this patch set will
> > be still pending for another long time. And the idea of the solution
> > was discussed with Arnd and you, so I thought we all got a consensus
> > about how to add the packed request support step by step. Moreover
> > this patch set will not impact the normal routine without enabling MMC
> > software queue and I already did lots of stable testing (including
> > request handling, tuning and recovering).
>
> Did you test system suspend/resume, while also having an ongoing
> file/dd operations towards the mmc/sd card?

Yes, I did and it can work.

>
> In any case, I am aware of the consensus - it looks promising. More
> importantly, I appreciate the work you are doing here. Don't get me
> wrong on that, even if I am causing these long an unacceptable delays
> - sorry about that!
>
> I have spent most of my reviewing time this week, looking at your
> series, but it's not a trivial review and I want to take my time to

Thanks for spending time on this patch set.

> review it thoroughly. And fore sure, I fully respects Arnd and Adrian
> reviews that is made already.
>
> That said, I am sorry to disappoint you, but I am just not ready to
> apply it yet.
>
> In regards to the holidays, don't worry, I will be working. Well,
> except for those days that are public holidays in Sweden. :-)
>
> >
> > We really want to use the packed request support with adding ADMA3
> > transfer mode to improve the IO performance on our platform ASAP, and
> > I think we still have more work and potential discussion to add the
> > packed request support (maybe need optimize the blk-mq to support
> > batch requests handling), but as we discussed before, we should
> > introduce the MMC software queue first, then I can move on to the next
> > step. I am sorry again to troubling you.
>
> To help you out a bit in regards to this, I have hosted a separate
> branch in my git tree that have the series applied (based on today's
> "next" branch). The branch is namned "next_host_sq". I may decide to
> merge it to next, to get some test coverage, but let's see about that.
>
> In any case, feel free to base AMDA3 (packed command) support on the
> new branch, at least for now.

Great, thanks for your help.


--
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
  2019-11-18 10:43 ` [PATCH v7 1/4] mmc: Add MMC host " Baolin Wang
  2019-11-22 10:32   ` Arnd Bergmann
@ 2020-01-21 11:52   ` Ulf Hansson
  2020-01-27 10:23     ` (Exiting) Baolin Wang
  1 sibling, 1 reply; 14+ messages in thread
From: Ulf Hansson @ 2020-01-21 11:52 UTC (permalink / raw)
  To: Baolin Wang
  Cc: Adrian Hunter, Asutosh Das, Orson Zhai, Chunyan Zhang,
	Arnd Bergmann, Linus Walleij, Vincent Guittot, Baolin Wang,
	linux-mmc, Linux Kernel Mailing List

On Mon, 18 Nov 2019 at 11:43, Baolin Wang <baolin.wang7@gmail.com> wrote:
>
> From: Baolin Wang <baolin.wang@linaro.org>
>
> Now the MMC read/write stack will always wait for previous request is
> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> or queue a work to complete request, that will bring context switching
> overhead, especially for high I/O per second rates, to affect the IO
> performance.
>
> Thus this patch introduces MMC software queue interface based on the
> hardware command queue engine's interfaces, which is similar with the
> hardware command queue engine's idea, that can remove the context
> switching. Moreover we set the default queue depth as 32 for software
> queue, which allows more requests to be prepared, merged and inserted
> into IO scheduler to improve performance, but we only allow 2 requests
> in flight, that is enough to let the irq handler always trigger the
> next request without a context switch, as well as avoiding a long latency.
>
> From the fio testing data in cover letter, we can see the software
> queue can improve some performance with 4K block size, increasing
> about 16% for random read, increasing about 90% for random write,
> though no obvious improvement for sequential read and write.
>
> Moreover we can expand the software queue interface to support MMC
> packed request or packed command in future.
>
> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
> ---
>  drivers/mmc/core/block.c   |   61 ++++++++
>  drivers/mmc/core/mmc.c     |   13 +-
>  drivers/mmc/core/queue.c   |   33 ++++-
>  drivers/mmc/host/Kconfig   |    7 +
>  drivers/mmc/host/Makefile  |    1 +
>  drivers/mmc/host/mmc_hsq.c |  344 ++++++++++++++++++++++++++++++++++++++++++++
>  drivers/mmc/host/mmc_hsq.h |   30 ++++
>  include/linux/mmc/host.h   |    3 +
>  8 files changed, 482 insertions(+), 10 deletions(-)
>  create mode 100644 drivers/mmc/host/mmc_hsq.c
>  create mode 100644 drivers/mmc/host/mmc_hsq.h
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 2c71a43..870462c 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -168,6 +168,11 @@ struct mmc_rpmb_data {
>
>  static inline int mmc_blk_part_switch(struct mmc_card *card,
>                                       unsigned int part_type);
> +static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
> +                              struct mmc_card *card,
> +                              int disable_multi,
> +                              struct mmc_queue *mq);
> +static void mmc_blk_swq_req_done(struct mmc_request *mrq);

We have debated whether swq ("software queue") is a good name - and
just to confirm, I also don't have a great better suggestion.

However, I do like the name of new host interface though, host
software queue, "hsq". That makes sense to me.

One option, to possibly make the core code more aligned to the hsq
interface, could be to stick with the "hsq" acronym for the core layer
as well.

In other words, the above function (and its new friends introduced in
the series) could be renamed to mmc_blk_hsq_req_done(). What do you
think about that?

>
>  static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
>  {
> @@ -1569,9 +1574,30 @@ static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
>         return mmc_blk_cqe_start_req(mq->card->host, mrq);
>  }
>
> +static int mmc_blk_swq_issue_rw_rq(struct mmc_queue *mq, struct request *req)
> +{
> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +       struct mmc_host *host = mq->card->host;
> +       int err;
> +
> +       mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
> +       mqrq->brq.mrq.done = mmc_blk_swq_req_done;
> +       mmc_pre_req(host, &mqrq->brq.mrq);
> +
> +       err = mmc_cqe_start_req(host, &mqrq->brq.mrq);
> +       if (err)
> +               mmc_post_req(host, &mqrq->brq.mrq, err);
> +
> +       return err;
> +}
> +
>  static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
>  {
>         struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +       struct mmc_host *host = mq->card->host;
> +
> +       if (host->swq_enabled)

If we switch to use "hsq", this would then be "hsq_enabled".

> +               return mmc_blk_swq_issue_rw_rq(mq, req);
>
>         mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
>
> @@ -1957,6 +1983,41 @@ static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
>                 mmc_run_bkops(mq->card);
>  }
>
> +static void mmc_blk_swq_req_done(struct mmc_request *mrq)
> +{
> +       struct mmc_queue_req *mqrq =
> +               container_of(mrq, struct mmc_queue_req, brq.mrq);
> +       struct request *req = mmc_queue_req_to_req(mqrq);
> +       struct request_queue *q = req->q;
> +       struct mmc_queue *mq = q->queuedata;
> +       struct mmc_host *host = mq->card->host;
> +       unsigned long flags;
> +
> +       if (mmc_blk_rq_error(&mqrq->brq) ||
> +           mmc_blk_urgent_bkops_needed(mq, mqrq)) {
> +               spin_lock_irqsave(&mq->lock, flags);
> +               mq->recovery_needed = true;
> +               mq->recovery_req = req;
> +               spin_unlock_irqrestore(&mq->lock, flags);
> +
> +               host->cqe_ops->cqe_recovery_start(host);
> +
> +               schedule_work(&mq->recovery_work);
> +               return;
> +       }
> +
> +       mmc_blk_rw_reset_success(mq, req);
> +
> +       /*
> +        * Block layer timeouts race with completions which means the normal
> +        * completion path cannot be used during recovery.
> +        */
> +       if (mq->in_recovery)
> +               mmc_blk_cqe_complete_rq(mq, req);
> +       else
> +               blk_mq_complete_request(req);
> +}
> +
>  void mmc_blk_mq_complete(struct request *req)
>  {
>         struct mmc_queue *mq = req->q->queuedata;
> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> index c880489..8eac1a2 100644
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -1852,15 +1852,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
>          */
>         card->reenable_cmdq = card->ext_csd.cmdq_en;
>
> -       if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
> +       if (host->cqe_ops && !host->cqe_enabled) {

The doesn't looks entirely correct to me, as it means enabling the CQE
hardware for hosts with MMC_CAP2_CQE set, but no matter of whether the
eMMC card really supports CMDQ (or if we failed to enabled CMDQ for
the card). More comments below.

>                 err = host->cqe_ops->cqe_enable(host, card);
>                 if (err) {
>                         pr_err("%s: Failed to enable CQE, error %d\n",
>                                 mmc_hostname(host), err);
>                 } else {
>                         host->cqe_enabled = true;
> -                       pr_info("%s: Command Queue Engine enabled\n",
> -                               mmc_hostname(host));
> +
> +                       if (card->ext_csd.cmdq_en) {
> +                               pr_info("%s: Command Queue Engine enabled\n",
> +                                       mmc_hostname(host));
> +                       } else {
> +                               host->swq_enabled = true;
> +                               pr_info("%s: Software Queue enabled\n",
> +                                       mmc_hostname(host));

A few questions around the above code.

1.
Let's assume the host supports MMC_CAP2_CQE, but the eMMC card doesn't
support CMDQ.

In this case, we still want to allow the host to use the software
variant (the hsq) of the interface. In principle that is what the code
above already tries to implement, but then you also need to update the
support in drivers/mmc/host/cqhci.[ch] to support that dynamically.
For example, the ->cqe_enable() callback should check
"card->ext_csd.cmdq_en" and adjust its behavior accordingly, depending
if the flag has been set or not.

2.
I also notice that you are enabling the use of the hsq path, solely
for eMMC cards. I am guessing hsq is beneficial to use for SD cards as
well, don't you think?

Of course, I am fine by enabling that in a step-by-step approach, so
no need to add that as a part of $subject patch. Although, at least
make it a part of the series.

> +                       }
>                 }
>         }
>
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 9edc086..d9086c1 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -62,7 +62,7 @@ enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
>  {
>         struct mmc_host *host = mq->card->host;
>
> -       if (mq->use_cqe)
> +       if (mq->use_cqe && !host->swq_enabled)
>                 return mmc_cqe_issue_type(host, req);
>
>         if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
> @@ -124,12 +124,14 @@ static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
>  {
>         struct request_queue *q = req->q;
>         struct mmc_queue *mq = q->queuedata;
> +       struct mmc_card *card = mq->card;
> +       struct mmc_host *host = card->host;
>         unsigned long flags;
>         int ret;
>
>         spin_lock_irqsave(&mq->lock, flags);
>
> -       if (mq->recovery_needed || !mq->use_cqe)
> +       if (mq->recovery_needed || !mq->use_cqe || host->swq_enabled)
>                 ret = BLK_EH_RESET_TIMER;
>         else
>                 ret = mmc_cqe_timed_out(req);
> @@ -144,12 +146,13 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
>         struct mmc_queue *mq = container_of(work, struct mmc_queue,
>                                             recovery_work);
>         struct request_queue *q = mq->queue;
> +       struct mmc_host *host = mq->card->host;
>
>         mmc_get_card(mq->card, &mq->ctx);
>
>         mq->in_recovery = true;
>
> -       if (mq->use_cqe)
> +       if (mq->use_cqe && !host->swq_enabled)
>                 mmc_blk_cqe_recovery(mq);
>         else
>                 mmc_blk_mq_recovery(mq);
> @@ -160,6 +163,9 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
>         mq->recovery_needed = false;
>         spin_unlock_irq(&mq->lock);
>
> +       if (host->swq_enabled)
> +               host->cqe_ops->cqe_recovery_finish(host);
> +
>         mmc_put_card(mq->card, &mq->ctx);
>
>         blk_mq_run_hw_queues(q, true);
> @@ -279,6 +285,14 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>                 }
>                 break;
>         case MMC_ISSUE_ASYNC:
> +               /*
> +                * For MMC host software queue, we only allow 2 requests in
> +                * flight to avoid a long latency.
> +                */
> +               if (host->swq_enabled && mq->in_flight[issue_type] > 2) {
> +                       spin_unlock_irq(&mq->lock);
> +                       return BLK_STS_RESOURCE;
> +               }
>                 break;
>         default:
>                 /*
> @@ -430,11 +444,16 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card)
>          * The queue depth for CQE must match the hardware because the request
>          * tag is used to index the hardware queue.
>          */
> -       if (mq->use_cqe)
> -               mq->tag_set.queue_depth =
> -                       min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
> -       else
> +       if (mq->use_cqe) {
> +               if (host->swq_enabled)
> +                       mq->tag_set.queue_depth = host->cqe_qdepth;

I don't think we need to treat the hsq as special case in regards to
the .queue_depth.

It should be fine to use the default MMC_QUEUE_DEPTH (64), don't you think?

> +               else
> +                       mq->tag_set.queue_depth =
> +                               min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
> +       } else {
>                 mq->tag_set.queue_depth = MMC_QUEUE_DEPTH;
> +       }
> +
>         mq->tag_set.numa_node = NUMA_NO_NODE;
>         mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
>         mq->tag_set.nr_hw_queues = 1;
> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> index 49ea02c..efa4019 100644
> --- a/drivers/mmc/host/Kconfig
> +++ b/drivers/mmc/host/Kconfig
> @@ -936,6 +936,13 @@ config MMC_CQHCI
>
>           If unsure, say N.
>
> +config MMC_HSQ
> +       tristate "MMC Host Software Queue support"
> +       help
> +         This selects the Software Queue support.
> +
> +         If unsure, say N.
> +
>  config MMC_TOSHIBA_PCI
>         tristate "Toshiba Type A SD/MMC Card Interface Driver"
>         depends on PCI
> diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
> index 11c4598..c14b439 100644
> --- a/drivers/mmc/host/Makefile
> +++ b/drivers/mmc/host/Makefile
> @@ -98,6 +98,7 @@ obj-$(CONFIG_MMC_SDHCI_BRCMSTB)               += sdhci-brcmstb.o
>  obj-$(CONFIG_MMC_SDHCI_OMAP)           += sdhci-omap.o
>  obj-$(CONFIG_MMC_SDHCI_SPRD)           += sdhci-sprd.o
>  obj-$(CONFIG_MMC_CQHCI)                        += cqhci.o
> +obj-$(CONFIG_MMC_HSQ)                  += mmc_hsq.o
>
>  ifeq ($(CONFIG_CB710_DEBUG),y)
>         CFLAGS-cb710-mmc        += -DDEBUG
> diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
> new file mode 100644
> index 0000000..f5a4f93
> --- /dev/null
> +++ b/drivers/mmc/host/mmc_hsq.c
> @@ -0,0 +1,344 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * MMC software queue support based on command queue interfaces
> + *
> + * Copyright (C) 2019 Linaro, Inc.
> + * Author: Baolin Wang <baolin.wang@linaro.org>
> + */
> +
> +#include <linux/mmc/card.h>
> +#include <linux/mmc/host.h>
> +
> +#include "mmc_hsq.h"
> +
> +#define HSQ_NUM_SLOTS  32
> +#define HSQ_INVALID_TAG        HSQ_NUM_SLOTS
> +
> +static void mmc_hsq_pump_requests(struct mmc_hsq *hsq)
> +{
> +       struct mmc_host *mmc = hsq->mmc;
> +       struct hsq_slot *slot;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&hsq->lock, flags);
> +
> +       /* Make sure we are not already running a request now */
> +       if (hsq->mrq) {
> +               spin_unlock_irqrestore(&hsq->lock, flags);
> +               return;
> +       }
> +
> +       /* Make sure there are remain requests need to pump */
> +       if (!hsq->qcnt || !hsq->enabled) {
> +               spin_unlock_irqrestore(&hsq->lock, flags);
> +               return;
> +       }
> +
> +       slot = &hsq->slot[hsq->next_tag];
> +       hsq->mrq = slot->mrq;
> +       hsq->qcnt--;
> +
> +       spin_unlock_irqrestore(&hsq->lock, flags);
> +
> +       mmc->ops->request(mmc, hsq->mrq);
> +}
> +
> +static void mmc_hsq_update_next_tag(struct mmc_hsq *hsq, int remains)
> +{
> +       struct hsq_slot *slot;
> +       int tag;
> +
> +       /*
> +        * If there are no remain requests in software queue, then set a invalid
> +        * tag.
> +        */
> +       if (!remains) {
> +               hsq->next_tag = HSQ_INVALID_TAG;
> +               return;
> +       }
> +
> +       /*
> +        * Increasing the next tag and check if the corresponding request is
> +        * available, if yes, then we found a candidate request.
> +        */
> +       if (++hsq->next_tag != HSQ_INVALID_TAG) {
> +               slot = &hsq->slot[hsq->next_tag];
> +               if (slot->mrq)
> +                       return;
> +       }
> +
> +       /* Othersie we should iterate all slots to find a available tag. */
> +       for (tag = 0; tag < HSQ_NUM_SLOTS; tag++) {
> +               slot = &hsq->slot[tag];
> +               if (slot->mrq)
> +                       break;
> +       }
> +
> +       if (tag == HSQ_NUM_SLOTS)
> +               tag = HSQ_INVALID_TAG;
> +
> +       hsq->next_tag = tag;
> +}
> +
> +static void mmc_hsq_post_request(struct mmc_hsq *hsq)
> +{
> +       unsigned long flags;
> +       int remains;
> +
> +       spin_lock_irqsave(&hsq->lock, flags);
> +
> +       remains = hsq->qcnt;
> +       hsq->mrq = NULL;
> +
> +       /* Update the next available tag to be queued. */
> +       mmc_hsq_update_next_tag(hsq, remains);
> +
> +       if (hsq->waiting_for_idle && !remains) {
> +               hsq->waiting_for_idle = false;
> +               wake_up(&hsq->wait_queue);
> +       }
> +
> +       /* Do not pump new request in recovery mode. */
> +       if (hsq->recovery_halt) {
> +               spin_unlock_irqrestore(&hsq->lock, flags);
> +               return;
> +       }
> +
> +       spin_unlock_irqrestore(&hsq->lock, flags);
> +
> +        /*
> +         * Try to pump new request to host controller as fast as possible,
> +         * after completing previous request.
> +         */
> +       if (remains > 0)
> +               mmc_hsq_pump_requests(hsq);
> +}
> +
> +/**
> + * mmc_hsq_finalize_request - finalize one request if the request is done
> + * @mmc: the host controller
> + * @mrq: the request need to be finalized
> + *
> + * Return true if we finalized the corresponding request in software queue,
> + * otherwise return false.
> + */
> +bool mmc_hsq_finalize_request(struct mmc_host *mmc, struct mmc_request *mrq)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&hsq->lock, flags);
> +
> +       if (!hsq->enabled || !hsq->mrq || hsq->mrq != mrq) {
> +               spin_unlock_irqrestore(&hsq->lock, flags);
> +               return false;
> +       }
> +
> +       /*
> +        * Clear current completed slot request to make a room for new request.
> +        */
> +       hsq->slot[hsq->next_tag].mrq = NULL;
> +
> +       spin_unlock_irqrestore(&hsq->lock, flags);
> +
> +       mmc_cqe_request_done(mmc, hsq->mrq);
> +
> +       mmc_hsq_post_request(hsq);
> +
> +       return true;
> +}
> +EXPORT_SYMBOL_GPL(mmc_hsq_finalize_request);
> +
> +static void mmc_hsq_recovery_start(struct mmc_host *mmc)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&hsq->lock, flags);
> +
> +       hsq->recovery_halt = true;
> +
> +       spin_unlock_irqrestore(&hsq->lock, flags);
> +}
> +
> +static void mmc_hsq_recovery_finish(struct mmc_host *mmc)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +       int remains;
> +
> +       spin_lock_irq(&hsq->lock);
> +
> +       hsq->recovery_halt = false;
> +       remains = hsq->qcnt;
> +
> +       spin_unlock_irq(&hsq->lock);
> +
> +       /*
> +        * Try to pump new request if there are request pending in software
> +        * queue after finishing recovery.
> +        */
> +       if (remains > 0)
> +               mmc_hsq_pump_requests(hsq);
> +}
> +
> +static int mmc_hsq_request(struct mmc_host *mmc, struct mmc_request *mrq)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +       int tag = mrq->tag;
> +
> +       spin_lock_irq(&hsq->lock);
> +
> +       if (!hsq->enabled) {
> +               spin_unlock_irq(&hsq->lock);
> +               return -ESHUTDOWN;
> +       }
> +
> +       /* Do not queue any new requests in recovery mode. */
> +       if (hsq->recovery_halt) {
> +               spin_unlock_irq(&hsq->lock);
> +               return -EBUSY;
> +       }
> +
> +       hsq->slot[tag].mrq = mrq;
> +
> +       /*
> +        * Set the next tag as current request tag if no available
> +        * next tag.
> +        */
> +       if (hsq->next_tag == HSQ_INVALID_TAG)
> +               hsq->next_tag = tag;
> +
> +       hsq->qcnt++;
> +
> +       spin_unlock_irq(&hsq->lock);
> +
> +       mmc_hsq_pump_requests(hsq);
> +
> +       return 0;
> +}
> +
> +static void mmc_hsq_post_req(struct mmc_host *mmc, struct mmc_request *mrq)
> +{
> +       if (mmc->ops->post_req)
> +               mmc->ops->post_req(mmc, mrq, 0);
> +}
> +
> +static bool mmc_hsq_queue_is_idle(struct mmc_hsq *hsq, int *ret)
> +{
> +       bool is_idle;
> +
> +       spin_lock_irq(&hsq->lock);
> +
> +       is_idle = (!hsq->mrq && !hsq->qcnt) ||
> +               hsq->recovery_halt;
> +
> +       *ret = hsq->recovery_halt ? -EBUSY : 0;
> +       hsq->waiting_for_idle = !is_idle;
> +
> +       spin_unlock_irq(&hsq->lock);
> +
> +       return is_idle;
> +}
> +
> +static int mmc_hsq_wait_for_idle(struct mmc_host *mmc)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +       int ret;
> +
> +       wait_event(hsq->wait_queue,
> +                  mmc_hsq_queue_is_idle(hsq, &ret));
> +
> +       return ret;
> +}
> +
> +static void mmc_hsq_disable(struct mmc_host *mmc)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +       u32 timeout = 500;
> +       int ret;
> +
> +       spin_lock_irq(&hsq->lock);
> +
> +       if (!hsq->enabled) {
> +               spin_unlock_irq(&hsq->lock);
> +               return;
> +       }
> +
> +       spin_unlock_irq(&hsq->lock);
> +
> +       ret = wait_event_timeout(hsq->wait_queue,
> +                                mmc_hsq_queue_is_idle(hsq, &ret),
> +                                msecs_to_jiffies(timeout));
> +       if (ret == 0) {
> +               pr_warn("could not stop mmc software queue\n");
> +               return;
> +       }
> +
> +       spin_lock_irq(&hsq->lock);
> +
> +       hsq->enabled = false;
> +
> +       spin_unlock_irq(&hsq->lock);
> +}
> +
> +static int mmc_hsq_enable(struct mmc_host *mmc, struct mmc_card *card)
> +{
> +       struct mmc_hsq *hsq = mmc->cqe_private;
> +
> +       spin_lock_irq(&hsq->lock);
> +
> +       if (hsq->enabled) {
> +               spin_unlock_irq(&hsq->lock);
> +               return -EBUSY;
> +       }
> +
> +       hsq->enabled = true;
> +
> +       spin_unlock_irq(&hsq->lock);
> +
> +       return 0;
> +}
> +
> +static const struct mmc_cqe_ops mmc_hsq_ops = {
> +       .cqe_enable = mmc_hsq_enable,
> +       .cqe_disable = mmc_hsq_disable,
> +       .cqe_request = mmc_hsq_request,
> +       .cqe_post_req = mmc_hsq_post_req,
> +       .cqe_wait_for_idle = mmc_hsq_wait_for_idle,
> +       .cqe_recovery_start = mmc_hsq_recovery_start,
> +       .cqe_recovery_finish = mmc_hsq_recovery_finish,
> +};
> +
> +int mmc_hsq_init(struct mmc_hsq *hsq, struct mmc_host *mmc)
> +{
> +       hsq->num_slots = HSQ_NUM_SLOTS;
> +       hsq->next_tag = HSQ_INVALID_TAG;
> +       mmc->cqe_qdepth = HSQ_NUM_SLOTS;
> +
> +       hsq->slot = devm_kcalloc(mmc_dev(mmc), hsq->num_slots,
> +                                sizeof(struct hsq_slot), GFP_KERNEL);
> +       if (!hsq->slot)
> +               return -ENOMEM;
> +
> +       hsq->mmc = mmc;
> +       hsq->mmc->cqe_private = hsq;
> +       mmc->cqe_ops = &mmc_hsq_ops;
> +
> +       spin_lock_init(&hsq->lock);
> +       init_waitqueue_head(&hsq->wait_queue);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(mmc_hsq_init);
> +
> +void mmc_hsq_suspend(struct mmc_host *mmc)
> +{
> +       mmc_hsq_disable(mmc);
> +}
> +EXPORT_SYMBOL_GPL(mmc_hsq_suspend);
> +
> +int mmc_hsq_resume(struct mmc_host *mmc)
> +{
> +       return mmc_hsq_enable(mmc, NULL);
> +}
> +EXPORT_SYMBOL_GPL(mmc_hsq_resume);
> diff --git a/drivers/mmc/host/mmc_hsq.h b/drivers/mmc/host/mmc_hsq.h
> new file mode 100644
> index 0000000..d51beb7
> --- /dev/null
> +++ b/drivers/mmc/host/mmc_hsq.h
> @@ -0,0 +1,30 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#ifndef LINUX_MMC_HSQ_H
> +#define LINUX_MMC_HSQ_H
> +
> +struct hsq_slot {
> +       struct mmc_request *mrq;
> +};
> +
> +struct mmc_hsq {
> +       struct mmc_host *mmc;
> +       struct mmc_request *mrq;
> +       wait_queue_head_t wait_queue;
> +       struct hsq_slot *slot;
> +       spinlock_t lock;
> +
> +       int next_tag;
> +       int num_slots;
> +       int qcnt;
> +
> +       bool enabled;
> +       bool waiting_for_idle;
> +       bool recovery_halt;
> +};
> +
> +int mmc_hsq_init(struct mmc_hsq *hsq, struct mmc_host *mmc);
> +void mmc_hsq_suspend(struct mmc_host *mmc);
> +int mmc_hsq_resume(struct mmc_host *mmc);
> +bool mmc_hsq_finalize_request(struct mmc_host *mmc, struct mmc_request *mrq);
> +
> +#endif
> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index ba70338..3931aa3 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -462,6 +462,9 @@ struct mmc_host {
>         bool                    cqe_enabled;
>         bool                    cqe_on;
>
> +       /* Software Queue support */
> +       bool                    swq_enabled;
> +
>         unsigned long           private[0] ____cacheline_aligned;
>  };
>
> --
> 1.7.9.5
>

Other than the above, this looks indeed very promising! I have no
further comment for the rest of the patches in the series.

And again, apologize for the delays!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 1/4] mmc: Add MMC host software queue support
  2020-01-21 11:52   ` Ulf Hansson
@ 2020-01-27 10:23     ` (Exiting) Baolin Wang
  0 siblings, 0 replies; 14+ messages in thread
From: (Exiting) Baolin Wang @ 2020-01-27 10:23 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Baolin Wang, Adrian Hunter, Asutosh Das, Orson Zhai,
	Chunyan Zhang, Arnd Bergmann, Linus Walleij, Vincent Guittot,
	linux-mmc, Linux Kernel Mailing List

Hi Ulf,

(Sorry for the late reply due to my holidays).

On Tue, 21 Jan 2020 at 19:52, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 18 Nov 2019 at 11:43, Baolin Wang <baolin.wang7@gmail.com> wrote:
> >
> > From: Baolin Wang <baolin.wang@linaro.org>
> >
> > Now the MMC read/write stack will always wait for previous request is
> > completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> > or queue a work to complete request, that will bring context switching
> > overhead, especially for high I/O per second rates, to affect the IO
> > performance.
> >
> > Thus this patch introduces MMC software queue interface based on the
> > hardware command queue engine's interfaces, which is similar with the
> > hardware command queue engine's idea, that can remove the context
> > switching. Moreover we set the default queue depth as 32 for software
> > queue, which allows more requests to be prepared, merged and inserted
> > into IO scheduler to improve performance, but we only allow 2 requests
> > in flight, that is enough to let the irq handler always trigger the
> > next request without a context switch, as well as avoiding a long latency.
> >
> > From the fio testing data in cover letter, we can see the software
> > queue can improve some performance with 4K block size, increasing
> > about 16% for random read, increasing about 90% for random write,
> > though no obvious improvement for sequential read and write.
> >
> > Moreover we can expand the software queue interface to support MMC
> > packed request or packed command in future.
> >
> > Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
> > Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
> > ---
> >  drivers/mmc/core/block.c   |   61 ++++++++
> >  drivers/mmc/core/mmc.c     |   13 +-
> >  drivers/mmc/core/queue.c   |   33 ++++-
> >  drivers/mmc/host/Kconfig   |    7 +
> >  drivers/mmc/host/Makefile  |    1 +
> >  drivers/mmc/host/mmc_hsq.c |  344 ++++++++++++++++++++++++++++++++++++++++++++
> >  drivers/mmc/host/mmc_hsq.h |   30 ++++
> >  include/linux/mmc/host.h   |    3 +
> >  8 files changed, 482 insertions(+), 10 deletions(-)
> >  create mode 100644 drivers/mmc/host/mmc_hsq.c
> >  create mode 100644 drivers/mmc/host/mmc_hsq.h
> >
> > diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> > index 2c71a43..870462c 100644
> > --- a/drivers/mmc/core/block.c
> > +++ b/drivers/mmc/core/block.c
> > @@ -168,6 +168,11 @@ struct mmc_rpmb_data {
> >
> >  static inline int mmc_blk_part_switch(struct mmc_card *card,
> >                                       unsigned int part_type);
> > +static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
> > +                              struct mmc_card *card,
> > +                              int disable_multi,
> > +                              struct mmc_queue *mq);
> > +static void mmc_blk_swq_req_done(struct mmc_request *mrq);
>
> We have debated whether swq ("software queue") is a good name - and
> just to confirm, I also don't have a great better suggestion.
>
> However, I do like the name of new host interface though, host
> software queue, "hsq". That makes sense to me.
>
> One option, to possibly make the core code more aligned to the hsq
> interface, could be to stick with the "hsq" acronym for the core layer
> as well.

Sure.

>
> In other words, the above function (and its new friends introduced in
> the series) could be renamed to mmc_blk_hsq_req_done(). What do you
> think about that?

Yes, totally agree with you. Will change in the next version.

> >
> >  static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
> >  {
> > @@ -1569,9 +1574,30 @@ static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
> >         return mmc_blk_cqe_start_req(mq->card->host, mrq);
> >  }
> >
> > +static int mmc_blk_swq_issue_rw_rq(struct mmc_queue *mq, struct request *req)
> > +{
> > +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> > +       struct mmc_host *host = mq->card->host;
> > +       int err;
> > +
> > +       mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
> > +       mqrq->brq.mrq.done = mmc_blk_swq_req_done;
> > +       mmc_pre_req(host, &mqrq->brq.mrq);
> > +
> > +       err = mmc_cqe_start_req(host, &mqrq->brq.mrq);
> > +       if (err)
> > +               mmc_post_req(host, &mqrq->brq.mrq, err);
> > +
> > +       return err;
> > +}
> > +
> >  static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
> >  {
> >         struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> > +       struct mmc_host *host = mq->card->host;
> > +
> > +       if (host->swq_enabled)
>
> If we switch to use "hsq", this would then be "hsq_enabled".

Sure.

>
> > +               return mmc_blk_swq_issue_rw_rq(mq, req);
> >
> >         mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
> >
> > @@ -1957,6 +1983,41 @@ static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
> >                 mmc_run_bkops(mq->card);
> >  }
> >
> > +static void mmc_blk_swq_req_done(struct mmc_request *mrq)
> > +{
> > +       struct mmc_queue_req *mqrq =
> > +               container_of(mrq, struct mmc_queue_req, brq.mrq);
> > +       struct request *req = mmc_queue_req_to_req(mqrq);
> > +       struct request_queue *q = req->q;
> > +       struct mmc_queue *mq = q->queuedata;
> > +       struct mmc_host *host = mq->card->host;
> > +       unsigned long flags;
> > +
> > +       if (mmc_blk_rq_error(&mqrq->brq) ||
> > +           mmc_blk_urgent_bkops_needed(mq, mqrq)) {
> > +               spin_lock_irqsave(&mq->lock, flags);
> > +               mq->recovery_needed = true;
> > +               mq->recovery_req = req;
> > +               spin_unlock_irqrestore(&mq->lock, flags);
> > +
> > +               host->cqe_ops->cqe_recovery_start(host);
> > +
> > +               schedule_work(&mq->recovery_work);
> > +               return;
> > +       }
> > +
> > +       mmc_blk_rw_reset_success(mq, req);
> > +
> > +       /*
> > +        * Block layer timeouts race with completions which means the normal
> > +        * completion path cannot be used during recovery.
> > +        */
> > +       if (mq->in_recovery)
> > +               mmc_blk_cqe_complete_rq(mq, req);
> > +       else
> > +               blk_mq_complete_request(req);
> > +}
> > +
> >  void mmc_blk_mq_complete(struct request *req)
> >  {
> >         struct mmc_queue *mq = req->q->queuedata;
> > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> > index c880489..8eac1a2 100644
> > --- a/drivers/mmc/core/mmc.c
> > +++ b/drivers/mmc/core/mmc.c
> > @@ -1852,15 +1852,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
> >          */
> >         card->reenable_cmdq = card->ext_csd.cmdq_en;
> >
> > -       if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
> > +       if (host->cqe_ops && !host->cqe_enabled) {
>
> The doesn't looks entirely correct to me, as it means enabling the CQE
> hardware for hosts with MMC_CAP2_CQE set, but no matter of whether the
> eMMC card really supports CMDQ (or if we failed to enabled CMDQ for
> the card). More comments below.
>
> >                 err = host->cqe_ops->cqe_enable(host, card);
> >                 if (err) {
> >                         pr_err("%s: Failed to enable CQE, error %d\n",
> >                                 mmc_hostname(host), err);
> >                 } else {
> >                         host->cqe_enabled = true;
> > -                       pr_info("%s: Command Queue Engine enabled\n",
> > -                               mmc_hostname(host));
> > +
> > +                       if (card->ext_csd.cmdq_en) {
> > +                               pr_info("%s: Command Queue Engine enabled\n",
> > +                                       mmc_hostname(host));
> > +                       } else {
> > +                               host->swq_enabled = true;
> > +                               pr_info("%s: Software Queue enabled\n",
> > +                                       mmc_hostname(host));
>
> A few questions around the above code.
>
> 1.
> Let's assume the host supports MMC_CAP2_CQE, but the eMMC card doesn't
> support CMDQ.
>
> In this case, we still want to allow the host to use the software
> variant (the hsq) of the interface. In principle that is what the code
> above already tries to implement, but then you also need to update the
> support in drivers/mmc/host/cqhci.[ch] to support that dynamically.
> For example, the ->cqe_enable() callback should check
> "card->ext_csd.cmdq_en" and adjust its behavior accordingly, depending
> if the flag has been set or not.

Right. I will add "card->ext_csd.cmdq_en" checking in
drivers/mmc/host/cqhci.c to fix this issue.


> 2.
> I also notice that you are enabling the use of the hsq path, solely
> for eMMC cards. I am guessing hsq is beneficial to use for SD cards as
> well, don't you think?

Right.

>
> Of course, I am fine by enabling that in a step-by-step approach, so
> no need to add that as a part of $subject patch. Although, at least
> make it a part of the series.

Yes, that's what I thought. The SD card related patch was already in
my local tree, and I will post it if this patch set was accepted.

>
> > +                       }
> >                 }
> >         }
> >
> > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> > index 9edc086..d9086c1 100644
> > --- a/drivers/mmc/core/queue.c
> > +++ b/drivers/mmc/core/queue.c
> > @@ -62,7 +62,7 @@ enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
> >  {
> >         struct mmc_host *host = mq->card->host;
> >
> > -       if (mq->use_cqe)
> > +       if (mq->use_cqe && !host->swq_enabled)
> >                 return mmc_cqe_issue_type(host, req);
> >
> >         if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
> > @@ -124,12 +124,14 @@ static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
> >  {
> >         struct request_queue *q = req->q;
> >         struct mmc_queue *mq = q->queuedata;
> > +       struct mmc_card *card = mq->card;
> > +       struct mmc_host *host = card->host;
> >         unsigned long flags;
> >         int ret;
> >
> >         spin_lock_irqsave(&mq->lock, flags);
> >
> > -       if (mq->recovery_needed || !mq->use_cqe)
> > +       if (mq->recovery_needed || !mq->use_cqe || host->swq_enabled)
> >                 ret = BLK_EH_RESET_TIMER;
> >         else
> >                 ret = mmc_cqe_timed_out(req);
> > @@ -144,12 +146,13 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
> >         struct mmc_queue *mq = container_of(work, struct mmc_queue,
> >                                             recovery_work);
> >         struct request_queue *q = mq->queue;
> > +       struct mmc_host *host = mq->card->host;
> >
> >         mmc_get_card(mq->card, &mq->ctx);
> >
> >         mq->in_recovery = true;
> >
> > -       if (mq->use_cqe)
> > +       if (mq->use_cqe && !host->swq_enabled)
> >                 mmc_blk_cqe_recovery(mq);
> >         else
> >                 mmc_blk_mq_recovery(mq);
> > @@ -160,6 +163,9 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
> >         mq->recovery_needed = false;
> >         spin_unlock_irq(&mq->lock);
> >
> > +       if (host->swq_enabled)
> > +               host->cqe_ops->cqe_recovery_finish(host);
> > +
> >         mmc_put_card(mq->card, &mq->ctx);
> >
> >         blk_mq_run_hw_queues(q, true);
> > @@ -279,6 +285,14 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
> >                 }
> >                 break;
> >         case MMC_ISSUE_ASYNC:
> > +               /*
> > +                * For MMC host software queue, we only allow 2 requests in
> > +                * flight to avoid a long latency.
> > +                */
> > +               if (host->swq_enabled && mq->in_flight[issue_type] > 2) {
> > +                       spin_unlock_irq(&mq->lock);
> > +                       return BLK_STS_RESOURCE;
> > +               }
> >                 break;
> >         default:
> >                 /*
> > @@ -430,11 +444,16 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card)
> >          * The queue depth for CQE must match the hardware because the request
> >          * tag is used to index the hardware queue.
> >          */
> > -       if (mq->use_cqe)
> > -               mq->tag_set.queue_depth =
> > -                       min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
> > -       else
> > +       if (mq->use_cqe) {
> > +               if (host->swq_enabled)
> > +                       mq->tag_set.queue_depth = host->cqe_qdepth;
>
> I don't think we need to treat the hsq as special case in regards to
> the .queue_depth.
>
> It should be fine to use the default MMC_QUEUE_DEPTH (64), don't you think?

Yes, I agree.

> > +               else
> > +                       mq->tag_set.queue_depth =
> > +                               min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
> > +       } else {
> >                 mq->tag_set.queue_depth = MMC_QUEUE_DEPTH;
> > +       }
> > +
> >         mq->tag_set.numa_node = NUMA_NO_NODE;
> >         mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
> >         mq->tag_set.nr_hw_queues = 1;
> > diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> > index 49ea02c..efa4019 100644
> > --- a/drivers/mmc/host/Kconfig
> > +++ b/drivers/mmc/host/Kconfig
> > @@ -936,6 +936,13 @@ config MMC_CQHCI
> >
> >           If unsure, say N.
> >
> > +config MMC_HSQ
> > +       tristate "MMC Host Software Queue support"
> > +       help
> > +         This selects the Software Queue support.
> > +
> > +         If unsure, say N.
> > +
> >  config MMC_TOSHIBA_PCI
> >         tristate "Toshiba Type A SD/MMC Card Interface Driver"
> >         depends on PCI
> > diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
> > index 11c4598..c14b439 100644
> > --- a/drivers/mmc/host/Makefile
> > +++ b/drivers/mmc/host/Makefile
> > @@ -98,6 +98,7 @@ obj-$(CONFIG_MMC_SDHCI_BRCMSTB)               += sdhci-brcmstb.o
> >  obj-$(CONFIG_MMC_SDHCI_OMAP)           += sdhci-omap.o
> >  obj-$(CONFIG_MMC_SDHCI_SPRD)           += sdhci-sprd.o
> >  obj-$(CONFIG_MMC_CQHCI)                        += cqhci.o
> > +obj-$(CONFIG_MMC_HSQ)                  += mmc_hsq.o
> >
> >  ifeq ($(CONFIG_CB710_DEBUG),y)
> >         CFLAGS-cb710-mmc        += -DDEBUG
> > diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
> > new file mode 100644
> > index 0000000..f5a4f93
> > --- /dev/null
> > +++ b/drivers/mmc/host/mmc_hsq.c
> > @@ -0,0 +1,344 @@
>
> Other than the above, this looks indeed very promising! I have no
> further comment for the rest of the patches in the series.

Very appreciated for your good suggestion. I will send out the next
version after my holidays with addressing your comments. Thanks.

-- 
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-01-27 10:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18 10:43 [PATCH v7 0/4] Add MMC software queue support Baolin Wang
2019-11-18 10:43 ` [PATCH v7 1/4] mmc: Add MMC host " Baolin Wang
2019-11-22 10:32   ` Arnd Bergmann
2019-11-22 10:42     ` Baolin Wang
2019-12-09  9:07       ` Baolin Wang
     [not found]         ` <CAMz4kuJ2q_=kEcpz2+GJANdPm5DmwWMLbqBmZHGgtBiEhNFqzw@mail.gmail.com>
     [not found]           ` <CAPDyKFp95H4KVrhiMD9H-C9iZHzEHufNPP95_X7DroYiR+nhHg@mail.gmail.com>
     [not found]             ` <CAMz4kuKRna4s1g3pbw=kCuEnX2voFSh+cQ-mHkrWUoXF9p21XA@mail.gmail.com>
     [not found]               ` <CAPDyKFo3ysxbJr=3fpaEq0rM0qSeCCkLcfA+7mcANQVXYoQ9oA@mail.gmail.com>
     [not found]                 ` <CAMz4kuLQLWYGKTKcycDqWXFPt-aXZvV=geQWbF_aEoh9PE37Yw@mail.gmail.com>
     [not found]                   ` <CAMz4kuKe+Xg=-N2e7V0_GBcddKzfRkt7zRG_j-vjGyFvkXcTMA@mail.gmail.com>
2019-12-19 15:21                     ` Ulf Hansson
2019-12-20  3:50                       ` (Exiting) Baolin Wang
2020-01-21 11:52   ` Ulf Hansson
2020-01-27 10:23     ` (Exiting) Baolin Wang
2019-11-18 10:43 ` [PATCH v7 2/4] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
2019-11-22 12:13   ` Adrian Hunter
2019-11-18 10:43 ` [PATCH v7 3/4] mmc: host: sdhci: Add a variable to defer to complete requests if needed Baolin Wang
2019-11-22 12:14   ` Adrian Hunter
2019-11-18 10:43 ` [PATCH v7 4/4] mmc: host: sdhci-sprd: Add software queue support Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).