linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Add MMC software queue support
@ 2019-09-19  5:58 Baolin Wang
  2019-09-19  5:58 ` [PATCH v3 1/3] mmc: " Baolin Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Baolin Wang @ 2019-09-19  5:58 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, linux-mmc, linux-kernel

Hi All,

Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead, especially for high I/O per second rates, to affect the IO
performance.

Thus this patch set will introduce the MMC software command queue support
based on command queue engine's interfaces, and set the queue depth as 2,
that means we do not need wait for previous request is completed and can
queue 2 requests in flight. It is enough to let the irq handler always
trigger the next request without a context switch and then ask the blk_mq
layer for the next one to get queued, as well as avoiding a long latency.

Moreover we can expand the MMC software queue interface to support
MMC packed request or packed command instead of adding new interfaces,
according to previosus discussion.

Below are some comparison data with fio tool. The fio command I used
is like below with changing the '--rw' parameter and enabling the direct
IO flag to measure the actual hardware transfer speed in 4K block size.

./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=512M --group_reporting --numjobs=20 --name=test_read

My eMMC card working at HS400 Enhanced strobe mode:
[    2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[    2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB 
[    2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
[    2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
[    2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0)

1. Without MMC software queue
I tested 3 times for each case and output a average speed.

1) Sequential read:
Speed: 28.9MiB/s, 26.4MiB/s, 30.9MiB/s
Average speed: 28.7MiB/s

2) Random read:
Speed: 18.2MiB/s, 8.9MiB/s, 15.8MiB/s
Average speed: 14.3MiB/s

3) Sequential write:
Speed: 21.1MiB/s, 27.9MiB/s, 25MiB/s
Average speed: 24.7MiB/s

4) Random write:
Speed: 21.5MiB/s, 18.1MiB/s, 18.1MiB/s
Average speed: 19.2MiB/s

2. With MMC software queue
I tested 3 times for each case and output a average speed.

1) Sequential read:
Speed: 44.1MiB/s, 42.3MiB/s, 44.4MiB/s
Average speed: 43.6MiB/s

2) Random read:
Speed: 30.6MiB/s, 30.9MiB/s, 30.5MiB/s
Average speed: 30.6MiB/s

3) Sequential write:
Speed: 44.1MiB/s, 45.9MiB/s, 44.2MiB/s
Average speed: 44.7MiB/s

4) Random write:
Speed: 45.1MiB/s, 43.3MiB/s, 42.4MiB/s
Average speed: 43.6MiB/s

Form above data, we can see the MMC software queue can help to improve the
performance obviously.

Any comments are welcome. Thanks a lot.

Changes from v2:
 - Remove reference to 'struct cqhci_host' and 'struct cqhci_slot',
 instead adding 'struct sqhci_host', which is only used by software queue.

Changes from v1:
 - Add request_done ops for sdhci_ops.
 - Replace virtual command queue with software queue for functions and
 variables.
 - Rename the software queue file and add sqhci.h header file.

Baolin Wang (3):
  mmc: Add MMC software queue support
  mmc: host: sdhci: Add request_done ops for struct sdhci_ops
  mmc: host: sdhci-sprd: Add software queue support

 drivers/mmc/core/block.c      |   61 ++++++++
 drivers/mmc/core/mmc.c        |   13 +-
 drivers/mmc/core/queue.c      |   25 ++-
 drivers/mmc/host/Kconfig      |    9 ++
 drivers/mmc/host/Makefile     |    1 +
 drivers/mmc/host/sdhci-sprd.c |   26 ++++
 drivers/mmc/host/sdhci.c      |   12 +-
 drivers/mmc/host/sdhci.h      |    2 +
 drivers/mmc/host/sqhci.c      |  344 +++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/sqhci.h      |   53 +++++++
 include/linux/mmc/host.h      |    3 +
 11 files changed, 537 insertions(+), 12 deletions(-)
 create mode 100644 drivers/mmc/host/sqhci.c
 create mode 100644 drivers/mmc/host/sqhci.h

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/3] mmc: Add MMC software queue support
  2019-09-19  5:58 [PATCH v3 0/3] Add MMC software queue support Baolin Wang
@ 2019-09-19  5:58 ` Baolin Wang
  2019-09-21 14:49   ` kbuild test robot
  2019-09-19  5:58 ` [PATCH v3 2/3] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Baolin Wang @ 2019-09-19  5:58 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, linux-mmc, linux-kernel

Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead, especially for high I/O per second rates, to affect the IO
performance.

Thus this patch introduces MMC software queue interface based on the
hardware command queue engine's interfaces, which is similar with the
hardware command queue engine's idea, that can remove the context
switching. Moreover we set the queue depth as 2 for software queue,
that is enough to let the irq handler always trigger the next request
without a context switch and then ask the blk_mq layer for the next one
to get queued, as well as avoiding a long latency.

From the fio testing data in cover letter, we can see the software
queue can improve performance obviously with 4K block size, increasing
about 52% for sequential read, increasing about 114% for random read,
increasing about 81% for sequential write, and increasing about 127%
for random write.

Moreover we can expand the software queue interface to support MMC
packed request or packed command in future.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/mmc/core/block.c  |   61 ++++++++
 drivers/mmc/core/mmc.c    |   13 +-
 drivers/mmc/core/queue.c  |   25 +++-
 drivers/mmc/host/Kconfig  |    8 ++
 drivers/mmc/host/Makefile |    1 +
 drivers/mmc/host/sqhci.c  |  344 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/sqhci.h  |   53 +++++++
 include/linux/mmc/host.h  |    3 +
 8 files changed, 498 insertions(+), 10 deletions(-)
 create mode 100644 drivers/mmc/host/sqhci.c
 create mode 100644 drivers/mmc/host/sqhci.h

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 2c71a43..870462c 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -168,6 +168,11 @@ struct mmc_rpmb_data {
 
 static inline int mmc_blk_part_switch(struct mmc_card *card,
 				      unsigned int part_type);
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+			       struct mmc_card *card,
+			       int disable_multi,
+			       struct mmc_queue *mq);
+static void mmc_blk_swq_req_done(struct mmc_request *mrq);
 
 static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
 {
@@ -1569,9 +1574,30 @@ static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
 	return mmc_blk_cqe_start_req(mq->card->host, mrq);
 }
 
+static int mmc_blk_swq_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+	int err;
+
+	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
+	mqrq->brq.mrq.done = mmc_blk_swq_req_done;
+	mmc_pre_req(host, &mqrq->brq.mrq);
+
+	err = mmc_cqe_start_req(host, &mqrq->brq.mrq);
+	if (err)
+		mmc_post_req(host, &mqrq->brq.mrq, err);
+
+	return err;
+}
+
 static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+
+	if (host->swq_enabled)
+		return mmc_blk_swq_issue_rw_rq(mq, req);
 
 	mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
 
@@ -1957,6 +1983,41 @@ static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
 		mmc_run_bkops(mq->card);
 }
 
+static void mmc_blk_swq_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq =
+		container_of(mrq, struct mmc_queue_req, brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	unsigned long flags;
+
+	if (mmc_blk_rq_error(&mqrq->brq) ||
+	    mmc_blk_urgent_bkops_needed(mq, mqrq)) {
+		spin_lock_irqsave(&mq->lock, flags);
+		mq->recovery_needed = true;
+		mq->recovery_req = req;
+		spin_unlock_irqrestore(&mq->lock, flags);
+
+		host->cqe_ops->cqe_recovery_start(host);
+
+		schedule_work(&mq->recovery_work);
+		return;
+	}
+
+	mmc_blk_rw_reset_success(mq, req);
+
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->in_recovery)
+		mmc_blk_cqe_complete_rq(mq, req);
+	else
+		blk_mq_complete_request(req);
+}
+
 void mmc_blk_mq_complete(struct request *req)
 {
 	struct mmc_queue *mq = req->q->queuedata;
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index c880489..8eac1a2 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1852,15 +1852,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
 	 */
 	card->reenable_cmdq = card->ext_csd.cmdq_en;
 
-	if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
+	if (host->cqe_ops && !host->cqe_enabled) {
 		err = host->cqe_ops->cqe_enable(host, card);
 		if (err) {
 			pr_err("%s: Failed to enable CQE, error %d\n",
 				mmc_hostname(host), err);
 		} else {
 			host->cqe_enabled = true;
-			pr_info("%s: Command Queue Engine enabled\n",
-				mmc_hostname(host));
+
+			if (card->ext_csd.cmdq_en) {
+				pr_info("%s: Command Queue Engine enabled\n",
+					mmc_hostname(host));
+			} else {
+				host->swq_enabled = true;
+				pr_info("%s: Software Queue enabled\n",
+					mmc_hostname(host));
+			}
 		}
 	}
 
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 7102e2e..2c93c29 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -60,7 +60,7 @@ enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_host *host = mq->card->host;
 
-	if (mq->use_cqe)
+	if (mq->use_cqe && !host->swq_enabled)
 		return mmc_cqe_issue_type(host, req);
 
 	if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
@@ -122,12 +122,14 @@ static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
 {
 	struct request_queue *q = req->q;
 	struct mmc_queue *mq = q->queuedata;
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
 	unsigned long flags;
 	int ret;
 
 	spin_lock_irqsave(&mq->lock, flags);
 
-	if (mq->recovery_needed || !mq->use_cqe)
+	if (mq->recovery_needed || !mq->use_cqe || host->swq_enabled)
 		ret = BLK_EH_RESET_TIMER;
 	else
 		ret = mmc_cqe_timed_out(req);
@@ -142,12 +144,13 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
 	struct mmc_queue *mq = container_of(work, struct mmc_queue,
 					    recovery_work);
 	struct request_queue *q = mq->queue;
+	struct mmc_host *host = mq->card->host;
 
 	mmc_get_card(mq->card, &mq->ctx);
 
 	mq->in_recovery = true;
 
-	if (mq->use_cqe)
+	if (mq->use_cqe && !host->swq_enabled)
 		mmc_blk_cqe_recovery(mq);
 	else
 		mmc_blk_mq_recovery(mq);
@@ -158,6 +161,9 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
 	mq->recovery_needed = false;
 	spin_unlock_irq(&mq->lock);
 
+	if (host->swq_enabled)
+		host->cqe_ops->cqe_recovery_finish(host);
+
 	mmc_put_card(mq->card, &mq->ctx);
 
 	blk_mq_run_hw_queues(q, true);
@@ -407,11 +413,16 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card)
 	 * The queue depth for CQE must match the hardware because the request
 	 * tag is used to index the hardware queue.
 	 */
-	if (mq->use_cqe)
-		mq->tag_set.queue_depth =
-			min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
-	else
+	if (mq->use_cqe) {
+		if (host->swq_enabled)
+			mq->tag_set.queue_depth = host->cqe_qdepth;
+		else
+			mq->tag_set.queue_depth =
+				min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
+	} else {
 		mq->tag_set.queue_depth = MMC_QUEUE_DEPTH;
+	}
+
 	mq->tag_set.numa_node = NUMA_NO_NODE;
 	mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
 	mq->tag_set.nr_hw_queues = 1;
diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 14d89a1..d117f18 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -923,6 +923,14 @@ config MMC_CQHCI
 
 	  If unsure, say N.
 
+config MMC_SQHCI
+	bool "Software Queue Host Controller Interface support"
+	help
+	  This selects the Software Queue Host Controller Interface (SQHCI)
+	  support.
+
+	  If unsure, say N.
+
 config MMC_TOSHIBA_PCI
 	tristate "Toshiba Type A SD/MMC Card Interface Driver"
 	depends on PCI
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 7357871..a3588d4 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -97,6 +97,7 @@ obj-$(CONFIG_MMC_SDHCI_BRCMSTB)		+= sdhci-brcmstb.o
 obj-$(CONFIG_MMC_SDHCI_OMAP)		+= sdhci-omap.o
 obj-$(CONFIG_MMC_SDHCI_SPRD)		+= sdhci-sprd.o
 obj-$(CONFIG_MMC_CQHCI)			+= cqhci.o
+obj-$(CONFIG_MMC_SQHCI)			+= sqhci.o
 
 ifeq ($(CONFIG_CB710_DEBUG),y)
 	CFLAGS-cb710-mmc	+= -DDEBUG
diff --git a/drivers/mmc/host/sqhci.c b/drivers/mmc/host/sqhci.c
new file mode 100644
index 0000000..c172bc1
--- /dev/null
+++ b/drivers/mmc/host/sqhci.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * MMC software queue support based on command queue interfaces
+ *
+ * Copyright (C) 2019 Linaro, Inc.
+ * Author: Baolin Wang <baolin.wang@linaro.org>
+ */
+
+#include <linux/mmc/card.h>
+#include <linux/mmc/host.h>
+
+#include "sqhci.h"
+
+#define SQHCI_NUM_SLOTS		2
+#define SQHCI_INVALID_TAG	SQHCI_NUM_SLOTS
+
+static void sqhci_pump_requests(struct sqhci_host *sq_host)
+{
+	struct mmc_host *mmc = sq_host->mmc;
+	struct sqhci_slot *slot;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sq_host->lock, flags);
+
+	/* Make sure we are not already running a request now */
+	if (sq_host->mrq) {
+		spin_unlock_irqrestore(&sq_host->lock, flags);
+		return;
+	}
+
+	/* Make sure there are remain requests need to pump */
+	if (!sq_host->qcnt || !sq_host->enabled) {
+		spin_unlock_irqrestore(&sq_host->lock, flags);
+		return;
+	}
+
+	slot = &sq_host->slot[sq_host->next_tag];
+	sq_host->mrq = slot->mrq;
+	sq_host->qcnt--;
+
+	spin_unlock_irqrestore(&sq_host->lock, flags);
+
+	mmc->ops->request(mmc, sq_host->mrq);
+}
+
+static void sqhci_update_next_tag(struct sqhci_host *sq_host, int remains)
+{
+	struct sqhci_slot *slot;
+	int tag;
+
+	/*
+	 * If there are no remain requests in software queue, then set a invalid
+	 * tag.
+	 */
+	if (!remains) {
+		sq_host->next_tag = SQHCI_INVALID_TAG;
+		return;
+	}
+
+	/*
+	 * Increasing the next tag and check if the corresponding request is
+	 * available, if yes, then we found a candidate request.
+	 */
+	if (++sq_host->next_tag != SQHCI_INVALID_TAG) {
+		slot = &sq_host->slot[sq_host->next_tag];
+		if (slot->mrq)
+			return;
+	}
+
+	/* Othersie we should iterate all slots to find a available tag. */
+	for (tag = 0; tag < SQHCI_NUM_SLOTS; tag++) {
+		slot = &sq_host->slot[tag];
+		if (slot->mrq)
+			break;
+	}
+
+	if (tag == SQHCI_NUM_SLOTS)
+		tag = SQHCI_INVALID_TAG;
+
+	sq_host->next_tag = tag;
+}
+
+static void sqhci_post_request(struct sqhci_host *sq_host)
+{
+	unsigned long flags;
+	int remains;
+
+	spin_lock_irqsave(&sq_host->lock, flags);
+
+	remains = sq_host->qcnt;
+	sq_host->mrq = NULL;
+
+	/* Update the next available tag to be queued. */
+	sqhci_update_next_tag(sq_host, remains);
+
+	if (sq_host->waiting_for_idle && !remains) {
+		sq_host->waiting_for_idle = false;
+		wake_up(&sq_host->wait_queue);
+	}
+
+	/* Do not pump new request in recovery mode. */
+	if (sq_host->recovery_halt) {
+		spin_unlock_irqrestore(&sq_host->lock, flags);
+		return;
+	}
+
+	spin_unlock_irqrestore(&sq_host->lock, flags);
+
+	 /*
+	  * Try to pump new request to host controller as fast as possible,
+	  * after completing previous request.
+	  */
+	if (remains > 0)
+		sqhci_pump_requests(sq_host);
+}
+
+/**
+ * sqhci_finalize_request - finalize one request if the request is done
+ * @mmc: the host controller
+ * @mrq: the request need to be finalized
+ *
+ * Return true if we finalized the corresponding request in software queue,
+ * otherwise return false.
+ */
+bool sqhci_finalize_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sq_host->lock, flags);
+
+	if (!sq_host->enabled || !sq_host->mrq || sq_host->mrq != mrq) {
+		spin_unlock_irqrestore(&sq_host->lock, flags);
+		return false;
+	}
+
+	/*
+	 * Clear current completed slot request to make a room for new request.
+	 */
+	sq_host->slot[sq_host->next_tag].mrq = NULL;
+
+	spin_unlock_irqrestore(&sq_host->lock, flags);
+
+	mmc_cqe_request_done(mmc, sq_host->mrq);
+
+	sqhci_post_request(sq_host);
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(sqhci_finalize_request);
+
+static void sqhci_recovery_start(struct mmc_host *mmc)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sq_host->lock, flags);
+
+	sq_host->recovery_halt = true;
+
+	spin_unlock_irqrestore(&sq_host->lock, flags);
+}
+
+static void sqhci_recovery_finish(struct mmc_host *mmc)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+	int remains;
+
+	spin_lock_irq(&sq_host->lock);
+
+	sq_host->recovery_halt = false;
+	remains = sq_host->qcnt;
+
+	spin_unlock_irq(&sq_host->lock);
+
+	/*
+	 * Try to pump new request if there are request pending in software
+	 * queue after finishing recovery.
+	 */
+	if (remains > 0)
+		sqhci_pump_requests(sq_host);
+}
+
+static int sqhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+	int tag = mrq->tag;
+
+	spin_lock_irq(&sq_host->lock);
+
+	if (!sq_host->enabled) {
+		spin_unlock_irq(&sq_host->lock);
+		return -ESHUTDOWN;
+	}
+
+	/* Do not queue any new requests in recovery mode. */
+	if (sq_host->recovery_halt) {
+		spin_unlock_irq(&sq_host->lock);
+		return -EBUSY;
+	}
+
+	sq_host->slot[tag].mrq = mrq;
+
+	/*
+	 * Set the next tag as current request tag if no available
+	 * next tag.
+	 */
+	if (sq_host->next_tag == SQHCI_INVALID_TAG)
+		sq_host->next_tag = tag;
+
+	sq_host->qcnt++;
+
+	spin_unlock_irq(&sq_host->lock);
+
+	sqhci_pump_requests(sq_host);
+
+	return 0;
+}
+
+static void sqhci_post_req(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	if (mmc->ops->post_req)
+		mmc->ops->post_req(mmc, mrq, 0);
+}
+
+static bool sqhci_queue_is_idle(struct sqhci_host *sq_host, int *ret)
+{
+	bool is_idle;
+
+	spin_lock_irq(&sq_host->lock);
+
+	is_idle = (!sq_host->mrq && !sq_host->qcnt) ||
+		sq_host->recovery_halt;
+
+	*ret = sq_host->recovery_halt ? -EBUSY : 0;
+	sq_host->waiting_for_idle = !is_idle;
+
+	spin_unlock_irq(&sq_host->lock);
+
+	return is_idle;
+}
+
+static int sqhci_wait_for_idle(struct mmc_host *mmc)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+	int ret;
+
+	wait_event(sq_host->wait_queue,
+		   sqhci_queue_is_idle(sq_host, &ret));
+
+	return ret;
+}
+
+static void sqhci_disable(struct mmc_host *mmc)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+	u32 timeout = 500;
+	int ret;
+
+	spin_lock_irq(&sq_host->lock);
+
+	if (!sq_host->enabled) {
+		spin_unlock_irq(&sq_host->lock);
+		return;
+	}
+
+	spin_unlock_irq(&sq_host->lock);
+
+	ret = wait_event_timeout(sq_host->wait_queue,
+				 sqhci_queue_is_idle(sq_host, &ret),
+				 msecs_to_jiffies(timeout));
+	if (ret == 0) {
+		pr_warn("could not stop mmc software queue\n");
+		return;
+	}
+
+	spin_lock_irq(&sq_host->lock);
+
+	sq_host->enabled = false;
+
+	spin_unlock_irq(&sq_host->lock);
+}
+
+static int sqhci_enable(struct mmc_host *mmc, struct mmc_card *card)
+{
+	struct sqhci_host *sq_host = mmc->cqe_private;
+
+	spin_lock_irq(&sq_host->lock);
+
+	if (sq_host->enabled) {
+		spin_unlock_irq(&sq_host->lock);
+		return -EBUSY;
+	}
+
+	sq_host->enabled = true;
+
+	spin_unlock_irq(&sq_host->lock);
+
+	return 0;
+}
+
+static const struct mmc_cqe_ops sqhci_ops = {
+	.cqe_enable = sqhci_enable,
+	.cqe_disable = sqhci_disable,
+	.cqe_request = sqhci_request,
+	.cqe_post_req = sqhci_post_req,
+	.cqe_wait_for_idle = sqhci_wait_for_idle,
+	.cqe_recovery_start = sqhci_recovery_start,
+	.cqe_recovery_finish = sqhci_recovery_finish,
+};
+
+int sqhci_init(struct sqhci_host *sq_host, struct mmc_host *mmc)
+{
+	sq_host->num_slots = SQHCI_NUM_SLOTS;
+	sq_host->next_tag = SQHCI_INVALID_TAG;
+	mmc->cqe_qdepth = SQHCI_NUM_SLOTS;
+
+	sq_host->slot = devm_kcalloc(mmc_dev(mmc), sq_host->num_slots,
+				     sizeof(struct sqhci_slot), GFP_KERNEL);
+	if (!sq_host->slot)
+		return -ENOMEM;
+
+	sq_host->mmc = mmc;
+	sq_host->mmc->cqe_private = sq_host;
+	mmc->cqe_ops = &sqhci_ops;
+
+	spin_lock_init(&sq_host->lock);
+	init_waitqueue_head(&sq_host->wait_queue);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(sqhci_init);
+
+void sqhci_suspend(struct mmc_host *mmc)
+{
+	sqhci_disable(mmc);
+}
+EXPORT_SYMBOL_GPL(sqhci_suspend);
+
+int sqhci_resume(struct mmc_host *mmc)
+{
+	return sqhci_enable(mmc, NULL);
+}
+EXPORT_SYMBOL_GPL(sqhci_resume);
diff --git a/drivers/mmc/host/sqhci.h b/drivers/mmc/host/sqhci.h
new file mode 100644
index 0000000..517e0a5
--- /dev/null
+++ b/drivers/mmc/host/sqhci.h
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef LINUX_MMC_SQHCI_H
+#define LINUX_MMC_SQHCI_H
+
+struct sqhci_slot {
+	struct mmc_request *mrq;
+};
+
+struct sqhci_host {
+	struct mmc_host *mmc;
+	struct mmc_request *mrq;
+	wait_queue_head_t wait_queue;
+	struct sqhci_slot *slot;
+	spinlock_t lock;
+
+	int next_tag;
+	int num_slots;
+	int qcnt;
+
+	bool enabled;
+	bool waiting_for_idle;
+	bool recovery_halt;
+};
+
+#ifdef CONFIG_MMC_SQHCI
+int sqhci_init(struct sqhci_host *sq_host, struct mmc_host *mmc);
+void sqhci_suspend(struct mmc_host *mmc);
+int sqhci_resume(struct mmc_host *mmc);
+bool sqhci_finalize_request(struct mmc_host *mmc, struct mmc_request *mrq);
+#else
+static inline int sqhci_init(struct sqhci_host *sq_host, struct mmc_host *mmc)
+{
+	return -EINVAL;
+}
+
+static inline void sqhci_suspend(struct mmc_host *mmc)
+{
+}
+
+static inline int sqhci_resume(struct mmc_host *mmc)
+{
+	return 0;
+}
+
+static inline bool sqhci_finalize_request(struct mmc_host *mmc,
+					  struct mmc_request *mrq)
+{
+	return false;
+}
+
+#endif
+
+#endif
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 4a351cb..706e2bc 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -459,6 +459,9 @@ struct mmc_host {
 	bool			cqe_enabled;
 	bool			cqe_on;
 
+	/* Software Queue support */
+	bool			swq_enabled;
+
 	unsigned long		private[0] ____cacheline_aligned;
 };
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/3] mmc: host: sdhci: Add request_done ops for struct sdhci_ops
  2019-09-19  5:58 [PATCH v3 0/3] Add MMC software queue support Baolin Wang
  2019-09-19  5:58 ` [PATCH v3 1/3] mmc: " Baolin Wang
@ 2019-09-19  5:58 ` Baolin Wang
  2019-09-19  5:58 ` [PATCH v3 3/3] mmc: host: sdhci-sprd: Add software queue support Baolin Wang
  2019-09-26  9:43 ` [PATCH v3 0/3] Add MMC " Baolin Wang
  3 siblings, 0 replies; 9+ messages in thread
From: Baolin Wang @ 2019-09-19  5:58 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, linux-mmc, linux-kernel

Add request_done ops for struct sdhci_ops as a preparation in case some
host controllers have different method to complete one request, such as
supporting request completion of MMC software queue.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/mmc/host/sdhci.c |   12 ++++++++++--
 drivers/mmc/host/sdhci.h |    2 ++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index a5dc5aa..b2c8695 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2710,7 +2710,10 @@ static bool sdhci_request_done(struct sdhci_host *host)
 
 	spin_unlock_irqrestore(&host->lock, flags);
 
-	mmc_request_done(host->mmc, mrq);
+	if (host->ops->request_done)
+		host->ops->request_done(host, mrq);
+	else
+		mmc_request_done(host->mmc, mrq);
 
 	return false;
 }
@@ -3133,7 +3136,12 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
 
 	/* Process mrqs ready for immediate completion */
 	for (i = 0; i < SDHCI_MAX_MRQS; i++) {
-		if (mrqs_done[i])
+		if (!mrqs_done[i])
+			continue;
+
+		if (host->ops->request_done)
+			host->ops->request_done(host, mrqs_done[i]);
+		else
 			mmc_request_done(host->mmc, mrqs_done[i]);
 	}
 
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 902f855..e9a476f 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -643,6 +643,8 @@ struct sdhci_ops {
 	void	(*voltage_switch)(struct sdhci_host *host);
 	void	(*adma_write_desc)(struct sdhci_host *host, void **desc,
 				   dma_addr_t addr, int len, unsigned int cmd);
+	void	(*request_done)(struct sdhci_host *host,
+				struct mmc_request *mrq);
 };
 
 #ifdef CONFIG_MMC_SDHCI_IO_ACCESSORS
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/3] mmc: host: sdhci-sprd: Add software queue support
  2019-09-19  5:58 [PATCH v3 0/3] Add MMC software queue support Baolin Wang
  2019-09-19  5:58 ` [PATCH v3 1/3] mmc: " Baolin Wang
  2019-09-19  5:58 ` [PATCH v3 2/3] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
@ 2019-09-19  5:58 ` Baolin Wang
  2019-09-26  9:43 ` [PATCH v3 0/3] Add MMC " Baolin Wang
  3 siblings, 0 replies; 9+ messages in thread
From: Baolin Wang @ 2019-09-19  5:58 UTC (permalink / raw)
  To: adrian.hunter, ulf.hansson, asutoshd
  Cc: orsonzhai, zhang.lyra, arnd, linus.walleij, vincent.guittot,
	baolin.wang, linux-mmc, linux-kernel

Add software queue support to improve the performance.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/mmc/host/Kconfig      |    1 +
 drivers/mmc/host/sdhci-sprd.c |   26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index d117f18..862e8e9 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -619,6 +619,7 @@ config MMC_SDHCI_SPRD
 	depends on ARCH_SPRD
 	depends on MMC_SDHCI_PLTFM
 	select MMC_SDHCI_IO_ACCESSORS
+	select MMC_SQHCI
 	help
 	  This selects the SDIO Host Controller in Spreadtrum
 	  SoCs, this driver supports R11(IP version: R11P0).
diff --git a/drivers/mmc/host/sdhci-sprd.c b/drivers/mmc/host/sdhci-sprd.c
index d07b979..4dec0b3 100644
--- a/drivers/mmc/host/sdhci-sprd.c
+++ b/drivers/mmc/host/sdhci-sprd.c
@@ -19,6 +19,7 @@
 #include <linux/slab.h>
 
 #include "sdhci-pltfm.h"
+#include "sqhci.h"
 
 /* SDHCI_ARGUMENT2 register high 16bit */
 #define SDHCI_SPRD_ARG2_STUFF		GENMASK(31, 16)
@@ -379,6 +380,16 @@ static unsigned int sdhci_sprd_get_ro(struct sdhci_host *host)
 	return 0;
 }
 
+static void sdhci_sprd_request_done(struct sdhci_host *host,
+				    struct mmc_request *mrq)
+{
+	/* Validate if the request was from software queue firstly. */
+	if (sqhci_finalize_request(host->mmc, mrq))
+		return;
+
+	 mmc_request_done(host->mmc, mrq);
+}
+
 static struct sdhci_ops sdhci_sprd_ops = {
 	.read_l = sdhci_sprd_readl,
 	.write_l = sdhci_sprd_writel,
@@ -392,6 +403,7 @@ static unsigned int sdhci_sprd_get_ro(struct sdhci_host *host)
 	.hw_reset = sdhci_sprd_hw_reset,
 	.get_max_timeout_count = sdhci_sprd_get_max_timeout_count,
 	.get_ro = sdhci_sprd_get_ro,
+	.request_done = sdhci_sprd_request_done,
 };
 
 static void sdhci_sprd_request(struct mmc_host *mmc, struct mmc_request *mrq)
@@ -521,6 +533,7 @@ static int sdhci_sprd_probe(struct platform_device *pdev)
 {
 	struct sdhci_host *host;
 	struct sdhci_sprd_host *sprd_host;
+	struct sqhci_host *sq_host;
 	struct clk *clk;
 	int ret = 0;
 
@@ -631,6 +644,16 @@ static int sdhci_sprd_probe(struct platform_device *pdev)
 
 	sprd_host->flags = host->flags;
 
+	sq_host = devm_kzalloc(&pdev->dev, sizeof(*sq_host), GFP_KERNEL);
+	if (!sq_host) {
+		ret = -ENOMEM;
+		goto err_cleanup_host;
+	}
+
+	ret = sqhci_init(sq_host, host->mmc);
+	if (ret)
+		goto err_cleanup_host;
+
 	ret = __sdhci_add_host(host);
 	if (ret)
 		goto err_cleanup_host;
@@ -689,6 +712,7 @@ static int sdhci_sprd_runtime_suspend(struct device *dev)
 	struct sdhci_host *host = dev_get_drvdata(dev);
 	struct sdhci_sprd_host *sprd_host = TO_SPRD_HOST(host);
 
+	sqhci_suspend(host->mmc);
 	sdhci_runtime_suspend_host(host);
 
 	clk_disable_unprepare(sprd_host->clk_sdio);
@@ -717,6 +741,8 @@ static int sdhci_sprd_runtime_resume(struct device *dev)
 		goto clk_disable;
 
 	sdhci_runtime_resume_host(host, 1);
+	sqhci_resume(host->mmc);
+
 	return 0;
 
 clk_disable:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/3] mmc: Add MMC software queue support
  2019-09-19  5:58 ` [PATCH v3 1/3] mmc: " Baolin Wang
@ 2019-09-21 14:49   ` kbuild test robot
  2019-09-23  7:53     ` Baolin Wang
  0 siblings, 1 reply; 9+ messages in thread
From: kbuild test robot @ 2019-09-21 14:49 UTC (permalink / raw)
  To: Baolin Wang
  Cc: kbuild-all, adrian.hunter, ulf.hansson, asutoshd, orsonzhai,
	zhang.lyra, arnd, linus.walleij, vincent.guittot, baolin.wang,
	linux-mmc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1160 bytes --]

Hi Baolin,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3 next-20190918]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Baolin-Wang/mmc-Add-MMC-software-queue-support/20190919-140107
config: i386-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-13) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 
:::::: branch date: 7 hours ago
:::::: commit date: 7 hours ago

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   ld: drivers/mmc/host/sqhci.o: in function `sqhci_finalize_request':
>> (.text+0x644): undefined reference to `mmc_cqe_request_done'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 69916 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/3] mmc: Add MMC software queue support
  2019-09-21 14:49   ` kbuild test robot
@ 2019-09-23  7:53     ` Baolin Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Baolin Wang @ 2019-09-23  7:53 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, Adrian Hunter, Ulf Hansson, asutoshd, Orson Zhai,
	Chunyan Zhang, Arnd Bergmann, Linus Walleij, Vincent Guittot,
	linux-mmc, LKML

Hi,

On Sat, 21 Sep 2019 at 22:43, kbuild test robot <lkp@intel.com> wrote:
>
> Hi Baolin,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [cannot apply to v5.3 next-20190918]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
>
> url:    https://github.com/0day-ci/linux/commits/Baolin-Wang/mmc-Add-MMC-software-queue-support/20190919-140107
> config: i386-allmodconfig (attached as .config)
> compiler: gcc-7 (Debian 7.4.0-13) 7.4.0
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=i386
> :::::: branch date: 7 hours ago
> :::::: commit date: 7 hours ago
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
>    ld: drivers/mmc/host/sqhci.o: in function `sqhci_finalize_request':
> >> (.text+0x644): undefined reference to `mmc_cqe_request_done'

OK. I will fix this issue in the next version and wait for a while to
see if there are any new comments for this patch set. Thanks.

-- 
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/3] Add MMC software queue support
  2019-09-19  5:58 [PATCH v3 0/3] Add MMC software queue support Baolin Wang
                   ` (2 preceding siblings ...)
  2019-09-19  5:58 ` [PATCH v3 3/3] mmc: host: sdhci-sprd: Add software queue support Baolin Wang
@ 2019-09-26  9:43 ` Baolin Wang
  2019-09-26 12:07   ` Adrian Hunter
  3 siblings, 1 reply; 9+ messages in thread
From: Baolin Wang @ 2019-09-26  9:43 UTC (permalink / raw)
  To: Adrian Hunter, Ulf Hansson, asutoshd
  Cc: Orson Zhai, Chunyan Zhang, Arnd Bergmann, Linus Walleij,
	Vincent Guittot, linux-mmc, LKML

Hi Adrian and Ulf,

On Thu, 19 Sep 2019 at 13:59, Baolin Wang <baolin.wang@linaro.org> wrote:
>
> Hi All,
>
> Now the MMC read/write stack will always wait for previous request is
> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> or queue a work to complete request, that will bring context switching
> overhead, especially for high I/O per second rates, to affect the IO
> performance.
>
> Thus this patch set will introduce the MMC software command queue support
> based on command queue engine's interfaces, and set the queue depth as 2,
> that means we do not need wait for previous request is completed and can
> queue 2 requests in flight. It is enough to let the irq handler always
> trigger the next request without a context switch and then ask the blk_mq
> layer for the next one to get queued, as well as avoiding a long latency.
>
> Moreover we can expand the MMC software queue interface to support
> MMC packed request or packed command instead of adding new interfaces,
> according to previosus discussion.
>
> Below are some comparison data with fio tool. The fio command I used
> is like below with changing the '--rw' parameter and enabling the direct
> IO flag to measure the actual hardware transfer speed in 4K block size.
>
> ./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=512M --group_reporting --numjobs=20 --name=test_read
>
> My eMMC card working at HS400 Enhanced strobe mode:
> [    2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> [    2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB
> [    2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
> [    2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
> [    2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0)
>
> 1. Without MMC software queue
> I tested 3 times for each case and output a average speed.
>
> 1) Sequential read:
> Speed: 28.9MiB/s, 26.4MiB/s, 30.9MiB/s
> Average speed: 28.7MiB/s
>
> 2) Random read:
> Speed: 18.2MiB/s, 8.9MiB/s, 15.8MiB/s
> Average speed: 14.3MiB/s
>
> 3) Sequential write:
> Speed: 21.1MiB/s, 27.9MiB/s, 25MiB/s
> Average speed: 24.7MiB/s
>
> 4) Random write:
> Speed: 21.5MiB/s, 18.1MiB/s, 18.1MiB/s
> Average speed: 19.2MiB/s
>
> 2. With MMC software queue
> I tested 3 times for each case and output a average speed.
>
> 1) Sequential read:
> Speed: 44.1MiB/s, 42.3MiB/s, 44.4MiB/s
> Average speed: 43.6MiB/s
>
> 2) Random read:
> Speed: 30.6MiB/s, 30.9MiB/s, 30.5MiB/s
> Average speed: 30.6MiB/s
>
> 3) Sequential write:
> Speed: 44.1MiB/s, 45.9MiB/s, 44.2MiB/s
> Average speed: 44.7MiB/s
>
> 4) Random write:
> Speed: 45.1MiB/s, 43.3MiB/s, 42.4MiB/s
> Average speed: 43.6MiB/s
>
> Form above data, we can see the MMC software queue can help to improve the
> performance obviously.
>
> Any comments are welcome. Thanks a lot.
>
> Changes from v2:
>  - Remove reference to 'struct cqhci_host' and 'struct cqhci_slot',
>  instead adding 'struct sqhci_host', which is only used by software queue.
>
> Changes from v1:
>  - Add request_done ops for sdhci_ops.
>  - Replace virtual command queue with software queue for functions and
>  variables.
>  - Rename the software queue file and add sqhci.h header file.

Do you have any comments for this patch set except the random config
building issue that will be fixed in the next version? Thanks.

>
> Baolin Wang (3):
>   mmc: Add MMC software queue support
>   mmc: host: sdhci: Add request_done ops for struct sdhci_ops
>   mmc: host: sdhci-sprd: Add software queue support
>
>  drivers/mmc/core/block.c      |   61 ++++++++
>  drivers/mmc/core/mmc.c        |   13 +-
>  drivers/mmc/core/queue.c      |   25 ++-
>  drivers/mmc/host/Kconfig      |    9 ++
>  drivers/mmc/host/Makefile     |    1 +
>  drivers/mmc/host/sdhci-sprd.c |   26 ++++
>  drivers/mmc/host/sdhci.c      |   12 +-
>  drivers/mmc/host/sdhci.h      |    2 +
>  drivers/mmc/host/sqhci.c      |  344 +++++++++++++++++++++++++++++++++++++++++
>  drivers/mmc/host/sqhci.h      |   53 +++++++
>  include/linux/mmc/host.h      |    3 +
>  11 files changed, 537 insertions(+), 12 deletions(-)
>  create mode 100644 drivers/mmc/host/sqhci.c
>  create mode 100644 drivers/mmc/host/sqhci.h
>
> --
> 1.7.9.5
>


-- 
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/3] Add MMC software queue support
  2019-09-26  9:43 ` [PATCH v3 0/3] Add MMC " Baolin Wang
@ 2019-09-26 12:07   ` Adrian Hunter
  2019-09-27  9:33     ` Baolin Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2019-09-26 12:07 UTC (permalink / raw)
  To: Baolin Wang, Ulf Hansson, asutoshd
  Cc: Orson Zhai, Chunyan Zhang, Arnd Bergmann, Linus Walleij,
	Vincent Guittot, linux-mmc, LKML

On 26/09/19 12:43 PM, Baolin Wang wrote:
> Hi Adrian and Ulf,
> 
> On Thu, 19 Sep 2019 at 13:59, Baolin Wang <baolin.wang@linaro.org> wrote:
>>
>> Hi All,
>>
>> Now the MMC read/write stack will always wait for previous request is
>> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
>> or queue a work to complete request, that will bring context switching
>> overhead, especially for high I/O per second rates, to affect the IO
>> performance.
>>
>> Thus this patch set will introduce the MMC software command queue support
>> based on command queue engine's interfaces, and set the queue depth as 2,
>> that means we do not need wait for previous request is completed and can
>> queue 2 requests in flight. It is enough to let the irq handler always
>> trigger the next request without a context switch and then ask the blk_mq
>> layer for the next one to get queued, as well as avoiding a long latency.
>>
>> Moreover we can expand the MMC software queue interface to support
>> MMC packed request or packed command instead of adding new interfaces,
>> according to previosus discussion.
>>
>> Below are some comparison data with fio tool. The fio command I used
>> is like below with changing the '--rw' parameter and enabling the direct
>> IO flag to measure the actual hardware transfer speed in 4K block size.
>>
>> ./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=512M --group_reporting --numjobs=20 --name=test_read
>>
>> My eMMC card working at HS400 Enhanced strobe mode:
>> [    2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
>> [    2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB
>> [    2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
>> [    2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
>> [    2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0)
>>
>> 1. Without MMC software queue
>> I tested 3 times for each case and output a average speed.
>>
>> 1) Sequential read:
>> Speed: 28.9MiB/s, 26.4MiB/s, 30.9MiB/s
>> Average speed: 28.7MiB/s
>>
>> 2) Random read:
>> Speed: 18.2MiB/s, 8.9MiB/s, 15.8MiB/s
>> Average speed: 14.3MiB/s
>>
>> 3) Sequential write:
>> Speed: 21.1MiB/s, 27.9MiB/s, 25MiB/s
>> Average speed: 24.7MiB/s
>>
>> 4) Random write:
>> Speed: 21.5MiB/s, 18.1MiB/s, 18.1MiB/s
>> Average speed: 19.2MiB/s
>>
>> 2. With MMC software queue
>> I tested 3 times for each case and output a average speed.
>>
>> 1) Sequential read:
>> Speed: 44.1MiB/s, 42.3MiB/s, 44.4MiB/s
>> Average speed: 43.6MiB/s
>>
>> 2) Random read:
>> Speed: 30.6MiB/s, 30.9MiB/s, 30.5MiB/s
>> Average speed: 30.6MiB/s
>>
>> 3) Sequential write:
>> Speed: 44.1MiB/s, 45.9MiB/s, 44.2MiB/s
>> Average speed: 44.7MiB/s
>>
>> 4) Random write:
>> Speed: 45.1MiB/s, 43.3MiB/s, 42.4MiB/s
>> Average speed: 43.6MiB/s
>>
>> Form above data, we can see the MMC software queue can help to improve the
>> performance obviously.
>>
>> Any comments are welcome. Thanks a lot.
>>
>> Changes from v2:
>>  - Remove reference to 'struct cqhci_host' and 'struct cqhci_slot',
>>  instead adding 'struct sqhci_host', which is only used by software queue.
>>
>> Changes from v1:
>>  - Add request_done ops for sdhci_ops.
>>  - Replace virtual command queue with software queue for functions and
>>  variables.
>>  - Rename the software queue file and add sqhci.h header file.
> 
> Do you have any comments for this patch set except the random config
> building issue that will be fixed in the next version? Thanks.

Pedantically, swhci is not a host controller interface, so the name still
seems inappropriate. Otherwise I haven't had time to look at it, sorry.

> 
>>
>> Baolin Wang (3):
>>   mmc: Add MMC software queue support
>>   mmc: host: sdhci: Add request_done ops for struct sdhci_ops
>>   mmc: host: sdhci-sprd: Add software queue support
>>
>>  drivers/mmc/core/block.c      |   61 ++++++++
>>  drivers/mmc/core/mmc.c        |   13 +-
>>  drivers/mmc/core/queue.c      |   25 ++-
>>  drivers/mmc/host/Kconfig      |    9 ++
>>  drivers/mmc/host/Makefile     |    1 +
>>  drivers/mmc/host/sdhci-sprd.c |   26 ++++
>>  drivers/mmc/host/sdhci.c      |   12 +-
>>  drivers/mmc/host/sdhci.h      |    2 +
>>  drivers/mmc/host/sqhci.c      |  344 +++++++++++++++++++++++++++++++++++++++++
>>  drivers/mmc/host/sqhci.h      |   53 +++++++
>>  include/linux/mmc/host.h      |    3 +
>>  11 files changed, 537 insertions(+), 12 deletions(-)
>>  create mode 100644 drivers/mmc/host/sqhci.c
>>  create mode 100644 drivers/mmc/host/sqhci.h
>>
>> --
>> 1.7.9.5
>>
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/3] Add MMC software queue support
  2019-09-26 12:07   ` Adrian Hunter
@ 2019-09-27  9:33     ` Baolin Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Baolin Wang @ 2019-09-27  9:33 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, asutoshd, Orson Zhai, Chunyan Zhang, Arnd Bergmann,
	Linus Walleij, Vincent Guittot, linux-mmc, LKML

On Thu, 26 Sep 2019 at 20:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 26/09/19 12:43 PM, Baolin Wang wrote:
> > Hi Adrian and Ulf,
> >
> > On Thu, 19 Sep 2019 at 13:59, Baolin Wang <baolin.wang@linaro.org> wrote:
> >>
> >> Hi All,
> >>
> >> Now the MMC read/write stack will always wait for previous request is
> >> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> >> or queue a work to complete request, that will bring context switching
> >> overhead, especially for high I/O per second rates, to affect the IO
> >> performance.
> >>
> >> Thus this patch set will introduce the MMC software command queue support
> >> based on command queue engine's interfaces, and set the queue depth as 2,
> >> that means we do not need wait for previous request is completed and can
> >> queue 2 requests in flight. It is enough to let the irq handler always
> >> trigger the next request without a context switch and then ask the blk_mq
> >> layer for the next one to get queued, as well as avoiding a long latency.
> >>
> >> Moreover we can expand the MMC software queue interface to support
> >> MMC packed request or packed command instead of adding new interfaces,
> >> according to previosus discussion.
> >>
> >> Below are some comparison data with fio tool. The fio command I used
> >> is like below with changing the '--rw' parameter and enabling the direct
> >> IO flag to measure the actual hardware transfer speed in 4K block size.
> >>
> >> ./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=512M --group_reporting --numjobs=20 --name=test_read
> >>
> >> My eMMC card working at HS400 Enhanced strobe mode:
> >> [    2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> >> [    2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB
> >> [    2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
> >> [    2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
> >> [    2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0)
> >>
> >> 1. Without MMC software queue
> >> I tested 3 times for each case and output a average speed.
> >>
> >> 1) Sequential read:
> >> Speed: 28.9MiB/s, 26.4MiB/s, 30.9MiB/s
> >> Average speed: 28.7MiB/s
> >>
> >> 2) Random read:
> >> Speed: 18.2MiB/s, 8.9MiB/s, 15.8MiB/s
> >> Average speed: 14.3MiB/s
> >>
> >> 3) Sequential write:
> >> Speed: 21.1MiB/s, 27.9MiB/s, 25MiB/s
> >> Average speed: 24.7MiB/s
> >>
> >> 4) Random write:
> >> Speed: 21.5MiB/s, 18.1MiB/s, 18.1MiB/s
> >> Average speed: 19.2MiB/s
> >>
> >> 2. With MMC software queue
> >> I tested 3 times for each case and output a average speed.
> >>
> >> 1) Sequential read:
> >> Speed: 44.1MiB/s, 42.3MiB/s, 44.4MiB/s
> >> Average speed: 43.6MiB/s
> >>
> >> 2) Random read:
> >> Speed: 30.6MiB/s, 30.9MiB/s, 30.5MiB/s
> >> Average speed: 30.6MiB/s
> >>
> >> 3) Sequential write:
> >> Speed: 44.1MiB/s, 45.9MiB/s, 44.2MiB/s
> >> Average speed: 44.7MiB/s
> >>
> >> 4) Random write:
> >> Speed: 45.1MiB/s, 43.3MiB/s, 42.4MiB/s
> >> Average speed: 43.6MiB/s
> >>
> >> Form above data, we can see the MMC software queue can help to improve the
> >> performance obviously.
> >>
> >> Any comments are welcome. Thanks a lot.
> >>
> >> Changes from v2:
> >>  - Remove reference to 'struct cqhci_host' and 'struct cqhci_slot',
> >>  instead adding 'struct sqhci_host', which is only used by software queue.
> >>
> >> Changes from v1:
> >>  - Add request_done ops for sdhci_ops.
> >>  - Replace virtual command queue with software queue for functions and
> >>  variables.
> >>  - Rename the software queue file and add sqhci.h header file.
> >
> > Do you have any comments for this patch set except the random config
> > building issue that will be fixed in the next version? Thanks.
>
> Pedantically, swhci is not a host controller interface, so the name still
> seems inappropriate. Otherwise I haven't had time to look at it, sorry.

OK. I will talk with Ulf to think about a good name. Thanks.

-- 
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-27  9:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-19  5:58 [PATCH v3 0/3] Add MMC software queue support Baolin Wang
2019-09-19  5:58 ` [PATCH v3 1/3] mmc: " Baolin Wang
2019-09-21 14:49   ` kbuild test robot
2019-09-23  7:53     ` Baolin Wang
2019-09-19  5:58 ` [PATCH v3 2/3] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
2019-09-19  5:58 ` [PATCH v3 3/3] mmc: host: sdhci-sprd: Add software queue support Baolin Wang
2019-09-26  9:43 ` [PATCH v3 0/3] Add MMC " Baolin Wang
2019-09-26 12:07   ` Adrian Hunter
2019-09-27  9:33     ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).