All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 00/13] mmc: Add Command Queue support
@ 2017-08-10 12:08 Adrian Hunter
  2017-08-10 12:08 ` [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now() Adrian Hunter
                   ` (16 more replies)
  0 siblings, 17 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Hi

Here is V5 of the hardware command queue patches without the software
command queue patches.

HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
2% drop in sequential read speed but no change to sequential write.


Changes since V4:
      mmc: core: Add mmc_retune_hold_now()
	Add explanation to commit message.
      mmc: host: Add CQE interface
	Add comments to callback declarations.
      mmc: core: Turn off CQE before sending commands
	Add explanation to commit message.
      mmc: core: Add support for handling CQE requests
	Add comments as requested by Ulf.
      mmc: core: Remove unused MMC_CAP2_PACKED_CMD
	New patch.
      mmc: mmc: Enable Command Queuing
	Adjust for removal of MMC_CAP2_PACKED_CMD.
	Add a comment about Packed Commands.
      mmc: mmc: Enable CQE's
	Remove un-necessary check for MMC_CAP2_CQE
      mmc: block: Use local variables in mmc_blk_data_prep()
	New patch.
      mmc: block: Prepare CQE data
	Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
	Remove priority setting.
	Add explanation to commit message.
      mmc: cqhci: support for command queue enabled host
	Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA

Changes since V3:
	Adjusted ...blk_end_request...() for new block status codes
	Fixed CQHCI transaction descriptor for "no DCMD" case

Changes since V2:
	Dropped patches that have been applied.
	Re-based
	Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"

Changes since V1:

	"Share mmc request array between partitions" is dependent
	on changes in "Introduce queue semantics", so added that
	and block fixes:

	Added "Fix is_waiting_last_req set incorrectly"
	Added "Fix cmd error reset failure path"
	Added "Use local var for mqrq_cur"
	Added "Introduce queue semantics"

Changes since RFC:

	Re-based on next.
	Added comment about command queue priority.
	Added some acks and reviews.


Adrian Hunter (12):
      mmc: core: Add mmc_retune_hold_now()
      mmc: core: Add members to mmc_request and mmc_data for CQE's
      mmc: host: Add CQE interface
      mmc: core: Turn off CQE before sending commands
      mmc: core: Add support for handling CQE requests
      mmc: core: Remove unused MMC_CAP2_PACKED_CMD
      mmc: mmc: Enable Command Queuing
      mmc: mmc: Enable CQE's
      mmc: block: Use local variables in mmc_blk_data_prep()
      mmc: block: Prepare CQE data
      mmc: block: Add CQE support
      mmc: sdhci-pci: Add CQHCI support for Intel GLK

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

 drivers/mmc/core/block.c          |  238 +++++++-
 drivers/mmc/core/block.h          |    7 +
 drivers/mmc/core/bus.c            |    7 +
 drivers/mmc/core/core.c           |  178 +++++-
 drivers/mmc/core/host.c           |    6 +
 drivers/mmc/core/host.h           |    1 +
 drivers/mmc/core/mmc.c            |   42 +-
 drivers/mmc/core/queue.c          |  273 ++++++++-
 drivers/mmc/core/queue.h          |   42 +-
 drivers/mmc/host/Kconfig          |   14 +
 drivers/mmc/host/Makefile         |    1 +
 drivers/mmc/host/cqhci.c          | 1154 +++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h          |  240 ++++++++
 drivers/mmc/host/sdhci-pci-core.c |  153 ++++-
 include/linux/mmc/core.h          |   18 +-
 include/linux/mmc/host.h          |   63 +-
 include/trace/events/mmc.h        |   36 +-
 17 files changed, 2412 insertions(+), 61 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h


Regards
Adrian

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now()
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:26   ` Linus Walleij
  2017-08-22 11:12   ` Ulf Hansson
  2017-08-10 12:08 ` [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's Adrian Hunter
                   ` (15 subsequent siblings)
  16 siblings, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

mmc_return_hold() / mmc_retune_release() are used around a group of
commands to prevent re-tuning between the commands. Re-tuning can still
happen before the first command. In some cases, re-tuning must be
prevented entirely. Add mmc_retune_hold_now() for that purpose. It is
added in preparation for CQE support where it will be used by CQE recovery.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/host.c | 6 ++++++
 drivers/mmc/core/host.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 1503412f826c..ad88deb2e8f3 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -111,6 +111,12 @@ void mmc_retune_hold(struct mmc_host *host)
 	host->hold_retune += 1;
 }
 
+void mmc_retune_hold_now(struct mmc_host *host)
+{
+	host->retune_now = 0;
+	host->hold_retune += 1;
+}
+
 void mmc_retune_release(struct mmc_host *host)
 {
 	if (host->hold_retune)
diff --git a/drivers/mmc/core/host.h b/drivers/mmc/core/host.h
index fb6a76a03833..77d6f60d1bf9 100644
--- a/drivers/mmc/core/host.h
+++ b/drivers/mmc/core/host.h
@@ -19,6 +19,7 @@
 void mmc_retune_enable(struct mmc_host *host);
 void mmc_retune_disable(struct mmc_host *host);
 void mmc_retune_hold(struct mmc_host *host);
+void mmc_retune_hold_now(struct mmc_host *host);
 void mmc_retune_release(struct mmc_host *host);
 int mmc_retune(struct mmc_host *host);
 void mmc_retune_pause(struct mmc_host *host);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
  2017-08-10 12:08 ` [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now() Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:29   ` Linus Walleij
  2017-08-22 11:12   ` Ulf Hansson
  2017-08-10 12:08 ` [PATCH V5 03/13] mmc: host: Add CQE interface Adrian Hunter
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Most of the information needed to issue requests to a CQE is already in
struct mmc_request and struct mmc_data. Add data block address, some flags,
and the task id (tag), and allow for cmd being NULL which it is for CQE
tasks.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/mmc/core.h   | 13 +++++++++++--
 include/trace/events/mmc.h | 36 +++++++++++++++++++++++-------------
 2 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
index a0c63ea28796..bf1788a224e6 100644
--- a/include/linux/mmc/core.h
+++ b/include/linux/mmc/core.h
@@ -122,11 +122,18 @@ struct mmc_data {
 	unsigned int		timeout_clks;	/* data timeout (in clocks) */
 	unsigned int		blksz;		/* data block size */
 	unsigned int		blocks;		/* number of blocks */
+	unsigned int		blk_addr;	/* block address */
 	int			error;		/* data error */
 	unsigned int		flags;
 
-#define MMC_DATA_WRITE	(1 << 8)
-#define MMC_DATA_READ	(1 << 9)
+#define MMC_DATA_WRITE		BIT(8)
+#define MMC_DATA_READ		BIT(9)
+/* Extra flags used by CQE */
+#define MMC_DATA_QBR		BIT(10)		/* CQE queue barrier*/
+#define MMC_DATA_PRIO		BIT(11)		/* CQE high priority */
+#define MMC_DATA_REL_WR		BIT(12)		/* Reliable write */
+#define MMC_DATA_DAT_TAG	BIT(13)		/* Tag request */
+#define MMC_DATA_FORCED_PRG	BIT(14)		/* Forced programming */
 
 	unsigned int		bytes_xfered;
 
@@ -153,6 +160,8 @@ struct mmc_request {
 
 	/* Allow other commands during this ongoing data transfer or busy wait */
 	bool			cap_cmd_during_tfr;
+
+	int			tag;
 };
 
 struct mmc_card;
diff --git a/include/trace/events/mmc.h b/include/trace/events/mmc.h
index a72f9b94c80b..f30a99ac65b6 100644
--- a/include/trace/events/mmc.h
+++ b/include/trace/events/mmc.h
@@ -29,8 +29,10 @@
 		__field(unsigned int,		sbc_flags)
 		__field(unsigned int,		sbc_retries)
 		__field(unsigned int,		blocks)
+		__field(unsigned int,		blk_addr)
 		__field(unsigned int,		blksz)
 		__field(unsigned int,		data_flags)
+		__field(int,			tag)
 		__field(unsigned int,		can_retune)
 		__field(unsigned int,		doing_retune)
 		__field(unsigned int,		retune_now)
@@ -42,10 +44,10 @@
 	),
 
 	TP_fast_assign(
-		__entry->cmd_opcode = mrq->cmd->opcode;
-		__entry->cmd_arg = mrq->cmd->arg;
-		__entry->cmd_flags = mrq->cmd->flags;
-		__entry->cmd_retries = mrq->cmd->retries;
+		__entry->cmd_opcode = mrq->cmd ? mrq->cmd->opcode : 0;
+		__entry->cmd_arg = mrq->cmd ? mrq->cmd->arg : 0;
+		__entry->cmd_flags = mrq->cmd ? mrq->cmd->flags : 0;
+		__entry->cmd_retries = mrq->cmd ? mrq->cmd->retries : 0;
 		__entry->stop_opcode = mrq->stop ? mrq->stop->opcode : 0;
 		__entry->stop_arg = mrq->stop ? mrq->stop->arg : 0;
 		__entry->stop_flags = mrq->stop ? mrq->stop->flags : 0;
@@ -56,7 +58,9 @@
 		__entry->sbc_retries = mrq->sbc ? mrq->sbc->retries : 0;
 		__entry->blksz = mrq->data ? mrq->data->blksz : 0;
 		__entry->blocks = mrq->data ? mrq->data->blocks : 0;
+		__entry->blk_addr = mrq->data ? mrq->data->blk_addr : 0;
 		__entry->data_flags = mrq->data ? mrq->data->flags : 0;
+		__entry->tag = mrq->tag;
 		__entry->can_retune = host->can_retune;
 		__entry->doing_retune = host->doing_retune;
 		__entry->retune_now = host->retune_now;
@@ -71,8 +75,8 @@
 		  "cmd_opcode=%u cmd_arg=0x%x cmd_flags=0x%x cmd_retries=%u "
 		  "stop_opcode=%u stop_arg=0x%x stop_flags=0x%x stop_retries=%u "
 		  "sbc_opcode=%u sbc_arg=0x%x sbc_flags=0x%x sbc_retires=%u "
-		  "blocks=%u block_size=%u data_flags=0x%x "
-		  "can_retune=%u doing_retune=%u retune_now=%u "
+		  "blocks=%u block_size=%u blk_addr=%u data_flags=0x%x "
+		  "tag=%d can_retune=%u doing_retune=%u retune_now=%u "
 		  "need_retune=%d hold_retune=%d retune_period=%u",
 		  __get_str(name), __entry->mrq,
 		  __entry->cmd_opcode, __entry->cmd_arg,
@@ -81,7 +85,8 @@
 		  __entry->stop_flags, __entry->stop_retries,
 		  __entry->sbc_opcode, __entry->sbc_arg,
 		  __entry->sbc_flags, __entry->sbc_retries,
-		  __entry->blocks, __entry->blksz, __entry->data_flags,
+		  __entry->blocks, __entry->blk_addr,
+		  __entry->blksz, __entry->data_flags, __entry->tag,
 		  __entry->can_retune, __entry->doing_retune,
 		  __entry->retune_now, __entry->need_retune,
 		  __entry->hold_retune, __entry->retune_period)
@@ -108,6 +113,7 @@
 		__field(unsigned int,		sbc_retries)
 		__field(unsigned int,		bytes_xfered)
 		__field(int,			data_err)
+		__field(int,			tag)
 		__field(unsigned int,		can_retune)
 		__field(unsigned int,		doing_retune)
 		__field(unsigned int,		retune_now)
@@ -119,10 +125,13 @@
 	),
 
 	TP_fast_assign(
-		__entry->cmd_opcode = mrq->cmd->opcode;
-		__entry->cmd_err = mrq->cmd->error;
-		memcpy(__entry->cmd_resp, mrq->cmd->resp, 4);
-		__entry->cmd_retries = mrq->cmd->retries;
+		__entry->cmd_opcode = mrq->cmd ? mrq->cmd->opcode : 0;
+		__entry->cmd_err = mrq->cmd ? mrq->cmd->error : 0;
+		__entry->cmd_resp[0] = mrq->cmd ? mrq->cmd->resp[0] : 0;
+		__entry->cmd_resp[1] = mrq->cmd ? mrq->cmd->resp[1] : 0;
+		__entry->cmd_resp[2] = mrq->cmd ? mrq->cmd->resp[2] : 0;
+		__entry->cmd_resp[3] = mrq->cmd ? mrq->cmd->resp[3] : 0;
+		__entry->cmd_retries = mrq->cmd ? mrq->cmd->retries : 0;
 		__entry->stop_opcode = mrq->stop ? mrq->stop->opcode : 0;
 		__entry->stop_err = mrq->stop ? mrq->stop->error : 0;
 		__entry->stop_resp[0] = mrq->stop ? mrq->stop->resp[0] : 0;
@@ -139,6 +148,7 @@
 		__entry->sbc_retries = mrq->sbc ? mrq->sbc->retries : 0;
 		__entry->bytes_xfered = mrq->data ? mrq->data->bytes_xfered : 0;
 		__entry->data_err = mrq->data ? mrq->data->error : 0;
+		__entry->tag = mrq->tag;
 		__entry->can_retune = host->can_retune;
 		__entry->doing_retune = host->doing_retune;
 		__entry->retune_now = host->retune_now;
@@ -154,7 +164,7 @@
 		  "cmd_retries=%u stop_opcode=%u stop_err=%d "
 		  "stop_resp=0x%x 0x%x 0x%x 0x%x stop_retries=%u "
 		  "sbc_opcode=%u sbc_err=%d sbc_resp=0x%x 0x%x 0x%x 0x%x "
-		  "sbc_retries=%u bytes_xfered=%u data_err=%d "
+		  "sbc_retries=%u bytes_xfered=%u data_err=%d tag=%d "
 		  "can_retune=%u doing_retune=%u retune_now=%u need_retune=%d "
 		  "hold_retune=%d retune_period=%u",
 		  __get_str(name), __entry->mrq,
@@ -170,7 +180,7 @@
 		  __entry->sbc_resp[0], __entry->sbc_resp[1],
 		  __entry->sbc_resp[2], __entry->sbc_resp[3],
 		  __entry->sbc_retries,
-		  __entry->bytes_xfered, __entry->data_err,
+		  __entry->bytes_xfered, __entry->data_err, __entry->tag,
 		  __entry->can_retune, __entry->doing_retune,
 		  __entry->retune_now, __entry->need_retune,
 		  __entry->hold_retune, __entry->retune_period)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
  2017-08-10 12:08 ` [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now() Adrian Hunter
  2017-08-10 12:08 ` [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:31   ` Linus Walleij
  2017-08-22 11:13   ` Ulf Hansson
  2017-08-10 12:08 ` [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands Adrian Hunter
                   ` (13 subsequent siblings)
  16 siblings, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Add CQE host operations, capabilities, and host members.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/mmc/host.h | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 4617c21f97f7..89e1d7e2953e 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -162,6 +162,50 @@ struct mmc_host_ops {
 				  unsigned int direction, int blk_size);
 };
 
+struct mmc_cqe_ops {
+	/* Allocate resources, and make the CQE operational */
+	int	(*cqe_enable)(struct mmc_host *host, struct mmc_card *card);
+	/* Free resources, and make the CQE non-operational */
+	void	(*cqe_disable)(struct mmc_host *host);
+	/*
+	 * Issue a read, write or DCMD request to the CQE. Also deal with the
+	 * effect of ->cqe_off().
+	 */
+	int	(*cqe_request)(struct mmc_host *host, struct mmc_request *mrq);
+	/* Free resources (e.g. DMA mapping) associated with the request */
+	void	(*cqe_post_req)(struct mmc_host *host, struct mmc_request *mrq);
+	/*
+	 * Prepare the CQE and host controller to accept non-CQ commands. There
+	 * is no corresponding ->cqe_on(), instead ->cqe_request() is required
+	 * to deal with that.
+	 */
+	void	(*cqe_off)(struct mmc_host *host);
+	/*
+	 * Wait for all CQE tasks to complete. Return an error if recovery
+	 * becomes necessary.
+	 */
+	int	(*cqe_wait_for_idle)(struct mmc_host *host);
+	/*
+	 * Notify CQE that a request has timed out. Return false if the request
+	 * completed or true if a timeout happened in which case indicate if
+	 * recovery is needed.
+	 */
+	bool	(*cqe_timeout)(struct mmc_host *host, struct mmc_request *mrq,
+			       bool *recovery_needed);
+	/*
+	 * Stop all CQE activity and prepare the CQE and host controller to
+	 * accept recovery commands.
+	 */
+	void	(*cqe_recovery_start)(struct mmc_host *host);
+	/*
+	 * Clear the queue and call mmc_cqe_request_done() on all requests.
+	 * Requests that errored will have the error set on the mmc_request
+	 * (data->error or cmd->error for DCMD).  Requests that did not error
+	 * will have zero data bytes transferred.
+	 */
+	void	(*cqe_recovery_finish)(struct mmc_host *host);
+};
+
 struct mmc_async_req {
 	/* active mmc request */
 	struct mmc_request	*mrq;
@@ -307,6 +351,8 @@ struct mmc_host {
 #define MMC_CAP2_HS400_ES	(1 << 20)	/* Host supports enhanced strobe */
 #define MMC_CAP2_NO_SD		(1 << 21)	/* Do not send SD commands during initialization */
 #define MMC_CAP2_NO_MMC		(1 << 22)	/* Do not send (e)MMC commands during initialization */
+#define MMC_CAP2_CQE		(1 << 23)	/* Has eMMC command queue engine */
+#define MMC_CAP2_CQE_DCMD	(1 << 24)	/* CQE can issue a direct command */
 
 	mmc_pm_flag_t		pm_caps;	/* supported pm features */
 
@@ -390,6 +436,19 @@ struct mmc_host {
 	int			dsr_req;	/* DSR value is valid */
 	u32			dsr;	/* optional driver stage (DSR) value */
 
+	/* Command Queue Engine (CQE) support */
+	const struct mmc_cqe_ops *cqe_ops;
+	void			*cqe_private;
+	/*
+	 * Notify uppers layers (e.g. mmc block driver) that CQE needs recovery
+	 * due to an error associated with the mmc_request.
+	 */
+	void			(*cqe_recovery_notifier)(struct mmc_host *,
+							 struct mmc_request *);
+	int			cqe_qdepth;
+	bool			cqe_enabled;
+	bool			cqe_on;
+
 	unsigned long		private[0] ____cacheline_aligned;
 };
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (2 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 03/13] mmc: host: Add CQE interface Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:32   ` Linus Walleij
  2017-08-22 11:13   ` Ulf Hansson
  2017-08-10 12:08 ` [PATCH V5 05/13] mmc: core: Add support for handling CQE requests Adrian Hunter
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

CQE needs to be off for the host controller to accept non-CQ commands. Turn
off the CQE before sending commands, and ensure it is off in any reset or
power management paths, or re-tuning.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 6177eb09bf1b..29544aa2824a 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -260,6 +260,9 @@ static void __mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 
 	trace_mmc_request_start(host, mrq);
 
+	if (host->cqe_on)
+		host->cqe_ops->cqe_off(host);
+
 	host->ops->request(host, mrq);
 }
 
@@ -979,6 +982,9 @@ int mmc_execute_tuning(struct mmc_card *card)
 	if (!host->ops->execute_tuning)
 		return 0;
 
+	if (host->cqe_on)
+		host->cqe_ops->cqe_off(host);
+
 	if (mmc_card_mmc(card))
 		opcode = MMC_SEND_TUNING_BLOCK_HS200;
 	else
@@ -1018,6 +1024,9 @@ void mmc_set_bus_width(struct mmc_host *host, unsigned int width)
  */
 void mmc_set_initial_state(struct mmc_host *host)
 {
+	if (host->cqe_on)
+		host->cqe_ops->cqe_off(host);
+
 	mmc_retune_disable(host);
 
 	if (mmc_host_is_spi(host))
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 05/13] mmc: core: Add support for handling CQE requests
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (3 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:39   ` Linus Walleij
  2017-08-22  8:06   ` Ulf Hansson
  2017-08-10 12:08 ` [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD Adrian Hunter
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Add core support for handling CQE requests, including starting, completing
and recovering.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/core.c  | 169 +++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/mmc/core.h |   5 ++
 2 files changed, 169 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 29544aa2824a..9483d0bf39bf 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -266,7 +266,8 @@ static void __mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 	host->ops->request(host, mrq);
 }
 
-static void mmc_mrq_pr_debug(struct mmc_host *host, struct mmc_request *mrq)
+static void mmc_mrq_pr_debug(struct mmc_host *host, struct mmc_request *mrq,
+			     bool cqe)
 {
 	if (mrq->sbc) {
 		pr_debug("<%s: starting CMD%u arg %08x flags %08x>\n",
@@ -275,9 +276,12 @@ static void mmc_mrq_pr_debug(struct mmc_host *host, struct mmc_request *mrq)
 	}
 
 	if (mrq->cmd) {
-		pr_debug("%s: starting CMD%u arg %08x flags %08x\n",
-			 mmc_hostname(host), mrq->cmd->opcode, mrq->cmd->arg,
-			 mrq->cmd->flags);
+		pr_debug("%s: starting %sCMD%u arg %08x flags %08x\n",
+			 mmc_hostname(host), cqe ? "CQE direct " : "",
+			 mrq->cmd->opcode, mrq->cmd->arg, mrq->cmd->flags);
+	} else if (cqe) {
+		pr_debug("%s: starting CQE transfer for tag %d blkaddr %u\n",
+			 mmc_hostname(host), mrq->tag, mrq->data->blk_addr);
 	}
 
 	if (mrq->data) {
@@ -342,7 +346,7 @@ static int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 	if (mmc_card_removed(host->card))
 		return -ENOMEDIUM;
 
-	mmc_mrq_pr_debug(host, mrq);
+	mmc_mrq_pr_debug(host, mrq, false);
 
 	WARN_ON(!host->claimed);
 
@@ -482,6 +486,161 @@ void mmc_wait_for_req_done(struct mmc_host *host, struct mmc_request *mrq)
 }
 EXPORT_SYMBOL(mmc_wait_for_req_done);
 
+/*
+ * mmc_cqe_start_req - Start a CQE request.
+ * @host: MMC host to start the request
+ * @mrq: request to start
+ *
+ * Start the request, re-tuning if needed and it is possible. Returns an error
+ * code if the request fails to start or -EBUSY if CQE is busy.
+ */
+int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	int err;
+
+	/*
+	 * CQE cannot process re-tuning commands. Caller must hold retuning
+	 * while CQE is in use.  Re-tuning can happen here only when CQE has no
+	 * active requests i.e. this is the first.  Note, re-tuning will call
+	 * ->cqe_off().
+	 */
+	err = mmc_retune(host);
+	if (err)
+		goto out_err;
+
+	mrq->host = host;
+
+	mmc_mrq_pr_debug(host, mrq, true);
+
+	err = mmc_mrq_prep(host, mrq);
+	if (err)
+		goto out_err;
+
+	err = host->cqe_ops->cqe_request(host, mrq);
+	if (err)
+		goto out_err;
+
+	trace_mmc_request_start(host, mrq);
+
+	return 0;
+
+out_err:
+	if (mrq->cmd) {
+		pr_debug("%s: failed to start CQE direct CMD%u, error %d\n",
+			 mmc_hostname(host), mrq->cmd->opcode, err);
+	} else {
+		pr_debug("%s: failed to start CQE transfer for tag %d, error %d\n",
+			 mmc_hostname(host), mrq->tag, err);
+	}
+	return err;
+}
+EXPORT_SYMBOL(mmc_cqe_start_req);
+
+static void __mmc_cqe_request_done(struct mmc_host *host,
+				   struct mmc_request *mrq)
+{
+	mmc_should_fail_request(host, mrq);
+
+	/* Flag re-tuning needed on CRC errors */
+	if ((mrq->cmd && mrq->cmd->error == -EILSEQ) ||
+	    (mrq->data && mrq->data->error == -EILSEQ))
+		mmc_retune_needed(host);
+
+	trace_mmc_request_done(host, mrq);
+
+	if (mrq->cmd) {
+		pr_debug("%s: CQE req done (direct CMD%u): %d\n",
+			 mmc_hostname(host), mrq->cmd->opcode, mrq->cmd->error);
+	} else {
+		pr_debug("%s: CQE transfer done tag %d\n",
+			 mmc_hostname(host), mrq->tag);
+	}
+
+	if (mrq->data) {
+		pr_debug("%s:     %d bytes transferred: %d\n",
+			 mmc_hostname(host),
+			 mrq->data->bytes_xfered, mrq->data->error);
+	}
+}
+
+/**
+ *	mmc_cqe_request_done - CQE has finished processing an MMC request
+ *	@host: MMC host which completed request
+ *	@mrq: MMC request which completed
+ *
+ *	CQE drivers should call this function when they have completed
+ *	their processing of a request.
+ */
+void mmc_cqe_request_done(struct mmc_host *host, struct mmc_request *mrq)
+{
+	__mmc_cqe_request_done(host, mrq);
+
+	mrq->done(mrq);
+}
+EXPORT_SYMBOL(mmc_cqe_request_done);
+
+/**
+ *	mmc_cqe_post_req - CQE post process of a completed MMC request
+ *	@host: MMC host
+ *	@mrq: MMC request to be processed
+ */
+void mmc_cqe_post_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	if (host->cqe_ops->cqe_post_req)
+		host->cqe_ops->cqe_post_req(host, mrq);
+}
+EXPORT_SYMBOL(mmc_cqe_post_req);
+
+/* Arbitrary 1 second timeout */
+#define MMC_CQE_RECOVERY_TIMEOUT	1000
+
+/*
+ * mmc_cqe_recovery - Recover from CQE errors.
+ * @host: MMC host to recover
+ *
+ * Recovery consists of stopping CQE, stopping eMMC, discarding the queue in
+ * in eMMC, and discarding the queue in CQE. CQE must call
+ * mmc_cqe_request_done() on all requests. An error is returned if the eMMC
+ * fails to discard its queue.
+ */
+int mmc_cqe_recovery(struct mmc_host *host)
+{
+	struct mmc_command cmd;
+	int err;
+
+	mmc_retune_hold_now(host);
+
+	/*
+	 * Recovery is expected seldom, if at all, but it reduces performance,
+	 * so make sure it is not completely silent.
+	 */
+	pr_warn("%s: running CQE recovery\n", mmc_hostname(host));
+
+	host->cqe_ops->cqe_recovery_start(host);
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.opcode       = MMC_STOP_TRANSMISSION,
+	cmd.flags        = MMC_RSP_R1B | MMC_CMD_AC,
+	cmd.flags       &= ~MMC_RSP_CRC; /* Ignore CRC */
+	cmd.busy_timeout = MMC_CQE_RECOVERY_TIMEOUT,
+	mmc_wait_for_cmd(host, &cmd, 0);
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.opcode       = MMC_CMDQ_TASK_MGMT;
+	cmd.arg          = 1; /* Discard entire queue */
+	cmd.flags        = MMC_RSP_R1B | MMC_CMD_AC;
+	cmd.flags       &= ~MMC_RSP_CRC; /* Ignore CRC */
+	cmd.busy_timeout = MMC_CQE_RECOVERY_TIMEOUT,
+	err = mmc_wait_for_cmd(host, &cmd, 0);
+
+	host->cqe_ops->cqe_recovery_finish(host);
+
+	mmc_retune_release(host);
+
+	return err;
+}
+EXPORT_SYMBOL(mmc_cqe_recovery);
+
 /**
  *	mmc_is_req_done - Determine if a 'cap_cmd_during_tfr' request is done
  *	@host: MMC host
diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
index bf1788a224e6..1974fcfd4284 100644
--- a/include/linux/mmc/core.h
+++ b/include/linux/mmc/core.h
@@ -174,6 +174,11 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 int mmc_wait_for_cmd(struct mmc_host *host, struct mmc_command *cmd,
 		int retries);
 
+int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq);
+void mmc_cqe_request_done(struct mmc_host *host, struct mmc_request *mrq);
+void mmc_cqe_post_req(struct mmc_host *host, struct mmc_request *mrq);
+int mmc_cqe_recovery(struct mmc_host *host);
+
 int mmc_hw_reset(struct mmc_host *host);
 void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card);
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (4 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 05/13] mmc: core: Add support for handling CQE requests Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:33   ` Linus Walleij
  2017-08-22 11:12   ` Ulf Hansson
  2017-08-10 12:08 ` [PATCH V5 07/13] mmc: mmc: Enable Command Queuing Adrian Hunter
                   ` (10 subsequent siblings)
  16 siblings, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Packed commands support was removed but some bits got left behind. Remove
them.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/mmc.c   | 23 -----------------------
 include/linux/mmc/host.h |  4 ----
 2 files changed, 27 deletions(-)

diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index adb5c88fd054..5fcb8d721224 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1790,29 +1790,6 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
 	 */
 	card->reenable_cmdq = card->ext_csd.cmdq_en;
 
-	/*
-	 * The mandatory minimum values are defined for packed command.
-	 * read: 5, write: 3
-	 */
-	if (card->ext_csd.max_packed_writes >= 3 &&
-	    card->ext_csd.max_packed_reads >= 5 &&
-	    host->caps2 & MMC_CAP2_PACKED_CMD) {
-		err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL,
-				EXT_CSD_EXP_EVENTS_CTRL,
-				EXT_CSD_PACKED_EVENT_EN,
-				card->ext_csd.generic_cmd6_time);
-		if (err && err != -EBADMSG)
-			goto free_card;
-		if (err) {
-			pr_warn("%s: Enabling packed event failed\n",
-				mmc_hostname(card->host));
-			card->ext_csd.packed_event_en = 0;
-			err = 0;
-		} else {
-			card->ext_csd.packed_event_en = 1;
-		}
-	}
-
 	if (!oldcard)
 		host->card = card;
 
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 89e1d7e2953e..5f3eb4a0373f 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -335,10 +335,6 @@ struct mmc_host {
 				 MMC_CAP2_HS200_1_2V_SDR)
 #define MMC_CAP2_CD_ACTIVE_HIGH	(1 << 10)	/* Card-detect signal active high */
 #define MMC_CAP2_RO_ACTIVE_HIGH	(1 << 11)	/* Write-protect signal active high */
-#define MMC_CAP2_PACKED_RD	(1 << 12)	/* Allow packed read */
-#define MMC_CAP2_PACKED_WR	(1 << 13)	/* Allow packed write */
-#define MMC_CAP2_PACKED_CMD	(MMC_CAP2_PACKED_RD | \
-				 MMC_CAP2_PACKED_WR)
 #define MMC_CAP2_NO_PRESCAN_POWERUP (1 << 14)	/* Don't power up before scan */
 #define MMC_CAP2_HS400_1_8V	(1 << 15)	/* Can support HS400 1.8V */
 #define MMC_CAP2_HS400_1_2V	(1 << 16)	/* Can support HS400 1.2V */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 07/13] mmc: mmc: Enable Command Queuing
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (5 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:33   ` Linus Walleij
  2017-08-10 12:08 ` [PATCH V5 08/13] mmc: mmc: Enable CQE's Adrian Hunter
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Enable the Command Queue if the host controller supports a command queue
engine. It is not compatible with Packed Commands, so make a note of that in the
comment.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/mmc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 5fcb8d721224..6f6e57d04971 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1784,6 +1784,23 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
 	}
 
 	/*
+	 * Enable Command Queue if supported. Note that Packed Commands cannot
+	 * be used with Command Queue.
+	 */
+	card->ext_csd.cmdq_en = false;
+	if (card->ext_csd.cmdq_support && host->caps2 & MMC_CAP2_CQE) {
+		err = mmc_cmdq_enable(card);
+		if (err && err != -EBADMSG)
+			goto free_card;
+		if (err) {
+			pr_warn("%s: Enabling CMDQ failed\n",
+				mmc_hostname(card->host));
+			card->ext_csd.cmdq_support = false;
+			card->ext_csd.cmdq_depth = 0;
+			err = 0;
+		}
+	}
+	/*
 	 * In some cases (e.g. RPMB or mmc_test), the Command Queue must be
 	 * disabled for a time, so a flag is needed to indicate to re-enable the
 	 * Command Queue.
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 08/13] mmc: mmc: Enable CQE's
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (6 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 07/13] mmc: mmc: Enable Command Queuing Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:41   ` Linus Walleij
  2017-08-10 12:08 ` [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep() Adrian Hunter
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Enable or disable CQE when a card is added or removed respectively.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/bus.c |  7 +++++++
 drivers/mmc/core/mmc.c | 12 ++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index 301246513a37..a4b49e25fe96 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -369,10 +369,17 @@ int mmc_add_card(struct mmc_card *card)
  */
 void mmc_remove_card(struct mmc_card *card)
 {
+	struct mmc_host *host = card->host;
+
 #ifdef CONFIG_DEBUG_FS
 	mmc_remove_card_debugfs(card);
 #endif
 
+	if (host->cqe_enabled) {
+		host->cqe_ops->cqe_disable(host);
+		host->cqe_enabled = false;
+	}
+
 	if (mmc_card_present(card)) {
 		if (mmc_host_is_spi(card->host)) {
 			pr_info("%s: SPI card removed\n",
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 6f6e57d04971..93c61c1fce10 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1807,6 +1807,18 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
 	 */
 	card->reenable_cmdq = card->ext_csd.cmdq_en;
 
+	if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
+		err = host->cqe_ops->cqe_enable(host, card);
+		if (err) {
+			pr_err("%s: Failed to enable CQE, error %d\n",
+				mmc_hostname(host), err);
+		} else {
+			host->cqe_enabled = true;
+			pr_info("%s: Command Queue Engine enabled\n",
+				mmc_hostname(host));
+		}
+	}
+
 	if (!oldcard)
 		host->card = card;
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep()
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (7 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 08/13] mmc: mmc: Enable CQE's Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 11:43   ` Linus Walleij
  2017-08-10 12:08 ` [PATCH V5 10/13] mmc: block: Prepare CQE data Adrian Hunter
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Use local variables in mmc_blk_data_prep() in preparation for adding CQE
support which doesn't use the output variables.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index a11beadda26f..bd7a899979d4 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1485,21 +1485,22 @@ static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
 }
 
 static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
-			      int disable_multi, bool *do_rel_wr,
-			      bool *do_data_tag)
+			      int disable_multi, bool *do_rel_wr_p,
+			      bool *do_data_tag_p)
 {
 	struct mmc_blk_data *md = mq->blkdata;
 	struct mmc_card *card = md->queue.card;
 	struct mmc_blk_request *brq = &mqrq->brq;
 	struct request *req = mmc_queue_req_to_req(mqrq);
+	bool do_rel_wr, do_data_tag;
 
 	/*
 	 * Reliable writes are used to implement Forced Unit Access and
 	 * are supported only on MMCs.
 	 */
-	*do_rel_wr = (req->cmd_flags & REQ_FUA) &&
-		     rq_data_dir(req) == WRITE &&
-		     (md->flags & MMC_BLK_REL_WR);
+	do_rel_wr = (req->cmd_flags & REQ_FUA) &&
+		    rq_data_dir(req) == WRITE &&
+		    (md->flags & MMC_BLK_REL_WR);
 
 	memset(brq, 0, sizeof(struct mmc_blk_request));
 
@@ -1547,18 +1548,18 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 						brq->data.blocks);
 	}
 
-	if (*do_rel_wr)
+	if (do_rel_wr)
 		mmc_apply_rel_rw(brq, card, req);
 
 	/*
 	 * Data tag is used only during writing meta data to speed
 	 * up write and any subsequent read of this meta data
 	 */
-	*do_data_tag = card->ext_csd.data_tag_unit_size &&
-		       (req->cmd_flags & REQ_META) &&
-		       (rq_data_dir(req) == WRITE) &&
-		       ((brq->data.blocks * brq->data.blksz) >=
-			card->ext_csd.data_tag_unit_size);
+	do_data_tag = card->ext_csd.data_tag_unit_size &&
+		      (req->cmd_flags & REQ_META) &&
+		      (rq_data_dir(req) == WRITE) &&
+		      ((brq->data.blocks * brq->data.blksz) >=
+		       card->ext_csd.data_tag_unit_size);
 
 	mmc_set_data_timeout(&brq->data, card);
 
@@ -1587,6 +1588,12 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 	mqrq->areq.mrq = &brq->mrq;
 
 	mmc_queue_bounce_pre(mqrq);
+
+	if (do_rel_wr_p)
+		*do_rel_wr_p = do_rel_wr;
+
+	if (do_data_tag_p)
+		*do_data_tag_p = do_data_tag;
 }
 
 static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 10/13] mmc: block: Prepare CQE data
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (8 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep() Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-10 12:08 ` [PATCH V5 11/13] mmc: block: Add CQE support Adrian Hunter
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Enhance mmc_blk_data_prep() to support CQE requests. That means adding
some things that for non-CQE requests would be encoded into the command
arguments - such as the block address, reliable-write flag, and data tag
flag. Also the request tag is needed to provide the command queue task id,
and a comment is added to explain the future possibility of defining a
priority.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index bd7a899979d4..d8677b28af5c 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1505,6 +1505,7 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 	memset(brq, 0, sizeof(struct mmc_blk_request));
 
 	brq->mrq.data = &brq->data;
+	brq->mrq.tag = req->tag;
 
 	brq->stop.opcode = MMC_STOP_TRANSMISSION;
 	brq->stop.arg = 0;
@@ -1519,6 +1520,14 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 
 	brq->data.blksz = 512;
 	brq->data.blocks = blk_rq_sectors(req);
+	brq->data.blk_addr = blk_rq_pos(req);
+
+	/*
+	 * The command queue supports 2 priorities: "high" (1) and "simple" (0).
+	 * The eMMC will give "high" priority tasks priority over "simple"
+	 * priority tasks. Here we always set "simple" priority by not setting
+	 * MMC_DATA_PRIO.
+	 */
 
 	/*
 	 * The block layer doesn't support all sector count
@@ -1548,8 +1557,10 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 						brq->data.blocks);
 	}
 
-	if (do_rel_wr)
+	if (do_rel_wr) {
 		mmc_apply_rel_rw(brq, card, req);
+		brq->data.flags |= MMC_DATA_REL_WR;
+	}
 
 	/*
 	 * Data tag is used only during writing meta data to speed
@@ -1561,6 +1572,9 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		      ((brq->data.blocks * brq->data.blksz) >=
 		       card->ext_csd.data_tag_unit_size);
 
+	if (do_data_tag)
+		brq->data.flags |= MMC_DATA_DAT_TAG;
+
 	mmc_set_data_timeout(&brq->data, card);
 
 	brq->data.sg = mqrq->sg;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (9 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 10/13] mmc: block: Prepare CQE data Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-20 12:13   ` Linus Walleij
  2017-08-10 12:08 ` [PATCH V5 12/13] mmc: cqhci: support for command queue enabled host Adrian Hunter
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Add CQE support to the block driver, including:
	- optionally using DCMD for flush requests
	- manually issuing discard requests
	- issuing read / write requests to the CQE
	- supporting block-layer timeouts
	- handling recovery
	- supporting re-tuning

CQE offers 25% - 50% better random multi-threaded I/O.  There is a slight
(e.g. 2%) drop in sequential read speed but no observable change to sequential
write.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 195 ++++++++++++++++++++++++++++++++-
 drivers/mmc/core/block.h |   7 ++
 drivers/mmc/core/queue.c | 273 ++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/mmc/core/queue.h |  42 +++++++-
 4 files changed, 510 insertions(+), 7 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index d8677b28af5c..f1fc9a3b8b53 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -108,6 +108,7 @@ struct mmc_blk_data {
 #define MMC_BLK_WRITE		BIT(1)
 #define MMC_BLK_DISCARD		BIT(2)
 #define MMC_BLK_SECDISCARD	BIT(3)
+#define MMC_BLK_CQE_RECOVERY	BIT(4)
 
 	/*
 	 * Only set in main mmc_blk_data associated
@@ -1610,6 +1611,198 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		*do_data_tag_p = do_data_tag;
 }
 
+#define MMC_CQE_RETRIES 2
+
+void mmc_blk_cqe_complete_rq(struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	unsigned long flags;
+	bool put_card;
+	int err;
+
+	mmc_cqe_post_req(host, mrq);
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	mq->cqe_in_flight[mmc_cqe_issue_type(host, req)] -= 1;
+
+	put_card = mmc_cqe_tot_in_flight(mq) == 0;
+
+	if (mrq->cmd && mrq->cmd->error)
+		err = mrq->cmd->error;
+	else if (mrq->data && mrq->data->error)
+		err = mrq->data->error;
+	else
+		err = 0;
+
+	if (err) {
+		if (mqrq->retries++ < MMC_CQE_RETRIES)
+			blk_requeue_request(q, req);
+		else
+			__blk_end_request_all(req, BLK_STS_IOERR);
+	} else if (mrq->data) {
+		if (__blk_end_request(req, BLK_STS_OK, mrq->data->bytes_xfered))
+			blk_requeue_request(q, req);
+	} else {
+		__blk_end_request_all(req, BLK_STS_OK);
+	}
+
+	mmc_cqe_kick_queue(mq);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	if (put_card)
+		mmc_put_card(mq->card);
+}
+
+void mmc_blk_cqe_recovery(struct mmc_queue *mq)
+{
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
+	int err;
+
+	mmc_get_card(card);
+
+	pr_debug("%s: CQE recovery start\n", mmc_hostname(host));
+
+	mq->cqe_in_recovery = true;
+
+	err = mmc_cqe_recovery(host);
+	if (err)
+		mmc_blk_reset(mq->blkdata, host, MMC_BLK_CQE_RECOVERY);
+	else
+		mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
+
+	mq->cqe_in_recovery = false;
+
+	pr_debug("%s: CQE recovery done\n", mmc_hostname(host));
+
+	mmc_put_card(card);
+}
+
+static void mmc_blk_cqe_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->cqe_in_recovery)
+		mmc_blk_cqe_complete_rq(req);
+	else
+		blk_complete_request(req);
+}
+
+static int mmc_blk_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	mrq->done = mmc_blk_cqe_req_done;
+	return mmc_cqe_start_req(host, mrq);
+}
+
+static struct mmc_request *mmc_blk_cqe_prep_dcmd(struct mmc_queue_req *mqrq,
+						 struct request *req)
+{
+	struct mmc_blk_request *brq = &mqrq->brq;
+
+	memset(brq, 0, sizeof(*brq));
+
+	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.tag = req->tag;
+
+	return &brq->mrq;
+}
+
+static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = mmc_blk_cqe_prep_dcmd(mqrq, req);
+
+	mrq->cmd->opcode = MMC_SWITCH;
+	mrq->cmd->arg = (MMC_SWITCH_MODE_WRITE_BYTE << 24) |
+			(EXT_CSD_FLUSH_CACHE << 16) |
+			(1 << 8) |
+			EXT_CSD_CMD_SET_NORMAL;
+	mrq->cmd->flags = MMC_CMD_AC | MMC_RSP_R1B;
+
+	return mmc_blk_cqe_start_req(mq->card->host, mrq);
+}
+
+static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+
+	mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
+
+	return mmc_blk_cqe_start_req(mq->card->host, &mqrq->brq.mrq);
+}
+
+enum mmc_issued mmc_blk_cqe_issue_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->blkdata;
+	struct mmc_card *card = md->queue.card;
+	struct mmc_host *host = card->host;
+	int ret;
+
+	ret = mmc_blk_part_switch(card, md);
+	if (ret)
+		return MMC_REQ_FAILED_TO_START;
+
+	switch (mmc_cqe_issue_type(host, req)) {
+	case MMC_ISSUE_SYNC:
+		ret = host->cqe_ops->cqe_wait_for_idle(host);
+		if (ret)
+			return MMC_REQ_BUSY;
+		switch (req_op(req)) {
+		case REQ_OP_DRV_IN:
+		case REQ_OP_DRV_OUT:
+			mmc_blk_issue_drv_op(mq, req);
+			break;
+		case REQ_OP_DISCARD:
+			mmc_blk_issue_discard_rq(mq, req);
+			break;
+		case REQ_OP_SECURE_ERASE:
+			mmc_blk_issue_secdiscard_rq(mq, req);
+			break;
+		case REQ_OP_FLUSH:
+			mmc_blk_issue_flush(mq, req);
+			break;
+		default:
+			WARN_ON_ONCE(1);
+			return MMC_REQ_FAILED_TO_START;
+		}
+		return MMC_REQ_FINISHED;
+	case MMC_ISSUE_DCMD:
+	case MMC_ISSUE_ASYNC:
+		switch (req_op(req)) {
+		case REQ_OP_FLUSH:
+			ret = mmc_blk_cqe_issue_flush(mq, req);
+			break;
+		case REQ_OP_READ:
+		case REQ_OP_WRITE:
+			ret = mmc_blk_cqe_issue_rw_rq(mq, req);
+			break;
+		default:
+			WARN_ON_ONCE(1);
+			ret = -EINVAL;
+		}
+		if (!ret)
+			return MMC_REQ_STARTED;
+		return ret == -EBUSY ? MMC_REQ_BUSY : MMC_REQ_FAILED_TO_START;
+	default:
+		WARN_ON_ONCE(1);
+		return MMC_REQ_FAILED_TO_START;
+	}
+}
+
 static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 			       struct mmc_card *card,
 			       int disable_multi,
@@ -2033,7 +2226,7 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
 	INIT_LIST_HEAD(&md->part);
 	md->usage = 1;
 
-	ret = mmc_init_queue(&md->queue, card, &md->lock, subname);
+	ret = mmc_init_queue(&md->queue, card, &md->lock, subname, area_type);
 	if (ret)
 		goto err_putdisk;
 
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index 860ca7c8df86..d7b3d7008b00 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -6,4 +6,11 @@
 
 void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req);
 
+enum mmc_issued;
+
+enum mmc_issued mmc_blk_cqe_issue_rq(struct mmc_queue *mq,
+				     struct request *req);
+void mmc_blk_cqe_complete_rq(struct request *rq);
+void mmc_blk_cqe_recovery(struct mmc_queue *mq);
+
 #endif
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index affa7370ba82..0cb7b0e8ee58 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -36,10 +36,254 @@ static int mmc_prep_request(struct request_queue *q, struct request *req)
 		return BLKPREP_KILL;
 
 	req->rq_flags |= RQF_DONTPREP;
+	req_to_mmc_queue_req(req)->retries = 0;
 
 	return BLKPREP_OK;
 }
 
+static void mmc_cqe_request_fn(struct request_queue *q)
+{
+	struct mmc_queue *mq = q->queuedata;
+	struct request *req;
+
+	if (!mq) {
+		while ((req = blk_fetch_request(q)) != NULL) {
+			req->rq_flags |= RQF_QUIET;
+			__blk_end_request_all(req, BLK_STS_IOERR);
+		}
+		return;
+	}
+
+	if (mq->asleep && !mq->cqe_busy)
+		wake_up_process(mq->thread);
+}
+
+static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
+{
+	/* Allow only 1 DCMD at a time */
+	return mq->cqe_in_flight[MMC_ISSUE_DCMD];
+}
+
+void mmc_cqe_kick_queue(struct mmc_queue *mq)
+{
+	if ((mq->cqe_busy & MMC_CQE_DCMD_BUSY) && !mmc_cqe_dcmd_busy(mq))
+		mq->cqe_busy &= ~MMC_CQE_DCMD_BUSY;
+
+	mq->cqe_busy &= ~MMC_CQE_QUEUE_FULL;
+
+	if (mq->asleep && !mq->cqe_busy)
+		__blk_run_queue(mq->queue);
+}
+
+static inline bool mmc_cqe_can_dcmd(struct mmc_host *host)
+{
+	return host->caps2 & MMC_CAP2_CQE_DCMD;
+}
+
+enum mmc_issue_type mmc_cqe_issue_type(struct mmc_host *host,
+				       struct request *req)
+{
+	switch (req_op(req)) {
+	case REQ_OP_DRV_IN:
+	case REQ_OP_DRV_OUT:
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+		return MMC_ISSUE_SYNC;
+	case REQ_OP_FLUSH:
+		return mmc_cqe_can_dcmd(host) ? MMC_ISSUE_DCMD : MMC_ISSUE_SYNC;
+	default:
+		return MMC_ISSUE_ASYNC;
+	}
+}
+
+static void __mmc_cqe_recovery_notifier(struct mmc_queue *mq)
+{
+	if (!mq->cqe_recovery_needed) {
+		mq->cqe_recovery_needed = true;
+		wake_up_process(mq->thread);
+	}
+}
+
+static void mmc_cqe_recovery_notifier(struct mmc_host *host,
+				      struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	unsigned long flags;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+	__mmc_cqe_recovery_notifier(mq);
+	spin_unlock_irqrestore(q->queue_lock, flags);
+}
+
+static int mmc_cqe_thread(void *d)
+{
+	struct mmc_queue *mq = d;
+	struct request_queue *q = mq->queue;
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
+	unsigned long flags;
+	int get_put = 0;
+
+	current->flags |= PF_MEMALLOC;
+
+	down(&mq->thread_sem);
+	spin_lock_irqsave(q->queue_lock, flags);
+	while (1) {
+		struct request *req = NULL;
+		enum mmc_issue_type issue_type;
+		bool retune_ok = false;
+
+		if (mq->cqe_recovery_needed) {
+			spin_unlock_irqrestore(q->queue_lock, flags);
+			mmc_blk_cqe_recovery(mq);
+			spin_lock_irqsave(q->queue_lock, flags);
+			mq->cqe_recovery_needed = false;
+		}
+
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		if (!kthread_should_stop())
+			req = blk_peek_request(q);
+
+		if (req) {
+			issue_type = mmc_cqe_issue_type(host, req);
+			switch (issue_type) {
+			case MMC_ISSUE_DCMD:
+				if (mmc_cqe_dcmd_busy(mq)) {
+					mq->cqe_busy |= MMC_CQE_DCMD_BUSY;
+					req = NULL;
+					break;
+				}
+				/* Fall through */
+			case MMC_ISSUE_ASYNC:
+				if (blk_queue_start_tag(q, req)) {
+					mq->cqe_busy |= MMC_CQE_QUEUE_FULL;
+					req = NULL;
+				}
+				break;
+			default:
+				/*
+				 * Timeouts are handled by mmc core, so set a
+				 * large value to avoid races.
+				 */
+				req->timeout = 600 * HZ;
+				blk_start_request(req);
+				break;
+			}
+			if (req) {
+				mq->cqe_in_flight[issue_type] += 1;
+				if (mmc_cqe_tot_in_flight(mq) == 1)
+					get_put += 1;
+				if (mmc_cqe_qcnt(mq) == 1)
+					retune_ok = true;
+			}
+		}
+
+		mq->asleep = !req;
+
+		spin_unlock_irqrestore(q->queue_lock, flags);
+
+		if (req) {
+			enum mmc_issued issued;
+
+			set_current_state(TASK_RUNNING);
+
+			if (get_put) {
+				get_put = 0;
+				mmc_get_card(card);
+			}
+
+			if (host->need_retune && retune_ok &&
+			    !host->hold_retune)
+				host->retune_now = true;
+			else
+				host->retune_now = false;
+
+			issued = mmc_blk_cqe_issue_rq(mq, req);
+
+			cond_resched();
+
+			spin_lock_irqsave(q->queue_lock, flags);
+
+			switch (issued) {
+			case MMC_REQ_STARTED:
+				break;
+			case MMC_REQ_BUSY:
+				blk_requeue_request(q, req);
+				goto finished;
+			case MMC_REQ_FAILED_TO_START:
+				__blk_end_request_all(req, BLK_STS_IOERR);
+				/* Fall through */
+			case MMC_REQ_FINISHED:
+finished:
+				mq->cqe_in_flight[issue_type] -= 1;
+				if (mmc_cqe_tot_in_flight(mq) == 0)
+					get_put = -1;
+			}
+		} else {
+			if (get_put < 0) {
+				get_put = 0;
+				mmc_put_card(card);
+			}
+			/*
+			 * Do not stop with requests in flight in case recovery
+			 * is needed.
+			 */
+			if (kthread_should_stop() &&
+			    !mmc_cqe_tot_in_flight(mq)) {
+				set_current_state(TASK_RUNNING);
+				break;
+			}
+			up(&mq->thread_sem);
+			schedule();
+			down(&mq->thread_sem);
+			spin_lock_irqsave(q->queue_lock, flags);
+		}
+	} /* loop */
+	up(&mq->thread_sem);
+
+	return 0;
+}
+
+static enum blk_eh_timer_return __mmc_cqe_timed_out(struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct mmc_queue *mq = req->q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	enum mmc_issue_type issue_type = mmc_cqe_issue_type(host, req);
+	bool recovery_needed = false;
+
+	switch (issue_type) {
+	case MMC_ISSUE_ASYNC:
+	case MMC_ISSUE_DCMD:
+		if (host->cqe_ops->cqe_timeout(host, mrq, &recovery_needed)) {
+			if (recovery_needed)
+				__mmc_cqe_recovery_notifier(mq);
+			return BLK_EH_RESET_TIMER;
+		}
+		/* No timeout */
+		return BLK_EH_HANDLED;
+	default:
+		/* Timeout is handled by mmc core */
+		return BLK_EH_RESET_TIMER;
+	}
+}
+
+static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
+{
+	struct mmc_queue *mq = req->q->queuedata;
+
+	if (mq->cqe_recovery_needed)
+		return BLK_EH_RESET_TIMER;
+
+	return __mmc_cqe_timed_out(req);
+}
+
 static int mmc_queue_thread(void *d)
 {
 	struct mmc_queue *mq = d;
@@ -233,11 +477,12 @@ static void mmc_exit_request(struct request_queue *q, struct request *req)
  * Initialise a MMC card request queue.
  */
 int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
-		   spinlock_t *lock, const char *subname)
+		   spinlock_t *lock, const char *subname, int area_type)
 {
 	struct mmc_host *host = card->host;
 	u64 limit = BLK_BOUNCE_HIGH;
 	int ret = -ENOMEM;
+	bool use_cqe = host->cqe_enabled && area_type != MMC_BLK_DATA_AREA_RPMB;
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = (u64)dma_max_pfn(mmc_dev(host)) << PAGE_SHIFT;
@@ -247,7 +492,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	if (!mq->queue)
 		return -ENOMEM;
 	mq->queue->queue_lock = lock;
-	mq->queue->request_fn = mmc_request_fn;
+	mq->queue->request_fn = use_cqe ? mmc_cqe_request_fn : mmc_request_fn;
 	mq->queue->init_rq_fn = mmc_init_request;
 	mq->queue->exit_rq_fn = mmc_exit_request;
 	mq->queue->cmd_size = sizeof(struct mmc_queue_req);
@@ -259,6 +504,24 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 		return ret;
 	}
 
+	if (use_cqe) {
+		int q_depth = card->ext_csd.cmdq_depth;
+
+		if (q_depth > host->cqe_qdepth)
+			q_depth = host->cqe_qdepth;
+
+		ret = blk_queue_init_tags(mq->queue, q_depth, NULL,
+					  BLK_TAG_ALLOC_FIFO);
+		if (ret)
+			goto cleanup_queue;
+
+		blk_queue_softirq_done(mq->queue, mmc_blk_cqe_complete_rq);
+		blk_queue_rq_timed_out(mq->queue, mmc_cqe_timed_out);
+		blk_queue_rq_timeout(mq->queue, 60 * HZ);
+
+		host->cqe_recovery_notifier = mmc_cqe_recovery_notifier;
+	}
+
 	blk_queue_prep_rq(mq->queue, mmc_prep_request);
 	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);
 	queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, mq->queue);
@@ -280,9 +543,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 
 	sema_init(&mq->thread_sem, 1);
 
-	mq->thread = kthread_run(mmc_queue_thread, mq, "mmcqd/%d%s",
-		host->index, subname ? subname : "");
-
+	mq->thread = kthread_run(use_cqe ? mmc_cqe_thread : mmc_queue_thread,
+				 mq, "mmcqd/%d%s", host->index,
+				 subname ? subname : "");
 	if (IS_ERR(mq->thread)) {
 		ret = PTR_ERR(mq->thread);
 		goto cleanup_queue;
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 361b46408e0f..8e9273d977c0 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -7,6 +7,20 @@
 #include <linux/mmc/core.h>
 #include <linux/mmc/host.h>
 
+enum mmc_issued {
+	MMC_REQ_STARTED,
+	MMC_REQ_BUSY,
+	MMC_REQ_FAILED_TO_START,
+	MMC_REQ_FINISHED,
+};
+
+enum mmc_issue_type {
+	MMC_ISSUE_SYNC,
+	MMC_ISSUE_DCMD,
+	MMC_ISSUE_ASYNC,
+	MMC_ISSUE_MAX,
+};
+
 static inline struct mmc_queue_req *req_to_mmc_queue_req(struct request *rq)
 {
 	return blk_mq_rq_to_pdu(rq);
@@ -53,6 +67,7 @@ struct mmc_queue_req {
 	int			drv_op_result;
 	struct mmc_blk_ioc_data	**idata;
 	unsigned int		ioc_count;
+	int			retries;
 };
 
 struct mmc_queue {
@@ -70,10 +85,17 @@ struct mmc_queue {
 	 * associated mmc_queue_req data.
 	 */
 	int			qcnt;
+	/* Following are defined for a Command Queue Engine */
+	int			cqe_in_flight[MMC_ISSUE_MAX];
+	unsigned int		cqe_busy;
+	bool			cqe_recovery_needed;
+	bool			cqe_in_recovery;
+#define MMC_CQE_DCMD_BUSY	BIT(0)
+#define MMC_CQE_QUEUE_FULL	BIT(1)
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
-			  const char *);
+			  const char *, int);
 extern void mmc_cleanup_queue(struct mmc_queue *);
 extern void mmc_queue_suspend(struct mmc_queue *);
 extern void mmc_queue_resume(struct mmc_queue *);
@@ -85,4 +107,22 @@ extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
 
 extern int mmc_access_rpmb(struct mmc_queue *);
 
+void mmc_cqe_kick_queue(struct mmc_queue *mq);
+
+enum mmc_issue_type mmc_cqe_issue_type(struct mmc_host *host,
+				       struct request *req);
+
+static inline int mmc_cqe_tot_in_flight(struct mmc_queue *mq)
+{
+	return mq->cqe_in_flight[MMC_ISSUE_SYNC] +
+	       mq->cqe_in_flight[MMC_ISSUE_DCMD] +
+	       mq->cqe_in_flight[MMC_ISSUE_ASYNC];
+}
+
+static inline int mmc_cqe_qcnt(struct mmc_queue *mq)
+{
+	return mq->cqe_in_flight[MMC_ISSUE_DCMD] +
+	       mq->cqe_in_flight[MMC_ISSUE_ASYNC];
+}
+
 #endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 12/13] mmc: cqhci: support for command queue enabled host
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (10 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 11/13] mmc: block: Add CQE support Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-10 12:08 ` [PATCH V5 13/13] mmc: sdhci-pci: Add CQHCI support for Intel GLK Adrian Hunter
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

From: Venkat Gopalakrishnan <venkatg@codeaurora.org>

This patch adds CMDQ support for command-queue compatible
hosts.

Command queue is added in eMMC-5.1 specification. This
enables the controller to process upto 32 requests at
a time.

Adrian Hunter contributed renaming to cqhci, recovery, suspend
and resume, cqhci_off, cqhci_wait_for_idle, and external timeout
handling.

Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org>
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/host/Kconfig  |   13 +
 drivers/mmc/host/Makefile |    1 +
 drivers/mmc/host/cqhci.c  | 1154 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h  |  240 ++++++++++
 4 files changed, 1408 insertions(+)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 60f90d49e7a9..ccf7dab4a9f8 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -843,6 +843,19 @@ config MMC_SUNXI
 	  This selects support for the SD/MMC Host Controller on
 	  Allwinner sunxi SoCs.
 
+config MMC_CQHCI
+	tristate "Command Queue Host Controller Interface support"
+	depends on HAS_DMA
+	help
+	  This selects the Command Queue Host Controller Interface (CQHCI)
+	  support present in host controllers of Qualcomm Technologies, Inc
+	  amongst others.
+	  This controller supports eMMC devices with command queue support.
+
+	  If you have a controller with this interface, say Y or M here.
+
+	  If unsure, say N.
+
 config MMC_TOSHIBA_PCI
 	tristate "Toshiba Type A SD/MMC Card Interface Driver"
 	depends on PCI
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 8c46766c000c..3ae71e006890 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -88,6 +88,7 @@ obj-$(CONFIG_MMC_SDHCI_MSM)		+= sdhci-msm.o
 obj-$(CONFIG_MMC_SDHCI_ST)		+= sdhci-st.o
 obj-$(CONFIG_MMC_SDHCI_MICROCHIP_PIC32)	+= sdhci-pic32.o
 obj-$(CONFIG_MMC_SDHCI_BRCMSTB)		+= sdhci-brcmstb.o
+obj-$(CONFIG_MMC_CQHCI)			+= cqhci.o
 
 ifeq ($(CONFIG_CB710_DEBUG),y)
 	CFLAGS-cb710-mmc	+= -DDEBUG
diff --git a/drivers/mmc/host/cqhci.c b/drivers/mmc/host/cqhci.c
new file mode 100644
index 000000000000..8650a13ef5cd
--- /dev/null
+++ b/drivers/mmc/host/cqhci.c
@@ -0,0 +1,1154 @@
+/* Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/delay.h>
+#include <linux/highmem.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/dma-mapping.h>
+#include <linux/slab.h>
+#include <linux/scatterlist.h>
+#include <linux/platform_device.h>
+#include <linux/ktime.h>
+
+#include <linux/mmc/mmc.h>
+#include <linux/mmc/host.h>
+#include <linux/mmc/card.h>
+
+#include "cqhci.h"
+
+#define DCMD_SLOT 31
+#define NUM_SLOTS 32
+
+struct cqhci_slot {
+	struct mmc_request *mrq;
+	unsigned int flags;
+#define CQHCI_EXTERNAL_TIMEOUT	BIT(0)
+#define CQHCI_COMPLETED		BIT(1)
+#define CQHCI_HOST_CRC		BIT(2)
+#define CQHCI_HOST_TIMEOUT	BIT(3)
+#define CQHCI_HOST_OTHER	BIT(4)
+};
+
+static inline u8 *get_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->desc_base + (tag * cq_host->slot_sz);
+}
+
+static inline u8 *get_link_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	u8 *desc = get_desc(cq_host, tag);
+
+	return desc + cq_host->task_desc_len;
+}
+
+static inline dma_addr_t get_trans_desc_dma(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->trans_desc_dma_base +
+		(cq_host->mmc->max_segs * tag *
+		 cq_host->trans_desc_len);
+}
+
+static inline u8 *get_trans_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->trans_desc_base +
+		(cq_host->trans_desc_len * cq_host->mmc->max_segs * tag);
+}
+
+static void setup_trans_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	u8 *link_temp;
+	dma_addr_t trans_temp;
+
+	link_temp = get_link_desc(cq_host, tag);
+	trans_temp = get_trans_desc_dma(cq_host, tag);
+
+	memset(link_temp, 0, cq_host->link_desc_len);
+	if (cq_host->link_desc_len > 8)
+		*(link_temp + 8) = 0;
+
+	if (tag == DCMD_SLOT && (cq_host->mmc->caps2 & MMC_CAP2_CQE_DCMD)) {
+		*link_temp = CQHCI_VALID(0) | CQHCI_ACT(0) | CQHCI_END(1);
+		return;
+	}
+
+	*link_temp = CQHCI_VALID(1) | CQHCI_ACT(0x6) | CQHCI_END(0);
+
+	if (cq_host->dma64) {
+		__le64 *data_addr = (__le64 __force *)(link_temp + 4);
+
+		data_addr[0] = cpu_to_le64(trans_temp);
+	} else {
+		__le32 *data_addr = (__le32 __force *)(link_temp + 4);
+
+		data_addr[0] = cpu_to_le32(trans_temp);
+	}
+}
+
+static void cqhci_set_irqs(struct cqhci_host *cq_host, u32 set)
+{
+	u32 ier;
+
+	ier = cqhci_readl(cq_host, CQHCI_ISTE);
+	ier |= set;
+	cqhci_writel(cq_host, ier, CQHCI_ISTE);
+	cqhci_writel(cq_host, ier, CQHCI_ISGE);
+}
+
+#define DRV_NAME "cqhci"
+
+#define CQHCI_DUMP(f, x...) \
+	pr_err("%s: " DRV_NAME ": " f, mmc_hostname(mmc), ## x)
+
+static void cqhci_dumpregs(struct cqhci_host *cq_host)
+{
+	struct mmc_host *mmc = cq_host->mmc;
+
+	CQHCI_DUMP("============ CQHCI REGISTER DUMP ===========\n");
+
+	CQHCI_DUMP("Caps:      0x%08x | Version:  0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CAP),
+		   cqhci_readl(cq_host, CQHCI_VER));
+	CQHCI_DUMP("Config:    0x%08x | Control:  0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CFG),
+		   cqhci_readl(cq_host, CQHCI_CTL));
+	CQHCI_DUMP("Int stat:  0x%08x | Int enab: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_IS),
+		   cqhci_readl(cq_host, CQHCI_ISTE));
+	CQHCI_DUMP("Int sig:   0x%08x | Int Coal: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_ISGE),
+		   cqhci_readl(cq_host, CQHCI_IC));
+	CQHCI_DUMP("TDL base:  0x%08x | TDL up32: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TDLBA),
+		   cqhci_readl(cq_host, CQHCI_TDLBAU));
+	CQHCI_DUMP("Doorbell:  0x%08x | TCN:      0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TDBR),
+		   cqhci_readl(cq_host, CQHCI_TCN));
+	CQHCI_DUMP("Dev queue: 0x%08x | Dev Pend: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_DQS),
+		   cqhci_readl(cq_host, CQHCI_DPT));
+	CQHCI_DUMP("Task clr:  0x%08x | SSC1:     0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TCLR),
+		   cqhci_readl(cq_host, CQHCI_SSC1));
+	CQHCI_DUMP("SSC2:      0x%08x | DCMD rsp: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_SSC2),
+		   cqhci_readl(cq_host, CQHCI_CRDCT));
+	CQHCI_DUMP("RED mask:  0x%08x | TERRI:    0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_RMEM),
+		   cqhci_readl(cq_host, CQHCI_TERRI));
+	CQHCI_DUMP("Resp idx:  0x%08x | Resp arg: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CRI),
+		   cqhci_readl(cq_host, CQHCI_CRA));
+
+	if (cq_host->ops->dumpregs)
+		cq_host->ops->dumpregs(mmc);
+	else
+		CQHCI_DUMP(": ===========================================\n");
+}
+
+/**
+ * The allocated descriptor table for task, link & transfer descritors
+ * looks like:
+ * |----------|
+ * |task desc |  |->|----------|
+ * |----------|  |  |trans desc|
+ * |link desc-|->|  |----------|
+ * |----------|          .
+ *      .                .
+ *  no. of slots      max-segs
+ *      .           |----------|
+ * |----------|
+ * The idea here is to create the [task+trans] table and mark & point the
+ * link desc to the transfer desc table on a per slot basis.
+ */
+static int cqhci_host_alloc_tdl(struct cqhci_host *cq_host)
+{
+	int i = 0;
+
+	/* task descriptor can be 64/128 bit irrespective of arch */
+	if (cq_host->caps & CQHCI_TASK_DESC_SZ_128) {
+		cqhci_writel(cq_host, cqhci_readl(cq_host, CQHCI_CFG) |
+			       CQHCI_TASK_DESC_SZ, CQHCI_CFG);
+		cq_host->task_desc_len = 16;
+	} else {
+		cq_host->task_desc_len = 8;
+	}
+
+	/*
+	 * 96 bits length of transfer desc instead of 128 bits which means
+	 * ADMA would expect next valid descriptor at the 96th bit
+	 * or 128th bit
+	 */
+	if (cq_host->dma64) {
+		if (cq_host->quirks & CQHCI_QUIRK_SHORT_TXFR_DESC_SZ)
+			cq_host->trans_desc_len = 12;
+		else
+			cq_host->trans_desc_len = 16;
+		cq_host->link_desc_len = 16;
+	} else {
+		cq_host->trans_desc_len = 8;
+		cq_host->link_desc_len = 8;
+	}
+
+	/* total size of a slot: 1 task & 1 transfer (link) */
+	cq_host->slot_sz = cq_host->task_desc_len + cq_host->link_desc_len;
+
+	cq_host->desc_size = cq_host->slot_sz * cq_host->num_slots;
+
+	cq_host->data_size = cq_host->trans_desc_len * cq_host->mmc->max_segs *
+		(cq_host->num_slots - 1);
+
+	pr_debug("%s: cqhci: desc_size: %zu data_sz: %zu slot-sz: %d\n",
+		 mmc_hostname(cq_host->mmc), cq_host->desc_size, cq_host->data_size,
+		 cq_host->slot_sz);
+
+	/*
+	 * allocate a dma-mapped chunk of memory for the descriptors
+	 * allocate a dma-mapped chunk of memory for link descriptors
+	 * setup each link-desc memory offset per slot-number to
+	 * the descriptor table.
+	 */
+	cq_host->desc_base = dmam_alloc_coherent(mmc_dev(cq_host->mmc),
+						 cq_host->desc_size,
+						 &cq_host->desc_dma_base,
+						 GFP_KERNEL);
+	cq_host->trans_desc_base = dmam_alloc_coherent(mmc_dev(cq_host->mmc),
+					      cq_host->data_size,
+					      &cq_host->trans_desc_dma_base,
+					      GFP_KERNEL);
+	if (!cq_host->desc_base || !cq_host->trans_desc_base)
+		return -ENOMEM;
+
+	pr_debug("%s: cqhci: desc-base: 0x%p trans-base: 0x%p\n desc_dma 0x%llx trans_dma: 0x%llx\n",
+		 mmc_hostname(cq_host->mmc), cq_host->desc_base, cq_host->trans_desc_base,
+		(unsigned long long)cq_host->desc_dma_base,
+		(unsigned long long)cq_host->trans_desc_dma_base);
+
+	for (; i < (cq_host->num_slots); i++)
+		setup_trans_desc(cq_host, i);
+
+	return 0;
+}
+
+static void __cqhci_enable(struct cqhci_host *cq_host)
+{
+	struct mmc_host *mmc = cq_host->mmc;
+	u32 cqcfg;
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+
+	/* Configuration must not be changed while enabled */
+	if (cqcfg & CQHCI_ENABLE) {
+		cqcfg &= ~CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+	}
+
+	cqcfg &= ~(CQHCI_DCMD | CQHCI_TASK_DESC_SZ);
+
+	if (mmc->caps2 & MMC_CAP2_CQE_DCMD)
+		cqcfg |= CQHCI_DCMD;
+
+	if (cq_host->caps & CQHCI_TASK_DESC_SZ_128)
+		cqcfg |= CQHCI_TASK_DESC_SZ;
+
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cqhci_writel(cq_host, lower_32_bits(cq_host->desc_dma_base),
+		     CQHCI_TDLBA);
+	cqhci_writel(cq_host, upper_32_bits(cq_host->desc_dma_base),
+		     CQHCI_TDLBAU);
+
+	cqhci_writel(cq_host, cq_host->rca, CQHCI_SSC2);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	cqcfg |= CQHCI_ENABLE;
+
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	mmc->cqe_on = true;
+
+	if (cq_host->ops->enable)
+		cq_host->ops->enable(mmc);
+
+	/* Ensure all writes are done before interrupts are enabled */
+	wmb();
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_MASK);
+
+	cq_host->activated = true;
+}
+
+static void __cqhci_disable(struct cqhci_host *cq_host)
+{
+	u32 cqcfg;
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+	cqcfg &= ~CQHCI_ENABLE;
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cq_host->mmc->cqe_on = false;
+
+	cq_host->activated = false;
+}
+
+int cqhci_suspend(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (cq_host->enabled)
+		__cqhci_disable(cq_host);
+
+	return 0;
+}
+EXPORT_SYMBOL(cqhci_suspend);
+
+int cqhci_resume(struct mmc_host *mmc)
+{
+	/* Re-enable is done upon first request */
+	return 0;
+}
+EXPORT_SYMBOL(cqhci_resume);
+
+static int cqhci_enable(struct mmc_host *mmc, struct mmc_card *card)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int err;
+
+	if (cq_host->enabled)
+		return 0;
+
+	cq_host->rca = card->rca;
+
+	err = cqhci_host_alloc_tdl(cq_host);
+	if (err)
+		return err;
+
+	__cqhci_enable(cq_host);
+
+	cq_host->enabled = true;
+
+#ifdef DEBUG
+	cqhci_dumpregs(cq_host);
+#endif
+	return 0;
+}
+
+/* CQHCI is idle and should halt immediately, so set a small timeout */
+#define CQHCI_OFF_TIMEOUT 100
+
+static void cqhci_off(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	ktime_t timeout;
+	bool timed_out;
+	u32 reg;
+
+	if (!cq_host->enabled || !mmc->cqe_on || cq_host->recovery_halt)
+		return;
+
+	if (cq_host->ops->disable)
+		cq_host->ops->disable(mmc, false);
+
+	cqhci_writel(cq_host, CQHCI_HALT, CQHCI_CTL);
+
+	timeout = ktime_add_us(ktime_get(), CQHCI_OFF_TIMEOUT);
+	while (1) {
+		timed_out = ktime_compare(ktime_get(), timeout) > 0;
+		reg = cqhci_readl(cq_host, CQHCI_CTL);
+		if ((reg & CQHCI_HALT) || timed_out)
+			break;
+	}
+
+	if (timed_out)
+		pr_err("%s: cqhci: CQE stuck on\n", mmc_hostname(mmc));
+	else
+		pr_debug("%s: cqhci: CQE off\n", mmc_hostname(mmc));
+
+	mmc->cqe_on = false;
+}
+
+static void cqhci_disable(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (!cq_host->enabled)
+		return;
+
+	cqhci_off(mmc);
+
+	__cqhci_disable(cq_host);
+
+	dmam_free_coherent(mmc_dev(mmc), cq_host->data_size,
+			   cq_host->trans_desc_base,
+			   cq_host->trans_desc_dma_base);
+
+	dmam_free_coherent(mmc_dev(mmc), cq_host->desc_size,
+			   cq_host->desc_base,
+			   cq_host->desc_dma_base);
+
+	cq_host->trans_desc_base = NULL;
+	cq_host->desc_base = NULL;
+
+	cq_host->enabled = false;
+}
+
+static void cqhci_prep_task_desc(struct mmc_request *mrq,
+					u64 *data, bool intr)
+{
+	u32 req_flags = mrq->data->flags;
+
+	*data = CQHCI_VALID(1) |
+		CQHCI_END(1) |
+		CQHCI_INT(intr) |
+		CQHCI_ACT(0x5) |
+		CQHCI_FORCED_PROG(!!(req_flags & MMC_DATA_FORCED_PRG)) |
+		CQHCI_DATA_TAG(!!(req_flags & MMC_DATA_DAT_TAG)) |
+		CQHCI_DATA_DIR(!!(req_flags & MMC_DATA_READ)) |
+		CQHCI_PRIORITY(!!(req_flags & MMC_DATA_PRIO)) |
+		CQHCI_QBAR(!!(req_flags & MMC_DATA_QBR)) |
+		CQHCI_REL_WRITE(!!(req_flags & MMC_DATA_REL_WR)) |
+		CQHCI_BLK_COUNT(mrq->data->blocks) |
+		CQHCI_BLK_ADDR((u64)mrq->data->blk_addr);
+
+	pr_debug("%s: cqhci: tag %d task descriptor 0x016%llx\n",
+		 mmc_hostname(mrq->host), mrq->tag, (unsigned long long)*data);
+}
+
+static int cqhci_dma_map(struct mmc_host *host, struct mmc_request *mrq)
+{
+	int sg_count;
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return -EINVAL;
+
+	sg_count = dma_map_sg(mmc_dev(host), data->sg,
+			      data->sg_len,
+			      (data->flags & MMC_DATA_WRITE) ?
+			      DMA_TO_DEVICE : DMA_FROM_DEVICE);
+	if (!sg_count) {
+		pr_err("%s: sg-len: %d\n", __func__, data->sg_len);
+		return -ENOMEM;
+	}
+
+	return sg_count;
+}
+
+static void cqhci_set_tran_desc(u8 *desc, dma_addr_t addr, int len, bool end,
+				bool dma64)
+{
+	__le32 *attr = (__le32 __force *)desc;
+
+	*attr = (CQHCI_VALID(1) |
+		 CQHCI_END(end ? 1 : 0) |
+		 CQHCI_INT(0) |
+		 CQHCI_ACT(0x4) |
+		 CQHCI_DAT_LENGTH(len));
+
+	if (dma64) {
+		__le64 *dataddr = (__le64 __force *)(desc + 4);
+
+		dataddr[0] = cpu_to_le64(addr);
+	} else {
+		__le32 *dataddr = (__le32 __force *)(desc + 4);
+
+		dataddr[0] = cpu_to_le32(addr);
+	}
+}
+
+static int cqhci_prep_tran_desc(struct mmc_request *mrq,
+			       struct cqhci_host *cq_host, int tag)
+{
+	struct mmc_data *data = mrq->data;
+	int i, sg_count, len;
+	bool end = false;
+	bool dma64 = cq_host->dma64;
+	dma_addr_t addr;
+	u8 *desc;
+	struct scatterlist *sg;
+
+	sg_count = cqhci_dma_map(mrq->host, mrq);
+	if (sg_count < 0) {
+		pr_err("%s: %s: unable to map sg lists, %d\n",
+				mmc_hostname(mrq->host), __func__, sg_count);
+		return sg_count;
+	}
+
+	desc = get_trans_desc(cq_host, tag);
+
+	for_each_sg(data->sg, sg, sg_count, i) {
+		addr = sg_dma_address(sg);
+		len = sg_dma_len(sg);
+
+		if ((i+1) == sg_count)
+			end = true;
+		cqhci_set_tran_desc(desc, addr, len, end, dma64);
+		desc += cq_host->trans_desc_len;
+	}
+
+	return 0;
+}
+
+static void cqhci_prep_dcmd_desc(struct mmc_host *mmc,
+				   struct mmc_request *mrq)
+{
+	u64 *task_desc = NULL;
+	u64 data = 0;
+	u8 resp_type;
+	u8 *desc;
+	__le64 *dataddr;
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	u8 timing;
+
+	if (!(mrq->cmd->flags & MMC_RSP_PRESENT)) {
+		resp_type = 0x0;
+		timing = 0x1;
+	} else {
+		if (mrq->cmd->flags & MMC_RSP_R1B) {
+			resp_type = 0x3;
+			timing = 0x0;
+		} else {
+			resp_type = 0x2;
+			timing = 0x1;
+		}
+	}
+
+	task_desc = (__le64 __force *)get_desc(cq_host, cq_host->dcmd_slot);
+	memset(task_desc, 0, cq_host->task_desc_len);
+	data |= (CQHCI_VALID(1) |
+		 CQHCI_END(1) |
+		 CQHCI_INT(1) |
+		 CQHCI_QBAR(1) |
+		 CQHCI_ACT(0x5) |
+		 CQHCI_CMD_INDEX(mrq->cmd->opcode) |
+		 CQHCI_CMD_TIMING(timing) | CQHCI_RESP_TYPE(resp_type));
+	*task_desc |= data;
+	desc = (u8 *)task_desc;
+	pr_debug("%s: cqhci: dcmd: cmd: %d timing: %d resp: %d\n",
+		 mmc_hostname(mmc), mrq->cmd->opcode, timing, resp_type);
+	dataddr = (__le64 __force *)(desc + 4);
+	dataddr[0] = cpu_to_le64((u64)mrq->cmd->arg);
+
+}
+
+static void cqhci_post_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	struct mmc_data *data = mrq->data;
+
+	if (data) {
+		dma_unmap_sg(mmc_dev(host), data->sg, data->sg_len,
+			     (data->flags & MMC_DATA_READ) ?
+			     DMA_FROM_DEVICE : DMA_TO_DEVICE);
+	}
+}
+
+static inline int cqhci_tag(struct mmc_request *mrq)
+{
+	return mrq->cmd ? DCMD_SLOT : mrq->tag;
+}
+
+static int cqhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	int err = 0;
+	u64 data = 0;
+	u64 *task_desc = NULL;
+	int tag = cqhci_tag(mrq);
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	unsigned long flags;
+
+	if (!cq_host->enabled) {
+		pr_err("%s: cqhci: not enabled\n", mmc_hostname(mmc));
+		return -EINVAL;
+	}
+
+	/* First request after resume has to re-enable */
+	if (!cq_host->activated)
+		__cqhci_enable(cq_host);
+
+	if (!mmc->cqe_on) {
+		cqhci_writel(cq_host, 0, CQHCI_CTL);
+		mmc->cqe_on = true;
+		pr_debug("%s: cqhci: CQE on\n", mmc_hostname(mmc));
+		if (cqhci_readl(cq_host, CQHCI_CTL) && CQHCI_HALT) {
+			pr_err("%s: cqhci: CQE failed to exit halt state\n",
+			       mmc_hostname(mmc));
+		}
+		if (cq_host->ops->enable)
+			cq_host->ops->enable(mmc);
+	}
+
+	if (mrq->data) {
+		task_desc = (__le64 __force *)get_desc(cq_host, tag);
+		cqhci_prep_task_desc(mrq, &data, 1);
+		*task_desc = cpu_to_le64(data);
+		err = cqhci_prep_tran_desc(mrq, cq_host, tag);
+		if (err) {
+			pr_err("%s: cqhci: failed to setup tx desc: %d\n",
+			       mmc_hostname(mmc), err);
+			return err;
+		}
+	} else {
+		cqhci_prep_dcmd_desc(mmc, mrq);
+	}
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+
+	if (cq_host->recovery_halt) {
+		err = -EBUSY;
+		goto out_unlock;
+	}
+
+	cq_host->slot[tag].mrq = mrq;
+	cq_host->slot[tag].flags = 0;
+
+	cq_host->qcnt += 1;
+
+	cqhci_writel(cq_host, 1 << tag, CQHCI_TDBR);
+	if (!(cqhci_readl(cq_host, CQHCI_TDBR) & (1 << tag)))
+		pr_debug("%s: cqhci: doorbell not set for tag %d\n",
+			 mmc_hostname(mmc), tag);
+out_unlock:
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	if (err)
+		cqhci_post_req(mmc, mrq);
+
+	return err;
+}
+
+static void cqhci_recovery_needed(struct mmc_host *mmc, struct mmc_request *mrq,
+				  bool notify)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (!cq_host->recovery_halt) {
+		cq_host->recovery_halt = true;
+		pr_debug("%s: cqhci: recovery needed\n", mmc_hostname(mmc));
+		wake_up(&cq_host->wait_queue);
+		if (notify && mmc->cqe_recovery_notifier)
+			mmc->cqe_recovery_notifier(mmc, mrq);
+	}
+}
+
+static unsigned int cqhci_error_flags(int error1, int error2)
+{
+	int error = error1 ? error1 : error2;
+
+	switch (error) {
+	case -EILSEQ:
+		return CQHCI_HOST_CRC;
+	case -ETIMEDOUT:
+		return CQHCI_HOST_TIMEOUT;
+	default:
+		return CQHCI_HOST_OTHER;
+	}
+}
+
+static void cqhci_error_irq(struct mmc_host *mmc, u32 status, int cmd_error,
+			    int data_error)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	struct cqhci_slot *slot;
+	u32 terri;
+	int tag;
+
+	spin_lock(&cq_host->lock);
+
+	terri = cqhci_readl(cq_host, CQHCI_TERRI);
+
+	pr_debug("%s: cqhci: error IRQ status: 0x%08x cmd error %d data error %d TERRI: 0x%08x\n",
+		 mmc_hostname(mmc), status, cmd_error, data_error, terri);
+
+	/* Forget about errors when recovery has already been triggered */
+	if (cq_host->recovery_halt)
+		goto out_unlock;
+
+	if (!cq_host->qcnt) {
+		WARN_ONCE(1, "%s: cqhci: error when idle. IRQ status: 0x%08x cmd error %d data error %d TERRI: 0x%08x\n",
+			  mmc_hostname(mmc), status, cmd_error, data_error,
+			  terri);
+		goto out_unlock;
+	}
+
+	if (CQHCI_TERRI_C_VALID(terri)) {
+		tag = CQHCI_TERRI_C_TASK(terri);
+		slot = &cq_host->slot[tag];
+		if (slot->mrq) {
+			slot->flags = cqhci_error_flags(cmd_error, data_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+		}
+	}
+
+	if (CQHCI_TERRI_D_VALID(terri)) {
+		tag = CQHCI_TERRI_D_TASK(terri);
+		slot = &cq_host->slot[tag];
+		if (slot->mrq) {
+			slot->flags = cqhci_error_flags(data_error, cmd_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+		}
+	}
+
+	if (!cq_host->recovery_halt) {
+		/*
+		 * The only way to guarantee forward progress is to mark at
+		 * least one task in error, so if none is indicated, pick one.
+		 */
+		for (tag = 0; tag < NUM_SLOTS; tag++) {
+			slot = &cq_host->slot[tag];
+			if (!slot->mrq)
+				continue;
+			slot->flags = cqhci_error_flags(data_error, cmd_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+			break;
+		}
+	}
+
+out_unlock:
+	spin_unlock(&cq_host->lock);
+}
+
+static void cqhci_finish_mrq(struct mmc_host *mmc, unsigned int tag)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	struct mmc_request *mrq = slot->mrq;
+	struct mmc_data *data;
+
+	if (!mrq) {
+		WARN_ONCE(1, "%s: cqhci: spurious TCN for tag %d\n",
+			  mmc_hostname(mmc), tag);
+		return;
+	}
+
+	/* No completions allowed during recovery */
+	if (cq_host->recovery_halt) {
+		slot->flags |= CQHCI_COMPLETED;
+		return;
+	}
+
+	slot->mrq = NULL;
+
+	cq_host->qcnt -= 1;
+
+	data = mrq->data;
+	if (data) {
+		if (data->error)
+			data->bytes_xfered = 0;
+		else
+			data->bytes_xfered = data->blksz * data->blocks;
+	}
+
+	mmc_cqe_request_done(mmc, mrq);
+}
+
+irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 intmask, int cmd_error,
+		      int data_error)
+{
+	u32 status;
+	unsigned long tag = 0, comp_status;
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	status = cqhci_readl(cq_host, CQHCI_IS);
+	cqhci_writel(cq_host, status, CQHCI_IS);
+
+	pr_debug("%s: cqhci: IRQ status: 0x%08x\n", mmc_hostname(mmc), status);
+
+	if ((status & CQHCI_IS_RED) || cmd_error || data_error)
+		cqhci_error_irq(mmc, status, cmd_error, data_error);
+
+	if (status & CQHCI_IS_TCC) {
+		/* read TCN and complete the request */
+		comp_status = cqhci_readl(cq_host, CQHCI_TCN);
+		cqhci_writel(cq_host, comp_status, CQHCI_TCN);
+		pr_debug("%s: cqhci: TCN: 0x%08lx\n",
+			 mmc_hostname(mmc), comp_status);
+
+		spin_lock(&cq_host->lock);
+
+		for_each_set_bit(tag, &comp_status, cq_host->num_slots) {
+			/* complete the corresponding mrq */
+			pr_debug("%s: cqhci: completing tag %lu\n",
+				 mmc_hostname(mmc), tag);
+			cqhci_finish_mrq(mmc, tag);
+		}
+
+		if (cq_host->waiting_for_idle && !cq_host->qcnt) {
+			cq_host->waiting_for_idle = false;
+			wake_up(&cq_host->wait_queue);
+		}
+
+		spin_unlock(&cq_host->lock);
+	}
+
+	if (status & CQHCI_IS_TCL)
+		wake_up(&cq_host->wait_queue);
+
+	if (status & CQHCI_IS_HAC)
+		wake_up(&cq_host->wait_queue);
+
+	return IRQ_HANDLED;
+}
+EXPORT_SYMBOL(cqhci_irq);
+
+static bool cqhci_is_idle(struct cqhci_host *cq_host, int *ret)
+{
+	unsigned long flags;
+	bool is_idle;
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	is_idle = !cq_host->qcnt || cq_host->recovery_halt;
+	*ret = cq_host->recovery_halt ? -EBUSY : 0;
+	cq_host->waiting_for_idle = !is_idle;
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	return is_idle;
+}
+
+static int cqhci_wait_for_idle(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int ret;
+
+	wait_event(cq_host->wait_queue, cqhci_is_idle(cq_host, &ret));
+
+	return ret;
+}
+
+static bool cqhci_timeout(struct mmc_host *mmc, struct mmc_request *mrq,
+			  bool *recovery_needed)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int tag = cqhci_tag(mrq);
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	unsigned long flags;
+	bool timed_out;
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	timed_out = slot->mrq == mrq;
+	if (timed_out) {
+		slot->flags |= CQHCI_EXTERNAL_TIMEOUT;
+		cqhci_recovery_needed(mmc, mrq, false);
+		*recovery_needed = cq_host->recovery_halt;
+	}
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	if (timed_out) {
+		pr_err("%s: cqhci: timeout for tag %d\n",
+		       mmc_hostname(mmc), tag);
+		cqhci_dumpregs(cq_host);
+	}
+
+	return timed_out;
+}
+
+static bool cqhci_tasks_cleared(struct cqhci_host *cq_host)
+{
+	return !(cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_CLEAR_ALL_TASKS);
+}
+
+static bool cqhci_clear_all_tasks(struct mmc_host *mmc, unsigned int timeout)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	bool ret;
+	u32 ctl;
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_TCL);
+
+	ctl = cqhci_readl(cq_host, CQHCI_CTL);
+	ctl |= CQHCI_CLEAR_ALL_TASKS;
+	cqhci_writel(cq_host, ctl, CQHCI_CTL);
+
+	wait_event_timeout(cq_host->wait_queue, cqhci_tasks_cleared(cq_host),
+			   msecs_to_jiffies(timeout) + 1);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	ret = cqhci_tasks_cleared(cq_host);
+
+	if (!ret)
+		pr_debug("%s: cqhci: Failed to clear tasks\n",
+			 mmc_hostname(mmc));
+
+	return ret;
+}
+
+static bool cqhci_halted(struct cqhci_host *cq_host)
+{
+	return cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT;
+}
+
+static bool cqhci_halt(struct mmc_host *mmc, unsigned int timeout)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	bool ret;
+	u32 ctl;
+
+	if (cqhci_halted(cq_host))
+		return true;
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_HAC);
+
+	ctl = cqhci_readl(cq_host, CQHCI_CTL);
+	ctl |= CQHCI_HALT;
+	cqhci_writel(cq_host, ctl, CQHCI_CTL);
+
+	wait_event_timeout(cq_host->wait_queue, cqhci_halted(cq_host),
+			   msecs_to_jiffies(timeout) + 1);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	ret = cqhci_halted(cq_host);
+
+	if (!ret)
+		pr_debug("%s: cqhci: Failed to halt\n", mmc_hostname(mmc));
+
+	return ret;
+}
+
+/*
+ * After halting we expect to be able to use the command line. We interpret the
+ * failure to halt to mean the data lines might still be in use (and the upper
+ * layers will need to send a STOP command), so we set the timeout based on a
+ * generous command timeout.
+ */
+#define CQHCI_START_HALT_TIMEOUT	5
+
+static void cqhci_recovery_start(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	pr_debug("%s: cqhci: %s\n", mmc_hostname(mmc), __func__);
+
+	WARN_ON(!cq_host->recovery_halt);
+
+	cqhci_halt(mmc, CQHCI_START_HALT_TIMEOUT);
+
+	if (cq_host->ops->disable)
+		cq_host->ops->disable(mmc, true);
+
+	mmc->cqe_on = false;
+}
+
+static int cqhci_error_from_flags(unsigned int flags)
+{
+	if (!flags)
+		return 0;
+
+	/* CRC errors might indicate re-tuning so prefer to report that */
+	if (flags & CQHCI_HOST_CRC)
+		return -EILSEQ;
+
+	if (flags & (CQHCI_EXTERNAL_TIMEOUT | CQHCI_HOST_TIMEOUT))
+		return -ETIMEDOUT;
+
+	return -EIO;
+}
+
+static void cqhci_recover_mrq(struct cqhci_host *cq_host, unsigned int tag)
+{
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	struct mmc_request *mrq = slot->mrq;
+	struct mmc_data *data;
+
+	if (!mrq)
+		return;
+
+	slot->mrq = NULL;
+
+	cq_host->qcnt -= 1;
+
+	data = mrq->data;
+	if (data) {
+		data->bytes_xfered = 0;
+		data->error = cqhci_error_from_flags(slot->flags);
+	} else {
+		mrq->cmd->error = cqhci_error_from_flags(slot->flags);
+	}
+
+	mmc_cqe_request_done(cq_host->mmc, mrq);
+}
+
+static void cqhci_recover_mrqs(struct cqhci_host *cq_host)
+{
+	int i;
+
+	for (i = 0; i < cq_host->num_slots; i++)
+		cqhci_recover_mrq(cq_host, i);
+}
+
+/*
+ * By now the command and data lines should be unused so there is no reason for
+ * CQHCI to take a long time to halt, but if it doesn't halt there could be
+ * problems clearing tasks, so be generous.
+ */
+#define CQHCI_FINISH_HALT_TIMEOUT	20
+
+/* CQHCI could be expected to clear it's internal state pretty quickly */
+#define CQHCI_CLEAR_TIMEOUT		20
+
+static void cqhci_recovery_finish(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	unsigned long flags;
+	u32 cqcfg;
+	bool ok;
+
+	pr_debug("%s: cqhci: %s\n", mmc_hostname(mmc), __func__);
+
+	WARN_ON(!cq_host->recovery_halt);
+
+	ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+
+	if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+		ok = false;
+
+	/*
+	 * The specification contradicts itself, by saying that tasks cannot be
+	 * cleared if CQHCI does not halt, but if CQHCI does not halt, it should
+	 * be disabled/re-enabled, but not to disable before clearing tasks.
+	 * Have a go anyway.
+	 */
+	if (!ok) {
+		pr_debug("%s: cqhci: disable / re-enable\n", mmc_hostname(mmc));
+		cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+		cqcfg &= ~CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+		cqcfg |= CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+		/* Be sure that there are no tasks */
+		ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+		if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+			ok = false;
+		WARN_ON(!ok);
+	}
+
+	cqhci_recover_mrqs(cq_host);
+
+	WARN_ON(cq_host->qcnt);
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	cq_host->qcnt = 0;
+	cq_host->recovery_halt = false;
+	mmc->cqe_on = false;
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	/* Ensure all writes are done before interrupts are re-enabled */
+	wmb();
+
+	cqhci_writel(cq_host, CQHCI_IS_HAC | CQHCI_IS_TCL, CQHCI_IS);
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_MASK);
+
+	pr_debug("%s: cqhci: recovery done\n", mmc_hostname(mmc));
+}
+
+static const struct mmc_cqe_ops cqhci_cqe_ops = {
+	.cqe_enable = cqhci_enable,
+	.cqe_disable = cqhci_disable,
+	.cqe_request = cqhci_request,
+	.cqe_post_req = cqhci_post_req,
+	.cqe_off = cqhci_off,
+	.cqe_wait_for_idle = cqhci_wait_for_idle,
+	.cqe_timeout = cqhci_timeout,
+	.cqe_recovery_start = cqhci_recovery_start,
+	.cqe_recovery_finish = cqhci_recovery_finish,
+};
+
+struct cqhci_host *cqhci_pltfm_init(struct platform_device *pdev)
+{
+	struct cqhci_host *cq_host;
+	struct resource *cqhci_memres = NULL;
+
+	/* check and setup CMDQ interface */
+	cqhci_memres = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+						   "cqhci_mem");
+	if (!cqhci_memres) {
+		dev_dbg(&pdev->dev, "CMDQ not supported\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	cq_host = devm_kzalloc(&pdev->dev, sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host)
+		return ERR_PTR(-ENOMEM);
+	cq_host->mmio = devm_ioremap(&pdev->dev,
+				     cqhci_memres->start,
+				     resource_size(cqhci_memres));
+	if (!cq_host->mmio) {
+		dev_err(&pdev->dev, "failed to remap cqhci regs\n");
+		return ERR_PTR(-EBUSY);
+	}
+	dev_dbg(&pdev->dev, "CMDQ ioremap: done\n");
+
+	return cq_host;
+}
+EXPORT_SYMBOL(cqhci_pltfm_init);
+
+static unsigned int cqhci_ver_major(struct cqhci_host *cq_host)
+{
+	return CQHCI_VER_MAJOR(cqhci_readl(cq_host, CQHCI_VER));
+}
+
+static unsigned int cqhci_ver_minor(struct cqhci_host *cq_host)
+{
+	u32 ver = cqhci_readl(cq_host, CQHCI_VER);
+
+	return CQHCI_VER_MINOR1(ver) * 10 + CQHCI_VER_MINOR2(ver);
+}
+
+int cqhci_init(struct cqhci_host *cq_host, struct mmc_host *mmc,
+	      bool dma64)
+{
+	int err;
+
+	cq_host->dma64 = dma64;
+	cq_host->mmc = mmc;
+	cq_host->mmc->cqe_private = cq_host;
+
+	cq_host->num_slots = NUM_SLOTS;
+	cq_host->dcmd_slot = DCMD_SLOT;
+
+	mmc->cqe_ops = &cqhci_cqe_ops;
+
+	mmc->cqe_qdepth = NUM_SLOTS;
+	if (mmc->caps2 & MMC_CAP2_CQE_DCMD)
+		mmc->cqe_qdepth -= 1;
+
+	cq_host->slot = devm_kcalloc(mmc_dev(mmc), cq_host->num_slots,
+				     sizeof(*cq_host->slot), GFP_KERNEL);
+	if (!cq_host->slot) {
+		err = -ENOMEM;
+		goto out_err;
+	}
+
+	spin_lock_init(&cq_host->lock);
+
+	init_completion(&cq_host->halt_comp);
+	init_waitqueue_head(&cq_host->wait_queue);
+
+	pr_info("%s: CQHCI version %u.%02u\n",
+		mmc_hostname(mmc), cqhci_ver_major(cq_host),
+		cqhci_ver_minor(cq_host));
+
+	return 0;
+
+out_err:
+	pr_err("%s: CQHCI version %u.%02u failed to initialize, error %d\n",
+	       mmc_hostname(mmc), cqhci_ver_major(cq_host),
+	       cqhci_ver_minor(cq_host), err);
+	return err;
+}
+EXPORT_SYMBOL(cqhci_init);
+
+MODULE_AUTHOR("Venkat Gopalakrishnan <venkatg@codeaurora.org>");
+MODULE_DESCRIPTION("Command Queue Host Controller Interface driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/mmc/host/cqhci.h b/drivers/mmc/host/cqhci.h
new file mode 100644
index 000000000000..2d39d361b322
--- /dev/null
+++ b/drivers/mmc/host/cqhci.h
@@ -0,0 +1,240 @@
+/* Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#ifndef LINUX_MMC_CQHCI_H
+#define LINUX_MMC_CQHCI_H
+
+#include <linux/compiler.h>
+#include <linux/bitops.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/completion.h>
+#include <linux/wait.h>
+#include <linux/irqreturn.h>
+#include <asm/io.h>
+
+/* registers */
+/* version */
+#define CQHCI_VER			0x00
+#define CQHCI_VER_MAJOR(x)		(((x) & GENMASK(11, 8)) >> 8)
+#define CQHCI_VER_MINOR1(x)		(((x) & GENMASK(7, 4)) >> 4)
+#define CQHCI_VER_MINOR2(x)		((x) & GENMASK(3, 0))
+
+/* capabilities */
+#define CQHCI_CAP			0x04
+/* configuration */
+#define CQHCI_CFG			0x08
+#define CQHCI_DCMD			0x00001000
+#define CQHCI_TASK_DESC_SZ		0x00000100
+#define CQHCI_ENABLE			0x00000001
+
+/* control */
+#define CQHCI_CTL			0x0C
+#define CQHCI_CLEAR_ALL_TASKS		0x00000100
+#define CQHCI_HALT			0x00000001
+
+/* interrupt status */
+#define CQHCI_IS			0x10
+#define CQHCI_IS_HAC			BIT(0)
+#define CQHCI_IS_TCC			BIT(1)
+#define CQHCI_IS_RED			BIT(2)
+#define CQHCI_IS_TCL			BIT(3)
+
+#define CQHCI_IS_MASK (CQHCI_IS_TCC | CQHCI_IS_RED)
+
+/* interrupt status enable */
+#define CQHCI_ISTE			0x14
+
+/* interrupt signal enable */
+#define CQHCI_ISGE			0x18
+
+/* interrupt coalescing */
+#define CQHCI_IC			0x1C
+#define CQHCI_IC_ENABLE			BIT(31)
+#define CQHCI_IC_RESET			BIT(16)
+#define CQHCI_IC_ICCTHWEN		BIT(15)
+#define CQHCI_IC_ICCTH(x)		((x & 0x1F) << 8)
+#define CQHCI_IC_ICTOVALWEN		BIT(7)
+#define CQHCI_IC_ICTOVAL(x)		(x & 0x7F)
+
+/* task list base address */
+#define CQHCI_TDLBA			0x20
+
+/* task list base address upper */
+#define CQHCI_TDLBAU			0x24
+
+/* door-bell */
+#define CQHCI_TDBR			0x28
+
+/* task completion notification */
+#define CQHCI_TCN			0x2C
+
+/* device queue status */
+#define CQHCI_DQS			0x30
+
+/* device pending tasks */
+#define CQHCI_DPT			0x34
+
+/* task clear */
+#define CQHCI_TCLR			0x38
+
+/* send status config 1 */
+#define CQHCI_SSC1			0x40
+
+/* send status config 2 */
+#define CQHCI_SSC2			0x44
+
+/* response for dcmd */
+#define CQHCI_CRDCT			0x48
+
+/* response mode error mask */
+#define CQHCI_RMEM			0x50
+
+/* task error info */
+#define CQHCI_TERRI			0x54
+
+#define CQHCI_TERRI_C_INDEX(x)		((x) & GENMASK(5, 0))
+#define CQHCI_TERRI_C_TASK(x)		(((x) & GENMASK(12, 8)) >> 8)
+#define CQHCI_TERRI_C_VALID(x)		((x) & BIT(15))
+#define CQHCI_TERRI_D_INDEX(x)		(((x) & GENMASK(21, 16)) >> 16)
+#define CQHCI_TERRI_D_TASK(x)		(((x) & GENMASK(28, 24)) >> 24)
+#define CQHCI_TERRI_D_VALID(x)		((x) & BIT(31))
+
+/* command response index */
+#define CQHCI_CRI			0x58
+
+/* command response argument */
+#define CQHCI_CRA			0x5C
+
+#define CQHCI_INT_ALL			0xF
+#define CQHCI_IC_DEFAULT_ICCTH		31
+#define CQHCI_IC_DEFAULT_ICTOVAL	1
+
+/* attribute fields */
+#define CQHCI_VALID(x)			((x & 1) << 0)
+#define CQHCI_END(x)			((x & 1) << 1)
+#define CQHCI_INT(x)			((x & 1) << 2)
+#define CQHCI_ACT(x)			((x & 0x7) << 3)
+
+/* data command task descriptor fields */
+#define CQHCI_FORCED_PROG(x)		((x & 1) << 6)
+#define CQHCI_CONTEXT(x)		((x & 0xF) << 7)
+#define CQHCI_DATA_TAG(x)		((x & 1) << 11)
+#define CQHCI_DATA_DIR(x)		((x & 1) << 12)
+#define CQHCI_PRIORITY(x)		((x & 1) << 13)
+#define CQHCI_QBAR(x)			((x & 1) << 14)
+#define CQHCI_REL_WRITE(x)		((x & 1) << 15)
+#define CQHCI_BLK_COUNT(x)		((x & 0xFFFF) << 16)
+#define CQHCI_BLK_ADDR(x)		((x & 0xFFFFFFFF) << 32)
+
+/* direct command task descriptor fields */
+#define CQHCI_CMD_INDEX(x)		((x & 0x3F) << 16)
+#define CQHCI_CMD_TIMING(x)		((x & 1) << 22)
+#define CQHCI_RESP_TYPE(x)		((x & 0x3) << 23)
+
+/* transfer descriptor fields */
+#define CQHCI_DAT_LENGTH(x)		((x & 0xFFFF) << 16)
+#define CQHCI_DAT_ADDR_LO(x)		((x & 0xFFFFFFFF) << 32)
+#define CQHCI_DAT_ADDR_HI(x)		((x & 0xFFFFFFFF) << 0)
+
+struct cqhci_host_ops;
+struct mmc_host;
+struct cqhci_slot;
+
+struct cqhci_host {
+	const struct cqhci_host_ops *ops;
+	void __iomem *mmio;
+	struct mmc_host *mmc;
+
+	spinlock_t lock;
+
+	/* relative card address of device */
+	unsigned int rca;
+
+	/* 64 bit DMA */
+	bool dma64;
+	int num_slots;
+	int qcnt;
+
+	u32 dcmd_slot;
+	u32 caps;
+#define CQHCI_TASK_DESC_SZ_128		0x1
+
+	u32 quirks;
+#define CQHCI_QUIRK_SHORT_TXFR_DESC_SZ	0x1
+
+	bool enabled;
+	bool halted;
+	bool init_done;
+	bool activated;
+	bool waiting_for_idle;
+	bool recovery_halt;
+
+	size_t desc_size;
+	size_t data_size;
+
+	u8 *desc_base;
+
+	/* total descriptor size */
+	u8 slot_sz;
+
+	/* 64/128 bit depends on CQHCI_CFG */
+	u8 task_desc_len;
+
+	/* 64 bit on 32-bit arch, 128 bit on 64-bit */
+	u8 link_desc_len;
+
+	u8 *trans_desc_base;
+	/* same length as transfer descriptor */
+	u8 trans_desc_len;
+
+	dma_addr_t desc_dma_base;
+	dma_addr_t trans_desc_dma_base;
+
+	struct completion halt_comp;
+	wait_queue_head_t wait_queue;
+	struct cqhci_slot *slot;
+};
+
+struct cqhci_host_ops {
+	void (*dumpregs)(struct mmc_host *mmc);
+	void (*write_l)(struct cqhci_host *host, u32 val, int reg);
+	u32 (*read_l)(struct cqhci_host *host, int reg);
+	void (*enable)(struct mmc_host *mmc);
+	void (*disable)(struct mmc_host *mmc, bool recovery);
+};
+
+static inline void cqhci_writel(struct cqhci_host *host, u32 val, int reg)
+{
+	if (unlikely(host->ops->write_l))
+		host->ops->write_l(host, val, reg);
+	else
+		writel_relaxed(val, host->mmio + reg);
+}
+
+static inline u32 cqhci_readl(struct cqhci_host *host, int reg)
+{
+	if (unlikely(host->ops->read_l))
+		return host->ops->read_l(host, reg);
+	else
+		return readl_relaxed(host->mmio + reg);
+}
+
+struct platform_device;
+
+irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 intmask, int cmd_error,
+		      int data_error);
+int cqhci_init(struct cqhci_host *cq_host, struct mmc_host *mmc, bool dma64);
+struct cqhci_host *cqhci_pltfm_init(struct platform_device *pdev);
+int cqhci_suspend(struct mmc_host *mmc);
+int cqhci_resume(struct mmc_host *mmc);
+
+#endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH V5 13/13] mmc: sdhci-pci: Add CQHCI support for Intel GLK
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (11 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 12/13] mmc: cqhci: support for command queue enabled host Adrian Hunter
@ 2017-08-10 12:08 ` Adrian Hunter
  2017-08-11 10:33 ` [PATCH V5 00/13] mmc: Add Command Queue support Bough Chen
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-10 12:08 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

Add CQHCI initialization and implement CQHCI operations for Intel GLK.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/host/Kconfig          |   1 +
 drivers/mmc/host/sdhci-pci-core.c | 153 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index ccf7dab4a9f8..b8c9ea5cb8cd 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -81,6 +81,7 @@ config MMC_SDHCI_BIG_ENDIAN_32BIT_BYTE_SWAPPER
 config MMC_SDHCI_PCI
 	tristate "SDHCI support on PCI bus"
 	depends on MMC_SDHCI && PCI
+	select MMC_CQHCI
 	help
 	  This selects the PCI Secure Digital Host Controller Interface.
 	  Most controllers found today are PCI devices.
diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
index 2c853cfa8389..ffabddeedc1a 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -30,6 +30,8 @@
 #include <linux/mmc/sdhci-pci-data.h>
 #include <linux/acpi.h>
 
+#include "cqhci.h"
+
 #include "sdhci.h"
 #include "sdhci-pci.h"
 #include "sdhci-pci-o2micro.h"
@@ -118,6 +120,28 @@ int sdhci_pci_resume_host(struct sdhci_pci_chip *chip)
 
 	return 0;
 }
+
+static int sdhci_cqhci_suspend(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = cqhci_suspend(chip->slots[0]->host->mmc);
+	if (ret)
+		return ret;
+
+	return sdhci_pci_suspend_host(chip);
+}
+
+static int sdhci_cqhci_resume(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = sdhci_pci_resume_host(chip);
+	if (ret)
+		return ret;
+
+	return cqhci_resume(chip->slots[0]->host->mmc);
+}
 #endif
 
 #ifdef CONFIG_PM
@@ -168,8 +192,48 @@ static int sdhci_pci_runtime_resume_host(struct sdhci_pci_chip *chip)
 
 	return 0;
 }
+
+static int sdhci_cqhci_runtime_suspend(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = cqhci_suspend(chip->slots[0]->host->mmc);
+	if (ret)
+		return ret;
+
+	return sdhci_pci_runtime_suspend_host(chip);
+}
+
+static int sdhci_cqhci_runtime_resume(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = sdhci_pci_runtime_resume_host(chip);
+	if (ret)
+		return ret;
+
+	return cqhci_resume(chip->slots[0]->host->mmc);
+}
 #endif
 
+static u32 sdhci_cqhci_irq(struct sdhci_host *host, u32 intmask)
+{
+	int cmd_error = 0;
+	int data_error = 0;
+
+	if (!sdhci_cqe_irq(host, intmask, &cmd_error, &data_error))
+		return intmask;
+
+	cqhci_irq(host->mmc, intmask, cmd_error, data_error);
+
+	return 0;
+}
+
+static void sdhci_pci_dumpregs(struct mmc_host *mmc)
+{
+	sdhci_dumpregs(mmc_priv(mmc));
+}
+
 /*****************************************************************************\
  *                                                                           *
  * Hardware specific quirk handling                                          *
@@ -568,6 +632,17 @@ static void intel_hs400_enhanced_strobe(struct mmc_host *mmc,
 	.hw_reset		= sdhci_pci_hw_reset,
 };
 
+static const struct sdhci_ops sdhci_intel_glk_ops = {
+	.set_clock		= sdhci_set_clock,
+	.set_power		= sdhci_intel_set_power,
+	.enable_dma		= sdhci_pci_enable_dma,
+	.set_bus_width		= sdhci_pci_set_bus_width,
+	.reset			= sdhci_reset,
+	.set_uhs_signaling	= sdhci_set_uhs_signaling,
+	.hw_reset		= sdhci_pci_hw_reset,
+	.irq			= sdhci_cqhci_irq,
+};
+
 static void byt_read_dsm(struct sdhci_pci_slot *slot)
 {
 	struct intel_host *intel_host = sdhci_pci_priv(slot);
@@ -597,6 +672,8 @@ static int glk_emmc_probe_slot(struct sdhci_pci_slot *slot)
 {
 	int ret = byt_emmc_probe_slot(slot);
 
+	slot->host->mmc->caps2 |= MMC_CAP2_CQE;
+
 	if (slot->chip->pdev->device != PCI_DEVICE_ID_INTEL_GLK_EMMC) {
 		slot->host->mmc->caps2 |= MMC_CAP2_HS400_ES,
 		slot->host->mmc_host_ops.hs400_enhanced_strobe =
@@ -606,6 +683,71 @@ static int glk_emmc_probe_slot(struct sdhci_pci_slot *slot)
 	return ret;
 }
 
+static void glk_cqe_enable(struct mmc_host *mmc)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	u32 reg;
+
+	/*
+	 * CQE gets stuck if it sees Buffer Read Enable bit set, which can be
+	 * the case after tuning, so ensure the buffer is drained.
+	 */
+	reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	while (reg & SDHCI_DATA_AVAILABLE) {
+		sdhci_readl(host, SDHCI_BUFFER);
+		reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	}
+
+	sdhci_cqe_enable(mmc);
+}
+
+static const struct cqhci_host_ops glk_cqhci_ops = {
+	.enable		= glk_cqe_enable,
+	.disable	= sdhci_cqe_disable,
+	.dumpregs	= sdhci_pci_dumpregs,
+};
+
+static int glk_emmc_add_host(struct sdhci_pci_slot *slot)
+{
+	struct device *dev = &slot->chip->pdev->dev;
+	struct sdhci_host *host = slot->host;
+	struct cqhci_host *cq_host;
+	bool dma64;
+	int ret;
+
+	ret = sdhci_setup_host(host);
+	if (ret)
+		return ret;
+
+	cq_host = devm_kzalloc(dev, sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	cq_host->mmio = host->ioaddr + 0x200;
+	cq_host->quirks |= CQHCI_QUIRK_SHORT_TXFR_DESC_SZ;
+	cq_host->ops = &glk_cqhci_ops;
+
+	dma64 = host->flags & SDHCI_USE_64_BIT_DMA;
+	if (dma64)
+		cq_host->caps |= CQHCI_TASK_DESC_SZ_128;
+
+	ret = cqhci_init(cq_host, host->mmc, dma64);
+	if (ret)
+		goto cleanup;
+
+	ret = __sdhci_add_host(host);
+	if (ret)
+		goto cleanup;
+
+	return 0;
+
+cleanup:
+	sdhci_cleanup_host(host);
+	return ret;
+}
+
 #ifdef CONFIG_ACPI
 static int ni_set_max_freq(struct sdhci_pci_slot *slot)
 {
@@ -684,11 +826,20 @@ static int byt_sd_probe_slot(struct sdhci_pci_slot *slot)
 static const struct sdhci_pci_fixes sdhci_intel_glk_emmc = {
 	.allow_runtime_pm	= true,
 	.probe_slot		= glk_emmc_probe_slot,
+	.add_host		= glk_emmc_add_host,
+#ifdef CONFIG_PM_SLEEP
+	.suspend		= sdhci_cqhci_suspend,
+	.resume			= sdhci_cqhci_resume,
+#endif
+#ifdef CONFIG_PM
+	.runtime_suspend	= sdhci_cqhci_runtime_suspend,
+	.runtime_resume		= sdhci_cqhci_runtime_resume,
+#endif
 	.quirks			= SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC,
 	.quirks2		= SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
 				  SDHCI_QUIRK2_CAPS_BIT63_FOR_HS400 |
 				  SDHCI_QUIRK2_STOP_WITH_TC,
-	.ops			= &sdhci_intel_byt_ops,
+	.ops			= &sdhci_intel_glk_ops,
 	.priv_size		= sizeof(struct intel_host),
 };
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* RE: [PATCH V5 00/13] mmc: Add Command Queue support
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (12 preceding siblings ...)
  2017-08-10 12:08 ` [PATCH V5 13/13] mmc: sdhci-pci: Add CQHCI support for Intel GLK Adrian Hunter
@ 2017-08-11 10:33 ` Bough Chen
  2017-08-17  7:45 ` Bough Chen
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 49+ messages in thread
From: Bough Chen @ 2017-08-11 10:33 UTC (permalink / raw)
  To: Adrian Hunter, Ulf Hansson
  Cc: linux-mmc, Alex Lemberg, Mateusz Nowak, Yuliy Izrailov,
	Jaehoon Chung, Dong Aisheng, Das Asutosh, Zhangfei Gao,
	Sahitya Tummala, Harjani Ritesh, Venu Byravarasu, Linus Walleij,
	Shawn Lin

> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Adrian Hunter
> Sent: Thursday, August 10, 2017 8:08 PM
> To: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: linux-mmc <linux-mmc@vger.kernel.org>; Bough Chen
> <haibo.chen@nxp.com>; Alex Lemberg <alex.lemberg@sandisk.com>;
> Mateusz Nowak <mateusz.nowak@intel.com>; Yuliy Izrailov
> <Yuliy.Izrailov@sandisk.com>; Jaehoon Chung <jh80.chung@samsung.com>;
> Dong Aisheng <dongas86@gmail.com>; Das Asutosh
> <asutoshd@codeaurora.org>; Zhangfei Gao <zhangfei.gao@gmail.com>;
> Sahitya Tummala <stummala@codeaurora.org>; Harjani Ritesh
> <riteshh@codeaurora.org>; Venu Byravarasu <vbyravarasu@nvidia.com>;
> Linus Walleij <linus.walleij@linaro.org>; Shawn Lin <shawn.lin@rock-chips.com>
> Subject: [PATCH V5 00/13] mmc: Add Command Queue support
> 
> Hi
> 
> Here is V5 of the hardware command queue patches without the software
> command queue patches.
> 
> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight 2%
> drop in sequential read speed but no change to sequential write.
> 
> 

Hi Adrian,

My work is based on 4.9 branch, I back port most mmc related patches, but not back port that much block layer related patches, so when I apply your V5 patches, I remove some code, such as "BLK_STS_IOERR" and "REQ_OP_DRV_IN" "REQ_OP_DRV_OUT", change to the old usage. No other change for your V5 patches.

This V5 patches indeed fix the 32bit DMA issue, now my i.MX8 CMDQ can pass all the stress test.
 
So I would like to add:
 
Tested-by: Haibo Chen <haibo.chen@nxp.com>;

Best Regards
Haibo chen

> Changes since V4:
>       mmc: core: Add mmc_retune_hold_now()
> 	Add explanation to commit message.
>       mmc: host: Add CQE interface
> 	Add comments to callback declarations.
>       mmc: core: Turn off CQE before sending commands
> 	Add explanation to commit message.
>       mmc: core: Add support for handling CQE requests
> 	Add comments as requested by Ulf.
>       mmc: core: Remove unused MMC_CAP2_PACKED_CMD
> 	New patch.
>       mmc: mmc: Enable Command Queuing
> 	Adjust for removal of MMC_CAP2_PACKED_CMD.
> 	Add a comment about Packed Commands.
>       mmc: mmc: Enable CQE's
> 	Remove un-necessary check for MMC_CAP2_CQE
>       mmc: block: Use local variables in mmc_blk_data_prep()
> 	New patch.
>       mmc: block: Prepare CQE data
> 	Adjust due to "mmc: block: Use local variables in
> mmc_blk_data_prep()"
> 	Remove priority setting.
> 	Add explanation to commit message.
>       mmc: cqhci: support for command queue enabled host
> 	Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA
> 
> Changes since V3:
> 	Adjusted ...blk_end_request...() for new block status codes
> 	Fixed CQHCI transaction descriptor for "no DCMD" case
> 
> Changes since V2:
> 	Dropped patches that have been applied.
> 	Re-based
> 	Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"
> 
> Changes since V1:
> 
> 	"Share mmc request array between partitions" is dependent
> 	on changes in "Introduce queue semantics", so added that
> 	and block fixes:
> 
> 	Added "Fix is_waiting_last_req set incorrectly"
> 	Added "Fix cmd error reset failure path"
> 	Added "Use local var for mqrq_cur"
> 	Added "Introduce queue semantics"
> 
> Changes since RFC:
> 
> 	Re-based on next.
> 	Added comment about command queue priority.
> 	Added some acks and reviews.
> 
> 
> Adrian Hunter (12):
>       mmc: core: Add mmc_retune_hold_now()
>       mmc: core: Add members to mmc_request and mmc_data for CQE's
>       mmc: host: Add CQE interface
>       mmc: core: Turn off CQE before sending commands
>       mmc: core: Add support for handling CQE requests
>       mmc: core: Remove unused MMC_CAP2_PACKED_CMD
>       mmc: mmc: Enable Command Queuing
>       mmc: mmc: Enable CQE's
>       mmc: block: Use local variables in mmc_blk_data_prep()
>       mmc: block: Prepare CQE data
>       mmc: block: Add CQE support
>       mmc: sdhci-pci: Add CQHCI support for Intel GLK
> 
> Venkat Gopalakrishnan (1):
>       mmc: cqhci: support for command queue enabled host
> 
>  drivers/mmc/core/block.c          |  238 +++++++-
>  drivers/mmc/core/block.h          |    7 +
>  drivers/mmc/core/bus.c            |    7 +
>  drivers/mmc/core/core.c           |  178 +++++-
>  drivers/mmc/core/host.c           |    6 +
>  drivers/mmc/core/host.h           |    1 +
>  drivers/mmc/core/mmc.c            |   42 +-
>  drivers/mmc/core/queue.c          |  273 ++++++++-
>  drivers/mmc/core/queue.h          |   42 +-
>  drivers/mmc/host/Kconfig          |   14 +
>  drivers/mmc/host/Makefile         |    1 +
>  drivers/mmc/host/cqhci.c          | 1154
> +++++++++++++++++++++++++++++++++++++
>  drivers/mmc/host/cqhci.h          |  240 ++++++++
>  drivers/mmc/host/sdhci-pci-core.c |  153 ++++-
>  include/linux/mmc/core.h          |   18 +-
>  include/linux/mmc/host.h          |   63 +-
>  include/trace/events/mmc.h        |   36 +-
>  17 files changed, 2412 insertions(+), 61 deletions(-)  create mode 100644
> drivers/mmc/host/cqhci.c  create mode 100644 drivers/mmc/host/cqhci.h
> 
> 
> Regards
> Adrian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH V5 00/13] mmc: Add Command Queue support
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (13 preceding siblings ...)
  2017-08-11 10:33 ` [PATCH V5 00/13] mmc: Add Command Queue support Bough Chen
@ 2017-08-17  7:45 ` Bough Chen
  2017-08-17  8:56   ` Shawn Lin
  2017-08-18 11:03   ` Adrian Hunter
  2017-08-18 11:06 ` Adrian Hunter
  2017-08-22 11:22 ` Ulf Hansson
  16 siblings, 2 replies; 49+ messages in thread
From: Bough Chen @ 2017-08-17  7:45 UTC (permalink / raw)
  To: Adrian Hunter, Ulf Hansson, Shawn Lin
  Cc: linux-mmc, Alex Lemberg, Mateusz Nowak, Yuliy Izrailov,
	Jaehoon Chung, Dong Aisheng, Das Asutosh, Zhangfei Gao,
	Sahitya Tummala, Harjani Ritesh, Venu Byravarasu, Linus Walleij


> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
> owner@vger.kernel.org] On Behalf Of Adrian Hunter
> Sent: Thursday, August 10, 2017 8:08 PM
> To: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: linux-mmc <linux-mmc@vger.kernel.org>; Bough Chen
> <haibo.chen@nxp.com>; Alex Lemberg <alex.lemberg@sandisk.com>;
> Mateusz Nowak <mateusz.nowak@intel.com>; Yuliy Izrailov
> <Yuliy.Izrailov@sandisk.com>; Jaehoon Chung <jh80.chung@samsung.com>;
> Dong Aisheng <dongas86@gmail.com>; Das Asutosh
> <asutoshd@codeaurora.org>; Zhangfei Gao <zhangfei.gao@gmail.com>;
> Sahitya Tummala <stummala@codeaurora.org>; Harjani Ritesh
> <riteshh@codeaurora.org>; Venu Byravarasu <vbyravarasu@nvidia.com>;
> Linus Walleij <linus.walleij@linaro.org>; Shawn Lin <shawn.lin@rock-chips.com>
> Subject: [PATCH V5 00/13] mmc: Add Command Queue support
> 
> Hi
> 
> Here is V5 of the hardware command queue patches without the software
> command queue patches.
> 
> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight 2%
> drop in sequential read speed but no change to sequential write.
> 
> 
Hi Adrian,

I test the performance on the i.MX8.  Here in my side, I use 'dd' to test the
sequential read/write speed, see a slight 3% drop for both read and write.

    ---------------------------------------------
   |               | read (KB/s)| write (KB/s) |
   ---------------------------------------------
    |CMDQ in HS400ES|    257     |     94.5     |
    ---------------------------------------------
    |    HS400ES    |    265     |     96.6     |
    --------------------------------------------- 

For random multi-threaded I/O, I use 'fio'  to test, the test command:
fio -filename=/mnt/test -direct=1 -iodepth 1 -thread -rw=randread - ioengine=psync  -bs=4k -size=2G -numjobs=10 -runtime=60  -group_reporting - name=mytest

I test 5 times, and get the average value.
For CMDQ in HS400ES
    ----------------------------------------------------
    | block size   |    4KB    |    8KB    |    16KB   |
    ----------------------------------------------------
    | random read  | 26340KB/s | 51844KB/s | 85738KB/s |
    ----------------------------------------------------
    | random write | 12691KB/s | 15879KB/s | 17535KB/s |
    ----------------------------------------------------

HS400ES without CMDQ
    ----------------------------------------------------
    | block size   |    4KB    |    8KB    |    16KB   |
    ----------------------------------------------------
    | random read  | 18585KB/s | 35041KB/s | 63880KB/s |
    ----------------------------------------------------
    | random write | 16465KB/s | 19210KB/s | 22672KB/s |
    ----------------------------------------------------

For random write from the test, I find every test value differ greatly no matter enable CMDQ or not.

>From the test, I see CMDQ random read speed increase 34%~48%, but for random write, the speed 
drop 17.4% ~ 33%.  

When you send software cmdq V5 patch, you give some explanation for the random write. But the
average random write speed also drops a lot, I think it is now a good news, which need attention!


Maybe Shawn Lin can double test this random write performance.

Best Regards
Haibo Chen

> Changes since V4:
>       mmc: core: Add mmc_retune_hold_now()
> 	Add explanation to commit message.
>       mmc: host: Add CQE interface
> 	Add comments to callback declarations.
>       mmc: core: Turn off CQE before sending commands
> 	Add explanation to commit message.
>       mmc: core: Add support for handling CQE requests
> 	Add comments as requested by Ulf.
>       mmc: core: Remove unused MMC_CAP2_PACKED_CMD
> 	New patch.
>       mmc: mmc: Enable Command Queuing
> 	Adjust for removal of MMC_CAP2_PACKED_CMD.
> 	Add a comment about Packed Commands.
>       mmc: mmc: Enable CQE's
> 	Remove un-necessary check for MMC_CAP2_CQE
>       mmc: block: Use local variables in mmc_blk_data_prep()
> 	New patch.
>       mmc: block: Prepare CQE data
> 	Adjust due to "mmc: block: Use local variables in
> mmc_blk_data_prep()"
> 	Remove priority setting.
> 	Add explanation to commit message.
>       mmc: cqhci: support for command queue enabled host
> 	Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA
> 
> Changes since V3:
> 	Adjusted ...blk_end_request...() for new block status codes
> 	Fixed CQHCI transaction descriptor for "no DCMD" case
> 
> Changes since V2:
> 	Dropped patches that have been applied.
> 	Re-based
> 	Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"
> 
> Changes since V1:
> 
> 	"Share mmc request array between partitions" is dependent
> 	on changes in "Introduce queue semantics", so added that
> 	and block fixes:
> 
> 	Added "Fix is_waiting_last_req set incorrectly"
> 	Added "Fix cmd error reset failure path"
> 	Added "Use local var for mqrq_cur"
> 	Added "Introduce queue semantics"
> 
> Changes since RFC:
> 
> 	Re-based on next.
> 	Added comment about command queue priority.
> 	Added some acks and reviews.
> 
> 
> Adrian Hunter (12):
>       mmc: core: Add mmc_retune_hold_now()
>       mmc: core: Add members to mmc_request and mmc_data for CQE's
>       mmc: host: Add CQE interface
>       mmc: core: Turn off CQE before sending commands
>       mmc: core: Add support for handling CQE requests
>       mmc: core: Remove unused MMC_CAP2_PACKED_CMD
>       mmc: mmc: Enable Command Queuing
>       mmc: mmc: Enable CQE's
>       mmc: block: Use local variables in mmc_blk_data_prep()
>       mmc: block: Prepare CQE data
>       mmc: block: Add CQE support
>       mmc: sdhci-pci: Add CQHCI support for Intel GLK
> 
> Venkat Gopalakrishnan (1):
>       mmc: cqhci: support for command queue enabled host
> 
>  drivers/mmc/core/block.c          |  238 +++++++-
>  drivers/mmc/core/block.h          |    7 +
>  drivers/mmc/core/bus.c            |    7 +
>  drivers/mmc/core/core.c           |  178 +++++-
>  drivers/mmc/core/host.c           |    6 +
>  drivers/mmc/core/host.h           |    1 +
>  drivers/mmc/core/mmc.c            |   42 +-
>  drivers/mmc/core/queue.c          |  273 ++++++++-
>  drivers/mmc/core/queue.h          |   42 +-
>  drivers/mmc/host/Kconfig          |   14 +
>  drivers/mmc/host/Makefile         |    1 +
>  drivers/mmc/host/cqhci.c          | 1154
> +++++++++++++++++++++++++++++++++++++
>  drivers/mmc/host/cqhci.h          |  240 ++++++++
>  drivers/mmc/host/sdhci-pci-core.c |  153 ++++-
>  include/linux/mmc/core.h          |   18 +-
>  include/linux/mmc/host.h          |   63 +-
>  include/trace/events/mmc.h        |   36 +-
>  17 files changed, 2412 insertions(+), 61 deletions(-)  create mode 100644
> drivers/mmc/host/cqhci.c  create mode 100644 drivers/mmc/host/cqhci.h
> 
> 
> Regards
> Adrian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 00/13] mmc: Add Command Queue support
  2017-08-17  7:45 ` Bough Chen
@ 2017-08-17  8:56   ` Shawn Lin
  2017-08-18 11:03   ` Adrian Hunter
  1 sibling, 0 replies; 49+ messages in thread
From: Shawn Lin @ 2017-08-17  8:56 UTC (permalink / raw)
  To: Bough Chen, Adrian Hunter, Ulf Hansson
  Cc: shawn.lin, linux-mmc, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij

Hi

On 2017/8/17 15:45, Bough Chen wrote:
> 
>> -----Original Message-----
>> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
>> owner@vger.kernel.org] On Behalf Of Adrian Hunter
>> Sent: Thursday, August 10, 2017 8:08 PM
>> To: Ulf Hansson <ulf.hansson@linaro.org>
>> Cc: linux-mmc <linux-mmc@vger.kernel.org>; Bough Chen
>> <haibo.chen@nxp.com>; Alex Lemberg <alex.lemberg@sandisk.com>;
>> Mateusz Nowak <mateusz.nowak@intel.com>; Yuliy Izrailov
>> <Yuliy.Izrailov@sandisk.com>; Jaehoon Chung <jh80.chung@samsung.com>;
>> Dong Aisheng <dongas86@gmail.com>; Das Asutosh
>> <asutoshd@codeaurora.org>; Zhangfei Gao <zhangfei.gao@gmail.com>;
>> Sahitya Tummala <stummala@codeaurora.org>; Harjani Ritesh
>> <riteshh@codeaurora.org>; Venu Byravarasu <vbyravarasu@nvidia.com>;
>> Linus Walleij <linus.walleij@linaro.org>; Shawn Lin <shawn.lin@rock-chips.com>
>> Subject: [PATCH V5 00/13] mmc: Add Command Queue support
>>
>> Hi
>>
>> Here is V5 of the hardware command queue patches without the software
>> command queue patches.
>>
>> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight 2%
>> drop in sequential read speed but no change to sequential write.
>>
>>
> Hi Adrian,
> 
> I test the performance on the i.MX8.  Here in my side, I use 'dd' to test the
> sequential read/write speed, see a slight 3% drop for both read and write.
> 
>      ---------------------------------------------
>     |               | read (KB/s)| write (KB/s) |
>     ---------------------------------------------
>      |CMDQ in HS400ES|    257     |     94.5     |
>      ---------------------------------------------
>      |    HS400ES    |    265     |     96.6     |
>      ---------------------------------------------
> 
> For random multi-threaded I/O, I use 'fio'  to test, the test command:
> fio -filename=/mnt/test -direct=1 -iodepth 1 -thread -rw=randread - ioengine=psync  -bs=4k -size=2G -numjobs=10 -runtime=60  -group_reporting - name=mytest
> 
> I test 5 times, and get the average value.
> For CMDQ in HS400ES
>      ----------------------------------------------------
>      | block size   |    4KB    |    8KB    |    16KB   |
>      ----------------------------------------------------
>      | random read  | 26340KB/s | 51844KB/s | 85738KB/s |
>      ----------------------------------------------------
>      | random write | 12691KB/s | 15879KB/s | 17535KB/s |
>      ----------------------------------------------------
> 
> HS400ES without CMDQ
>      ----------------------------------------------------
>      | block size   |    4KB    |    8KB    |    16KB   |
>      ----------------------------------------------------
>      | random read  | 18585KB/s | 35041KB/s | 63880KB/s |
>      ----------------------------------------------------
>      | random write | 16465KB/s | 19210KB/s | 22672KB/s |
>      ----------------------------------------------------
> 
> For random write from the test, I find every test value differ greatly no matter enable CMDQ or not.
> 
>>From the test, I see CMDQ random read speed increase 34%~48%, but for random write, the speed 
> drop 17.4% ~ 33%.
> 
> When you send software cmdq V5 patch, you give some explanation for the random write. But the
> average random write speed also drops a lot, I think it is now a good news, which need attention!
> 
> 
> Maybe Shawn Lin can double test this random write performance.

Yes, I also find the performance fluctuates a lot for different
eMMC chips when enabling CMDQ. I will need more data to check that.
Will update it later.

> 
> Best Regards
> Haibo Chen
> 
>> Changes since V4:
>>        mmc: core: Add mmc_retune_hold_now()
>> 	Add explanation to commit message.
>>        mmc: host: Add CQE interface
>> 	Add comments to callback declarations.
>>        mmc: core: Turn off CQE before sending commands
>> 	Add explanation to commit message.
>>        mmc: core: Add support for handling CQE requests
>> 	Add comments as requested by Ulf.
>>        mmc: core: Remove unused MMC_CAP2_PACKED_CMD
>> 	New patch.
>>        mmc: mmc: Enable Command Queuing
>> 	Adjust for removal of MMC_CAP2_PACKED_CMD.
>> 	Add a comment about Packed Commands.
>>        mmc: mmc: Enable CQE's
>> 	Remove un-necessary check for MMC_CAP2_CQE
>>        mmc: block: Use local variables in mmc_blk_data_prep()
>> 	New patch.
>>        mmc: block: Prepare CQE data
>> 	Adjust due to "mmc: block: Use local variables in
>> mmc_blk_data_prep()"
>> 	Remove priority setting.
>> 	Add explanation to commit message.
>>        mmc: cqhci: support for command queue enabled host
>> 	Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA
>>
>> Changes since V3:
>> 	Adjusted ...blk_end_request...() for new block status codes
>> 	Fixed CQHCI transaction descriptor for "no DCMD" case
>>
>> Changes since V2:
>> 	Dropped patches that have been applied.
>> 	Re-based
>> 	Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"
>>
>> Changes since V1:
>>
>> 	"Share mmc request array between partitions" is dependent
>> 	on changes in "Introduce queue semantics", so added that
>> 	and block fixes:
>>
>> 	Added "Fix is_waiting_last_req set incorrectly"
>> 	Added "Fix cmd error reset failure path"
>> 	Added "Use local var for mqrq_cur"
>> 	Added "Introduce queue semantics"
>>
>> Changes since RFC:
>>
>> 	Re-based on next.
>> 	Added comment about command queue priority.
>> 	Added some acks and reviews.
>>
>>
>> Adrian Hunter (12):
>>        mmc: core: Add mmc_retune_hold_now()
>>        mmc: core: Add members to mmc_request and mmc_data for CQE's
>>        mmc: host: Add CQE interface
>>        mmc: core: Turn off CQE before sending commands
>>        mmc: core: Add support for handling CQE requests
>>        mmc: core: Remove unused MMC_CAP2_PACKED_CMD
>>        mmc: mmc: Enable Command Queuing
>>        mmc: mmc: Enable CQE's
>>        mmc: block: Use local variables in mmc_blk_data_prep()
>>        mmc: block: Prepare CQE data
>>        mmc: block: Add CQE support
>>        mmc: sdhci-pci: Add CQHCI support for Intel GLK
>>
>> Venkat Gopalakrishnan (1):
>>        mmc: cqhci: support for command queue enabled host
>>
>>   drivers/mmc/core/block.c          |  238 +++++++-
>>   drivers/mmc/core/block.h          |    7 +
>>   drivers/mmc/core/bus.c            |    7 +
>>   drivers/mmc/core/core.c           |  178 +++++-
>>   drivers/mmc/core/host.c           |    6 +
>>   drivers/mmc/core/host.h           |    1 +
>>   drivers/mmc/core/mmc.c            |   42 +-
>>   drivers/mmc/core/queue.c          |  273 ++++++++-
>>   drivers/mmc/core/queue.h          |   42 +-
>>   drivers/mmc/host/Kconfig          |   14 +
>>   drivers/mmc/host/Makefile         |    1 +
>>   drivers/mmc/host/cqhci.c          | 1154
>> +++++++++++++++++++++++++++++++++++++
>>   drivers/mmc/host/cqhci.h          |  240 ++++++++
>>   drivers/mmc/host/sdhci-pci-core.c |  153 ++++-
>>   include/linux/mmc/core.h          |   18 +-
>>   include/linux/mmc/host.h          |   63 +-
>>   include/trace/events/mmc.h        |   36 +-
>>   17 files changed, 2412 insertions(+), 61 deletions(-)  create mode 100644
>> drivers/mmc/host/cqhci.c  create mode 100644 drivers/mmc/host/cqhci.h
>>
>>
>> Regards
>> Adrian
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the
>> body of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 00/13] mmc: Add Command Queue support
  2017-08-17  7:45 ` Bough Chen
  2017-08-17  8:56   ` Shawn Lin
@ 2017-08-18 11:03   ` Adrian Hunter
  1 sibling, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-18 11:03 UTC (permalink / raw)
  To: Bough Chen, Ulf Hansson, Shawn Lin
  Cc: linux-mmc, Alex Lemberg, Mateusz Nowak, Yuliy Izrailov,
	Jaehoon Chung, Dong Aisheng, Das Asutosh, Zhangfei Gao,
	Sahitya Tummala, Harjani Ritesh, Venu Byravarasu, Linus Walleij

On 17/08/17 10:45, Bough Chen wrote:
> 
>> -----Original Message-----
>> From: linux-mmc-owner@vger.kernel.org [mailto:linux-mmc-
>> owner@vger.kernel.org] On Behalf Of Adrian Hunter
>> Sent: Thursday, August 10, 2017 8:08 PM
>> To: Ulf Hansson <ulf.hansson@linaro.org>
>> Cc: linux-mmc <linux-mmc@vger.kernel.org>; Bough Chen
>> <haibo.chen@nxp.com>; Alex Lemberg <alex.lemberg@sandisk.com>;
>> Mateusz Nowak <mateusz.nowak@intel.com>; Yuliy Izrailov
>> <Yuliy.Izrailov@sandisk.com>; Jaehoon Chung <jh80.chung@samsung.com>;
>> Dong Aisheng <dongas86@gmail.com>; Das Asutosh
>> <asutoshd@codeaurora.org>; Zhangfei Gao <zhangfei.gao@gmail.com>;
>> Sahitya Tummala <stummala@codeaurora.org>; Harjani Ritesh
>> <riteshh@codeaurora.org>; Venu Byravarasu <vbyravarasu@nvidia.com>;
>> Linus Walleij <linus.walleij@linaro.org>; Shawn Lin <shawn.lin@rock-chips.com>
>> Subject: [PATCH V5 00/13] mmc: Add Command Queue support
>>
>> Hi
>>
>> Here is V5 of the hardware command queue patches without the software
>> command queue patches.
>>
>> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight 2%
>> drop in sequential read speed but no change to sequential write.
>>
>>
> Hi Adrian,
> 
> I test the performance on the i.MX8.  Here in my side, I use 'dd' to test the
> sequential read/write speed, see a slight 3% drop for both read and write.
> 
>     ---------------------------------------------
>    |               | read (KB/s)| write (KB/s) |
>    ---------------------------------------------
>     |CMDQ in HS400ES|    257     |     94.5     |
>     ---------------------------------------------
>     |    HS400ES    |    265     |     96.6     |
>     --------------------------------------------- 
> 
> For random multi-threaded I/O, I use 'fio'  to test, the test command:
> fio -filename=/mnt/test -direct=1 -iodepth 1 -thread -rw=randread - ioengine=psync  -bs=4k -size=2G -numjobs=10 -runtime=60  -group_reporting - name=mytest
> 
> I test 5 times, and get the average value.
> For CMDQ in HS400ES
>     ----------------------------------------------------
>     | block size   |    4KB    |    8KB    |    16KB   |
>     ----------------------------------------------------
>     | random read  | 26340KB/s | 51844KB/s | 85738KB/s |
>     ----------------------------------------------------
>     | random write | 12691KB/s | 15879KB/s | 17535KB/s |
>     ----------------------------------------------------
> 
> HS400ES without CMDQ
>     ----------------------------------------------------
>     | block size   |    4KB    |    8KB    |    16KB   |
>     ----------------------------------------------------
>     | random read  | 18585KB/s | 35041KB/s | 63880KB/s |
>     ----------------------------------------------------
>     | random write | 16465KB/s | 19210KB/s | 22672KB/s |
>     ----------------------------------------------------
> 
> For random write from the test, I find every test value differ greatly no matter enable CMDQ or not.
> 
>>From the test, I see CMDQ random read speed increase 34%~48%, but for random write, the speed 
> drop 17.4% ~ 33%.  
> 
> When you send software cmdq V5 patch, you give some explanation for the random write. But the
> average random write speed also drops a lot, I think it is now a good news, which need attention!

I tried your test and got quite variable results but did not see a drop:

CQ
                    4K           8K            16K
Random Write    10,374 KB/s   9,669 KB/s    10,712 KB/s


Non-CQ
                    4K           8K            16K
Random Write     9,898 KB/s   8,747 KB/s     9,098 KB/s


But given the variability of results I would not conclude CQ was faster.

I would suggest you run more tests and see if the trend continues.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 00/13] mmc: Add Command Queue support
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (14 preceding siblings ...)
  2017-08-17  7:45 ` Bough Chen
@ 2017-08-18 11:06 ` Adrian Hunter
  2017-08-22 11:22 ` Ulf Hansson
  16 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-18 11:06 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10/08/17 15:08, Adrian Hunter wrote:
> Here is V5 of the hardware command queue patches without the software
> command queue patches.
> 
> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
> 2% drop in sequential read speed but no change to sequential write.
> 

Are there any more comments?  If not, shouldn't this be applied?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now()
  2017-08-10 12:08 ` [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now() Adrian Hunter
@ 2017-08-20 11:26   ` Linus Walleij
  2017-08-22 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:26 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> mmc_return_hold() / mmc_retune_release() are used around a group of
> commands to prevent re-tuning between the commands. Re-tuning can still
> happen before the first command. In some cases, re-tuning must be
> prevented entirely. Add mmc_retune_hold_now() for that purpose. It is
> added in preparation for CQE support where it will be used by CQE recovery.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

> +void mmc_retune_hold_now(struct mmc_host *host)
>  void mmc_retune_release(struct mmc_host *host)

If the only hint we have about how to use these functions is through the
commit log we are in trouble, but I would say in this case the function
name is kind of half-obvious at least.

I think what troubles me is Rusty's API maturity guideline #6
"6. The name tells you how to use it."

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's
  2017-08-10 12:08 ` [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's Adrian Hunter
@ 2017-08-20 11:29   ` Linus Walleij
  2017-08-21  9:26     ` Adrian Hunter
  2017-08-22 11:12   ` Ulf Hansson
  1 sibling, 1 reply; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:29 UTC (permalink / raw)
  To: Adrian Hunter, linux-block
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Most of the information needed to issue requests to a CQE is already in
> struct mmc_request and struct mmc_data. Add data block address, some flags,
> and the task id (tag), and allow for cmd being NULL which it is for CQE
> tasks.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

> +       int                     tag;

Is this consistent with the block layer idea of "tag"?

I am asking because I get confused.

I thought the block layers idea of a "tag" was some metadata
associated with a request. Not that I am a block layer expert.
Why can't we just name this "task_id" if that is what it is in
Linux terms? Does the specification call it "tag"?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-10 12:08 ` [PATCH V5 03/13] mmc: host: Add CQE interface Adrian Hunter
@ 2017-08-20 11:31   ` Linus Walleij
  2017-08-22 11:13   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:31 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Add CQE host operations, capabilities, and host members.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

This looks like a clean and nice host-API to me.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands
  2017-08-10 12:08 ` [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands Adrian Hunter
@ 2017-08-20 11:32   ` Linus Walleij
  2017-08-22 11:13   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:32 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> CQE needs to be off for the host controller to accept non-CQ commands. Turn
> off the CQE before sending commands, and ensure it is off in any reset or
> power management paths, or re-tuning.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD
  2017-08-10 12:08 ` [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD Adrian Hunter
@ 2017-08-20 11:33   ` Linus Walleij
  2017-08-22 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:33 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Packed commands support was removed but some bits got left behind. Remove
> them.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

I would even apply this as patch #1 in the series (I guess that works).
Sorry for leaving this cruft behind.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 07/13] mmc: mmc: Enable Command Queuing
  2017-08-10 12:08 ` [PATCH V5 07/13] mmc: mmc: Enable Command Queuing Adrian Hunter
@ 2017-08-20 11:33   ` Linus Walleij
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:33 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Enable the Command Queue if the host controller supports a command queue
> engine. It is not compatible with Packed Commands, so make a note of that in the
> comment.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 05/13] mmc: core: Add support for handling CQE requests
  2017-08-10 12:08 ` [PATCH V5 05/13] mmc: core: Add support for handling CQE requests Adrian Hunter
@ 2017-08-20 11:39   ` Linus Walleij
  2017-08-21  9:26     ` Adrian Hunter
  2017-08-22  8:06   ` Ulf Hansson
  1 sibling, 1 reply; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:39 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Add core support for handling CQE requests, including starting, completing
> and recovering.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>


> +static void __mmc_cqe_request_done(struct mmc_host *host,
> +                                  struct mmc_request *mrq)

We are naming too much stuff __foo now, it gets really hard to figure
out from the function name and the surrounding code what is going on.

I guess people are using this like "do parts of what mmc_cqe_request_done()
is doing" but it'd be nice if we could be specific.

mmc_cqe_request_finalize() could work?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 08/13] mmc: mmc: Enable CQE's
  2017-08-10 12:08 ` [PATCH V5 08/13] mmc: mmc: Enable CQE's Adrian Hunter
@ 2017-08-20 11:41   ` Linus Walleij
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:41 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Enable or disable CQE when a card is added or removed respectively.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

pr_info("%s: Command Queue Engine enabled\n",

Kudos for making this pr_info() as we're introducing it to everyone
so that users can see what is going on. Good!

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep()
  2017-08-10 12:08 ` [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep() Adrian Hunter
@ 2017-08-20 11:43   ` Linus Walleij
  2017-08-21  9:27     ` Adrian Hunter
  0 siblings, 1 reply; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 11:43 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Use local variables in mmc_blk_data_prep() in preparation for adding CQE
> support which doesn't use the output variables.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
on the patch as such but:

>  static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
> -                             int disable_multi, bool *do_rel_wr,
> -                             bool *do_data_tag)
> +                             int disable_multi, bool *do_rel_wr_p,
> +                             bool *do_data_tag_p)

We need to figure out what we mean by "tag" in eMMC.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-10 12:08 ` [PATCH V5 11/13] mmc: block: Add CQE support Adrian Hunter
@ 2017-08-20 12:13   ` Linus Walleij
  2017-08-21  9:27     ` Adrian Hunter
  2017-08-31 10:23     ` Christoph Hellwig
  0 siblings, 2 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-20 12:13 UTC (permalink / raw)
  To: Adrian Hunter, linux-block
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Add CQE support to the block driver, including:
>         - optionally using DCMD for flush requests
>         - manually issuing discard requests

"Manually" has no place in computer code IMO since there is no
man there. But I may be especially grumpy. But I don't understand the
comment anyway.

>         - issuing read / write requests to the CQE
>         - supporting block-layer timeouts
>         - handling recovery
>         - supporting re-tuning
>
> CQE offers 25% - 50% better random multi-threaded I/O.  There is a slight
> (e.g. 2%) drop in sequential read speed but no observable change to sequential
> write.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

This commit message seriously needs to detail the design of the
request handling thread/engine.

So adding a new (in effect) invasive block driver needs to at least
be CC:ed to the block maintainers so we don't sneak anything like
that in under the radar.

This duplicates the "thread that sucks out requests" from the
MMC core, and doubles the problem of replacing it with
something like blk-mq.

Especially these two snippets I would personally NOT merge
without the explicit consent of a block layer maintainer:

> +static void mmc_cqe_request_fn(struct request_queue *q)
> +{
> +       struct mmc_queue *mq = q->queuedata;
> +       struct request *req;
> +
> +       if (!mq) {
> +               while ((req = blk_fetch_request(q)) != NULL) {
> +                       req->rq_flags |= RQF_QUIET;
> +                       __blk_end_request_all(req, BLK_STS_IOERR);
> +               }
> +               return;
> +       }
> +
> +       if (mq->asleep && !mq->cqe_busy)
> +               wake_up_process(mq->thread);
> +}
(...)
> +static int mmc_cqe_thread(void *d)
> +{
> +       struct mmc_queue *mq = d;
> +       struct request_queue *q = mq->queue;
> +       struct mmc_card *card = mq->card;
> +       struct mmc_host *host = card->host;
> +       unsigned long flags;
> +       int get_put = 0;
> +
> +       current->flags |= PF_MEMALLOC;
> +
> +       down(&mq->thread_sem);
> +       spin_lock_irqsave(q->queue_lock, flags);
> +       while (1) {
> +               struct request *req = NULL;
> +               enum mmc_issue_type issue_type;
> +               bool retune_ok = false;
> +
> +               if (mq->cqe_recovery_needed) {
> +                       spin_unlock_irqrestore(q->queue_lock, flags);
> +                       mmc_blk_cqe_recovery(mq);
> +                       spin_lock_irqsave(q->queue_lock, flags);
> +                       mq->cqe_recovery_needed = false;
> +               }
> +
> +               set_current_state(TASK_INTERRUPTIBLE);
> +
> +               if (!kthread_should_stop())
> +                       req = blk_peek_request(q);

Why are you using blk_peek_request() instead of blk_fetch_request()
when the other thread just goes with fetch?

Am I right in assuming that also this request queue replies on the
implicit semantic that blk_peek_request(q) shall return NULL
if there are no requests in the queue, and that it is pulling out a
few NULLs to flush the request-handling machinery? Or did it
fix this? (Put that in the commit message in that case.)

> +
> +               if (req) {
> +                       issue_type = mmc_cqe_issue_type(host, req);
> +                       switch (issue_type) {
> +                       case MMC_ISSUE_DCMD:
> +                               if (mmc_cqe_dcmd_busy(mq)) {
> +                                       mq->cqe_busy |= MMC_CQE_DCMD_BUSY;
> +                                       req = NULL;
> +                                       break;
> +                               }

I guess peek is used becaue you want to have the option to give it up here.

> +                               /* Fall through */
> +                       case MMC_ISSUE_ASYNC:
> +                               if (blk_queue_start_tag(q, req)) {
> +                                       mq->cqe_busy |= MMC_CQE_QUEUE_FULL;
> +                                       req = NULL;
> +                               }

Then you start it with blk_queue_start_tag(), now does "tag" in this
function call mean the same as the "tag" you just added to the
MMC request struct?

> +                               break;
> +                       default:
> +                               /*
> +                                * Timeouts are handled by mmc core, so set a
> +                                * large value to avoid races.
> +                                */
> +                               req->timeout = 600 * HZ;
> +                               blk_start_request(req);

And here it just starts.

> +                               break;
> +                       }
> +                       if (req) {
> +                               mq->cqe_in_flight[issue_type] += 1;
> +                               if (mmc_cqe_tot_in_flight(mq) == 1)
> +                                       get_put += 1;
> +                               if (mmc_cqe_qcnt(mq) == 1)
> +                                       retune_ok = true;
> +                       }
> +               }
> +
> +               mq->asleep = !req;
> +
> +               spin_unlock_irqrestore(q->queue_lock, flags);
> +
> +               if (req) {
> +                       enum mmc_issued issued;
> +
> +                       set_current_state(TASK_RUNNING);
> +
> +                       if (get_put) {
> +                               get_put = 0;
> +                               mmc_get_card(card);
> +                       }
> +
> +                       if (host->need_retune && retune_ok &&
> +                           !host->hold_retune)
> +                               host->retune_now = true;
> +                       else
> +                               host->retune_now = false;
> +
> +                       issued = mmc_blk_cqe_issue_rq(mq, req);
> +
> +                       cond_resched();
> +
> +                       spin_lock_irqsave(q->queue_lock, flags);
> +
> +                       switch (issued) {
> +                       case MMC_REQ_STARTED:
> +                               break;
> +                       case MMC_REQ_BUSY:
> +                               blk_requeue_request(q, req);

Here it is requeued.

> +                               goto finished;
> +                       case MMC_REQ_FAILED_TO_START:
> +                               __blk_end_request_all(req, BLK_STS_IOERR);

Or ended.

> +                               /* Fall through */
> +                       case MMC_REQ_FINISHED:
> +finished:
> +                               mq->cqe_in_flight[issue_type] -= 1;
> +                               if (mmc_cqe_tot_in_flight(mq) == 0)
> +                                       get_put = -1;
> +                       }
> +               } else {
> +                       if (get_put < 0) {
> +                               get_put = 0;
> +                               mmc_put_card(card);
> +                       }
> +                       /*
> +                        * Do not stop with requests in flight in case recovery
> +                        * is needed.
> +                        */
> +                       if (kthread_should_stop() &&
> +                           !mmc_cqe_tot_in_flight(mq)) {
> +                               set_current_state(TASK_RUNNING);
> +                               break;
> +                       }
> +                       up(&mq->thread_sem);
> +                       schedule();
> +                       down(&mq->thread_sem);
> +                       spin_lock_irqsave(q->queue_lock, flags);

And this semaphoring and threading is just as confusing as ever and now
we have two of them. (Sorry, I'm grumpy.)

I think I said in the beginning maybe not in response to this series but
another that I don't like the idea of making a second copy of the
whole request thread. I would much rather see that ONE request thread
is used for both regular requests and CQE.

Atleast I wanna see a serious rebuttal of why my claim that we should
keep this code down is such a pipe dream that we just have to go ahead
and make a second instance of all the core request mechanics.
And that this should be part of the commit message: "I'm duplicating
the queue thread because (...)" etc.

I'm sorry for being so conservative, but I am plain scared, I had serious
trouble refactoring this code already, and I got the feeling that several
people think the MMC stack has grown unapproachable for average
developers (my Googledoc where I tried to pry it apart got very popular
with developers I respect), and this is making that situation worse. Soon
only you and Ulf will understand this piece of code.

I do not know if I'm especially stupid when I feel the MMC stack code
is already hard to grasp, if that is the case I need to be told.

What we need is an MMC stack where it is clear where blocks come in
and out and how they are processed by the block layer, but now we
already have a scary Rube Goldberg-machine and it is not getting better.
If people have different feelings they can tell me off right now.

I have my hopes up that we can get the code lesser and more readable
with MQ, as I tried to illustrate in my attempts, which are indeed lame
because they don't work because of misc and SDIO use cases, but
I'm honestly doing my best. Currently with other clean-ups to get a
clean surface to do that.

As it stands, the MQ migration work size is in some spots doubled or
more than doubled after this commit :(

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's
  2017-08-20 11:29   ` Linus Walleij
@ 2017-08-21  9:26     ` Adrian Hunter
  0 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-21  9:26 UTC (permalink / raw)
  To: Linus Walleij, linux-block
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On 20/08/17 14:29, Linus Walleij wrote:
> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
>> Most of the information needed to issue requests to a CQE is already in
>> struct mmc_request and struct mmc_data. Add data block address, some flags,
>> and the task id (tag), and allow for cmd being NULL which it is for CQE
>> tasks.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> 
>> +       int                     tag;
> 
> Is this consistent with the block layer idea of "tag"?

It is named "tag" because it is the block layer tag.

> 
> I am asking because I get confused.
> 
> I thought the block layers idea of a "tag" was some metadata
> associated with a request. Not that I am a block layer expert.

The block layer tag is a unique number to identify the task to the hardware
queue.  It is typically a number from 0 up to queue depth - 1.

> Why can't we just name this "task_id" if that is what it is in
> Linux terms? Does the specification call it "tag"?

The eMMC specification calls it "task id" but the block layer calls it
"tag".  I went with "tag" to be easier for block layer people to understand.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 05/13] mmc: core: Add support for handling CQE requests
  2017-08-20 11:39   ` Linus Walleij
@ 2017-08-21  9:26     ` Adrian Hunter
  2017-08-31 11:32       ` Linus Walleij
  0 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-21  9:26 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On 20/08/17 14:39, Linus Walleij wrote:
> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
>> Add core support for handling CQE requests, including starting, completing
>> and recovering.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> 
> 
>> +static void __mmc_cqe_request_done(struct mmc_host *host,
>> +                                  struct mmc_request *mrq)
> 
> We are naming too much stuff __foo now, it gets really hard to figure
> out from the function name and the surrounding code what is going on.

You have written several times that you don't like __foo() names, however it
is a normal kernel paradigm.

> I guess people are using this like "do parts of what mmc_cqe_request_done()
> is doing" but it'd be nice if we could be specific.
> 
> mmc_cqe_request_finalize() could work?

It can be rolled into mmc_cqe_request_done().

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep()
  2017-08-20 11:43   ` Linus Walleij
@ 2017-08-21  9:27     ` Adrian Hunter
  0 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-21  9:27 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On 20/08/17 14:43, Linus Walleij wrote:
> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
>> Use local variables in mmc_blk_data_prep() in preparation for adding CQE
>> support which doesn't use the output variables.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> 
> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
> on the patch as such but:
> 
>>  static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
>> -                             int disable_multi, bool *do_rel_wr,
>> -                             bool *do_data_tag)
>> +                             int disable_multi, bool *do_rel_wr_p,
>> +                             bool *do_data_tag_p)
> 
> We need to figure out what we mean by "tag" in eMMC.

Data tags are an eMMC invention to categorize the type of data.  They have
nothing to do with block layer tags.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-20 12:13   ` Linus Walleij
@ 2017-08-21  9:27     ` Adrian Hunter
  2017-08-31 10:05       ` Linus Walleij
  2017-08-31 10:25       ` Christoph Hellwig
  2017-08-31 10:23     ` Christoph Hellwig
  1 sibling, 2 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-21  9:27 UTC (permalink / raw)
  To: Linus Walleij, linux-block
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On 20/08/17 15:13, Linus Walleij wrote:
> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
>> Add CQE support to the block driver, including:
>>         - optionally using DCMD for flush requests
>>         - manually issuing discard requests
> 
> "Manually" has no place in computer code IMO since there is no
> man there. But I may be especially grumpy. But I don't understand the
> comment anyway.

CQE automatically sends the commands to complete requests.  However it only
supports reads / writes and so-called "direct commands" (DCMD).  Furthermore
DCMD is limited to one command at a time, but discards require 3 commands.
That makes issuing discards through CQE very awkward, but some CQE's don't
support DCMD anyway.  So for discards, the existing non-CQE approach is
taken, where the mmc core code issues the 3 commands one at a time i.e.
mmc_erase().

> 
>>         - issuing read / write requests to the CQE
>>         - supporting block-layer timeouts
>>         - handling recovery
>>         - supporting re-tuning
>>
>> CQE offers 25% - 50% better random multi-threaded I/O.  There is a slight
>> (e.g. 2%) drop in sequential read speed but no observable change to sequential
>> write.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> 
> This commit message seriously needs to detail the design of the
> request handling thread/engine.
> 
> So adding a new (in effect) invasive block driver needs to at least
> be CC:ed to the block maintainers so we don't sneak anything like
> that in under the radar.
> 
> This duplicates the "thread that sucks out requests" from the
> MMC core, and doubles the problem of replacing it with
> something like blk-mq.

We need to start with a legacy API because people want to backport CQ to
earlier kernels (we really need to get features upstream more quickly), but
blk-mq has been evolving a lot (e.g. elevator support), so backporters face
having either something quite different from upstream or trying to backport
great chunks of the block layer.

We also don't know how blk-mq will perform so it would be prudent to start
with support for both the legacy API and blk-mq (as scsi does) so that we
can find out first.

A loop to remove requests from the queue is normal e.g. scsi_request_fn()
and taking a consistent approach with the existing mmc thread does not
double the the problem of blk-mq implementation, since the same approach can
be used for both.

> 
> Especially these two snippets I would personally NOT merge
> without the explicit consent of a block layer maintainer:
> 
>> +static void mmc_cqe_request_fn(struct request_queue *q)
>> +{
>> +       struct mmc_queue *mq = q->queuedata;
>> +       struct request *req;
>> +
>> +       if (!mq) {
>> +               while ((req = blk_fetch_request(q)) != NULL) {
>> +                       req->rq_flags |= RQF_QUIET;
>> +                       __blk_end_request_all(req, BLK_STS_IOERR);
>> +               }
>> +               return;
>> +       }
>> +
>> +       if (mq->asleep && !mq->cqe_busy)
>> +               wake_up_process(mq->thread);
>> +}
> (...)
>> +static int mmc_cqe_thread(void *d)
>> +{
>> +       struct mmc_queue *mq = d;
>> +       struct request_queue *q = mq->queue;
>> +       struct mmc_card *card = mq->card;
>> +       struct mmc_host *host = card->host;
>> +       unsigned long flags;
>> +       int get_put = 0;
>> +
>> +       current->flags |= PF_MEMALLOC;
>> +
>> +       down(&mq->thread_sem);
>> +       spin_lock_irqsave(q->queue_lock, flags);
>> +       while (1) {
>> +               struct request *req = NULL;
>> +               enum mmc_issue_type issue_type;
>> +               bool retune_ok = false;
>> +
>> +               if (mq->cqe_recovery_needed) {
>> +                       spin_unlock_irqrestore(q->queue_lock, flags);
>> +                       mmc_blk_cqe_recovery(mq);
>> +                       spin_lock_irqsave(q->queue_lock, flags);
>> +                       mq->cqe_recovery_needed = false;
>> +               }
>> +
>> +               set_current_state(TASK_INTERRUPTIBLE);
>> +
>> +               if (!kthread_should_stop())
>> +                       req = blk_peek_request(q);
> 
> Why are you using blk_peek_request() instead of blk_fetch_request()
> when the other thread just goes with fetch?

Because we are not always able to start the request.

> Am I right in assuming that also this request queue replies on the
> implicit semantic that blk_peek_request(q) shall return NULL
> if there are no requests in the queue, and that it is pulling out a
> few NULLs to flush the request-handling machinery? Or did it
> fix this? (Put that in the commit message in that case.)

CQ is different.  Because read / write requests are processed asynchronously
we can keep preparing and issuing more requests until the hardware queue is
full.  That is, we don't have to wait for the first request to complete
before preparing and issuing the second request, and so on.

However, the existing thread's need to issue a "null" is not because it is a
thread.  It is because of the convoluted design of mmc_start_areq().
However, you still need a way to issue the prepared request when the current
request completes.  Either wake-up the thread to do it, or in the past I
have suggested that should be done in the completion path, but that needs
changes to the host controller API.

> 
>> +
>> +               if (req) {
>> +                       issue_type = mmc_cqe_issue_type(host, req);
>> +                       switch (issue_type) {
>> +                       case MMC_ISSUE_DCMD:
>> +                               if (mmc_cqe_dcmd_busy(mq)) {
>> +                                       mq->cqe_busy |= MMC_CQE_DCMD_BUSY;
>> +                                       req = NULL;
>> +                                       break;
>> +                               }
> 
> I guess peek is used becaue you want to have the option to give it up here.

Yes

> 
>> +                               /* Fall through */
>> +                       case MMC_ISSUE_ASYNC:
>> +                               if (blk_queue_start_tag(q, req)) {
>> +                                       mq->cqe_busy |= MMC_CQE_QUEUE_FULL;
>> +                                       req = NULL;
>> +                               }
> 
> Then you start it with blk_queue_start_tag(), now does "tag" in this
> function call mean the same as the "tag" you just added to the
> MMC request struct?

Yes

> 
>> +                               break;
>> +                       default:
>> +                               /*
>> +                                * Timeouts are handled by mmc core, so set a
>> +                                * large value to avoid races.
>> +                                */
>> +                               req->timeout = 600 * HZ;
>> +                               blk_start_request(req);
> 
> And here it just starts.

This is the path for requests that the CQE cannot process.  They don't need
to be tagged because there is no command queue - no way to do more than one
request at a time.

> 
>> +                               break;
>> +                       }
>> +                       if (req) {
>> +                               mq->cqe_in_flight[issue_type] += 1;
>> +                               if (mmc_cqe_tot_in_flight(mq) == 1)
>> +                                       get_put += 1;
>> +                               if (mmc_cqe_qcnt(mq) == 1)
>> +                                       retune_ok = true;
>> +                       }
>> +               }
>> +
>> +               mq->asleep = !req;
>> +
>> +               spin_unlock_irqrestore(q->queue_lock, flags);
>> +
>> +               if (req) {
>> +                       enum mmc_issued issued;
>> +
>> +                       set_current_state(TASK_RUNNING);
>> +
>> +                       if (get_put) {
>> +                               get_put = 0;
>> +                               mmc_get_card(card);
>> +                       }
>> +
>> +                       if (host->need_retune && retune_ok &&
>> +                           !host->hold_retune)
>> +                               host->retune_now = true;
>> +                       else
>> +                               host->retune_now = false;
>> +
>> +                       issued = mmc_blk_cqe_issue_rq(mq, req);
>> +
>> +                       cond_resched();
>> +
>> +                       spin_lock_irqsave(q->queue_lock, flags);
>> +
>> +                       switch (issued) {
>> +                       case MMC_REQ_STARTED:
>> +                               break;
>> +                       case MMC_REQ_BUSY:
>> +                               blk_requeue_request(q, req);
> 
> Here it is requeued.

Any request that is started has to be either re-queued or completed.

> 
>> +                               goto finished;
>> +                       case MMC_REQ_FAILED_TO_START:
>> +                               __blk_end_request_all(req, BLK_STS_IOERR);
> 
> Or ended.

Any request that is started has to be either re-queued or completed.

> 
>> +                               /* Fall through */
>> +                       case MMC_REQ_FINISHED:
>> +finished:
>> +                               mq->cqe_in_flight[issue_type] -= 1;
>> +                               if (mmc_cqe_tot_in_flight(mq) == 0)
>> +                                       get_put = -1;
>> +                       }
>> +               } else {
>> +                       if (get_put < 0) {
>> +                               get_put = 0;
>> +                               mmc_put_card(card);
>> +                       }
>> +                       /*
>> +                        * Do not stop with requests in flight in case recovery
>> +                        * is needed.
>> +                        */
>> +                       if (kthread_should_stop() &&
>> +                           !mmc_cqe_tot_in_flight(mq)) {
>> +                               set_current_state(TASK_RUNNING);
>> +                               break;
>> +                       }
>> +                       up(&mq->thread_sem);
>> +                       schedule();
>> +                       down(&mq->thread_sem);
>> +                       spin_lock_irqsave(q->queue_lock, flags);
> 
> And this semaphoring and threading is just as confusing as ever and now
> we have two of them. (Sorry, I'm grumpy.)

Sure we have two, but they follow the same approach. i.e. if you can get rid
of one, then you can use the same method to get rid of the other.

> 
> I think I said in the beginning maybe not in response to this series but
> another that I don't like the idea of making a second copy of the
> whole request thread. I would much rather see that ONE request thread
> is used for both regular requests and CQE.

They issue requests in completely different ways.  There is essentially no
common code.

> 
> Atleast I wanna see a serious rebuttal of why my claim that we should
> keep this code down is such a pipe dream that we just have to go ahead
> and make a second instance of all the core request mechanics.
> And that this should be part of the commit message: "I'm duplicating
> the queue thread because (...)" etc.

As I wrote above, we need to start with the legacy API, because of
backporting and because we don't know how blk-mq will perform.

> 
> I'm sorry for being so conservative, but I am plain scared, I had serious
> trouble refactoring this code already, and I got the feeling that several
> people think the MMC stack has grown unapproachable for average
> developers (my Googledoc where I tried to pry it apart got very popular
> with developers I respect), and this is making that situation worse. Soon
> only you and Ulf will understand this piece of code.

The mmc block driver has problems, but the thread isn't one of them.

> 
> I do not know if I'm especially stupid when I feel the MMC stack code
> is already hard to grasp, if that is the case I need to be told.

mmc_start_areq() is very convoluted.  However, getting rid of
mmc_start_areq() is only the first step.  Step two: have the host controller
complete requests without the need of block driver polling i.e.
card_busy_detect() must not be called in the normal completion path.  Step
three: make it possible to issue the prepared request in the completion path
of the current request.  Steps two and three need host controller driver
changes.

> 
> What we need is an MMC stack where it is clear where blocks come in
> and out and how they are processed by the block layer, but now we
> already have a scary Rube Goldberg-machine and it is not getting better.
> If people have different feelings they can tell me off right now.
> 
> I have my hopes up that we can get the code lesser and more readable
> with MQ, as I tried to illustrate in my attempts, which are indeed lame
> because they don't work because of misc and SDIO use cases, but
> I'm honestly doing my best. Currently with other clean-ups to get a
> clean surface to do that.
> 
> As it stands, the MQ migration work size is in some spots doubled or
> more than doubled after this commit :(

Not true.  I have begun work on blk-mq for CQE and I will send an RFC in a
day or two.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 05/13] mmc: core: Add support for handling CQE requests
  2017-08-10 12:08 ` [PATCH V5 05/13] mmc: core: Add support for handling CQE requests Adrian Hunter
  2017-08-20 11:39   ` Linus Walleij
@ 2017-08-22  8:06   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22  8:06 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

[...]

> diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
> index bf1788a224e6..1974fcfd4284 100644
> --- a/include/linux/mmc/core.h
> +++ b/include/linux/mmc/core.h
> @@ -174,6 +174,11 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
>  int mmc_wait_for_cmd(struct mmc_host *host, struct mmc_command *cmd,
>                 int retries);
>
> +int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq);
> +void mmc_cqe_request_done(struct mmc_host *host, struct mmc_request *mrq);
> +void mmc_cqe_post_req(struct mmc_host *host, struct mmc_request *mrq);
> +int mmc_cqe_recovery(struct mmc_host *host);
> +

Please move mmc_cqe_start_req(), mmc_cqe_post_req(),
mmc_cqe_recovery() to drivers/mmc/core/core.h. This makes sure they
don't become abused.

Also, please move mmc_cqe_request_done() to include/linux/mmc/host.h.
Then it becomes consistent with where mmc_request_done() is declared.

Feel free to also fold in a patch moving the existing mmc_start_areq()
to drivers/mmc/core/core.h. Unfortunate the others closely related to
these functions, are being abused by SDIO func drivers and needs more
work before they can be moved.

>  int mmc_hw_reset(struct mmc_host *host);
>  void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card);
>
> --
> 1.9.1
>

Sorry for not noticing this in v4.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD
  2017-08-10 12:08 ` [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD Adrian Hunter
  2017-08-20 11:33   ` Linus Walleij
@ 2017-08-22 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22 11:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
> Packed commands support was removed but some bits got left behind. Remove
> them.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for next!

Kind regards
Uffe

> ---
>  drivers/mmc/core/mmc.c   | 23 -----------------------
>  include/linux/mmc/host.h |  4 ----
>  2 files changed, 27 deletions(-)
>
> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> index adb5c88fd054..5fcb8d721224 100644
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -1790,29 +1790,6 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
>          */
>         card->reenable_cmdq = card->ext_csd.cmdq_en;
>
> -       /*
> -        * The mandatory minimum values are defined for packed command.
> -        * read: 5, write: 3
> -        */
> -       if (card->ext_csd.max_packed_writes >= 3 &&
> -           card->ext_csd.max_packed_reads >= 5 &&
> -           host->caps2 & MMC_CAP2_PACKED_CMD) {
> -               err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL,
> -                               EXT_CSD_EXP_EVENTS_CTRL,
> -                               EXT_CSD_PACKED_EVENT_EN,
> -                               card->ext_csd.generic_cmd6_time);
> -               if (err && err != -EBADMSG)
> -                       goto free_card;
> -               if (err) {
> -                       pr_warn("%s: Enabling packed event failed\n",
> -                               mmc_hostname(card->host));
> -                       card->ext_csd.packed_event_en = 0;
> -                       err = 0;
> -               } else {
> -                       card->ext_csd.packed_event_en = 1;
> -               }
> -       }
> -
>         if (!oldcard)
>                 host->card = card;
>
> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index 89e1d7e2953e..5f3eb4a0373f 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -335,10 +335,6 @@ struct mmc_host {
>                                  MMC_CAP2_HS200_1_2V_SDR)
>  #define MMC_CAP2_CD_ACTIVE_HIGH        (1 << 10)       /* Card-detect signal active high */
>  #define MMC_CAP2_RO_ACTIVE_HIGH        (1 << 11)       /* Write-protect signal active high */
> -#define MMC_CAP2_PACKED_RD     (1 << 12)       /* Allow packed read */
> -#define MMC_CAP2_PACKED_WR     (1 << 13)       /* Allow packed write */
> -#define MMC_CAP2_PACKED_CMD    (MMC_CAP2_PACKED_RD | \
> -                                MMC_CAP2_PACKED_WR)
>  #define MMC_CAP2_NO_PRESCAN_POWERUP (1 << 14)  /* Don't power up before scan */
>  #define MMC_CAP2_HS400_1_8V    (1 << 15)       /* Can support HS400 1.8V */
>  #define MMC_CAP2_HS400_1_2V    (1 << 16)       /* Can support HS400 1.2V */
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now()
  2017-08-10 12:08 ` [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now() Adrian Hunter
  2017-08-20 11:26   ` Linus Walleij
@ 2017-08-22 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22 11:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
> mmc_return_hold() / mmc_retune_release() are used around a group of
> commands to prevent re-tuning between the commands. Re-tuning can still
> happen before the first command. In some cases, re-tuning must be
> prevented entirely. Add mmc_retune_hold_now() for that purpose. It is
> added in preparation for CQE support where it will be used by CQE recovery.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for next!

Kind regards
Uffe

> ---
>  drivers/mmc/core/host.c | 6 ++++++
>  drivers/mmc/core/host.h | 1 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> index 1503412f826c..ad88deb2e8f3 100644
> --- a/drivers/mmc/core/host.c
> +++ b/drivers/mmc/core/host.c
> @@ -111,6 +111,12 @@ void mmc_retune_hold(struct mmc_host *host)
>         host->hold_retune += 1;
>  }
>
> +void mmc_retune_hold_now(struct mmc_host *host)
> +{
> +       host->retune_now = 0;
> +       host->hold_retune += 1;
> +}
> +
>  void mmc_retune_release(struct mmc_host *host)
>  {
>         if (host->hold_retune)
> diff --git a/drivers/mmc/core/host.h b/drivers/mmc/core/host.h
> index fb6a76a03833..77d6f60d1bf9 100644
> --- a/drivers/mmc/core/host.h
> +++ b/drivers/mmc/core/host.h
> @@ -19,6 +19,7 @@
>  void mmc_retune_enable(struct mmc_host *host);
>  void mmc_retune_disable(struct mmc_host *host);
>  void mmc_retune_hold(struct mmc_host *host);
> +void mmc_retune_hold_now(struct mmc_host *host);
>  void mmc_retune_release(struct mmc_host *host);
>  int mmc_retune(struct mmc_host *host);
>  void mmc_retune_pause(struct mmc_host *host);
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's
  2017-08-10 12:08 ` [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's Adrian Hunter
  2017-08-20 11:29   ` Linus Walleij
@ 2017-08-22 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22 11:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
> Most of the information needed to issue requests to a CQE is already in
> struct mmc_request and struct mmc_data. Add data block address, some flags,
> and the task id (tag), and allow for cmd being NULL which it is for CQE
> tasks.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for next!

Kind regards
Uffe

> ---
>  include/linux/mmc/core.h   | 13 +++++++++++--
>  include/trace/events/mmc.h | 36 +++++++++++++++++++++++-------------
>  2 files changed, 34 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
> index a0c63ea28796..bf1788a224e6 100644
> --- a/include/linux/mmc/core.h
> +++ b/include/linux/mmc/core.h
> @@ -122,11 +122,18 @@ struct mmc_data {
>         unsigned int            timeout_clks;   /* data timeout (in clocks) */
>         unsigned int            blksz;          /* data block size */
>         unsigned int            blocks;         /* number of blocks */
> +       unsigned int            blk_addr;       /* block address */
>         int                     error;          /* data error */
>         unsigned int            flags;
>
> -#define MMC_DATA_WRITE (1 << 8)
> -#define MMC_DATA_READ  (1 << 9)
> +#define MMC_DATA_WRITE         BIT(8)
> +#define MMC_DATA_READ          BIT(9)
> +/* Extra flags used by CQE */
> +#define MMC_DATA_QBR           BIT(10)         /* CQE queue barrier*/
> +#define MMC_DATA_PRIO          BIT(11)         /* CQE high priority */
> +#define MMC_DATA_REL_WR                BIT(12)         /* Reliable write */
> +#define MMC_DATA_DAT_TAG       BIT(13)         /* Tag request */
> +#define MMC_DATA_FORCED_PRG    BIT(14)         /* Forced programming */
>
>         unsigned int            bytes_xfered;
>
> @@ -153,6 +160,8 @@ struct mmc_request {
>
>         /* Allow other commands during this ongoing data transfer or busy wait */
>         bool                    cap_cmd_during_tfr;
> +
> +       int                     tag;
>  };
>
>  struct mmc_card;
> diff --git a/include/trace/events/mmc.h b/include/trace/events/mmc.h
> index a72f9b94c80b..f30a99ac65b6 100644
> --- a/include/trace/events/mmc.h
> +++ b/include/trace/events/mmc.h
> @@ -29,8 +29,10 @@
>                 __field(unsigned int,           sbc_flags)
>                 __field(unsigned int,           sbc_retries)
>                 __field(unsigned int,           blocks)
> +               __field(unsigned int,           blk_addr)
>                 __field(unsigned int,           blksz)
>                 __field(unsigned int,           data_flags)
> +               __field(int,                    tag)
>                 __field(unsigned int,           can_retune)
>                 __field(unsigned int,           doing_retune)
>                 __field(unsigned int,           retune_now)
> @@ -42,10 +44,10 @@
>         ),
>
>         TP_fast_assign(
> -               __entry->cmd_opcode = mrq->cmd->opcode;
> -               __entry->cmd_arg = mrq->cmd->arg;
> -               __entry->cmd_flags = mrq->cmd->flags;
> -               __entry->cmd_retries = mrq->cmd->retries;
> +               __entry->cmd_opcode = mrq->cmd ? mrq->cmd->opcode : 0;
> +               __entry->cmd_arg = mrq->cmd ? mrq->cmd->arg : 0;
> +               __entry->cmd_flags = mrq->cmd ? mrq->cmd->flags : 0;
> +               __entry->cmd_retries = mrq->cmd ? mrq->cmd->retries : 0;
>                 __entry->stop_opcode = mrq->stop ? mrq->stop->opcode : 0;
>                 __entry->stop_arg = mrq->stop ? mrq->stop->arg : 0;
>                 __entry->stop_flags = mrq->stop ? mrq->stop->flags : 0;
> @@ -56,7 +58,9 @@
>                 __entry->sbc_retries = mrq->sbc ? mrq->sbc->retries : 0;
>                 __entry->blksz = mrq->data ? mrq->data->blksz : 0;
>                 __entry->blocks = mrq->data ? mrq->data->blocks : 0;
> +               __entry->blk_addr = mrq->data ? mrq->data->blk_addr : 0;
>                 __entry->data_flags = mrq->data ? mrq->data->flags : 0;
> +               __entry->tag = mrq->tag;
>                 __entry->can_retune = host->can_retune;
>                 __entry->doing_retune = host->doing_retune;
>                 __entry->retune_now = host->retune_now;
> @@ -71,8 +75,8 @@
>                   "cmd_opcode=%u cmd_arg=0x%x cmd_flags=0x%x cmd_retries=%u "
>                   "stop_opcode=%u stop_arg=0x%x stop_flags=0x%x stop_retries=%u "
>                   "sbc_opcode=%u sbc_arg=0x%x sbc_flags=0x%x sbc_retires=%u "
> -                 "blocks=%u block_size=%u data_flags=0x%x "
> -                 "can_retune=%u doing_retune=%u retune_now=%u "
> +                 "blocks=%u block_size=%u blk_addr=%u data_flags=0x%x "
> +                 "tag=%d can_retune=%u doing_retune=%u retune_now=%u "
>                   "need_retune=%d hold_retune=%d retune_period=%u",
>                   __get_str(name), __entry->mrq,
>                   __entry->cmd_opcode, __entry->cmd_arg,
> @@ -81,7 +85,8 @@
>                   __entry->stop_flags, __entry->stop_retries,
>                   __entry->sbc_opcode, __entry->sbc_arg,
>                   __entry->sbc_flags, __entry->sbc_retries,
> -                 __entry->blocks, __entry->blksz, __entry->data_flags,
> +                 __entry->blocks, __entry->blk_addr,
> +                 __entry->blksz, __entry->data_flags, __entry->tag,
>                   __entry->can_retune, __entry->doing_retune,
>                   __entry->retune_now, __entry->need_retune,
>                   __entry->hold_retune, __entry->retune_period)
> @@ -108,6 +113,7 @@
>                 __field(unsigned int,           sbc_retries)
>                 __field(unsigned int,           bytes_xfered)
>                 __field(int,                    data_err)
> +               __field(int,                    tag)
>                 __field(unsigned int,           can_retune)
>                 __field(unsigned int,           doing_retune)
>                 __field(unsigned int,           retune_now)
> @@ -119,10 +125,13 @@
>         ),
>
>         TP_fast_assign(
> -               __entry->cmd_opcode = mrq->cmd->opcode;
> -               __entry->cmd_err = mrq->cmd->error;
> -               memcpy(__entry->cmd_resp, mrq->cmd->resp, 4);
> -               __entry->cmd_retries = mrq->cmd->retries;
> +               __entry->cmd_opcode = mrq->cmd ? mrq->cmd->opcode : 0;
> +               __entry->cmd_err = mrq->cmd ? mrq->cmd->error : 0;
> +               __entry->cmd_resp[0] = mrq->cmd ? mrq->cmd->resp[0] : 0;
> +               __entry->cmd_resp[1] = mrq->cmd ? mrq->cmd->resp[1] : 0;
> +               __entry->cmd_resp[2] = mrq->cmd ? mrq->cmd->resp[2] : 0;
> +               __entry->cmd_resp[3] = mrq->cmd ? mrq->cmd->resp[3] : 0;
> +               __entry->cmd_retries = mrq->cmd ? mrq->cmd->retries : 0;
>                 __entry->stop_opcode = mrq->stop ? mrq->stop->opcode : 0;
>                 __entry->stop_err = mrq->stop ? mrq->stop->error : 0;
>                 __entry->stop_resp[0] = mrq->stop ? mrq->stop->resp[0] : 0;
> @@ -139,6 +148,7 @@
>                 __entry->sbc_retries = mrq->sbc ? mrq->sbc->retries : 0;
>                 __entry->bytes_xfered = mrq->data ? mrq->data->bytes_xfered : 0;
>                 __entry->data_err = mrq->data ? mrq->data->error : 0;
> +               __entry->tag = mrq->tag;
>                 __entry->can_retune = host->can_retune;
>                 __entry->doing_retune = host->doing_retune;
>                 __entry->retune_now = host->retune_now;
> @@ -154,7 +164,7 @@
>                   "cmd_retries=%u stop_opcode=%u stop_err=%d "
>                   "stop_resp=0x%x 0x%x 0x%x 0x%x stop_retries=%u "
>                   "sbc_opcode=%u sbc_err=%d sbc_resp=0x%x 0x%x 0x%x 0x%x "
> -                 "sbc_retries=%u bytes_xfered=%u data_err=%d "
> +                 "sbc_retries=%u bytes_xfered=%u data_err=%d tag=%d "
>                   "can_retune=%u doing_retune=%u retune_now=%u need_retune=%d "
>                   "hold_retune=%d retune_period=%u",
>                   __get_str(name), __entry->mrq,
> @@ -170,7 +180,7 @@
>                   __entry->sbc_resp[0], __entry->sbc_resp[1],
>                   __entry->sbc_resp[2], __entry->sbc_resp[3],
>                   __entry->sbc_retries,
> -                 __entry->bytes_xfered, __entry->data_err,
> +                 __entry->bytes_xfered, __entry->data_err, __entry->tag,
>                   __entry->can_retune, __entry->doing_retune,
>                   __entry->retune_now, __entry->need_retune,
>                   __entry->hold_retune, __entry->retune_period)
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-10 12:08 ` [PATCH V5 03/13] mmc: host: Add CQE interface Adrian Hunter
  2017-08-20 11:31   ` Linus Walleij
@ 2017-08-22 11:13   ` Ulf Hansson
  2017-08-23  6:54     ` Adrian Hunter
  1 sibling, 1 reply; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22 11:13 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
> Add CQE host operations, capabilities, and host members.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for next! However with some minor comments, explained below.

[...]

>
> +       /* Command Queue Engine (CQE) support */
> +       const struct mmc_cqe_ops *cqe_ops;
> +       void                    *cqe_private;
> +       /*
> +        * Notify uppers layers (e.g. mmc block driver) that CQE needs recovery
> +        * due to an error associated with the mmc_request.
> +        */

Thanks for adding the explanation.

> +       void                    (*cqe_recovery_notifier)(struct mmc_host *,
> +                                                        struct mmc_request *);

Normally we don't put callbacks in the struct mmc_host that someone
else than the host driver should assign - so this feels a bit upside
down.

Is there any reason to why you didn't want to add a new API? Something
like mmc_cqe_recover(), which the host driver could call.

> +       int                     cqe_qdepth;
> +       bool                    cqe_enabled;
> +       bool                    cqe_on;
> +
>         unsigned long           private[0] ____cacheline_aligned;
>  };
>
> --
> 1.9.1
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands
  2017-08-10 12:08 ` [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands Adrian Hunter
  2017-08-20 11:32   ` Linus Walleij
@ 2017-08-22 11:13   ` Ulf Hansson
  1 sibling, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22 11:13 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
> CQE needs to be off for the host controller to accept non-CQ commands. Turn
> off the CQE before sending commands, and ensure it is off in any reset or
> power management paths, or re-tuning.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for next!

Kind regards
Uffe

> ---
>  drivers/mmc/core/core.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 6177eb09bf1b..29544aa2824a 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -260,6 +260,9 @@ static void __mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
>
>         trace_mmc_request_start(host, mrq);
>
> +       if (host->cqe_on)
> +               host->cqe_ops->cqe_off(host);
> +
>         host->ops->request(host, mrq);
>  }
>
> @@ -979,6 +982,9 @@ int mmc_execute_tuning(struct mmc_card *card)
>         if (!host->ops->execute_tuning)
>                 return 0;
>
> +       if (host->cqe_on)
> +               host->cqe_ops->cqe_off(host);
> +
>         if (mmc_card_mmc(card))
>                 opcode = MMC_SEND_TUNING_BLOCK_HS200;
>         else
> @@ -1018,6 +1024,9 @@ void mmc_set_bus_width(struct mmc_host *host, unsigned int width)
>   */
>  void mmc_set_initial_state(struct mmc_host *host)
>  {
> +       if (host->cqe_on)
> +               host->cqe_ops->cqe_off(host);
> +
>         mmc_retune_disable(host);
>
>         if (mmc_host_is_spi(host))
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 00/13] mmc: Add Command Queue support
  2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
                   ` (15 preceding siblings ...)
  2017-08-18 11:06 ` Adrian Hunter
@ 2017-08-22 11:22 ` Ulf Hansson
  16 siblings, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-22 11:22 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
> Hi
>
> Here is V5 of the hardware command queue patches without the software
> command queue patches.

I have queued up a subset of this series for next, for those changes I
have replied separately.

However to also simplify for people to test and to get some build
reports from 0-build, I have published a cmdq branch in my mmc tree.
The branch is not intended to stay that long-term, it is just as an
intermediate step to help to find any further issues.

Kind regards
Uffe

>
> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
> 2% drop in sequential read speed but no change to sequential write.
>
>
> Changes since V4:
>       mmc: core: Add mmc_retune_hold_now()
>         Add explanation to commit message.
>       mmc: host: Add CQE interface
>         Add comments to callback declarations.
>       mmc: core: Turn off CQE before sending commands
>         Add explanation to commit message.
>       mmc: core: Add support for handling CQE requests
>         Add comments as requested by Ulf.
>       mmc: core: Remove unused MMC_CAP2_PACKED_CMD
>         New patch.
>       mmc: mmc: Enable Command Queuing
>         Adjust for removal of MMC_CAP2_PACKED_CMD.
>         Add a comment about Packed Commands.
>       mmc: mmc: Enable CQE's
>         Remove un-necessary check for MMC_CAP2_CQE
>       mmc: block: Use local variables in mmc_blk_data_prep()
>         New patch.
>       mmc: block: Prepare CQE data
>         Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
>         Remove priority setting.
>         Add explanation to commit message.
>       mmc: cqhci: support for command queue enabled host
>         Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA
>
> Changes since V3:
>         Adjusted ...blk_end_request...() for new block status codes
>         Fixed CQHCI transaction descriptor for "no DCMD" case
>
> Changes since V2:
>         Dropped patches that have been applied.
>         Re-based
>         Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"
>
> Changes since V1:
>
>         "Share mmc request array between partitions" is dependent
>         on changes in "Introduce queue semantics", so added that
>         and block fixes:
>
>         Added "Fix is_waiting_last_req set incorrectly"
>         Added "Fix cmd error reset failure path"
>         Added "Use local var for mqrq_cur"
>         Added "Introduce queue semantics"
>
> Changes since RFC:
>
>         Re-based on next.
>         Added comment about command queue priority.
>         Added some acks and reviews.
>
>
> Adrian Hunter (12):
>       mmc: core: Add mmc_retune_hold_now()
>       mmc: core: Add members to mmc_request and mmc_data for CQE's
>       mmc: host: Add CQE interface
>       mmc: core: Turn off CQE before sending commands
>       mmc: core: Add support for handling CQE requests
>       mmc: core: Remove unused MMC_CAP2_PACKED_CMD
>       mmc: mmc: Enable Command Queuing
>       mmc: mmc: Enable CQE's
>       mmc: block: Use local variables in mmc_blk_data_prep()
>       mmc: block: Prepare CQE data
>       mmc: block: Add CQE support
>       mmc: sdhci-pci: Add CQHCI support for Intel GLK
>
> Venkat Gopalakrishnan (1):
>       mmc: cqhci: support for command queue enabled host
>
>  drivers/mmc/core/block.c          |  238 +++++++-
>  drivers/mmc/core/block.h          |    7 +
>  drivers/mmc/core/bus.c            |    7 +
>  drivers/mmc/core/core.c           |  178 +++++-
>  drivers/mmc/core/host.c           |    6 +
>  drivers/mmc/core/host.h           |    1 +
>  drivers/mmc/core/mmc.c            |   42 +-
>  drivers/mmc/core/queue.c          |  273 ++++++++-
>  drivers/mmc/core/queue.h          |   42 +-
>  drivers/mmc/host/Kconfig          |   14 +
>  drivers/mmc/host/Makefile         |    1 +
>  drivers/mmc/host/cqhci.c          | 1154 +++++++++++++++++++++++++++++++++++++
>  drivers/mmc/host/cqhci.h          |  240 ++++++++
>  drivers/mmc/host/sdhci-pci-core.c |  153 ++++-
>  include/linux/mmc/core.h          |   18 +-
>  include/linux/mmc/host.h          |   63 +-
>  include/trace/events/mmc.h        |   36 +-
>  17 files changed, 2412 insertions(+), 61 deletions(-)
>  create mode 100644 drivers/mmc/host/cqhci.c
>  create mode 100644 drivers/mmc/host/cqhci.h
>
>
> Regards
> Adrian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-22 11:13   ` Ulf Hansson
@ 2017-08-23  6:54     ` Adrian Hunter
  2017-08-23 12:48       ` Ulf Hansson
  0 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-23  6:54 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 22/08/17 14:13, Ulf Hansson wrote:
> On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> +       void                    (*cqe_recovery_notifier)(struct mmc_host *,
>> +                                                        struct mmc_request *);
> 
> Normally we don't put callbacks in the struct mmc_host that someone
> else than the host driver should assign - so this feels a bit upside
> down.
> 
> Is there any reason to why you didn't want to add a new API? Something
> like mmc_cqe_recover(), which the host driver could call.

That would make the host driver dependent on the block driver.  There needs
to be a function pointer, even if we wrap it in an API.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-23  6:54     ` Adrian Hunter
@ 2017-08-23 12:48       ` Ulf Hansson
  2017-08-24  6:53         ` Adrian Hunter
  0 siblings, 1 reply; 49+ messages in thread
From: Ulf Hansson @ 2017-08-23 12:48 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 23 August 2017 at 08:54, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 22/08/17 14:13, Ulf Hansson wrote:
>> On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> +       void                    (*cqe_recovery_notifier)(struct mmc_host *,
>>> +                                                        struct mmc_request *);
>>
>> Normally we don't put callbacks in the struct mmc_host that someone
>> else than the host driver should assign - so this feels a bit upside
>> down.
>>
>> Is there any reason to why you didn't want to add a new API? Something
>> like mmc_cqe_recover(), which the host driver could call.
>
> That would make the host driver dependent on the block driver.  There needs
> to be a function pointer, even if we wrap it in an API.

Okay, I see. I guess we could put such pointer somewhere closer to the
mmc request queue.

Anyway, I didn't find out how this pointer was being protected from
concurrent access or perhaps that is managed via
mmc_claim|release_host()? Moreover, I could find it ever it being
reset to NULL.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-23 12:48       ` Ulf Hansson
@ 2017-08-24  6:53         ` Adrian Hunter
  2017-08-24  9:24           ` Ulf Hansson
  0 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2017-08-24  6:53 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 23/08/17 15:48, Ulf Hansson wrote:
> On 23 August 2017 at 08:54, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 22/08/17 14:13, Ulf Hansson wrote:
>>> On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>> +       void                    (*cqe_recovery_notifier)(struct mmc_host *,
>>>> +                                                        struct mmc_request *);
>>>
>>> Normally we don't put callbacks in the struct mmc_host that someone
>>> else than the host driver should assign - so this feels a bit upside
>>> down.
>>>
>>> Is there any reason to why you didn't want to add a new API? Something
>>> like mmc_cqe_recover(), which the host driver could call.
>>
>> That would make the host driver dependent on the block driver.  There needs
>> to be a function pointer, even if we wrap it in an API.
> 
> Okay, I see. I guess we could put such pointer somewhere closer to the
> mmc request queue.

Another possibility is to put in into struct mmc_request like the "done"
callback.

> 
> Anyway, I didn't find out how this pointer was being protected from
> concurrent access or perhaps that is managed via
> mmc_claim|release_host()?

It is assumed CQE is only used by the card driver so the callback can be set
/ reset in the card driver's probe / remove.

>                           Moreover, I could find it ever it being
> reset to NULL.

Yeah, doesn't look like it gets reset.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 03/13] mmc: host: Add CQE interface
  2017-08-24  6:53         ` Adrian Hunter
@ 2017-08-24  9:24           ` Ulf Hansson
  0 siblings, 0 replies; 49+ messages in thread
From: Ulf Hansson @ 2017-08-24  9:24 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Linus Walleij, Shawn Lin

On 24 August 2017 at 08:53, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 23/08/17 15:48, Ulf Hansson wrote:
>> On 23 August 2017 at 08:54, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> On 22/08/17 14:13, Ulf Hansson wrote:
>>>> On 10 August 2017 at 14:08, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>> +       void                    (*cqe_recovery_notifier)(struct mmc_host *,
>>>>> +                                                        struct mmc_request *);
>>>>
>>>> Normally we don't put callbacks in the struct mmc_host that someone
>>>> else than the host driver should assign - so this feels a bit upside
>>>> down.
>>>>
>>>> Is there any reason to why you didn't want to add a new API? Something
>>>> like mmc_cqe_recover(), which the host driver could call.
>>>
>>> That would make the host driver dependent on the block driver.  There needs
>>> to be a function pointer, even if we wrap it in an API.
>>
>> Okay, I see. I guess we could put such pointer somewhere closer to the
>> mmc request queue.
>
> Another possibility is to put in into struct mmc_request like the "done"
> callback.

Yes, I think like that.

That would also solve the problem of having to protect the pointer as
the request would then have a corresponding life cycle to the block
device driver.

>
>>
>> Anyway, I didn't find out how this pointer was being protected from
>> concurrent access or perhaps that is managed via
>> mmc_claim|release_host()?
>
> It is assumed CQE is only used by the card driver so the callback can be set
> / reset in the card driver's probe / remove.
>
>>                           Moreover, I could find it ever it being
>> reset to NULL.
>
> Yeah, doesn't look like it gets reset.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-21  9:27     ` Adrian Hunter
@ 2017-08-31 10:05       ` Linus Walleij
  2017-08-31 10:25       ` Christoph Hellwig
  1 sibling, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-31 10:05 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-block, Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Shawn Lin

On Mon, Aug 21, 2017 at 11:27 AM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 20/08/17 15:13, Linus Walleij wrote:
>> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>>> Add CQE support to the block driver, including:
>>>         - optionally using DCMD for flush requests
>>>         - manually issuing discard requests
>>
>> "Manually" has no place in computer code IMO since there is no
>> man there. But I may be especially grumpy. But I don't understand the
>> comment anyway.
>
> CQE automatically sends the commands to complete requests.  However it only
> supports reads / writes and so-called "direct commands" (DCMD).  Furthermore
> DCMD is limited to one command at a time, but discards require 3 commands.
> That makes issuing discards through CQE very awkward, but some CQE's don't
> support DCMD anyway.  So for discards, the existing non-CQE approach is
> taken, where the mmc core code issues the 3 commands one at a time i.e.
> mmc_erase().

Ah that's a great deal of good technical detail (I think I might have
even understood
what is going on) can we please make sure to get this text into the commit.

>> So adding a new (in effect) invasive block driver needs to at least
>> be CC:ed to the block maintainers so we don't sneak anything like
>> that in under the radar.
>>
>> This duplicates the "thread that sucks out requests" from the
>> MMC core, and doubles the problem of replacing it with
>> something like blk-mq.
>
> We need to start with a legacy API because people want to backport CQ to
> earlier kernels (we really need to get features upstream more quickly), but
> blk-mq has been evolving a lot (e.g. elevator support), so backporters face
> having either something quite different from upstream or trying to backport
> great chunks of the block layer.

I hope I do not come across as rude to a senior kernel developer if I just
reference this:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-api-nonsense.rst

My position is that any attempts to push around the community due
to internal product development falls under the "argument ti moderation"
fallacy. Even for huge enterprises.
https://en.wikipedia.org/wiki/Argument_to_moderation

> We also don't know how blk-mq will perform so it would be prudent to start
> with support for both the legacy API and blk-mq (as scsi does) so that we
> can find out first.

As far as I have understood, that is not the position of the block layer
maintainers, who are quickly phasing out the old block layer in favor
of the new mq scheduler.

Example: they did not accept the new BFQ scheduling policy until it was
rewritten as a plugin to mq.

> A loop to remove requests from the queue is normal e.g. scsi_request_fn()
> and taking a consistent approach with the existing mmc thread does not
> double the the problem of blk-mq implementation, since the same approach can
> be used for both.

If the two threads are so similar, why do we need two threads?

My general feeling is that this part of the code is duct-taped on the side
of the old request poll loop instead of refactoring and integrating the new
code with the existing poll loop.

I need a clear argument as to why that is not possible if this solution
should persist.

>>> +               if (!kthread_should_stop())
>>> +                       req = blk_peek_request(q);
>>
>> Why are you using blk_peek_request() instead of blk_fetch_request()
>> when the other thread just goes with fetch?
>
> Because we are not always able to start the request.

OK I understand this now, sorry for my ignorance.

>> Am I right in assuming that also this request queue replies on the
>> implicit semantic that blk_peek_request(q) shall return NULL
>> if there are no requests in the queue, and that it is pulling out a
>> few NULLs to flush the request-handling machinery? Or did it
>> fix this? (Put that in the commit message in that case.)
>
> CQ is different.  Because read / write requests are processed asynchronously
> we can keep preparing and issuing more requests until the hardware queue is
> full.  That is, we don't have to wait for the first request to complete
> before preparing and issuing the second request, and so on.
>
> However, the existing thread's need to issue a "null" is not because it is a
> thread.  It is because of the convoluted design of mmc_start_areq().

Yeah that is true :(

> However, you still need a way to issue the prepared request when the current
> request completes.  Either wake-up the thread to do it, or in the past I
> have suggested that should be done in the completion path, but that needs
> changes to the host controller API.

We might have to do that then? It is anyways necessary for MQ
which (in my implementation) sets the queue depth to 2 and get the
same issue with two outstanding requests.

>>> +                               break;
>>> +                       default:
>>> +                               /*
>>> +                                * Timeouts are handled by mmc core, so set a
>>> +                                * large value to avoid races.
>>> +                                */
>>> +                               req->timeout = 600 * HZ;
>>> +                               blk_start_request(req);
>>
>> And here it just starts.
>
> This is the path for requests that the CQE cannot process.  They don't need
> to be tagged because there is no command queue - no way to do more than one
> request at a time.

OK I get it.

>>> +                       switch (issued) {
>>> +                       case MMC_REQ_STARTED:
>>> +                               break;
>>> +                       case MMC_REQ_BUSY:
>>> +                               blk_requeue_request(q, req);
>>
>> Here it is requeued.
>
> Any request that is started has to be either re-queued or completed.

Yup I get it.

>>> +                               goto finished;
>>> +                       case MMC_REQ_FAILED_TO_START:
>>> +                               __blk_end_request_all(req, BLK_STS_IOERR);
>>
>> Or ended.
>
> Any request that is started has to be either re-queued or completed.

Yup.

So why can't we do all this in the existing thread instead of a new
thread?

>> I think I said in the beginning maybe not in response to this series but
>> another that I don't like the idea of making a second copy of the
>> whole request thread. I would much rather see that ONE request thread
>> is used for both regular requests and CQE.
>
> They issue requests in completely different ways.  There is essentially no
> common code.

That sounds a bit handwavy, sorry.

I need to know exactly why.

Will it also have to be two separate MQ paths when we migrate
to that?

>> Atleast I wanna see a serious rebuttal of why my claim that we should
>> keep this code down is such a pipe dream that we just have to go ahead
>> and make a second instance of all the core request mechanics.
>> And that this should be part of the commit message: "I'm duplicating
>> the queue thread because (...)" etc.
>
> As I wrote above, we need to start with the legacy API, because of
> backporting and because we don't know how blk-mq will perform.

Sorry I do not buy it. (See the above references.)

We need to agree to disagree and leave this to
the higher authority then.

>> I'm sorry for being so conservative, but I am plain scared, I had serious
>> trouble refactoring this code already, and I got the feeling that several
>> people think the MMC stack has grown unapproachable for average
>> developers (my Googledoc where I tried to pry it apart got very popular
>> with developers I respect), and this is making that situation worse. Soon
>> only you and Ulf will understand this piece of code.
>
> The mmc block driver has problems, but the thread isn't one of them.

The pulling out of NULLs from the end of the request queue to
flush the request pipeline is definately an abuse of the API and
that is a big problem.

> mmc_start_areq() is very convoluted.  However, getting rid of
> mmc_start_areq() is only the first step.

We are in violent agreement, at last.

> Step two: have the host controller
> complete requests without the need of block driver polling i.e.
> card_busy_detect() must not be called in the normal completion path.

I agree on this too.

> Step
> three: make it possible to issue the prepared request in the completion path
> of the current request.

I actually did patches to do that.

OK the patches suck for several reasons. But I have a proof of concept:
https://marc.info/?l=linux-mmc&m=148665460925587&w=2

Despite the convoluted subject this is what that patch does and
what the series try to build up to.

> Steps two and three need host controller driver
> changes.

I don't know about that. I think I kind of pulled off three.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-20 12:13   ` Linus Walleij
  2017-08-21  9:27     ` Adrian Hunter
@ 2017-08-31 10:23     ` Christoph Hellwig
  2017-08-31 12:00       ` Adrian Hunter
  1 sibling, 1 reply; 49+ messages in thread
From: Christoph Hellwig @ 2017-08-31 10:23 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Adrian Hunter, linux-block, Ulf Hansson, linux-mmc, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin

On Sun, Aug 20, 2017 at 02:13:03PM +0200, Linus Walleij wrote:
> So adding a new (in effect) invasive block driver needs to at least
> be CC:ed to the block maintainers so we don't sneak anything like
> that in under the radar.

Yes.

> And this semaphoring and threading is just as confusing as ever and now
> we have two of them. (Sorry, I'm grumpy.)

But your are grumpy for a good reason.  The MMC driver is a pain
to understand for even a seasons block layer developer.

> What we need is an MMC stack where it is clear where blocks come in
> and out and how they are processed by the block layer, but now we
> already have a scary Rube Goldberg-machine and it is not getting better.
> If people have different feelings they can tell me off right now.

Agreed.

> 
> I have my hopes up that we can get the code lesser and more readable
> with MQ, as I tried to illustrate in my attempts, which are indeed lame
> because they don't work because of misc and SDIO use cases, but
> I'm honestly doing my best. Currently with other clean-ups to get a
> clean surface to do that.
> 
> As it stands, the MQ migration work size is in some spots doubled or
> more than doubled after this commit :(

I don't think we should merge this.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-21  9:27     ` Adrian Hunter
  2017-08-31 10:05       ` Linus Walleij
@ 2017-08-31 10:25       ` Christoph Hellwig
  1 sibling, 0 replies; 49+ messages in thread
From: Christoph Hellwig @ 2017-08-31 10:25 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Linus Walleij, linux-block, Ulf Hansson, linux-mmc, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin

On Mon, Aug 21, 2017 at 12:27:30PM +0300, Adrian Hunter wrote:
> We need to start with a legacy API because people want to backport CQ to
> earlier kernels (we really need to get features upstream more quickly), but
> blk-mq has been evolving a lot (e.g. elevator support), so backporters face
> having either something quite different from upstream or trying to backport
> great chunks of the block layer.

Hell no - the point of Linux mainline development is to move forward.
We'll need to do the right thing for long term Linux development FIRST
and then worry about backporting SECOND.

Stop thinking about all kernels while you wear your mainline kernel
developer hat.  Yes, occasionally we can do smaller amount of reordering
to keep a small fix first for stable backports, but not to this extent
of using obsolete infrastructure and making an already messy driver
worse just to please backports to long obsolete kernels.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 05/13] mmc: core: Add support for handling CQE requests
  2017-08-21  9:26     ` Adrian Hunter
@ 2017-08-31 11:32       ` Linus Walleij
  0 siblings, 0 replies; 49+ messages in thread
From: Linus Walleij @ 2017-08-31 11:32 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg, Mateusz Nowak,
	Yuliy Izrailov, Jaehoon Chung, Dong Aisheng, Das Asutosh,
	Zhangfei Gao, Sahitya Tummala, Harjani Ritesh, Venu Byravarasu,
	Shawn Lin

On Mon, Aug 21, 2017 at 11:26 AM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 20/08/17 14:39, Linus Walleij wrote:
>> On Thu, Aug 10, 2017 at 2:08 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>>> Add core support for handling CQE requests, including starting, completing
>>> and recovering.
>>>
>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>>
>>
>>> +static void __mmc_cqe_request_done(struct mmc_host *host,
>>> +                                  struct mmc_request *mrq)
>>
>> We are naming too much stuff __foo now, it gets really hard to figure
>> out from the function name and the surrounding code what is going on.
>
> You have written several times that you don't like __foo() names, however it
> is a normal kernel paradigm.

Normal doesn't mean "good".

I am negative to it because it has very unclear semantics. What is the
semantic meaning of prefixing a function with __* really?

I have referred to Rusty Russell's API levels:
http://sweng.the-davies.net/Home/rustys-api-design-manifesto

This is on level 3-4 and definately not at 6.

So in my opinion, I have informed, founded in theory and
valid reasons to dislike it, and I don't think it is a matter of taste or
opinion.

>> I guess people are using this like "do parts of what mmc_cqe_request_done()
>> is doing" but it'd be nice if we could be specific.
>>
>> mmc_cqe_request_finalize() could work?
>
> It can be rolled into mmc_cqe_request_done().

OK!

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH V5 11/13] mmc: block: Add CQE support
  2017-08-31 10:23     ` Christoph Hellwig
@ 2017-08-31 12:00       ` Adrian Hunter
  0 siblings, 0 replies; 49+ messages in thread
From: Adrian Hunter @ 2017-08-31 12:00 UTC (permalink / raw)
  To: Christoph Hellwig, Linus Walleij
  Cc: linux-block, Ulf Hansson, linux-mmc, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Shawn Lin

On 31/08/17 13:23, Christoph Hellwig wrote:
> On Sun, Aug 20, 2017 at 02:13:03PM +0200, Linus Walleij wrote:
>> So adding a new (in effect) invasive block driver needs to at least
>> be CC:ed to the block maintainers so we don't sneak anything like
>> that in under the radar.
> 
> Yes.
> 
>> And this semaphoring and threading is just as confusing as ever and now
>> we have two of them. (Sorry, I'm grumpy.)
> 
> But your are grumpy for a good reason.  The MMC driver is a pain
> to understand for even a seasons block layer developer.
> 
>> What we need is an MMC stack where it is clear where blocks come in
>> and out and how they are processed by the block layer, but now we
>> already have a scary Rube Goldberg-machine and it is not getting better.
>> If people have different feelings they can tell me off right now.
> 
> Agreed.
> 
>>
>> I have my hopes up that we can get the code lesser and more readable
>> with MQ, as I tried to illustrate in my attempts, which are indeed lame
>> because they don't work because of misc and SDIO use cases, but
>> I'm honestly doing my best. Currently with other clean-ups to get a
>> clean surface to do that.
>>
>> As it stands, the MQ migration work size is in some spots doubled or
>> more than doubled after this commit :(
> 
> I don't think we should merge this.
> 

OK, let's merge the blk-mq version then.  Here it is in V7:

	https://marc.info/?l=linux-mmc&m=150418101630159&w=2

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2017-08-31 12:07 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-10 12:08 [PATCH V5 00/13] mmc: Add Command Queue support Adrian Hunter
2017-08-10 12:08 ` [PATCH V5 01/13] mmc: core: Add mmc_retune_hold_now() Adrian Hunter
2017-08-20 11:26   ` Linus Walleij
2017-08-22 11:12   ` Ulf Hansson
2017-08-10 12:08 ` [PATCH V5 02/13] mmc: core: Add members to mmc_request and mmc_data for CQE's Adrian Hunter
2017-08-20 11:29   ` Linus Walleij
2017-08-21  9:26     ` Adrian Hunter
2017-08-22 11:12   ` Ulf Hansson
2017-08-10 12:08 ` [PATCH V5 03/13] mmc: host: Add CQE interface Adrian Hunter
2017-08-20 11:31   ` Linus Walleij
2017-08-22 11:13   ` Ulf Hansson
2017-08-23  6:54     ` Adrian Hunter
2017-08-23 12:48       ` Ulf Hansson
2017-08-24  6:53         ` Adrian Hunter
2017-08-24  9:24           ` Ulf Hansson
2017-08-10 12:08 ` [PATCH V5 04/13] mmc: core: Turn off CQE before sending commands Adrian Hunter
2017-08-20 11:32   ` Linus Walleij
2017-08-22 11:13   ` Ulf Hansson
2017-08-10 12:08 ` [PATCH V5 05/13] mmc: core: Add support for handling CQE requests Adrian Hunter
2017-08-20 11:39   ` Linus Walleij
2017-08-21  9:26     ` Adrian Hunter
2017-08-31 11:32       ` Linus Walleij
2017-08-22  8:06   ` Ulf Hansson
2017-08-10 12:08 ` [PATCH V5 06/13] mmc: core: Remove unused MMC_CAP2_PACKED_CMD Adrian Hunter
2017-08-20 11:33   ` Linus Walleij
2017-08-22 11:12   ` Ulf Hansson
2017-08-10 12:08 ` [PATCH V5 07/13] mmc: mmc: Enable Command Queuing Adrian Hunter
2017-08-20 11:33   ` Linus Walleij
2017-08-10 12:08 ` [PATCH V5 08/13] mmc: mmc: Enable CQE's Adrian Hunter
2017-08-20 11:41   ` Linus Walleij
2017-08-10 12:08 ` [PATCH V5 09/13] mmc: block: Use local variables in mmc_blk_data_prep() Adrian Hunter
2017-08-20 11:43   ` Linus Walleij
2017-08-21  9:27     ` Adrian Hunter
2017-08-10 12:08 ` [PATCH V5 10/13] mmc: block: Prepare CQE data Adrian Hunter
2017-08-10 12:08 ` [PATCH V5 11/13] mmc: block: Add CQE support Adrian Hunter
2017-08-20 12:13   ` Linus Walleij
2017-08-21  9:27     ` Adrian Hunter
2017-08-31 10:05       ` Linus Walleij
2017-08-31 10:25       ` Christoph Hellwig
2017-08-31 10:23     ` Christoph Hellwig
2017-08-31 12:00       ` Adrian Hunter
2017-08-10 12:08 ` [PATCH V5 12/13] mmc: cqhci: support for command queue enabled host Adrian Hunter
2017-08-10 12:08 ` [PATCH V5 13/13] mmc: sdhci-pci: Add CQHCI support for Intel GLK Adrian Hunter
2017-08-11 10:33 ` [PATCH V5 00/13] mmc: Add Command Queue support Bough Chen
2017-08-17  7:45 ` Bough Chen
2017-08-17  8:56   ` Shawn Lin
2017-08-18 11:03   ` Adrian Hunter
2017-08-18 11:06 ` Adrian Hunter
2017-08-22 11:22 ` Ulf Hansson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.