linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V14 00/24] mmc: Add Command Queue support
@ 2017-11-21 13:42 Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 01/24] mmc: block: Fix missing blk_put_request() Adrian Hunter
                   ` (24 more replies)
  0 siblings, 25 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Hi

Here is V14 of the hardware command queue patches without the software
command queue patches, now using blk-mq and now with blk-mq support for
non-CQE I/O.

V14 includes a number of fixes to existing code, changes to default to
blk-mq, and adds patches to remove legacy code.

HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
2% drop in sequential read speed but no change to sequential write.

Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
seemed to be coming from the inferior latency of running work items compared
with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
performance degradation from 3% to 1%.

While we should look at changing blk-mq to give better workqueue performance,
a bigger gain is likely to be made by adding a new host API to enable the
next already-prepared request to be issued directly from within ->done()
callback of the current request.


Changes since V13:
      mmc: block: Fix missing blk_put_request()
	New patch.
      mmc: block: Check return value of blk_get_request()
	New patch.
      mmc: core: Do not leave the block driver in a suspended state
	New patch.
      mmc: block: Ensure that debugfs files are removed
	New patch.
      mmc: block: No need to export mmc_cleanup_queue()
	New patch.
      mmc: block: Simplify cleaning up the queue
	New patch.
      mmc: block: Use data timeout in card_busy_detect()
	New patch.
      mmc: block: Check for transfer state in card_busy_detect()
	New patch.
      mmc: block: Make card_busy_detect() accumulate all response error bits
	New patch.
      mmc: core: Make mmc_pre_req() and mmc_post_req() available
	New patch.
      mmc: core: Add parameter use_blk_mq
	Default to y
      mmc: block: Add blk-mq support
	Wrap blk_mq_end_request / blk_end_request_all
	Rename mmc_blk_rw_recovery -> mmc_blk_mq_rw_recovery
	Additional parentheses to '==' expressions
	Use mmc_pre_req() / mmc_post_req()
	Fix missing tuning release on error after mmc_start_request()
	Expand comment about timeouts
	Allow for possibility that the queue is quiesced when removing
	Ensure complete_work is flushed when removing
      mmc: block: Add CQE support
	Additional parentheses to '==' expressions
      mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
	Replaces patch "Stop using card_busy_detect()" retaining card_busy_detect()
      mmc: block: blk-mq: Stop using legacy recovery
	Allow for SPI
      mmc: mmc_test: Do not use mmc_start_areq() anymore
	New patch.
      mmc: core: Remove option not to use blk-mq
	New patch.
      mmc: block: Remove code no longer needed after the switch to blk-mq
	New patch.
      mmc: core: Remove code no longer needed after the switch to blk-mq
	New patch.

Changes since V12:
      mmc: block: Add error-handling comments
	New patch.
      mmc: block: Add blk-mq support
	Use legacy error handling
      mmc: block: Add CQE support
	Re-base
      mmc: block: blk-mq: Add support for direct completion
	New patch.
      mmc: block: blk-mq: Separate card polling from recovery
	New patch.
      mmc: block: blk-mq: Stop using card_busy_detect()
	New patch.
      mmc: block: blk-mq: Stop using legacy recovery
	New patch.

Changes since V11:
      Split "mmc: block: Add CQE and blk-mq support" into 2 patches

Changes since V10:
      mmc: core: Remove unnecessary host claim
      mmc: core: Introduce host claiming by context
      mmc: core: Add support for handling CQE requests
      mmc: mmc: Enable Command Queuing
      mmc: mmc: Enable CQE's
      mmc: block: Use local variables in mmc_blk_data_prep()
      mmc: block: Prepare CQE data
      mmc: block: Factor out mmc_setup_queue()
      mmc: core: Add parameter use_blk_mq
      mmc: core: Export mmc_start_bkops()
      mmc: core: Export mmc_start_request()
      mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
	Dropped because they have been applied
      mmc: block: Add CQE and blk-mq support
	Extend blk-mq support for asynchronous read / writes to all host
	controllers including those that require polling. The direct
	completion path is still available but depends on a new capability
	flag.
	Drop blk-mq support for synchronous read / writes.

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

Changes since V9:
      mmc: block: Add CQE and blk-mq support
	- reinstate mq support for REQ_OP_DRV_IN/OUT that was removed because
	it was incorrectly assumed to be handled by the rpmb character device
	- don't check for rpmb block device anymore
      mmc: cqhci: support for command queue enabled host
	Fix cqhci_set_irqs() as per Haibo Chen

Changes since V8:
	Re-based
      mmc: core: Introduce host claiming by context
	Slightly simplified as per Ulf
      mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
	New patch.
      mmc: block: Add CQE and blk-mq support
	Fix missing ->post_req() on the error path

Changes since V7:
	Re-based
      mmc: core: Introduce host claiming by context
	Slightly simplified
      mmc: core: Add parameter use_blk_mq
	New patch.
      mmc: core: Remove unnecessary host claim
	New patch.
      mmc: core: Export mmc_start_bkops()
	New patch.
      mmc: core: Export mmc_start_request()
	New patch.
      mmc: block: Add CQE and blk-mq support
	Add blk-mq support for non_CQE requests

Changes since V6:
      mmc: core: Introduce host claiming by context
	New patch.
      mmc: core: Move mmc_start_areq() declaration
	Dropped because it has been applied
      mmc: block: Fix block status codes
	Dropped because it has been applied
      mmc: host: Add CQE interface
	Dropped because it has been applied
      mmc: core: Turn off CQE before sending commands
	Dropped because it has been applied
      mmc: block: Factor out mmc_setup_queue()
	New patch.
      mmc: block: Add CQE support
	Drop legacy support and add blk-mq support

Changes since V5:
	Re-based
      mmc: core: Add mmc_retune_hold_now()
	Dropped because it has been applied
      mmc: core: Add members to mmc_request and mmc_data for CQE's
	Dropped because it has been applied
      mmc: core: Move mmc_start_areq() declaration
	New patch at Ulf's request
      mmc: block: Fix block status codes
	Another un-related patch
      mmc: host: Add CQE interface
	Move recovery_notifier() callback to struct mmc_request
      mmc: core: Add support for handling CQE requests
	Roll __mmc_cqe_request_done() into mmc_cqe_request_done()
	Move function declarations requested by Ulf
      mmc: core: Remove unused MMC_CAP2_PACKED_CMD
	Dropped because it has been applied
      mmc: block: Add CQE support
	Add explanation to commit message
	Adjustment for changed recovery_notifier() callback
      mmc: cqhci: support for command queue enabled host
	Adjustment for changed recovery_notifier() callback
      mmc: sdhci-pci: Add CQHCI support for Intel GLK
	Add DCMD capability for Intel controllers except GLK

Changes since V4:
      mmc: core: Add mmc_retune_hold_now()
	Add explanation to commit message.
      mmc: host: Add CQE interface
	Add comments to callback declarations.
      mmc: core: Turn off CQE before sending commands
	Add explanation to commit message.
      mmc: core: Add support for handling CQE requests
	Add comments as requested by Ulf.
      mmc: core: Remove unused MMC_CAP2_PACKED_CMD
	New patch.
      mmc: mmc: Enable Command Queuing
	Adjust for removal of MMC_CAP2_PACKED_CMD.
	Add a comment about Packed Commands.
      mmc: mmc: Enable CQE's
	Remove un-necessary check for MMC_CAP2_CQE
      mmc: block: Use local variables in mmc_blk_data_prep()
	New patch.
      mmc: block: Prepare CQE data
	Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
	Remove priority setting.
	Add explanation to commit message.
      mmc: cqhci: support for command queue enabled host
	Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA

Changes since V3:
	Adjusted ...blk_end_request...() for new block status codes
	Fixed CQHCI transaction descriptor for "no DCMD" case

Changes since V2:
	Dropped patches that have been applied.
	Re-based
	Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"

Changes since V1:

	"Share mmc request array between partitions" is dependent
	on changes in "Introduce queue semantics", so added that
	and block fixes:

	Added "Fix is_waiting_last_req set incorrectly"
	Added "Fix cmd error reset failure path"
	Added "Use local var for mqrq_cur"
	Added "Introduce queue semantics"

Changes since RFC:

	Re-based on next.
	Added comment about command queue priority.
	Added some acks and reviews.


Adrian Hunter (9):
      mmc: core: Add parameter use_blk_mq
      mmc: block: Add error-handling comments
      mmc: block: Add blk-mq support
      mmc: block: Add CQE support
      mmc: sdhci-pci: Add CQHCI support for Intel GLK
      mmc: block: blk-mq: Add support for direct completion
      mmc: block: blk-mq: Separate card polling from recovery
      mmc: block: blk-mq: Stop using card_busy_detect()
      mmc: block: blk-mq: Stop using legacy recovery

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

 drivers/mmc/Kconfig               |   11 +
 drivers/mmc/core/block.c          |  850 ++++++++++++++++++++++++++-
 drivers/mmc/core/block.h          |   12 +
 drivers/mmc/core/core.c           |    7 +
 drivers/mmc/core/core.h           |    2 +
 drivers/mmc/core/host.c           |    2 +
 drivers/mmc/core/host.h           |    4 +
 drivers/mmc/core/queue.c          |  426 +++++++++++++-
 drivers/mmc/core/queue.h          |   56 ++
 drivers/mmc/host/Kconfig          |   14 +
 drivers/mmc/host/Makefile         |    1 +
 drivers/mmc/host/cqhci.c          | 1150 +++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h          |  240 ++++++++
 drivers/mmc/host/sdhci-pci-core.c |  155 ++++-
 include/linux/mmc/host.h          |    2 +
 15 files changed, 2900 insertions(+), 32 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h

 
Adrian Hunter (4):
      mmc: core: Add parameter use_blk_mq
      mmc: block: Add blk-mq support
      mmc: block: Add CQE support
      mmc: sdhci-pci: Add CQHCI support for Intel GLK

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

 drivers/mmc/Kconfig               |   11 +
 drivers/mmc/core/block.c          |  801 +++++++++++++++++++++++++-
 drivers/mmc/core/block.h          |   12 +
 drivers/mmc/core/core.c           |    7 +
 drivers/mmc/core/core.h           |    2 +
 drivers/mmc/core/host.c           |    2 +
 drivers/mmc/core/host.h           |    4 +
 drivers/mmc/core/queue.c          |  426 +++++++++++++-
 drivers/mmc/core/queue.h          |   56 ++
 drivers/mmc/host/Kconfig          |   14 +
 drivers/mmc/host/Makefile         |    1 +
 drivers/mmc/host/cqhci.c          | 1150 +++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h          |  240 ++++++++
 drivers/mmc/host/sdhci-pci-core.c |  155 ++++-
 include/linux/mmc/host.h          |    2 +
 15 files changed, 2852 insertions(+), 31 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h

Adrian Hunter (23):
      mmc: block: Fix missing blk_put_request()
      mmc: block: Check return value of blk_get_request()
      mmc: core: Do not leave the block driver in a suspended state
      mmc: block: Ensure that debugfs files are removed
      mmc: block: No need to export mmc_cleanup_queue()
      mmc: block: Simplify cleaning up the queue
      mmc: block: Use data timeout in card_busy_detect()
      mmc: block: Check for transfer state in card_busy_detect()
      mmc: block: Make card_busy_detect() accumulate all response error bits
      mmc: core: Make mmc_pre_req() and mmc_post_req() available
      mmc: block: Add error-handling comments
      mmc: core: Add parameter use_blk_mq
      mmc: block: Add blk-mq support
      mmc: block: Add CQE support
      mmc: sdhci-pci: Add CQHCI support for Intel GLK
      mmc: block: blk-mq: Add support for direct completion
      mmc: block: blk-mq: Separate card polling from recovery
      mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
      mmc: block: blk-mq: Stop using legacy recovery
      mmc: mmc_test: Do not use mmc_start_areq() anymore
      mmc: core: Remove option not to use blk-mq
      mmc: block: Remove code no longer needed after the switch to blk-mq
      mmc: core: Remove code no longer needed after the switch to blk-mq

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

 drivers/mmc/core/block.c          | 1396 +++++++++++++++++++++----------------
 drivers/mmc/core/block.h          |   12 +-
 drivers/mmc/core/bus.c            |    5 +-
 drivers/mmc/core/core.c           |  216 +-----
 drivers/mmc/core/core.h           |   39 +-
 drivers/mmc/core/debugfs.c        |    1 +
 drivers/mmc/core/host.h           |    4 +
 drivers/mmc/core/mmc_test.c       |  122 ++--
 drivers/mmc/core/queue.c          |  493 ++++++++-----
 drivers/mmc/core/queue.h          |   69 +-
 drivers/mmc/host/Kconfig          |   14 +
 drivers/mmc/host/Makefile         |    1 +
 drivers/mmc/host/cqhci.c          | 1150 ++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h          |  240 +++++++
 drivers/mmc/host/sdhci-pci-core.c |  155 +++-
 include/linux/mmc/host.h          |    5 +-
 16 files changed, 2840 insertions(+), 1082 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h


Regards
Adrian

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH V14 01/24] mmc: block: Fix missing blk_put_request()
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-23  9:41   ` Linus Walleij
  2017-11-23 18:12   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 02/24] mmc: block: Check return value of blk_get_request() Adrian Hunter
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Ensure blk_get_request() is paired with blk_put_request().

Fixes: 0493f6fe5bde ("mmc: block: Move boot partition locking into a driver op")
Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index ea80ff4cd7f9..f60939858586 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -236,6 +236,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
 	req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
 	blk_execute_rq(mq->queue, NULL, req, 0);
 	ret = req_to_mmc_queue_req(req)->drv_op_result;
+	blk_put_request(req);
 
 	if (!ret) {
 		pr_info("%s: Locking boot partition ro until next power on\n",
@@ -2557,6 +2558,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
 		*val = ret;
 		ret = 0;
 	}
+	blk_put_request(req);
 
 	return ret;
 }
@@ -2587,6 +2589,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
 	req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
 	blk_execute_rq(mq->queue, NULL, req, 0);
 	err = req_to_mmc_queue_req(req)->drv_op_result;
+	blk_put_request(req);
 	if (err) {
 		pr_err("FAILED %d\n", err);
 		goto out_free;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 02/24] mmc: block: Check return value of blk_get_request()
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 01/24] mmc: block: Fix missing blk_put_request() Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-23  9:56   ` Linus Walleij
  2017-11-23 18:12   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state Adrian Hunter
                   ` (22 subsequent siblings)
  24 siblings, 2 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

blk_get_request() can fail, always check the return value.

Fixes: 0493f6fe5bde ("mmc: block: Move boot partition locking into a driver op")
Fixes: 3ecd8cf23f88 ("mmc: block: move multi-ioctl() to use block layer")
Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index f60939858586..4a319ddbd956 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -233,6 +233,10 @@ static ssize_t power_ro_lock_store(struct device *dev,
 
 	/* Dispatch locking to the block layer */
 	req = blk_get_request(mq->queue, REQ_OP_DRV_OUT, __GFP_RECLAIM);
+	if (IS_ERR(req)) {
+		count = PTR_ERR(req);
+		goto out_put;
+	}
 	req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
 	blk_execute_rq(mq->queue, NULL, req, 0);
 	ret = req_to_mmc_queue_req(req)->drv_op_result;
@@ -249,7 +253,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
 				set_disk_ro(part_md->disk, 1);
 			}
 	}
-
+out_put:
 	mmc_blk_put(md);
 	return count;
 }
@@ -625,6 +629,10 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
 	req = blk_get_request(mq->queue,
 		idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
 		__GFP_RECLAIM);
+	if (IS_ERR(req)) {
+		err = PTR_ERR(req);
+		goto cmd_done;
+	}
 	idatas[0] = idata;
 	req_to_mmc_queue_req(req)->drv_op =
 		rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
@@ -692,6 +700,10 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
 	req = blk_get_request(mq->queue,
 		idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
 		__GFP_RECLAIM);
+	if (IS_ERR(req)) {
+		err = PTR_ERR(req);
+		goto cmd_err;
+	}
 	req_to_mmc_queue_req(req)->drv_op =
 		rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
 	req_to_mmc_queue_req(req)->drv_op_data = idata;
@@ -2551,6 +2563,8 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
 
 	/* Ask the block layer about the card status */
 	req = blk_get_request(mq->queue, REQ_OP_DRV_IN, __GFP_RECLAIM);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
 	req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
 	blk_execute_rq(mq->queue, NULL, req, 0);
 	ret = req_to_mmc_queue_req(req)->drv_op_result;
@@ -2585,6 +2599,10 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
 
 	/* Ask the block layer for the EXT CSD */
 	req = blk_get_request(mq->queue, REQ_OP_DRV_IN, __GFP_RECLAIM);
+	if (IS_ERR(req)) {
+		err = PTR_ERR(req);
+		goto out_free;
+	}
 	req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
 	req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
 	blk_execute_rq(mq->queue, NULL, req, 0);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 01/24] mmc: block: Fix missing blk_put_request() Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 02/24] mmc: block: Check return value of blk_get_request() Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-23  9:58   ` Linus Walleij
  2017-11-23 18:12   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed Adrian Hunter
                   ` (21 subsequent siblings)
  24 siblings, 2 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

The block driver must be resumed if the mmc bus fails to suspend the card.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/bus.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index a4b49e25fe96..7586ff2ad1f1 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -157,6 +157,9 @@ static int mmc_bus_suspend(struct device *dev)
 		return ret;
 
 	ret = host->bus_ops->suspend(host);
+	if (ret)
+		pm_generic_resume(dev);
+
 	return ret;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (2 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-23 13:22   ` Linus Walleij
  2017-11-23 18:13   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 05/24] mmc: block: No need to export mmc_cleanup_queue() Adrian Hunter
                   ` (20 subsequent siblings)
  24 siblings, 2 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

The card is not necessarily being removed, but the debugfs files must be
removed when the driver is removed, otherwise they will continue to exist
after unbinding the card from the driver. e.g.

  # echo "mmc1:0001" > /sys/bus/mmc/drivers/mmcblk/unbind
  # cat /sys/kernel/debug/mmc1/mmc1\:0001/ext_csd
  [  173.634584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
  [  173.643356] IP: mmc_ext_csd_open+0x5e/0x170

A complication is that the debugfs_root may have already been removed, so
check for that too.

Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c   | 44 +++++++++++++++++++++++++++++++++++++-------
 drivers/mmc/core/debugfs.c |  1 +
 2 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 4a319ddbd956..ccfa98af1dd3 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -122,6 +122,10 @@ struct mmc_blk_data {
 	struct device_attribute force_ro;
 	struct device_attribute power_ro_lock;
 	int	area_type;
+
+	/* debugfs files (only in main mmc_blk_data) */
+	struct dentry *status_dentry;
+	struct dentry *ext_csd_dentry;
 };
 
 /* Device type for RPMB character devices */
@@ -2653,7 +2657,7 @@ static int mmc_ext_csd_release(struct inode *inode, struct file *file)
 	.llseek		= default_llseek,
 };
 
-static int mmc_blk_add_debugfs(struct mmc_card *card)
+static int mmc_blk_add_debugfs(struct mmc_card *card, struct mmc_blk_data *md)
 {
 	struct dentry *root;
 
@@ -2663,28 +2667,53 @@ static int mmc_blk_add_debugfs(struct mmc_card *card)
 	root = card->debugfs_root;
 
 	if (mmc_card_mmc(card) || mmc_card_sd(card)) {
-		if (!debugfs_create_file("status", S_IRUSR, root, card,
-					 &mmc_dbg_card_status_fops))
+		md->status_dentry =
+			debugfs_create_file("status", S_IRUSR, root, card,
+					    &mmc_dbg_card_status_fops);
+		if (!md->status_dentry)
 			return -EIO;
 	}
 
 	if (mmc_card_mmc(card)) {
-		if (!debugfs_create_file("ext_csd", S_IRUSR, root, card,
-					 &mmc_dbg_ext_csd_fops))
+		md->ext_csd_dentry =
+			debugfs_create_file("ext_csd", S_IRUSR, root, card,
+					    &mmc_dbg_ext_csd_fops);
+		if (!md->ext_csd_dentry)
 			return -EIO;
 	}
 
 	return 0;
 }
 
+static void mmc_blk_remove_debugfs(struct mmc_card *card,
+				   struct mmc_blk_data *md)
+{
+	if (!card->debugfs_root)
+		return;
+
+	if (!IS_ERR_OR_NULL(md->status_dentry)) {
+		debugfs_remove(md->status_dentry);
+		md->status_dentry = NULL;
+	}
+
+	if (!IS_ERR_OR_NULL(md->ext_csd_dentry)) {
+		debugfs_remove(md->ext_csd_dentry);
+		md->ext_csd_dentry = NULL;
+	}
+}
 
 #else
 
-static int mmc_blk_add_debugfs(struct mmc_card *card)
+static int mmc_blk_add_debugfs(struct mmc_card *card, struct mmc_blk_data *md)
 {
 	return 0;
 }
 
+static void mmc_blk_remove_debugfs(struct mmc_card *card,
+				   struct mmc_blk_data *md)
+{
+}
+
 #endif /* CONFIG_DEBUG_FS */
 
 static int mmc_blk_probe(struct mmc_card *card)
@@ -2724,7 +2753,7 @@ static int mmc_blk_probe(struct mmc_card *card)
 	}
 
 	/* Add two debugfs entries */
-	mmc_blk_add_debugfs(card);
+	mmc_blk_add_debugfs(card, md);
 
 	pm_runtime_set_autosuspend_delay(&card->dev, 3000);
 	pm_runtime_use_autosuspend(&card->dev);
@@ -2750,6 +2779,7 @@ static void mmc_blk_remove(struct mmc_card *card)
 {
 	struct mmc_blk_data *md = dev_get_drvdata(&card->dev);
 
+	mmc_blk_remove_debugfs(card, md);
 	mmc_blk_remove_parts(card, md);
 	pm_runtime_get_sync(&card->dev);
 	mmc_claim_host(card->host);
diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index 01e459a34f33..0f4a7d7b2626 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -314,4 +314,5 @@ void mmc_add_card_debugfs(struct mmc_card *card)
 void mmc_remove_card_debugfs(struct mmc_card *card)
 {
 	debugfs_remove_recursive(card->debugfs_root);
+	card->debugfs_root = NULL;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 05/24] mmc: block: No need to export mmc_cleanup_queue()
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (3 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-23 13:23   ` Linus Walleij
  2017-11-21 13:42 ` [PATCH V14 06/24] mmc: block: Simplify cleaning up the queue Adrian Hunter
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

mmc_cleanup_queue() is not used by a different module. Do not export it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/queue.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 4f33d277b125..26f8da30ebe5 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -270,7 +270,6 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 
 	mq->card = NULL;
 }
-EXPORT_SYMBOL(mmc_cleanup_queue);
 
 /**
  * mmc_queue_suspend - suspend a MMC request queue
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 06/24] mmc: block: Simplify cleaning up the queue
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (4 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 05/24] mmc: block: No need to export mmc_cleanup_queue() Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-23 13:27   ` Linus Walleij
  2017-11-21 13:42 ` [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect() Adrian Hunter
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Use blk_cleanup_queue() to shutdown the queue when the driver is removed,
and instead get an extra reference to the queue to prevent the queue being
freed before the final mmc_blk_put().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 17 ++++++++++++-----
 drivers/mmc/core/queue.c |  2 ++
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index ccfa98af1dd3..e44f6d90aeb4 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -189,7 +189,7 @@ static void mmc_blk_put(struct mmc_blk_data *md)
 	md->usage--;
 	if (md->usage == 0) {
 		int devidx = mmc_get_devidx(md->disk);
-		blk_cleanup_queue(md->queue.queue);
+		blk_put_queue(md->queue.queue);
 		ida_simple_remove(&mmc_blk_ida, devidx);
 		put_disk(md->disk);
 		kfree(md);
@@ -2156,6 +2156,17 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
 
 	md->queue.blkdata = md;
 
+	/*
+	 * Keep an extra reference to the queue so that we can shutdown the
+	 * queue (i.e. call blk_cleanup_queue()) while there are still
+	 * references to the 'md'. The corresponding blk_put_queue() is in
+	 * mmc_blk_put().
+	 */
+	if (!blk_get_queue(md->queue.queue)) {
+		mmc_cleanup_queue(&md->queue);
+		goto err_putdisk;
+	}
+
 	md->disk->major	= MMC_BLOCK_MAJOR;
 	md->disk->first_minor = devidx * perdev_minors;
 	md->disk->fops = &mmc_bdops;
@@ -2471,10 +2482,6 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
 		 * from being accepted.
 		 */
 		card = md->queue.card;
-		spin_lock_irq(md->queue.queue->queue_lock);
-		queue_flag_set(QUEUE_FLAG_BYPASS, md->queue.queue);
-		spin_unlock_irq(md->queue.queue->queue_lock);
-		blk_set_queue_dying(md->queue.queue);
 		mmc_cleanup_queue(&md->queue);
 		if (md->disk->flags & GENHD_FL_UP) {
 			device_remove_file(disk_to_dev(md->disk), &md->force_ro);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 26f8da30ebe5..ae6d9da68735 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -268,6 +268,8 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	blk_start_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
+	blk_cleanup_queue(q);
+
 	mq->card = NULL;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (5 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 06/24] mmc: block: Simplify cleaning up the queue Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 15:39   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 08/24] mmc: block: Check for transfer state " Adrian Hunter
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

card_busy_detect() has a 10 minute timeout. However the correct timeout is
the data timeout. Change card_busy_detect() to use the data timeout.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 48 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index e44f6d90aeb4..0874ab3e5c92 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -63,7 +63,6 @@
 #endif
 #define MODULE_PARAM_PREFIX "mmcblk."
 
-#define MMC_BLK_TIMEOUT_MS  (10 * 60 * 1000)        /* 10 minute timeout */
 #define MMC_SANITIZE_REQ_TIMEOUT 240000
 #define MMC_EXTRACT_INDEX_FROM_ARG(x) ((x & 0x00FF0000) >> 16)
 
@@ -921,14 +920,48 @@ static int mmc_sd_num_wr_blocks(struct mmc_card *card, u32 *written_blocks)
 	return 0;
 }
 
-static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
-		bool hw_busy_detect, struct request *req, bool *gen_err)
+static unsigned int mmc_blk_clock_khz(struct mmc_host *host)
 {
-	unsigned long timeout = jiffies + msecs_to_jiffies(timeout_ms);
+	if (host->actual_clock)
+		return host->actual_clock / 1000;
+
+	/* Clock may be subject to a divisor, fudge it by a factor of 2. */
+	if (host->ios.clock)
+		return host->ios.clock / 2000;
+
+	/* How can there be no clock */
+	WARN_ON_ONCE(1);
+	return 100; /* 100 kHz is minimum possible value */
+}
+
+static unsigned long mmc_blk_data_timeout_jiffies(struct mmc_host *host,
+						  struct mmc_data *data)
+{
+	unsigned int ms = DIV_ROUND_UP(data->timeout_ns, 1000000);
+	unsigned int khz;
+
+	if (data->timeout_clks) {
+		khz = mmc_blk_clock_khz(host);
+		ms += DIV_ROUND_UP(data->timeout_clks, khz);
+	}
+
+	return msecs_to_jiffies(ms);
+}
+
+static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
+			    struct request *req, bool *gen_err)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_data *data = &mqrq->brq.data;
+	unsigned long timeout;
 	int err = 0;
 	u32 status;
 
+	timeout = jiffies + mmc_blk_data_timeout_jiffies(card->host, data);
+
 	do {
+		bool done = time_after(jiffies, timeout);
+
 		err = __mmc_send_status(card, &status, 5);
 		if (err) {
 			pr_err("%s: error %d requesting status\n",
@@ -951,7 +984,7 @@ static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
 		 * Timeout if the device never becomes ready for data and never
 		 * leaves the program state.
 		 */
-		if (time_after(jiffies, timeout)) {
+		if (done) {
 			pr_err("%s: Card stuck in programming state! %s %s\n",
 				mmc_hostname(card->host),
 				req->rq_disk->disk_name, __func__);
@@ -1011,7 +1044,7 @@ static int send_stop(struct mmc_card *card, unsigned int timeout_ms,
 		*gen_err = true;
 	}
 
-	return card_busy_detect(card, timeout_ms, use_r1b_resp, req, gen_err);
+	return card_busy_detect(card, use_r1b_resp, req, gen_err);
 }
 
 #define ERR_NOMEDIUM	3
@@ -1546,8 +1579,7 @@ static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
 			gen_err = true;
 		}
 
-		err = card_busy_detect(card, MMC_BLK_TIMEOUT_MS, false, req,
-					&gen_err);
+		err = card_busy_detect(card, false, req, &gen_err);
 		if (err)
 			return MMC_BLK_CMD_ERR;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 08/24] mmc: block: Check for transfer state in card_busy_detect()
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (6 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect() Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 09/24] mmc: block: Make card_busy_detect() accumulate all response error bits Adrian Hunter
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

The card is required to return to transfer state. Since that is the state
required to start another transfer, check for that state instead of
programming state.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 0874ab3e5c92..130ce94fdaf2 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -948,6 +948,16 @@ static unsigned long mmc_blk_data_timeout_jiffies(struct mmc_host *host,
 	return msecs_to_jiffies(ms);
 }
 
+static inline bool mmc_blk_in_tran_state(u32 status)
+{
+	/*
+	 * Some cards mishandle the status bits, so make sure to check both the
+	 * busy indication and the card state.
+	 */
+	return status & R1_READY_FOR_DATA &&
+	       (R1_CURRENT_STATE(status) == R1_STATE_TRAN);
+}
+
 static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
 			    struct request *req, bool *gen_err)
 {
@@ -985,9 +995,9 @@ static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
 		 * leaves the program state.
 		 */
 		if (done) {
-			pr_err("%s: Card stuck in programming state! %s %s\n",
+			pr_err("%s: Card stuck in wrong state! %s %s status: %#x\n",
 				mmc_hostname(card->host),
-				req->rq_disk->disk_name, __func__);
+				req->rq_disk->disk_name, __func__, status);
 			return -ETIMEDOUT;
 		}
 
@@ -996,8 +1006,7 @@ static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
 		 * so make sure to check both the busy
 		 * indication and the card state.
 		 */
-	} while (!(status & R1_READY_FOR_DATA) ||
-		 (R1_CURRENT_STATE(status) == R1_STATE_PRG));
+	} while (!mmc_blk_in_tran_state(status));
 
 	return err;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 09/24] mmc: block: Make card_busy_detect() accumulate all response error bits
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (7 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 08/24] mmc: block: Check for transfer state " Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 10/24] mmc: core: Make mmc_pre_req() and mmc_post_req() available Adrian Hunter
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Make card_busy_detect() accumulate all response error bits. Later patches
will make use of this.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 130ce94fdaf2..607774fcb9f5 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -959,7 +959,7 @@ static inline bool mmc_blk_in_tran_state(u32 status)
 }
 
 static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
-			    struct request *req, bool *gen_err)
+			    struct request *req, u32 *resp_errs)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
 	struct mmc_data *data = &mqrq->brq.data;
@@ -979,11 +979,9 @@ static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
 			return err;
 		}
 
-		if (status & R1_ERROR) {
-			pr_err("%s: %s: error sending status cmd, status %#x\n",
-				req->rq_disk->disk_name, __func__, status);
-			*gen_err = true;
-		}
+		/* Accumulate any response error bits seen */
+		if (resp_errs)
+			*resp_errs |= status;
 
 		/* We may rely on the host hw to handle busy detection.*/
 		if ((card->host->caps & MMC_CAP_WAIT_WHILE_BUSY) &&
@@ -1011,6 +1009,22 @@ static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
 	return err;
 }
 
+static int card_busy_detect_err(struct mmc_card *card, bool hw_busy_detect,
+				struct request *req, bool *gen_err)
+{
+	u32 resp_errs = 0;
+	int err;
+
+	err = card_busy_detect(card, hw_busy_detect, req, &resp_errs);
+	if (resp_errs & R1_ERROR) {
+		pr_err("%s: %s: error sending status cmd, status %#x\n",
+		       req->rq_disk->disk_name, __func__, resp_errs);
+		*gen_err = true;
+	}
+
+	return err;
+}
+
 static int send_stop(struct mmc_card *card, unsigned int timeout_ms,
 		struct request *req, bool *gen_err, u32 *stop_status)
 {
@@ -1053,7 +1067,7 @@ static int send_stop(struct mmc_card *card, unsigned int timeout_ms,
 		*gen_err = true;
 	}
 
-	return card_busy_detect(card, use_r1b_resp, req, gen_err);
+	return card_busy_detect_err(card, use_r1b_resp, req, gen_err);
 }
 
 #define ERR_NOMEDIUM	3
@@ -1588,7 +1602,7 @@ static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
 			gen_err = true;
 		}
 
-		err = card_busy_detect(card, false, req, &gen_err);
+		err = card_busy_detect_err(card, false, req, &gen_err);
 		if (err)
 			return MMC_BLK_CMD_ERR;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 10/24] mmc: core: Make mmc_pre_req() and mmc_post_req() available
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (8 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 09/24] mmc: block: Make card_busy_detect() accumulate all response error bits Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 11/24] mmc: block: Add error-handling comments Adrian Hunter
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Make mmc_pre_req() and mmc_post_req() available to the card drivers. Later
patches will make use of this.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/core.c | 31 -------------------------------
 drivers/mmc/core/core.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 1f0f44f4dd5f..7ca6e4866a8b 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -658,37 +658,6 @@ bool mmc_is_req_done(struct mmc_host *host, struct mmc_request *mrq)
 EXPORT_SYMBOL(mmc_is_req_done);
 
 /**
- *	mmc_pre_req - Prepare for a new request
- *	@host: MMC host to prepare command
- *	@mrq: MMC request to prepare for
- *
- *	mmc_pre_req() is called in prior to mmc_start_req() to let
- *	host prepare for the new request. Preparation of a request may be
- *	performed while another request is running on the host.
- */
-static void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq)
-{
-	if (host->ops->pre_req)
-		host->ops->pre_req(host, mrq);
-}
-
-/**
- *	mmc_post_req - Post process a completed request
- *	@host: MMC host to post process command
- *	@mrq: MMC request to post process for
- *	@err: Error, if non zero, clean up any resources made in pre_req
- *
- *	Let the host post process a completed request. Post processing of
- *	a request may be performed while another reuqest is running.
- */
-static void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
-			 int err)
-{
-	if (host->ops->post_req)
-		host->ops->post_req(host, mrq, err);
-}
-
-/**
  * mmc_finalize_areq() - finalize an asynchronous request
  * @host: MMC host to finalize any ongoing request on
  *
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index 71e6c6d7ceb7..f564ddfbe070 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -152,4 +152,35 @@ static inline void mmc_claim_host(struct mmc_host *host)
 void mmc_cqe_post_req(struct mmc_host *host, struct mmc_request *mrq);
 int mmc_cqe_recovery(struct mmc_host *host);
 
+/**
+ *	mmc_pre_req - Prepare for a new request
+ *	@host: MMC host to prepare command
+ *	@mrq: MMC request to prepare for
+ *
+ *	mmc_pre_req() is called in prior to mmc_start_req() to let
+ *	host prepare for the new request. Preparation of a request may be
+ *	performed while another request is running on the host.
+ */
+static inline void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	if (host->ops->pre_req)
+		host->ops->pre_req(host, mrq);
+}
+
+/**
+ *	mmc_post_req - Post process a completed request
+ *	@host: MMC host to post process command
+ *	@mrq: MMC request to post process for
+ *	@err: Error, if non zero, clean up any resources made in pre_req
+ *
+ *	Let the host post process a completed request. Post processing of
+ *	a request may be performed while another reuqest is running.
+ */
+static inline void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
+				int err)
+{
+	if (host->ops->post_req)
+		host->ops->post_req(host, mrq, err);
+}
+
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 11/24] mmc: block: Add error-handling comments
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (9 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 10/24] mmc: core: Make mmc_pre_req() and mmc_post_req() available Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 12/24] mmc: core: Add parameter use_blk_mq Adrian Hunter
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Add error-handling comments to explain what would also be done for blk-mq
if it used the legacy error-handling.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 607774fcb9f5..56624853d3d3 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1966,7 +1966,11 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 		case MMC_BLK_SUCCESS:
 		case MMC_BLK_PARTIAL:
 			/*
-			 * A block was successfully transferred.
+			 * Reset success, and accept bytes_xfered. For
+			 * MMC_BLK_PARTIAL re-submit the remaining request. For
+			 * MMC_BLK_SUCCESS error out the remaining request (it
+			 * could not be re-submitted anyway if a next request
+			 * had already begun).
 			 */
 			mmc_blk_reset_success(md, type);
 
@@ -1986,6 +1990,14 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 			}
 			break;
 		case MMC_BLK_CMD_ERR:
+			/*
+			 * For SD cards, get bytes written, but do not accept
+			 * bytes_xfered if that fails. For MMC cards accept
+			 * bytes_xfered. Then try to reset. If reset fails then
+			 * error out the remaining request, otherwise retry
+			 * once (N.B mmc_blk_reset() will not succeed twice in a
+			 * row).
+			 */
 			req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
 			if (mmc_blk_reset(md, card->host, type)) {
 				if (req_pending)
@@ -2002,11 +2014,20 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 			}
 			break;
 		case MMC_BLK_RETRY:
+			/*
+			 * Do not accept bytes_xfered, but retry up to 5 times,
+			 * otherwise same as abort.
+			 */
 			retune_retry_done = brq->retune_retry_done;
 			if (retry++ < 5)
 				break;
 			/* Fall through */
 		case MMC_BLK_ABORT:
+			/*
+			 * Do not accept bytes_xfered, but try to reset. If
+			 * reset succeeds, try once more, otherwise error out
+			 * the request.
+			 */
 			if (!mmc_blk_reset(md, card->host, type))
 				break;
 			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
@@ -2015,6 +2036,13 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 		case MMC_BLK_DATA_ERR: {
 			int err;
 
+			/*
+			 * Do not accept bytes_xfered, but try to reset. If
+			 * reset succeeds, try once more. If reset fails with
+			 * ENODEV which means the partition is wrong, then error
+			 * out the request. Otherwise attempt to read one sector
+			 * at a time.
+			 */
 			err = mmc_blk_reset(md, card->host, type);
 			if (!err)
 				break;
@@ -2026,6 +2054,10 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 			/* Fall through */
 		}
 		case MMC_BLK_ECC_ERR:
+			/*
+			 * Do not accept bytes_xfered. If reading more than one
+			 * sector, try reading one sector at a time.
+			 */
 			if (brq->data.blocks > 1) {
 				/* Redo read one sector at a time */
 				pr_warn("%s: retrying using single block read\n",
@@ -2047,10 +2079,12 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 			}
 			break;
 		case MMC_BLK_NOMEDIUM:
+			/* Do not accept bytes_xfered. Error out the request */
 			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
 			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
 			return;
 		default:
+			/* Do not accept bytes_xfered. Error out the request */
 			pr_err("%s: Unhandled return value (%d)",
 					old_req->rq_disk->disk_name, status);
 			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 12/24] mmc: core: Add parameter use_blk_mq
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (10 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 11/24] mmc: block: Add error-handling comments Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 13/24] mmc: block: Add blk-mq support Adrian Hunter
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Until mmc has blk-mq support fully implemented and tested, add a parameter
use_blk_mq, set to true if config option MMC_MQ_DEFAULT is selected, which
it is by default.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/Kconfig      | 10 ++++++++++
 drivers/mmc/core/core.c  |  7 +++++++
 drivers/mmc/core/core.h  |  2 ++
 drivers/mmc/core/host.c  |  2 ++
 drivers/mmc/core/host.h  |  4 ++++
 include/linux/mmc/host.h |  1 +
 6 files changed, 26 insertions(+)

diff --git a/drivers/mmc/Kconfig b/drivers/mmc/Kconfig
index ec21388311db..42565562577c 100644
--- a/drivers/mmc/Kconfig
+++ b/drivers/mmc/Kconfig
@@ -12,6 +12,16 @@ menuconfig MMC
 	  If you want MMC/SD/SDIO support, you should say Y here and
 	  also to your specific host controller driver.
 
+config MMC_MQ_DEFAULT
+	bool "MMC: use blk-mq I/O path by default"
+	depends on MMC && BLOCK
+	default y
+	---help---
+	  This option enables the new blk-mq based I/O path for MMC block
+	  devices by default.  With the option the mmc_core.use_blk_mq
+	  module/boot option defaults to Y, without it to N, but it can
+	  still be overridden either way.
+
 if MMC
 
 source "drivers/mmc/core/Kconfig"
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 7ca6e4866a8b..617802f45386 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -66,6 +66,13 @@
 bool use_spi_crc = 1;
 module_param(use_spi_crc, bool, 0);
 
+#ifdef CONFIG_MMC_MQ_DEFAULT
+bool mmc_use_blk_mq = true;
+#else
+bool mmc_use_blk_mq = false;
+#endif
+module_param_named(use_blk_mq, mmc_use_blk_mq, bool, S_IWUSR | S_IRUGO);
+
 static int mmc_schedule_delayed_work(struct delayed_work *work,
 				     unsigned long delay)
 {
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index f564ddfbe070..aa87cd8d14c6 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -35,6 +35,8 @@ struct mmc_bus_ops {
 	int (*reset)(struct mmc_host *);
 };
 
+extern bool mmc_use_blk_mq;
+
 void mmc_attach_bus(struct mmc_host *host, const struct mmc_bus_ops *ops);
 void mmc_detach_bus(struct mmc_host *host);
 
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 35a9e4fd1a9f..62ef6cb0ece4 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -404,6 +404,8 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
 
 	host->fixed_drv_type = -EINVAL;
 
+	host->use_blk_mq = mmc_use_blk_mq;
+
 	return host;
 }
 
diff --git a/drivers/mmc/core/host.h b/drivers/mmc/core/host.h
index fb689a1065ed..6eaf558e62d6 100644
--- a/drivers/mmc/core/host.h
+++ b/drivers/mmc/core/host.h
@@ -74,6 +74,10 @@ static inline bool mmc_card_hs400es(struct mmc_card *card)
 	return card->host->ios.enhanced_strobe;
 }
 
+static inline bool mmc_host_use_blk_mq(struct mmc_host *host)
+{
+	return host->use_blk_mq;
+}
 
 #endif
 
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index e7743eca1021..ce2075d6f429 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -380,6 +380,7 @@ struct mmc_host {
 	unsigned int		doing_retune:1;	/* re-tuning in progress */
 	unsigned int		retune_now:1;	/* do re-tuning at next req */
 	unsigned int		retune_paused:1; /* re-tuning is temporarily disabled */
+	unsigned int		use_blk_mq:1;	/* use blk-mq */
 
 	int			rescan_disable;	/* disable card detection */
 	int			rescan_entered;	/* used with nonremovable devices */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (11 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 12/24] mmc: core: Add parameter use_blk_mq Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-24 10:12   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 14/24] mmc: block: Add CQE support Adrian Hunter
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Define and use a blk-mq queue. Discards and flushes are processed
synchronously, but reads and writes asynchronously. In order to support
slow DMA unmapping, DMA unmapping is not done until after the next request
is started. That means the request is not completed until then. If there is
no next request then the completion is done by queued work.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 453 ++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/mmc/core/block.h |   9 +
 drivers/mmc/core/queue.c | 291 +++++++++++++++++++++++++++---
 drivers/mmc/core/queue.h |  32 ++++
 4 files changed, 754 insertions(+), 31 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 56624853d3d3..a08d727d100b 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1276,6 +1276,14 @@ static inline void mmc_blk_reset_success(struct mmc_blk_data *md, int type)
 	md->reset_done &= ~type;
 }
 
+static void mmc_blk_end_request(struct request *req, blk_status_t error)
+{
+	if (req->mq_ctx)
+		blk_mq_end_request(req, error);
+	else
+		blk_end_request_all(req, error);
+}
+
 /*
  * The non-block commands come back from the block layer after it queued it and
  * processed it with all other requests and then they get issued in this
@@ -1337,7 +1345,7 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
 		break;
 	}
 	mq_rq->drv_op_result = ret;
-	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	mmc_blk_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
@@ -1380,7 +1388,7 @@ static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
 	else
 		mmc_blk_reset_success(md, type);
 fail:
-	blk_end_request(req, status, blk_rq_bytes(req));
+	mmc_blk_end_request(req, status);
 }
 
 static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
@@ -1450,7 +1458,7 @@ static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
 	if (!err)
 		mmc_blk_reset_success(md, type);
 out:
-	blk_end_request(req, status, blk_rq_bytes(req));
+	mmc_blk_end_request(req, status);
 }
 
 static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
@@ -1460,7 +1468,7 @@ static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
 	int ret = 0;
 
 	ret = mmc_flush_cache(card);
-	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	mmc_blk_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 /*
@@ -1537,11 +1545,9 @@ static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
 	}
 }
 
-static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
-					     struct mmc_async_req *areq)
+static enum mmc_blk_status __mmc_blk_err_check(struct mmc_card *card,
+					       struct mmc_queue_req *mq_mrq)
 {
-	struct mmc_queue_req *mq_mrq = container_of(areq, struct mmc_queue_req,
-						    areq);
 	struct mmc_blk_request *brq = &mq_mrq->brq;
 	struct request *req = mmc_queue_req_to_req(mq_mrq);
 	int need_retune = card->host->need_retune;
@@ -1646,6 +1652,15 @@ static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
 	return MMC_BLK_SUCCESS;
 }
 
+static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
+					     struct mmc_async_req *areq)
+{
+	struct mmc_queue_req *mq_mrq = container_of(areq, struct mmc_queue_req,
+						    areq);
+
+	return __mmc_blk_err_check(card, mq_mrq);
+}
+
 static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 			      int disable_multi, bool *do_rel_wr_p,
 			      bool *do_data_tag_p)
@@ -1838,6 +1853,428 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 	mqrq->areq.err_check = mmc_blk_err_check;
 }
 
+#define MMC_MAX_RETRIES		5
+#define MMC_NO_RETRIES		(MMC_MAX_RETRIES + 1)
+
+/* Single sector read during recovery */
+static void mmc_blk_ss_read(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	blk_status_t status;
+
+	while (1) {
+		mmc_blk_rw_rq_prep(mqrq, mq->card, 1, mq);
+
+		mmc_wait_for_req(mq->card->host, &mqrq->brq.mrq);
+
+		/*
+		 * Not expecting command errors, so just give up in that case.
+		 * If there are retries remaining, the request will get
+		 * requeued.
+		 */
+		if (mqrq->brq.cmd.error)
+			return;
+
+		if (blk_rq_bytes(req) <= 512)
+			break;
+
+		status = mqrq->brq.data.error ? BLK_STS_IOERR : BLK_STS_OK;
+
+		blk_update_request(req, status, 512);
+	}
+
+	mqrq->retries = MMC_NO_RETRIES;
+}
+
+static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
+{
+	int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_blk_request *brq = &mqrq->brq;
+	struct mmc_blk_data *md = mq->blkdata;
+	struct mmc_card *card = mq->card;
+	static enum mmc_blk_status status;
+
+	brq->retune_retry_done = mqrq->retries;
+
+	status = __mmc_blk_err_check(card, mqrq);
+
+	mmc_retune_release(card->host);
+
+	/*
+	 * Requests are completed by mmc_blk_mq_complete_rq() which sets simple
+	 * policy:
+	 * 1. A request that has transferred at least some data is considered
+	 * successful and will be requeued if there is remaining data to
+	 * transfer.
+	 * 2. Otherwise the number of retries is incremented and the request
+	 * will be requeued if there are remaining retries.
+	 * 3. Otherwise the request will be errored out.
+	 * That means mmc_blk_mq_complete_rq() is controlled by bytes_xfered and
+	 * mqrq->retries. So there are only 4 possible actions here:
+	 *	1. do not accept the bytes_xfered value i.e. set it to zero
+	 *	2. change mqrq->retries to determine the number of retries
+	 *	3. try to reset the card
+	 *	4. read one sector at a time
+	 */
+	switch (status) {
+	case MMC_BLK_SUCCESS:
+	case MMC_BLK_PARTIAL:
+		/* Reset success, and accept bytes_xfered */
+		mmc_blk_reset_success(md, type);
+		break;
+	case MMC_BLK_CMD_ERR:
+		/*
+		 * For SD cards, get bytes written, but do not accept
+		 * bytes_xfered if that fails. For MMC cards accept
+		 * bytes_xfered. Then try to reset. If reset fails then
+		 * error out the remaining request, otherwise retry
+		 * once (N.B mmc_blk_reset() will not succeed twice in a
+		 * row).
+		 */
+		if (mmc_card_sd(card)) {
+			u32 blocks;
+			int err;
+
+			err = mmc_sd_num_wr_blocks(card, &blocks);
+			if (err)
+				brq->data.bytes_xfered = 0;
+			else
+				brq->data.bytes_xfered = blocks << 9;
+		}
+		if (mmc_blk_reset(md, card->host, type))
+			mqrq->retries = MMC_NO_RETRIES;
+		else
+			mqrq->retries = MMC_MAX_RETRIES - 1;
+		break;
+	case MMC_BLK_RETRY:
+		/*
+		 * Do not accept bytes_xfered, but retry up to 5 times,
+		 * otherwise same as abort.
+		 */
+		brq->data.bytes_xfered = 0;
+		if (mqrq->retries < MMC_MAX_RETRIES)
+			break;
+		/* Fall through */
+	case MMC_BLK_ABORT:
+		/*
+		 * Do not accept bytes_xfered, but try to reset. If
+		 * reset succeeds, try once more, otherwise error out
+		 * the request.
+		 */
+		brq->data.bytes_xfered = 0;
+		if (mmc_blk_reset(md, card->host, type))
+			mqrq->retries = MMC_NO_RETRIES;
+		else
+			mqrq->retries = MMC_MAX_RETRIES - 1;
+		break;
+	case MMC_BLK_DATA_ERR: {
+		int err;
+
+		/*
+		 * Do not accept bytes_xfered, but try to reset. If
+		 * reset succeeds, try once more. If reset fails with
+		 * ENODEV which means the partition is wrong, then error
+		 * out the request. Otherwise attempt to read one sector
+		 * at a time.
+		 */
+		brq->data.bytes_xfered = 0;
+		err = mmc_blk_reset(md, card->host, type);
+		if (!err) {
+			mqrq->retries = MMC_MAX_RETRIES - 1;
+			break;
+		}
+		if (err == -ENODEV) {
+			mqrq->retries = MMC_NO_RETRIES;
+			break;
+		}
+		/* Fall through */
+	}
+	case MMC_BLK_ECC_ERR:
+		/*
+		 * Do not accept bytes_xfered. If reading more than one
+		 * sector, try reading one sector at a time.
+		 */
+		brq->data.bytes_xfered = 0;
+		/* FIXME: Missing single sector read for large sector size */
+		if (brq->data.blocks > 1 && !mmc_large_sector(card)) {
+			/* Redo read one sector at a time */
+			pr_warn("%s: retrying using single block read\n",
+				req->rq_disk->disk_name);
+			mmc_blk_ss_read(mq, req);
+		} else {
+			mqrq->retries = MMC_NO_RETRIES;
+		}
+		break;
+	case MMC_BLK_NOMEDIUM:
+		/* Do not accept bytes_xfered. Error out the request */
+		brq->data.bytes_xfered = 0;
+		mqrq->retries = MMC_NO_RETRIES;
+		break;
+	default:
+		/* Do not accept bytes_xfered. Error out the request */
+		brq->data.bytes_xfered = 0;
+		mqrq->retries = MMC_NO_RETRIES;
+		pr_err("%s: Unhandled return value (%d)",
+		       req->rq_disk->disk_name, status);
+		break;
+	}
+}
+
+static void mmc_blk_mq_complete_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	unsigned int nr_bytes = mqrq->brq.data.bytes_xfered;
+
+	if (nr_bytes) {
+		if (blk_update_request(req, BLK_STS_OK, nr_bytes))
+			blk_mq_requeue_request(req, true);
+		else
+			__blk_mq_end_request(req, BLK_STS_OK);
+	} else if (mqrq->retries++ < MMC_MAX_RETRIES) {
+		blk_mq_requeue_request(req, true);
+	} else {
+		if (mmc_card_removed(mq->card))
+			req->rq_flags |= RQF_QUIET;
+		blk_mq_end_request(req, BLK_STS_IOERR);
+	}
+}
+
+static bool mmc_blk_urgent_bkops_needed(struct mmc_queue *mq,
+					struct mmc_queue_req *mqrq)
+{
+	return mmc_card_mmc(mq->card) && !mmc_host_is_spi(mq->card->host) &&
+	       (mqrq->brq.cmd.resp[0] & R1_EXCEPTION_EVENT ||
+		mqrq->brq.stop.resp[0] & R1_EXCEPTION_EVENT);
+}
+
+static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
+				 struct mmc_queue_req *mqrq)
+{
+	if (mmc_blk_urgent_bkops_needed(mq, mqrq))
+		mmc_start_bkops(mq->card, true);
+}
+
+void mmc_blk_mq_complete(struct request *req)
+{
+	struct mmc_queue *mq = req->q->queuedata;
+
+	mmc_blk_mq_complete_rq(mq, req);
+}
+
+static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
+				       struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+
+	mmc_blk_mq_rw_recovery(mq, req);
+
+	mmc_blk_urgent_bkops(mq, mqrq);
+}
+
+static void mmc_blk_mq_acct_req_done(struct mmc_queue *mq, struct request *req)
+{
+	struct request_queue *q = req->q;
+	unsigned long flags;
+	bool put_card;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	mq->in_flight[mmc_issue_type(mq, req)] -= 1;
+
+	put_card = (mmc_tot_in_flight(mq) == 0);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	if (put_card)
+		mmc_put_card(mq->card, &mq->ctx);
+}
+
+static void mmc_blk_mq_post_req(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct mmc_host *host = mq->card->host;
+
+	mmc_post_req(host, mrq, 0);
+
+	blk_mq_complete_request(req);
+
+	mmc_blk_mq_acct_req_done(mq, req);
+}
+
+static void mmc_blk_mq_complete_prev_req(struct mmc_queue *mq,
+					 struct request **prev_req)
+{
+	mutex_lock(&mq->complete_lock);
+
+	if (!mq->complete_req)
+		goto out_unlock;
+
+	mmc_blk_mq_poll_completion(mq, mq->complete_req);
+
+	if (prev_req)
+		*prev_req = mq->complete_req;
+	else
+		mmc_blk_mq_post_req(mq, mq->complete_req);
+
+	mq->complete_req = NULL;
+
+out_unlock:
+	mutex_unlock(&mq->complete_lock);
+}
+
+void mmc_blk_mq_complete_work(struct work_struct *work)
+{
+	struct mmc_queue *mq = container_of(work, struct mmc_queue,
+					    complete_work);
+
+	mmc_blk_mq_complete_prev_req(mq, NULL);
+}
+
+static void mmc_blk_mq_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	unsigned long flags;
+	bool waiting;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+	mq->complete_req = req;
+	mq->rw_wait = false;
+	waiting = mq->waiting;
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	if (waiting)
+		wake_up(&mq->wait);
+	else
+		kblockd_schedule_work(&mq->complete_work);
+}
+
+static bool mmc_blk_rw_wait_cond(struct mmc_queue *mq, int *err)
+{
+	struct request_queue *q = mq->queue;
+	unsigned long flags;
+	bool done;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+	done = !mq->rw_wait;
+	mq->waiting = !done;
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	return done;
+}
+
+static int mmc_blk_rw_wait(struct mmc_queue *mq, struct request **prev_req)
+{
+	int err = 0;
+
+	wait_event(mq->wait, mmc_blk_rw_wait_cond(mq, &err));
+
+	mmc_blk_mq_complete_prev_req(mq, prev_req);
+
+	return err;
+}
+
+static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
+				  struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+	struct request *prev_req = NULL;
+	int err = 0;
+
+	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
+
+	mqrq->brq.mrq.done = mmc_blk_mq_req_done;
+
+	mmc_pre_req(host, &mqrq->brq.mrq);
+
+	err = mmc_blk_rw_wait(mq, &prev_req);
+	if (err)
+		goto out_post_req;
+
+	mq->rw_wait = true;
+
+	err = mmc_start_request(host, &mqrq->brq.mrq);
+
+	if (prev_req)
+		mmc_blk_mq_post_req(mq, prev_req);
+
+	if (err) {
+		mq->rw_wait = false;
+		mmc_retune_release(host);
+	}
+
+out_post_req:
+	if (err)
+		mmc_post_req(host, &mqrq->brq.mrq, err);
+
+	return err;
+}
+
+static int mmc_blk_wait_for_idle(struct mmc_queue *mq, struct mmc_host *host)
+{
+	return mmc_blk_rw_wait(mq, NULL);
+}
+
+enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->blkdata;
+	struct mmc_card *card = md->queue.card;
+	struct mmc_host *host = card->host;
+	int ret;
+
+	ret = mmc_blk_part_switch(card, md->part_type);
+	if (ret)
+		return MMC_REQ_FAILED_TO_START;
+
+	switch (mmc_issue_type(mq, req)) {
+	case MMC_ISSUE_SYNC:
+		ret = mmc_blk_wait_for_idle(mq, host);
+		if (ret)
+			return MMC_REQ_BUSY;
+		switch (req_op(req)) {
+		case REQ_OP_DRV_IN:
+		case REQ_OP_DRV_OUT:
+			mmc_blk_issue_drv_op(mq, req);
+			break;
+		case REQ_OP_DISCARD:
+			mmc_blk_issue_discard_rq(mq, req);
+			break;
+		case REQ_OP_SECURE_ERASE:
+			mmc_blk_issue_secdiscard_rq(mq, req);
+			break;
+		case REQ_OP_FLUSH:
+			mmc_blk_issue_flush(mq, req);
+			break;
+		default:
+			WARN_ON_ONCE(1);
+			return MMC_REQ_FAILED_TO_START;
+		}
+		return MMC_REQ_FINISHED;
+	case MMC_ISSUE_ASYNC:
+		switch (req_op(req)) {
+		case REQ_OP_READ:
+		case REQ_OP_WRITE:
+			ret = mmc_blk_mq_issue_rw_rq(mq, req);
+			break;
+		default:
+			WARN_ON_ONCE(1);
+			ret = -EINVAL;
+		}
+		if (!ret)
+			return MMC_REQ_STARTED;
+		return ret == -EBUSY ? MMC_REQ_BUSY : MMC_REQ_FAILED_TO_START;
+	default:
+		WARN_ON_ONCE(1);
+		return MMC_REQ_FAILED_TO_START;
+	}
+}
+
 static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
 			       struct mmc_blk_request *brq, struct request *req,
 			       bool old_req_pending)
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index 5946636101ef..6d34e87b18f6 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -7,4 +7,13 @@
 
 void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req);
 
+enum mmc_issued;
+
+enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req);
+void mmc_blk_mq_complete(struct request *req);
+
+struct work_struct;
+
+void mmc_blk_mq_complete_work(struct work_struct *work);
+
 #endif
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index ae6d9da68735..b9c2430e9292 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -22,6 +22,7 @@
 #include "block.h"
 #include "core.h"
 #include "card.h"
+#include "host.h"
 
 /*
  * Prepare a MMC request. This just filters out odd stuff.
@@ -34,10 +35,25 @@ static int mmc_prep_request(struct request_queue *q, struct request *req)
 		return BLKPREP_KILL;
 
 	req->rq_flags |= RQF_DONTPREP;
+	req_to_mmc_queue_req(req)->retries = 0;
 
 	return BLKPREP_OK;
 }
 
+enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
+{
+	if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
+		return MMC_ISSUE_ASYNC;
+
+	return MMC_ISSUE_SYNC;
+}
+
+static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
+						 bool reserved)
+{
+	return BLK_EH_RESET_TIMER;
+}
+
 static int mmc_queue_thread(void *d)
 {
 	struct mmc_queue *mq = d;
@@ -154,11 +170,10 @@ static void mmc_queue_setup_discard(struct request_queue *q,
  * @req: the request
  * @gfp: memory allocation policy
  */
-static int mmc_init_request(struct request_queue *q, struct request *req,
-			    gfp_t gfp)
+static int __mmc_init_request(struct mmc_queue *mq, struct request *req,
+			      gfp_t gfp)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
-	struct mmc_queue *mq = q->queuedata;
 	struct mmc_card *card = mq->card;
 	struct mmc_host *host = card->host;
 
@@ -169,6 +184,12 @@ static int mmc_init_request(struct request_queue *q, struct request *req,
 	return 0;
 }
 
+static int mmc_init_request(struct request_queue *q, struct request *req,
+			    gfp_t gfp)
+{
+	return __mmc_init_request(q->queuedata, req, gfp);
+}
+
 static void mmc_exit_request(struct request_queue *q, struct request *req)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
@@ -177,6 +198,108 @@ static void mmc_exit_request(struct request_queue *q, struct request *req)
 	mq_rq->sg = NULL;
 }
 
+static int mmc_mq_init_request(struct blk_mq_tag_set *set, struct request *req,
+			       unsigned int hctx_idx, unsigned int numa_node)
+{
+	return __mmc_init_request(set->driver_data, req, GFP_KERNEL);
+}
+
+static void mmc_mq_exit_request(struct blk_mq_tag_set *set, struct request *req,
+				unsigned int hctx_idx)
+{
+	struct mmc_queue *mq = set->driver_data;
+
+	mmc_exit_request(mq->queue, req);
+}
+
+static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
+				    const struct blk_mq_queue_data *bd)
+{
+	struct request *req = bd->rq;
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_card *card = mq->card;
+	enum mmc_issue_type issue_type;
+	enum mmc_issued issued;
+	bool get_card;
+	int ret;
+
+	if (mmc_card_removed(mq->card)) {
+		req->rq_flags |= RQF_QUIET;
+		return BLK_STS_IOERR;
+	}
+
+	issue_type = mmc_issue_type(mq, req);
+
+	spin_lock_irq(q->queue_lock);
+
+	switch (issue_type) {
+	case MMC_ISSUE_ASYNC:
+		break;
+	default:
+		/*
+		 * Timeouts are handled by mmc core, and we don't have a host
+		 * API to abort requests, so we can't handle the timeout anyway.
+		 * However, when the timeout happens, blk_mq_complete_request()
+		 * no longer works (to stop the request disappearing under us).
+		 * To avoid racing with that, set a large timeout.
+		 */
+		req->timeout = 600 * HZ;
+		break;
+	}
+
+	mq->in_flight[issue_type] += 1;
+	get_card = (mmc_tot_in_flight(mq) == 1);
+
+	spin_unlock_irq(q->queue_lock);
+
+	if (!(req->rq_flags & RQF_DONTPREP)) {
+		req_to_mmc_queue_req(req)->retries = 0;
+		req->rq_flags |= RQF_DONTPREP;
+	}
+
+	if (get_card)
+		mmc_get_card(card, &mq->ctx);
+
+	blk_mq_start_request(req);
+
+	issued = mmc_blk_mq_issue_rq(mq, req);
+
+	switch (issued) {
+	case MMC_REQ_BUSY:
+		ret = BLK_STS_RESOURCE;
+		break;
+	case MMC_REQ_FAILED_TO_START:
+		ret = BLK_STS_IOERR;
+		break;
+	default:
+		ret = BLK_STS_OK;
+		break;
+	}
+
+	if (issued != MMC_REQ_STARTED) {
+		bool put_card = false;
+
+		spin_lock_irq(q->queue_lock);
+		mq->in_flight[issue_type] -= 1;
+		if (mmc_tot_in_flight(mq) == 0)
+			put_card = true;
+		spin_unlock_irq(q->queue_lock);
+		if (put_card)
+			mmc_put_card(card, &mq->ctx);
+	}
+
+	return ret;
+}
+
+static const struct blk_mq_ops mmc_mq_ops = {
+	.queue_rq	= mmc_mq_queue_rq,
+	.init_request	= mmc_mq_init_request,
+	.exit_request	= mmc_mq_exit_request,
+	.complete	= mmc_blk_mq_complete,
+	.timeout	= mmc_mq_timed_out,
+};
+
 static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 {
 	struct mmc_host *host = card->host;
@@ -198,6 +321,69 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 
 	/* Initialize thread_sem even if it is not used */
 	sema_init(&mq->thread_sem, 1);
+
+	INIT_WORK(&mq->complete_work, mmc_blk_mq_complete_work);
+
+	mutex_init(&mq->complete_lock);
+
+	init_waitqueue_head(&mq->wait);
+}
+
+static int mmc_mq_init_queue(struct mmc_queue *mq, int q_depth,
+			     const struct blk_mq_ops *mq_ops, spinlock_t *lock)
+{
+	int ret;
+
+	memset(&mq->tag_set, 0, sizeof(mq->tag_set));
+	mq->tag_set.ops = mq_ops;
+	mq->tag_set.queue_depth = q_depth;
+	mq->tag_set.numa_node = NUMA_NO_NODE;
+	mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE |
+			    BLK_MQ_F_BLOCKING;
+	mq->tag_set.nr_hw_queues = 1;
+	mq->tag_set.cmd_size = sizeof(struct mmc_queue_req);
+	mq->tag_set.driver_data = mq;
+
+	ret = blk_mq_alloc_tag_set(&mq->tag_set);
+	if (ret)
+		return ret;
+
+	mq->queue = blk_mq_init_queue(&mq->tag_set);
+	if (IS_ERR(mq->queue)) {
+		ret = PTR_ERR(mq->queue);
+		goto free_tag_set;
+	}
+
+	mq->queue->queue_lock = lock;
+	mq->queue->queuedata = mq;
+
+	return 0;
+
+free_tag_set:
+	blk_mq_free_tag_set(&mq->tag_set);
+
+	return ret;
+}
+
+#define MMC_QUEUE_DEPTH 64
+
+static int mmc_mq_init(struct mmc_queue *mq, struct mmc_card *card,
+			 spinlock_t *lock)
+{
+	int q_depth;
+	int ret;
+
+	q_depth = MMC_QUEUE_DEPTH;
+
+	ret = mmc_mq_init_queue(mq, q_depth, &mmc_mq_ops, lock);
+	if (ret)
+		return ret;
+
+	blk_queue_rq_timeout(mq->queue, 60 * HZ);
+
+	mmc_setup_queue(mq, card);
+
+	return 0;
 }
 
 /**
@@ -216,6 +402,10 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	int ret = -ENOMEM;
 
 	mq->card = card;
+
+	if (mmc_host_use_blk_mq(host))
+		return mmc_mq_init(mq, card, lock);
+
 	mq->queue = blk_alloc_queue(GFP_KERNEL);
 	if (!mq->queue)
 		return -ENOMEM;
@@ -251,11 +441,70 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	return ret;
 }
 
+static void mmc_mq_queue_suspend(struct mmc_queue *mq)
+{
+	blk_mq_quiesce_queue(mq->queue);
+
+	/*
+	 * The host remains claimed while there are outstanding requests, so
+	 * simply claiming and releasing here ensures there are none.
+	 */
+	mmc_claim_host(mq->card->host);
+	mmc_release_host(mq->card->host);
+}
+
+static void mmc_mq_queue_resume(struct mmc_queue *mq)
+{
+	blk_mq_unquiesce_queue(mq->queue);
+}
+
+static void __mmc_queue_suspend(struct mmc_queue *mq)
+{
+	struct request_queue *q = mq->queue;
+	unsigned long flags;
+
+	if (!mq->suspended) {
+		mq->suspended |= true;
+
+		spin_lock_irqsave(q->queue_lock, flags);
+		blk_stop_queue(q);
+		spin_unlock_irqrestore(q->queue_lock, flags);
+
+		down(&mq->thread_sem);
+	}
+}
+
+static void __mmc_queue_resume(struct mmc_queue *mq)
+{
+	struct request_queue *q = mq->queue;
+	unsigned long flags;
+
+	if (mq->suspended) {
+		mq->suspended = false;
+
+		up(&mq->thread_sem);
+
+		spin_lock_irqsave(q->queue_lock, flags);
+		blk_start_queue(q);
+		spin_unlock_irqrestore(q->queue_lock, flags);
+	}
+}
+
 void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
 	unsigned long flags;
 
+	if (q->mq_ops) {
+		/*
+		 * The legacy code handled the possibility of being suspended,
+		 * so do that here too.
+		 */
+		if (blk_queue_quiesced(q))
+			blk_mq_unquiesce_queue(q);
+		goto out_cleanup;
+	}
+
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
 
@@ -268,8 +517,16 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	blk_start_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
+out_cleanup:
 	blk_cleanup_queue(q);
 
+	/*
+	 * A request can be completed before the next request, potentially
+	 * leaving a complete_work with nothing to do. Such a work item might
+	 * still be queued at this point. Flush it.
+	 */
+	flush_work(&mq->complete_work);
+
 	mq->card = NULL;
 }
 
@@ -284,17 +541,11 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 void mmc_queue_suspend(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
-
-	if (!mq->suspended) {
-		mq->suspended |= true;
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_stop_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
 
-		down(&mq->thread_sem);
-	}
+	if (q->mq_ops)
+		mmc_mq_queue_suspend(mq);
+	else
+		__mmc_queue_suspend(mq);
 }
 
 /**
@@ -304,17 +555,11 @@ void mmc_queue_suspend(struct mmc_queue *mq)
 void mmc_queue_resume(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
 
-	if (mq->suspended) {
-		mq->suspended = false;
-
-		up(&mq->thread_sem);
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
+	if (q->mq_ops)
+		mmc_mq_queue_resume(mq);
+	else
+		__mmc_queue_resume(mq);
 }
 
 /*
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 547b457c4251..ce9249852f26 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -8,6 +8,19 @@
 #include <linux/mmc/core.h>
 #include <linux/mmc/host.h>
 
+enum mmc_issued {
+	MMC_REQ_STARTED,
+	MMC_REQ_BUSY,
+	MMC_REQ_FAILED_TO_START,
+	MMC_REQ_FINISHED,
+};
+
+enum mmc_issue_type {
+	MMC_ISSUE_SYNC,
+	MMC_ISSUE_ASYNC,
+	MMC_ISSUE_MAX,
+};
+
 static inline struct mmc_queue_req *req_to_mmc_queue_req(struct request *rq)
 {
 	return blk_mq_rq_to_pdu(rq);
@@ -57,12 +70,15 @@ struct mmc_queue_req {
 	int			drv_op_result;
 	void			*drv_op_data;
 	unsigned int		ioc_count;
+	int			retries;
 };
 
 struct mmc_queue {
 	struct mmc_card		*card;
 	struct task_struct	*thread;
 	struct semaphore	thread_sem;
+	struct mmc_ctx		ctx;
+	struct blk_mq_tag_set	tag_set;
 	bool			suspended;
 	bool			asleep;
 	struct mmc_blk_data	*blkdata;
@@ -74,6 +90,14 @@ struct mmc_queue {
 	 * associated mmc_queue_req data.
 	 */
 	int			qcnt;
+
+	int			in_flight[MMC_ISSUE_MAX];
+	bool			rw_wait;
+	bool			waiting;
+	wait_queue_head_t	wait;
+	struct request		*complete_req;
+	struct mutex		complete_lock;
+	struct work_struct	complete_work;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
@@ -84,4 +108,12 @@ extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
 extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
 				     struct mmc_queue_req *);
 
+enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req);
+
+static inline int mmc_tot_in_flight(struct mmc_queue *mq)
+{
+	return mq->in_flight[MMC_ISSUE_SYNC] +
+	       mq->in_flight[MMC_ISSUE_ASYNC];
+}
+
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 14/24] mmc: block: Add CQE support
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (12 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 13/24] mmc: block: Add blk-mq support Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-28 11:20   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 15/24] mmc: cqhci: support for command queue enabled host Adrian Hunter
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Add CQE support to the block driver, including:
    - optionally using DCMD for flush requests
    - "manually" issuing discard requests
    - issuing read / write requests to the CQE
    - supporting block-layer timeouts
    - handling recovery
    - supporting re-tuning

CQE offers 25% - 50% better random multi-threaded I/O.  There is a slight
(e.g. 2%) drop in sequential read speed but no observable change to sequential
write.

CQE automatically sends the commands to complete requests.  However it only
supports reads / writes and so-called "direct commands" (DCMD).  Furthermore
DCMD is limited to one command at a time, but discards require 3 commands.
That makes issuing discards through CQE very awkward, but some CQE's don't
support DCMD anyway.  So for discards, the existing non-CQE approach is
taken, where the mmc core code issues the 3 commands one at a time i.e.
mmc_erase(). Where DCMD is used, is for issuing flushes.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 150 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/mmc/core/block.h |   2 +
 drivers/mmc/core/queue.c | 158 +++++++++++++++++++++++++++++++++++++++++++++--
 drivers/mmc/core/queue.h |  18 ++++++
 4 files changed, 322 insertions(+), 6 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index a08d727d100b..2aacd3fa0d1a 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -111,6 +111,7 @@ struct mmc_blk_data {
 #define MMC_BLK_WRITE		BIT(1)
 #define MMC_BLK_DISCARD		BIT(2)
 #define MMC_BLK_SECDISCARD	BIT(3)
+#define MMC_BLK_CQE_RECOVERY	BIT(4)
 
 	/*
 	 * Only set in main mmc_blk_data associated
@@ -1785,6 +1786,138 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		*do_data_tag_p = do_data_tag;
 }
 
+#define MMC_CQE_RETRIES 2
+
+static void mmc_blk_cqe_complete_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct request_queue *q = req->q;
+	struct mmc_host *host = mq->card->host;
+	unsigned long flags;
+	bool put_card;
+	int err;
+
+	mmc_cqe_post_req(host, mrq);
+
+	if (mrq->cmd && mrq->cmd->error)
+		err = mrq->cmd->error;
+	else if (mrq->data && mrq->data->error)
+		err = mrq->data->error;
+	else
+		err = 0;
+
+	if (err) {
+		if (mqrq->retries++ < MMC_CQE_RETRIES)
+			blk_mq_requeue_request(req, true);
+		else
+			blk_mq_end_request(req, BLK_STS_IOERR);
+	} else if (mrq->data) {
+		if (blk_update_request(req, BLK_STS_OK, mrq->data->bytes_xfered))
+			blk_mq_requeue_request(req, true);
+		else
+			__blk_mq_end_request(req, BLK_STS_OK);
+	} else {
+		blk_mq_end_request(req, BLK_STS_OK);
+	}
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	mq->in_flight[mmc_issue_type(mq, req)] -= 1;
+
+	put_card = (mmc_tot_in_flight(mq) == 0);
+
+	mmc_cqe_check_busy(mq);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	if (!mq->cqe_busy)
+		blk_mq_run_hw_queues(q, true);
+
+	if (put_card)
+		mmc_put_card(mq->card, &mq->ctx);
+}
+
+void mmc_blk_cqe_recovery(struct mmc_queue *mq)
+{
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
+	int err;
+
+	pr_debug("%s: CQE recovery start\n", mmc_hostname(host));
+
+	err = mmc_cqe_recovery(host);
+	if (err)
+		mmc_blk_reset(mq->blkdata, host, MMC_BLK_CQE_RECOVERY);
+	else
+		mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
+
+	pr_debug("%s: CQE recovery done\n", mmc_hostname(host));
+}
+
+static void mmc_blk_cqe_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->in_recovery)
+		mmc_blk_cqe_complete_rq(mq, req);
+	else
+		blk_mq_complete_request(req);
+}
+
+static int mmc_blk_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	mrq->done		= mmc_blk_cqe_req_done;
+	mrq->recovery_notifier	= mmc_cqe_recovery_notifier;
+
+	return mmc_cqe_start_req(host, mrq);
+}
+
+static struct mmc_request *mmc_blk_cqe_prep_dcmd(struct mmc_queue_req *mqrq,
+						 struct request *req)
+{
+	struct mmc_blk_request *brq = &mqrq->brq;
+
+	memset(brq, 0, sizeof(*brq));
+
+	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.tag = req->tag;
+
+	return &brq->mrq;
+}
+
+static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = mmc_blk_cqe_prep_dcmd(mqrq, req);
+
+	mrq->cmd->opcode = MMC_SWITCH;
+	mrq->cmd->arg = (MMC_SWITCH_MODE_WRITE_BYTE << 24) |
+			(EXT_CSD_FLUSH_CACHE << 16) |
+			(1 << 8) |
+			EXT_CSD_CMD_SET_NORMAL;
+	mrq->cmd->flags = MMC_CMD_AC | MMC_RSP_R1B;
+
+	return mmc_blk_cqe_start_req(mq->card->host, mrq);
+}
+
+static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+
+	mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
+
+	return mmc_blk_cqe_start_req(mq->card->host, &mqrq->brq.mrq);
+}
+
 static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 			       struct mmc_card *card,
 			       int disable_multi,
@@ -2059,7 +2192,10 @@ void mmc_blk_mq_complete(struct request *req)
 {
 	struct mmc_queue *mq = req->q->queuedata;
 
-	mmc_blk_mq_complete_rq(mq, req);
+	if (mq->use_cqe)
+		mmc_blk_cqe_complete_rq(mq, req);
+	else
+		mmc_blk_mq_complete_rq(mq, req);
 }
 
 static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
@@ -2218,6 +2354,9 @@ static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
 
 static int mmc_blk_wait_for_idle(struct mmc_queue *mq, struct mmc_host *host)
 {
+	if (mq->use_cqe)
+		return host->cqe_ops->cqe_wait_for_idle(host);
+
 	return mmc_blk_rw_wait(mq, NULL);
 }
 
@@ -2256,11 +2395,18 @@ enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
 			return MMC_REQ_FAILED_TO_START;
 		}
 		return MMC_REQ_FINISHED;
+	case MMC_ISSUE_DCMD:
 	case MMC_ISSUE_ASYNC:
 		switch (req_op(req)) {
+		case REQ_OP_FLUSH:
+			ret = mmc_blk_cqe_issue_flush(mq, req);
+			break;
 		case REQ_OP_READ:
 		case REQ_OP_WRITE:
-			ret = mmc_blk_mq_issue_rw_rq(mq, req);
+			if (mq->use_cqe)
+				ret = mmc_blk_cqe_issue_rw_rq(mq, req);
+			else
+				ret = mmc_blk_mq_issue_rw_rq(mq, req);
 			break;
 		default:
 			WARN_ON_ONCE(1);
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index 6d34e87b18f6..f472ce5d5647 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -7,6 +7,8 @@
 
 void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req);
 
+void mmc_blk_cqe_recovery(struct mmc_queue *mq);
+
 enum mmc_issued;
 
 enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index b9c2430e9292..d0eae15261d7 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -40,18 +40,142 @@ static int mmc_prep_request(struct request_queue *q, struct request *req)
 	return BLKPREP_OK;
 }
 
+static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
+{
+	/* Allow only 1 DCMD at a time */
+	return mq->in_flight[MMC_ISSUE_DCMD];
+}
+
+void mmc_cqe_check_busy(struct mmc_queue *mq)
+{
+	if ((mq->cqe_busy & MMC_CQE_DCMD_BUSY) && !mmc_cqe_dcmd_busy(mq))
+		mq->cqe_busy &= ~MMC_CQE_DCMD_BUSY;
+
+	mq->cqe_busy &= ~MMC_CQE_QUEUE_FULL;
+}
+
+static inline bool mmc_cqe_can_dcmd(struct mmc_host *host)
+{
+	return host->caps2 & MMC_CAP2_CQE_DCMD;
+}
+
+enum mmc_issue_type mmc_cqe_issue_type(struct mmc_host *host,
+				       struct request *req)
+{
+	switch (req_op(req)) {
+	case REQ_OP_DRV_IN:
+	case REQ_OP_DRV_OUT:
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+		return MMC_ISSUE_SYNC;
+	case REQ_OP_FLUSH:
+		return mmc_cqe_can_dcmd(host) ? MMC_ISSUE_DCMD : MMC_ISSUE_SYNC;
+	default:
+		return MMC_ISSUE_ASYNC;
+	}
+}
+
 enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
 {
+	struct mmc_host *host = mq->card->host;
+
+	if (mq->use_cqe)
+		return mmc_cqe_issue_type(host, req);
+
 	if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
 		return MMC_ISSUE_ASYNC;
 
 	return MMC_ISSUE_SYNC;
 }
 
+static void __mmc_cqe_recovery_notifier(struct mmc_queue *mq)
+{
+	if (!mq->recovery_needed) {
+		mq->recovery_needed = true;
+		schedule_work(&mq->recovery_work);
+	}
+}
+
+void mmc_cqe_recovery_notifier(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	unsigned long flags;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+	__mmc_cqe_recovery_notifier(mq);
+	spin_unlock_irqrestore(q->queue_lock, flags);
+}
+
+static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct mmc_queue *mq = req->q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	enum mmc_issue_type issue_type = mmc_issue_type(mq, req);
+	bool recovery_needed = false;
+
+	switch (issue_type) {
+	case MMC_ISSUE_ASYNC:
+	case MMC_ISSUE_DCMD:
+		if (host->cqe_ops->cqe_timeout(host, mrq, &recovery_needed)) {
+			if (recovery_needed)
+				__mmc_cqe_recovery_notifier(mq);
+			return BLK_EH_RESET_TIMER;
+		}
+		/* No timeout */
+		return BLK_EH_HANDLED;
+	default:
+		/* Timeout is handled by mmc core */
+		return BLK_EH_RESET_TIMER;
+	}
+}
+
 static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
 						 bool reserved)
 {
-	return BLK_EH_RESET_TIMER;
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	if (mq->recovery_needed || !mq->use_cqe)
+		ret = BLK_EH_RESET_TIMER;
+	else
+		ret = mmc_cqe_timed_out(req);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	return ret;
+}
+
+static void mmc_mq_recovery_handler(struct work_struct *work)
+{
+	struct mmc_queue *mq = container_of(work, struct mmc_queue,
+					    recovery_work);
+	struct request_queue *q = mq->queue;
+
+	mmc_get_card(mq->card, &mq->ctx);
+
+	mq->in_recovery = true;
+
+	mmc_blk_cqe_recovery(mq);
+
+	mq->in_recovery = false;
+
+	spin_lock_irq(q->queue_lock);
+	mq->recovery_needed = false;
+	spin_unlock_irq(q->queue_lock);
+
+	mmc_put_card(mq->card, &mq->ctx);
+
+	blk_mq_run_hw_queues(q, true);
 }
 
 static int mmc_queue_thread(void *d)
@@ -219,9 +343,10 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 	struct request_queue *q = req->q;
 	struct mmc_queue *mq = q->queuedata;
 	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
 	enum mmc_issue_type issue_type;
 	enum mmc_issued issued;
-	bool get_card;
+	bool get_card, cqe_retune_ok;
 	int ret;
 
 	if (mmc_card_removed(mq->card)) {
@@ -233,7 +358,19 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	spin_lock_irq(q->queue_lock);
 
+	if (mq->recovery_needed) {
+		spin_unlock_irq(q->queue_lock);
+		return BLK_STS_RESOURCE;
+	}
+
 	switch (issue_type) {
+	case MMC_ISSUE_DCMD:
+		if (mmc_cqe_dcmd_busy(mq)) {
+			mq->cqe_busy |= MMC_CQE_DCMD_BUSY;
+			spin_unlock_irq(q->queue_lock);
+			return BLK_STS_RESOURCE;
+		}
+		break;
 	case MMC_ISSUE_ASYNC:
 		break;
 	default:
@@ -250,6 +387,7 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	mq->in_flight[issue_type] += 1;
 	get_card = (mmc_tot_in_flight(mq) == 1);
+	cqe_retune_ok = (mmc_cqe_qcnt(mq) == 1);
 
 	spin_unlock_irq(q->queue_lock);
 
@@ -261,6 +399,11 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (get_card)
 		mmc_get_card(card, &mq->ctx);
 
+	if (mq->use_cqe) {
+		host->retune_now = host->need_retune && cqe_retune_ok &&
+				   !host->hold_retune;
+	}
+
 	blk_mq_start_request(req);
 
 	issued = mmc_blk_mq_issue_rq(mq, req);
@@ -322,6 +465,7 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 	/* Initialize thread_sem even if it is not used */
 	sema_init(&mq->thread_sem, 1);
 
+	INIT_WORK(&mq->recovery_work, mmc_mq_recovery_handler);
 	INIT_WORK(&mq->complete_work, mmc_blk_mq_complete_work);
 
 	mutex_init(&mq->complete_lock);
@@ -370,10 +514,14 @@ static int mmc_mq_init_queue(struct mmc_queue *mq, int q_depth,
 static int mmc_mq_init(struct mmc_queue *mq, struct mmc_card *card,
 			 spinlock_t *lock)
 {
+	struct mmc_host *host = card->host;
 	int q_depth;
 	int ret;
 
-	q_depth = MMC_QUEUE_DEPTH;
+	if (mq->use_cqe)
+		q_depth = min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
+	else
+		q_depth = MMC_QUEUE_DEPTH;
 
 	ret = mmc_mq_init_queue(mq, q_depth, &mmc_mq_ops, lock);
 	if (ret)
@@ -403,7 +551,9 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 
 	mq->card = card;
 
-	if (mmc_host_use_blk_mq(host))
+	mq->use_cqe = host->cqe_enabled;
+
+	if (mq->use_cqe || mmc_host_use_blk_mq(host))
 		return mmc_mq_init(mq, card, lock);
 
 	mq->queue = blk_alloc_queue(GFP_KERNEL);
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index ce9249852f26..1d7d3b0afff8 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -17,6 +17,7 @@ enum mmc_issued {
 
 enum mmc_issue_type {
 	MMC_ISSUE_SYNC,
+	MMC_ISSUE_DCMD,
 	MMC_ISSUE_ASYNC,
 	MMC_ISSUE_MAX,
 };
@@ -92,8 +93,15 @@ struct mmc_queue {
 	int			qcnt;
 
 	int			in_flight[MMC_ISSUE_MAX];
+	unsigned int		cqe_busy;
+#define MMC_CQE_DCMD_BUSY	BIT(0)
+#define MMC_CQE_QUEUE_FULL	BIT(1)
+	bool			use_cqe;
+	bool			recovery_needed;
+	bool			in_recovery;
 	bool			rw_wait;
 	bool			waiting;
+	struct work_struct	recovery_work;
 	wait_queue_head_t	wait;
 	struct request		*complete_req;
 	struct mutex		complete_lock;
@@ -108,11 +116,21 @@ extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
 extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
 				     struct mmc_queue_req *);
 
+void mmc_cqe_check_busy(struct mmc_queue *mq);
+void mmc_cqe_recovery_notifier(struct mmc_request *mrq);
+
 enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req);
 
 static inline int mmc_tot_in_flight(struct mmc_queue *mq)
 {
 	return mq->in_flight[MMC_ISSUE_SYNC] +
+	       mq->in_flight[MMC_ISSUE_DCMD] +
+	       mq->in_flight[MMC_ISSUE_ASYNC];
+}
+
+static inline int mmc_cqe_qcnt(struct mmc_queue *mq)
+{
+	return mq->in_flight[MMC_ISSUE_DCMD] +
 	       mq->in_flight[MMC_ISSUE_ASYNC];
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 15/24] mmc: cqhci: support for command queue enabled host
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (13 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 14/24] mmc: block: Add CQE support Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 16/24] mmc: sdhci-pci: Add CQHCI support for Intel GLK Adrian Hunter
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

From: Venkat Gopalakrishnan <venkatg@codeaurora.org>

This patch adds CMDQ support for command-queue compatible
hosts.

Command queue is added in eMMC-5.1 specification. This
enables the controller to process upto 32 requests at
a time.

Adrian Hunter contributed renaming to cqhci, recovery, suspend
and resume, cqhci_off, cqhci_wait_for_idle, and external timeout
handling.

Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org>
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/host/Kconfig  |   13 +
 drivers/mmc/host/Makefile |    1 +
 drivers/mmc/host/cqhci.c  | 1150 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h  |  240 ++++++++++
 4 files changed, 1404 insertions(+)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 567028c9219a..3092b7085cb5 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -857,6 +857,19 @@ config MMC_SUNXI
 	  This selects support for the SD/MMC Host Controller on
 	  Allwinner sunxi SoCs.
 
+config MMC_CQHCI
+	tristate "Command Queue Host Controller Interface support"
+	depends on HAS_DMA
+	help
+	  This selects the Command Queue Host Controller Interface (CQHCI)
+	  support present in host controllers of Qualcomm Technologies, Inc
+	  amongst others.
+	  This controller supports eMMC devices with command queue support.
+
+	  If you have a controller with this interface, say Y or M here.
+
+	  If unsure, say N.
+
 config MMC_TOSHIBA_PCI
 	tristate "Toshiba Type A SD/MMC Card Interface Driver"
 	depends on PCI
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index a43cf0d5a5d3..407a011026cd 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -92,6 +92,7 @@ obj-$(CONFIG_MMC_SDHCI_ST)		+= sdhci-st.o
 obj-$(CONFIG_MMC_SDHCI_MICROCHIP_PIC32)	+= sdhci-pic32.o
 obj-$(CONFIG_MMC_SDHCI_BRCMSTB)		+= sdhci-brcmstb.o
 obj-$(CONFIG_MMC_SDHCI_OMAP)		+= sdhci-omap.o
+obj-$(CONFIG_MMC_CQHCI)			+= cqhci.o
 
 ifeq ($(CONFIG_CB710_DEBUG),y)
 	CFLAGS-cb710-mmc	+= -DDEBUG
diff --git a/drivers/mmc/host/cqhci.c b/drivers/mmc/host/cqhci.c
new file mode 100644
index 000000000000..159270e947cf
--- /dev/null
+++ b/drivers/mmc/host/cqhci.c
@@ -0,0 +1,1150 @@
+/* Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/delay.h>
+#include <linux/highmem.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/dma-mapping.h>
+#include <linux/slab.h>
+#include <linux/scatterlist.h>
+#include <linux/platform_device.h>
+#include <linux/ktime.h>
+
+#include <linux/mmc/mmc.h>
+#include <linux/mmc/host.h>
+#include <linux/mmc/card.h>
+
+#include "cqhci.h"
+
+#define DCMD_SLOT 31
+#define NUM_SLOTS 32
+
+struct cqhci_slot {
+	struct mmc_request *mrq;
+	unsigned int flags;
+#define CQHCI_EXTERNAL_TIMEOUT	BIT(0)
+#define CQHCI_COMPLETED		BIT(1)
+#define CQHCI_HOST_CRC		BIT(2)
+#define CQHCI_HOST_TIMEOUT	BIT(3)
+#define CQHCI_HOST_OTHER	BIT(4)
+};
+
+static inline u8 *get_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->desc_base + (tag * cq_host->slot_sz);
+}
+
+static inline u8 *get_link_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	u8 *desc = get_desc(cq_host, tag);
+
+	return desc + cq_host->task_desc_len;
+}
+
+static inline dma_addr_t get_trans_desc_dma(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->trans_desc_dma_base +
+		(cq_host->mmc->max_segs * tag *
+		 cq_host->trans_desc_len);
+}
+
+static inline u8 *get_trans_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->trans_desc_base +
+		(cq_host->trans_desc_len * cq_host->mmc->max_segs * tag);
+}
+
+static void setup_trans_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	u8 *link_temp;
+	dma_addr_t trans_temp;
+
+	link_temp = get_link_desc(cq_host, tag);
+	trans_temp = get_trans_desc_dma(cq_host, tag);
+
+	memset(link_temp, 0, cq_host->link_desc_len);
+	if (cq_host->link_desc_len > 8)
+		*(link_temp + 8) = 0;
+
+	if (tag == DCMD_SLOT && (cq_host->mmc->caps2 & MMC_CAP2_CQE_DCMD)) {
+		*link_temp = CQHCI_VALID(0) | CQHCI_ACT(0) | CQHCI_END(1);
+		return;
+	}
+
+	*link_temp = CQHCI_VALID(1) | CQHCI_ACT(0x6) | CQHCI_END(0);
+
+	if (cq_host->dma64) {
+		__le64 *data_addr = (__le64 __force *)(link_temp + 4);
+
+		data_addr[0] = cpu_to_le64(trans_temp);
+	} else {
+		__le32 *data_addr = (__le32 __force *)(link_temp + 4);
+
+		data_addr[0] = cpu_to_le32(trans_temp);
+	}
+}
+
+static void cqhci_set_irqs(struct cqhci_host *cq_host, u32 set)
+{
+	cqhci_writel(cq_host, set, CQHCI_ISTE);
+	cqhci_writel(cq_host, set, CQHCI_ISGE);
+}
+
+#define DRV_NAME "cqhci"
+
+#define CQHCI_DUMP(f, x...) \
+	pr_err("%s: " DRV_NAME ": " f, mmc_hostname(mmc), ## x)
+
+static void cqhci_dumpregs(struct cqhci_host *cq_host)
+{
+	struct mmc_host *mmc = cq_host->mmc;
+
+	CQHCI_DUMP("============ CQHCI REGISTER DUMP ===========\n");
+
+	CQHCI_DUMP("Caps:      0x%08x | Version:  0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CAP),
+		   cqhci_readl(cq_host, CQHCI_VER));
+	CQHCI_DUMP("Config:    0x%08x | Control:  0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CFG),
+		   cqhci_readl(cq_host, CQHCI_CTL));
+	CQHCI_DUMP("Int stat:  0x%08x | Int enab: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_IS),
+		   cqhci_readl(cq_host, CQHCI_ISTE));
+	CQHCI_DUMP("Int sig:   0x%08x | Int Coal: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_ISGE),
+		   cqhci_readl(cq_host, CQHCI_IC));
+	CQHCI_DUMP("TDL base:  0x%08x | TDL up32: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TDLBA),
+		   cqhci_readl(cq_host, CQHCI_TDLBAU));
+	CQHCI_DUMP("Doorbell:  0x%08x | TCN:      0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TDBR),
+		   cqhci_readl(cq_host, CQHCI_TCN));
+	CQHCI_DUMP("Dev queue: 0x%08x | Dev Pend: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_DQS),
+		   cqhci_readl(cq_host, CQHCI_DPT));
+	CQHCI_DUMP("Task clr:  0x%08x | SSC1:     0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TCLR),
+		   cqhci_readl(cq_host, CQHCI_SSC1));
+	CQHCI_DUMP("SSC2:      0x%08x | DCMD rsp: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_SSC2),
+		   cqhci_readl(cq_host, CQHCI_CRDCT));
+	CQHCI_DUMP("RED mask:  0x%08x | TERRI:    0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_RMEM),
+		   cqhci_readl(cq_host, CQHCI_TERRI));
+	CQHCI_DUMP("Resp idx:  0x%08x | Resp arg: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CRI),
+		   cqhci_readl(cq_host, CQHCI_CRA));
+
+	if (cq_host->ops->dumpregs)
+		cq_host->ops->dumpregs(mmc);
+	else
+		CQHCI_DUMP(": ===========================================\n");
+}
+
+/**
+ * The allocated descriptor table for task, link & transfer descritors
+ * looks like:
+ * |----------|
+ * |task desc |  |->|----------|
+ * |----------|  |  |trans desc|
+ * |link desc-|->|  |----------|
+ * |----------|          .
+ *      .                .
+ *  no. of slots      max-segs
+ *      .           |----------|
+ * |----------|
+ * The idea here is to create the [task+trans] table and mark & point the
+ * link desc to the transfer desc table on a per slot basis.
+ */
+static int cqhci_host_alloc_tdl(struct cqhci_host *cq_host)
+{
+	int i = 0;
+
+	/* task descriptor can be 64/128 bit irrespective of arch */
+	if (cq_host->caps & CQHCI_TASK_DESC_SZ_128) {
+		cqhci_writel(cq_host, cqhci_readl(cq_host, CQHCI_CFG) |
+			       CQHCI_TASK_DESC_SZ, CQHCI_CFG);
+		cq_host->task_desc_len = 16;
+	} else {
+		cq_host->task_desc_len = 8;
+	}
+
+	/*
+	 * 96 bits length of transfer desc instead of 128 bits which means
+	 * ADMA would expect next valid descriptor at the 96th bit
+	 * or 128th bit
+	 */
+	if (cq_host->dma64) {
+		if (cq_host->quirks & CQHCI_QUIRK_SHORT_TXFR_DESC_SZ)
+			cq_host->trans_desc_len = 12;
+		else
+			cq_host->trans_desc_len = 16;
+		cq_host->link_desc_len = 16;
+	} else {
+		cq_host->trans_desc_len = 8;
+		cq_host->link_desc_len = 8;
+	}
+
+	/* total size of a slot: 1 task & 1 transfer (link) */
+	cq_host->slot_sz = cq_host->task_desc_len + cq_host->link_desc_len;
+
+	cq_host->desc_size = cq_host->slot_sz * cq_host->num_slots;
+
+	cq_host->data_size = cq_host->trans_desc_len * cq_host->mmc->max_segs *
+		(cq_host->num_slots - 1);
+
+	pr_debug("%s: cqhci: desc_size: %zu data_sz: %zu slot-sz: %d\n",
+		 mmc_hostname(cq_host->mmc), cq_host->desc_size, cq_host->data_size,
+		 cq_host->slot_sz);
+
+	/*
+	 * allocate a dma-mapped chunk of memory for the descriptors
+	 * allocate a dma-mapped chunk of memory for link descriptors
+	 * setup each link-desc memory offset per slot-number to
+	 * the descriptor table.
+	 */
+	cq_host->desc_base = dmam_alloc_coherent(mmc_dev(cq_host->mmc),
+						 cq_host->desc_size,
+						 &cq_host->desc_dma_base,
+						 GFP_KERNEL);
+	cq_host->trans_desc_base = dmam_alloc_coherent(mmc_dev(cq_host->mmc),
+					      cq_host->data_size,
+					      &cq_host->trans_desc_dma_base,
+					      GFP_KERNEL);
+	if (!cq_host->desc_base || !cq_host->trans_desc_base)
+		return -ENOMEM;
+
+	pr_debug("%s: cqhci: desc-base: 0x%p trans-base: 0x%p\n desc_dma 0x%llx trans_dma: 0x%llx\n",
+		 mmc_hostname(cq_host->mmc), cq_host->desc_base, cq_host->trans_desc_base,
+		(unsigned long long)cq_host->desc_dma_base,
+		(unsigned long long)cq_host->trans_desc_dma_base);
+
+	for (; i < (cq_host->num_slots); i++)
+		setup_trans_desc(cq_host, i);
+
+	return 0;
+}
+
+static void __cqhci_enable(struct cqhci_host *cq_host)
+{
+	struct mmc_host *mmc = cq_host->mmc;
+	u32 cqcfg;
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+
+	/* Configuration must not be changed while enabled */
+	if (cqcfg & CQHCI_ENABLE) {
+		cqcfg &= ~CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+	}
+
+	cqcfg &= ~(CQHCI_DCMD | CQHCI_TASK_DESC_SZ);
+
+	if (mmc->caps2 & MMC_CAP2_CQE_DCMD)
+		cqcfg |= CQHCI_DCMD;
+
+	if (cq_host->caps & CQHCI_TASK_DESC_SZ_128)
+		cqcfg |= CQHCI_TASK_DESC_SZ;
+
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cqhci_writel(cq_host, lower_32_bits(cq_host->desc_dma_base),
+		     CQHCI_TDLBA);
+	cqhci_writel(cq_host, upper_32_bits(cq_host->desc_dma_base),
+		     CQHCI_TDLBAU);
+
+	cqhci_writel(cq_host, cq_host->rca, CQHCI_SSC2);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	cqcfg |= CQHCI_ENABLE;
+
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	mmc->cqe_on = true;
+
+	if (cq_host->ops->enable)
+		cq_host->ops->enable(mmc);
+
+	/* Ensure all writes are done before interrupts are enabled */
+	wmb();
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_MASK);
+
+	cq_host->activated = true;
+}
+
+static void __cqhci_disable(struct cqhci_host *cq_host)
+{
+	u32 cqcfg;
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+	cqcfg &= ~CQHCI_ENABLE;
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cq_host->mmc->cqe_on = false;
+
+	cq_host->activated = false;
+}
+
+int cqhci_suspend(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (cq_host->enabled)
+		__cqhci_disable(cq_host);
+
+	return 0;
+}
+EXPORT_SYMBOL(cqhci_suspend);
+
+int cqhci_resume(struct mmc_host *mmc)
+{
+	/* Re-enable is done upon first request */
+	return 0;
+}
+EXPORT_SYMBOL(cqhci_resume);
+
+static int cqhci_enable(struct mmc_host *mmc, struct mmc_card *card)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int err;
+
+	if (cq_host->enabled)
+		return 0;
+
+	cq_host->rca = card->rca;
+
+	err = cqhci_host_alloc_tdl(cq_host);
+	if (err)
+		return err;
+
+	__cqhci_enable(cq_host);
+
+	cq_host->enabled = true;
+
+#ifdef DEBUG
+	cqhci_dumpregs(cq_host);
+#endif
+	return 0;
+}
+
+/* CQHCI is idle and should halt immediately, so set a small timeout */
+#define CQHCI_OFF_TIMEOUT 100
+
+static void cqhci_off(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	ktime_t timeout;
+	bool timed_out;
+	u32 reg;
+
+	if (!cq_host->enabled || !mmc->cqe_on || cq_host->recovery_halt)
+		return;
+
+	if (cq_host->ops->disable)
+		cq_host->ops->disable(mmc, false);
+
+	cqhci_writel(cq_host, CQHCI_HALT, CQHCI_CTL);
+
+	timeout = ktime_add_us(ktime_get(), CQHCI_OFF_TIMEOUT);
+	while (1) {
+		timed_out = ktime_compare(ktime_get(), timeout) > 0;
+		reg = cqhci_readl(cq_host, CQHCI_CTL);
+		if ((reg & CQHCI_HALT) || timed_out)
+			break;
+	}
+
+	if (timed_out)
+		pr_err("%s: cqhci: CQE stuck on\n", mmc_hostname(mmc));
+	else
+		pr_debug("%s: cqhci: CQE off\n", mmc_hostname(mmc));
+
+	mmc->cqe_on = false;
+}
+
+static void cqhci_disable(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (!cq_host->enabled)
+		return;
+
+	cqhci_off(mmc);
+
+	__cqhci_disable(cq_host);
+
+	dmam_free_coherent(mmc_dev(mmc), cq_host->data_size,
+			   cq_host->trans_desc_base,
+			   cq_host->trans_desc_dma_base);
+
+	dmam_free_coherent(mmc_dev(mmc), cq_host->desc_size,
+			   cq_host->desc_base,
+			   cq_host->desc_dma_base);
+
+	cq_host->trans_desc_base = NULL;
+	cq_host->desc_base = NULL;
+
+	cq_host->enabled = false;
+}
+
+static void cqhci_prep_task_desc(struct mmc_request *mrq,
+					u64 *data, bool intr)
+{
+	u32 req_flags = mrq->data->flags;
+
+	*data = CQHCI_VALID(1) |
+		CQHCI_END(1) |
+		CQHCI_INT(intr) |
+		CQHCI_ACT(0x5) |
+		CQHCI_FORCED_PROG(!!(req_flags & MMC_DATA_FORCED_PRG)) |
+		CQHCI_DATA_TAG(!!(req_flags & MMC_DATA_DAT_TAG)) |
+		CQHCI_DATA_DIR(!!(req_flags & MMC_DATA_READ)) |
+		CQHCI_PRIORITY(!!(req_flags & MMC_DATA_PRIO)) |
+		CQHCI_QBAR(!!(req_flags & MMC_DATA_QBR)) |
+		CQHCI_REL_WRITE(!!(req_flags & MMC_DATA_REL_WR)) |
+		CQHCI_BLK_COUNT(mrq->data->blocks) |
+		CQHCI_BLK_ADDR((u64)mrq->data->blk_addr);
+
+	pr_debug("%s: cqhci: tag %d task descriptor 0x016%llx\n",
+		 mmc_hostname(mrq->host), mrq->tag, (unsigned long long)*data);
+}
+
+static int cqhci_dma_map(struct mmc_host *host, struct mmc_request *mrq)
+{
+	int sg_count;
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return -EINVAL;
+
+	sg_count = dma_map_sg(mmc_dev(host), data->sg,
+			      data->sg_len,
+			      (data->flags & MMC_DATA_WRITE) ?
+			      DMA_TO_DEVICE : DMA_FROM_DEVICE);
+	if (!sg_count) {
+		pr_err("%s: sg-len: %d\n", __func__, data->sg_len);
+		return -ENOMEM;
+	}
+
+	return sg_count;
+}
+
+static void cqhci_set_tran_desc(u8 *desc, dma_addr_t addr, int len, bool end,
+				bool dma64)
+{
+	__le32 *attr = (__le32 __force *)desc;
+
+	*attr = (CQHCI_VALID(1) |
+		 CQHCI_END(end ? 1 : 0) |
+		 CQHCI_INT(0) |
+		 CQHCI_ACT(0x4) |
+		 CQHCI_DAT_LENGTH(len));
+
+	if (dma64) {
+		__le64 *dataddr = (__le64 __force *)(desc + 4);
+
+		dataddr[0] = cpu_to_le64(addr);
+	} else {
+		__le32 *dataddr = (__le32 __force *)(desc + 4);
+
+		dataddr[0] = cpu_to_le32(addr);
+	}
+}
+
+static int cqhci_prep_tran_desc(struct mmc_request *mrq,
+			       struct cqhci_host *cq_host, int tag)
+{
+	struct mmc_data *data = mrq->data;
+	int i, sg_count, len;
+	bool end = false;
+	bool dma64 = cq_host->dma64;
+	dma_addr_t addr;
+	u8 *desc;
+	struct scatterlist *sg;
+
+	sg_count = cqhci_dma_map(mrq->host, mrq);
+	if (sg_count < 0) {
+		pr_err("%s: %s: unable to map sg lists, %d\n",
+				mmc_hostname(mrq->host), __func__, sg_count);
+		return sg_count;
+	}
+
+	desc = get_trans_desc(cq_host, tag);
+
+	for_each_sg(data->sg, sg, sg_count, i) {
+		addr = sg_dma_address(sg);
+		len = sg_dma_len(sg);
+
+		if ((i+1) == sg_count)
+			end = true;
+		cqhci_set_tran_desc(desc, addr, len, end, dma64);
+		desc += cq_host->trans_desc_len;
+	}
+
+	return 0;
+}
+
+static void cqhci_prep_dcmd_desc(struct mmc_host *mmc,
+				   struct mmc_request *mrq)
+{
+	u64 *task_desc = NULL;
+	u64 data = 0;
+	u8 resp_type;
+	u8 *desc;
+	__le64 *dataddr;
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	u8 timing;
+
+	if (!(mrq->cmd->flags & MMC_RSP_PRESENT)) {
+		resp_type = 0x0;
+		timing = 0x1;
+	} else {
+		if (mrq->cmd->flags & MMC_RSP_R1B) {
+			resp_type = 0x3;
+			timing = 0x0;
+		} else {
+			resp_type = 0x2;
+			timing = 0x1;
+		}
+	}
+
+	task_desc = (__le64 __force *)get_desc(cq_host, cq_host->dcmd_slot);
+	memset(task_desc, 0, cq_host->task_desc_len);
+	data |= (CQHCI_VALID(1) |
+		 CQHCI_END(1) |
+		 CQHCI_INT(1) |
+		 CQHCI_QBAR(1) |
+		 CQHCI_ACT(0x5) |
+		 CQHCI_CMD_INDEX(mrq->cmd->opcode) |
+		 CQHCI_CMD_TIMING(timing) | CQHCI_RESP_TYPE(resp_type));
+	*task_desc |= data;
+	desc = (u8 *)task_desc;
+	pr_debug("%s: cqhci: dcmd: cmd: %d timing: %d resp: %d\n",
+		 mmc_hostname(mmc), mrq->cmd->opcode, timing, resp_type);
+	dataddr = (__le64 __force *)(desc + 4);
+	dataddr[0] = cpu_to_le64((u64)mrq->cmd->arg);
+
+}
+
+static void cqhci_post_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	struct mmc_data *data = mrq->data;
+
+	if (data) {
+		dma_unmap_sg(mmc_dev(host), data->sg, data->sg_len,
+			     (data->flags & MMC_DATA_READ) ?
+			     DMA_FROM_DEVICE : DMA_TO_DEVICE);
+	}
+}
+
+static inline int cqhci_tag(struct mmc_request *mrq)
+{
+	return mrq->cmd ? DCMD_SLOT : mrq->tag;
+}
+
+static int cqhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	int err = 0;
+	u64 data = 0;
+	u64 *task_desc = NULL;
+	int tag = cqhci_tag(mrq);
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	unsigned long flags;
+
+	if (!cq_host->enabled) {
+		pr_err("%s: cqhci: not enabled\n", mmc_hostname(mmc));
+		return -EINVAL;
+	}
+
+	/* First request after resume has to re-enable */
+	if (!cq_host->activated)
+		__cqhci_enable(cq_host);
+
+	if (!mmc->cqe_on) {
+		cqhci_writel(cq_host, 0, CQHCI_CTL);
+		mmc->cqe_on = true;
+		pr_debug("%s: cqhci: CQE on\n", mmc_hostname(mmc));
+		if (cqhci_readl(cq_host, CQHCI_CTL) && CQHCI_HALT) {
+			pr_err("%s: cqhci: CQE failed to exit halt state\n",
+			       mmc_hostname(mmc));
+		}
+		if (cq_host->ops->enable)
+			cq_host->ops->enable(mmc);
+	}
+
+	if (mrq->data) {
+		task_desc = (__le64 __force *)get_desc(cq_host, tag);
+		cqhci_prep_task_desc(mrq, &data, 1);
+		*task_desc = cpu_to_le64(data);
+		err = cqhci_prep_tran_desc(mrq, cq_host, tag);
+		if (err) {
+			pr_err("%s: cqhci: failed to setup tx desc: %d\n",
+			       mmc_hostname(mmc), err);
+			return err;
+		}
+	} else {
+		cqhci_prep_dcmd_desc(mmc, mrq);
+	}
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+
+	if (cq_host->recovery_halt) {
+		err = -EBUSY;
+		goto out_unlock;
+	}
+
+	cq_host->slot[tag].mrq = mrq;
+	cq_host->slot[tag].flags = 0;
+
+	cq_host->qcnt += 1;
+
+	cqhci_writel(cq_host, 1 << tag, CQHCI_TDBR);
+	if (!(cqhci_readl(cq_host, CQHCI_TDBR) & (1 << tag)))
+		pr_debug("%s: cqhci: doorbell not set for tag %d\n",
+			 mmc_hostname(mmc), tag);
+out_unlock:
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	if (err)
+		cqhci_post_req(mmc, mrq);
+
+	return err;
+}
+
+static void cqhci_recovery_needed(struct mmc_host *mmc, struct mmc_request *mrq,
+				  bool notify)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (!cq_host->recovery_halt) {
+		cq_host->recovery_halt = true;
+		pr_debug("%s: cqhci: recovery needed\n", mmc_hostname(mmc));
+		wake_up(&cq_host->wait_queue);
+		if (notify && mrq->recovery_notifier)
+			mrq->recovery_notifier(mrq);
+	}
+}
+
+static unsigned int cqhci_error_flags(int error1, int error2)
+{
+	int error = error1 ? error1 : error2;
+
+	switch (error) {
+	case -EILSEQ:
+		return CQHCI_HOST_CRC;
+	case -ETIMEDOUT:
+		return CQHCI_HOST_TIMEOUT;
+	default:
+		return CQHCI_HOST_OTHER;
+	}
+}
+
+static void cqhci_error_irq(struct mmc_host *mmc, u32 status, int cmd_error,
+			    int data_error)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	struct cqhci_slot *slot;
+	u32 terri;
+	int tag;
+
+	spin_lock(&cq_host->lock);
+
+	terri = cqhci_readl(cq_host, CQHCI_TERRI);
+
+	pr_debug("%s: cqhci: error IRQ status: 0x%08x cmd error %d data error %d TERRI: 0x%08x\n",
+		 mmc_hostname(mmc), status, cmd_error, data_error, terri);
+
+	/* Forget about errors when recovery has already been triggered */
+	if (cq_host->recovery_halt)
+		goto out_unlock;
+
+	if (!cq_host->qcnt) {
+		WARN_ONCE(1, "%s: cqhci: error when idle. IRQ status: 0x%08x cmd error %d data error %d TERRI: 0x%08x\n",
+			  mmc_hostname(mmc), status, cmd_error, data_error,
+			  terri);
+		goto out_unlock;
+	}
+
+	if (CQHCI_TERRI_C_VALID(terri)) {
+		tag = CQHCI_TERRI_C_TASK(terri);
+		slot = &cq_host->slot[tag];
+		if (slot->mrq) {
+			slot->flags = cqhci_error_flags(cmd_error, data_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+		}
+	}
+
+	if (CQHCI_TERRI_D_VALID(terri)) {
+		tag = CQHCI_TERRI_D_TASK(terri);
+		slot = &cq_host->slot[tag];
+		if (slot->mrq) {
+			slot->flags = cqhci_error_flags(data_error, cmd_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+		}
+	}
+
+	if (!cq_host->recovery_halt) {
+		/*
+		 * The only way to guarantee forward progress is to mark at
+		 * least one task in error, so if none is indicated, pick one.
+		 */
+		for (tag = 0; tag < NUM_SLOTS; tag++) {
+			slot = &cq_host->slot[tag];
+			if (!slot->mrq)
+				continue;
+			slot->flags = cqhci_error_flags(data_error, cmd_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+			break;
+		}
+	}
+
+out_unlock:
+	spin_unlock(&cq_host->lock);
+}
+
+static void cqhci_finish_mrq(struct mmc_host *mmc, unsigned int tag)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	struct mmc_request *mrq = slot->mrq;
+	struct mmc_data *data;
+
+	if (!mrq) {
+		WARN_ONCE(1, "%s: cqhci: spurious TCN for tag %d\n",
+			  mmc_hostname(mmc), tag);
+		return;
+	}
+
+	/* No completions allowed during recovery */
+	if (cq_host->recovery_halt) {
+		slot->flags |= CQHCI_COMPLETED;
+		return;
+	}
+
+	slot->mrq = NULL;
+
+	cq_host->qcnt -= 1;
+
+	data = mrq->data;
+	if (data) {
+		if (data->error)
+			data->bytes_xfered = 0;
+		else
+			data->bytes_xfered = data->blksz * data->blocks;
+	}
+
+	mmc_cqe_request_done(mmc, mrq);
+}
+
+irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 intmask, int cmd_error,
+		      int data_error)
+{
+	u32 status;
+	unsigned long tag = 0, comp_status;
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	status = cqhci_readl(cq_host, CQHCI_IS);
+	cqhci_writel(cq_host, status, CQHCI_IS);
+
+	pr_debug("%s: cqhci: IRQ status: 0x%08x\n", mmc_hostname(mmc), status);
+
+	if ((status & CQHCI_IS_RED) || cmd_error || data_error)
+		cqhci_error_irq(mmc, status, cmd_error, data_error);
+
+	if (status & CQHCI_IS_TCC) {
+		/* read TCN and complete the request */
+		comp_status = cqhci_readl(cq_host, CQHCI_TCN);
+		cqhci_writel(cq_host, comp_status, CQHCI_TCN);
+		pr_debug("%s: cqhci: TCN: 0x%08lx\n",
+			 mmc_hostname(mmc), comp_status);
+
+		spin_lock(&cq_host->lock);
+
+		for_each_set_bit(tag, &comp_status, cq_host->num_slots) {
+			/* complete the corresponding mrq */
+			pr_debug("%s: cqhci: completing tag %lu\n",
+				 mmc_hostname(mmc), tag);
+			cqhci_finish_mrq(mmc, tag);
+		}
+
+		if (cq_host->waiting_for_idle && !cq_host->qcnt) {
+			cq_host->waiting_for_idle = false;
+			wake_up(&cq_host->wait_queue);
+		}
+
+		spin_unlock(&cq_host->lock);
+	}
+
+	if (status & CQHCI_IS_TCL)
+		wake_up(&cq_host->wait_queue);
+
+	if (status & CQHCI_IS_HAC)
+		wake_up(&cq_host->wait_queue);
+
+	return IRQ_HANDLED;
+}
+EXPORT_SYMBOL(cqhci_irq);
+
+static bool cqhci_is_idle(struct cqhci_host *cq_host, int *ret)
+{
+	unsigned long flags;
+	bool is_idle;
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	is_idle = !cq_host->qcnt || cq_host->recovery_halt;
+	*ret = cq_host->recovery_halt ? -EBUSY : 0;
+	cq_host->waiting_for_idle = !is_idle;
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	return is_idle;
+}
+
+static int cqhci_wait_for_idle(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int ret;
+
+	wait_event(cq_host->wait_queue, cqhci_is_idle(cq_host, &ret));
+
+	return ret;
+}
+
+static bool cqhci_timeout(struct mmc_host *mmc, struct mmc_request *mrq,
+			  bool *recovery_needed)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int tag = cqhci_tag(mrq);
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	unsigned long flags;
+	bool timed_out;
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	timed_out = slot->mrq == mrq;
+	if (timed_out) {
+		slot->flags |= CQHCI_EXTERNAL_TIMEOUT;
+		cqhci_recovery_needed(mmc, mrq, false);
+		*recovery_needed = cq_host->recovery_halt;
+	}
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	if (timed_out) {
+		pr_err("%s: cqhci: timeout for tag %d\n",
+		       mmc_hostname(mmc), tag);
+		cqhci_dumpregs(cq_host);
+	}
+
+	return timed_out;
+}
+
+static bool cqhci_tasks_cleared(struct cqhci_host *cq_host)
+{
+	return !(cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_CLEAR_ALL_TASKS);
+}
+
+static bool cqhci_clear_all_tasks(struct mmc_host *mmc, unsigned int timeout)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	bool ret;
+	u32 ctl;
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_TCL);
+
+	ctl = cqhci_readl(cq_host, CQHCI_CTL);
+	ctl |= CQHCI_CLEAR_ALL_TASKS;
+	cqhci_writel(cq_host, ctl, CQHCI_CTL);
+
+	wait_event_timeout(cq_host->wait_queue, cqhci_tasks_cleared(cq_host),
+			   msecs_to_jiffies(timeout) + 1);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	ret = cqhci_tasks_cleared(cq_host);
+
+	if (!ret)
+		pr_debug("%s: cqhci: Failed to clear tasks\n",
+			 mmc_hostname(mmc));
+
+	return ret;
+}
+
+static bool cqhci_halted(struct cqhci_host *cq_host)
+{
+	return cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT;
+}
+
+static bool cqhci_halt(struct mmc_host *mmc, unsigned int timeout)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	bool ret;
+	u32 ctl;
+
+	if (cqhci_halted(cq_host))
+		return true;
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_HAC);
+
+	ctl = cqhci_readl(cq_host, CQHCI_CTL);
+	ctl |= CQHCI_HALT;
+	cqhci_writel(cq_host, ctl, CQHCI_CTL);
+
+	wait_event_timeout(cq_host->wait_queue, cqhci_halted(cq_host),
+			   msecs_to_jiffies(timeout) + 1);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	ret = cqhci_halted(cq_host);
+
+	if (!ret)
+		pr_debug("%s: cqhci: Failed to halt\n", mmc_hostname(mmc));
+
+	return ret;
+}
+
+/*
+ * After halting we expect to be able to use the command line. We interpret the
+ * failure to halt to mean the data lines might still be in use (and the upper
+ * layers will need to send a STOP command), so we set the timeout based on a
+ * generous command timeout.
+ */
+#define CQHCI_START_HALT_TIMEOUT	5
+
+static void cqhci_recovery_start(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	pr_debug("%s: cqhci: %s\n", mmc_hostname(mmc), __func__);
+
+	WARN_ON(!cq_host->recovery_halt);
+
+	cqhci_halt(mmc, CQHCI_START_HALT_TIMEOUT);
+
+	if (cq_host->ops->disable)
+		cq_host->ops->disable(mmc, true);
+
+	mmc->cqe_on = false;
+}
+
+static int cqhci_error_from_flags(unsigned int flags)
+{
+	if (!flags)
+		return 0;
+
+	/* CRC errors might indicate re-tuning so prefer to report that */
+	if (flags & CQHCI_HOST_CRC)
+		return -EILSEQ;
+
+	if (flags & (CQHCI_EXTERNAL_TIMEOUT | CQHCI_HOST_TIMEOUT))
+		return -ETIMEDOUT;
+
+	return -EIO;
+}
+
+static void cqhci_recover_mrq(struct cqhci_host *cq_host, unsigned int tag)
+{
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	struct mmc_request *mrq = slot->mrq;
+	struct mmc_data *data;
+
+	if (!mrq)
+		return;
+
+	slot->mrq = NULL;
+
+	cq_host->qcnt -= 1;
+
+	data = mrq->data;
+	if (data) {
+		data->bytes_xfered = 0;
+		data->error = cqhci_error_from_flags(slot->flags);
+	} else {
+		mrq->cmd->error = cqhci_error_from_flags(slot->flags);
+	}
+
+	mmc_cqe_request_done(cq_host->mmc, mrq);
+}
+
+static void cqhci_recover_mrqs(struct cqhci_host *cq_host)
+{
+	int i;
+
+	for (i = 0; i < cq_host->num_slots; i++)
+		cqhci_recover_mrq(cq_host, i);
+}
+
+/*
+ * By now the command and data lines should be unused so there is no reason for
+ * CQHCI to take a long time to halt, but if it doesn't halt there could be
+ * problems clearing tasks, so be generous.
+ */
+#define CQHCI_FINISH_HALT_TIMEOUT	20
+
+/* CQHCI could be expected to clear it's internal state pretty quickly */
+#define CQHCI_CLEAR_TIMEOUT		20
+
+static void cqhci_recovery_finish(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	unsigned long flags;
+	u32 cqcfg;
+	bool ok;
+
+	pr_debug("%s: cqhci: %s\n", mmc_hostname(mmc), __func__);
+
+	WARN_ON(!cq_host->recovery_halt);
+
+	ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+
+	if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+		ok = false;
+
+	/*
+	 * The specification contradicts itself, by saying that tasks cannot be
+	 * cleared if CQHCI does not halt, but if CQHCI does not halt, it should
+	 * be disabled/re-enabled, but not to disable before clearing tasks.
+	 * Have a go anyway.
+	 */
+	if (!ok) {
+		pr_debug("%s: cqhci: disable / re-enable\n", mmc_hostname(mmc));
+		cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+		cqcfg &= ~CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+		cqcfg |= CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+		/* Be sure that there are no tasks */
+		ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+		if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+			ok = false;
+		WARN_ON(!ok);
+	}
+
+	cqhci_recover_mrqs(cq_host);
+
+	WARN_ON(cq_host->qcnt);
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	cq_host->qcnt = 0;
+	cq_host->recovery_halt = false;
+	mmc->cqe_on = false;
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	/* Ensure all writes are done before interrupts are re-enabled */
+	wmb();
+
+	cqhci_writel(cq_host, CQHCI_IS_HAC | CQHCI_IS_TCL, CQHCI_IS);
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_MASK);
+
+	pr_debug("%s: cqhci: recovery done\n", mmc_hostname(mmc));
+}
+
+static const struct mmc_cqe_ops cqhci_cqe_ops = {
+	.cqe_enable = cqhci_enable,
+	.cqe_disable = cqhci_disable,
+	.cqe_request = cqhci_request,
+	.cqe_post_req = cqhci_post_req,
+	.cqe_off = cqhci_off,
+	.cqe_wait_for_idle = cqhci_wait_for_idle,
+	.cqe_timeout = cqhci_timeout,
+	.cqe_recovery_start = cqhci_recovery_start,
+	.cqe_recovery_finish = cqhci_recovery_finish,
+};
+
+struct cqhci_host *cqhci_pltfm_init(struct platform_device *pdev)
+{
+	struct cqhci_host *cq_host;
+	struct resource *cqhci_memres = NULL;
+
+	/* check and setup CMDQ interface */
+	cqhci_memres = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+						   "cqhci_mem");
+	if (!cqhci_memres) {
+		dev_dbg(&pdev->dev, "CMDQ not supported\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	cq_host = devm_kzalloc(&pdev->dev, sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host)
+		return ERR_PTR(-ENOMEM);
+	cq_host->mmio = devm_ioremap(&pdev->dev,
+				     cqhci_memres->start,
+				     resource_size(cqhci_memres));
+	if (!cq_host->mmio) {
+		dev_err(&pdev->dev, "failed to remap cqhci regs\n");
+		return ERR_PTR(-EBUSY);
+	}
+	dev_dbg(&pdev->dev, "CMDQ ioremap: done\n");
+
+	return cq_host;
+}
+EXPORT_SYMBOL(cqhci_pltfm_init);
+
+static unsigned int cqhci_ver_major(struct cqhci_host *cq_host)
+{
+	return CQHCI_VER_MAJOR(cqhci_readl(cq_host, CQHCI_VER));
+}
+
+static unsigned int cqhci_ver_minor(struct cqhci_host *cq_host)
+{
+	u32 ver = cqhci_readl(cq_host, CQHCI_VER);
+
+	return CQHCI_VER_MINOR1(ver) * 10 + CQHCI_VER_MINOR2(ver);
+}
+
+int cqhci_init(struct cqhci_host *cq_host, struct mmc_host *mmc,
+	      bool dma64)
+{
+	int err;
+
+	cq_host->dma64 = dma64;
+	cq_host->mmc = mmc;
+	cq_host->mmc->cqe_private = cq_host;
+
+	cq_host->num_slots = NUM_SLOTS;
+	cq_host->dcmd_slot = DCMD_SLOT;
+
+	mmc->cqe_ops = &cqhci_cqe_ops;
+
+	mmc->cqe_qdepth = NUM_SLOTS;
+	if (mmc->caps2 & MMC_CAP2_CQE_DCMD)
+		mmc->cqe_qdepth -= 1;
+
+	cq_host->slot = devm_kcalloc(mmc_dev(mmc), cq_host->num_slots,
+				     sizeof(*cq_host->slot), GFP_KERNEL);
+	if (!cq_host->slot) {
+		err = -ENOMEM;
+		goto out_err;
+	}
+
+	spin_lock_init(&cq_host->lock);
+
+	init_completion(&cq_host->halt_comp);
+	init_waitqueue_head(&cq_host->wait_queue);
+
+	pr_info("%s: CQHCI version %u.%02u\n",
+		mmc_hostname(mmc), cqhci_ver_major(cq_host),
+		cqhci_ver_minor(cq_host));
+
+	return 0;
+
+out_err:
+	pr_err("%s: CQHCI version %u.%02u failed to initialize, error %d\n",
+	       mmc_hostname(mmc), cqhci_ver_major(cq_host),
+	       cqhci_ver_minor(cq_host), err);
+	return err;
+}
+EXPORT_SYMBOL(cqhci_init);
+
+MODULE_AUTHOR("Venkat Gopalakrishnan <venkatg@codeaurora.org>");
+MODULE_DESCRIPTION("Command Queue Host Controller Interface driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/mmc/host/cqhci.h b/drivers/mmc/host/cqhci.h
new file mode 100644
index 000000000000..2d39d361b322
--- /dev/null
+++ b/drivers/mmc/host/cqhci.h
@@ -0,0 +1,240 @@
+/* Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#ifndef LINUX_MMC_CQHCI_H
+#define LINUX_MMC_CQHCI_H
+
+#include <linux/compiler.h>
+#include <linux/bitops.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/completion.h>
+#include <linux/wait.h>
+#include <linux/irqreturn.h>
+#include <asm/io.h>
+
+/* registers */
+/* version */
+#define CQHCI_VER			0x00
+#define CQHCI_VER_MAJOR(x)		(((x) & GENMASK(11, 8)) >> 8)
+#define CQHCI_VER_MINOR1(x)		(((x) & GENMASK(7, 4)) >> 4)
+#define CQHCI_VER_MINOR2(x)		((x) & GENMASK(3, 0))
+
+/* capabilities */
+#define CQHCI_CAP			0x04
+/* configuration */
+#define CQHCI_CFG			0x08
+#define CQHCI_DCMD			0x00001000
+#define CQHCI_TASK_DESC_SZ		0x00000100
+#define CQHCI_ENABLE			0x00000001
+
+/* control */
+#define CQHCI_CTL			0x0C
+#define CQHCI_CLEAR_ALL_TASKS		0x00000100
+#define CQHCI_HALT			0x00000001
+
+/* interrupt status */
+#define CQHCI_IS			0x10
+#define CQHCI_IS_HAC			BIT(0)
+#define CQHCI_IS_TCC			BIT(1)
+#define CQHCI_IS_RED			BIT(2)
+#define CQHCI_IS_TCL			BIT(3)
+
+#define CQHCI_IS_MASK (CQHCI_IS_TCC | CQHCI_IS_RED)
+
+/* interrupt status enable */
+#define CQHCI_ISTE			0x14
+
+/* interrupt signal enable */
+#define CQHCI_ISGE			0x18
+
+/* interrupt coalescing */
+#define CQHCI_IC			0x1C
+#define CQHCI_IC_ENABLE			BIT(31)
+#define CQHCI_IC_RESET			BIT(16)
+#define CQHCI_IC_ICCTHWEN		BIT(15)
+#define CQHCI_IC_ICCTH(x)		((x & 0x1F) << 8)
+#define CQHCI_IC_ICTOVALWEN		BIT(7)
+#define CQHCI_IC_ICTOVAL(x)		(x & 0x7F)
+
+/* task list base address */
+#define CQHCI_TDLBA			0x20
+
+/* task list base address upper */
+#define CQHCI_TDLBAU			0x24
+
+/* door-bell */
+#define CQHCI_TDBR			0x28
+
+/* task completion notification */
+#define CQHCI_TCN			0x2C
+
+/* device queue status */
+#define CQHCI_DQS			0x30
+
+/* device pending tasks */
+#define CQHCI_DPT			0x34
+
+/* task clear */
+#define CQHCI_TCLR			0x38
+
+/* send status config 1 */
+#define CQHCI_SSC1			0x40
+
+/* send status config 2 */
+#define CQHCI_SSC2			0x44
+
+/* response for dcmd */
+#define CQHCI_CRDCT			0x48
+
+/* response mode error mask */
+#define CQHCI_RMEM			0x50
+
+/* task error info */
+#define CQHCI_TERRI			0x54
+
+#define CQHCI_TERRI_C_INDEX(x)		((x) & GENMASK(5, 0))
+#define CQHCI_TERRI_C_TASK(x)		(((x) & GENMASK(12, 8)) >> 8)
+#define CQHCI_TERRI_C_VALID(x)		((x) & BIT(15))
+#define CQHCI_TERRI_D_INDEX(x)		(((x) & GENMASK(21, 16)) >> 16)
+#define CQHCI_TERRI_D_TASK(x)		(((x) & GENMASK(28, 24)) >> 24)
+#define CQHCI_TERRI_D_VALID(x)		((x) & BIT(31))
+
+/* command response index */
+#define CQHCI_CRI			0x58
+
+/* command response argument */
+#define CQHCI_CRA			0x5C
+
+#define CQHCI_INT_ALL			0xF
+#define CQHCI_IC_DEFAULT_ICCTH		31
+#define CQHCI_IC_DEFAULT_ICTOVAL	1
+
+/* attribute fields */
+#define CQHCI_VALID(x)			((x & 1) << 0)
+#define CQHCI_END(x)			((x & 1) << 1)
+#define CQHCI_INT(x)			((x & 1) << 2)
+#define CQHCI_ACT(x)			((x & 0x7) << 3)
+
+/* data command task descriptor fields */
+#define CQHCI_FORCED_PROG(x)		((x & 1) << 6)
+#define CQHCI_CONTEXT(x)		((x & 0xF) << 7)
+#define CQHCI_DATA_TAG(x)		((x & 1) << 11)
+#define CQHCI_DATA_DIR(x)		((x & 1) << 12)
+#define CQHCI_PRIORITY(x)		((x & 1) << 13)
+#define CQHCI_QBAR(x)			((x & 1) << 14)
+#define CQHCI_REL_WRITE(x)		((x & 1) << 15)
+#define CQHCI_BLK_COUNT(x)		((x & 0xFFFF) << 16)
+#define CQHCI_BLK_ADDR(x)		((x & 0xFFFFFFFF) << 32)
+
+/* direct command task descriptor fields */
+#define CQHCI_CMD_INDEX(x)		((x & 0x3F) << 16)
+#define CQHCI_CMD_TIMING(x)		((x & 1) << 22)
+#define CQHCI_RESP_TYPE(x)		((x & 0x3) << 23)
+
+/* transfer descriptor fields */
+#define CQHCI_DAT_LENGTH(x)		((x & 0xFFFF) << 16)
+#define CQHCI_DAT_ADDR_LO(x)		((x & 0xFFFFFFFF) << 32)
+#define CQHCI_DAT_ADDR_HI(x)		((x & 0xFFFFFFFF) << 0)
+
+struct cqhci_host_ops;
+struct mmc_host;
+struct cqhci_slot;
+
+struct cqhci_host {
+	const struct cqhci_host_ops *ops;
+	void __iomem *mmio;
+	struct mmc_host *mmc;
+
+	spinlock_t lock;
+
+	/* relative card address of device */
+	unsigned int rca;
+
+	/* 64 bit DMA */
+	bool dma64;
+	int num_slots;
+	int qcnt;
+
+	u32 dcmd_slot;
+	u32 caps;
+#define CQHCI_TASK_DESC_SZ_128		0x1
+
+	u32 quirks;
+#define CQHCI_QUIRK_SHORT_TXFR_DESC_SZ	0x1
+
+	bool enabled;
+	bool halted;
+	bool init_done;
+	bool activated;
+	bool waiting_for_idle;
+	bool recovery_halt;
+
+	size_t desc_size;
+	size_t data_size;
+
+	u8 *desc_base;
+
+	/* total descriptor size */
+	u8 slot_sz;
+
+	/* 64/128 bit depends on CQHCI_CFG */
+	u8 task_desc_len;
+
+	/* 64 bit on 32-bit arch, 128 bit on 64-bit */
+	u8 link_desc_len;
+
+	u8 *trans_desc_base;
+	/* same length as transfer descriptor */
+	u8 trans_desc_len;
+
+	dma_addr_t desc_dma_base;
+	dma_addr_t trans_desc_dma_base;
+
+	struct completion halt_comp;
+	wait_queue_head_t wait_queue;
+	struct cqhci_slot *slot;
+};
+
+struct cqhci_host_ops {
+	void (*dumpregs)(struct mmc_host *mmc);
+	void (*write_l)(struct cqhci_host *host, u32 val, int reg);
+	u32 (*read_l)(struct cqhci_host *host, int reg);
+	void (*enable)(struct mmc_host *mmc);
+	void (*disable)(struct mmc_host *mmc, bool recovery);
+};
+
+static inline void cqhci_writel(struct cqhci_host *host, u32 val, int reg)
+{
+	if (unlikely(host->ops->write_l))
+		host->ops->write_l(host, val, reg);
+	else
+		writel_relaxed(val, host->mmio + reg);
+}
+
+static inline u32 cqhci_readl(struct cqhci_host *host, int reg)
+{
+	if (unlikely(host->ops->read_l))
+		return host->ops->read_l(host, reg);
+	else
+		return readl_relaxed(host->mmio + reg);
+}
+
+struct platform_device;
+
+irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 intmask, int cmd_error,
+		      int data_error);
+int cqhci_init(struct cqhci_host *cq_host, struct mmc_host *mmc, bool dma64);
+struct cqhci_host *cqhci_pltfm_init(struct platform_device *pdev);
+int cqhci_suspend(struct mmc_host *mmc);
+int cqhci_resume(struct mmc_host *mmc);
+
+#endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 16/24] mmc: sdhci-pci: Add CQHCI support for Intel GLK
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (14 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 15/24] mmc: cqhci: support for command queue enabled host Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 17/24] mmc: block: blk-mq: Add support for direct completion Adrian Hunter
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Add CQHCI initialization and implement CQHCI operations for Intel GLK.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/host/Kconfig          |   1 +
 drivers/mmc/host/sdhci-pci-core.c | 155 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 155 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 3092b7085cb5..2b02a9788bb6 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -81,6 +81,7 @@ config MMC_SDHCI_BIG_ENDIAN_32BIT_BYTE_SWAPPER
 config MMC_SDHCI_PCI
 	tristate "SDHCI support on PCI bus"
 	depends on MMC_SDHCI && PCI
+	select MMC_CQHCI
 	help
 	  This selects the PCI Secure Digital Host Controller Interface.
 	  Most controllers found today are PCI devices.
diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
index 3e4f04fd5175..110c634cfb43 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -30,6 +30,8 @@
 #include <linux/mmc/sdhci-pci-data.h>
 #include <linux/acpi.h>
 
+#include "cqhci.h"
+
 #include "sdhci.h"
 #include "sdhci-pci.h"
 
@@ -116,6 +118,28 @@ int sdhci_pci_resume_host(struct sdhci_pci_chip *chip)
 
 	return 0;
 }
+
+static int sdhci_cqhci_suspend(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = cqhci_suspend(chip->slots[0]->host->mmc);
+	if (ret)
+		return ret;
+
+	return sdhci_pci_suspend_host(chip);
+}
+
+static int sdhci_cqhci_resume(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = sdhci_pci_resume_host(chip);
+	if (ret)
+		return ret;
+
+	return cqhci_resume(chip->slots[0]->host->mmc);
+}
 #endif
 
 #ifdef CONFIG_PM
@@ -166,8 +190,48 @@ static int sdhci_pci_runtime_resume_host(struct sdhci_pci_chip *chip)
 
 	return 0;
 }
+
+static int sdhci_cqhci_runtime_suspend(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = cqhci_suspend(chip->slots[0]->host->mmc);
+	if (ret)
+		return ret;
+
+	return sdhci_pci_runtime_suspend_host(chip);
+}
+
+static int sdhci_cqhci_runtime_resume(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = sdhci_pci_runtime_resume_host(chip);
+	if (ret)
+		return ret;
+
+	return cqhci_resume(chip->slots[0]->host->mmc);
+}
 #endif
 
+static u32 sdhci_cqhci_irq(struct sdhci_host *host, u32 intmask)
+{
+	int cmd_error = 0;
+	int data_error = 0;
+
+	if (!sdhci_cqe_irq(host, intmask, &cmd_error, &data_error))
+		return intmask;
+
+	cqhci_irq(host->mmc, intmask, cmd_error, data_error);
+
+	return 0;
+}
+
+static void sdhci_pci_dumpregs(struct mmc_host *mmc)
+{
+	sdhci_dumpregs(mmc_priv(mmc));
+}
+
 /*****************************************************************************\
  *                                                                           *
  * Hardware specific quirk handling                                          *
@@ -583,6 +647,18 @@ static void sdhci_intel_voltage_switch(struct sdhci_host *host)
 	.voltage_switch		= sdhci_intel_voltage_switch,
 };
 
+static const struct sdhci_ops sdhci_intel_glk_ops = {
+	.set_clock		= sdhci_set_clock,
+	.set_power		= sdhci_intel_set_power,
+	.enable_dma		= sdhci_pci_enable_dma,
+	.set_bus_width		= sdhci_set_bus_width,
+	.reset			= sdhci_reset,
+	.set_uhs_signaling	= sdhci_set_uhs_signaling,
+	.hw_reset		= sdhci_pci_hw_reset,
+	.voltage_switch		= sdhci_intel_voltage_switch,
+	.irq			= sdhci_cqhci_irq,
+};
+
 static void byt_read_dsm(struct sdhci_pci_slot *slot)
 {
 	struct intel_host *intel_host = sdhci_pci_priv(slot);
@@ -612,12 +688,80 @@ static int glk_emmc_probe_slot(struct sdhci_pci_slot *slot)
 {
 	int ret = byt_emmc_probe_slot(slot);
 
+	slot->host->mmc->caps2 |= MMC_CAP2_CQE;
+
 	if (slot->chip->pdev->device != PCI_DEVICE_ID_INTEL_GLK_EMMC) {
 		slot->host->mmc->caps2 |= MMC_CAP2_HS400_ES,
 		slot->host->mmc_host_ops.hs400_enhanced_strobe =
 						intel_hs400_enhanced_strobe;
+		slot->host->mmc->caps2 |= MMC_CAP2_CQE_DCMD;
+	}
+
+	return ret;
+}
+
+static void glk_cqe_enable(struct mmc_host *mmc)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	u32 reg;
+
+	/*
+	 * CQE gets stuck if it sees Buffer Read Enable bit set, which can be
+	 * the case after tuning, so ensure the buffer is drained.
+	 */
+	reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	while (reg & SDHCI_DATA_AVAILABLE) {
+		sdhci_readl(host, SDHCI_BUFFER);
+		reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	}
+
+	sdhci_cqe_enable(mmc);
+}
+
+static const struct cqhci_host_ops glk_cqhci_ops = {
+	.enable		= glk_cqe_enable,
+	.disable	= sdhci_cqe_disable,
+	.dumpregs	= sdhci_pci_dumpregs,
+};
+
+static int glk_emmc_add_host(struct sdhci_pci_slot *slot)
+{
+	struct device *dev = &slot->chip->pdev->dev;
+	struct sdhci_host *host = slot->host;
+	struct cqhci_host *cq_host;
+	bool dma64;
+	int ret;
+
+	ret = sdhci_setup_host(host);
+	if (ret)
+		return ret;
+
+	cq_host = devm_kzalloc(dev, sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host) {
+		ret = -ENOMEM;
+		goto cleanup;
 	}
 
+	cq_host->mmio = host->ioaddr + 0x200;
+	cq_host->quirks |= CQHCI_QUIRK_SHORT_TXFR_DESC_SZ;
+	cq_host->ops = &glk_cqhci_ops;
+
+	dma64 = host->flags & SDHCI_USE_64_BIT_DMA;
+	if (dma64)
+		cq_host->caps |= CQHCI_TASK_DESC_SZ_128;
+
+	ret = cqhci_init(cq_host, host->mmc, dma64);
+	if (ret)
+		goto cleanup;
+
+	ret = __sdhci_add_host(host);
+	if (ret)
+		goto cleanup;
+
+	return 0;
+
+cleanup:
+	sdhci_cleanup_host(host);
 	return ret;
 }
 
@@ -699,11 +843,20 @@ static int byt_sd_probe_slot(struct sdhci_pci_slot *slot)
 static const struct sdhci_pci_fixes sdhci_intel_glk_emmc = {
 	.allow_runtime_pm	= true,
 	.probe_slot		= glk_emmc_probe_slot,
+	.add_host		= glk_emmc_add_host,
+#ifdef CONFIG_PM_SLEEP
+	.suspend		= sdhci_cqhci_suspend,
+	.resume			= sdhci_cqhci_resume,
+#endif
+#ifdef CONFIG_PM
+	.runtime_suspend	= sdhci_cqhci_runtime_suspend,
+	.runtime_resume		= sdhci_cqhci_runtime_resume,
+#endif
 	.quirks			= SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC,
 	.quirks2		= SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
 				  SDHCI_QUIRK2_CAPS_BIT63_FOR_HS400 |
 				  SDHCI_QUIRK2_STOP_WITH_TC,
-	.ops			= &sdhci_intel_byt_ops,
+	.ops			= &sdhci_intel_glk_ops,
 	.priv_size		= sizeof(struct intel_host),
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 17/24] mmc: block: blk-mq: Add support for direct completion
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (15 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 16/24] mmc: sdhci-pci: Add CQHCI support for Intel GLK Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-28 19:02   ` Ulf Hansson
  2017-11-21 13:42 ` [PATCH V14 18/24] mmc: block: blk-mq: Separate card polling from recovery Adrian Hunter
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

For blk-mq, add support for completing requests directly in the ->done
callback. That means that error handling and urgent background operations
must be handled by recovery_work in that case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 102 +++++++++++++++++++++++++++++++++++++++++------
 drivers/mmc/core/block.h |   1 +
 drivers/mmc/core/queue.c |   5 ++-
 drivers/mmc/core/queue.h |   6 +++
 include/linux/mmc/host.h |   1 +
 5 files changed, 101 insertions(+), 14 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 2aacd3fa0d1a..b1857eee46fd 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -2154,6 +2154,22 @@ static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
 	}
 }
 
+static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
+{
+	mmc_blk_eval_resp_error(brq);
+
+	return brq->sbc.error || brq->cmd.error || brq->stop.error ||
+	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
+}
+
+static inline void mmc_blk_rw_reset_success(struct mmc_queue *mq,
+					    struct request *req)
+{
+	int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+
+	mmc_blk_reset_success(mq->blkdata, type);
+}
+
 static void mmc_blk_mq_complete_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
@@ -2234,14 +2250,43 @@ static void mmc_blk_mq_post_req(struct mmc_queue *mq, struct request *req)
 
 	mmc_post_req(host, mrq, 0);
 
-	blk_mq_complete_request(req);
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->in_recovery)
+		mmc_blk_mq_complete_rq(mq, req);
+	else
+		blk_mq_complete_request(req);
 
 	mmc_blk_mq_acct_req_done(mq, req);
 }
 
+void mmc_blk_mq_recovery(struct mmc_queue *mq)
+{
+	struct request *req = mq->recovery_req;
+	struct mmc_host *host = mq->card->host;
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+
+	mq->recovery_req = NULL;
+	mq->rw_wait = false;
+
+	if (mmc_blk_rq_error(&mqrq->brq)) {
+		mmc_retune_hold_now(host);
+		mmc_blk_mq_rw_recovery(mq, req);
+	}
+
+	mmc_blk_urgent_bkops(mq, mqrq);
+
+	mmc_blk_mq_post_req(mq, req);
+}
+
 static void mmc_blk_mq_complete_prev_req(struct mmc_queue *mq,
 					 struct request **prev_req)
 {
+	if (mmc_queue_direct_complete(mq->card->host))
+		return;
+
 	mutex_lock(&mq->complete_lock);
 
 	if (!mq->complete_req)
@@ -2275,19 +2320,43 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq)
 	struct request *req = mmc_queue_req_to_req(mqrq);
 	struct request_queue *q = req->q;
 	struct mmc_queue *mq = q->queuedata;
+	struct mmc_host *host = mq->card->host;
 	unsigned long flags;
-	bool waiting;
 
-	spin_lock_irqsave(q->queue_lock, flags);
-	mq->complete_req = req;
-	mq->rw_wait = false;
-	waiting = mq->waiting;
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	if (!mmc_queue_direct_complete(host)) {
+		bool waiting;
+
+		spin_lock_irqsave(q->queue_lock, flags);
+		mq->complete_req = req;
+		mq->rw_wait = false;
+		waiting = mq->waiting;
+		spin_unlock_irqrestore(q->queue_lock, flags);
+
+		if (waiting)
+			wake_up(&mq->wait);
+		else
+			kblockd_schedule_work(&mq->complete_work);
+
+		return;
+	}
 
-	if (waiting)
+	if (mmc_blk_rq_error(&mqrq->brq) ||
+	    mmc_blk_urgent_bkops_needed(mq, mqrq)) {
+		spin_lock_irqsave(q->queue_lock, flags);
+		mq->recovery_needed = true;
+		mq->recovery_req = req;
+		spin_unlock_irqrestore(q->queue_lock, flags);
 		wake_up(&mq->wait);
-	else
-		kblockd_schedule_work(&mq->complete_work);
+		schedule_work(&mq->recovery_work);
+		return;
+	}
+
+	mmc_blk_rw_reset_success(mq, req);
+
+	mq->rw_wait = false;
+	wake_up(&mq->wait);
+
+	mmc_blk_mq_post_req(mq, req);
 }
 
 static bool mmc_blk_rw_wait_cond(struct mmc_queue *mq, int *err)
@@ -2297,7 +2366,12 @@ static bool mmc_blk_rw_wait_cond(struct mmc_queue *mq, int *err)
 	bool done;
 
 	spin_lock_irqsave(q->queue_lock, flags);
-	done = !mq->rw_wait;
+	if (mq->recovery_needed) {
+		*err = -EBUSY;
+		done = true;
+	} else {
+		done = !mq->rw_wait;
+	}
 	mq->waiting = !done;
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
@@ -2340,10 +2414,12 @@ static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
 	if (prev_req)
 		mmc_blk_mq_post_req(mq, prev_req);
 
-	if (err) {
+	if (err)
 		mq->rw_wait = false;
+
+	/* Release re-tuning here where there is no synchronization required */
+	if (err || mmc_queue_direct_complete(host))
 		mmc_retune_release(host);
-	}
 
 out_post_req:
 	if (err)
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index f472ce5d5647..b126418fd163 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -13,6 +13,7 @@
 
 enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req);
 void mmc_blk_mq_complete(struct request *req);
+void mmc_blk_mq_recovery(struct mmc_queue *mq);
 
 struct work_struct;
 
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index d0eae15261d7..148a13d9adf0 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -165,7 +165,10 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
 
 	mq->in_recovery = true;
 
-	mmc_blk_cqe_recovery(mq);
+	if (mq->use_cqe)
+		mmc_blk_cqe_recovery(mq);
+	else
+		mmc_blk_mq_recovery(mq);
 
 	mq->in_recovery = false;
 
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 1d7d3b0afff8..c4271fa54f1a 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -103,6 +103,7 @@ struct mmc_queue {
 	bool			waiting;
 	struct work_struct	recovery_work;
 	wait_queue_head_t	wait;
+	struct request		*recovery_req;
 	struct request		*complete_req;
 	struct mutex		complete_lock;
 	struct work_struct	complete_work;
@@ -134,4 +135,9 @@ static inline int mmc_cqe_qcnt(struct mmc_queue *mq)
 	       mq->in_flight[MMC_ISSUE_ASYNC];
 }
 
+static inline bool mmc_queue_direct_complete(struct mmc_host *host)
+{
+	return host->caps & MMC_CAP_DIRECT_COMPLETE;
+}
+
 #endif
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index ce2075d6f429..4b68a95a8818 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -324,6 +324,7 @@ struct mmc_host {
 #define MMC_CAP_DRIVER_TYPE_A	(1 << 23)	/* Host supports Driver Type A */
 #define MMC_CAP_DRIVER_TYPE_C	(1 << 24)	/* Host supports Driver Type C */
 #define MMC_CAP_DRIVER_TYPE_D	(1 << 25)	/* Host supports Driver Type D */
+#define MMC_CAP_DIRECT_COMPLETE	(1 << 27)	/* RW reqs can be completed within mmc_request_done() */
 #define MMC_CAP_CD_WAKE		(1 << 28)	/* Enable card detect wake */
 #define MMC_CAP_CMD_DURING_TFR	(1 << 29)	/* Commands during data transfer */
 #define MMC_CAP_CMD23		(1 << 30)	/* CMD23 supported. */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 18/24] mmc: block: blk-mq: Separate card polling from recovery
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (16 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 17/24] mmc: block: blk-mq: Add support for direct completion Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 19/24] mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy Adrian Hunter
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Recovery is simpler to understand if it is only used for errors. Create a
separate function for card polling.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index b1857eee46fd..7eb55f2dffa9 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -2162,6 +2162,24 @@ static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
 	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
 }
 
+static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	bool gen_err = false;
+	int err;
+
+	if (mmc_host_is_spi(card->host) || rq_data_dir(req) == READ)
+		return 0;
+
+	err = card_busy_detect_err(card, true, req, &gen_err);
+
+	/* Copy the general error bit so it will be seen later on */
+	if (gen_err)
+		mqrq->brq.stop.resp[0] |= R1_ERROR;
+
+	return err;
+}
+
 static inline void mmc_blk_rw_reset_success(struct mmc_queue *mq,
 					    struct request *req)
 {
@@ -2218,8 +2236,15 @@ static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
 				       struct request *req)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
 
-	mmc_blk_mq_rw_recovery(mq, req);
+	if (mmc_blk_rq_error(&mqrq->brq) ||
+	    mmc_blk_card_busy(mq->card, req)) {
+		mmc_blk_mq_rw_recovery(mq, req);
+	} else {
+		mmc_blk_rw_reset_success(mq, req);
+		mmc_retune_release(host);
+	}
 
 	mmc_blk_urgent_bkops(mq, mqrq);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 19/24] mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (17 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 18/24] mmc: block: blk-mq: Separate card polling from recovery Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 20/24] mmc: block: blk-mq: Stop using legacy recovery Adrian Hunter
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Check error bits and save the exception bit when polling card busy.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 38 ++++++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 7eb55f2dffa9..ac976c84571f 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1495,15 +1495,18 @@ static inline void mmc_apply_rel_rw(struct mmc_blk_request *brq,
 	}
 }
 
-#define CMD_ERRORS							\
-	(R1_OUT_OF_RANGE |	/* Command argument out of range */	\
-	 R1_ADDRESS_ERROR |	/* Misaligned address */		\
+#define CMD_ERRORS_EXCL_OOR						\
+	(R1_ADDRESS_ERROR |	/* Misaligned address */		\
 	 R1_BLOCK_LEN_ERROR |	/* Transferred block length incorrect */\
 	 R1_WP_VIOLATION |	/* Tried to write to protected block */	\
 	 R1_CARD_ECC_FAILED |	/* Card ECC failed */			\
 	 R1_CC_ERROR |		/* Card controller error */		\
 	 R1_ERROR)		/* General/unknown error */
 
+#define CMD_ERRORS							\
+	(CMD_ERRORS_EXCL_OOR |						\
+	 R1_OUT_OF_RANGE)	/* Command argument out of range */	\
+
 static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
 {
 	u32 val;
@@ -2162,20 +2165,39 @@ static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
 	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
 }
 
+static inline bool mmc_blk_oor_valid(struct mmc_blk_request *brq)
+{
+	return !!brq->mrq.sbc;
+}
+
+static inline u32 mmc_blk_stop_err_bits(struct mmc_blk_request *brq)
+{
+	return mmc_blk_oor_valid(brq) ? CMD_ERRORS : CMD_ERRORS_EXCL_OOR;
+}
+
 static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
-	bool gen_err = false;
+	u32 status = 0;
 	int err;
 
 	if (mmc_host_is_spi(card->host) || rq_data_dir(req) == READ)
 		return 0;
 
-	err = card_busy_detect_err(card, true, req, &gen_err);
+	err = card_busy_detect(card, true, req, &status);
+
+	/*
+	 * Do not assume data transferred correctly if there are any error bits
+	 * set.
+	 */
+	if (!err && status & mmc_blk_stop_err_bits(&mqrq->brq)) {
+		mqrq->brq.data.bytes_xfered = 0;
+		err = -EIO;
+	}
 
-	/* Copy the general error bit so it will be seen later on */
-	if (gen_err)
-		mqrq->brq.stop.resp[0] |= R1_ERROR;
+	/* Copy the exception bit so it will be seen later on */
+	if (mmc_card_mmc(card) && status & R1_EXCEPTION_EVENT)
+		mqrq->brq.cmd.resp[0] |= R1_EXCEPTION_EVENT;
 
 	return err;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 20/24] mmc: block: blk-mq: Stop using legacy recovery
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (18 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 19/24] mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 21/24] mmc: mmc_test: Do not use mmc_start_areq() anymore Adrian Hunter
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

There are only a few things the recovery needs to do. Primarily, it just
needs to:
	Determine the number of bytes transferred
	Get the card back to transfer state
	Determine whether to retry

There are also a couple of additional features:
	Reset the card before the last retry
	Read one sector at a time

The legacy code spent much effort analyzing command errors, but commands
fail fast, so it is simpler just to give all command errors the same number
of retries.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 289 +++++++++++++++++++++++++----------------------
 1 file changed, 153 insertions(+), 136 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index ac976c84571f..af445e405488 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1549,9 +1549,11 @@ static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
 	}
 }
 
-static enum mmc_blk_status __mmc_blk_err_check(struct mmc_card *card,
-					       struct mmc_queue_req *mq_mrq)
+static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
+					     struct mmc_async_req *areq)
 {
+	struct mmc_queue_req *mq_mrq = container_of(areq, struct mmc_queue_req,
+						    areq);
 	struct mmc_blk_request *brq = &mq_mrq->brq;
 	struct request *req = mmc_queue_req_to_req(mq_mrq);
 	int need_retune = card->host->need_retune;
@@ -1656,15 +1658,6 @@ static enum mmc_blk_status __mmc_blk_err_check(struct mmc_card *card,
 	return MMC_BLK_SUCCESS;
 }
 
-static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
-					     struct mmc_async_req *areq)
-{
-	struct mmc_queue_req *mq_mrq = container_of(areq, struct mmc_queue_req,
-						    areq);
-
-	return __mmc_blk_err_check(card, mq_mrq);
-}
-
 static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 			      int disable_multi, bool *do_rel_wr_p,
 			      bool *do_data_tag_p)
@@ -1990,6 +1983,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 }
 
 #define MMC_MAX_RETRIES		5
+#define MMC_DATA_RETRIES	2
 #define MMC_NO_RETRIES		(MMC_MAX_RETRIES + 1)
 
 /* Single sector read during recovery */
@@ -2022,6 +2016,85 @@ static void mmc_blk_ss_read(struct mmc_queue *mq, struct request *req)
 	mqrq->retries = MMC_NO_RETRIES;
 }
 
+static inline bool mmc_blk_oor_valid(struct mmc_blk_request *brq)
+{
+	return !!brq->mrq.sbc;
+}
+
+static inline u32 mmc_blk_stop_err_bits(struct mmc_blk_request *brq)
+{
+	return mmc_blk_oor_valid(brq) ? CMD_ERRORS : CMD_ERRORS_EXCL_OOR;
+}
+
+/*
+ * Check for errors the host controller driver might not have seen such as
+ * response mode errors or invalid card state.
+ */
+static bool mmc_blk_status_error(struct request *req, u32 status)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_blk_request *brq = &mqrq->brq;
+	struct mmc_queue *mq = req->q->queuedata;
+	u32 stop_err_bits;
+
+	if (mmc_host_is_spi(mq->card->host))
+		return 0;
+
+	stop_err_bits = mmc_blk_stop_err_bits(brq);
+
+	return brq->cmd.resp[0]  & CMD_ERRORS    ||
+	       brq->stop.resp[0] & stop_err_bits ||
+	       status            & stop_err_bits ||
+	       (rq_data_dir(req) == WRITE && !mmc_blk_in_tran_state(status));
+}
+
+static inline bool mmc_blk_cmd_started(struct mmc_blk_request *brq)
+{
+	return !brq->sbc.error && !brq->cmd.error &&
+	       !(brq->cmd.resp[0] & CMD_ERRORS);
+}
+
+static int mmc_blk_send_stop(struct mmc_card *card)
+{
+	struct mmc_command cmd = {
+		.opcode = MMC_STOP_TRANSMISSION,
+		.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_AC,
+	};
+
+	return mmc_wait_for_cmd(card->host, &cmd, 5);
+}
+
+static int mmc_blk_fix_state(struct mmc_card *card, struct request *req)
+{
+	int err;
+
+	mmc_retune_hold_now(card->host);
+
+	mmc_blk_send_stop(card);
+
+	err = card_busy_detect(card, false, req, NULL);
+
+	mmc_retune_release(card->host);
+
+	return err;
+}
+
+/*
+ * Requests are completed by mmc_blk_mq_complete_rq() which sets simple
+ * policy:
+ * 1. A request that has transferred at least some data is considered
+ * successful and will be requeued if there is remaining data to
+ * transfer.
+ * 2. Otherwise the number of retries is incremented and the request
+ * will be requeued if there are remaining retries.
+ * 3. Otherwise the request will be errored out.
+ * That means mmc_blk_mq_complete_rq() is controlled by bytes_xfered and
+ * mqrq->retries. So there are only 4 possible actions here:
+ *	1. do not accept the bytes_xfered value i.e. set it to zero
+ *	2. change mqrq->retries to determine the number of retries
+ *	3. try to reset the card
+ *	4. read one sector at a time
+ */
 static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
 {
 	int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
@@ -2029,131 +2102,85 @@ static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
 	struct mmc_blk_request *brq = &mqrq->brq;
 	struct mmc_blk_data *md = mq->blkdata;
 	struct mmc_card *card = mq->card;
-	static enum mmc_blk_status status;
-
-	brq->retune_retry_done = mqrq->retries;
+	u32 status;
+	u32 blocks;
+	int err;
 
-	status = __mmc_blk_err_check(card, mqrq);
+	/*
+	 * Some errors the host driver might not have seen. Set the number of
+	 * bytes transferred to zero in that case.
+	 */
+	err = __mmc_send_status(card, &status, 0);
+	if (err || mmc_blk_status_error(req, status))
+		brq->data.bytes_xfered = 0;
 
 	mmc_retune_release(card->host);
 
 	/*
-	 * Requests are completed by mmc_blk_mq_complete_rq() which sets simple
-	 * policy:
-	 * 1. A request that has transferred at least some data is considered
-	 * successful and will be requeued if there is remaining data to
-	 * transfer.
-	 * 2. Otherwise the number of retries is incremented and the request
-	 * will be requeued if there are remaining retries.
-	 * 3. Otherwise the request will be errored out.
-	 * That means mmc_blk_mq_complete_rq() is controlled by bytes_xfered and
-	 * mqrq->retries. So there are only 4 possible actions here:
-	 *	1. do not accept the bytes_xfered value i.e. set it to zero
-	 *	2. change mqrq->retries to determine the number of retries
-	 *	3. try to reset the card
-	 *	4. read one sector at a time
+	 * Try again to get the status. This also provides an opportunity for
+	 * re-tuning.
 	 */
-	switch (status) {
-	case MMC_BLK_SUCCESS:
-	case MMC_BLK_PARTIAL:
-		/* Reset success, and accept bytes_xfered */
-		mmc_blk_reset_success(md, type);
-		break;
-	case MMC_BLK_CMD_ERR:
-		/*
-		 * For SD cards, get bytes written, but do not accept
-		 * bytes_xfered if that fails. For MMC cards accept
-		 * bytes_xfered. Then try to reset. If reset fails then
-		 * error out the remaining request, otherwise retry
-		 * once (N.B mmc_blk_reset() will not succeed twice in a
-		 * row).
-		 */
-		if (mmc_card_sd(card)) {
-			u32 blocks;
-			int err;
+	if (err)
+		err = __mmc_send_status(card, &status, 0);
 
-			err = mmc_sd_num_wr_blocks(card, &blocks);
-			if (err)
-				brq->data.bytes_xfered = 0;
-			else
-				brq->data.bytes_xfered = blocks << 9;
-		}
-		if (mmc_blk_reset(md, card->host, type))
-			mqrq->retries = MMC_NO_RETRIES;
-		else
-			mqrq->retries = MMC_MAX_RETRIES - 1;
-		break;
-	case MMC_BLK_RETRY:
-		/*
-		 * Do not accept bytes_xfered, but retry up to 5 times,
-		 * otherwise same as abort.
-		 */
-		brq->data.bytes_xfered = 0;
-		if (mqrq->retries < MMC_MAX_RETRIES)
-			break;
-		/* Fall through */
-	case MMC_BLK_ABORT:
-		/*
-		 * Do not accept bytes_xfered, but try to reset. If
-		 * reset succeeds, try once more, otherwise error out
-		 * the request.
-		 */
-		brq->data.bytes_xfered = 0;
-		if (mmc_blk_reset(md, card->host, type))
-			mqrq->retries = MMC_NO_RETRIES;
-		else
-			mqrq->retries = MMC_MAX_RETRIES - 1;
-		break;
-	case MMC_BLK_DATA_ERR: {
-		int err;
+	/*
+	 * Nothing more to do after the number of bytes transferred has been
+	 * updated and there is no card.
+	 */
+	if (err && mmc_detect_card_removed(card->host))
+		return;
 
-		/*
-		 * Do not accept bytes_xfered, but try to reset. If
-		 * reset succeeds, try once more. If reset fails with
-		 * ENODEV which means the partition is wrong, then error
-		 * out the request. Otherwise attempt to read one sector
-		 * at a time.
-		 */
-		brq->data.bytes_xfered = 0;
-		err = mmc_blk_reset(md, card->host, type);
-		if (!err) {
-			mqrq->retries = MMC_MAX_RETRIES - 1;
-			break;
-		}
-		if (err == -ENODEV) {
-			mqrq->retries = MMC_NO_RETRIES;
-			break;
-		}
-		/* Fall through */
+	/* Try to get back to "tran" state */
+	if (!mmc_host_is_spi(mq->card->host) &&
+	    (err || !mmc_blk_in_tran_state(status)))
+		err = mmc_blk_fix_state(mq->card, req);
+
+	/*
+	 * Special case for SD cards where the card might record the number of
+	 * blocks written.
+	 */
+	if (!err && mmc_blk_cmd_started(brq) && mmc_card_sd(card) &&
+	    rq_data_dir(req) == WRITE) {
+		if (mmc_sd_num_wr_blocks(card, &blocks))
+			brq->data.bytes_xfered = 0;
+		else
+			brq->data.bytes_xfered = blocks << 9;
 	}
-	case MMC_BLK_ECC_ERR:
-		/*
-		 * Do not accept bytes_xfered. If reading more than one
-		 * sector, try reading one sector at a time.
-		 */
-		brq->data.bytes_xfered = 0;
-		/* FIXME: Missing single sector read for large sector size */
-		if (brq->data.blocks > 1 && !mmc_large_sector(card)) {
-			/* Redo read one sector at a time */
-			pr_warn("%s: retrying using single block read\n",
-				req->rq_disk->disk_name);
-			mmc_blk_ss_read(mq, req);
-		} else {
-			mqrq->retries = MMC_NO_RETRIES;
-		}
-		break;
-	case MMC_BLK_NOMEDIUM:
-		/* Do not accept bytes_xfered. Error out the request */
-		brq->data.bytes_xfered = 0;
-		mqrq->retries = MMC_NO_RETRIES;
-		break;
-	default:
-		/* Do not accept bytes_xfered. Error out the request */
-		brq->data.bytes_xfered = 0;
+
+	/* Reset if the card is in a bad state */
+	if (!mmc_host_is_spi(mq->card->host) &&
+	    err && mmc_blk_reset(md, card->host, type)) {
+		pr_err("%s: recovery failed!\n", req->rq_disk->disk_name);
 		mqrq->retries = MMC_NO_RETRIES;
-		pr_err("%s: Unhandled return value (%d)",
-		       req->rq_disk->disk_name, status);
-		break;
+		return;
+	}
+
+	/*
+	 * If anything was done, just return and if there is anything remaining
+	 * on the request it will get requeued.
+	 */
+	if (brq->data.bytes_xfered)
+		return;
+
+	/* Reset before last retry */
+	if (mqrq->retries + 1 == MMC_MAX_RETRIES)
+		mmc_blk_reset(md, card->host, type);
+
+	/* Command errors fail fast, so use all MMC_MAX_RETRIES */
+	if (brq->sbc.error || brq->cmd.error)
+		return;
+
+	/* Reduce the remaining retries for data errors */
+	if (mqrq->retries < MMC_MAX_RETRIES - MMC_DATA_RETRIES) {
+		mqrq->retries = MMC_MAX_RETRIES - MMC_DATA_RETRIES;
+		return;
+	}
+
+	/* FIXME: Missing single sector read for large sector size */
+	if (rq_data_dir(req) == READ && !mmc_large_sector(card)) {
+		/* Read one sector at a time */
+		mmc_blk_ss_read(mq, req);
+		return;
 	}
 }
 
@@ -2165,16 +2192,6 @@ static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
 	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
 }
 
-static inline bool mmc_blk_oor_valid(struct mmc_blk_request *brq)
-{
-	return !!brq->mrq.sbc;
-}
-
-static inline u32 mmc_blk_stop_err_bits(struct mmc_blk_request *brq)
-{
-	return mmc_blk_oor_valid(brq) ? CMD_ERRORS : CMD_ERRORS_EXCL_OOR;
-}
-
 static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
 {
 	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 21/24] mmc: mmc_test: Do not use mmc_start_areq() anymore
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (19 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 20/24] mmc: block: blk-mq: Stop using legacy recovery Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 22/24] mmc: core: Remove option not to use blk-mq Adrian Hunter
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

The block driver's blk-mq paths do not use mmc_start_areq(). In order to
remove mmc_start_areq() entirely, start by removing it from mmc_test.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/mmc_test.c | 122 ++++++++++++++++++++------------------------
 1 file changed, 54 insertions(+), 68 deletions(-)

diff --git a/drivers/mmc/core/mmc_test.c b/drivers/mmc/core/mmc_test.c
index 478869805b96..9311c8de2061 100644
--- a/drivers/mmc/core/mmc_test.c
+++ b/drivers/mmc/core/mmc_test.c
@@ -171,11 +171,6 @@ struct mmc_test_multiple_rw {
 	enum mmc_test_prep_media prepare;
 };
 
-struct mmc_test_async_req {
-	struct mmc_async_req areq;
-	struct mmc_test_card *test;
-};
-
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -741,30 +736,6 @@ static int mmc_test_check_result(struct mmc_test_card *test,
 	return ret;
 }
 
-static enum mmc_blk_status mmc_test_check_result_async(struct mmc_card *card,
-				       struct mmc_async_req *areq)
-{
-	struct mmc_test_async_req *test_async =
-		container_of(areq, struct mmc_test_async_req, areq);
-	int ret;
-
-	mmc_test_wait_busy(test_async->test);
-
-	/*
-	 * FIXME: this would earlier just casts a regular error code,
-	 * either of the kernel type -ERRORCODE or the local test framework
-	 * RESULT_* errorcode, into an enum mmc_blk_status and return as
-	 * result check. Instead, convert it to some reasonable type by just
-	 * returning either MMC_BLK_SUCCESS or MMC_BLK_CMD_ERR.
-	 * If possible, a reasonable error code should be returned.
-	 */
-	ret = mmc_test_check_result(test_async->test, areq->mrq);
-	if (ret)
-		return MMC_BLK_CMD_ERR;
-
-	return MMC_BLK_SUCCESS;
-}
-
 /*
  * Checks that a "short transfer" behaved as expected
  */
@@ -831,6 +802,45 @@ static struct mmc_test_req *mmc_test_req_alloc(void)
 	return rq;
 }
 
+static void mmc_test_wait_done(struct mmc_request *mrq)
+{
+	complete(&mrq->completion);
+}
+
+static int mmc_test_start_areq(struct mmc_test_card *test,
+			       struct mmc_request *mrq,
+			       struct mmc_request *prev_mrq)
+{
+	struct mmc_host *host = test->card->host;
+	int err = 0;
+
+	if (mrq) {
+		init_completion(&mrq->completion);
+		mrq->done = mmc_test_wait_done;
+		mmc_pre_req(host, mrq);
+	}
+
+	if (prev_mrq) {
+		wait_for_completion(&prev_mrq->completion);
+		err = mmc_test_wait_busy(test);
+		if (!err)
+			err = mmc_test_check_result(test, prev_mrq);
+	}
+
+	if (!err && mrq) {
+		err = mmc_start_request(host, mrq);
+		if (err)
+			mmc_retune_release(host);
+	}
+
+	if (prev_mrq)
+		mmc_post_req(host, prev_mrq, 0);
+
+	if (err && mrq)
+		mmc_post_req(host, mrq, err);
+
+	return err;
+}
 
 static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 				      struct scatterlist *sg, unsigned sg_len,
@@ -838,17 +848,10 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 				      unsigned blksz, int write, int count)
 {
 	struct mmc_test_req *rq1, *rq2;
-	struct mmc_test_async_req test_areq[2];
-	struct mmc_async_req *done_areq;
-	struct mmc_async_req *cur_areq = &test_areq[0].areq;
-	struct mmc_async_req *other_areq = &test_areq[1].areq;
-	enum mmc_blk_status status;
+	struct mmc_request *mrq, *prev_mrq;
 	int i;
 	int ret = RESULT_OK;
 
-	test_areq[0].test = test;
-	test_areq[1].test = test;
-
 	rq1 = mmc_test_req_alloc();
 	rq2 = mmc_test_req_alloc();
 	if (!rq1 || !rq2) {
@@ -856,33 +859,25 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 		goto err;
 	}
 
-	cur_areq->mrq = &rq1->mrq;
-	cur_areq->err_check = mmc_test_check_result_async;
-	other_areq->mrq = &rq2->mrq;
-	other_areq->err_check = mmc_test_check_result_async;
+	mrq = &rq1->mrq;
+	prev_mrq = NULL;
 
 	for (i = 0; i < count; i++) {
-		mmc_test_prepare_mrq(test, cur_areq->mrq, sg, sg_len, dev_addr,
-				     blocks, blksz, write);
-		done_areq = mmc_start_areq(test->card->host, cur_areq, &status);
-
-		if (status != MMC_BLK_SUCCESS || (!done_areq && i > 0)) {
-			ret = RESULT_FAIL;
+		mmc_test_req_reset(container_of(mrq, struct mmc_test_req, mrq));
+		mmc_test_prepare_mrq(test, mrq, sg, sg_len, dev_addr, blocks,
+				     blksz, write);
+		ret = mmc_test_start_areq(test, mrq, prev_mrq);
+		if (ret)
 			goto err;
-		}
 
-		if (done_areq)
-			mmc_test_req_reset(container_of(done_areq->mrq,
-						struct mmc_test_req, mrq));
+		if (!prev_mrq)
+			prev_mrq = &rq2->mrq;
 
-		swap(cur_areq, other_areq);
+		swap(mrq, prev_mrq);
 		dev_addr += blocks;
 	}
 
-	done_areq = mmc_start_areq(test->card->host, NULL, &status);
-	if (status != MMC_BLK_SUCCESS)
-		ret = RESULT_FAIL;
-
+	ret = mmc_test_start_areq(test, NULL, prev_mrq);
 err:
 	kfree(rq1);
 	kfree(rq2);
@@ -2356,11 +2351,9 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 	struct mmc_test_req *rq = mmc_test_req_alloc();
 	struct mmc_host *host = test->card->host;
 	struct mmc_test_area *t = &test->area;
-	struct mmc_test_async_req test_areq = { .test = test };
 	struct mmc_request *mrq;
 	unsigned long timeout;
 	bool expired = false;
-	enum mmc_blk_status blkstat = MMC_BLK_SUCCESS;
 	int ret = 0, cmd_ret;
 	u32 status = 0;
 	int count = 0;
@@ -2373,9 +2366,6 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 		mrq->sbc = &rq->sbc;
 	mrq->cap_cmd_during_tfr = true;
 
-	test_areq.areq.mrq = mrq;
-	test_areq.areq.err_check = mmc_test_check_result_async;
-
 	mmc_test_prepare_mrq(test, mrq, t->sg, t->sg_len, dev_addr, t->blocks,
 			     512, write);
 
@@ -2388,11 +2378,9 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 
 	/* Start ongoing data request */
 	if (use_areq) {
-		mmc_start_areq(host, &test_areq.areq, &blkstat);
-		if (blkstat != MMC_BLK_SUCCESS) {
-			ret = RESULT_FAIL;
+		ret = mmc_test_start_areq(test, mrq, NULL);
+		if (ret)
 			goto out_free;
-		}
 	} else {
 		mmc_wait_for_req(host, mrq);
 	}
@@ -2426,9 +2414,7 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 
 	/* Wait for data request to complete */
 	if (use_areq) {
-		mmc_start_areq(host, NULL, &blkstat);
-		if (blkstat != MMC_BLK_SUCCESS)
-			ret = RESULT_FAIL;
+		ret = mmc_test_start_areq(test, NULL, mrq);
 	} else {
 		mmc_wait_for_req_done(test->card->host, mrq);
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 22/24] mmc: core: Remove option not to use blk-mq
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (20 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 21/24] mmc: mmc_test: Do not use mmc_start_areq() anymore Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 23/24] mmc: block: Remove code no longer needed after the switch to blk-mq Adrian Hunter
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Remove config option MMC_MQ_DEFAULT and parameter mmc_use_blk_mq, so that
blk-mq must be used always.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/Kconfig     | 10 ----------
 drivers/mmc/core/core.c |  7 -------
 drivers/mmc/core/core.h |  2 --
 drivers/mmc/core/host.c |  2 --
 drivers/mmc/core/host.h |  2 +-
 5 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/drivers/mmc/Kconfig b/drivers/mmc/Kconfig
index 42565562577c..ec21388311db 100644
--- a/drivers/mmc/Kconfig
+++ b/drivers/mmc/Kconfig
@@ -12,16 +12,6 @@ menuconfig MMC
 	  If you want MMC/SD/SDIO support, you should say Y here and
 	  also to your specific host controller driver.
 
-config MMC_MQ_DEFAULT
-	bool "MMC: use blk-mq I/O path by default"
-	depends on MMC && BLOCK
-	default y
-	---help---
-	  This option enables the new blk-mq based I/O path for MMC block
-	  devices by default.  With the option the mmc_core.use_blk_mq
-	  module/boot option defaults to Y, without it to N, but it can
-	  still be overridden either way.
-
 if MMC
 
 source "drivers/mmc/core/Kconfig"
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 617802f45386..7ca6e4866a8b 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -66,13 +66,6 @@
 bool use_spi_crc = 1;
 module_param(use_spi_crc, bool, 0);
 
-#ifdef CONFIG_MMC_MQ_DEFAULT
-bool mmc_use_blk_mq = true;
-#else
-bool mmc_use_blk_mq = false;
-#endif
-module_param_named(use_blk_mq, mmc_use_blk_mq, bool, S_IWUSR | S_IRUGO);
-
 static int mmc_schedule_delayed_work(struct delayed_work *work,
 				     unsigned long delay)
 {
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index aa87cd8d14c6..f564ddfbe070 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -35,8 +35,6 @@ struct mmc_bus_ops {
 	int (*reset)(struct mmc_host *);
 };
 
-extern bool mmc_use_blk_mq;
-
 void mmc_attach_bus(struct mmc_host *host, const struct mmc_bus_ops *ops);
 void mmc_detach_bus(struct mmc_host *host);
 
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 62ef6cb0ece4..35a9e4fd1a9f 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -404,8 +404,6 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
 
 	host->fixed_drv_type = -EINVAL;
 
-	host->use_blk_mq = mmc_use_blk_mq;
-
 	return host;
 }
 
diff --git a/drivers/mmc/core/host.h b/drivers/mmc/core/host.h
index 6eaf558e62d6..a33903a2ea96 100644
--- a/drivers/mmc/core/host.h
+++ b/drivers/mmc/core/host.h
@@ -76,7 +76,7 @@ static inline bool mmc_card_hs400es(struct mmc_card *card)
 
 static inline bool mmc_host_use_blk_mq(struct mmc_host *host)
 {
-	return host->use_blk_mq;
+	return true;
 }
 
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 23/24] mmc: block: Remove code no longer needed after the switch to blk-mq
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (21 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 22/24] mmc: core: Remove option not to use blk-mq Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-21 13:42 ` [PATCH V14 24/24] mmc: core: " Adrian Hunter
  2017-11-28  9:42 ` [PATCH V14 00/24] mmc: Add Command Queue support Linus Walleij
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Remove code no longer needed after the switch to blk-mq.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/block.c | 706 +----------------------------------------------
 drivers/mmc/core/block.h |   2 -
 drivers/mmc/core/queue.c | 240 +---------------
 drivers/mmc/core/queue.h |  15 -
 4 files changed, 13 insertions(+), 950 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index af445e405488..02dbc25d097b 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1010,239 +1010,6 @@ static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
 	return err;
 }
 
-static int card_busy_detect_err(struct mmc_card *card, bool hw_busy_detect,
-				struct request *req, bool *gen_err)
-{
-	u32 resp_errs = 0;
-	int err;
-
-	err = card_busy_detect(card, hw_busy_detect, req, &resp_errs);
-	if (resp_errs & R1_ERROR) {
-		pr_err("%s: %s: error sending status cmd, status %#x\n",
-		       req->rq_disk->disk_name, __func__, resp_errs);
-		*gen_err = true;
-	}
-
-	return err;
-}
-
-static int send_stop(struct mmc_card *card, unsigned int timeout_ms,
-		struct request *req, bool *gen_err, u32 *stop_status)
-{
-	struct mmc_host *host = card->host;
-	struct mmc_command cmd = {};
-	int err;
-	bool use_r1b_resp = rq_data_dir(req) == WRITE;
-
-	/*
-	 * Normally we use R1B responses for WRITE, but in cases where the host
-	 * has specified a max_busy_timeout we need to validate it. A failure
-	 * means we need to prevent the host from doing hw busy detection, which
-	 * is done by converting to a R1 response instead.
-	 */
-	if (host->max_busy_timeout && (timeout_ms > host->max_busy_timeout))
-		use_r1b_resp = false;
-
-	cmd.opcode = MMC_STOP_TRANSMISSION;
-	if (use_r1b_resp) {
-		cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		cmd.busy_timeout = timeout_ms;
-	} else {
-		cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_AC;
-	}
-
-	err = mmc_wait_for_cmd(host, &cmd, 5);
-	if (err)
-		return err;
-
-	*stop_status = cmd.resp[0];
-
-	/* No need to check card status in case of READ. */
-	if (rq_data_dir(req) == READ)
-		return 0;
-
-	if (!mmc_host_is_spi(host) &&
-		(*stop_status & R1_ERROR)) {
-		pr_err("%s: %s: general error sending stop command, resp %#x\n",
-			req->rq_disk->disk_name, __func__, *stop_status);
-		*gen_err = true;
-	}
-
-	return card_busy_detect_err(card, use_r1b_resp, req, gen_err);
-}
-
-#define ERR_NOMEDIUM	3
-#define ERR_RETRY	2
-#define ERR_ABORT	1
-#define ERR_CONTINUE	0
-
-static int mmc_blk_cmd_error(struct request *req, const char *name, int error,
-	bool status_valid, u32 status)
-{
-	switch (error) {
-	case -EILSEQ:
-		/* response crc error, retry the r/w cmd */
-		pr_err("%s: %s sending %s command, card status %#x\n",
-			req->rq_disk->disk_name, "response CRC error",
-			name, status);
-		return ERR_RETRY;
-
-	case -ETIMEDOUT:
-		pr_err("%s: %s sending %s command, card status %#x\n",
-			req->rq_disk->disk_name, "timed out", name, status);
-
-		/* If the status cmd initially failed, retry the r/w cmd */
-		if (!status_valid) {
-			pr_err("%s: status not valid, retrying timeout\n",
-				req->rq_disk->disk_name);
-			return ERR_RETRY;
-		}
-
-		/*
-		 * If it was a r/w cmd crc error, or illegal command
-		 * (eg, issued in wrong state) then retry - we should
-		 * have corrected the state problem above.
-		 */
-		if (status & (R1_COM_CRC_ERROR | R1_ILLEGAL_COMMAND)) {
-			pr_err("%s: command error, retrying timeout\n",
-				req->rq_disk->disk_name);
-			return ERR_RETRY;
-		}
-
-		/* Otherwise abort the command */
-		return ERR_ABORT;
-
-	default:
-		/* We don't understand the error code the driver gave us */
-		pr_err("%s: unknown error %d sending read/write command, card status %#x\n",
-		       req->rq_disk->disk_name, error, status);
-		return ERR_ABORT;
-	}
-}
-
-/*
- * Initial r/w and stop cmd error recovery.
- * We don't know whether the card received the r/w cmd or not, so try to
- * restore things back to a sane state.  Essentially, we do this as follows:
- * - Obtain card status.  If the first attempt to obtain card status fails,
- *   the status word will reflect the failed status cmd, not the failed
- *   r/w cmd.  If we fail to obtain card status, it suggests we can no
- *   longer communicate with the card.
- * - Check the card state.  If the card received the cmd but there was a
- *   transient problem with the response, it might still be in a data transfer
- *   mode.  Try to send it a stop command.  If this fails, we can't recover.
- * - If the r/w cmd failed due to a response CRC error, it was probably
- *   transient, so retry the cmd.
- * - If the r/w cmd timed out, but we didn't get the r/w cmd status, retry.
- * - If the r/w cmd timed out, and the r/w cmd failed due to CRC error or
- *   illegal cmd, retry.
- * Otherwise we don't understand what happened, so abort.
- */
-static int mmc_blk_cmd_recovery(struct mmc_card *card, struct request *req,
-	struct mmc_blk_request *brq, bool *ecc_err, bool *gen_err)
-{
-	bool prev_cmd_status_valid = true;
-	u32 status, stop_status = 0;
-	int err, retry;
-
-	if (mmc_card_removed(card))
-		return ERR_NOMEDIUM;
-
-	/*
-	 * Try to get card status which indicates both the card state
-	 * and why there was no response.  If the first attempt fails,
-	 * we can't be sure the returned status is for the r/w command.
-	 */
-	for (retry = 2; retry >= 0; retry--) {
-		err = __mmc_send_status(card, &status, 0);
-		if (!err)
-			break;
-
-		/* Re-tune if needed */
-		mmc_retune_recheck(card->host);
-
-		prev_cmd_status_valid = false;
-		pr_err("%s: error %d sending status command, %sing\n",
-		       req->rq_disk->disk_name, err, retry ? "retry" : "abort");
-	}
-
-	/* We couldn't get a response from the card.  Give up. */
-	if (err) {
-		/* Check if the card is removed */
-		if (mmc_detect_card_removed(card->host))
-			return ERR_NOMEDIUM;
-		return ERR_ABORT;
-	}
-
-	/* Flag ECC errors */
-	if ((status & R1_CARD_ECC_FAILED) ||
-	    (brq->stop.resp[0] & R1_CARD_ECC_FAILED) ||
-	    (brq->cmd.resp[0] & R1_CARD_ECC_FAILED))
-		*ecc_err = true;
-
-	/* Flag General errors */
-	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ)
-		if ((status & R1_ERROR) ||
-			(brq->stop.resp[0] & R1_ERROR)) {
-			pr_err("%s: %s: general error sending stop or status command, stop cmd response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, __func__,
-			       brq->stop.resp[0], status);
-			*gen_err = true;
-		}
-
-	/*
-	 * Check the current card state.  If it is in some data transfer
-	 * mode, tell it to stop (and hopefully transition back to TRAN.)
-	 */
-	if (R1_CURRENT_STATE(status) == R1_STATE_DATA ||
-	    R1_CURRENT_STATE(status) == R1_STATE_RCV) {
-		err = send_stop(card,
-			DIV_ROUND_UP(brq->data.timeout_ns, 1000000),
-			req, gen_err, &stop_status);
-		if (err) {
-			pr_err("%s: error %d sending stop command\n",
-			       req->rq_disk->disk_name, err);
-			/*
-			 * If the stop cmd also timed out, the card is probably
-			 * not present, so abort. Other errors are bad news too.
-			 */
-			return ERR_ABORT;
-		}
-
-		if (stop_status & R1_CARD_ECC_FAILED)
-			*ecc_err = true;
-	}
-
-	/* Check for set block count errors */
-	if (brq->sbc.error)
-		return mmc_blk_cmd_error(req, "SET_BLOCK_COUNT", brq->sbc.error,
-				prev_cmd_status_valid, status);
-
-	/* Check for r/w command errors */
-	if (brq->cmd.error)
-		return mmc_blk_cmd_error(req, "r/w cmd", brq->cmd.error,
-				prev_cmd_status_valid, status);
-
-	/* Data errors */
-	if (!brq->stop.error)
-		return ERR_CONTINUE;
-
-	/* Now for stop errors.  These aren't fatal to the transfer. */
-	pr_info("%s: error %d sending stop command, original cmd response %#x, card status %#x\n",
-	       req->rq_disk->disk_name, brq->stop.error,
-	       brq->cmd.resp[0], status);
-
-	/*
-	 * Subsitute in our own stop status as this will give the error
-	 * state which happened during the execution of the r/w command.
-	 */
-	if (stop_status) {
-		brq->stop.resp[0] = stop_status;
-		brq->stop.error = 0;
-	}
-	return ERR_CONTINUE;
-}
-
 static int mmc_blk_reset(struct mmc_blk_data *md, struct mmc_host *host,
 			 int type)
 {
@@ -1277,14 +1044,6 @@ static inline void mmc_blk_reset_success(struct mmc_blk_data *md, int type)
 	md->reset_done &= ~type;
 }
 
-static void mmc_blk_end_request(struct request *req, blk_status_t error)
-{
-	if (req->mq_ctx)
-		blk_mq_end_request(req, error);
-	else
-		blk_end_request_all(req, error);
-}
-
 /*
  * The non-block commands come back from the block layer after it queued it and
  * processed it with all other requests and then they get issued in this
@@ -1346,7 +1105,7 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
 		break;
 	}
 	mq_rq->drv_op_result = ret;
-	mmc_blk_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	blk_mq_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
@@ -1389,7 +1148,7 @@ static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
 	else
 		mmc_blk_reset_success(md, type);
 fail:
-	mmc_blk_end_request(req, status);
+	blk_mq_end_request(req, status);
 }
 
 static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
@@ -1459,7 +1218,7 @@ static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
 	if (!err)
 		mmc_blk_reset_success(md, type);
 out:
-	mmc_blk_end_request(req, status);
+	blk_mq_end_request(req, status);
 }
 
 static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
@@ -1469,7 +1228,7 @@ static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
 	int ret = 0;
 
 	ret = mmc_flush_cache(card);
-	mmc_blk_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	blk_mq_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 /*
@@ -1549,115 +1308,6 @@ static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
 	}
 }
 
-static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
-					     struct mmc_async_req *areq)
-{
-	struct mmc_queue_req *mq_mrq = container_of(areq, struct mmc_queue_req,
-						    areq);
-	struct mmc_blk_request *brq = &mq_mrq->brq;
-	struct request *req = mmc_queue_req_to_req(mq_mrq);
-	int need_retune = card->host->need_retune;
-	bool ecc_err = false;
-	bool gen_err = false;
-
-	/*
-	 * sbc.error indicates a problem with the set block count
-	 * command.  No data will have been transferred.
-	 *
-	 * cmd.error indicates a problem with the r/w command.  No
-	 * data will have been transferred.
-	 *
-	 * stop.error indicates a problem with the stop command.  Data
-	 * may have been transferred, or may still be transferring.
-	 */
-
-	mmc_blk_eval_resp_error(brq);
-
-	if (brq->sbc.error || brq->cmd.error ||
-	    brq->stop.error || brq->data.error) {
-		switch (mmc_blk_cmd_recovery(card, req, brq, &ecc_err, &gen_err)) {
-		case ERR_RETRY:
-			return MMC_BLK_RETRY;
-		case ERR_ABORT:
-			return MMC_BLK_ABORT;
-		case ERR_NOMEDIUM:
-			return MMC_BLK_NOMEDIUM;
-		case ERR_CONTINUE:
-			break;
-		}
-	}
-
-	/*
-	 * Check for errors relating to the execution of the
-	 * initial command - such as address errors.  No data
-	 * has been transferred.
-	 */
-	if (brq->cmd.resp[0] & CMD_ERRORS) {
-		pr_err("%s: r/w command failed, status = %#x\n",
-		       req->rq_disk->disk_name, brq->cmd.resp[0]);
-		return MMC_BLK_ABORT;
-	}
-
-	/*
-	 * Everything else is either success, or a data error of some
-	 * kind.  If it was a write, we may have transitioned to
-	 * program mode, which we have to wait for it to complete.
-	 */
-	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
-		int err;
-
-		/* Check stop command response */
-		if (brq->stop.resp[0] & R1_ERROR) {
-			pr_err("%s: %s: general error sending stop command, stop cmd response %#x\n",
-			       req->rq_disk->disk_name, __func__,
-			       brq->stop.resp[0]);
-			gen_err = true;
-		}
-
-		err = card_busy_detect_err(card, false, req, &gen_err);
-		if (err)
-			return MMC_BLK_CMD_ERR;
-	}
-
-	/* if general error occurs, retry the write operation. */
-	if (gen_err) {
-		pr_warn("%s: retrying write for general error\n",
-				req->rq_disk->disk_name);
-		return MMC_BLK_RETRY;
-	}
-
-	/* Some errors (ECC) are flagged on the next commmand, so check stop, too */
-	if (brq->data.error || brq->stop.error) {
-		if (need_retune && !brq->retune_retry_done) {
-			pr_debug("%s: retrying because a re-tune was needed\n",
-				 req->rq_disk->disk_name);
-			brq->retune_retry_done = 1;
-			return MMC_BLK_RETRY;
-		}
-		pr_err("%s: error %d transferring data, sector %u, nr %u, cmd response %#x, card status %#x\n",
-		       req->rq_disk->disk_name, brq->data.error ?: brq->stop.error,
-		       (unsigned)blk_rq_pos(req),
-		       (unsigned)blk_rq_sectors(req),
-		       brq->cmd.resp[0], brq->stop.resp[0]);
-
-		if (rq_data_dir(req) == READ) {
-			if (ecc_err)
-				return MMC_BLK_ECC_ERR;
-			return MMC_BLK_DATA_ERR;
-		} else {
-			return MMC_BLK_CMD_ERR;
-		}
-	}
-
-	if (!brq->data.bytes_xfered)
-		return MMC_BLK_RETRY;
-
-	if (blk_rq_bytes(req) != brq->data.bytes_xfered)
-		return MMC_BLK_PARTIAL;
-
-	return MMC_BLK_SUCCESS;
-}
-
 static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 			      int disable_multi, bool *do_rel_wr_p,
 			      bool *do_data_tag_p)
@@ -1773,8 +1423,6 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		brq->data.sg_len = i;
 	}
 
-	mqrq->areq.mrq = &brq->mrq;
-
 	if (do_rel_wr_p)
 		*do_rel_wr_p = do_rel_wr;
 
@@ -1978,8 +1626,6 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 		brq->sbc.flags = MMC_RSP_R1 | MMC_CMD_AC;
 		brq->mrq.sbc = &brq->sbc;
 	}
-
-	mqrq->areq.err_check = mmc_blk_err_check;
 }
 
 #define MMC_MAX_RETRIES		5
@@ -2561,350 +2207,6 @@ enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
 	}
 }
 
-static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
-			       struct mmc_blk_request *brq, struct request *req,
-			       bool old_req_pending)
-{
-	bool req_pending;
-
-	/*
-	 * If this is an SD card and we're writing, we can first
-	 * mark the known good sectors as ok.
-	 *
-	 * If the card is not SD, we can still ok written sectors
-	 * as reported by the controller (which might be less than
-	 * the real number of written sectors, but never more).
-	 */
-	if (mmc_card_sd(card)) {
-		u32 blocks;
-		int err;
-
-		err = mmc_sd_num_wr_blocks(card, &blocks);
-		if (err)
-			req_pending = old_req_pending;
-		else
-			req_pending = blk_end_request(req, BLK_STS_OK, blocks << 9);
-	} else {
-		req_pending = blk_end_request(req, BLK_STS_OK, brq->data.bytes_xfered);
-	}
-	return req_pending;
-}
-
-static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
-				 struct request *req,
-				 struct mmc_queue_req *mqrq)
-{
-	if (mmc_card_removed(card))
-		req->rq_flags |= RQF_QUIET;
-	while (blk_end_request(req, BLK_STS_IOERR, blk_rq_cur_bytes(req)));
-	mq->qcnt--;
-}
-
-/**
- * mmc_blk_rw_try_restart() - tries to restart the current async request
- * @mq: the queue with the card and host to restart
- * @req: a new request that want to be started after the current one
- */
-static void mmc_blk_rw_try_restart(struct mmc_queue *mq, struct request *req,
-				   struct mmc_queue_req *mqrq)
-{
-	if (!req)
-		return;
-
-	/*
-	 * If the card was removed, just cancel everything and return.
-	 */
-	if (mmc_card_removed(mq->card)) {
-		req->rq_flags |= RQF_QUIET;
-		blk_end_request_all(req, BLK_STS_IOERR);
-		mq->qcnt--; /* FIXME: just set to 0? */
-		return;
-	}
-	/* Else proceed and try to restart the current async request */
-	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
-	mmc_start_areq(mq->card->host, &mqrq->areq, NULL);
-}
-
-static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
-{
-	struct mmc_blk_data *md = mq->blkdata;
-	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq;
-	int disable_multi = 0, retry = 0, type, retune_retry_done = 0;
-	enum mmc_blk_status status;
-	struct mmc_queue_req *mqrq_cur = NULL;
-	struct mmc_queue_req *mq_rq;
-	struct request *old_req;
-	struct mmc_async_req *new_areq;
-	struct mmc_async_req *old_areq;
-	bool req_pending = true;
-
-	if (new_req) {
-		mqrq_cur = req_to_mmc_queue_req(new_req);
-		mq->qcnt++;
-	}
-
-	if (!mq->qcnt)
-		return;
-
-	do {
-		if (new_req) {
-			/*
-			 * When 4KB native sector is enabled, only 8 blocks
-			 * multiple read or write is allowed
-			 */
-			if (mmc_large_sector(card) &&
-				!IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
-				pr_err("%s: Transfer size is not 4KB sector size aligned\n",
-					new_req->rq_disk->disk_name);
-				mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
-				return;
-			}
-
-			mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
-			new_areq = &mqrq_cur->areq;
-		} else
-			new_areq = NULL;
-
-		old_areq = mmc_start_areq(card->host, new_areq, &status);
-		if (!old_areq) {
-			/*
-			 * We have just put the first request into the pipeline
-			 * and there is nothing more to do until it is
-			 * complete.
-			 */
-			return;
-		}
-
-		/*
-		 * An asynchronous request has been completed and we proceed
-		 * to handle the result of it.
-		 */
-		mq_rq =	container_of(old_areq, struct mmc_queue_req, areq);
-		brq = &mq_rq->brq;
-		old_req = mmc_queue_req_to_req(mq_rq);
-		type = rq_data_dir(old_req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
-
-		switch (status) {
-		case MMC_BLK_SUCCESS:
-		case MMC_BLK_PARTIAL:
-			/*
-			 * Reset success, and accept bytes_xfered. For
-			 * MMC_BLK_PARTIAL re-submit the remaining request. For
-			 * MMC_BLK_SUCCESS error out the remaining request (it
-			 * could not be re-submitted anyway if a next request
-			 * had already begun).
-			 */
-			mmc_blk_reset_success(md, type);
-
-			req_pending = blk_end_request(old_req, BLK_STS_OK,
-						      brq->data.bytes_xfered);
-			/*
-			 * If the blk_end_request function returns non-zero even
-			 * though all data has been transferred and no errors
-			 * were returned by the host controller, it's a bug.
-			 */
-			if (status == MMC_BLK_SUCCESS && req_pending) {
-				pr_err("%s BUG rq_tot %d d_xfer %d\n",
-				       __func__, blk_rq_bytes(old_req),
-				       brq->data.bytes_xfered);
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				return;
-			}
-			break;
-		case MMC_BLK_CMD_ERR:
-			/*
-			 * For SD cards, get bytes written, but do not accept
-			 * bytes_xfered if that fails. For MMC cards accept
-			 * bytes_xfered. Then try to reset. If reset fails then
-			 * error out the remaining request, otherwise retry
-			 * once (N.B mmc_blk_reset() will not succeed twice in a
-			 * row).
-			 */
-			req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
-			if (mmc_blk_reset(md, card->host, type)) {
-				if (req_pending)
-					mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				else
-					mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			if (!req_pending) {
-				mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			break;
-		case MMC_BLK_RETRY:
-			/*
-			 * Do not accept bytes_xfered, but retry up to 5 times,
-			 * otherwise same as abort.
-			 */
-			retune_retry_done = brq->retune_retry_done;
-			if (retry++ < 5)
-				break;
-			/* Fall through */
-		case MMC_BLK_ABORT:
-			/*
-			 * Do not accept bytes_xfered, but try to reset. If
-			 * reset succeeds, try once more, otherwise error out
-			 * the request.
-			 */
-			if (!mmc_blk_reset(md, card->host, type))
-				break;
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		case MMC_BLK_DATA_ERR: {
-			int err;
-
-			/*
-			 * Do not accept bytes_xfered, but try to reset. If
-			 * reset succeeds, try once more. If reset fails with
-			 * ENODEV which means the partition is wrong, then error
-			 * out the request. Otherwise attempt to read one sector
-			 * at a time.
-			 */
-			err = mmc_blk_reset(md, card->host, type);
-			if (!err)
-				break;
-			if (err == -ENODEV) {
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			/* Fall through */
-		}
-		case MMC_BLK_ECC_ERR:
-			/*
-			 * Do not accept bytes_xfered. If reading more than one
-			 * sector, try reading one sector at a time.
-			 */
-			if (brq->data.blocks > 1) {
-				/* Redo read one sector at a time */
-				pr_warn("%s: retrying using single block read\n",
-					old_req->rq_disk->disk_name);
-				disable_multi = 1;
-				break;
-			}
-			/*
-			 * After an error, we redo I/O one sector at a
-			 * time, so we only reach here after trying to
-			 * read a single sector.
-			 */
-			req_pending = blk_end_request(old_req, BLK_STS_IOERR,
-						      brq->data.blksz);
-			if (!req_pending) {
-				mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			break;
-		case MMC_BLK_NOMEDIUM:
-			/* Do not accept bytes_xfered. Error out the request */
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		default:
-			/* Do not accept bytes_xfered. Error out the request */
-			pr_err("%s: Unhandled return value (%d)",
-					old_req->rq_disk->disk_name, status);
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		}
-
-		if (req_pending) {
-			/*
-			 * In case of a incomplete request
-			 * prepare it again and resend.
-			 */
-			mmc_blk_rw_rq_prep(mq_rq, card,
-					disable_multi, mq);
-			mmc_start_areq(card->host,
-					&mq_rq->areq, NULL);
-			mq_rq->brq.retune_retry_done = retune_retry_done;
-		}
-	} while (req_pending);
-
-	mq->qcnt--;
-}
-
-void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
-{
-	int ret;
-	struct mmc_blk_data *md = mq->blkdata;
-	struct mmc_card *card = md->queue.card;
-
-	if (req && !mq->qcnt)
-		/* claim host only for the first request */
-		mmc_get_card(card, NULL);
-
-	ret = mmc_blk_part_switch(card, md->part_type);
-	if (ret) {
-		if (req) {
-			blk_end_request_all(req, BLK_STS_IOERR);
-		}
-		goto out;
-	}
-
-	if (req) {
-		switch (req_op(req)) {
-		case REQ_OP_DRV_IN:
-		case REQ_OP_DRV_OUT:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * ioctl()s
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_drv_op(mq, req);
-			break;
-		case REQ_OP_DISCARD:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * discard.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_discard_rq(mq, req);
-			break;
-		case REQ_OP_SECURE_ERASE:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * secure erase.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_secdiscard_rq(mq, req);
-			break;
-		case REQ_OP_FLUSH:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * flush.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_flush(mq, req);
-			break;
-		default:
-			/* Normal request, just issue it */
-			mmc_blk_issue_rw_rq(mq, req);
-			card->host->context_info.is_waiting_last_req = false;
-			break;
-		}
-	} else {
-		/* No request, flushing the pipeline with NULL */
-		mmc_blk_issue_rw_rq(mq, NULL);
-		card->host->context_info.is_waiting_last_req = false;
-	}
-
-out:
-	if (!mq->qcnt)
-		mmc_put_card(card, NULL);
-}
-
 static inline int mmc_blk_readonly(struct mmc_card *card)
 {
 	return mmc_card_readonly(card) ||
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index b126418fd163..31153f656f41 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -5,8 +5,6 @@
 struct mmc_queue;
 struct request;
 
-void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req);
-
 void mmc_blk_cqe_recovery(struct mmc_queue *mq);
 
 enum mmc_issued;
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 148a13d9adf0..aa591015412a 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -24,22 +24,6 @@
 #include "card.h"
 #include "host.h"
 
-/*
- * Prepare a MMC request. This just filters out odd stuff.
- */
-static int mmc_prep_request(struct request_queue *q, struct request *req)
-{
-	struct mmc_queue *mq = q->queuedata;
-
-	if (mq && mmc_card_removed(mq->card))
-		return BLKPREP_KILL;
-
-	req->rq_flags |= RQF_DONTPREP;
-	req_to_mmc_queue_req(req)->retries = 0;
-
-	return BLKPREP_OK;
-}
-
 static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
 {
 	/* Allow only 1 DCMD at a time */
@@ -181,86 +165,6 @@ static void mmc_mq_recovery_handler(struct work_struct *work)
 	blk_mq_run_hw_queues(q, true);
 }
 
-static int mmc_queue_thread(void *d)
-{
-	struct mmc_queue *mq = d;
-	struct request_queue *q = mq->queue;
-	struct mmc_context_info *cntx = &mq->card->host->context_info;
-
-	current->flags |= PF_MEMALLOC;
-
-	down(&mq->thread_sem);
-	do {
-		struct request *req;
-
-		spin_lock_irq(q->queue_lock);
-		set_current_state(TASK_INTERRUPTIBLE);
-		req = blk_fetch_request(q);
-		mq->asleep = false;
-		cntx->is_waiting_last_req = false;
-		cntx->is_new_req = false;
-		if (!req) {
-			/*
-			 * Dispatch queue is empty so set flags for
-			 * mmc_request_fn() to wake us up.
-			 */
-			if (mq->qcnt)
-				cntx->is_waiting_last_req = true;
-			else
-				mq->asleep = true;
-		}
-		spin_unlock_irq(q->queue_lock);
-
-		if (req || mq->qcnt) {
-			set_current_state(TASK_RUNNING);
-			mmc_blk_issue_rq(mq, req);
-			cond_resched();
-		} else {
-			if (kthread_should_stop()) {
-				set_current_state(TASK_RUNNING);
-				break;
-			}
-			up(&mq->thread_sem);
-			schedule();
-			down(&mq->thread_sem);
-		}
-	} while (1);
-	up(&mq->thread_sem);
-
-	return 0;
-}
-
-/*
- * Generic MMC request handler.  This is called for any queue on a
- * particular host.  When the host is not busy, we look for a request
- * on any queue on this host, and attempt to issue it.  This may
- * not be the queue we were asked to process.
- */
-static void mmc_request_fn(struct request_queue *q)
-{
-	struct mmc_queue *mq = q->queuedata;
-	struct request *req;
-	struct mmc_context_info *cntx;
-
-	if (!mq) {
-		while ((req = blk_fetch_request(q)) != NULL) {
-			req->rq_flags |= RQF_QUIET;
-			__blk_end_request_all(req, BLK_STS_IOERR);
-		}
-		return;
-	}
-
-	cntx = &mq->card->host->context_info;
-
-	if (cntx->is_waiting_last_req) {
-		cntx->is_new_req = true;
-		wake_up_interruptible(&cntx->wait);
-	}
-
-	if (mq->asleep)
-		wake_up_process(mq->thread);
-}
-
 static struct scatterlist *mmc_alloc_sg(int sg_len, gfp_t gfp)
 {
 	struct scatterlist *sg;
@@ -311,12 +215,6 @@ static int __mmc_init_request(struct mmc_queue *mq, struct request *req,
 	return 0;
 }
 
-static int mmc_init_request(struct request_queue *q, struct request *req,
-			    gfp_t gfp)
-{
-	return __mmc_init_request(q->queuedata, req, gfp);
-}
-
 static void mmc_exit_request(struct request_queue *q, struct request *req)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
@@ -465,9 +363,6 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 	blk_queue_max_segments(mq->queue, host->max_segs);
 	blk_queue_max_segment_size(mq->queue, host->max_seg_size);
 
-	/* Initialize thread_sem even if it is not used */
-	sema_init(&mq->thread_sem, 1);
-
 	INIT_WORK(&mq->recovery_work, mmc_mq_recovery_handler);
 	INIT_WORK(&mq->complete_work, mmc_blk_mq_complete_work);
 
@@ -550,51 +445,15 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 		   spinlock_t *lock, const char *subname)
 {
 	struct mmc_host *host = card->host;
-	int ret = -ENOMEM;
 
 	mq->card = card;
 
 	mq->use_cqe = host->cqe_enabled;
 
-	if (mq->use_cqe || mmc_host_use_blk_mq(host))
-		return mmc_mq_init(mq, card, lock);
-
-	mq->queue = blk_alloc_queue(GFP_KERNEL);
-	if (!mq->queue)
-		return -ENOMEM;
-	mq->queue->queue_lock = lock;
-	mq->queue->request_fn = mmc_request_fn;
-	mq->queue->init_rq_fn = mmc_init_request;
-	mq->queue->exit_rq_fn = mmc_exit_request;
-	mq->queue->cmd_size = sizeof(struct mmc_queue_req);
-	mq->queue->queuedata = mq;
-	mq->qcnt = 0;
-	ret = blk_init_allocated_queue(mq->queue);
-	if (ret) {
-		blk_cleanup_queue(mq->queue);
-		return ret;
-	}
-
-	blk_queue_prep_rq(mq->queue, mmc_prep_request);
-
-	mmc_setup_queue(mq, card);
-
-	mq->thread = kthread_run(mmc_queue_thread, mq, "mmcqd/%d%s",
-		host->index, subname ? subname : "");
-
-	if (IS_ERR(mq->thread)) {
-		ret = PTR_ERR(mq->thread);
-		goto cleanup_queue;
-	}
-
-	return 0;
-
-cleanup_queue:
-	blk_cleanup_queue(mq->queue);
-	return ret;
+	return mmc_mq_init(mq, card, lock);
 }
 
-static void mmc_mq_queue_suspend(struct mmc_queue *mq)
+void mmc_queue_suspend(struct mmc_queue *mq)
 {
 	blk_mq_quiesce_queue(mq->queue);
 
@@ -606,71 +465,22 @@ static void mmc_mq_queue_suspend(struct mmc_queue *mq)
 	mmc_release_host(mq->card->host);
 }
 
-static void mmc_mq_queue_resume(struct mmc_queue *mq)
+void mmc_queue_resume(struct mmc_queue *mq)
 {
 	blk_mq_unquiesce_queue(mq->queue);
 }
 
-static void __mmc_queue_suspend(struct mmc_queue *mq)
-{
-	struct request_queue *q = mq->queue;
-	unsigned long flags;
-
-	if (!mq->suspended) {
-		mq->suspended |= true;
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_stop_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-
-		down(&mq->thread_sem);
-	}
-}
-
-static void __mmc_queue_resume(struct mmc_queue *mq)
-{
-	struct request_queue *q = mq->queue;
-	unsigned long flags;
-
-	if (mq->suspended) {
-		mq->suspended = false;
-
-		up(&mq->thread_sem);
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
-}
-
 void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
 
-	if (q->mq_ops) {
-		/*
-		 * The legacy code handled the possibility of being suspended,
-		 * so do that here too.
-		 */
-		if (blk_queue_quiesced(q))
-			blk_mq_unquiesce_queue(q);
-		goto out_cleanup;
-	}
-
-	/* Make sure the queue isn't suspended, as that will deadlock */
-	mmc_queue_resume(mq);
-
-	/* Then terminate our worker thread */
-	kthread_stop(mq->thread);
-
-	/* Empty the queue */
-	spin_lock_irqsave(q->queue_lock, flags);
-	q->queuedata = NULL;
-	blk_start_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	/*
+	 * The legacy code handled the possibility of being suspended,
+	 * so do that here too.
+	 */
+	if (blk_queue_quiesced(q))
+		blk_mq_unquiesce_queue(q);
 
-out_cleanup:
 	blk_cleanup_queue(q);
 
 	/*
@@ -683,38 +493,6 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
 	mq->card = NULL;
 }
 
-/**
- * mmc_queue_suspend - suspend a MMC request queue
- * @mq: MMC queue to suspend
- *
- * Stop the block request queue, and wait for our thread to
- * complete any outstanding requests.  This ensures that we
- * won't suspend while a request is being processed.
- */
-void mmc_queue_suspend(struct mmc_queue *mq)
-{
-	struct request_queue *q = mq->queue;
-
-	if (q->mq_ops)
-		mmc_mq_queue_suspend(mq);
-	else
-		__mmc_queue_suspend(mq);
-}
-
-/**
- * mmc_queue_resume - resume a previously suspended MMC request queue
- * @mq: MMC queue to resume
- */
-void mmc_queue_resume(struct mmc_queue *mq)
-{
-	struct request_queue *q = mq->queue;
-
-	if (q->mq_ops)
-		mmc_mq_queue_resume(mq);
-	else
-		__mmc_queue_resume(mq);
-}
-
 /*
  * Prepare the sg list(s) to be handed of to the host driver
  */
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index c4271fa54f1a..9214b1cdc8ab 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -34,7 +34,6 @@ static inline struct request *mmc_queue_req_to_req(struct mmc_queue_req *mqr)
 	return blk_mq_rq_from_pdu(mqr);
 }
 
-struct task_struct;
 struct mmc_blk_data;
 struct mmc_blk_ioc_data;
 
@@ -44,7 +43,6 @@ struct mmc_blk_request {
 	struct mmc_command	cmd;
 	struct mmc_command	stop;
 	struct mmc_data		data;
-	int			retune_retry_done;
 };
 
 /**
@@ -66,7 +64,6 @@ enum mmc_drv_op {
 struct mmc_queue_req {
 	struct mmc_blk_request	brq;
 	struct scatterlist	*sg;
-	struct mmc_async_req	areq;
 	enum mmc_drv_op		drv_op;
 	int			drv_op_result;
 	void			*drv_op_data;
@@ -76,22 +73,10 @@ struct mmc_queue_req {
 
 struct mmc_queue {
 	struct mmc_card		*card;
-	struct task_struct	*thread;
-	struct semaphore	thread_sem;
 	struct mmc_ctx		ctx;
 	struct blk_mq_tag_set	tag_set;
-	bool			suspended;
-	bool			asleep;
 	struct mmc_blk_data	*blkdata;
 	struct request_queue	*queue;
-	/*
-	 * FIXME: this counter is not a very reliable way of keeping
-	 * track of how many requests that are ongoing. Switch to just
-	 * letting the block core keep track of requests and per-request
-	 * associated mmc_queue_req data.
-	 */
-	int			qcnt;
-
 	int			in_flight[MMC_ISSUE_MAX];
 	unsigned int		cqe_busy;
 #define MMC_CQE_DCMD_BUSY	BIT(0)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH V14 24/24] mmc: core: Remove code no longer needed after the switch to blk-mq
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (22 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 23/24] mmc: block: Remove code no longer needed after the switch to blk-mq Adrian Hunter
@ 2017-11-21 13:42 ` Adrian Hunter
  2017-11-28  9:42 ` [PATCH V14 00/24] mmc: Add Command Queue support Linus Walleij
  24 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-21 13:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Remove code no longer needed after the switch to blk-mq.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/core/bus.c   |   2 -
 drivers/mmc/core/core.c  | 185 +----------------------------------------------
 drivers/mmc/core/core.h  |   8 --
 include/linux/mmc/host.h |   3 -
 4 files changed, 1 insertion(+), 197 deletions(-)

diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index 7586ff2ad1f1..fc92c6c1c9a4 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -351,8 +351,6 @@ int mmc_add_card(struct mmc_card *card)
 #ifdef CONFIG_DEBUG_FS
 	mmc_add_card_debugfs(card);
 #endif
-	mmc_init_context_info(card->host);
-
 	card->dev.of_node = mmc_of_find_child_device(card->host, 0);
 
 	device_enable_async_suspend(&card->dev);
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 7ca6e4866a8b..e5c8727c16ad 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -361,20 +361,6 @@ int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 }
 EXPORT_SYMBOL(mmc_start_request);
 
-/*
- * mmc_wait_data_done() - done callback for data request
- * @mrq: done data request
- *
- * Wakes up mmc context, passed as a callback to host controller driver
- */
-static void mmc_wait_data_done(struct mmc_request *mrq)
-{
-	struct mmc_context_info *context_info = &mrq->host->context_info;
-
-	context_info->is_done_rcv = true;
-	wake_up_interruptible(&context_info->wait);
-}
-
 static void mmc_wait_done(struct mmc_request *mrq)
 {
 	complete(&mrq->completion);
@@ -392,37 +378,6 @@ static inline void mmc_wait_ongoing_tfr_cmd(struct mmc_host *host)
 		wait_for_completion(&ongoing_mrq->cmd_completion);
 }
 
-/*
- *__mmc_start_data_req() - starts data request
- * @host: MMC host to start the request
- * @mrq: data request to start
- *
- * Sets the done callback to be called when request is completed by the card.
- * Starts data mmc request execution
- * If an ongoing transfer is already in progress, wait for the command line
- * to become available before sending another command.
- */
-static int __mmc_start_data_req(struct mmc_host *host, struct mmc_request *mrq)
-{
-	int err;
-
-	mmc_wait_ongoing_tfr_cmd(host);
-
-	mrq->done = mmc_wait_data_done;
-	mrq->host = host;
-
-	init_completion(&mrq->cmd_completion);
-
-	err = mmc_start_request(host, mrq);
-	if (err) {
-		mrq->cmd->error = err;
-		mmc_complete_cmd(mrq);
-		mmc_wait_data_done(mrq);
-	}
-
-	return err;
-}
-
 static int __mmc_start_req(struct mmc_host *host, struct mmc_request *mrq)
 {
 	int err;
@@ -650,133 +605,11 @@ int mmc_cqe_recovery(struct mmc_host *host)
  */
 bool mmc_is_req_done(struct mmc_host *host, struct mmc_request *mrq)
 {
-	if (host->areq)
-		return host->context_info.is_done_rcv;
-	else
-		return completion_done(&mrq->completion);
+	return completion_done(&mrq->completion);
 }
 EXPORT_SYMBOL(mmc_is_req_done);
 
 /**
- * mmc_finalize_areq() - finalize an asynchronous request
- * @host: MMC host to finalize any ongoing request on
- *
- * Returns the status of the ongoing asynchronous request, but
- * MMC_BLK_SUCCESS if no request was going on.
- */
-static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host)
-{
-	struct mmc_context_info *context_info = &host->context_info;
-	enum mmc_blk_status status;
-
-	if (!host->areq)
-		return MMC_BLK_SUCCESS;
-
-	while (1) {
-		wait_event_interruptible(context_info->wait,
-				(context_info->is_done_rcv ||
-				 context_info->is_new_req));
-
-		if (context_info->is_done_rcv) {
-			struct mmc_command *cmd;
-
-			context_info->is_done_rcv = false;
-			cmd = host->areq->mrq->cmd;
-
-			if (!cmd->error || !cmd->retries ||
-			    mmc_card_removed(host->card)) {
-				status = host->areq->err_check(host->card,
-							       host->areq);
-				break; /* return status */
-			} else {
-				mmc_retune_recheck(host);
-				pr_info("%s: req failed (CMD%u): %d, retrying...\n",
-					mmc_hostname(host),
-					cmd->opcode, cmd->error);
-				cmd->retries--;
-				cmd->error = 0;
-				__mmc_start_request(host, host->areq->mrq);
-				continue; /* wait for done/new event again */
-			}
-		}
-
-		return MMC_BLK_NEW_REQUEST;
-	}
-
-	mmc_retune_release(host);
-
-	/*
-	 * Check BKOPS urgency for each R1 response
-	 */
-	if (host->card && mmc_card_mmc(host->card) &&
-	    ((mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1) ||
-	     (mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1B)) &&
-	    (host->areq->mrq->cmd->resp[0] & R1_EXCEPTION_EVENT)) {
-		mmc_start_bkops(host->card, true);
-	}
-
-	return status;
-}
-
-/**
- *	mmc_start_areq - start an asynchronous request
- *	@host: MMC host to start command
- *	@areq: asynchronous request to start
- *	@ret_stat: out parameter for status
- *
- *	Start a new MMC custom command request for a host.
- *	If there is on ongoing async request wait for completion
- *	of that request and start the new one and return.
- *	Does not wait for the new request to complete.
- *
- *      Returns the completed request, NULL in case of none completed.
- *	Wait for the an ongoing request (previoulsy started) to complete and
- *	return the completed request. If there is no ongoing request, NULL
- *	is returned without waiting. NULL is not an error condition.
- */
-struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
-				     struct mmc_async_req *areq,
-				     enum mmc_blk_status *ret_stat)
-{
-	enum mmc_blk_status status;
-	int start_err = 0;
-	struct mmc_async_req *previous = host->areq;
-
-	/* Prepare a new request */
-	if (areq)
-		mmc_pre_req(host, areq->mrq);
-
-	/* Finalize previous request */
-	status = mmc_finalize_areq(host);
-	if (ret_stat)
-		*ret_stat = status;
-
-	/* The previous request is still going on... */
-	if (status == MMC_BLK_NEW_REQUEST)
-		return NULL;
-
-	/* Fine so far, start the new request! */
-	if (status == MMC_BLK_SUCCESS && areq)
-		start_err = __mmc_start_data_req(host, areq->mrq);
-
-	/* Postprocess the old request at this point */
-	if (host->areq)
-		mmc_post_req(host, host->areq->mrq, 0);
-
-	/* Cancel a prepared request if it was not started. */
-	if ((status != MMC_BLK_SUCCESS || start_err) && areq)
-		mmc_post_req(host, areq->mrq, -EINVAL);
-
-	if (status != MMC_BLK_SUCCESS)
-		host->areq = NULL;
-	else
-		host->areq = areq;
-
-	return previous;
-}
-EXPORT_SYMBOL(mmc_start_areq);
-
-/**
  *	mmc_wait_for_req - start a request and wait for completion
  *	@host: MMC host to start command
  *	@mrq: MMC request to start
@@ -2963,22 +2796,6 @@ void mmc_unregister_pm_notifier(struct mmc_host *host)
 }
 #endif
 
-/**
- * mmc_init_context_info() - init synchronization context
- * @host: mmc host
- *
- * Init struct context_info needed to implement asynchronous
- * request mechanism, used by mmc core, host driver and mmc requests
- * supplier.
- */
-void mmc_init_context_info(struct mmc_host *host)
-{
-	host->context_info.is_new_req = false;
-	host->context_info.is_done_rcv = false;
-	host->context_info.is_waiting_last_req = false;
-	init_waitqueue_head(&host->context_info.wait);
-}
-
 static int __init mmc_init(void)
 {
 	int ret;
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index f564ddfbe070..26f375df6ec8 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -91,8 +91,6 @@ static inline void mmc_delay(unsigned int ms)
 void mmc_add_card_debugfs(struct mmc_card *card);
 void mmc_remove_card_debugfs(struct mmc_card *card);
 
-void mmc_init_context_info(struct mmc_host *host);
-
 int mmc_execute_tuning(struct mmc_card *card);
 int mmc_hs200_to_hs400(struct mmc_card *card);
 int mmc_hs400_to_hs200(struct mmc_card *card);
@@ -110,12 +108,6 @@ static inline void mmc_unregister_pm_notifier(struct mmc_host *host) { }
 
 int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq);
 
-struct mmc_async_req;
-
-struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
-				     struct mmc_async_req *areq,
-				     enum mmc_blk_status *ret_stat);
-
 int mmc_erase(struct mmc_card *card, unsigned int from, unsigned int nr,
 		unsigned int arg);
 int mmc_can_erase(struct mmc_card *card);
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 4b68a95a8818..b92fc04a0bcc 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -424,9 +424,6 @@ struct mmc_host {
 
 	struct dentry		*debugfs_root;
 
-	struct mmc_async_req	*areq;		/* active async req */
-	struct mmc_context_info	context_info;	/* async synchronization info */
-
 	/* Ongoing data transfer that allows commands during transfer */
 	struct mmc_request	*ongoing_mrq;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()
  2017-11-21 13:42 ` [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect() Adrian Hunter
@ 2017-11-21 15:39   ` Ulf Hansson
  2017-11-22  7:40     ` Adrian Hunter
  0 siblings, 1 reply; 50+ messages in thread
From: Ulf Hansson @ 2017-11-21 15:39 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> card_busy_detect() has a 10 minute timeout. However the correct timeout is
> the data timeout. Change card_busy_detect() to use the data timeout.

Unfortunate I don't think there is "correct" timeout for this case.

The data->timeout_ns is to indicate for the host to how long the
maximum time it's allowed to take between blocks that are written to
the data lines.

I haven't found a definition of the busy timeout, after the data write
has completed. The spec only mentions that the device moves to
programming state and pulls DAT0 to indicate busy.

Sure, 10 min seems crazy, perhaps something along the lines of 10-20 s
is more reasonable. What do you think?

Br
Uffe

>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  drivers/mmc/core/block.c | 48 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index e44f6d90aeb4..0874ab3e5c92 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -63,7 +63,6 @@
>  #endif
>  #define MODULE_PARAM_PREFIX "mmcblk."
>
> -#define MMC_BLK_TIMEOUT_MS  (10 * 60 * 1000)        /* 10 minute timeout */
>  #define MMC_SANITIZE_REQ_TIMEOUT 240000
>  #define MMC_EXTRACT_INDEX_FROM_ARG(x) ((x & 0x00FF0000) >> 16)
>
> @@ -921,14 +920,48 @@ static int mmc_sd_num_wr_blocks(struct mmc_card *card, u32 *written_blocks)
>         return 0;
>  }
>
> -static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
> -               bool hw_busy_detect, struct request *req, bool *gen_err)
> +static unsigned int mmc_blk_clock_khz(struct mmc_host *host)
>  {
> -       unsigned long timeout = jiffies + msecs_to_jiffies(timeout_ms);
> +       if (host->actual_clock)
> +               return host->actual_clock / 1000;
> +
> +       /* Clock may be subject to a divisor, fudge it by a factor of 2. */
> +       if (host->ios.clock)
> +               return host->ios.clock / 2000;
> +
> +       /* How can there be no clock */
> +       WARN_ON_ONCE(1);
> +       return 100; /* 100 kHz is minimum possible value */
> +}
> +
> +static unsigned long mmc_blk_data_timeout_jiffies(struct mmc_host *host,
> +                                                 struct mmc_data *data)
> +{
> +       unsigned int ms = DIV_ROUND_UP(data->timeout_ns, 1000000);
> +       unsigned int khz;
> +
> +       if (data->timeout_clks) {
> +               khz = mmc_blk_clock_khz(host);
> +               ms += DIV_ROUND_UP(data->timeout_clks, khz);
> +       }
> +
> +       return msecs_to_jiffies(ms);
> +}
> +
> +static int card_busy_detect(struct mmc_card *card, bool hw_busy_detect,
> +                           struct request *req, bool *gen_err)
> +{
> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +       struct mmc_data *data = &mqrq->brq.data;
> +       unsigned long timeout;
>         int err = 0;
>         u32 status;
>
> +       timeout = jiffies + mmc_blk_data_timeout_jiffies(card->host, data);
> +
>         do {
> +               bool done = time_after(jiffies, timeout);
> +
>                 err = __mmc_send_status(card, &status, 5);
>                 if (err) {
>                         pr_err("%s: error %d requesting status\n",
> @@ -951,7 +984,7 @@ static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
>                  * Timeout if the device never becomes ready for data and never
>                  * leaves the program state.
>                  */
> -               if (time_after(jiffies, timeout)) {
> +               if (done) {
>                         pr_err("%s: Card stuck in programming state! %s %s\n",
>                                 mmc_hostname(card->host),
>                                 req->rq_disk->disk_name, __func__);
> @@ -1011,7 +1044,7 @@ static int send_stop(struct mmc_card *card, unsigned int timeout_ms,
>                 *gen_err = true;
>         }
>
> -       return card_busy_detect(card, timeout_ms, use_r1b_resp, req, gen_err);
> +       return card_busy_detect(card, use_r1b_resp, req, gen_err);
>  }
>
>  #define ERR_NOMEDIUM   3
> @@ -1546,8 +1579,7 @@ static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
>                         gen_err = true;
>                 }
>
> -               err = card_busy_detect(card, MMC_BLK_TIMEOUT_MS, false, req,
> -                                       &gen_err);
> +               err = card_busy_detect(card, false, req, &gen_err);
>                 if (err)
>                         return MMC_BLK_CMD_ERR;
>         }
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()
  2017-11-21 15:39   ` Ulf Hansson
@ 2017-11-22  7:40     ` Adrian Hunter
  2017-11-22 14:43       ` Ulf Hansson
  0 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-22  7:40 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21/11/17 17:39, Ulf Hansson wrote:
> On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> card_busy_detect() has a 10 minute timeout. However the correct timeout is
>> the data timeout. Change card_busy_detect() to use the data timeout.
> 
> Unfortunate I don't think there is "correct" timeout for this case.
> 
> The data->timeout_ns is to indicate for the host to how long the
> maximum time it's allowed to take between blocks that are written to
> the data lines.
> 
> I haven't found a definition of the busy timeout, after the data write
> has completed. The spec only mentions that the device moves to
> programming state and pulls DAT0 to indicate busy.

To me it reads more like the timeout is for each block, including the last
i.e. the same timeout for "busy".  Note the card is also busy between blocks.

Equally it is the timeout we give the host controller.  So either the host
controller does not have a timeout for "busy" - which begs the question why
it has a timeout at all - or it invents its own "busy" timeout - which begs
the question why it isn't in the spec.

> 
> Sure, 10 min seems crazy, perhaps something along the lines of 10-20 s
> is more reasonable. What do you think?

We give SD cards a generous 3 seconds for writes.  SDHCI has long had a 10
second software timer for the whole request, which strongly suggests that
requests have always completed within 10 seconds.  So that puts the range of
an arbitrary timeout 3-10 s.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()
  2017-11-22  7:40     ` Adrian Hunter
@ 2017-11-22 14:43       ` Ulf Hansson
  2017-11-23 11:37         ` Adrian Hunter
  0 siblings, 1 reply; 50+ messages in thread
From: Ulf Hansson @ 2017-11-22 14:43 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 22 November 2017 at 08:40, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 21/11/17 17:39, Ulf Hansson wrote:
>> On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> card_busy_detect() has a 10 minute timeout. However the correct timeout is
>>> the data timeout. Change card_busy_detect() to use the data timeout.
>>
>> Unfortunate I don't think there is "correct" timeout for this case.
>>
>> The data->timeout_ns is to indicate for the host to how long the
>> maximum time it's allowed to take between blocks that are written to
>> the data lines.
>>
>> I haven't found a definition of the busy timeout, after the data write
>> has completed. The spec only mentions that the device moves to
>> programming state and pulls DAT0 to indicate busy.
>
> To me it reads more like the timeout is for each block, including the last
> i.e. the same timeout for "busy".  Note the card is also busy between blocks.

I don't think that is the same timeout. Or maybe it is.

In the eMMC 5.1 spec, there are mentions about "buffer busy signal"
and "programming busy signal", see section 6.15.3 (Timings - Data
Write).

Anyway, whether any of them is specified, is to me unclear.

>
> Equally it is the timeout we give the host controller.  So either the host
> controller does not have a timeout for "busy" - which begs the question why
> it has a timeout at all - or it invents its own "busy" timeout - which begs
> the question why it isn't in the spec.

Well, there are some vague hints in section 6.8.2 (Time-out
conditions), but I don't find these timeouts values being referred to
in 6.15 (Timings). And that puzzles me.

Moreover, the below is quoted from section 6.6.8.1 (Block write):
------
Some Devices may require long and unpredictable times to write a block
of data. After receiving a block of data and completing the CRC check,
the Device will begin writing and hold the DAT0 line low. The host may
poll the status of the Device with a SEND_STATUS command (CMD13) at
any time, and the Device will respond with its status (except in Sleep
state). The status bit READY_FOR_DATA indicates whether the Device can
accept new data or not. The host may deselect the Device by issuing
CMD7 that will then displace the Device into the Disconnect State and
release the DAT0 line without interrupting the write operation. When
reselecting the Device, it will reactivate busy indication by pulling
DAT0 to low. See 6.15 for details of busy indication.
------

>
>>
>> Sure, 10 min seems crazy, perhaps something along the lines of 10-20 s
>> is more reasonable. What do you think?
>
> We give SD cards a generous 3 seconds for writes.  SDHCI has long had a 10
> second software timer for the whole request, which strongly suggests that
> requests have always completed within 10 seconds.  So that puts the range of
> an arbitrary timeout 3-10 s.

>From the reasoning above, I guess we could try out 10 s. That is at
least a lot better than 10 minutes.

I also see that we have at three different places (switch, erase,
block data transfers) implementing busy signal detection. It would be
nice to try to align those pieces of code, because they are quite
similar. Of course, this deserves it's own separate task to try to fix
up.

BTW, perhaps we should move this to a separate change on top of the
series? Or is there as specific need for this in regards to blkmq and
CQE?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 01/24] mmc: block: Fix missing blk_put_request()
  2017-11-21 13:42 ` [PATCH V14 01/24] mmc: block: Fix missing blk_put_request() Adrian Hunter
@ 2017-11-23  9:41   ` Linus Walleij
  2017-11-23 18:12   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Linus Walleij @ 2017-11-23  9:41 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Ensure blk_get_request() is paired with blk_put_request().
>
> Fixes: 0493f6fe5bde ("mmc: block: Move boot partition locking into a driver op")
> Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Ah, me abusing the APIs, sorry for sloppiness.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Ulf I think this should be queued for fixes.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 02/24] mmc: block: Check return value of blk_get_request()
  2017-11-21 13:42 ` [PATCH V14 02/24] mmc: block: Check return value of blk_get_request() Adrian Hunter
@ 2017-11-23  9:56   ` Linus Walleij
  2017-11-23 18:12   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Linus Walleij @ 2017-11-23  9:56 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> blk_get_request() can fail, always check the return value.
>
> Fixes: 0493f6fe5bde ("mmc: block: Move boot partition locking into a driver op")
> Fixes: 3ecd8cf23f88 ("mmc: block: move multi-ioctl() to use block layer")
> Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
> Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
This should also go into fixes I think.

I guess this particular sloppiness was due to modeling the code on
drivers/ide/* which doesn't check the return value either.

I should go and fix that :/

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state
  2017-11-21 13:42 ` [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state Adrian Hunter
@ 2017-11-23  9:58   ` Linus Walleij
  2017-11-23 18:12   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Linus Walleij @ 2017-11-23  9:58 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> The block driver must be resumed if the mmc bus fails to suspend the card.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Also looks like a clear candidate for fixes.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()
  2017-11-22 14:43       ` Ulf Hansson
@ 2017-11-23 11:37         ` Adrian Hunter
  0 siblings, 0 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-23 11:37 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 22/11/17 16:43, Ulf Hansson wrote:
> On 22 November 2017 at 08:40, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 21/11/17 17:39, Ulf Hansson wrote:
>>> On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>> card_busy_detect() has a 10 minute timeout. However the correct timeout is
>>>> the data timeout. Change card_busy_detect() to use the data timeout.
>>>
>>> Unfortunate I don't think there is "correct" timeout for this case.
>>>
>>> The data->timeout_ns is to indicate for the host to how long the
>>> maximum time it's allowed to take between blocks that are written to
>>> the data lines.
>>>
>>> I haven't found a definition of the busy timeout, after the data write
>>> has completed. The spec only mentions that the device moves to
>>> programming state and pulls DAT0 to indicate busy.
>>
>> To me it reads more like the timeout is for each block, including the last
>> i.e. the same timeout for "busy".  Note the card is also busy between blocks.
> 
> I don't think that is the same timeout. Or maybe it is.
> 
> In the eMMC 5.1 spec, there are mentions about "buffer busy signal"
> and "programming busy signal", see section 6.15.3 (Timings - Data
> Write).
> 
> Anyway, whether any of them is specified, is to me unclear.
> 
>>
>> Equally it is the timeout we give the host controller.  So either the host
>> controller does not have a timeout for "busy" - which begs the question why
>> it has a timeout at all - or it invents its own "busy" timeout - which begs
>> the question why it isn't in the spec.
> 
> Well, there are some vague hints in section 6.8.2 (Time-out
> conditions), but I don't find these timeouts values being referred to
> in 6.15 (Timings). And that puzzles me.
> 
> Moreover, the below is quoted from section 6.6.8.1 (Block write):
> ------
> Some Devices may require long and unpredictable times to write a block
> of data. After receiving a block of data and completing the CRC check,
> the Device will begin writing and hold the DAT0 line low. The host may
> poll the status of the Device with a SEND_STATUS command (CMD13) at
> any time, and the Device will respond with its status (except in Sleep
> state). The status bit READY_FOR_DATA indicates whether the Device can
> accept new data or not. The host may deselect the Device by issuing
> CMD7 that will then displace the Device into the Disconnect State and
> release the DAT0 line without interrupting the write operation. When
> reselecting the Device, it will reactivate busy indication by pulling
> DAT0 to low. See 6.15 for details of busy indication.
> ------
> 
>>
>>>
>>> Sure, 10 min seems crazy, perhaps something along the lines of 10-20 s
>>> is more reasonable. What do you think?
>>
>> We give SD cards a generous 3 seconds for writes.  SDHCI has long had a 10
>> second software timer for the whole request, which strongly suggests that
>> requests have always completed within 10 seconds.  So that puts the range of
>> an arbitrary timeout 3-10 s.
> 
>>>From the reasoning above, I guess we could try out 10 s. That is at
> least a lot better than 10 minutes.
> 
> I also see that we have at three different places (switch, erase,
> block data transfers) implementing busy signal detection. It would be
> nice to try to align those pieces of code, because they are quite
> similar. Of course, this deserves it's own separate task to try to fix
> up.
> 
> BTW, perhaps we should move this to a separate change on top of the
> series? Or is there as specific need for this in regards to blkmq and
> CQE?

It is related to the recovery changes, so can be moved later in the patch set.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed
  2017-11-21 13:42 ` [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed Adrian Hunter
@ 2017-11-23 13:22   ` Linus Walleij
  2017-11-23 18:13   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Linus Walleij @ 2017-11-23 13:22 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> The card is not necessarily being removed, but the debugfs files must be
> removed when the driver is removed, otherwise they will continue to exist
> after unbinding the card from the driver. e.g.
>
>   # echo "mmc1:0001" > /sys/bus/mmc/drivers/mmcblk/unbind
>   # cat /sys/kernel/debug/mmc1/mmc1\:0001/ext_csd
>   [  173.634584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
>   [  173.643356] IP: mmc_ext_csd_open+0x5e/0x170
>
> A complication is that the debugfs_root may have already been removed, so
> check for that too.
>
> Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

I was assuming debugfs would always be removed from debugfs.c
using debugfs_remove_recursive(card->debugfs_root) but that
doesn't work for this case where we bin/unbind the block layer
interactively, sorry for missing it :(

Ulf: I think this can go in as an *early* fix as well, say after -rc1.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 05/24] mmc: block: No need to export mmc_cleanup_queue()
  2017-11-21 13:42 ` [PATCH V14 05/24] mmc: block: No need to export mmc_cleanup_queue() Adrian Hunter
@ 2017-11-23 13:23   ` Linus Walleij
  0 siblings, 0 replies; 50+ messages in thread
From: Linus Walleij @ 2017-11-23 13:23 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> mmc_cleanup_queue() is not used by a different module. Do not export it.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 06/24] mmc: block: Simplify cleaning up the queue
  2017-11-21 13:42 ` [PATCH V14 06/24] mmc: block: Simplify cleaning up the queue Adrian Hunter
@ 2017-11-23 13:27   ` Linus Walleij
  0 siblings, 0 replies; 50+ messages in thread
From: Linus Walleij @ 2017-11-23 13:27 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Use blk_cleanup_queue() to shutdown the queue when the driver is removed,
> and instead get an extra reference to the queue to prevent the queue being
> freed before the final mmc_blk_put().
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

This is way more elegant.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 01/24] mmc: block: Fix missing blk_put_request()
  2017-11-21 13:42 ` [PATCH V14 01/24] mmc: block: Fix missing blk_put_request() Adrian Hunter
  2017-11-23  9:41   ` Linus Walleij
@ 2017-11-23 18:12   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-23 18:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> Ensure blk_get_request() is paired with blk_put_request().
>
> Fixes: 0493f6fe5bde ("mmc: block: Move boot partition locking into a driver op")
> Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for fixes and added a stable tag!

Kind regards
Uffe

> ---
>  drivers/mmc/core/block.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index ea80ff4cd7f9..f60939858586 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -236,6 +236,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
>         req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
>         blk_execute_rq(mq->queue, NULL, req, 0);
>         ret = req_to_mmc_queue_req(req)->drv_op_result;
> +       blk_put_request(req);
>
>         if (!ret) {
>                 pr_info("%s: Locking boot partition ro until next power on\n",
> @@ -2557,6 +2558,7 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
>                 *val = ret;
>                 ret = 0;
>         }
> +       blk_put_request(req);
>
>         return ret;
>  }
> @@ -2587,6 +2589,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
>         req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
>         blk_execute_rq(mq->queue, NULL, req, 0);
>         err = req_to_mmc_queue_req(req)->drv_op_result;
> +       blk_put_request(req);
>         if (err) {
>                 pr_err("FAILED %d\n", err);
>                 goto out_free;
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 02/24] mmc: block: Check return value of blk_get_request()
  2017-11-21 13:42 ` [PATCH V14 02/24] mmc: block: Check return value of blk_get_request() Adrian Hunter
  2017-11-23  9:56   ` Linus Walleij
@ 2017-11-23 18:12   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-23 18:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> blk_get_request() can fail, always check the return value.
>
> Fixes: 0493f6fe5bde ("mmc: block: Move boot partition locking into a driver op")
> Fixes: 3ecd8cf23f88 ("mmc: block: move multi-ioctl() to use block layer")
> Fixes: 614f0388f580 ("mmc: block: move single ioctl() commands to block requests")
> Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for fixes and added a stable tag!

Kind regards
Uffe

> ---
>  drivers/mmc/core/block.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index f60939858586..4a319ddbd956 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -233,6 +233,10 @@ static ssize_t power_ro_lock_store(struct device *dev,
>
>         /* Dispatch locking to the block layer */
>         req = blk_get_request(mq->queue, REQ_OP_DRV_OUT, __GFP_RECLAIM);
> +       if (IS_ERR(req)) {
> +               count = PTR_ERR(req);
> +               goto out_put;
> +       }
>         req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_BOOT_WP;
>         blk_execute_rq(mq->queue, NULL, req, 0);
>         ret = req_to_mmc_queue_req(req)->drv_op_result;
> @@ -249,7 +253,7 @@ static ssize_t power_ro_lock_store(struct device *dev,
>                                 set_disk_ro(part_md->disk, 1);
>                         }
>         }
> -
> +out_put:
>         mmc_blk_put(md);
>         return count;
>  }
> @@ -625,6 +629,10 @@ static int mmc_blk_ioctl_cmd(struct mmc_blk_data *md,
>         req = blk_get_request(mq->queue,
>                 idata->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
>                 __GFP_RECLAIM);
> +       if (IS_ERR(req)) {
> +               err = PTR_ERR(req);
> +               goto cmd_done;
> +       }
>         idatas[0] = idata;
>         req_to_mmc_queue_req(req)->drv_op =
>                 rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
> @@ -692,6 +700,10 @@ static int mmc_blk_ioctl_multi_cmd(struct mmc_blk_data *md,
>         req = blk_get_request(mq->queue,
>                 idata[0]->ic.write_flag ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN,
>                 __GFP_RECLAIM);
> +       if (IS_ERR(req)) {
> +               err = PTR_ERR(req);
> +               goto cmd_err;
> +       }
>         req_to_mmc_queue_req(req)->drv_op =
>                 rpmb ? MMC_DRV_OP_IOCTL_RPMB : MMC_DRV_OP_IOCTL;
>         req_to_mmc_queue_req(req)->drv_op_data = idata;
> @@ -2551,6 +2563,8 @@ static int mmc_dbg_card_status_get(void *data, u64 *val)
>
>         /* Ask the block layer about the card status */
>         req = blk_get_request(mq->queue, REQ_OP_DRV_IN, __GFP_RECLAIM);
> +       if (IS_ERR(req))
> +               return PTR_ERR(req);
>         req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_CARD_STATUS;
>         blk_execute_rq(mq->queue, NULL, req, 0);
>         ret = req_to_mmc_queue_req(req)->drv_op_result;
> @@ -2585,6 +2599,10 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
>
>         /* Ask the block layer for the EXT CSD */
>         req = blk_get_request(mq->queue, REQ_OP_DRV_IN, __GFP_RECLAIM);
> +       if (IS_ERR(req)) {
> +               err = PTR_ERR(req);
> +               goto out_free;
> +       }
>         req_to_mmc_queue_req(req)->drv_op = MMC_DRV_OP_GET_EXT_CSD;
>         req_to_mmc_queue_req(req)->drv_op_data = &ext_csd;
>         blk_execute_rq(mq->queue, NULL, req, 0);
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state
  2017-11-21 13:42 ` [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state Adrian Hunter
  2017-11-23  9:58   ` Linus Walleij
@ 2017-11-23 18:12   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-23 18:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> The block driver must be resumed if the mmc bus fails to suspend the card.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for fixes and added a stable tag (I think v3.19+ is
the first one we can pick, else some other manual back porting is
needed).

Kind regards
Uffe

> ---
>  drivers/mmc/core/bus.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
> index a4b49e25fe96..7586ff2ad1f1 100644
> --- a/drivers/mmc/core/bus.c
> +++ b/drivers/mmc/core/bus.c
> @@ -157,6 +157,9 @@ static int mmc_bus_suspend(struct device *dev)
>                 return ret;
>
>         ret = host->bus_ops->suspend(host);
> +       if (ret)
> +               pm_generic_resume(dev);
> +
>         return ret;
>  }
>
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed
  2017-11-21 13:42 ` [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed Adrian Hunter
  2017-11-23 13:22   ` Linus Walleij
@ 2017-11-23 18:13   ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-23 18:13 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> The card is not necessarily being removed, but the debugfs files must be
> removed when the driver is removed, otherwise they will continue to exist
> after unbinding the card from the driver. e.g.
>
>   # echo "mmc1:0001" > /sys/bus/mmc/drivers/mmcblk/unbind
>   # cat /sys/kernel/debug/mmc1/mmc1\:0001/ext_csd
>   [  173.634584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
>   [  173.643356] IP: mmc_ext_csd_open+0x5e/0x170
>
> A complication is that the debugfs_root may have already been removed, so
> check for that too.
>
> Fixes: 627c3ccfb46a ("mmc: debugfs: Move block debugfs into block module")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Thanks, applied for fixes and added a stable tag!

Kind regards
Uffe

> ---
>  drivers/mmc/core/block.c   | 44 +++++++++++++++++++++++++++++++++++++-------
>  drivers/mmc/core/debugfs.c |  1 +
>  2 files changed, 38 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 4a319ddbd956..ccfa98af1dd3 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -122,6 +122,10 @@ struct mmc_blk_data {
>         struct device_attribute force_ro;
>         struct device_attribute power_ro_lock;
>         int     area_type;
> +
> +       /* debugfs files (only in main mmc_blk_data) */
> +       struct dentry *status_dentry;
> +       struct dentry *ext_csd_dentry;
>  };
>
>  /* Device type for RPMB character devices */
> @@ -2653,7 +2657,7 @@ static int mmc_ext_csd_release(struct inode *inode, struct file *file)
>         .llseek         = default_llseek,
>  };
>
> -static int mmc_blk_add_debugfs(struct mmc_card *card)
> +static int mmc_blk_add_debugfs(struct mmc_card *card, struct mmc_blk_data *md)
>  {
>         struct dentry *root;
>
> @@ -2663,28 +2667,53 @@ static int mmc_blk_add_debugfs(struct mmc_card *card)
>         root = card->debugfs_root;
>
>         if (mmc_card_mmc(card) || mmc_card_sd(card)) {
> -               if (!debugfs_create_file("status", S_IRUSR, root, card,
> -                                        &mmc_dbg_card_status_fops))
> +               md->status_dentry =
> +                       debugfs_create_file("status", S_IRUSR, root, card,
> +                                           &mmc_dbg_card_status_fops);
> +               if (!md->status_dentry)
>                         return -EIO;
>         }
>
>         if (mmc_card_mmc(card)) {
> -               if (!debugfs_create_file("ext_csd", S_IRUSR, root, card,
> -                                        &mmc_dbg_ext_csd_fops))
> +               md->ext_csd_dentry =
> +                       debugfs_create_file("ext_csd", S_IRUSR, root, card,
> +                                           &mmc_dbg_ext_csd_fops);
> +               if (!md->ext_csd_dentry)
>                         return -EIO;
>         }
>
>         return 0;
>  }
>
> +static void mmc_blk_remove_debugfs(struct mmc_card *card,
> +                                  struct mmc_blk_data *md)
> +{
> +       if (!card->debugfs_root)
> +               return;
> +
> +       if (!IS_ERR_OR_NULL(md->status_dentry)) {
> +               debugfs_remove(md->status_dentry);
> +               md->status_dentry = NULL;
> +       }
> +
> +       if (!IS_ERR_OR_NULL(md->ext_csd_dentry)) {
> +               debugfs_remove(md->ext_csd_dentry);
> +               md->ext_csd_dentry = NULL;
> +       }
> +}
>
>  #else
>
> -static int mmc_blk_add_debugfs(struct mmc_card *card)
> +static int mmc_blk_add_debugfs(struct mmc_card *card, struct mmc_blk_data *md)
>  {
>         return 0;
>  }
>
> +static void mmc_blk_remove_debugfs(struct mmc_card *card,
> +                                  struct mmc_blk_data *md)
> +{
> +}
> +
>  #endif /* CONFIG_DEBUG_FS */
>
>  static int mmc_blk_probe(struct mmc_card *card)
> @@ -2724,7 +2753,7 @@ static int mmc_blk_probe(struct mmc_card *card)
>         }
>
>         /* Add two debugfs entries */
> -       mmc_blk_add_debugfs(card);
> +       mmc_blk_add_debugfs(card, md);
>
>         pm_runtime_set_autosuspend_delay(&card->dev, 3000);
>         pm_runtime_use_autosuspend(&card->dev);
> @@ -2750,6 +2779,7 @@ static void mmc_blk_remove(struct mmc_card *card)
>  {
>         struct mmc_blk_data *md = dev_get_drvdata(&card->dev);
>
> +       mmc_blk_remove_debugfs(card, md);
>         mmc_blk_remove_parts(card, md);
>         pm_runtime_get_sync(&card->dev);
>         mmc_claim_host(card->host);
> diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
> index 01e459a34f33..0f4a7d7b2626 100644
> --- a/drivers/mmc/core/debugfs.c
> +++ b/drivers/mmc/core/debugfs.c
> @@ -314,4 +314,5 @@ void mmc_add_card_debugfs(struct mmc_card *card)
>  void mmc_remove_card_debugfs(struct mmc_card *card)
>  {
>         debugfs_remove_recursive(card->debugfs_root);
> +       card->debugfs_root = NULL;
>  }
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-21 13:42 ` [PATCH V14 13/24] mmc: block: Add blk-mq support Adrian Hunter
@ 2017-11-24 10:12   ` Ulf Hansson
  2017-11-27 10:20     ` Adrian Hunter
  0 siblings, 1 reply; 50+ messages in thread
From: Ulf Hansson @ 2017-11-24 10:12 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

[...]

> +/* Single sector read during recovery */
> +static void mmc_blk_ss_read(struct mmc_queue *mq, struct request *req)

Nitpick: I think mmc_blk_read_single() would be better as it is a more
clear name. Would you mind changing it?

> +{
> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +       blk_status_t status;
> +
> +       while (1) {
> +               mmc_blk_rw_rq_prep(mqrq, mq->card, 1, mq);
> +
> +               mmc_wait_for_req(mq->card->host, &mqrq->brq.mrq);
> +
> +               /*
> +                * Not expecting command errors, so just give up in that case.
> +                * If there are retries remaining, the request will get
> +                * requeued.
> +                */
> +               if (mqrq->brq.cmd.error)
> +                       return;

What happens here if the reason to the error is because the card was removed?

I guess next time __blk_err_check() is called from the
mmc_blk_mq_rw_recovery(), this will be detected and managed?

> +
> +               if (blk_rq_bytes(req) <= 512)

Shouldn't you check "if (blk_rq_bytes(req) <  512)"? How would you
otherwise read the last 512 bytes block?

> +                       break;
> +
> +               status = mqrq->brq.data.error ? BLK_STS_IOERR : BLK_STS_OK;
> +
> +               blk_update_request(req, status, 512);

Shouldn't we actually bail out, unless the error is a data ECC error?
On the other hand, I guess if it a more severe error, cmd.error will
anyway be set above!?

One more question, if there is a data error, we may want to try to
recover by sending a stop command? How do we manage that?

> +       }
> +
> +       mqrq->retries = MMC_NO_RETRIES;
> +}
> +
> +static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
> +{
> +       int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +       struct mmc_blk_request *brq = &mqrq->brq;
> +       struct mmc_blk_data *md = mq->blkdata;
> +       struct mmc_card *card = mq->card;
> +       static enum mmc_blk_status status;
> +
> +       brq->retune_retry_done = mqrq->retries;
> +
> +       status = __mmc_blk_err_check(card, mqrq);
> +
> +       mmc_retune_release(card->host);
> +
> +       /*
> +        * Requests are completed by mmc_blk_mq_complete_rq() which sets simple
> +        * policy:
> +        * 1. A request that has transferred at least some data is considered
> +        * successful and will be requeued if there is remaining data to
> +        * transfer.
> +        * 2. Otherwise the number of retries is incremented and the request
> +        * will be requeued if there are remaining retries.
> +        * 3. Otherwise the request will be errored out.
> +        * That means mmc_blk_mq_complete_rq() is controlled by bytes_xfered and
> +        * mqrq->retries. So there are only 4 possible actions here:
> +        *      1. do not accept the bytes_xfered value i.e. set it to zero
> +        *      2. change mqrq->retries to determine the number of retries
> +        *      3. try to reset the card
> +        *      4. read one sector at a time
> +        */
> +       switch (status) {
> +       case MMC_BLK_SUCCESS:
> +       case MMC_BLK_PARTIAL:
> +               /* Reset success, and accept bytes_xfered */
> +               mmc_blk_reset_success(md, type);
> +               break;
> +       case MMC_BLK_CMD_ERR:
> +               /*
> +                * For SD cards, get bytes written, but do not accept
> +                * bytes_xfered if that fails. For MMC cards accept
> +                * bytes_xfered. Then try to reset. If reset fails then
> +                * error out the remaining request, otherwise retry
> +                * once (N.B mmc_blk_reset() will not succeed twice in a
> +                * row).
> +                */
> +               if (mmc_card_sd(card)) {
> +                       u32 blocks;
> +                       int err;
> +
> +                       err = mmc_sd_num_wr_blocks(card, &blocks);
> +                       if (err)
> +                               brq->data.bytes_xfered = 0;
> +                       else
> +                               brq->data.bytes_xfered = blocks << 9;
> +               }
> +               if (mmc_blk_reset(md, card->host, type))
> +                       mqrq->retries = MMC_NO_RETRIES;
> +               else
> +                       mqrq->retries = MMC_MAX_RETRIES - 1;
> +               break;
> +       case MMC_BLK_RETRY:
> +               /*
> +                * Do not accept bytes_xfered, but retry up to 5 times,
> +                * otherwise same as abort.
> +                */
> +               brq->data.bytes_xfered = 0;
> +               if (mqrq->retries < MMC_MAX_RETRIES)
> +                       break;
> +               /* Fall through */
> +       case MMC_BLK_ABORT:
> +               /*
> +                * Do not accept bytes_xfered, but try to reset. If
> +                * reset succeeds, try once more, otherwise error out
> +                * the request.
> +                */
> +               brq->data.bytes_xfered = 0;
> +               if (mmc_blk_reset(md, card->host, type))
> +                       mqrq->retries = MMC_NO_RETRIES;
> +               else
> +                       mqrq->retries = MMC_MAX_RETRIES - 1;
> +               break;
> +       case MMC_BLK_DATA_ERR: {
> +               int err;
> +
> +               /*
> +                * Do not accept bytes_xfered, but try to reset. If
> +                * reset succeeds, try once more. If reset fails with
> +                * ENODEV which means the partition is wrong, then error
> +                * out the request. Otherwise attempt to read one sector
> +                * at a time.
> +                */
> +               brq->data.bytes_xfered = 0;
> +               err = mmc_blk_reset(md, card->host, type);
> +               if (!err) {
> +                       mqrq->retries = MMC_MAX_RETRIES - 1;
> +                       break;
> +               }
> +               if (err == -ENODEV) {
> +                       mqrq->retries = MMC_NO_RETRIES;
> +                       break;
> +               }
> +               /* Fall through */
> +       }
> +       case MMC_BLK_ECC_ERR:
> +               /*
> +                * Do not accept bytes_xfered. If reading more than one
> +                * sector, try reading one sector at a time.
> +                */
> +               brq->data.bytes_xfered = 0;
> +               /* FIXME: Missing single sector read for large sector size */
> +               if (brq->data.blocks > 1 && !mmc_large_sector(card)) {
> +                       /* Redo read one sector at a time */
> +                       pr_warn("%s: retrying using single block read\n",
> +                               req->rq_disk->disk_name);
> +                       mmc_blk_ss_read(mq, req);
> +               } else {
> +                       mqrq->retries = MMC_NO_RETRIES;
> +               }
> +               break;
> +       case MMC_BLK_NOMEDIUM:
> +               /* Do not accept bytes_xfered. Error out the request */
> +               brq->data.bytes_xfered = 0;
> +               mqrq->retries = MMC_NO_RETRIES;
> +               break;
> +       default:
> +               /* Do not accept bytes_xfered. Error out the request */
> +               brq->data.bytes_xfered = 0;
> +               mqrq->retries = MMC_NO_RETRIES;
> +               pr_err("%s: Unhandled return value (%d)",
> +                      req->rq_disk->disk_name, status);
> +               break;
> +       }
> +}
> +

[...]

> +
> +static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
> +                                      struct request *req)
> +{
> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +
> +       mmc_blk_mq_rw_recovery(mq, req);
> +
> +       mmc_blk_urgent_bkops(mq, mqrq);
> +}
> +
> +static void mmc_blk_mq_acct_req_done(struct mmc_queue *mq, struct request *req)

Nitpick: Can we please try to find a better name for this function. I
don't think "acct" is good abbreviation because, to me, it's not
self-explaining.

> +{
> +       struct request_queue *q = req->q;
> +       unsigned long flags;
> +       bool put_card;
> +
> +       spin_lock_irqsave(q->queue_lock, flags);
> +
> +       mq->in_flight[mmc_issue_type(mq, req)] -= 1;
> +
> +       put_card = (mmc_tot_in_flight(mq) == 0);
> +
> +       spin_unlock_irqrestore(q->queue_lock, flags);
> +
> +       if (put_card)
> +               mmc_put_card(mq->card, &mq->ctx);

I have tried to convince myself that the protection of calling
mmc_get|put_card() is safe, but I am not sure.

I am wondering whether there could be races for mmc_get|put_card().
Please see some more related comments below.

[...]

> +static void mmc_blk_mq_req_done(struct mmc_request *mrq)
> +{
> +       struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
> +                                                 brq.mrq);
> +       struct request *req = mmc_queue_req_to_req(mqrq);
> +       struct request_queue *q = req->q;
> +       struct mmc_queue *mq = q->queuedata;
> +       unsigned long flags;
> +       bool waiting;
> +
> +       spin_lock_irqsave(q->queue_lock, flags);
> +       mq->complete_req = req;
> +       mq->rw_wait = false;
> +       waiting = mq->waiting;

The synchronization methods is one of the most complex part of
$subject patch, yet also very elegant!

Anyway, would you mind adding some more comments here and there in the
code, to explain a bit about what goes on and why?

> +       spin_unlock_irqrestore(q->queue_lock, flags);
> +
> +       if (waiting)
> +               wake_up(&mq->wait);

For example, a comment here about: "In case there is a new request
prepared, leave it to that, to deal with finishing and post processing
of the request, else schedule a work to do it - and see what comes
first."

> +       else
> +               kblockd_schedule_work(&mq->complete_work);
> +}
> +
> +static bool mmc_blk_rw_wait_cond(struct mmc_queue *mq, int *err)
> +{
> +       struct request_queue *q = mq->queue;
> +       unsigned long flags;
> +       bool done;
> +
> +       spin_lock_irqsave(q->queue_lock, flags);
> +       done = !mq->rw_wait;
> +       mq->waiting = !done;

Also here it would be nice with some comments about the
synchronization. I stop from mentioning this from now on, because it
applies to a couple of more places.

I leave it to you to decide how and where it makes sense to add these
kind of comments.

> +       spin_unlock_irqrestore(q->queue_lock, flags);
> +
> +       return done;
> +}
> +
> +static int mmc_blk_rw_wait(struct mmc_queue *mq, struct request **prev_req)
> +{
> +       int err = 0;
> +
> +       wait_event(mq->wait, mmc_blk_rw_wait_cond(mq, &err));
> +
> +       mmc_blk_mq_complete_prev_req(mq, prev_req);
> +
> +       return err;
> +}
> +
> +static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
> +                                 struct request *req)
> +{
> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
> +       struct mmc_host *host = mq->card->host;
> +       struct request *prev_req = NULL;
> +       int err = 0;
> +
> +       mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
> +
> +       mqrq->brq.mrq.done = mmc_blk_mq_req_done;
> +
> +       mmc_pre_req(host, &mqrq->brq.mrq);

To be honest, using a queue_depth of 64, puzzles me! According to my
understanding we should use a queue_depth of 2, in case the host
implements the ->pre|post_req() callbacks, else we should set it to 1.

Although I may be missing some information about how to really use
this, because for example UBI (mtd) also uses 64 as queue depth!?

My interpretation of the queue_depth is that the blkmq layer will use
it to understand the maximum number of request a block device are able
to operate on simultaneously (when having one HW queue), thus the
number of outstanding dispatched requests for the block evice driver,
may be as close as possible to the queue_depth, but never above. I may
be totally wrong about this. :-)

Anyway, then if using a queue_depth of 64, how will you make sure that
you not end up having > 1 requests being prepared at the same time
(not counting the one that may be in transfer)?

>
> +       err = mmc_blk_rw_wait(mq, &prev_req);
> +       if (err)
> +               goto out_post_req;
> +
> +       mq->rw_wait = true;
> +
> +       err = mmc_start_request(host, &mqrq->brq.mrq);
> +
> +       if (prev_req)
> +               mmc_blk_mq_post_req(mq, prev_req);
> +
> +       if (err) {
> +               mq->rw_wait = false;
> +               mmc_retune_release(host);
> +       }
> +
> +out_post_req:
> +       if (err)
> +               mmc_post_req(host, &mqrq->brq.mrq, err);
> +
> +       return err;
> +}
> +
> +static int mmc_blk_wait_for_idle(struct mmc_queue *mq, struct mmc_host *host)
> +{
> +       return mmc_blk_rw_wait(mq, NULL);
> +}
> +
> +enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
> +{
> +       struct mmc_blk_data *md = mq->blkdata;
> +       struct mmc_card *card = md->queue.card;
> +       struct mmc_host *host = card->host;
> +       int ret;
> +
> +       ret = mmc_blk_part_switch(card, md->part_type);

What if there is an ongoing request, shouldn't you wait for that to
complete before switching partition?

> +       if (ret)
> +               return MMC_REQ_FAILED_TO_START;
> +
> +       switch (mmc_issue_type(mq, req)) {
> +       case MMC_ISSUE_SYNC:
> +               ret = mmc_blk_wait_for_idle(mq, host);
> +               if (ret)
> +                       return MMC_REQ_BUSY;

Wouldn't it be possible that yet a new SYNC request becomes queued in
parallel with this current one. Then, when reaching this point, how do
you make sure that new request waits for the current "SYNC" request?

I mean is the above mmc_blk_wait_for_idle(), really sufficient to deal
with synchronization?

I guess we could use mmc_claim_host(no-ctx) in some clever way to deal
with this, or perhaps there is a better option?

BTW, I guess the problem is also present if there is SYNC request
ongoing and then is a new ASYNC request coming in. Is the ASYNC
request really waiting for the SYNC request to finish?

Or maybe I just missed these parts in $subject patch.

> +               switch (req_op(req)) {
> +               case REQ_OP_DRV_IN:
> +               case REQ_OP_DRV_OUT:
> +                       mmc_blk_issue_drv_op(mq, req);
> +                       break;
> +               case REQ_OP_DISCARD:
> +                       mmc_blk_issue_discard_rq(mq, req);
> +                       break;
> +               case REQ_OP_SECURE_ERASE:
> +                       mmc_blk_issue_secdiscard_rq(mq, req);
> +                       break;
> +               case REQ_OP_FLUSH:
> +                       mmc_blk_issue_flush(mq, req);
> +                       break;
> +               default:
> +                       WARN_ON_ONCE(1);
> +                       return MMC_REQ_FAILED_TO_START;
> +               }
> +               return MMC_REQ_FINISHED;
> +       case MMC_ISSUE_ASYNC:
> +               switch (req_op(req)) {
> +               case REQ_OP_READ:
> +               case REQ_OP_WRITE:
> +                       ret = mmc_blk_mq_issue_rw_rq(mq, req);
> +                       break;
> +               default:
> +                       WARN_ON_ONCE(1);
> +                       ret = -EINVAL;
> +               }
> +               if (!ret)
> +                       return MMC_REQ_STARTED;
> +               return ret == -EBUSY ? MMC_REQ_BUSY : MMC_REQ_FAILED_TO_START;
> +       default:
> +               WARN_ON_ONCE(1);
> +               return MMC_REQ_FAILED_TO_START;
> +       }
> +}
> +

[...]

> +static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
> +                                   const struct blk_mq_queue_data *bd)
> +{
> +       struct request *req = bd->rq;
> +       struct request_queue *q = req->q;
> +       struct mmc_queue *mq = q->queuedata;
> +       struct mmc_card *card = mq->card;
> +       enum mmc_issue_type issue_type;
> +       enum mmc_issued issued;
> +       bool get_card;
> +       int ret;
> +
> +       if (mmc_card_removed(mq->card)) {
> +               req->rq_flags |= RQF_QUIET;
> +               return BLK_STS_IOERR;
> +       }
> +
> +       issue_type = mmc_issue_type(mq, req);
> +
> +       spin_lock_irq(q->queue_lock);
> +
> +       switch (issue_type) {
> +       case MMC_ISSUE_ASYNC:
> +               break;
> +       default:
> +               /*
> +                * Timeouts are handled by mmc core, and we don't have a host
> +                * API to abort requests, so we can't handle the timeout anyway.
> +                * However, when the timeout happens, blk_mq_complete_request()
> +                * no longer works (to stop the request disappearing under us).
> +                * To avoid racing with that, set a large timeout.
> +                */
> +               req->timeout = 600 * HZ;
> +               break;
> +       }
> +
> +       mq->in_flight[issue_type] += 1;
> +       get_card = (mmc_tot_in_flight(mq) == 1);
> +
> +       spin_unlock_irq(q->queue_lock);
> +
> +       if (!(req->rq_flags & RQF_DONTPREP)) {
> +               req_to_mmc_queue_req(req)->retries = 0;
> +               req->rq_flags |= RQF_DONTPREP;
> +       }
> +
> +       if (get_card)

Coming back to the get_card() thingy, which I wonder if it's fragile.

A request that finds get_card == true here, doesn't necessarily have
to reach to this point first (the task may be preempted), in case
there is another request being queued in parallel (or that can't
happen?).

That could then lead to that the following steps become executed for
the other requests, prior anybody calling mmc_get_card().

> +               mmc_get_card(card, &mq->ctx);
> +
> +       blk_mq_start_request(req);
> +
> +       issued = mmc_blk_mq_issue_rq(mq, req);
> +
> +       switch (issued) {
> +       case MMC_REQ_BUSY:
> +               ret = BLK_STS_RESOURCE;
> +               break;
> +       case MMC_REQ_FAILED_TO_START:
> +               ret = BLK_STS_IOERR;
> +               break;
> +       default:
> +               ret = BLK_STS_OK;
> +               break;
> +       }
> +
> +       if (issued != MMC_REQ_STARTED) {
> +               bool put_card = false;
> +
> +               spin_lock_irq(q->queue_lock);
> +               mq->in_flight[issue_type] -= 1;
> +               if (mmc_tot_in_flight(mq) == 0)
> +                       put_card = true;
> +               spin_unlock_irq(q->queue_lock);
> +               if (put_card)
> +                       mmc_put_card(card, &mq->ctx);

For the similar reasons as above, this also looks fragile.

> +       }
> +
> +       return ret;
> +}
> +
> +static const struct blk_mq_ops mmc_mq_ops = {
> +       .queue_rq       = mmc_mq_queue_rq,
> +       .init_request   = mmc_mq_init_request,
> +       .exit_request   = mmc_mq_exit_request,
> +       .complete       = mmc_blk_mq_complete,
> +       .timeout        = mmc_mq_timed_out,
> +};

[...]

> +#define MMC_QUEUE_DEPTH 64
> +
> +static int mmc_mq_init(struct mmc_queue *mq, struct mmc_card *card,
> +                        spinlock_t *lock)
> +{
> +       int q_depth;
> +       int ret;
> +
> +       q_depth = MMC_QUEUE_DEPTH;

I already mentioned my thoughts around the queue_depth... and again I
may be totally wrong. :-)

> +
> +       ret = mmc_mq_init_queue(mq, q_depth, &mmc_mq_ops, lock);
> +       if (ret)
> +               return ret;
> +
> +       blk_queue_rq_timeout(mq->queue, 60 * HZ);
> +
> +       mmc_setup_queue(mq, card);
> +
> +       return 0;
>  }

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-24 10:12   ` Ulf Hansson
@ 2017-11-27 10:20     ` Adrian Hunter
  2017-11-27 11:23       ` Ulf Hansson
  2017-11-27 11:36       ` Ulf Hansson
  0 siblings, 2 replies; 50+ messages in thread
From: Adrian Hunter @ 2017-11-27 10:20 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 24/11/17 12:12, Ulf Hansson wrote:
> [...]
> 
>> +/* Single sector read during recovery */
>> +static void mmc_blk_ss_read(struct mmc_queue *mq, struct request *req)
> 
> Nitpick: I think mmc_blk_read_single() would be better as it is a more
> clear name. Would you mind changing it?
> 
>> +{
>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>> +       blk_status_t status;
>> +
>> +       while (1) {
>> +               mmc_blk_rw_rq_prep(mqrq, mq->card, 1, mq);
>> +
>> +               mmc_wait_for_req(mq->card->host, &mqrq->brq.mrq);
>> +
>> +               /*
>> +                * Not expecting command errors, so just give up in that case.
>> +                * If there are retries remaining, the request will get
>> +                * requeued.
>> +                */
>> +               if (mqrq->brq.cmd.error)
>> +                       return;
> 
> What happens here if the reason to the error is because the card was removed?

Assuming the rescan is waiting for the host claim, the next read / write
request will end up calling mmc_detect_card_removed() in the recovery.
After that all following requests will error immediately because
mmc_mq_queue_rq() calls mmc_card_removed().

> 
> I guess next time __blk_err_check() is called from the
> mmc_blk_mq_rw_recovery(), this will be detected and managed?
> 
>> +
>> +               if (blk_rq_bytes(req) <= 512)
> 
> Shouldn't you check "if (blk_rq_bytes(req) <  512)"? How would you
> otherwise read the last 512 bytes block?

At this point we have read the last sector but not updated the request, so
the number of bytes left should be 512.  The reason we don't update the
request is so that the logic in mmc_blk_mq_complete_rq() will work.  I will
add a comment.

> 
>> +                       break;
>> +
>> +               status = mqrq->brq.data.error ? BLK_STS_IOERR : BLK_STS_OK;
>> +
>> +               blk_update_request(req, status, 512);
> 
> Shouldn't we actually bail out, unless the error is a data ECC error?
> On the other hand, I guess if it a more severe error, cmd.error will
> anyway be set above!?
> 
> One more question, if there is a data error, we may want to try to
> recover by sending a stop command? How do we manage that?

I was thinking a single-block read would not need a stop.  I will think
some more about error handling here.

> 
>> +       }
>> +
>> +       mqrq->retries = MMC_NO_RETRIES;
>> +}
>> +
>> +static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
>> +{
>> +       int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>> +       struct mmc_blk_request *brq = &mqrq->brq;
>> +       struct mmc_blk_data *md = mq->blkdata;
>> +       struct mmc_card *card = mq->card;
>> +       static enum mmc_blk_status status;
>> +
>> +       brq->retune_retry_done = mqrq->retries;
>> +
>> +       status = __mmc_blk_err_check(card, mqrq);
>> +
>> +       mmc_retune_release(card->host);
>> +
>> +       /*
>> +        * Requests are completed by mmc_blk_mq_complete_rq() which sets simple
>> +        * policy:
>> +        * 1. A request that has transferred at least some data is considered
>> +        * successful and will be requeued if there is remaining data to
>> +        * transfer.
>> +        * 2. Otherwise the number of retries is incremented and the request
>> +        * will be requeued if there are remaining retries.
>> +        * 3. Otherwise the request will be errored out.
>> +        * That means mmc_blk_mq_complete_rq() is controlled by bytes_xfered and
>> +        * mqrq->retries. So there are only 4 possible actions here:
>> +        *      1. do not accept the bytes_xfered value i.e. set it to zero
>> +        *      2. change mqrq->retries to determine the number of retries
>> +        *      3. try to reset the card
>> +        *      4. read one sector at a time
>> +        */
>> +       switch (status) {
>> +       case MMC_BLK_SUCCESS:
>> +       case MMC_BLK_PARTIAL:
>> +               /* Reset success, and accept bytes_xfered */
>> +               mmc_blk_reset_success(md, type);
>> +               break;
>> +       case MMC_BLK_CMD_ERR:
>> +               /*
>> +                * For SD cards, get bytes written, but do not accept
>> +                * bytes_xfered if that fails. For MMC cards accept
>> +                * bytes_xfered. Then try to reset. If reset fails then
>> +                * error out the remaining request, otherwise retry
>> +                * once (N.B mmc_blk_reset() will not succeed twice in a
>> +                * row).
>> +                */
>> +               if (mmc_card_sd(card)) {
>> +                       u32 blocks;
>> +                       int err;
>> +
>> +                       err = mmc_sd_num_wr_blocks(card, &blocks);
>> +                       if (err)
>> +                               brq->data.bytes_xfered = 0;
>> +                       else
>> +                               brq->data.bytes_xfered = blocks << 9;
>> +               }
>> +               if (mmc_blk_reset(md, card->host, type))
>> +                       mqrq->retries = MMC_NO_RETRIES;
>> +               else
>> +                       mqrq->retries = MMC_MAX_RETRIES - 1;
>> +               break;
>> +       case MMC_BLK_RETRY:
>> +               /*
>> +                * Do not accept bytes_xfered, but retry up to 5 times,
>> +                * otherwise same as abort.
>> +                */
>> +               brq->data.bytes_xfered = 0;
>> +               if (mqrq->retries < MMC_MAX_RETRIES)
>> +                       break;
>> +               /* Fall through */
>> +       case MMC_BLK_ABORT:
>> +               /*
>> +                * Do not accept bytes_xfered, but try to reset. If
>> +                * reset succeeds, try once more, otherwise error out
>> +                * the request.
>> +                */
>> +               brq->data.bytes_xfered = 0;
>> +               if (mmc_blk_reset(md, card->host, type))
>> +                       mqrq->retries = MMC_NO_RETRIES;
>> +               else
>> +                       mqrq->retries = MMC_MAX_RETRIES - 1;
>> +               break;
>> +       case MMC_BLK_DATA_ERR: {
>> +               int err;
>> +
>> +               /*
>> +                * Do not accept bytes_xfered, but try to reset. If
>> +                * reset succeeds, try once more. If reset fails with
>> +                * ENODEV which means the partition is wrong, then error
>> +                * out the request. Otherwise attempt to read one sector
>> +                * at a time.
>> +                */
>> +               brq->data.bytes_xfered = 0;
>> +               err = mmc_blk_reset(md, card->host, type);
>> +               if (!err) {
>> +                       mqrq->retries = MMC_MAX_RETRIES - 1;
>> +                       break;
>> +               }
>> +               if (err == -ENODEV) {
>> +                       mqrq->retries = MMC_NO_RETRIES;
>> +                       break;
>> +               }
>> +               /* Fall through */
>> +       }
>> +       case MMC_BLK_ECC_ERR:
>> +               /*
>> +                * Do not accept bytes_xfered. If reading more than one
>> +                * sector, try reading one sector at a time.
>> +                */
>> +               brq->data.bytes_xfered = 0;
>> +               /* FIXME: Missing single sector read for large sector size */
>> +               if (brq->data.blocks > 1 && !mmc_large_sector(card)) {
>> +                       /* Redo read one sector at a time */
>> +                       pr_warn("%s: retrying using single block read\n",
>> +                               req->rq_disk->disk_name);
>> +                       mmc_blk_ss_read(mq, req);
>> +               } else {
>> +                       mqrq->retries = MMC_NO_RETRIES;
>> +               }
>> +               break;
>> +       case MMC_BLK_NOMEDIUM:
>> +               /* Do not accept bytes_xfered. Error out the request */
>> +               brq->data.bytes_xfered = 0;
>> +               mqrq->retries = MMC_NO_RETRIES;
>> +               break;
>> +       default:
>> +               /* Do not accept bytes_xfered. Error out the request */
>> +               brq->data.bytes_xfered = 0;
>> +               mqrq->retries = MMC_NO_RETRIES;
>> +               pr_err("%s: Unhandled return value (%d)",
>> +                      req->rq_disk->disk_name, status);
>> +               break;
>> +       }
>> +}
>> +
> 
> [...]
> 
>> +
>> +static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
>> +                                      struct request *req)
>> +{
>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>> +
>> +       mmc_blk_mq_rw_recovery(mq, req);
>> +
>> +       mmc_blk_urgent_bkops(mq, mqrq);
>> +}
>> +
>> +static void mmc_blk_mq_acct_req_done(struct mmc_queue *mq, struct request *req)
> 
> Nitpick: Can we please try to find a better name for this function. I
> don't think "acct" is good abbreviation because, to me, it's not
> self-explaining.

What about mmc_blk_mq_decrement_in_flight() ?

> 
>> +{
>> +       struct request_queue *q = req->q;
>> +       unsigned long flags;
>> +       bool put_card;
>> +
>> +       spin_lock_irqsave(q->queue_lock, flags);
>> +
>> +       mq->in_flight[mmc_issue_type(mq, req)] -= 1;
>> +
>> +       put_card = (mmc_tot_in_flight(mq) == 0);
>> +
>> +       spin_unlock_irqrestore(q->queue_lock, flags);
>> +
>> +       if (put_card)
>> +               mmc_put_card(mq->card, &mq->ctx);
> 
> I have tried to convince myself that the protection of calling
> mmc_get|put_card() is safe, but I am not sure.
> 
> I am wondering whether there could be races for mmc_get|put_card().
> Please see some more related comments below.

mmc_put_card() is safe and necessary if we have seen mmc_tot_in_flight(mq)
== 0.  When the next request arrives it will have to do a mmc_get_card()
because it is changing the number of requests in flight from 0 to 1.  It
doesn't matter if that mmc_get_card() comes before or after or during this
mmc_put_card().

> 
> [...]
> 
>> +static void mmc_blk_mq_req_done(struct mmc_request *mrq)
>> +{
>> +       struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
>> +                                                 brq.mrq);
>> +       struct request *req = mmc_queue_req_to_req(mqrq);
>> +       struct request_queue *q = req->q;
>> +       struct mmc_queue *mq = q->queuedata;
>> +       unsigned long flags;
>> +       bool waiting;
>> +
>> +       spin_lock_irqsave(q->queue_lock, flags);
>> +       mq->complete_req = req;
>> +       mq->rw_wait = false;
>> +       waiting = mq->waiting;
> 
> The synchronization methods is one of the most complex part of
> $subject patch, yet also very elegant!
> 
> Anyway, would you mind adding some more comments here and there in the
> code, to explain a bit about what goes on and why?

Ok

> 
>> +       spin_unlock_irqrestore(q->queue_lock, flags);
>> +
>> +       if (waiting)
>> +               wake_up(&mq->wait);
> 
> For example, a comment here about: "In case there is a new request
> prepared, leave it to that, to deal with finishing and post processing
> of the request, else schedule a work to do it - and see what comes
> first."

Ok

> 
>> +       else
>> +               kblockd_schedule_work(&mq->complete_work);
>> +}
>> +
>> +static bool mmc_blk_rw_wait_cond(struct mmc_queue *mq, int *err)
>> +{
>> +       struct request_queue *q = mq->queue;
>> +       unsigned long flags;
>> +       bool done;
>> +
>> +       spin_lock_irqsave(q->queue_lock, flags);
>> +       done = !mq->rw_wait;
>> +       mq->waiting = !done;
> 
> Also here it would be nice with some comments about the
> synchronization. I stop from mentioning this from now on, because it
> applies to a couple of more places.
> 
> I leave it to you to decide how and where it makes sense to add these
> kind of comments.
> 
>> +       spin_unlock_irqrestore(q->queue_lock, flags);
>> +
>> +       return done;
>> +}
>> +
>> +static int mmc_blk_rw_wait(struct mmc_queue *mq, struct request **prev_req)
>> +{
>> +       int err = 0;
>> +
>> +       wait_event(mq->wait, mmc_blk_rw_wait_cond(mq, &err));
>> +
>> +       mmc_blk_mq_complete_prev_req(mq, prev_req);
>> +
>> +       return err;
>> +}
>> +
>> +static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
>> +                                 struct request *req)
>> +{
>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>> +       struct mmc_host *host = mq->card->host;
>> +       struct request *prev_req = NULL;
>> +       int err = 0;
>> +
>> +       mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
>> +
>> +       mqrq->brq.mrq.done = mmc_blk_mq_req_done;
>> +
>> +       mmc_pre_req(host, &mqrq->brq.mrq);
> 
> To be honest, using a queue_depth of 64, puzzles me! According to my
> understanding we should use a queue_depth of 2, in case the host
> implements the ->pre|post_req() callbacks, else we should set it to 1.
> 
> Although I may be missing some information about how to really use
> this, because for example UBI (mtd) also uses 64 as queue depth!?
> 
> My interpretation of the queue_depth is that the blkmq layer will use
> it to understand the maximum number of request a block device are able
> to operate on simultaneously (when having one HW queue), thus the
> number of outstanding dispatched requests for the block evice driver,
> may be as close as possible to the queue_depth, but never above. I may
> be totally wrong about this. :-)

For blk-mq, the queue_depth also defines the default nr_requests, which will
be 2 times the queue_depth if there is an elevator.  The old nr_requests was
128, so setting 64 gives the same nr_requests as before.

Otherwise the queue_depth is the size of the tag set.

A very low queue_depth might be a problem for I/O schedulers like kyber
which seems to try to limit the number of tags available for asynchronous
requests.

> 
> Anyway, then if using a queue_depth of 64, how will you make sure that
> you not end up having > 1 requests being prepared at the same time
> (not counting the one that may be in transfer)?

We are currently single-threaded since every request goes through
hctx->run_work when BLK_MQ_F_BLOCKING and nr_hw_queues == 1.  It might be
worth adding a mutex to ensure that never changes.

This point also answers some of the questions below, since there can be no
parallel dispatches.

> 
>>
>> +       err = mmc_blk_rw_wait(mq, &prev_req);
>> +       if (err)
>> +               goto out_post_req;
>> +
>> +       mq->rw_wait = true;
>> +
>> +       err = mmc_start_request(host, &mqrq->brq.mrq);
>> +
>> +       if (prev_req)
>> +               mmc_blk_mq_post_req(mq, prev_req);
>> +
>> +       if (err) {
>> +               mq->rw_wait = false;
>> +               mmc_retune_release(host);
>> +       }
>> +
>> +out_post_req:
>> +       if (err)
>> +               mmc_post_req(host, &mqrq->brq.mrq, err);
>> +
>> +       return err;
>> +}
>> +
>> +static int mmc_blk_wait_for_idle(struct mmc_queue *mq, struct mmc_host *host)
>> +{
>> +       return mmc_blk_rw_wait(mq, NULL);
>> +}
>> +
>> +enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
>> +{
>> +       struct mmc_blk_data *md = mq->blkdata;
>> +       struct mmc_card *card = md->queue.card;
>> +       struct mmc_host *host = card->host;
>> +       int ret;
>> +
>> +       ret = mmc_blk_part_switch(card, md->part_type);
> 
> What if there is an ongoing request, shouldn't you wait for that to
> complete before switching partition?

Two requests on the same queue cannot be on different partitions because we
have a different queue (and block device) for each partition.

> 
>> +       if (ret)
>> +               return MMC_REQ_FAILED_TO_START;
>> +
>> +       switch (mmc_issue_type(mq, req)) {
>> +       case MMC_ISSUE_SYNC:
>> +               ret = mmc_blk_wait_for_idle(mq, host);
>> +               if (ret)
>> +                       return MMC_REQ_BUSY;
> 
> Wouldn't it be possible that yet a new SYNC request becomes queued in
> parallel with this current one. Then, when reaching this point, how do
> you make sure that new request waits for the current "SYNC" request?

As mentioned above, there are no parallel dispatches.

> 
> I mean is the above mmc_blk_wait_for_idle(), really sufficient to deal
> with synchronization?

So long as there are no parallel dispatches.

> 
> I guess we could use mmc_claim_host(no-ctx) in some clever way to deal
> with this, or perhaps there is a better option?

We are relying on there being no parallel dispatches.  That is the case now,
but if it weren't we could use a mutex in mmc_mq_queue_rq().

> 
> BTW, I guess the problem is also present if there is SYNC request
> ongoing and then is a new ASYNC request coming in. Is the ASYNC
> request really waiting for the SYNC request to finish?

With no parallel dispatches, the SYNC request runs to completion before
another request can be dispatched.

> 
> Or maybe I just missed these parts in $subject patch.
> 
>> +               switch (req_op(req)) {
>> +               case REQ_OP_DRV_IN:
>> +               case REQ_OP_DRV_OUT:
>> +                       mmc_blk_issue_drv_op(mq, req);
>> +                       break;
>> +               case REQ_OP_DISCARD:
>> +                       mmc_blk_issue_discard_rq(mq, req);
>> +                       break;
>> +               case REQ_OP_SECURE_ERASE:
>> +                       mmc_blk_issue_secdiscard_rq(mq, req);
>> +                       break;
>> +               case REQ_OP_FLUSH:
>> +                       mmc_blk_issue_flush(mq, req);
>> +                       break;
>> +               default:
>> +                       WARN_ON_ONCE(1);
>> +                       return MMC_REQ_FAILED_TO_START;
>> +               }
>> +               return MMC_REQ_FINISHED;
>> +       case MMC_ISSUE_ASYNC:
>> +               switch (req_op(req)) {
>> +               case REQ_OP_READ:
>> +               case REQ_OP_WRITE:
>> +                       ret = mmc_blk_mq_issue_rw_rq(mq, req);
>> +                       break;
>> +               default:
>> +                       WARN_ON_ONCE(1);
>> +                       ret = -EINVAL;
>> +               }
>> +               if (!ret)
>> +                       return MMC_REQ_STARTED;
>> +               return ret == -EBUSY ? MMC_REQ_BUSY : MMC_REQ_FAILED_TO_START;
>> +       default:
>> +               WARN_ON_ONCE(1);
>> +               return MMC_REQ_FAILED_TO_START;
>> +       }
>> +}
>> +
> 
> [...]
> 
>> +static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>> +                                   const struct blk_mq_queue_data *bd)
>> +{
>> +       struct request *req = bd->rq;
>> +       struct request_queue *q = req->q;
>> +       struct mmc_queue *mq = q->queuedata;
>> +       struct mmc_card *card = mq->card;
>> +       enum mmc_issue_type issue_type;
>> +       enum mmc_issued issued;
>> +       bool get_card;
>> +       int ret;
>> +
>> +       if (mmc_card_removed(mq->card)) {
>> +               req->rq_flags |= RQF_QUIET;
>> +               return BLK_STS_IOERR;
>> +       }
>> +
>> +       issue_type = mmc_issue_type(mq, req);
>> +
>> +       spin_lock_irq(q->queue_lock);
>> +
>> +       switch (issue_type) {
>> +       case MMC_ISSUE_ASYNC:
>> +               break;
>> +       default:
>> +               /*
>> +                * Timeouts are handled by mmc core, and we don't have a host
>> +                * API to abort requests, so we can't handle the timeout anyway.
>> +                * However, when the timeout happens, blk_mq_complete_request()
>> +                * no longer works (to stop the request disappearing under us).
>> +                * To avoid racing with that, set a large timeout.
>> +                */
>> +               req->timeout = 600 * HZ;
>> +               break;
>> +       }
>> +
>> +       mq->in_flight[issue_type] += 1;
>> +       get_card = (mmc_tot_in_flight(mq) == 1);
>> +
>> +       spin_unlock_irq(q->queue_lock);
>> +
>> +       if (!(req->rq_flags & RQF_DONTPREP)) {
>> +               req_to_mmc_queue_req(req)->retries = 0;
>> +               req->rq_flags |= RQF_DONTPREP;
>> +       }
>> +
>> +       if (get_card)
> 
> Coming back to the get_card() thingy, which I wonder if it's fragile.
> 
> A request that finds get_card == true here, doesn't necessarily have
> to reach to this point first (the task may be preempted), in case
> there is another request being queued in parallel (or that can't
> happen?).
> 
> That could then lead to that the following steps become executed for
> the other requests, prior anybody calling mmc_get_card().

You are right, this logic does not support parallel dispatches.

> 
>> +               mmc_get_card(card, &mq->ctx);
>> +
>> +       blk_mq_start_request(req);
>> +
>> +       issued = mmc_blk_mq_issue_rq(mq, req);
>> +
>> +       switch (issued) {
>> +       case MMC_REQ_BUSY:
>> +               ret = BLK_STS_RESOURCE;
>> +               break;
>> +       case MMC_REQ_FAILED_TO_START:
>> +               ret = BLK_STS_IOERR;
>> +               break;
>> +       default:
>> +               ret = BLK_STS_OK;
>> +               break;
>> +       }
>> +
>> +       if (issued != MMC_REQ_STARTED) {
>> +               bool put_card = false;
>> +
>> +               spin_lock_irq(q->queue_lock);
>> +               mq->in_flight[issue_type] -= 1;
>> +               if (mmc_tot_in_flight(mq) == 0)
>> +                       put_card = true;
>> +               spin_unlock_irq(q->queue_lock);
>> +               if (put_card)
>> +                       mmc_put_card(card, &mq->ctx);
> 
> For the similar reasons as above, this also looks fragile.
> 
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +static const struct blk_mq_ops mmc_mq_ops = {
>> +       .queue_rq       = mmc_mq_queue_rq,
>> +       .init_request   = mmc_mq_init_request,
>> +       .exit_request   = mmc_mq_exit_request,
>> +       .complete       = mmc_blk_mq_complete,
>> +       .timeout        = mmc_mq_timed_out,
>> +};
> 
> [...]
> 
>> +#define MMC_QUEUE_DEPTH 64
>> +
>> +static int mmc_mq_init(struct mmc_queue *mq, struct mmc_card *card,
>> +                        spinlock_t *lock)
>> +{
>> +       int q_depth;
>> +       int ret;
>> +
>> +       q_depth = MMC_QUEUE_DEPTH;
> 
> I already mentioned my thoughts around the queue_depth... and again I
> may be totally wrong. :-)
> 
>> +
>> +       ret = mmc_mq_init_queue(mq, q_depth, &mmc_mq_ops, lock);
>> +       if (ret)
>> +               return ret;
>> +
>> +       blk_queue_rq_timeout(mq->queue, 60 * HZ);
>> +
>> +       mmc_setup_queue(mq, card);
>> +
>> +       return 0;
>>  }
> 
> [...]
> 
> Kind regards
> Uffe
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-27 10:20     ` Adrian Hunter
@ 2017-11-27 11:23       ` Ulf Hansson
  2017-11-27 14:15         ` Adrian Hunter
  2017-11-27 11:36       ` Ulf Hansson
  1 sibling, 1 reply; 50+ messages in thread
From: Ulf Hansson @ 2017-11-27 11:23 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 27 November 2017 at 11:20, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 24/11/17 12:12, Ulf Hansson wrote:
>> [...]
>>
>>> +/* Single sector read during recovery */
>>> +static void mmc_blk_ss_read(struct mmc_queue *mq, struct request *req)
>>
>> Nitpick: I think mmc_blk_read_single() would be better as it is a more
>> clear name. Would you mind changing it?
>>
>>> +{
>>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>>> +       blk_status_t status;
>>> +
>>> +       while (1) {
>>> +               mmc_blk_rw_rq_prep(mqrq, mq->card, 1, mq);
>>> +
>>> +               mmc_wait_for_req(mq->card->host, &mqrq->brq.mrq);
>>> +
>>> +               /*
>>> +                * Not expecting command errors, so just give up in that case.
>>> +                * If there are retries remaining, the request will get
>>> +                * requeued.
>>> +                */
>>> +               if (mqrq->brq.cmd.error)
>>> +                       return;
>>
>> What happens here if the reason to the error is because the card was removed?
>
> Assuming the rescan is waiting for the host claim, the next read / write
> request will end up calling mmc_detect_card_removed() in the recovery.
> After that all following requests will error immediately because
> mmc_mq_queue_rq() calls mmc_card_removed().

Yep, that seems reasonable. I have also tested this, so it seems to
work as expected and similar as before.

>
>>
>> I guess next time __blk_err_check() is called from the
>> mmc_blk_mq_rw_recovery(), this will be detected and managed?
>>
>>> +
>>> +               if (blk_rq_bytes(req) <= 512)
>>
>> Shouldn't you check "if (blk_rq_bytes(req) <  512)"? How would you
>> otherwise read the last 512 bytes block?
>
> At this point we have read the last sector but not updated the request, so
> the number of bytes left should be 512.  The reason we don't update the
> request is so that the logic in mmc_blk_mq_complete_rq() will work.  I will
> add a comment.

Not sure I get that, but I assume the comment will help me understand. :-)

>
>>
>>> +                       break;
>>> +
>>> +               status = mqrq->brq.data.error ? BLK_STS_IOERR : BLK_STS_OK;
>>> +
>>> +               blk_update_request(req, status, 512);
>>
>> Shouldn't we actually bail out, unless the error is a data ECC error?
>> On the other hand, I guess if it a more severe error, cmd.error will
>> anyway be set above!?
>>
>> One more question, if there is a data error, we may want to try to
>> recover by sending a stop command? How do we manage that?
>
> I was thinking a single-block read would not need a stop.  I will think
> some more about error handling here.

Great!

Anyway, you may be right -  and perhaps it may not be worth adding
error handling, especially if it complicates the code a lot.

[...]

>>> +static void mmc_blk_mq_acct_req_done(struct mmc_queue *mq, struct request *req)
>>
>> Nitpick: Can we please try to find a better name for this function. I
>> don't think "acct" is good abbreviation because, to me, it's not
>> self-explaining.
>
> What about mmc_blk_mq_decrement_in_flight() ?

Looks good, or perhaps even: mmc_blk_mq_dec_in_flight().

>
>>
>>> +{
>>> +       struct request_queue *q = req->q;
>>> +       unsigned long flags;
>>> +       bool put_card;
>>> +
>>> +       spin_lock_irqsave(q->queue_lock, flags);
>>> +
>>> +       mq->in_flight[mmc_issue_type(mq, req)] -= 1;
>>> +
>>> +       put_card = (mmc_tot_in_flight(mq) == 0);
>>> +
>>> +       spin_unlock_irqrestore(q->queue_lock, flags);
>>> +
>>> +       if (put_card)
>>> +               mmc_put_card(mq->card, &mq->ctx);
>>
>> I have tried to convince myself that the protection of calling
>> mmc_get|put_card() is safe, but I am not sure.
>>
>> I am wondering whether there could be races for mmc_get|put_card().
>> Please see some more related comments below.
>
> mmc_put_card() is safe and necessary if we have seen mmc_tot_in_flight(mq)
> == 0.  When the next request arrives it will have to do a mmc_get_card()
> because it is changing the number of requests in flight from 0 to 1.  It
> doesn't matter if that mmc_get_card() comes before or after or during this
> mmc_put_card().
>
>>
>> [...]

[...]

>>
>> Anyway, then if using a queue_depth of 64, how will you make sure that
>> you not end up having > 1 requests being prepared at the same time
>> (not counting the one that may be in transfer)?
>
> We are currently single-threaded since every request goes through
> hctx->run_work when BLK_MQ_F_BLOCKING and nr_hw_queues == 1.  It might be
> worth adding a mutex to ensure that never changes.
>
> This point also answers some of the questions below, since there can be no
> parallel dispatches.

Yeah, it clearly does. Thanks!

>>> +
>>> +enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
>>> +{
>>> +       struct mmc_blk_data *md = mq->blkdata;
>>> +       struct mmc_card *card = md->queue.card;
>>> +       struct mmc_host *host = card->host;
>>> +       int ret;
>>> +
>>> +       ret = mmc_blk_part_switch(card, md->part_type);
>>
>> What if there is an ongoing request, shouldn't you wait for that to
>> complete before switching partition?
>
> Two requests on the same queue cannot be on different partitions because we
> have a different queue (and block device) for each partition.

That's not true for RPMB anymore I am afraid.

RPMB shares the same queue as for the main eMMC partition, which is
because we strive towards fair I/O scheduling across the hole device.

>
>>
>>> +       if (ret)
>>> +               return MMC_REQ_FAILED_TO_START;
>>> +
>>> +       switch (mmc_issue_type(mq, req)) {
>>> +       case MMC_ISSUE_SYNC:
>>> +               ret = mmc_blk_wait_for_idle(mq, host);
>>> +               if (ret)
>>> +                       return MMC_REQ_BUSY;
>>
>> Wouldn't it be possible that yet a new SYNC request becomes queued in
>> parallel with this current one. Then, when reaching this point, how do
>> you make sure that new request waits for the current "SYNC" request?
>
> As mentioned above, there are no parallel dispatches.
>
>>
>> I mean is the above mmc_blk_wait_for_idle(), really sufficient to deal
>> with synchronization?
>
> So long as there are no parallel dispatches.
>
>>
>> I guess we could use mmc_claim_host(no-ctx) in some clever way to deal
>> with this, or perhaps there is a better option?
>
> We are relying on there being no parallel dispatches.  That is the case now,
> but if it weren't we could use a mutex in mmc_mq_queue_rq().
>

Yeah, but then leave that until needed.

>>
>> BTW, I guess the problem is also present if there is SYNC request
>> ongoing and then is a new ASYNC request coming in. Is the ASYNC
>> request really waiting for the SYNC request to finish?
>
> With no parallel dispatches, the SYNC request runs to completion before
> another request can be dispatched.

Yes, I get it now. Thanks for clarifying this!

[...]

>>> +static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>>> +                                   const struct blk_mq_queue_data *bd)
>>> +{
>>> +       struct request *req = bd->rq;
>>> +       struct request_queue *q = req->q;
>>> +       struct mmc_queue *mq = q->queuedata;
>>> +       struct mmc_card *card = mq->card;
>>> +       enum mmc_issue_type issue_type;
>>> +       enum mmc_issued issued;
>>> +       bool get_card;
>>> +       int ret;
>>> +
>>> +       if (mmc_card_removed(mq->card)) {
>>> +               req->rq_flags |= RQF_QUIET;
>>> +               return BLK_STS_IOERR;
>>> +       }
>>> +
>>> +       issue_type = mmc_issue_type(mq, req);
>>> +
>>> +       spin_lock_irq(q->queue_lock);
>>> +
>>> +       switch (issue_type) {
>>> +       case MMC_ISSUE_ASYNC:
>>> +               break;
>>> +       default:
>>> +               /*
>>> +                * Timeouts are handled by mmc core, and we don't have a host
>>> +                * API to abort requests, so we can't handle the timeout anyway.
>>> +                * However, when the timeout happens, blk_mq_complete_request()
>>> +                * no longer works (to stop the request disappearing under us).
>>> +                * To avoid racing with that, set a large timeout.
>>> +                */
>>> +               req->timeout = 600 * HZ;
>>> +               break;
>>> +       }
>>> +
>>> +       mq->in_flight[issue_type] += 1;
>>> +       get_card = (mmc_tot_in_flight(mq) == 1);
>>> +
>>> +       spin_unlock_irq(q->queue_lock);
>>> +
>>> +       if (!(req->rq_flags & RQF_DONTPREP)) {
>>> +               req_to_mmc_queue_req(req)->retries = 0;
>>> +               req->rq_flags |= RQF_DONTPREP;
>>> +       }
>>> +
>>> +       if (get_card)
>>
>> Coming back to the get_card() thingy, which I wonder if it's fragile.
>>
>> A request that finds get_card == true here, doesn't necessarily have
>> to reach to this point first (the task may be preempted), in case
>> there is another request being queued in parallel (or that can't
>> happen?).
>>
>> That could then lead to that the following steps become executed for
>> the other requests, prior anybody calling mmc_get_card().
>
> You are right, this logic does not support parallel dispatches.
>

This do raises a question, don't you think it would be beneficial,
especially for CQE to allow parallel dispatches?

I am not saying we should change this at this point, just that we may
consider changing this for future improvements.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-27 10:20     ` Adrian Hunter
  2017-11-27 11:23       ` Ulf Hansson
@ 2017-11-27 11:36       ` Ulf Hansson
  1 sibling, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-27 11:36 UTC (permalink / raw)
  To: Adrian Hunter, Jens Axboe, Paolo Valente
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

+ Jens, Paolo

[...]

>>> +static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
>>> +                                 struct request *req)
>>> +{
>>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>>> +       struct mmc_host *host = mq->card->host;
>>> +       struct request *prev_req = NULL;
>>> +       int err = 0;
>>> +
>>> +       mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
>>> +
>>> +       mqrq->brq.mrq.done = mmc_blk_mq_req_done;
>>> +
>>> +       mmc_pre_req(host, &mqrq->brq.mrq);
>>
>> To be honest, using a queue_depth of 64, puzzles me! According to my
>> understanding we should use a queue_depth of 2, in case the host
>> implements the ->pre|post_req() callbacks, else we should set it to 1.
>>
>> Although I may be missing some information about how to really use
>> this, because for example UBI (mtd) also uses 64 as queue depth!?
>>
>> My interpretation of the queue_depth is that the blkmq layer will use
>> it to understand the maximum number of request a block device are able
>> to operate on simultaneously (when having one HW queue), thus the
>> number of outstanding dispatched requests for the block evice driver,
>> may be as close as possible to the queue_depth, but never above. I may
>> be totally wrong about this. :-)
>
> For blk-mq, the queue_depth also defines the default nr_requests, which will
> be 2 times the queue_depth if there is an elevator.  The old nr_requests was
> 128, so setting 64 gives the same nr_requests as before.
>
> Otherwise the queue_depth is the size of the tag set.
>
> A very low queue_depth might be a problem for I/O schedulers like kyber
> which seems to try to limit the number of tags available for asynchronous
> requests.

You are probably right about this, but it makes no sense to me.

I don't understand why the queue_depth, stated by storage device, has
to do with the number of requests being available for I/O scheduling.

I have looped in Jens and Paolo (BFQ), perhaps they can help to spread
some more light on this.

>
>>
>> Anyway, then if using a queue_depth of 64, how will you make sure that
>> you not end up having > 1 requests being prepared at the same time
>> (not counting the one that may be in transfer)?
>
> We are currently single-threaded since every request goes through
> hctx->run_work when BLK_MQ_F_BLOCKING and nr_hw_queues == 1.  It might be
> worth adding a mutex to ensure that never changes.
>
> This point also answers some of the questions below, since there can be no
> parallel dispatches.
>

Yeah it does, again thanks!

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-27 11:23       ` Ulf Hansson
@ 2017-11-27 14:15         ` Adrian Hunter
  2017-11-28 10:58           ` Ulf Hansson
  0 siblings, 1 reply; 50+ messages in thread
From: Adrian Hunter @ 2017-11-27 14:15 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 27/11/17 13:23, Ulf Hansson wrote:
> On 27 November 2017 at 11:20, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 24/11/17 12:12, Ulf Hansson wrote:
>>> [...]
>>>
>>>> +/* Single sector read during recovery */
>>>> +static void mmc_blk_ss_read(struct mmc_queue *mq, struct request *req)
>>>
>>> Nitpick: I think mmc_blk_read_single() would be better as it is a more
>>> clear name. Would you mind changing it?
>>>
>>>> +{
>>>> +       struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
>>>> +       blk_status_t status;
>>>> +
>>>> +       while (1) {
>>>> +               mmc_blk_rw_rq_prep(mqrq, mq->card, 1, mq);
>>>> +
>>>> +               mmc_wait_for_req(mq->card->host, &mqrq->brq.mrq);
>>>> +
>>>> +               /*
>>>> +                * Not expecting command errors, so just give up in that case.
>>>> +                * If there are retries remaining, the request will get
>>>> +                * requeued.
>>>> +                */
>>>> +               if (mqrq->brq.cmd.error)
>>>> +                       return;
>>>
>>> What happens here if the reason to the error is because the card was removed?
>>
>> Assuming the rescan is waiting for the host claim, the next read / write
>> request will end up calling mmc_detect_card_removed() in the recovery.
>> After that all following requests will error immediately because
>> mmc_mq_queue_rq() calls mmc_card_removed().
> 
> Yep, that seems reasonable. I have also tested this, so it seems to
> work as expected and similar as before.
> 
>>
>>>
>>> I guess next time __blk_err_check() is called from the
>>> mmc_blk_mq_rw_recovery(), this will be detected and managed?
>>>
>>>> +
>>>> +               if (blk_rq_bytes(req) <= 512)
>>>
>>> Shouldn't you check "if (blk_rq_bytes(req) <  512)"? How would you
>>> otherwise read the last 512 bytes block?
>>
>> At this point we have read the last sector but not updated the request, so
>> the number of bytes left should be 512.  The reason we don't update the
>> request is so that the logic in mmc_blk_mq_complete_rq() will work.  I will
>> add a comment.
> 
> Not sure I get that, but I assume the comment will help me understand. :-)
> 
>>
>>>
>>>> +                       break;
>>>> +
>>>> +               status = mqrq->brq.data.error ? BLK_STS_IOERR : BLK_STS_OK;
>>>> +
>>>> +               blk_update_request(req, status, 512);
>>>
>>> Shouldn't we actually bail out, unless the error is a data ECC error?
>>> On the other hand, I guess if it a more severe error, cmd.error will
>>> anyway be set above!?
>>>
>>> One more question, if there is a data error, we may want to try to
>>> recover by sending a stop command? How do we manage that?
>>
>> I was thinking a single-block read would not need a stop.  I will think
>> some more about error handling here.
> 
> Great!
> 
> Anyway, you may be right -  and perhaps it may not be worth adding
> error handling, especially if it complicates the code a lot.
> 
> [...]
> 
>>>> +static void mmc_blk_mq_acct_req_done(struct mmc_queue *mq, struct request *req)
>>>
>>> Nitpick: Can we please try to find a better name for this function. I
>>> don't think "acct" is good abbreviation because, to me, it's not
>>> self-explaining.
>>
>> What about mmc_blk_mq_decrement_in_flight() ?
> 
> Looks good, or perhaps even: mmc_blk_mq_dec_in_flight().
> 
>>
>>>
>>>> +{
>>>> +       struct request_queue *q = req->q;
>>>> +       unsigned long flags;
>>>> +       bool put_card;
>>>> +
>>>> +       spin_lock_irqsave(q->queue_lock, flags);
>>>> +
>>>> +       mq->in_flight[mmc_issue_type(mq, req)] -= 1;
>>>> +
>>>> +       put_card = (mmc_tot_in_flight(mq) == 0);
>>>> +
>>>> +       spin_unlock_irqrestore(q->queue_lock, flags);
>>>> +
>>>> +       if (put_card)
>>>> +               mmc_put_card(mq->card, &mq->ctx);
>>>
>>> I have tried to convince myself that the protection of calling
>>> mmc_get|put_card() is safe, but I am not sure.
>>>
>>> I am wondering whether there could be races for mmc_get|put_card().
>>> Please see some more related comments below.
>>
>> mmc_put_card() is safe and necessary if we have seen mmc_tot_in_flight(mq)
>> == 0.  When the next request arrives it will have to do a mmc_get_card()
>> because it is changing the number of requests in flight from 0 to 1.  It
>> doesn't matter if that mmc_get_card() comes before or after or during this
>> mmc_put_card().
>>
>>>
>>> [...]
> 
> [...]
> 
>>>
>>> Anyway, then if using a queue_depth of 64, how will you make sure that
>>> you not end up having > 1 requests being prepared at the same time
>>> (not counting the one that may be in transfer)?
>>
>> We are currently single-threaded since every request goes through
>> hctx->run_work when BLK_MQ_F_BLOCKING and nr_hw_queues == 1.  It might be
>> worth adding a mutex to ensure that never changes.
>>
>> This point also answers some of the questions below, since there can be no
>> parallel dispatches.
> 
> Yeah, it clearly does. Thanks!
> 
>>>> +
>>>> +enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
>>>> +{
>>>> +       struct mmc_blk_data *md = mq->blkdata;
>>>> +       struct mmc_card *card = md->queue.card;
>>>> +       struct mmc_host *host = card->host;
>>>> +       int ret;
>>>> +
>>>> +       ret = mmc_blk_part_switch(card, md->part_type);
>>>
>>> What if there is an ongoing request, shouldn't you wait for that to
>>> complete before switching partition?
>>
>> Two requests on the same queue cannot be on different partitions because we
>> have a different queue (and block device) for each partition.
> 
> That's not true for RPMB anymore I am afraid.
> 
> RPMB shares the same queue as for the main eMMC partition, which is
> because we strive towards fair I/O scheduling across the hole device.

I hadn't thought of RPMB, but I think the logic is OK, which is good because
it is the same as we presently have.  Here the md->part_type will be the
main area even for RPMB.  So this switch won't do anything if we have a
request in flight.  Then inside __mmc_blk_ioctl_cmd() the switch to RPMB is
done, and afterwards mmc_blk_issue_drv_op() switches it back again.

> 
>>
>>>
>>>> +       if (ret)
>>>> +               return MMC_REQ_FAILED_TO_START;
>>>> +
>>>> +       switch (mmc_issue_type(mq, req)) {
>>>> +       case MMC_ISSUE_SYNC:
>>>> +               ret = mmc_blk_wait_for_idle(mq, host);
>>>> +               if (ret)
>>>> +                       return MMC_REQ_BUSY;
>>>
>>> Wouldn't it be possible that yet a new SYNC request becomes queued in
>>> parallel with this current one. Then, when reaching this point, how do
>>> you make sure that new request waits for the current "SYNC" request?
>>
>> As mentioned above, there are no parallel dispatches.
>>
>>>
>>> I mean is the above mmc_blk_wait_for_idle(), really sufficient to deal
>>> with synchronization?
>>
>> So long as there are no parallel dispatches.
>>
>>>
>>> I guess we could use mmc_claim_host(no-ctx) in some clever way to deal
>>> with this, or perhaps there is a better option?
>>
>> We are relying on there being no parallel dispatches.  That is the case now,
>> but if it weren't we could use a mutex in mmc_mq_queue_rq().
>>
> 
> Yeah, but then leave that until needed.
> 
>>>
>>> BTW, I guess the problem is also present if there is SYNC request
>>> ongoing and then is a new ASYNC request coming in. Is the ASYNC
>>> request really waiting for the SYNC request to finish?
>>
>> With no parallel dispatches, the SYNC request runs to completion before
>> another request can be dispatched.
> 
> Yes, I get it now. Thanks for clarifying this!
> 
> [...]
> 
>>>> +static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>> +                                   const struct blk_mq_queue_data *bd)
>>>> +{
>>>> +       struct request *req = bd->rq;
>>>> +       struct request_queue *q = req->q;
>>>> +       struct mmc_queue *mq = q->queuedata;
>>>> +       struct mmc_card *card = mq->card;
>>>> +       enum mmc_issue_type issue_type;
>>>> +       enum mmc_issued issued;
>>>> +       bool get_card;
>>>> +       int ret;
>>>> +
>>>> +       if (mmc_card_removed(mq->card)) {
>>>> +               req->rq_flags |= RQF_QUIET;
>>>> +               return BLK_STS_IOERR;
>>>> +       }
>>>> +
>>>> +       issue_type = mmc_issue_type(mq, req);
>>>> +
>>>> +       spin_lock_irq(q->queue_lock);
>>>> +
>>>> +       switch (issue_type) {
>>>> +       case MMC_ISSUE_ASYNC:
>>>> +               break;
>>>> +       default:
>>>> +               /*
>>>> +                * Timeouts are handled by mmc core, and we don't have a host
>>>> +                * API to abort requests, so we can't handle the timeout anyway.
>>>> +                * However, when the timeout happens, blk_mq_complete_request()
>>>> +                * no longer works (to stop the request disappearing under us).
>>>> +                * To avoid racing with that, set a large timeout.
>>>> +                */
>>>> +               req->timeout = 600 * HZ;
>>>> +               break;
>>>> +       }
>>>> +
>>>> +       mq->in_flight[issue_type] += 1;
>>>> +       get_card = (mmc_tot_in_flight(mq) == 1);
>>>> +
>>>> +       spin_unlock_irq(q->queue_lock);
>>>> +
>>>> +       if (!(req->rq_flags & RQF_DONTPREP)) {
>>>> +               req_to_mmc_queue_req(req)->retries = 0;
>>>> +               req->rq_flags |= RQF_DONTPREP;
>>>> +       }
>>>> +
>>>> +       if (get_card)
>>>
>>> Coming back to the get_card() thingy, which I wonder if it's fragile.
>>>
>>> A request that finds get_card == true here, doesn't necessarily have
>>> to reach to this point first (the task may be preempted), in case
>>> there is another request being queued in parallel (or that can't
>>> happen?).
>>>
>>> That could then lead to that the following steps become executed for
>>> the other requests, prior anybody calling mmc_get_card().
>>
>> You are right, this logic does not support parallel dispatches.
>>
> 
> This do raises a question, don't you think it would be beneficial,
> especially for CQE to allow parallel dispatches?
> 
> I am not saying we should change this at this point, just that we may
> consider changing this for future improvements.

I think the benefit is limited because the time to dispatch a request is
small compared with the time to complete a request. i.e. a number of
requests can be queued before the first one has completed.  But yes, it is
something to keep in mind.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 00/24] mmc: Add Command Queue support
  2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
                   ` (23 preceding siblings ...)
  2017-11-21 13:42 ` [PATCH V14 24/24] mmc: core: " Adrian Hunter
@ 2017-11-28  9:42 ` Linus Walleij
  2017-11-28 19:15   ` Ulf Hansson
  24 siblings, 1 reply; 50+ messages in thread
From: Linus Walleij @ 2017-11-28  9:42 UTC (permalink / raw)
  To: Adrian Hunter, Paolo Valente
  Cc: Ulf Hansson, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:

> Here is V14 of the hardware command queue patches without the software
> command queue patches, now using blk-mq and now with blk-mq support for
> non-CQE I/O.
>
> V14 includes a number of fixes to existing code, changes to default to
> blk-mq, and adds patches to remove legacy code.

I have looked over the code, I was unable to find a good mergebase to apply
it on (I guess it is based on linux-next at some date in the past) so mostly
I just looked at it overall, and I can solidly say that this patch series:

Acked-by: Linus Walleij <linus.walleij@linaro.org>

I gave some more explicit review on some initial patches that I think
should go in as fixes.

I do not expect it to perform any less than the previous iteration on my
systems where it was already performing well and Bartlomiej also has
confirmed that the patch set works for him.

Ulf: I suggest this be applied (+/- some rebasing) early for v4.15.

I am positively convinced that we can make things work on top of this.

> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
> 2% drop in sequential read speed but no change to sequential write.

Fully acceptable I think.

> Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
> seemed to be coming from the inferior latency of running work items compared
> with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
> performance degradation from 3% to 1%.

Also acceptable I think.

> While we should look at changing blk-mq to give better workqueue performance,
> a bigger gain is likely to be made by adding a new host API to enable the
> next already-prepared request to be issued directly from within ->done()
> callback of the current request.

I agree.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 13/24] mmc: block: Add blk-mq support
  2017-11-27 14:15         ` Adrian Hunter
@ 2017-11-28 10:58           ` Ulf Hansson
  0 siblings, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-28 10:58 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

[...]

>>>>> +
>>>>> +enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
>>>>> +{
>>>>> +       struct mmc_blk_data *md = mq->blkdata;
>>>>> +       struct mmc_card *card = md->queue.card;
>>>>> +       struct mmc_host *host = card->host;
>>>>> +       int ret;
>>>>> +
>>>>> +       ret = mmc_blk_part_switch(card, md->part_type);
>>>>
>>>> What if there is an ongoing request, shouldn't you wait for that to
>>>> complete before switching partition?
>>>
>>> Two requests on the same queue cannot be on different partitions because we
>>> have a different queue (and block device) for each partition.
>>
>> That's not true for RPMB anymore I am afraid.
>>
>> RPMB shares the same queue as for the main eMMC partition, which is
>> because we strive towards fair I/O scheduling across the hole device.
>
> I hadn't thought of RPMB, but I think the logic is OK, which is good because
> it is the same as we presently have.  Here the md->part_type will be the
> main area even for RPMB.  So this switch won't do anything if we have a
> request in flight.  Then inside __mmc_blk_ioctl_cmd() the switch to RPMB is
> done, and afterwards mmc_blk_issue_drv_op() switches it back again.

Yes, you are right! No worries then!

[...]

>>>
>>> You are right, this logic does not support parallel dispatches.
>>>
>>
>> This do raises a question, don't you think it would be beneficial,
>> especially for CQE to allow parallel dispatches?
>>
>> I am not saying we should change this at this point, just that we may
>> consider changing this for future improvements.
>
> I think the benefit is limited because the time to dispatch a request is
> small compared with the time to complete a request. i.e. a number of
> requests can be queued before the first one has completed.  But yes, it is
> something to keep in mind.

Yeah, let's leave this for future considerations.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 14/24] mmc: block: Add CQE support
  2017-11-21 13:42 ` [PATCH V14 14/24] mmc: block: Add CQE support Adrian Hunter
@ 2017-11-28 11:20   ` Ulf Hansson
  0 siblings, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-28 11:20 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

On 21 November 2017 at 14:42, Adrian Hunter <adrian.hunter@intel.com> wrote:
> Add CQE support to the block driver, including:
>     - optionally using DCMD for flush requests
>     - "manually" issuing discard requests
>     - issuing read / write requests to the CQE
>     - supporting block-layer timeouts
>     - handling recovery
>     - supporting re-tuning
>
> CQE offers 25% - 50% better random multi-threaded I/O.  There is a slight
> (e.g. 2%) drop in sequential read speed but no observable change to sequential
> write.
>
> CQE automatically sends the commands to complete requests.  However it only
> supports reads / writes and so-called "direct commands" (DCMD).  Furthermore
> DCMD is limited to one command at a time, but discards require 3 commands.
> That makes issuing discards through CQE very awkward, but some CQE's don't
> support DCMD anyway.  So for discards, the existing non-CQE approach is
> taken, where the mmc core code issues the 3 commands one at a time i.e.
> mmc_erase(). Where DCMD is used, is for issuing flushes.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

This looks good to me!

I only have one, very minor comment.

[...]

> @@ -370,10 +514,14 @@ static int mmc_mq_init_queue(struct mmc_queue *mq, int q_depth,
>  static int mmc_mq_init(struct mmc_queue *mq, struct mmc_card *card,
>                          spinlock_t *lock)
>  {
> +       struct mmc_host *host = card->host;
>         int q_depth;
>         int ret;
>
> -       q_depth = MMC_QUEUE_DEPTH;
> +       if (mq->use_cqe)
> +               q_depth = min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);

To make it clear why this is needed, could you please add some comment
in the code?

As I was trying to point out in the other reply about queue depth, for
patch 13, this is weird to me.
This may mean that we end up using queue_depth being less than
MMC_QUEUE_DEPTH (64) for the CQE case. While in fact, in the CQE case
the HW actually supports a bigger queue depth, comparing when not
using CQE.

Anyway, it seems like that will have to be a separate topic to discuss
with the blkmq experts.

> +       else
> +               q_depth = MMC_QUEUE_DEPTH;
>

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 17/24] mmc: block: blk-mq: Add support for direct completion
  2017-11-21 13:42 ` [PATCH V14 17/24] mmc: block: blk-mq: Add support for direct completion Adrian Hunter
@ 2017-11-28 19:02   ` Ulf Hansson
  0 siblings, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-28 19:02 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

[...]

>
> diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
> index 1d7d3b0afff8..c4271fa54f1a 100644
> --- a/drivers/mmc/core/queue.h
> +++ b/drivers/mmc/core/queue.h
> @@ -103,6 +103,7 @@ struct mmc_queue {
>         bool                    waiting;
>         struct work_struct      recovery_work;
>         wait_queue_head_t       wait;
> +       struct request          *recovery_req;
>         struct request          *complete_req;
>         struct mutex            complete_lock;
>         struct work_struct      complete_work;
> @@ -134,4 +135,9 @@ static inline int mmc_cqe_qcnt(struct mmc_queue *mq)
>                mq->in_flight[MMC_ISSUE_ASYNC];
>  }
>
> +static inline bool mmc_queue_direct_complete(struct mmc_host *host)

Nitpick 1) I would like to make it clear that this is a feature, which
depends on the behavior of the mmc host. Thus, I suggest renaming the
function to something along the lines of mmc_host_*()

Nitpick 2) There is system-wide PM core feature already "reserved" the
"direct_complete" terminology. To avoid mixing them up (terminology
wise), may I suggest renaming the mmc cap to MMC_CAP_DONE_COMPLETE,
thus perhaps the function above to mmc_host_done_complete(). Or if you
find a better option. :-)

> +{
> +       return host->caps & MMC_CAP_DIRECT_COMPLETE;
> +}
> +
>  #endif
> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index ce2075d6f429..4b68a95a8818 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -324,6 +324,7 @@ struct mmc_host {
>  #define MMC_CAP_DRIVER_TYPE_A  (1 << 23)       /* Host supports Driver Type A */
>  #define MMC_CAP_DRIVER_TYPE_C  (1 << 24)       /* Host supports Driver Type C */
>  #define MMC_CAP_DRIVER_TYPE_D  (1 << 25)       /* Host supports Driver Type D */
> +#define MMC_CAP_DIRECT_COMPLETE        (1 << 27)       /* RW reqs can be completed within mmc_request_done() */
>  #define MMC_CAP_CD_WAKE                (1 << 28)       /* Enable card detect wake */
>  #define MMC_CAP_CMD_DURING_TFR (1 << 29)       /* Commands during data transfer */
>  #define MMC_CAP_CMD23          (1 << 30)       /* CMD23 supported. */
> --
> 1.9.1
>

Besides the nitpicks, this looks good to me!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 00/24] mmc: Add Command Queue support
  2017-11-28  9:42 ` [PATCH V14 00/24] mmc: Add Command Queue support Linus Walleij
@ 2017-11-28 19:15   ` Ulf Hansson
  2017-11-28 19:23     ` Ulf Hansson
  0 siblings, 1 reply; 50+ messages in thread
From: Ulf Hansson @ 2017-11-28 19:15 UTC (permalink / raw)
  To: Linus Walleij, Adrian Hunter
  Cc: Paolo Valente, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Linus, Adrian,

On 28 November 2017 at 10:42, Linus Walleij <linus.walleij@linaro.org> wrote:
> On Tue, Nov 21, 2017 at 2:42 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>
>> Here is V14 of the hardware command queue patches without the software
>> command queue patches, now using blk-mq and now with blk-mq support for
>> non-CQE I/O.
>>
>> V14 includes a number of fixes to existing code, changes to default to
>> blk-mq, and adds patches to remove legacy code.
>
> I have looked over the code, I was unable to find a good mergebase to apply
> it on (I guess it is based on linux-next at some date in the past) so mostly
> I just looked at it overall, and I can solidly say that this patch series:
>
> Acked-by: Linus Walleij <linus.walleij@linaro.org>

Great, thanks!

>
> I gave some more explicit review on some initial patches that I think
> should go in as fixes.

Thanks, and already taken care of.

>
> I do not expect it to perform any less than the previous iteration on my
> systems where it was already performing well and Bartlomiej also has
> confirmed that the patch set works for him.
>
> Ulf: I suggest this be applied (+/- some rebasing) early for v4.15.

Yes, I am up for that!

I have now also completed my review of the series and in the end, most
of my comments turned out to be of minor issues, hopefully easily
addressed.

>
> I am positively convinced that we can make things work on top of this.
>
>> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
>> 2% drop in sequential read speed but no change to sequential write.
>
> Fully acceptable I think.
>
>> Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
>> seemed to be coming from the inferior latency of running work items compared
>> with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
>> performance degradation from 3% to 1%.
>
> Also acceptable I think.
>
>> While we should look at changing blk-mq to give better workqueue performance,
>> a bigger gain is likely to be made by adding a new host API to enable the
>> next already-prepared request to be issued directly from within ->done()
>> callback of the current request.

I assume that is taken care of by adding the new host cap
(MMC_CAP_DIRECT_COMPLETE).

So, then it's just a matter of adopting all host drivers, and when we
are done with that, we can remove that cap. :-)

>
> I agree.
>
> Yours,
> Linus Walleij

I am awaiting a re-based v15 version - and eager to apply it! :-)

Thanks and kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH V14 00/24] mmc: Add Command Queue support
  2017-11-28 19:15   ` Ulf Hansson
@ 2017-11-28 19:23     ` Ulf Hansson
  0 siblings, 0 replies; 50+ messages in thread
From: Ulf Hansson @ 2017-11-28 19:23 UTC (permalink / raw)
  To: Linus Walleij, Adrian Hunter
  Cc: Paolo Valente, linux-mmc, linux-block, linux-kernel, Bough Chen,
	Alex Lemberg, Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung,
	Dong Aisheng, Das Asutosh, Zhangfei Gao, Sahitya Tummala,
	Harjani Ritesh, Venu Byravarasu, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

[...]

>>> While we should look at changing blk-mq to give better workqueue performance,
>>> a bigger gain is likely to be made by adding a new host API to enable the
>>> next already-prepared request to be issued directly from within ->done()
>>> callback of the current request.
>
> I assume that is taken care of by adding the new host cap
> (MMC_CAP_DIRECT_COMPLETE).
>
> So, then it's just a matter of adopting all host drivers, and when we
> are done with that, we can remove that cap. :-)

Ehh, I realize that this not what you really propose, but rather this
is a next step of improvement. Anyway.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2017-11-28 19:23 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-21 13:42 [PATCH V14 00/24] mmc: Add Command Queue support Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 01/24] mmc: block: Fix missing blk_put_request() Adrian Hunter
2017-11-23  9:41   ` Linus Walleij
2017-11-23 18:12   ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 02/24] mmc: block: Check return value of blk_get_request() Adrian Hunter
2017-11-23  9:56   ` Linus Walleij
2017-11-23 18:12   ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 03/24] mmc: core: Do not leave the block driver in a suspended state Adrian Hunter
2017-11-23  9:58   ` Linus Walleij
2017-11-23 18:12   ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 04/24] mmc: block: Ensure that debugfs files are removed Adrian Hunter
2017-11-23 13:22   ` Linus Walleij
2017-11-23 18:13   ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 05/24] mmc: block: No need to export mmc_cleanup_queue() Adrian Hunter
2017-11-23 13:23   ` Linus Walleij
2017-11-21 13:42 ` [PATCH V14 06/24] mmc: block: Simplify cleaning up the queue Adrian Hunter
2017-11-23 13:27   ` Linus Walleij
2017-11-21 13:42 ` [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect() Adrian Hunter
2017-11-21 15:39   ` Ulf Hansson
2017-11-22  7:40     ` Adrian Hunter
2017-11-22 14:43       ` Ulf Hansson
2017-11-23 11:37         ` Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 08/24] mmc: block: Check for transfer state " Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 09/24] mmc: block: Make card_busy_detect() accumulate all response error bits Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 10/24] mmc: core: Make mmc_pre_req() and mmc_post_req() available Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 11/24] mmc: block: Add error-handling comments Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 12/24] mmc: core: Add parameter use_blk_mq Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 13/24] mmc: block: Add blk-mq support Adrian Hunter
2017-11-24 10:12   ` Ulf Hansson
2017-11-27 10:20     ` Adrian Hunter
2017-11-27 11:23       ` Ulf Hansson
2017-11-27 14:15         ` Adrian Hunter
2017-11-28 10:58           ` Ulf Hansson
2017-11-27 11:36       ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 14/24] mmc: block: Add CQE support Adrian Hunter
2017-11-28 11:20   ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 15/24] mmc: cqhci: support for command queue enabled host Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 16/24] mmc: sdhci-pci: Add CQHCI support for Intel GLK Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 17/24] mmc: block: blk-mq: Add support for direct completion Adrian Hunter
2017-11-28 19:02   ` Ulf Hansson
2017-11-21 13:42 ` [PATCH V14 18/24] mmc: block: blk-mq: Separate card polling from recovery Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 19/24] mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 20/24] mmc: block: blk-mq: Stop using legacy recovery Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 21/24] mmc: mmc_test: Do not use mmc_start_areq() anymore Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 22/24] mmc: core: Remove option not to use blk-mq Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 23/24] mmc: block: Remove code no longer needed after the switch to blk-mq Adrian Hunter
2017-11-21 13:42 ` [PATCH V14 24/24] mmc: core: " Adrian Hunter
2017-11-28  9:42 ` [PATCH V14 00/24] mmc: Add Command Queue support Linus Walleij
2017-11-28 19:15   ` Ulf Hansson
2017-11-28 19:23     ` Ulf Hansson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).