[PATCH V15 00/22] mmc: Add Command Queue support

* [PATCH V15 00/22] mmc: Add Command Queue support
@ 2017-11-29 13:40 Adrian Hunter
  2017-11-29 13:40 ` [PATCH V15 01/22] mmc: block: No need to export mmc_cleanup_queue() Adrian Hunter
                   ` (22 more replies)
  0 siblings, 23 replies; 42+ messages in thread
From: Adrian Hunter @ 2017-11-29 13:40 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-mmc, linux-block, linux-kernel, Bough Chen, Alex Lemberg,
	Mateusz Nowak, Yuliy Izrailov, Jaehoon Chung, Dong Aisheng,
	Das Asutosh, Zhangfei Gao, Sahitya Tummala, Harjani Ritesh,
	Venu Byravarasu, Linus Walleij, Shawn Lin,
	Bartlomiej Zolnierkiewicz, Christoph Hellwig

Hi

Here is V15 of the hardware command queue patches without the software
command queue patches, now using blk-mq and now with blk-mq support for
non-CQE I/O.

V14 included a number of fixes to existing code, changes to default to
blk-mq, and adds patches to remove legacy code.

HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
2% drop in sequential read speed but no change to sequential write.

Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
seemed to be coming from the inferior latency of running work items compared
with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
performance degradation from 3% to 1%.

While we should look at changing blk-mq to give better workqueue performance,
a bigger gain is likely to be made by adding a new host API to enable the
next already-prepared request to be issued directly from within ->done()
callback of the current request.

Changes since V14:
      mmc: block: Fix missing blk_put_request()
      mmc: block: Check return value of blk_get_request()
      mmc: core: Do not leave the block driver in a suspended state
      mmc: block: Ensure that debugfs files are removed
	Dropped because they have been applied
      mmc: block: Use data timeout in card_busy_detect()
	Replaced by other patches
      mmc: block: Add blk-mq support
	Rename mmc_blk_ss_read() to mmc_blk_read_single()
	Add more error handling to single sector read
	Let mmc_blk_mq_complete_rq() cater for requests already "updated" by recovery
	Rename mmc_blk_mq_acct_req_done() to mmc_blk_mq_dec_in_flight()
	Add comments about synchronization
	Add comment about not dispatching in parallel
	Add comment about the queue depth
      mmc: block: Add CQE support
	Add coment about CQE queue depth
      mmc: block: blk-mq: Add support for direct completion
	Rename mmc_queue_direct_complete() to mmc_host_done_complete()
	Rename MMC_CAP_DIRECT_COMPLETE to MMC_CAP_DONE_COMPLETE
      mmc: block: blk-mq: Separate card polling from recovery
	Ensure to report gen_err as an error
      mmc: block: Make card_busy_detect() accumulate all response error bits
	Patch moved later in the patch set and adjusted accordingly
      mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
	Adjusted due to patch re-ordering
      mmc: block: Check the timeout correctly in card_busy_detect()
	New patch.
      mmc: block: Add timeout_clks when calculating timeout
	New patch.
      mmc: block: Reduce polling timeout from 10 minutes to 10 seconds
	New patch.

Changes since V13:
      mmc: block: Fix missing blk_put_request()
	New patch.
      mmc: block: Check return value of blk_get_request()
	New patch.
      mmc: core: Do not leave the block driver in a suspended state
	New patch.
      mmc: block: Ensure that debugfs files are removed
	New patch.
      mmc: block: No need to export mmc_cleanup_queue()
	New patch.
      mmc: block: Simplify cleaning up the queue
	New patch.
      mmc: block: Use data timeout in card_busy_detect()
	New patch.
      mmc: block: Check for transfer state in card_busy_detect()
	New patch.
      mmc: block: Make card_busy_detect() accumulate all response error bits
	New patch.
      mmc: core: Make mmc_pre_req() and mmc_post_req() available
	New patch.
      mmc: core: Add parameter use_blk_mq
	Default to y
      mmc: block: Add blk-mq support
	Wrap blk_mq_end_request / blk_end_request_all
	Rename mmc_blk_rw_recovery -> mmc_blk_mq_rw_recovery
	Additional parentheses to '==' expressions
	Use mmc_pre_req() / mmc_post_req()
	Fix missing tuning release on error after mmc_start_request()
	Expand comment about timeouts
	Allow for possibility that the queue is quiesced when removing
	Ensure complete_work is flushed when removing
      mmc: block: Add CQE support
	Additional parentheses to '==' expressions
      mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
	Replaces patch "Stop using card_busy_detect()" retaining card_busy_detect()
      mmc: block: blk-mq: Stop using legacy recovery
	Allow for SPI
      mmc: mmc_test: Do not use mmc_start_areq() anymore
	New patch.
      mmc: core: Remove option not to use blk-mq
	New patch.
      mmc: block: Remove code no longer needed after the switch to blk-mq
	New patch.
      mmc: core: Remove code no longer needed after the switch to blk-mq
	New patch.

Changes since V12:
      mmc: block: Add error-handling comments
	New patch.
      mmc: block: Add blk-mq support
	Use legacy error handling
      mmc: block: Add CQE support
	Re-base
      mmc: block: blk-mq: Add support for direct completion
	New patch.
      mmc: block: blk-mq: Separate card polling from recovery
	New patch.
      mmc: block: blk-mq: Stop using card_busy_detect()
	New patch.
      mmc: block: blk-mq: Stop using legacy recovery
	New patch.

Changes since V11:
      Split "mmc: block: Add CQE and blk-mq support" into 2 patches

Changes since V10:
      mmc: core: Remove unnecessary host claim
      mmc: core: Introduce host claiming by context
      mmc: core: Add support for handling CQE requests
      mmc: mmc: Enable Command Queuing
      mmc: mmc: Enable CQE's
      mmc: block: Use local variables in mmc_blk_data_prep()
      mmc: block: Prepare CQE data
      mmc: block: Factor out mmc_setup_queue()
      mmc: core: Add parameter use_blk_mq
      mmc: core: Export mmc_start_bkops()
      mmc: core: Export mmc_start_request()
      mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
	Dropped because they have been applied
      mmc: block: Add CQE and blk-mq support
	Extend blk-mq support for asynchronous read / writes to all host
	controllers including those that require polling. The direct
	completion path is still available but depends on a new capability
	flag.
	Drop blk-mq support for synchronous read / writes.

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

Changes since V9:
      mmc: block: Add CQE and blk-mq support
	- reinstate mq support for REQ_OP_DRV_IN/OUT that was removed because
	it was incorrectly assumed to be handled by the rpmb character device
	- don't check for rpmb block device anymore
      mmc: cqhci: support for command queue enabled host
	Fix cqhci_set_irqs() as per Haibo Chen

Changes since V8:
	Re-based
      mmc: core: Introduce host claiming by context
	Slightly simplified as per Ulf
      mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
	New patch.
      mmc: block: Add CQE and blk-mq support
	Fix missing ->post_req() on the error path

Changes since V7:
	Re-based
      mmc: core: Introduce host claiming by context
	Slightly simplified
      mmc: core: Add parameter use_blk_mq
	New patch.
      mmc: core: Remove unnecessary host claim
	New patch.
      mmc: core: Export mmc_start_bkops()
	New patch.
      mmc: core: Export mmc_start_request()
	New patch.
      mmc: block: Add CQE and blk-mq support
	Add blk-mq support for non_CQE requests

Changes since V6:
      mmc: core: Introduce host claiming by context
	New patch.
      mmc: core: Move mmc_start_areq() declaration
	Dropped because it has been applied
      mmc: block: Fix block status codes
	Dropped because it has been applied
      mmc: host: Add CQE interface
	Dropped because it has been applied
      mmc: core: Turn off CQE before sending commands
	Dropped because it has been applied
      mmc: block: Factor out mmc_setup_queue()
	New patch.
      mmc: block: Add CQE support
	Drop legacy support and add blk-mq support

Changes since V5:
	Re-based
      mmc: core: Add mmc_retune_hold_now()
	Dropped because it has been applied
      mmc: core: Add members to mmc_request and mmc_data for CQE's
	Dropped because it has been applied
      mmc: core: Move mmc_start_areq() declaration
	New patch at Ulf's request
      mmc: block: Fix block status codes
	Another un-related patch
      mmc: host: Add CQE interface
	Move recovery_notifier() callback to struct mmc_request
      mmc: core: Add support for handling CQE requests
	Roll __mmc_cqe_request_done() into mmc_cqe_request_done()
	Move function declarations requested by Ulf
      mmc: core: Remove unused MMC_CAP2_PACKED_CMD
	Dropped because it has been applied
      mmc: block: Add CQE support
	Add explanation to commit message
	Adjustment for changed recovery_notifier() callback
      mmc: cqhci: support for command queue enabled host
	Adjustment for changed recovery_notifier() callback
      mmc: sdhci-pci: Add CQHCI support for Intel GLK
	Add DCMD capability for Intel controllers except GLK

Changes since V4:
      mmc: core: Add mmc_retune_hold_now()
	Add explanation to commit message.
      mmc: host: Add CQE interface
	Add comments to callback declarations.
      mmc: core: Turn off CQE before sending commands
	Add explanation to commit message.
      mmc: core: Add support for handling CQE requests
	Add comments as requested by Ulf.
      mmc: core: Remove unused MMC_CAP2_PACKED_CMD
	New patch.
      mmc: mmc: Enable Command Queuing
	Adjust for removal of MMC_CAP2_PACKED_CMD.
	Add a comment about Packed Commands.
      mmc: mmc: Enable CQE's
	Remove un-necessary check for MMC_CAP2_CQE
      mmc: block: Use local variables in mmc_blk_data_prep()
	New patch.
      mmc: block: Prepare CQE data
	Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
	Remove priority setting.
	Add explanation to commit message.
      mmc: cqhci: support for command queue enabled host
	Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA

Changes since V3:
	Adjusted ...blk_end_request...() for new block status codes
	Fixed CQHCI transaction descriptor for "no DCMD" case

Changes since V2:
	Dropped patches that have been applied.
	Re-based
	Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"

Changes since V1:

	"Share mmc request array between partitions" is dependent
	on changes in "Introduce queue semantics", so added that
	and block fixes:

	Added "Fix is_waiting_last_req set incorrectly"
	Added "Fix cmd error reset failure path"
	Added "Use local var for mqrq_cur"
	Added "Introduce queue semantics"

Changes since RFC:

	Re-based on next.
	Added comment about command queue priority.
	Added some acks and reviews.

Adrian Hunter (21):
      mmc: block: No need to export mmc_cleanup_queue()
      mmc: block: Simplify cleaning up the queue
      mmc: core: Make mmc_pre_req() and mmc_post_req() available
      mmc: block: Add error-handling comments
      mmc: core: Add parameter use_blk_mq
      mmc: block: Add blk-mq support
      mmc: block: Add CQE support
      mmc: sdhci-pci: Add CQHCI support for Intel GLK
      mmc: block: blk-mq: Add support for direct completion
      mmc: block: blk-mq: Separate card polling from recovery
      mmc: block: Make card_busy_detect() accumulate all response error bits
      mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
      mmc: block: Check the timeout correctly in card_busy_detect()
      mmc: block: Check for transfer state in card_busy_detect()
      mmc: block: Add timeout_clks when calculating timeout
      mmc: block: Reduce polling timeout from 10 minutes to 10 seconds
      mmc: block: blk-mq: Stop using legacy recovery
      mmc: mmc_test: Do not use mmc_start_areq() anymore
      mmc: core: Remove option not to use blk-mq
      mmc: block: Remove code no longer needed after the switch to blk-mq
      mmc: core: Remove code no longer needed after the switch to blk-mq

Venkat Gopalakrishnan (1):
      mmc: cqhci: support for command queue enabled host

 drivers/mmc/core/block.c          | 1383 +++++++++++++++++++++----------------
 drivers/mmc/core/block.h          |   12 +-
 drivers/mmc/core/bus.c            |    2 -
 drivers/mmc/core/core.c           |  216 +-----
 drivers/mmc/core/core.h           |   39 +-
 drivers/mmc/core/host.h           |    6 +-
 drivers/mmc/core/mmc_test.c       |  122 ++--
 drivers/mmc/core/queue.c          |  504 +++++++++-----
 drivers/mmc/core/queue.h          |   64 +-
 drivers/mmc/host/Kconfig          |   14 +
 drivers/mmc/host/Makefile         |    1 +
 drivers/mmc/host/cqhci.c          | 1150 ++++++++++++++++++++++++++++++
 drivers/mmc/host/cqhci.h          |  240 +++++++
 drivers/mmc/host/sdhci-pci-core.c |  155 ++++-
 include/linux/mmc/host.h          |    5 +-
 15 files changed, 2835 insertions(+), 1078 deletions(-)
 create mode 100644 drivers/mmc/host/cqhci.c
 create mode 100644 drivers/mmc/host/cqhci.h

Regards
Adrian

^ permalink raw reply	[flat|nested] 42+ messages in thread