All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12 v4] multiqueue for MMC/SD
@ 2017-10-26 12:57 Linus Walleij
  2017-10-26 12:57 ` [PATCH 01/12 v4] mmc: core: move the asynchronous post-processing Linus Walleij
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

This switches the MMC/SD stack over to unconditionally
using the multiqueue block interface for block access.
This modernizes the MMC/SD stack and makes it possible
to enable BFQ scheduling on these single-queue devices.

This is the v4 version of this v3 patch set from february:
https://marc.info/?l=linux-mmc&m=148665788227015&w=2

The patches are available in a git branch:
https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git/log/?h=mmc-mq-v4.14-rc4

You can pull it to a clean kernel tree like this:
git checkout -b mmc-test v4.14-rc4
git pull git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git mmc-mq-v4.14-rc4

I have now worked on it for more than a year. I was side
tracked to clean up some code, move request allocation to
be handled by the block layer, delete bounce buffer handling
and refactoring the RPMB support. With the changes to request
allocation, the patch set is a better fit and has shrunk
from 16 to 12 patches as a result.

It is still quite invasive. Yet it is something I think would
be nice to merge for v4.16...

The rationale for this approach was Arnd's suggestion to try to
switch the MMC/SD stack around so as to complete requests as
quickly as possible when they return from the device driver
so that new requests can be issued. We are doing this now:
the polling loop that was pulling NULL out of the request
queue and driving the pipeline with a loop is gone with
the next-to last patch ("block: issue requests in massive
parallel"). This sets the stage for MQ to go in and hammer
requests on the asynchronous issuing layer.

We use the trick to set the queue depth to 2 to get two
parallel requests pushed down to the host. I tried to set this
to 4, the code survives it, the queue just have three items
waiting to be submitted all the time.

In my opinion this is also a better fit for command queueuing.
Handling command queueing needs to happen in the asynchronous
submission codepath, so instead of waiting on a pending
areq, we just stack up requests in the command queue.

It sounds simple but I bet this drives a truck through Adrians
patch series. Sorry. :(

We are not issueing new requests from interrupt context: I still
have to post a work on a workqueue for it. Since there is the
retune and background operations that need to be checked after
every command and yeah, it needs to happen in blocking context
as far as I know.

I might make a hack trying to strip out the retune (etc) and
instead run request until something fail and report requests
back to the block layer in interrupt context. It would be an
interesting experiment, but for later.

We have parallelism in pre/post hooks also with multiqueue.
All asynchronous optimization that was there for the old block
layer is now also there for multiqueue.

Last time I followed up with some open questions
https://marc.info/?l=linux-mmc&m=149075698610224&w=2
I think these are now resolved.

As a result, the last patch is no longer in RFC state. I
think this works. (Famous last words, OK there WILL be
regressions but hey, we need to do this.)
You can see there are three steps:

- I do some necessary refactoring and need to move postprocessing
  to after the requests have been completed. This clearly, as you
  can see, introduce a performance regression in the dd test with
  the patch:
  "mmc: core: move the asynchronous post-processing"
  It seems the random seek with find isn't much affected.

- I continue the refactoring and get to the point of issueing
  requests immediately after every successful transfer, and the
  dd performance is restored with patch
  "mmc: queue: issue requests in massive parallel"

- Then I add multiqueue on top of the cake. So before the change
  we have the nice performance we want so we can study the effect
  of just introducing multiqueueing in the last patch
  "mmc: switch MMC/SD to use blk-mq multiqueueing v4"


PERFORMANCE BEFORE AND AFTER:

BEFORE this patch series, on Ulf's next branch ending with
commit cf653c788a29fa70e07b86492a7599c471c705de (mmc-next)
Merge: 4dda8e1f70f8 eb701ce16a45 ("Merge branch 'fixes' into next")

sync
echo 3 > /proc/sys/vm/drop_caches
sync
time dd if=/dev/mmcblk3 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.0GB) copied, 23.966583 seconds, 42.7MB/s
real    0m 23.97s
user    0m 0.01s
sys     0m 3.74s

mount /dev/mmcblk3p1 /mnt/
cd /mnt/
sync
echo 3 > /proc/sys/vm/drop_caches
sync
time find . > /dev/null
real    0m 3.24s
user    0m 0.22s
sys     0m 1.23s

sync
echo 3 > /proc/sys/vm/drop_caches
sync
iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test

                                                   random    random
   kB  reclen    write  rewrite    read    reread    read     write
20480       4     1598     1559     6782     6740     6751      536
20480       8     2134     2281    11449    11449    11407     1145
20480      16     3695     4171    17676    17677    17638     1234
20480      32     5751     7475    23622    23622    23584     3004
20480      64     6778     8648    27937    27950    27914     3445
20480     128     6073     8115    29091    29080    29070     4892
20480     256     7106     7208    29658    29670    29657     6743
20480     512     8828     9953    29911    29905    29901     7424
20480    1024     6566     7199    27233    27236    27209     6808
20480    2048     7370     7403    27492    27490    27475     7428
20480    4096     7352     7456    28124    28123    28109     7411
20480    8192     7271     7462    28371    28369    28359     7458
20480   16384     7097     7478    28540    28538    28528     7464

AFTER this patch series ending with
"mmc: switch MMC/SD to use blk-mq multiqueueing v4":

sync
echo 3 > /proc/sys/vm/drop_caches
sync
time dd if=/dev/mmcblk3 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.0GB) copied, 24.358276 seconds, 42.0MB/s
real    0m 24.36s
user    0m 0.00s
sys     0m 3.92s

mount /dev/mmcblk3p1 /mnt/
cd /mnt/
sync
echo 3 > /proc/sys/vm/drop_caches
sync
time find . > /dev/null
real    0m 3.92s
user    0m 0.26s
sys     0m 1.21s

sync
echo 3 > /proc/sys/vm/drop_caches
sync
iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test

                                                   random    random
   kB  reclen    write  rewrite    read    reread    read     write
20480       4     1614     1569     6913     6889     6876      531
20480       8     2147     2301    11628    11625    11581     1165
20480      16     3820     4256    17760    17764    17725     1549
20480      32     5814     7508    23148    23145    23123     3561
20480      64     7396     8161    27513    27527    27500     4177
20480     128     6707     9025    29160    29166    29139     5199
20480     256     7902     7860    29459    29456    29462     7307
20480     512     8061    11343    29888    29891    29881     6800
20480    1024     7076     7442    27702    27704    27700     7445
20480    2048     6846     8194    27417    27419    27418     6781
20480    4096     8115     6810    28113    28113    28109     8191
20480    8192     7254     7434    28413    28419    28414     7476
20480   16384     7090     7481    28623    28619    28625     7454


As you can see, performance is not affected, errors are in the noise
margin.

Linus Walleij (12):
  mmc: core: move the asynchronous post-processing
  mmc: core: add a workqueue for completing requests
  mmc: core: replace waitqueue with worker
  mmc: core: do away with is_done_rcv
  mmc: core: do away with is_new_req
  mmc: core: kill off the context info
  mmc: queue: simplify queue logic
  mmc: block: shuffle retry and error handling
  mmc: queue: stop flushing the pipeline with NULL
  mmc: queue/block: pass around struct mmc_queue_req*s
  mmc: block: issue requests in massive parallel
  mmc: switch MMC/SD to use blk-mq multiqueueing v4

 drivers/mmc/core/block.c    | 539 ++++++++++++++++++++++----------------------
 drivers/mmc/core/block.h    |   5 +-
 drivers/mmc/core/bus.c      |   1 -
 drivers/mmc/core/core.c     | 194 +++++++++-------
 drivers/mmc/core/core.h     |  11 +-
 drivers/mmc/core/host.c     |   1 -
 drivers/mmc/core/mmc_test.c |  31 +--
 drivers/mmc/core/queue.c    | 238 ++++++++-----------
 drivers/mmc/core/queue.h    |  16 +-
 include/linux/mmc/core.h    |   3 +-
 include/linux/mmc/host.h    |  27 +--
 11 files changed, 503 insertions(+), 563 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/12 v4] mmc: core: move the asynchronous post-processing
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 02/12 v4] mmc: core: add a workqueue for completing requests Linus Walleij
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

This moves the asynchronous post-processing of a request over
to the finalization function.

The patch has a slight semantic change:

Both places will be in the code path for if (host->areq) and
in the same sequence, but before this patch, the next request
was started before performing post-processing.

The effect is that whereas before, the post- and preprocessing
happened after starting the next request, now the preprocessing
will happen after the request is done and before the next has
started which would cut half of the pre/post optimizations out.

In the later patch named "mmc: core: replace waitqueue with
worker" we move the finalization to a worker started by
mmc_request_done() and in the patch named
"mmc: block: issue requests in massive parallel" we introduce
a forked success/failure path that can quickly complete
requests when they come back from the hardware.

These two later patches together restore the same optimization
but in a more elegant manner that avoids the need to flush the
two-stage pipleline with NULL, something we remove between these
two patches in the commit named
"mmc: queue: stop flushing the pipeline with NULL".

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/core.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 12b271c2a912..3d1270b9aec4 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -746,6 +746,9 @@ static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host)
 		mmc_start_bkops(host->card, true);
 	}
 
+	/* Successfully postprocess the old request at this point */
+	mmc_post_req(host, host->areq->mrq, 0);
+
 	return status;
 }
 
@@ -790,10 +793,6 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 	if (status == MMC_BLK_SUCCESS && areq)
 		start_err = __mmc_start_data_req(host, areq->mrq);
 
-	/* Postprocess the old request at this point */
-	if (host->areq)
-		mmc_post_req(host, host->areq->mrq, 0);
-
 	/* Cancel a prepared request if it was not started. */
 	if ((status != MMC_BLK_SUCCESS || start_err) && areq)
 		mmc_post_req(host, areq->mrq, -EINVAL);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/12 v4] mmc: core: add a workqueue for completing requests
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
  2017-10-26 12:57 ` [PATCH 01/12 v4] mmc: core: move the asynchronous post-processing Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 03/12 v4] mmc: core: replace waitqueue with worker Linus Walleij
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

As we want to complete requests autonomously from feeding the
host with new requests, we create a workqueue to deal with
this specifically in response to the callback from a host driver.
This is necessary to exploit parallelism properly.

This patch just adds the workqueu, later patches will make use of
it.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/core.c  | 9 +++++++++
 drivers/mmc/core/host.c  | 1 -
 include/linux/mmc/host.h | 4 ++++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 3d1270b9aec4..9c3baaddb1bd 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2828,6 +2828,14 @@ void mmc_start_host(struct mmc_host *host)
 	host->f_init = max(freqs[0], host->f_min);
 	host->rescan_disable = 0;
 	host->ios.power_mode = MMC_POWER_UNDEFINED;
+	/* Workqueue for completing requests */
+	host->req_done_wq = alloc_workqueue("mmc%d-reqdone",
+				WQ_FREEZABLE | WQ_HIGHPRI | WQ_MEM_RECLAIM,
+				0, host->index);
+	if (!host->req_done_wq) {
+		dev_err(mmc_dev(host), "could not allocate workqueue\n");
+		return;
+	}
 
 	if (!(host->caps2 & MMC_CAP2_NO_PRESCAN_POWERUP)) {
 		mmc_claim_host(host);
@@ -2849,6 +2857,7 @@ void mmc_stop_host(struct mmc_host *host)
 
 	host->rescan_disable = 1;
 	cancel_delayed_work_sync(&host->detect);
+	destroy_workqueue(host->req_done_wq);
 
 	/* clear pm flags now and let card drivers set them as needed */
 	host->pm_flags = 0;
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index e58be39b1568..8193363a5a46 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -381,7 +381,6 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
 	INIT_DELAYED_WORK(&host->detect, mmc_rescan);
 	INIT_DELAYED_WORK(&host->sdio_irq_work, sdio_irq_work);
 	setup_timer(&host->retune_timer, mmc_retune_timer, (unsigned long)host);
-
 	/*
 	 * By default, hosts do not support SGIO or large requests.
 	 * They have to set these according to their abilities.
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index c296f4351c1d..94a646eebf05 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -13,6 +13,7 @@
 #include <linux/sched.h>
 #include <linux/device.h>
 #include <linux/fault-inject.h>
+#include <linux/workqueue.h>
 
 #include <linux/mmc/core.h>
 #include <linux/mmc/card.h>
@@ -423,6 +424,9 @@ struct mmc_host {
 	struct mmc_async_req	*areq;		/* active async req */
 	struct mmc_context_info	context_info;	/* async synchronization info */
 
+	/* finalization workqueue, handles finalizing requests */
+	struct workqueue_struct	*req_done_wq;
+
 	/* Ongoing data transfer that allows commands during transfer */
 	struct mmc_request	*ongoing_mrq;
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/12 v4] mmc: core: replace waitqueue with worker
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
  2017-10-26 12:57 ` [PATCH 01/12 v4] mmc: core: move the asynchronous post-processing Linus Walleij
  2017-10-26 12:57 ` [PATCH 02/12 v4] mmc: core: add a workqueue for completing requests Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 04/12 v4] mmc: core: do away with is_done_rcv Linus Walleij
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

The waitqueue in the host context is there to signal back from
mmc_request_done() through mmc_wait_data_done() that the hardware
is done with a command, and when the wait is over, the core
will typically submit the next asynchronous request that is pending
just waiting for the hardware to be available.

This is in the way for letting the mmc_request_done() trigger the
report up to the block layer that a block request is finished.

Re-jig this as a first step, remvoving the waitqueue and introducing
a work that will run after a completed asynchronous request,
finalizing that request, including retransmissions, and eventually
reporting back with a completion and a status code to the
asynchronous issue method.

This has the upside that we can remove the MMC_BLK_NEW_REQUEST
status code and the "new_request" state in the request queue
that is only there to make the state machine spin out
the first time we send a request.

Use the workqueue we introduced in the host for handling just
this, and then add a work and completion in the asynchronous
request to deal with this mechanism.

We introduce a pointer from mmc_request back to the asynchronous
request so these can be referenced from each other, and
augment mmc_wait_data_done() to use this pointer to get at the
areq and kick the worker since that function is only used by
asynchronous requests anyway.

This is a central change that let us do many other changes since
we have broken the submit and complete code paths in two, and we
can potentially remove the NULL flushing of the asynchronous
pipeline and report block requests as finished directly from
the worker.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c |  3 ++
 drivers/mmc/core/core.c  | 93 ++++++++++++++++++++++++------------------------
 drivers/mmc/core/core.h  |  2 ++
 drivers/mmc/core/queue.c |  1 -
 include/linux/mmc/core.h |  3 +-
 include/linux/mmc/host.h |  7 ++--
 6 files changed, 59 insertions(+), 50 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index ea80ff4cd7f9..5c84175e49be 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1712,6 +1712,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 	mmc_blk_data_prep(mq, mqrq, disable_multi, &do_rel_wr, &do_data_tag);
 
 	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.areq = NULL;
 
 	brq->cmd.arg = blk_rq_pos(req);
 	if (!mmc_card_blockaddr(card))
@@ -1764,6 +1765,8 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 	}
 
 	mqrq->areq.err_check = mmc_blk_err_check;
+	mqrq->areq.host = card->host;
+	INIT_WORK(&mqrq->areq.finalization_work, mmc_finalize_areq);
 }
 
 static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 9c3baaddb1bd..f6a51608ab0b 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -369,10 +369,15 @@ EXPORT_SYMBOL(mmc_start_request);
  */
 static void mmc_wait_data_done(struct mmc_request *mrq)
 {
-	struct mmc_context_info *context_info = &mrq->host->context_info;
+	struct mmc_host *host = mrq->host;
+	struct mmc_context_info *context_info = &host->context_info;
+	struct mmc_async_req *areq = mrq->areq;
 
 	context_info->is_done_rcv = true;
-	wake_up_interruptible(&context_info->wait);
+	/* Schedule a work to deal with finalizing this request */
+	if (!areq)
+		pr_err("areq of the data mmc_request was NULL!\n");
+	queue_work(host->req_done_wq, &areq->finalization_work);
 }
 
 static void mmc_wait_done(struct mmc_request *mrq)
@@ -695,43 +700,34 @@ static void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
  * Returns the status of the ongoing asynchronous request, but
  * MMC_BLK_SUCCESS if no request was going on.
  */
-static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host)
+void mmc_finalize_areq(struct work_struct *work)
 {
+	struct mmc_async_req *areq =
+		container_of(work, struct mmc_async_req, finalization_work);
+	struct mmc_host *host = areq->host;
 	struct mmc_context_info *context_info = &host->context_info;
-	enum mmc_blk_status status;
-
-	if (!host->areq)
-		return MMC_BLK_SUCCESS;
-
-	while (1) {
-		wait_event_interruptible(context_info->wait,
-				(context_info->is_done_rcv ||
-				 context_info->is_new_req));
+	enum mmc_blk_status status = MMC_BLK_SUCCESS;
 
-		if (context_info->is_done_rcv) {
-			struct mmc_command *cmd;
+	if (context_info->is_done_rcv) {
+		struct mmc_command *cmd;
 
-			context_info->is_done_rcv = false;
-			cmd = host->areq->mrq->cmd;
+		context_info->is_done_rcv = false;
+		cmd = areq->mrq->cmd;
 
-			if (!cmd->error || !cmd->retries ||
-			    mmc_card_removed(host->card)) {
-				status = host->areq->err_check(host->card,
-							       host->areq);
-				break; /* return status */
-			} else {
-				mmc_retune_recheck(host);
-				pr_info("%s: req failed (CMD%u): %d, retrying...\n",
-					mmc_hostname(host),
-					cmd->opcode, cmd->error);
-				cmd->retries--;
-				cmd->error = 0;
-				__mmc_start_request(host, host->areq->mrq);
-				continue; /* wait for done/new event again */
-			}
+		if (!cmd->error || !cmd->retries ||
+		    mmc_card_removed(host->card)) {
+			status = areq->err_check(host->card,
+						 areq);
+		} else {
+			mmc_retune_recheck(host);
+			pr_info("%s: req failed (CMD%u): %d, retrying...\n",
+				mmc_hostname(host),
+				cmd->opcode, cmd->error);
+			cmd->retries--;
+			cmd->error = 0;
+			__mmc_start_request(host, areq->mrq);
+			return; /* wait for done/new event again */
 		}
-
-		return MMC_BLK_NEW_REQUEST;
 	}
 
 	mmc_retune_release(host);
@@ -740,17 +736,19 @@ static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host)
 	 * Check BKOPS urgency for each R1 response
 	 */
 	if (host->card && mmc_card_mmc(host->card) &&
-	    ((mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1) ||
-	     (mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1B)) &&
-	    (host->areq->mrq->cmd->resp[0] & R1_EXCEPTION_EVENT)) {
+	    ((mmc_resp_type(areq->mrq->cmd) == MMC_RSP_R1) ||
+	     (mmc_resp_type(areq->mrq->cmd) == MMC_RSP_R1B)) &&
+	    (areq->mrq->cmd->resp[0] & R1_EXCEPTION_EVENT)) {
 		mmc_start_bkops(host->card, true);
 	}
 
 	/* Successfully postprocess the old request at this point */
-	mmc_post_req(host, host->areq->mrq, 0);
+	mmc_post_req(host, areq->mrq, 0);
 
-	return status;
+	areq->finalization_status = status;
+	complete(&areq->complete);
 }
+EXPORT_SYMBOL(mmc_finalize_areq);
 
 /**
  *	mmc_start_areq - start an asynchronous request
@@ -780,18 +778,22 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 	if (areq)
 		mmc_pre_req(host, areq->mrq);
 
-	/* Finalize previous request */
-	status = mmc_finalize_areq(host);
+	/* Finalize previous request, if there is one */
+	if (previous) {
+		wait_for_completion(&previous->complete);
+		status = previous->finalization_status;
+	} else {
+		status = MMC_BLK_SUCCESS;
+	}
 	if (ret_stat)
 		*ret_stat = status;
 
-	/* The previous request is still going on... */
-	if (status == MMC_BLK_NEW_REQUEST)
-		return NULL;
-
 	/* Fine so far, start the new request! */
-	if (status == MMC_BLK_SUCCESS && areq)
+	if (status == MMC_BLK_SUCCESS && areq) {
+		init_completion(&areq->complete);
+		areq->mrq->areq = areq;
 		start_err = __mmc_start_data_req(host, areq->mrq);
+	}
 
 	/* Cancel a prepared request if it was not started. */
 	if ((status != MMC_BLK_SUCCESS || start_err) && areq)
@@ -3005,7 +3007,6 @@ void mmc_init_context_info(struct mmc_host *host)
 	host->context_info.is_new_req = false;
 	host->context_info.is_done_rcv = false;
 	host->context_info.is_waiting_last_req = false;
-	init_waitqueue_head(&host->context_info.wait);
 }
 
 static int __init mmc_init(void)
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index 71e6c6d7ceb7..e493d9d73fe2 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -13,6 +13,7 @@
 
 #include <linux/delay.h>
 #include <linux/sched.h>
+#include <linux/workqueue.h>
 
 struct mmc_host;
 struct mmc_card;
@@ -112,6 +113,7 @@ int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq);
 
 struct mmc_async_req;
 
+void mmc_finalize_areq(struct work_struct *work);
 struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 				     struct mmc_async_req *areq,
 				     enum mmc_blk_status *ret_stat);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 4f33d277b125..c46be4402803 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -111,7 +111,6 @@ static void mmc_request_fn(struct request_queue *q)
 
 	if (cntx->is_waiting_last_req) {
 		cntx->is_new_req = true;
-		wake_up_interruptible(&cntx->wait);
 	}
 
 	if (mq->asleep)
diff --git a/include/linux/mmc/core.h b/include/linux/mmc/core.h
index 927519385482..d755ef8ea880 100644
--- a/include/linux/mmc/core.h
+++ b/include/linux/mmc/core.h
@@ -13,6 +13,7 @@
 
 struct mmc_data;
 struct mmc_request;
+struct mmc_async_req;
 
 enum mmc_blk_status {
 	MMC_BLK_SUCCESS = 0,
@@ -23,7 +24,6 @@ enum mmc_blk_status {
 	MMC_BLK_DATA_ERR,
 	MMC_BLK_ECC_ERR,
 	MMC_BLK_NOMEDIUM,
-	MMC_BLK_NEW_REQUEST,
 };
 
 struct mmc_command {
@@ -155,6 +155,7 @@ struct mmc_request {
 
 	struct completion	completion;
 	struct completion	cmd_completion;
+	struct mmc_async_req	*areq; /* pointer to areq if any */
 	void			(*done)(struct mmc_request *);/* completion function */
 	/*
 	 * Notify uppers layers (e.g. mmc block driver) that recovery is needed
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 94a646eebf05..65f23a9ea724 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -14,6 +14,7 @@
 #include <linux/device.h>
 #include <linux/fault-inject.h>
 #include <linux/workqueue.h>
+#include <linux/completion.h>
 
 #include <linux/mmc/core.h>
 #include <linux/mmc/card.h>
@@ -215,6 +216,10 @@ struct mmc_async_req {
 	 * Returns 0 if success otherwise non zero.
 	 */
 	enum mmc_blk_status (*err_check)(struct mmc_card *, struct mmc_async_req *);
+	struct work_struct finalization_work;
+	enum mmc_blk_status finalization_status;
+	struct completion complete;
+	struct mmc_host *host;
 };
 
 /**
@@ -239,13 +244,11 @@ struct mmc_slot {
  * @is_done_rcv		wake up reason was done request
  * @is_new_req		wake up reason was new request
  * @is_waiting_last_req	mmc context waiting for single running request
- * @wait		wait queue
  */
 struct mmc_context_info {
 	bool			is_done_rcv;
 	bool			is_new_req;
 	bool			is_waiting_last_req;
-	wait_queue_head_t	wait;
 };
 
 struct regulator;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/12 v4] mmc: core: do away with is_done_rcv
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (2 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 03/12 v4] mmc: core: replace waitqueue with worker Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 05/12 v4] mmc: core: do away with is_new_req Linus Walleij
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

The "is_done_rcv" in the context info for the host is no longer
needed: it is clear from context (ha!) that as long as we are
waiting for the asynchronous request to come to completion,
we are not done receiving data, and when the finalization work
has run and completed the completion, we are indeed done.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/core.c  | 40 ++++++++++++++++------------------------
 include/linux/mmc/host.h |  2 --
 2 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index f6a51608ab0b..68125360a078 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -370,10 +370,8 @@ EXPORT_SYMBOL(mmc_start_request);
 static void mmc_wait_data_done(struct mmc_request *mrq)
 {
 	struct mmc_host *host = mrq->host;
-	struct mmc_context_info *context_info = &host->context_info;
 	struct mmc_async_req *areq = mrq->areq;
 
-	context_info->is_done_rcv = true;
 	/* Schedule a work to deal with finalizing this request */
 	if (!areq)
 		pr_err("areq of the data mmc_request was NULL!\n");
@@ -656,7 +654,7 @@ EXPORT_SYMBOL(mmc_cqe_recovery);
 bool mmc_is_req_done(struct mmc_host *host, struct mmc_request *mrq)
 {
 	if (host->areq)
-		return host->context_info.is_done_rcv;
+		return completion_done(&host->areq->complete);
 	else
 		return completion_done(&mrq->completion);
 }
@@ -705,29 +703,24 @@ void mmc_finalize_areq(struct work_struct *work)
 	struct mmc_async_req *areq =
 		container_of(work, struct mmc_async_req, finalization_work);
 	struct mmc_host *host = areq->host;
-	struct mmc_context_info *context_info = &host->context_info;
 	enum mmc_blk_status status = MMC_BLK_SUCCESS;
+	struct mmc_command *cmd;
 
-	if (context_info->is_done_rcv) {
-		struct mmc_command *cmd;
-
-		context_info->is_done_rcv = false;
-		cmd = areq->mrq->cmd;
+	cmd = areq->mrq->cmd;
 
-		if (!cmd->error || !cmd->retries ||
-		    mmc_card_removed(host->card)) {
-			status = areq->err_check(host->card,
-						 areq);
-		} else {
-			mmc_retune_recheck(host);
-			pr_info("%s: req failed (CMD%u): %d, retrying...\n",
-				mmc_hostname(host),
-				cmd->opcode, cmd->error);
-			cmd->retries--;
-			cmd->error = 0;
-			__mmc_start_request(host, areq->mrq);
-			return; /* wait for done/new event again */
-		}
+	if (!cmd->error || !cmd->retries ||
+	    mmc_card_removed(host->card)) {
+		status = areq->err_check(host->card,
+					 areq);
+	} else {
+		mmc_retune_recheck(host);
+		pr_info("%s: req failed (CMD%u): %d, retrying...\n",
+			mmc_hostname(host),
+			cmd->opcode, cmd->error);
+		cmd->retries--;
+		cmd->error = 0;
+		__mmc_start_request(host, areq->mrq);
+		return; /* wait for done/new event again */
 	}
 
 	mmc_retune_release(host);
@@ -3005,7 +2998,6 @@ void mmc_unregister_pm_notifier(struct mmc_host *host)
 void mmc_init_context_info(struct mmc_host *host)
 {
 	host->context_info.is_new_req = false;
-	host->context_info.is_done_rcv = false;
 	host->context_info.is_waiting_last_req = false;
 }
 
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 65f23a9ea724..d536325a9640 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -241,12 +241,10 @@ struct mmc_slot {
 
 /**
  * mmc_context_info - synchronization details for mmc context
- * @is_done_rcv		wake up reason was done request
  * @is_new_req		wake up reason was new request
  * @is_waiting_last_req	mmc context waiting for single running request
  */
 struct mmc_context_info {
-	bool			is_done_rcv;
 	bool			is_new_req;
 	bool			is_waiting_last_req;
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/12 v4] mmc: core: do away with is_new_req
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (3 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 04/12 v4] mmc: core: do away with is_done_rcv Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 06/12 v4] mmc: core: kill off the context info Linus Walleij
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

The host context member "is_new_req" is only assigned values,
never checked. Delete it.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/core.c  | 1 -
 drivers/mmc/core/queue.c | 5 -----
 include/linux/mmc/host.h | 2 --
 3 files changed, 8 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 68125360a078..ad832317f25b 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2997,7 +2997,6 @@ void mmc_unregister_pm_notifier(struct mmc_host *host)
  */
 void mmc_init_context_info(struct mmc_host *host)
 {
-	host->context_info.is_new_req = false;
 	host->context_info.is_waiting_last_req = false;
 }
 
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index c46be4402803..4a0752ef6154 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -55,7 +55,6 @@ static int mmc_queue_thread(void *d)
 		req = blk_fetch_request(q);
 		mq->asleep = false;
 		cntx->is_waiting_last_req = false;
-		cntx->is_new_req = false;
 		if (!req) {
 			/*
 			 * Dispatch queue is empty so set flags for
@@ -109,10 +108,6 @@ static void mmc_request_fn(struct request_queue *q)
 
 	cntx = &mq->card->host->context_info;
 
-	if (cntx->is_waiting_last_req) {
-		cntx->is_new_req = true;
-	}
-
 	if (mq->asleep)
 		wake_up_process(mq->thread);
 }
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index d536325a9640..ceb58b27f402 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -241,11 +241,9 @@ struct mmc_slot {
 
 /**
  * mmc_context_info - synchronization details for mmc context
- * @is_new_req		wake up reason was new request
  * @is_waiting_last_req	mmc context waiting for single running request
  */
 struct mmc_context_info {
-	bool			is_new_req;
 	bool			is_waiting_last_req;
 };
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/12 v4] mmc: core: kill off the context info
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (4 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 05/12 v4] mmc: core: do away with is_new_req Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 07/12 v4] mmc: queue: simplify queue logic Linus Walleij
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

The last member of the context info: is_waiting_last_req is
just assigned values, never checked. Delete that and the whole
context info as a result.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c |  2 --
 drivers/mmc/core/bus.c   |  1 -
 drivers/mmc/core/core.c  | 13 -------------
 drivers/mmc/core/core.h  |  2 --
 drivers/mmc/core/queue.c |  9 +--------
 include/linux/mmc/host.h |  9 ---------
 6 files changed, 1 insertion(+), 35 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 5c84175e49be..86ec87c17e71 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -2065,13 +2065,11 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 		default:
 			/* Normal request, just issue it */
 			mmc_blk_issue_rw_rq(mq, req);
-			card->host->context_info.is_waiting_last_req = false;
 			break;
 		}
 	} else {
 		/* No request, flushing the pipeline with NULL */
 		mmc_blk_issue_rw_rq(mq, NULL);
-		card->host->context_info.is_waiting_last_req = false;
 	}
 
 out:
diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index a4b49e25fe96..45904a7e87be 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -348,7 +348,6 @@ int mmc_add_card(struct mmc_card *card)
 #ifdef CONFIG_DEBUG_FS
 	mmc_add_card_debugfs(card);
 #endif
-	mmc_init_context_info(card->host);
 
 	card->dev.of_node = mmc_of_find_child_device(card->host, 0);
 
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index ad832317f25b..865db736c717 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2987,19 +2987,6 @@ void mmc_unregister_pm_notifier(struct mmc_host *host)
 }
 #endif
 
-/**
- * mmc_init_context_info() - init synchronization context
- * @host: mmc host
- *
- * Init struct context_info needed to implement asynchronous
- * request mechanism, used by mmc core, host driver and mmc requests
- * supplier.
- */
-void mmc_init_context_info(struct mmc_host *host)
-{
-	host->context_info.is_waiting_last_req = false;
-}
-
 static int __init mmc_init(void)
 {
 	int ret;
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index e493d9d73fe2..88b852ac8f74 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -92,8 +92,6 @@ void mmc_remove_host_debugfs(struct mmc_host *host);
 void mmc_add_card_debugfs(struct mmc_card *card);
 void mmc_remove_card_debugfs(struct mmc_card *card);
 
-void mmc_init_context_info(struct mmc_host *host);
-
 int mmc_execute_tuning(struct mmc_card *card);
 int mmc_hs200_to_hs400(struct mmc_card *card);
 int mmc_hs400_to_hs200(struct mmc_card *card);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 4a0752ef6154..2c232ba4e594 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -42,7 +42,6 @@ static int mmc_queue_thread(void *d)
 {
 	struct mmc_queue *mq = d;
 	struct request_queue *q = mq->queue;
-	struct mmc_context_info *cntx = &mq->card->host->context_info;
 
 	current->flags |= PF_MEMALLOC;
 
@@ -54,15 +53,12 @@ static int mmc_queue_thread(void *d)
 		set_current_state(TASK_INTERRUPTIBLE);
 		req = blk_fetch_request(q);
 		mq->asleep = false;
-		cntx->is_waiting_last_req = false;
 		if (!req) {
 			/*
 			 * Dispatch queue is empty so set flags for
 			 * mmc_request_fn() to wake us up.
 			 */
-			if (mq->qcnt)
-				cntx->is_waiting_last_req = true;
-			else
+			if (!mq->qcnt)
 				mq->asleep = true;
 		}
 		spin_unlock_irq(q->queue_lock);
@@ -96,7 +92,6 @@ static void mmc_request_fn(struct request_queue *q)
 {
 	struct mmc_queue *mq = q->queuedata;
 	struct request *req;
-	struct mmc_context_info *cntx;
 
 	if (!mq) {
 		while ((req = blk_fetch_request(q)) != NULL) {
@@ -106,8 +101,6 @@ static void mmc_request_fn(struct request_queue *q)
 		return;
 	}
 
-	cntx = &mq->card->host->context_info;
-
 	if (mq->asleep)
 		wake_up_process(mq->thread);
 }
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index ceb58b27f402..638f11d185bd 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -239,14 +239,6 @@ struct mmc_slot {
 	void *handler_priv;
 };
 
-/**
- * mmc_context_info - synchronization details for mmc context
- * @is_waiting_last_req	mmc context waiting for single running request
- */
-struct mmc_context_info {
-	bool			is_waiting_last_req;
-};
-
 struct regulator;
 struct mmc_pwrseq;
 
@@ -421,7 +413,6 @@ struct mmc_host {
 	struct dentry		*debugfs_root;
 
 	struct mmc_async_req	*areq;		/* active async req */
-	struct mmc_context_info	context_info;	/* async synchronization info */
 
 	/* finalization workqueue, handles finalizing requests */
 	struct workqueue_struct	*req_done_wq;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/12 v4] mmc: queue: simplify queue logic
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (5 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 06/12 v4] mmc: core: kill off the context info Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 08/12 v4] mmc: block: shuffle retry and error handling Linus Walleij
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

The if() statment checking if there is no current or previous
request is now just looking ahead at something that will be
concluded a few lines below. Simplify the logic by moving the
assignment of .asleep.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/queue.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 2c232ba4e594..023bbddc1a0b 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -53,14 +53,6 @@ static int mmc_queue_thread(void *d)
 		set_current_state(TASK_INTERRUPTIBLE);
 		req = blk_fetch_request(q);
 		mq->asleep = false;
-		if (!req) {
-			/*
-			 * Dispatch queue is empty so set flags for
-			 * mmc_request_fn() to wake us up.
-			 */
-			if (!mq->qcnt)
-				mq->asleep = true;
-		}
 		spin_unlock_irq(q->queue_lock);
 
 		if (req || mq->qcnt) {
@@ -68,6 +60,7 @@ static int mmc_queue_thread(void *d)
 			mmc_blk_issue_rq(mq, req);
 			cond_resched();
 		} else {
+			mq->asleep = true;
 			if (kthread_should_stop()) {
 				set_current_state(TASK_RUNNING);
 				break;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/12 v4] mmc: block: shuffle retry and error handling
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (6 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 07/12 v4] mmc: queue: simplify queue logic Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 09/12 v4] mmc: queue: stop flushing the pipeline with NULL Linus Walleij
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

Instead of doing retries at the same time as trying to submit new
requests, do the retries when the request is reported as completed
by the driver, in the finalization worker.

This is achieved by letting the core worker call back into the block
layer using a callback in the asynchronous request, ->report_done_status()
that will pass the status back to the block core so it can repeatedly
try to hammer the request using single request, retry etc by calling back
to the core layer using mmc_restart_areq(), which will just kick the
same asynchronous request without waiting for a previous ongoing request.

The beauty of it is that the completion will not complete until the
block layer has had the opportunity to hammer a bit at the card using
a bunch of different approaches that used to be in the while() loop in
mmc_blk_rw_done()

The algorithm for recapture, retry and handle errors is identical to the
one we used to have in mmc_blk_issue_rw_rq(), only augmented to get
called in another path from the core.

We have to add and initialize a pointer back to the struct mmc_queue
from the struct mmc_queue_req to find the queue from the asynchronous
request when reporting the status back to the core.

Other users of the asynchrous request that do not need to retry and
use misc error handling fallbacks will work fine since a NULL
->report_done_status() is just fine. This is currently only done by
the test module.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c | 337 ++++++++++++++++++++++++-----------------------
 drivers/mmc/core/core.c  |  47 ++++---
 drivers/mmc/core/core.h  |   1 +
 drivers/mmc/core/queue.c |   2 +
 drivers/mmc/core/queue.h |   1 +
 include/linux/mmc/host.h |   5 +-
 6 files changed, 210 insertions(+), 183 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 86ec87c17e71..c1178fa83f75 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1811,198 +1811,207 @@ static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
 /**
  * mmc_blk_rw_try_restart() - tries to restart the current async request
  * @mq: the queue with the card and host to restart
- * @req: a new request that want to be started after the current one
+ * @mqrq: the mmc_queue_request containing the areq to be restarted
  */
-static void mmc_blk_rw_try_restart(struct mmc_queue *mq, struct request *req,
+static void mmc_blk_rw_try_restart(struct mmc_queue *mq,
 				   struct mmc_queue_req *mqrq)
 {
-	if (!req)
-		return;
+	/* Proceed and try to restart the current async request */
+	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
+	mmc_restart_areq(mq->card->host, &mqrq->areq);
+}
+
+static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status status)
+{
+	struct mmc_queue *mq;
+	struct mmc_blk_data *md;
+	struct mmc_card *card;
+	struct mmc_host *host;
+	struct mmc_queue_req *mq_rq;
+	struct mmc_blk_request *brq;
+	struct request *old_req;
+	bool req_pending = true;
+	int disable_multi = 0, retry = 0, type, retune_retry_done = 0;
 
 	/*
-	 * If the card was removed, just cancel everything and return.
+	 * An asynchronous request has been completed and we proceed
+	 * to handle the result of it.
 	 */
-	if (mmc_card_removed(mq->card)) {
-		req->rq_flags |= RQF_QUIET;
-		blk_end_request_all(req, BLK_STS_IOERR);
-		mq->qcnt--; /* FIXME: just set to 0? */
+	mq_rq =	container_of(areq, struct mmc_queue_req, areq);
+	mq = mq_rq->mq;
+	md = mq->blkdata;
+	card = mq->card;
+	host = card->host;
+	brq = &mq_rq->brq;
+	old_req = mmc_queue_req_to_req(mq_rq);
+	type = rq_data_dir(old_req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+
+	switch (status) {
+	case MMC_BLK_SUCCESS:
+	case MMC_BLK_PARTIAL:
+		/*
+		 * A block was successfully transferred.
+		 */
+		mmc_blk_reset_success(md, type);
+		req_pending = blk_end_request(old_req, BLK_STS_OK,
+					      brq->data.bytes_xfered);
+		/*
+		 * If the blk_end_request function returns non-zero even
+		 * though all data has been transferred and no errors
+		 * were returned by the host controller, it's a bug.
+		 */
+		if (status == MMC_BLK_SUCCESS && req_pending) {
+			pr_err("%s BUG rq_tot %d d_xfer %d\n",
+			       __func__, blk_rq_bytes(old_req),
+			       brq->data.bytes_xfered);
+			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+			return;
+		}
+		break;
+	case MMC_BLK_CMD_ERR:
+		req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
+		if (mmc_blk_reset(md, card->host, type)) {
+			if (req_pending)
+				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+			else
+				mq->qcnt--;
+			mmc_blk_rw_try_restart(mq, mq_rq);
+			return;
+		}
+		if (!req_pending) {
+			mq->qcnt--;
+			mmc_blk_rw_try_restart(mq, mq_rq);
+			return;
+		}
+		break;
+	case MMC_BLK_RETRY:
+		retune_retry_done = brq->retune_retry_done;
+		if (retry++ < 5)
+			break;
+		/* Fall through */
+	case MMC_BLK_ABORT:
+		if (!mmc_blk_reset(md, card->host, type))
+			break;
+		mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+		mmc_blk_rw_try_restart(mq, mq_rq);
+		return;
+	case MMC_BLK_DATA_ERR: {
+		int err;
+			err = mmc_blk_reset(md, card->host, type);
+		if (!err)
+			break;
+		if (err == -ENODEV) {
+			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+			mmc_blk_rw_try_restart(mq, mq_rq);
+			return;
+		}
+		/* Fall through */
+	}
+	case MMC_BLK_ECC_ERR:
+		if (brq->data.blocks > 1) {
+			/* Redo read one sector at a time */
+			pr_warn("%s: retrying using single block read\n",
+				old_req->rq_disk->disk_name);
+			disable_multi = 1;
+			break;
+		}
+		/*
+		 * After an error, we redo I/O one sector at a
+		 * time, so we only reach here after trying to
+		 * read a single sector.
+		 */
+		req_pending = blk_end_request(old_req, BLK_STS_IOERR,
+					      brq->data.blksz);
+		if (!req_pending) {
+			mq->qcnt--;
+			mmc_blk_rw_try_restart(mq, mq_rq);
+			return;
+		}
+		break;
+	case MMC_BLK_NOMEDIUM:
+		mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+		mmc_blk_rw_try_restart(mq, mq_rq);
+		return;
+	default:
+		pr_err("%s: Unhandled return value (%d)",
+				old_req->rq_disk->disk_name, status);
+		mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+		mmc_blk_rw_try_restart(mq, mq_rq);
 		return;
 	}
-	/* Else proceed and try to restart the current async request */
-	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
-	mmc_start_areq(mq->card->host, &mqrq->areq, NULL);
+
+	if (req_pending) {
+		/*
+		 * In case of a incomplete request
+		 * prepare it again and resend.
+		 */
+		mmc_blk_rw_rq_prep(mq_rq, card,
+				disable_multi, mq);
+		mmc_start_areq(card->host, areq, NULL);
+		mq_rq->brq.retune_retry_done = retune_retry_done;
+	} else {
+		/* Else, this request is done */
+		mq->qcnt--;
+	}
 }
 
 static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 {
-	struct mmc_blk_data *md = mq->blkdata;
-	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq;
-	int disable_multi = 0, retry = 0, type, retune_retry_done = 0;
 	enum mmc_blk_status status;
-	struct mmc_queue_req *mqrq_cur = NULL;
-	struct mmc_queue_req *mq_rq;
-	struct request *old_req;
 	struct mmc_async_req *new_areq;
 	struct mmc_async_req *old_areq;
-	bool req_pending = true;
+	struct mmc_card *card = mq->card;
 
-	if (new_req) {
-		mqrq_cur = req_to_mmc_queue_req(new_req);
+	if (new_req)
 		mq->qcnt++;
-	}
 
 	if (!mq->qcnt)
 		return;
 
-	do {
-		if (new_req) {
-			/*
-			 * When 4KB native sector is enabled, only 8 blocks
-			 * multiple read or write is allowed
-			 */
-			if (mmc_large_sector(card) &&
-				!IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
-				pr_err("%s: Transfer size is not 4KB sector size aligned\n",
-					new_req->rq_disk->disk_name);
-				mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
-				return;
-			}
-
-			mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
-			new_areq = &mqrq_cur->areq;
-		} else
-			new_areq = NULL;
-
-		old_areq = mmc_start_areq(card->host, new_areq, &status);
-		if (!old_areq) {
-			/*
-			 * We have just put the first request into the pipeline
-			 * and there is nothing more to do until it is
-			 * complete.
-			 */
-			return;
-		}
+	/*
+	 * If the card was removed, just cancel everything and return.
+	 */
+	if (mmc_card_removed(card)) {
+		new_req->rq_flags |= RQF_QUIET;
+		blk_end_request_all(new_req, BLK_STS_IOERR);
+		mq->qcnt--; /* FIXME: just set to 0? */
+		return;
+	}
 
+	if (new_req) {
+		struct mmc_queue_req *mqrq_cur = req_to_mmc_queue_req(new_req);
 		/*
-		 * An asynchronous request has been completed and we proceed
-		 * to handle the result of it.
+		 * When 4KB native sector is enabled, only 8 blocks
+		 * multiple read or write is allowed
 		 */
-		mq_rq =	container_of(old_areq, struct mmc_queue_req, areq);
-		brq = &mq_rq->brq;
-		old_req = mmc_queue_req_to_req(mq_rq);
-		type = rq_data_dir(old_req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
-
-		switch (status) {
-		case MMC_BLK_SUCCESS:
-		case MMC_BLK_PARTIAL:
-			/*
-			 * A block was successfully transferred.
-			 */
-			mmc_blk_reset_success(md, type);
-
-			req_pending = blk_end_request(old_req, BLK_STS_OK,
-						      brq->data.bytes_xfered);
-			/*
-			 * If the blk_end_request function returns non-zero even
-			 * though all data has been transferred and no errors
-			 * were returned by the host controller, it's a bug.
-			 */
-			if (status == MMC_BLK_SUCCESS && req_pending) {
-				pr_err("%s BUG rq_tot %d d_xfer %d\n",
-				       __func__, blk_rq_bytes(old_req),
-				       brq->data.bytes_xfered);
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				return;
-			}
-			break;
-		case MMC_BLK_CMD_ERR:
-			req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
-			if (mmc_blk_reset(md, card->host, type)) {
-				if (req_pending)
-					mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				else
-					mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			if (!req_pending) {
-				mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			break;
-		case MMC_BLK_RETRY:
-			retune_retry_done = brq->retune_retry_done;
-			if (retry++ < 5)
-				break;
-			/* Fall through */
-		case MMC_BLK_ABORT:
-			if (!mmc_blk_reset(md, card->host, type))
-				break;
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		case MMC_BLK_DATA_ERR: {
-			int err;
-
-			err = mmc_blk_reset(md, card->host, type);
-			if (!err)
-				break;
-			if (err == -ENODEV) {
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			/* Fall through */
-		}
-		case MMC_BLK_ECC_ERR:
-			if (brq->data.blocks > 1) {
-				/* Redo read one sector at a time */
-				pr_warn("%s: retrying using single block read\n",
-					old_req->rq_disk->disk_name);
-				disable_multi = 1;
-				break;
-			}
-			/*
-			 * After an error, we redo I/O one sector at a
-			 * time, so we only reach here after trying to
-			 * read a single sector.
-			 */
-			req_pending = blk_end_request(old_req, BLK_STS_IOERR,
-						      brq->data.blksz);
-			if (!req_pending) {
-				mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			break;
-		case MMC_BLK_NOMEDIUM:
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		default:
-			pr_err("%s: Unhandled return value (%d)",
-					old_req->rq_disk->disk_name, status);
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
+		if (mmc_large_sector(card) &&
+		    !IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
+			pr_err("%s: Transfer size is not 4KB sector size aligned\n",
+			       new_req->rq_disk->disk_name);
+			mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
 			return;
 		}
 
-		if (req_pending) {
-			/*
-			 * In case of a incomplete request
-			 * prepare it again and resend.
-			 */
-			mmc_blk_rw_rq_prep(mq_rq, card,
-					disable_multi, mq);
-			mmc_start_areq(card->host,
-					&mq_rq->areq, NULL);
-			mq_rq->brq.retune_retry_done = retune_retry_done;
-		}
-	} while (req_pending);
+		mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
+		new_areq = &mqrq_cur->areq;
+		new_areq->report_done_status = mmc_blk_rw_done;
+	} else
+		new_areq = NULL;
 
-	mq->qcnt--;
+	old_areq = mmc_start_areq(card->host, new_areq, &status);
+	if (!old_areq) {
+		/*
+		 * We have just put the first request into the pipeline
+		 * and there is nothing more to do until it is
+		 * complete.
+		 */
+		return;
+	}
+	/*
+	 * FIXME: yes, we just discard the old_areq, it will be
+	 * post-processed when done, in mmc_blk_rw_done(). We clean
+	 * this up in later patches.
+	 */
 }
 
 void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 865db736c717..620dcbed15b7 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -738,12 +738,28 @@ void mmc_finalize_areq(struct work_struct *work)
 	/* Successfully postprocess the old request at this point */
 	mmc_post_req(host, areq->mrq, 0);
 
-	areq->finalization_status = status;
+	/* Call back with status, this will trigger retry etc if needed */
+	if (areq->report_done_status)
+		areq->report_done_status(areq, status);
+
+	/* This opens the gate for the next request to start on the host */
 	complete(&areq->complete);
 }
 EXPORT_SYMBOL(mmc_finalize_areq);
 
 /**
+ * mmc_restart_areq() - restart an asynchronous request
+ * @host: MMC host to restart the command on
+ * @areq: the asynchronous request to restart
+ */
+int mmc_restart_areq(struct mmc_host *host,
+		     struct mmc_async_req *areq)
+{
+	return __mmc_start_data_req(host, areq->mrq);
+}
+EXPORT_SYMBOL(mmc_restart_areq);
+
+/**
  *	mmc_start_areq - start an asynchronous request
  *	@host: MMC host to start command
  *	@areq: asynchronous request to start
@@ -763,7 +779,6 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 				     struct mmc_async_req *areq,
 				     enum mmc_blk_status *ret_stat)
 {
-	enum mmc_blk_status status;
 	int start_err = 0;
 	struct mmc_async_req *previous = host->areq;
 
@@ -772,31 +787,27 @@ struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 		mmc_pre_req(host, areq->mrq);
 
 	/* Finalize previous request, if there is one */
-	if (previous) {
+	if (previous)
 		wait_for_completion(&previous->complete);
-		status = previous->finalization_status;
-	} else {
-		status = MMC_BLK_SUCCESS;
-	}
+
+	/* Just always succeed */
 	if (ret_stat)
-		*ret_stat = status;
+		*ret_stat = MMC_BLK_SUCCESS;
 
 	/* Fine so far, start the new request! */
-	if (status == MMC_BLK_SUCCESS && areq) {
+	if (areq) {
 		init_completion(&areq->complete);
 		areq->mrq->areq = areq;
 		start_err = __mmc_start_data_req(host, areq->mrq);
+		/* Cancel a prepared request if it was not started. */
+		if (start_err) {
+			mmc_post_req(host, areq->mrq, -EINVAL);
+			host->areq = NULL;
+		} else {
+			host->areq = areq;
+		}
 	}
 
-	/* Cancel a prepared request if it was not started. */
-	if ((status != MMC_BLK_SUCCESS || start_err) && areq)
-		mmc_post_req(host, areq->mrq, -EINVAL);
-
-	if (status != MMC_BLK_SUCCESS)
-		host->areq = NULL;
-	else
-		host->areq = areq;
-
 	return previous;
 }
 EXPORT_SYMBOL(mmc_start_areq);
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index 88b852ac8f74..1859804ecd80 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -112,6 +112,7 @@ int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq);
 struct mmc_async_req;
 
 void mmc_finalize_areq(struct work_struct *work);
+int mmc_restart_areq(struct mmc_host *host, struct mmc_async_req *areq);
 struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
 				     struct mmc_async_req *areq,
 				     enum mmc_blk_status *ret_stat);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 023bbddc1a0b..db1fa11d9870 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -145,6 +145,7 @@ static int mmc_init_request(struct request_queue *q, struct request *req,
 	mq_rq->sg = mmc_alloc_sg(host->max_segs, gfp);
 	if (!mq_rq->sg)
 		return -ENOMEM;
+	mq_rq->mq = mq;
 
 	return 0;
 }
@@ -155,6 +156,7 @@ static void mmc_exit_request(struct request_queue *q, struct request *req)
 
 	kfree(mq_rq->sg);
 	mq_rq->sg = NULL;
+	mq_rq->mq = NULL;
 }
 
 static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 68f68ecd94ea..dce7cedb9d0b 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -52,6 +52,7 @@ struct mmc_queue_req {
 	struct mmc_blk_request	brq;
 	struct scatterlist	*sg;
 	struct mmc_async_req	areq;
+	struct mmc_queue	*mq;
 	enum mmc_drv_op		drv_op;
 	int			drv_op_result;
 	void			*drv_op_data;
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 638f11d185bd..74859a71e14b 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -216,8 +216,11 @@ struct mmc_async_req {
 	 * Returns 0 if success otherwise non zero.
 	 */
 	enum mmc_blk_status (*err_check)(struct mmc_card *, struct mmc_async_req *);
+	/*
+	 * Report finalization status from the core to e.g. the block layer.
+	 */
+	void (*report_done_status)(struct mmc_async_req *, enum mmc_blk_status);
 	struct work_struct finalization_work;
-	enum mmc_blk_status finalization_status;
 	struct completion complete;
 	struct mmc_host *host;
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/12 v4] mmc: queue: stop flushing the pipeline with NULL
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (7 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 08/12 v4] mmc: block: shuffle retry and error handling Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 10/12 v4] mmc: queue/block: pass around struct mmc_queue_req*s Linus Walleij
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

Remove all the pipeline flush: i.e. repeatedly sending NULL
down to the core layer to flush out asynchronous requests,
and also sending NULL after "special" commands to achieve the
same flush.

Instead: let the "special" commands wait for any ongoing
asynchronous transfers using the completion, and apart from
that expect the core.c and block.c layers to deal with the
ongoing requests autonomously without any "push" from the
queue.

Add a function in the core to wait for an asynchronous request
to complete.

Update the tests to use the new function prototypes.

This kills off some FIXME's such as gettin rid of the mq->qcnt
queue depth variable that was introduced a while back.

It is a vital step toward multiqueue enablement that we stop
pulling NULL off the end of the request queue to flush the
asynchronous issueing mechanism.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c    | 168 ++++++++++++++++----------------------------
 drivers/mmc/core/core.c     |  50 +++++++------
 drivers/mmc/core/core.h     |   6 +-
 drivers/mmc/core/mmc_test.c |  31 ++------
 drivers/mmc/core/queue.c    |  11 ++-
 drivers/mmc/core/queue.h    |   7 --
 6 files changed, 106 insertions(+), 167 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index c1178fa83f75..ab01cab4a026 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1805,7 +1805,6 @@ static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
 	if (mmc_card_removed(card))
 		req->rq_flags |= RQF_QUIET;
 	while (blk_end_request(req, BLK_STS_IOERR, blk_rq_cur_bytes(req)));
-	mq->qcnt--;
 }
 
 /**
@@ -1873,13 +1872,10 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		if (mmc_blk_reset(md, card->host, type)) {
 			if (req_pending)
 				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			else
-				mq->qcnt--;
 			mmc_blk_rw_try_restart(mq, mq_rq);
 			return;
 		}
 		if (!req_pending) {
-			mq->qcnt--;
 			mmc_blk_rw_try_restart(mq, mq_rq);
 			return;
 		}
@@ -1923,7 +1919,6 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		req_pending = blk_end_request(old_req, BLK_STS_IOERR,
 					      brq->data.blksz);
 		if (!req_pending) {
-			mq->qcnt--;
 			mmc_blk_rw_try_restart(mq, mq_rq);
 			return;
 		}
@@ -1947,26 +1942,16 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		 */
 		mmc_blk_rw_rq_prep(mq_rq, card,
 				disable_multi, mq);
-		mmc_start_areq(card->host, areq, NULL);
+		mmc_start_areq(card->host, areq);
 		mq_rq->brq.retune_retry_done = retune_retry_done;
-	} else {
-		/* Else, this request is done */
-		mq->qcnt--;
 	}
 }
 
 static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 {
-	enum mmc_blk_status status;
-	struct mmc_async_req *new_areq;
-	struct mmc_async_req *old_areq;
 	struct mmc_card *card = mq->card;
-
-	if (new_req)
-		mq->qcnt++;
-
-	if (!mq->qcnt)
-		return;
+	struct mmc_queue_req *mqrq_cur = req_to_mmc_queue_req(new_req);
+	struct mmc_async_req *areq = &mqrq_cur->areq;
 
 	/*
 	 * If the card was removed, just cancel everything and return.
@@ -1974,44 +1959,25 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 	if (mmc_card_removed(card)) {
 		new_req->rq_flags |= RQF_QUIET;
 		blk_end_request_all(new_req, BLK_STS_IOERR);
-		mq->qcnt--; /* FIXME: just set to 0? */
 		return;
 	}
 
-	if (new_req) {
-		struct mmc_queue_req *mqrq_cur = req_to_mmc_queue_req(new_req);
-		/*
-		 * When 4KB native sector is enabled, only 8 blocks
-		 * multiple read or write is allowed
-		 */
-		if (mmc_large_sector(card) &&
-		    !IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
-			pr_err("%s: Transfer size is not 4KB sector size aligned\n",
-			       new_req->rq_disk->disk_name);
-			mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
-			return;
-		}
-
-		mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
-		new_areq = &mqrq_cur->areq;
-		new_areq->report_done_status = mmc_blk_rw_done;
-	} else
-		new_areq = NULL;
 
-	old_areq = mmc_start_areq(card->host, new_areq, &status);
-	if (!old_areq) {
-		/*
-		 * We have just put the first request into the pipeline
-		 * and there is nothing more to do until it is
-		 * complete.
-		 */
-		return;
-	}
 	/*
-	 * FIXME: yes, we just discard the old_areq, it will be
-	 * post-processed when done, in mmc_blk_rw_done(). We clean
-	 * this up in later patches.
+	 * When 4KB native sector is enabled, only 8 blocks
+	 * multiple read or write is allowed
 	 */
+	if (mmc_large_sector(card) &&
+	    !IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
+		pr_err("%s: Transfer size is not 4KB sector size aligned\n",
+		       new_req->rq_disk->disk_name);
+		mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
+		return;
+	}
+
+	mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
+	areq->report_done_status = mmc_blk_rw_done;
+	mmc_start_areq(card->host, areq);
 }
 
 void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
@@ -2020,70 +1986,56 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 	struct mmc_blk_data *md = mq->blkdata;
 	struct mmc_card *card = md->queue.card;
 
-	if (req && !mq->qcnt)
-		/* claim host only for the first request */
-		mmc_get_card(card, NULL);
+	if (!req) {
+		pr_err("%s: tried to issue NULL request\n", __func__);
+		return;
+	}
 
 	ret = mmc_blk_part_switch(card, md->part_type);
 	if (ret) {
-		if (req) {
-			blk_end_request_all(req, BLK_STS_IOERR);
-		}
-		goto out;
+		blk_end_request_all(req, BLK_STS_IOERR);
+		return;
 	}
 
-	if (req) {
-		switch (req_op(req)) {
-		case REQ_OP_DRV_IN:
-		case REQ_OP_DRV_OUT:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * ioctl()s
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_drv_op(mq, req);
-			break;
-		case REQ_OP_DISCARD:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * discard.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_discard_rq(mq, req);
-			break;
-		case REQ_OP_SECURE_ERASE:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * secure erase.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_secdiscard_rq(mq, req);
-			break;
-		case REQ_OP_FLUSH:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * flush.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
-			mmc_blk_issue_flush(mq, req);
-			break;
-		default:
-			/* Normal request, just issue it */
-			mmc_blk_issue_rw_rq(mq, req);
-			break;
-		}
-	} else {
-		/* No request, flushing the pipeline with NULL */
-		mmc_blk_issue_rw_rq(mq, NULL);
+	switch (req_op(req)) {
+	case REQ_OP_DRV_IN:
+	case REQ_OP_DRV_OUT:
+		/*
+		 * Complete ongoing async transfer before issuing
+		 * ioctl()s
+		 */
+		mmc_wait_for_areq(card->host);
+		mmc_blk_issue_drv_op(mq, req);
+		break;
+	case REQ_OP_DISCARD:
+		/*
+		 * Complete ongoing async transfer before issuing
+		 * discard.
+		 */
+		mmc_wait_for_areq(card->host);
+		mmc_blk_issue_discard_rq(mq, req);
+		break;
+	case REQ_OP_SECURE_ERASE:
+		/*
+		 * Complete ongoing async transfer before issuing
+		 * secure erase.
+		 */
+		mmc_wait_for_areq(card->host);
+		mmc_blk_issue_secdiscard_rq(mq, req);
+		break;
+	case REQ_OP_FLUSH:
+		/*
+		 * Complete ongoing async transfer before issuing
+		 * flush.
+		 */
+		mmc_wait_for_areq(card->host);
+		mmc_blk_issue_flush(mq, req);
+		break;
+	default:
+		/* Normal request, just issue it */
+		mmc_blk_issue_rw_rq(mq, req);
+		break;
 	}
-
-out:
-	if (!mq->qcnt)
-		mmc_put_card(card, NULL);
 }
 
 static inline int mmc_blk_readonly(struct mmc_card *card)
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 620dcbed15b7..209ebb8a7f3f 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -747,6 +747,15 @@ void mmc_finalize_areq(struct work_struct *work)
 }
 EXPORT_SYMBOL(mmc_finalize_areq);
 
+void mmc_wait_for_areq(struct mmc_host *host)
+{
+	if (host->areq) {
+		wait_for_completion(&host->areq->complete);
+		host->areq = NULL;
+	}
+}
+EXPORT_SYMBOL(mmc_wait_for_areq);
+
 /**
  * mmc_restart_areq() - restart an asynchronous request
  * @host: MMC host to restart the command on
@@ -775,40 +784,37 @@ EXPORT_SYMBOL(mmc_restart_areq);
  *	return the completed request. If there is no ongoing request, NULL
  *	is returned without waiting. NULL is not an error condition.
  */
-struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
-				     struct mmc_async_req *areq,
-				     enum mmc_blk_status *ret_stat)
+int mmc_start_areq(struct mmc_host *host,
+		   struct mmc_async_req *areq)
 {
-	int start_err = 0;
 	struct mmc_async_req *previous = host->areq;
+	int ret;
+
+	/* Delete this check when we trust the code */
+	if (!areq)
+		pr_err("%s: NULL asynchronous request!\n", __func__);
 
 	/* Prepare a new request */
-	if (areq)
-		mmc_pre_req(host, areq->mrq);
+	mmc_pre_req(host, areq->mrq);
 
 	/* Finalize previous request, if there is one */
 	if (previous)
 		wait_for_completion(&previous->complete);
 
-	/* Just always succeed */
-	if (ret_stat)
-		*ret_stat = MMC_BLK_SUCCESS;
-
 	/* Fine so far, start the new request! */
-	if (areq) {
-		init_completion(&areq->complete);
-		areq->mrq->areq = areq;
-		start_err = __mmc_start_data_req(host, areq->mrq);
-		/* Cancel a prepared request if it was not started. */
-		if (start_err) {
-			mmc_post_req(host, areq->mrq, -EINVAL);
-			host->areq = NULL;
-		} else {
-			host->areq = areq;
-		}
+	init_completion(&areq->complete);
+	areq->mrq->areq = areq;
+	ret = __mmc_start_data_req(host, areq->mrq);
+	/* Cancel a prepared request if it was not started. */
+	if (ret) {
+		mmc_post_req(host, areq->mrq, -EINVAL);
+		host->areq = NULL;
+		pr_err("%s: failed to start request\n", __func__);
+	} else {
+		host->areq = areq;
 	}
 
-	return previous;
+	return ret;
 }
 EXPORT_SYMBOL(mmc_start_areq);
 
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index 1859804ecd80..5b8d0f1147ef 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -113,9 +113,9 @@ struct mmc_async_req;
 
 void mmc_finalize_areq(struct work_struct *work);
 int mmc_restart_areq(struct mmc_host *host, struct mmc_async_req *areq);
-struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
-				     struct mmc_async_req *areq,
-				     enum mmc_blk_status *ret_stat);
+int mmc_start_areq(struct mmc_host *host,
+		   struct mmc_async_req *areq);
+void mmc_wait_for_areq(struct mmc_host *host);
 
 int mmc_erase(struct mmc_card *card, unsigned int from, unsigned int nr,
 		unsigned int arg);
diff --git a/drivers/mmc/core/mmc_test.c b/drivers/mmc/core/mmc_test.c
index 478869805b96..256fdce38449 100644
--- a/drivers/mmc/core/mmc_test.c
+++ b/drivers/mmc/core/mmc_test.c
@@ -839,10 +839,8 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 {
 	struct mmc_test_req *rq1, *rq2;
 	struct mmc_test_async_req test_areq[2];
-	struct mmc_async_req *done_areq;
 	struct mmc_async_req *cur_areq = &test_areq[0].areq;
 	struct mmc_async_req *other_areq = &test_areq[1].areq;
-	enum mmc_blk_status status;
 	int i;
 	int ret = RESULT_OK;
 
@@ -864,25 +862,16 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 	for (i = 0; i < count; i++) {
 		mmc_test_prepare_mrq(test, cur_areq->mrq, sg, sg_len, dev_addr,
 				     blocks, blksz, write);
-		done_areq = mmc_start_areq(test->card->host, cur_areq, &status);
+		ret = mmc_start_areq(test->card->host, cur_areq);
+		mmc_wait_for_areq(test->card->host);
 
-		if (status != MMC_BLK_SUCCESS || (!done_areq && i > 0)) {
-			ret = RESULT_FAIL;
-			goto err;
-		}
-
-		if (done_areq)
-			mmc_test_req_reset(container_of(done_areq->mrq,
+		mmc_test_req_reset(container_of(cur_areq->mrq,
 						struct mmc_test_req, mrq));
 
 		swap(cur_areq, other_areq);
 		dev_addr += blocks;
 	}
 
-	done_areq = mmc_start_areq(test->card->host, NULL, &status);
-	if (status != MMC_BLK_SUCCESS)
-		ret = RESULT_FAIL;
-
 err:
 	kfree(rq1);
 	kfree(rq2);
@@ -2360,7 +2349,6 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 	struct mmc_request *mrq;
 	unsigned long timeout;
 	bool expired = false;
-	enum mmc_blk_status blkstat = MMC_BLK_SUCCESS;
 	int ret = 0, cmd_ret;
 	u32 status = 0;
 	int count = 0;
@@ -2388,11 +2376,8 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 
 	/* Start ongoing data request */
 	if (use_areq) {
-		mmc_start_areq(host, &test_areq.areq, &blkstat);
-		if (blkstat != MMC_BLK_SUCCESS) {
-			ret = RESULT_FAIL;
-			goto out_free;
-		}
+		mmc_start_areq(host, &test_areq.areq);
+		mmc_wait_for_areq(host);
 	} else {
 		mmc_wait_for_req(host, mrq);
 	}
@@ -2425,11 +2410,7 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 	} while (repeat_cmd && R1_CURRENT_STATE(status) != R1_STATE_TRAN);
 
 	/* Wait for data request to complete */
-	if (use_areq) {
-		mmc_start_areq(host, NULL, &blkstat);
-		if (blkstat != MMC_BLK_SUCCESS)
-			ret = RESULT_FAIL;
-	} else {
+	if (!use_areq) {
 		mmc_wait_for_req_done(test->card->host, mrq);
 	}
 
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index db1fa11d9870..cf43a2d5410d 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -42,6 +42,7 @@ static int mmc_queue_thread(void *d)
 {
 	struct mmc_queue *mq = d;
 	struct request_queue *q = mq->queue;
+	bool claimed_card = false;
 
 	current->flags |= PF_MEMALLOC;
 
@@ -55,7 +56,11 @@ static int mmc_queue_thread(void *d)
 		mq->asleep = false;
 		spin_unlock_irq(q->queue_lock);
 
-		if (req || mq->qcnt) {
+		if (req) {
+			if (!claimed_card) {
+				mmc_get_card(mq->card, NULL);
+				claimed_card = true;
+			}
 			set_current_state(TASK_RUNNING);
 			mmc_blk_issue_rq(mq, req);
 			cond_resched();
@@ -72,6 +77,9 @@ static int mmc_queue_thread(void *d)
 	} while (1);
 	up(&mq->thread_sem);
 
+	if (claimed_card)
+		mmc_put_card(mq->card, NULL);
+
 	return 0;
 }
 
@@ -207,7 +215,6 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 	mq->queue->exit_rq_fn = mmc_exit_request;
 	mq->queue->cmd_size = sizeof(struct mmc_queue_req);
 	mq->queue->queuedata = mq;
-	mq->qcnt = 0;
 	ret = blk_init_allocated_queue(mq->queue);
 	if (ret) {
 		blk_cleanup_queue(mq->queue);
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index dce7cedb9d0b..67ae311b107f 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -67,13 +67,6 @@ struct mmc_queue {
 	bool			asleep;
 	struct mmc_blk_data	*blkdata;
 	struct request_queue	*queue;
-	/*
-	 * FIXME: this counter is not a very reliable way of keeping
-	 * track of how many requests that are ongoing. Switch to just
-	 * letting the block core keep track of requests and per-request
-	 * associated mmc_queue_req data.
-	 */
-	int			qcnt;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/12 v4] mmc: queue/block: pass around struct mmc_queue_req*s
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (8 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 09/12 v4] mmc: queue: stop flushing the pipeline with NULL Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 12:57 ` [PATCH 11/12 v4] mmc: block: issue requests in massive parallel Linus Walleij
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

Instead of passing two pointers around several pointers to
mmc_queue_req, request, mmc_queue, and reassigning to the left and
right, issue mmc_queue_req and dereference the queue and request
from the mmq_queue_req where needed.

The struct mmc_queue_req is the thing that has a lifecycle after
all: this is what we are keeping in our queue, and what the block
layer helps us manager. Augment a bunch of functions to take a
single argument so we can see the trees and not just a big
jungle of arguments.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c | 129 ++++++++++++++++++++++++-----------------------
 drivers/mmc/core/block.h |   5 +-
 drivers/mmc/core/queue.c |   2 +-
 3 files changed, 70 insertions(+), 66 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index ab01cab4a026..184907f5fb97 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1208,9 +1208,9 @@ static inline void mmc_blk_reset_success(struct mmc_blk_data *md, int type)
  * processed it with all other requests and then they get issued in this
  * function.
  */
-static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_issue_drv_op(struct mmc_queue_req *mq_rq)
 {
-	struct mmc_queue_req *mq_rq;
+	struct mmc_queue *mq = mq_rq->mq;
 	struct mmc_card *card = mq->card;
 	struct mmc_blk_data *md = mq->blkdata;
 	struct mmc_blk_ioc_data **idata;
@@ -1220,7 +1220,6 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
 	int ret;
 	int i;
 
-	mq_rq = req_to_mmc_queue_req(req);
 	rpmb_ioctl = (mq_rq->drv_op == MMC_DRV_OP_IOCTL_RPMB);
 
 	switch (mq_rq->drv_op) {
@@ -1264,12 +1263,14 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
 		break;
 	}
 	mq_rq->drv_op_result = ret;
-	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	blk_end_request_all(mmc_queue_req_to_req(mq_rq),
+			    ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
-static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_issue_discard_rq(struct mmc_queue_req *mq_rq)
 {
-	struct mmc_blk_data *md = mq->blkdata;
+	struct request *req = mmc_queue_req_to_req(mq_rq);
+	struct mmc_blk_data *md = mq_rq->mq->blkdata;
 	struct mmc_card *card = md->queue.card;
 	unsigned int from, nr, arg;
 	int err = 0, type = MMC_BLK_DISCARD;
@@ -1310,10 +1311,10 @@ static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
 	blk_end_request(req, status, blk_rq_bytes(req));
 }
 
-static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
-				       struct request *req)
+static void mmc_blk_issue_secdiscard_rq(struct mmc_queue_req *mq_rq)
 {
-	struct mmc_blk_data *md = mq->blkdata;
+	struct request *req = mmc_queue_req_to_req(mq_rq);
+	struct mmc_blk_data *md = mq_rq->mq->blkdata;
 	struct mmc_card *card = md->queue.card;
 	unsigned int from, nr, arg;
 	int err = 0, type = MMC_BLK_SECDISCARD;
@@ -1380,14 +1381,15 @@ static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
 	blk_end_request(req, status, blk_rq_bytes(req));
 }
 
-static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
+static void mmc_blk_issue_flush(struct mmc_queue_req *mq_rq)
 {
-	struct mmc_blk_data *md = mq->blkdata;
+	struct mmc_blk_data *md = mq_rq->mq->blkdata;
 	struct mmc_card *card = md->queue.card;
 	int ret = 0;
 
 	ret = mmc_flush_cache(card);
-	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	blk_end_request_all(mmc_queue_req_to_req(mq_rq),
+			    ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 /*
@@ -1698,18 +1700,18 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		*do_data_tag_p = do_data_tag;
 }
 
-static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
-			       struct mmc_card *card,
-			       int disable_multi,
-			       struct mmc_queue *mq)
+static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mq_rq,
+			       int disable_multi)
 {
 	u32 readcmd, writecmd;
-	struct mmc_blk_request *brq = &mqrq->brq;
-	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct mmc_queue *mq = mq_rq->mq;
+	struct mmc_card *card = mq->card;
+	struct mmc_blk_request *brq = &mq_rq->brq;
+	struct request *req = mmc_queue_req_to_req(mq_rq);
 	struct mmc_blk_data *md = mq->blkdata;
 	bool do_rel_wr, do_data_tag;
 
-	mmc_blk_data_prep(mq, mqrq, disable_multi, &do_rel_wr, &do_data_tag);
+	mmc_blk_data_prep(mq, mq_rq, disable_multi, &do_rel_wr, &do_data_tag);
 
 	brq->mrq.cmd = &brq->cmd;
 	brq->mrq.areq = NULL;
@@ -1764,9 +1766,9 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 		brq->mrq.sbc = &brq->sbc;
 	}
 
-	mqrq->areq.err_check = mmc_blk_err_check;
-	mqrq->areq.host = card->host;
-	INIT_WORK(&mqrq->areq.finalization_work, mmc_finalize_areq);
+	mq_rq->areq.err_check = mmc_blk_err_check;
+	mq_rq->areq.host = card->host;
+	INIT_WORK(&mq_rq->areq.finalization_work, mmc_finalize_areq);
 }
 
 static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
@@ -1798,10 +1800,12 @@ static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
 	return req_pending;
 }
 
-static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
-				 struct request *req,
-				 struct mmc_queue_req *mqrq)
+static void mmc_blk_rw_cmd_abort(struct mmc_queue_req *mq_rq)
 {
+	struct mmc_queue *mq = mq_rq->mq;
+	struct mmc_card *card = mq->card;
+	struct request *req = mmc_queue_req_to_req(mq_rq);
+
 	if (mmc_card_removed(card))
 		req->rq_flags |= RQF_QUIET;
 	while (blk_end_request(req, BLK_STS_IOERR, blk_rq_cur_bytes(req)));
@@ -1809,15 +1813,15 @@ static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
 
 /**
  * mmc_blk_rw_try_restart() - tries to restart the current async request
- * @mq: the queue with the card and host to restart
- * @mqrq: the mmc_queue_request containing the areq to be restarted
+ * @mq_rq: the mmc_queue_request containing the areq to be restarted
  */
-static void mmc_blk_rw_try_restart(struct mmc_queue *mq,
-				   struct mmc_queue_req *mqrq)
+static void mmc_blk_rw_try_restart(struct mmc_queue_req *mq_rq)
 {
+	struct mmc_queue *mq = mq_rq->mq;
+
 	/* Proceed and try to restart the current async request */
-	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
-	mmc_restart_areq(mq->card->host, &mqrq->areq);
+	mmc_blk_rw_rq_prep(mq_rq, 0);
+	mmc_restart_areq(mq->card->host, &mq_rq->areq);
 }
 
 static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status status)
@@ -1863,7 +1867,7 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 			pr_err("%s BUG rq_tot %d d_xfer %d\n",
 			       __func__, blk_rq_bytes(old_req),
 			       brq->data.bytes_xfered);
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
+			mmc_blk_rw_cmd_abort(mq_rq);
 			return;
 		}
 		break;
@@ -1871,12 +1875,12 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
 		if (mmc_blk_reset(md, card->host, type)) {
 			if (req_pending)
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, mq_rq);
+				mmc_blk_rw_cmd_abort(mq_rq);
+			mmc_blk_rw_try_restart(mq_rq);
 			return;
 		}
 		if (!req_pending) {
-			mmc_blk_rw_try_restart(mq, mq_rq);
+			mmc_blk_rw_try_restart(mq_rq);
 			return;
 		}
 		break;
@@ -1888,8 +1892,8 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 	case MMC_BLK_ABORT:
 		if (!mmc_blk_reset(md, card->host, type))
 			break;
-		mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-		mmc_blk_rw_try_restart(mq, mq_rq);
+		mmc_blk_rw_cmd_abort(mq_rq);
+		mmc_blk_rw_try_restart(mq_rq);
 		return;
 	case MMC_BLK_DATA_ERR: {
 		int err;
@@ -1897,8 +1901,8 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		if (!err)
 			break;
 		if (err == -ENODEV) {
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, mq_rq);
+			mmc_blk_rw_cmd_abort(mq_rq);
+			mmc_blk_rw_try_restart(mq_rq);
 			return;
 		}
 		/* Fall through */
@@ -1919,19 +1923,19 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		req_pending = blk_end_request(old_req, BLK_STS_IOERR,
 					      brq->data.blksz);
 		if (!req_pending) {
-			mmc_blk_rw_try_restart(mq, mq_rq);
+			mmc_blk_rw_try_restart(mq_rq);
 			return;
 		}
 		break;
 	case MMC_BLK_NOMEDIUM:
-		mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-		mmc_blk_rw_try_restart(mq, mq_rq);
+		mmc_blk_rw_cmd_abort(mq_rq);
+		mmc_blk_rw_try_restart(mq_rq);
 		return;
 	default:
 		pr_err("%s: Unhandled return value (%d)",
 				old_req->rq_disk->disk_name, status);
-		mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-		mmc_blk_rw_try_restart(mq, mq_rq);
+		mmc_blk_rw_cmd_abort(mq_rq);
+		mmc_blk_rw_try_restart(mq_rq);
 		return;
 	}
 
@@ -1940,25 +1944,25 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		 * In case of a incomplete request
 		 * prepare it again and resend.
 		 */
-		mmc_blk_rw_rq_prep(mq_rq, card,
-				disable_multi, mq);
+		mmc_blk_rw_rq_prep(mq_rq, disable_multi);
 		mmc_start_areq(card->host, areq);
 		mq_rq->brq.retune_retry_done = retune_retry_done;
 	}
 }
 
-static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
+static void mmc_blk_issue_rw_rq(struct mmc_queue_req *mq_rq)
 {
+	struct request *req = mmc_queue_req_to_req(mq_rq);
+	struct mmc_queue *mq = mq_rq->mq;
 	struct mmc_card *card = mq->card;
-	struct mmc_queue_req *mqrq_cur = req_to_mmc_queue_req(new_req);
-	struct mmc_async_req *areq = &mqrq_cur->areq;
+	struct mmc_async_req *areq = &mq_rq->areq;
 
 	/*
 	 * If the card was removed, just cancel everything and return.
 	 */
 	if (mmc_card_removed(card)) {
-		new_req->rq_flags |= RQF_QUIET;
-		blk_end_request_all(new_req, BLK_STS_IOERR);
+		req->rq_flags |= RQF_QUIET;
+		blk_end_request_all(req, BLK_STS_IOERR);
 		return;
 	}
 
@@ -1968,22 +1972,23 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
 	 * multiple read or write is allowed
 	 */
 	if (mmc_large_sector(card) &&
-	    !IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
+	    !IS_ALIGNED(blk_rq_sectors(req), 8)) {
 		pr_err("%s: Transfer size is not 4KB sector size aligned\n",
-		       new_req->rq_disk->disk_name);
-		mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
+		       req->rq_disk->disk_name);
+		mmc_blk_rw_cmd_abort(mq_rq);
 		return;
 	}
 
-	mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
+	mmc_blk_rw_rq_prep(mq_rq, 0);
 	areq->report_done_status = mmc_blk_rw_done;
 	mmc_start_areq(card->host, areq);
 }
 
-void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
+void mmc_blk_issue_rq(struct mmc_queue_req *mq_rq)
 {
 	int ret;
-	struct mmc_blk_data *md = mq->blkdata;
+	struct request *req = mmc_queue_req_to_req(mq_rq);
+	struct mmc_blk_data *md = mq_rq->mq->blkdata;
 	struct mmc_card *card = md->queue.card;
 
 	if (!req) {
@@ -2005,7 +2010,7 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 		 * ioctl()s
 		 */
 		mmc_wait_for_areq(card->host);
-		mmc_blk_issue_drv_op(mq, req);
+		mmc_blk_issue_drv_op(mq_rq);
 		break;
 	case REQ_OP_DISCARD:
 		/*
@@ -2013,7 +2018,7 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 		 * discard.
 		 */
 		mmc_wait_for_areq(card->host);
-		mmc_blk_issue_discard_rq(mq, req);
+		mmc_blk_issue_discard_rq(mq_rq);
 		break;
 	case REQ_OP_SECURE_ERASE:
 		/*
@@ -2021,7 +2026,7 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 		 * secure erase.
 		 */
 		mmc_wait_for_areq(card->host);
-		mmc_blk_issue_secdiscard_rq(mq, req);
+		mmc_blk_issue_secdiscard_rq(mq_rq);
 		break;
 	case REQ_OP_FLUSH:
 		/*
@@ -2029,11 +2034,11 @@ void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 		 * flush.
 		 */
 		mmc_wait_for_areq(card->host);
-		mmc_blk_issue_flush(mq, req);
+		mmc_blk_issue_flush(mq_rq);
 		break;
 	default:
 		/* Normal request, just issue it */
-		mmc_blk_issue_rw_rq(mq, req);
+		mmc_blk_issue_rw_rq(mq_rq);
 		break;
 	}
 }
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index 860ca7c8df86..bbc1c8029b3b 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -1,9 +1,8 @@
 #ifndef _MMC_CORE_BLOCK_H
 #define _MMC_CORE_BLOCK_H
 
-struct mmc_queue;
-struct request;
+struct mmc_queue_req;
 
-void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req);
+void mmc_blk_issue_rq(struct mmc_queue_req *mq_rq);
 
 #endif
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index cf43a2d5410d..5511e323db31 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -62,7 +62,7 @@ static int mmc_queue_thread(void *d)
 				claimed_card = true;
 			}
 			set_current_state(TASK_RUNNING);
-			mmc_blk_issue_rq(mq, req);
+			mmc_blk_issue_rq(req_to_mmc_queue_req(req));
 			cond_resched();
 		} else {
 			mq->asleep = true;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/12 v4] mmc: block: issue requests in massive parallel
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (9 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 10/12 v4] mmc: queue/block: pass around struct mmc_queue_req*s Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-27 14:19   ` Ulf Hansson
  2017-10-26 12:57 ` [PATCH 12/12 v4] mmc: switch MMC/SD to use blk-mq multiqueueing Linus Walleij
  2017-10-26 13:34 ` [PATCH 00/12 v4] multiqueue for MMC/SD Adrian Hunter
  12 siblings, 1 reply; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

This makes a crucial change to the issueing mechanism for the
MMC requests:

Before commit "mmc: core: move the asynchronous post-processing"
some parallelism on the read/write requests was achieved by
speculatively postprocessing a request and re-preprocess and
re-issue the request if something went wrong, which we discover
later when checking for an error.

This is kind of ugly. Instead we need a mechanism like here:

We issue requests, and when they come back from the hardware,
we know if they finished successfully or not. If the request
was successful, we complete the asynchronous request and let a
new request immediately start on the hardware. If, and only if,
it returned an error from the hardware we go down the error
path.

This is achieved by splitting the work path from the hardware
in two: a successful path ending up calling down to
mmc_blk_rw_done() and completing quickly, and an errorpath
calling down to mmc_blk_rw_done_error().

This has a profound effect: we reintroduce the parallelism on
the successful path as mmc_post_req() can now be called in
while the next request is in transit (just like prior to
commit "mmc: core: move the asynchronous post-processing")
and blk_end_request() is called while the next request is
already on the hardware.

The latter has the profound effect of issuing a new request
again so that we actually may have three requests
in transit at the same time: one on the hardware, one being
prepared (such as DMA flushing) and one being prepared for
issuing next by the block layer. This shows up when we
transit to multiqueue, where this can be exploited.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c | 79 +++++++++++++++++++++++++++++++++---------------
 drivers/mmc/core/core.c  | 38 +++++++++++++++++------
 2 files changed, 83 insertions(+), 34 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 184907f5fb97..f06f381146a5 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1824,7 +1824,8 @@ static void mmc_blk_rw_try_restart(struct mmc_queue_req *mq_rq)
 	mmc_restart_areq(mq->card->host, &mq_rq->areq);
 }
 
-static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status status)
+static void mmc_blk_rw_done_error(struct mmc_async_req *areq,
+				  enum mmc_blk_status status)
 {
 	struct mmc_queue *mq;
 	struct mmc_blk_data *md;
@@ -1832,7 +1833,7 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 	struct mmc_host *host;
 	struct mmc_queue_req *mq_rq;
 	struct mmc_blk_request *brq;
-	struct request *old_req;
+	struct request *req;
 	bool req_pending = true;
 	int disable_multi = 0, retry = 0, type, retune_retry_done = 0;
 
@@ -1846,33 +1847,18 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 	card = mq->card;
 	host = card->host;
 	brq = &mq_rq->brq;
-	old_req = mmc_queue_req_to_req(mq_rq);
-	type = rq_data_dir(old_req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+	req = mmc_queue_req_to_req(mq_rq);
+	type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
 
 	switch (status) {
-	case MMC_BLK_SUCCESS:
 	case MMC_BLK_PARTIAL:
-		/*
-		 * A block was successfully transferred.
-		 */
+		/* This should trigger a retransmit */
 		mmc_blk_reset_success(md, type);
-		req_pending = blk_end_request(old_req, BLK_STS_OK,
+		req_pending = blk_end_request(req, BLK_STS_OK,
 					      brq->data.bytes_xfered);
-		/*
-		 * If the blk_end_request function returns non-zero even
-		 * though all data has been transferred and no errors
-		 * were returned by the host controller, it's a bug.
-		 */
-		if (status == MMC_BLK_SUCCESS && req_pending) {
-			pr_err("%s BUG rq_tot %d d_xfer %d\n",
-			       __func__, blk_rq_bytes(old_req),
-			       brq->data.bytes_xfered);
-			mmc_blk_rw_cmd_abort(mq_rq);
-			return;
-		}
 		break;
 	case MMC_BLK_CMD_ERR:
-		req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
+		req_pending = mmc_blk_rw_cmd_err(md, card, brq, req, req_pending);
 		if (mmc_blk_reset(md, card->host, type)) {
 			if (req_pending)
 				mmc_blk_rw_cmd_abort(mq_rq);
@@ -1911,7 +1897,7 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		if (brq->data.blocks > 1) {
 			/* Redo read one sector at a time */
 			pr_warn("%s: retrying using single block read\n",
-				old_req->rq_disk->disk_name);
+				req->rq_disk->disk_name);
 			disable_multi = 1;
 			break;
 		}
@@ -1920,7 +1906,7 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		 * time, so we only reach here after trying to
 		 * read a single sector.
 		 */
-		req_pending = blk_end_request(old_req, BLK_STS_IOERR,
+		req_pending = blk_end_request(req, BLK_STS_IOERR,
 					      brq->data.blksz);
 		if (!req_pending) {
 			mmc_blk_rw_try_restart(mq_rq);
@@ -1933,7 +1919,7 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 		return;
 	default:
 		pr_err("%s: Unhandled return value (%d)",
-				old_req->rq_disk->disk_name, status);
+		       req->rq_disk->disk_name, status);
 		mmc_blk_rw_cmd_abort(mq_rq);
 		mmc_blk_rw_try_restart(mq_rq);
 		return;
@@ -1950,6 +1936,49 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq, enum mmc_blk_status stat
 	}
 }
 
+static void mmc_blk_rw_done(struct mmc_async_req *areq,
+			    enum mmc_blk_status status)
+{
+	struct mmc_queue_req *mq_rq;
+	struct request *req;
+	struct mmc_blk_request *brq;
+	struct mmc_queue *mq;
+	struct mmc_blk_data *md;
+	bool req_pending;
+	int type;
+
+	/*
+	 * Anything other than success or partial transfers are errors.
+	 */
+	if (status != MMC_BLK_SUCCESS) {
+		mmc_blk_rw_done_error(areq, status);
+		return;
+	}
+
+	/* The quick path if the request was successful */
+	mq_rq =	container_of(areq, struct mmc_queue_req, areq);
+	brq = &mq_rq->brq;
+	mq = mq_rq->mq;
+	md = mq->blkdata;
+	req = mmc_queue_req_to_req(mq_rq);
+	type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+
+	mmc_blk_reset_success(md, type);
+	req_pending = blk_end_request(req, BLK_STS_OK,
+				      brq->data.bytes_xfered);
+	/*
+	 * If the blk_end_request function returns non-zero even
+	 * though all data has been transferred and no errors
+	 * were returned by the host controller, it's a bug.
+	 */
+	if (req_pending) {
+		pr_err("%s BUG rq_tot %d d_xfer %d\n",
+		       __func__, blk_rq_bytes(req),
+		       brq->data.bytes_xfered);
+		mmc_blk_rw_cmd_abort(mq_rq);
+	}
+}
+
 static void mmc_blk_issue_rw_rq(struct mmc_queue_req *mq_rq)
 {
 	struct request *req = mmc_queue_req_to_req(mq_rq);
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 209ebb8a7f3f..0f57e9fe66b6 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -735,15 +735,35 @@ void mmc_finalize_areq(struct work_struct *work)
 		mmc_start_bkops(host->card, true);
 	}
 
-	/* Successfully postprocess the old request at this point */
-	mmc_post_req(host, areq->mrq, 0);
-
-	/* Call back with status, this will trigger retry etc if needed */
-	if (areq->report_done_status)
-		areq->report_done_status(areq, status);
-
-	/* This opens the gate for the next request to start on the host */
-	complete(&areq->complete);
+	/*
+	 * Here we postprocess the request differently depending on if
+	 * we go on the success path or error path. The success path will
+	 * immediately let new requests hit the host, whereas the error
+	 * path will hold off new requests until we have retried and
+	 * succeeded or failed the current asynchronous request.
+	 */
+	if (status == MMC_BLK_SUCCESS) {
+		/*
+		 * This immediately opens the gate for the next request
+		 * to start on the host while we perform post-processing
+		 * and report back to the block layer.
+		 */
+		host->areq = NULL;
+		complete(&areq->complete);
+		mmc_post_req(host, areq->mrq, 0);
+		if (areq->report_done_status)
+			areq->report_done_status(areq, MMC_BLK_SUCCESS);
+	} else {
+		mmc_post_req(host, areq->mrq, 0);
+		/*
+		 * Call back with error status, this will trigger retry
+		 * etc if needed
+		 */
+		if (areq->report_done_status)
+			areq->report_done_status(areq, status);
+		host->areq = NULL;
+		complete(&areq->complete);
+	}
 }
 EXPORT_SYMBOL(mmc_finalize_areq);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 12/12 v4] mmc: switch MMC/SD to use blk-mq multiqueueing
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (10 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 11/12 v4] mmc: block: issue requests in massive parallel Linus Walleij
@ 2017-10-26 12:57 ` Linus Walleij
  2017-10-26 13:34 ` [PATCH 00/12 v4] multiqueue for MMC/SD Adrian Hunter
  12 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 12:57 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman,
	Adrian Hunter, Linus Walleij

This switches the MMC/SD stack to use the multiqueue block
layer interface.

We kill off the kthread that was just calling blk_fetch_request()
and let blk-mq drive all traffic, nice, that is how it should work.

Due to having switched the submission mechanics around so that
the completion of requests is now triggered from the host
callbacks, we manage to keep the same performance for linear
reads/writes as we have for the old block layer.

The open questions from earlier patch series v1 thru v3 have
been addressed:

- mmc_[get|put]_card() is now issued across requests from
  .queue_rq() to .complete() using Adrians nifty context lock.
  This means that the block layer does not compete with itself
  on getting access to the host, and we can let other users of
  the host come in. (For SDIO and mixed-mode cards.)

- Partial reads are handled by open coding calls to
  blk_update_request() as advised by Christoph.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/mmc/core/block.c |  87 ++++++++++--------
 drivers/mmc/core/queue.c | 223 ++++++++++++++++++-----------------------------
 drivers/mmc/core/queue.h |   8 +-
 3 files changed, 139 insertions(+), 179 deletions(-)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index f06f381146a5..9e0fe07e098a 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -28,6 +28,7 @@
 #include <linux/hdreg.h>
 #include <linux/kdev_t.h>
 #include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/cdev.h>
 #include <linux/mutex.h>
 #include <linux/scatterlist.h>
@@ -93,7 +94,6 @@ static DEFINE_IDA(mmc_rpmb_ida);
  * There is one mmc_blk_data per slot.
  */
 struct mmc_blk_data {
-	spinlock_t	lock;
 	struct device	*parent;
 	struct gendisk	*disk;
 	struct mmc_queue queue;
@@ -1204,6 +1204,18 @@ static inline void mmc_blk_reset_success(struct mmc_blk_data *md, int type)
 }
 
 /*
+ * This reports status back to the block layer for a finished request.
+ */
+static void mmc_blk_complete(struct mmc_queue_req *mq_rq,
+			     blk_status_t status)
+{
+	struct request *req = mmc_queue_req_to_req(mq_rq);
+
+	blk_mq_end_request(req, status);
+	blk_mq_complete_request(req);
+}
+
+/*
  * The non-block commands come back from the block layer after it queued it and
  * processed it with all other requests and then they get issued in this
  * function.
@@ -1262,9 +1274,9 @@ static void mmc_blk_issue_drv_op(struct mmc_queue_req *mq_rq)
 		ret = -EINVAL;
 		break;
 	}
+
 	mq_rq->drv_op_result = ret;
-	blk_end_request_all(mmc_queue_req_to_req(mq_rq),
-			    ret ? BLK_STS_IOERR : BLK_STS_OK);
+	mmc_blk_complete(mq_rq, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 static void mmc_blk_issue_discard_rq(struct mmc_queue_req *mq_rq)
@@ -1308,7 +1320,7 @@ static void mmc_blk_issue_discard_rq(struct mmc_queue_req *mq_rq)
 	else
 		mmc_blk_reset_success(md, type);
 fail:
-	blk_end_request(req, status, blk_rq_bytes(req));
+	mmc_blk_complete(mq_rq, status);
 }
 
 static void mmc_blk_issue_secdiscard_rq(struct mmc_queue_req *mq_rq)
@@ -1378,7 +1390,7 @@ static void mmc_blk_issue_secdiscard_rq(struct mmc_queue_req *mq_rq)
 	if (!err)
 		mmc_blk_reset_success(md, type);
 out:
-	blk_end_request(req, status, blk_rq_bytes(req));
+	mmc_blk_complete(mq_rq, status);
 }
 
 static void mmc_blk_issue_flush(struct mmc_queue_req *mq_rq)
@@ -1388,8 +1400,13 @@ static void mmc_blk_issue_flush(struct mmc_queue_req *mq_rq)
 	int ret = 0;
 
 	ret = mmc_flush_cache(card);
-	blk_end_request_all(mmc_queue_req_to_req(mq_rq),
-			    ret ? BLK_STS_IOERR : BLK_STS_OK);
+	/*
+	 * NOTE: this used to call blk_end_request_all() for both
+	 * cases in the old block layer to flush all queued
+	 * transactions. I am not sure it was even correct to
+	 * do that for the success case.
+	 */
+	mmc_blk_complete(mq_rq, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 /*
@@ -1768,7 +1785,6 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mq_rq,
 
 	mq_rq->areq.err_check = mmc_blk_err_check;
 	mq_rq->areq.host = card->host;
-	INIT_WORK(&mq_rq->areq.finalization_work, mmc_finalize_areq);
 }
 
 static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
@@ -1792,10 +1808,13 @@ static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
 		err = mmc_sd_num_wr_blocks(card, &blocks);
 		if (err)
 			req_pending = old_req_pending;
-		else
-			req_pending = blk_end_request(req, BLK_STS_OK, blocks << 9);
+		else {
+			req_pending = blk_update_request(req, BLK_STS_OK,
+							 blocks << 9);
+		}
 	} else {
-		req_pending = blk_end_request(req, BLK_STS_OK, brq->data.bytes_xfered);
+		req_pending = blk_update_request(req, BLK_STS_OK,
+						 brq->data.bytes_xfered);
 	}
 	return req_pending;
 }
@@ -1808,7 +1827,7 @@ static void mmc_blk_rw_cmd_abort(struct mmc_queue_req *mq_rq)
 
 	if (mmc_card_removed(card))
 		req->rq_flags |= RQF_QUIET;
-	while (blk_end_request(req, BLK_STS_IOERR, blk_rq_cur_bytes(req)));
+	mmc_blk_complete(mq_rq, BLK_STS_IOERR);
 }
 
 /**
@@ -1854,8 +1873,8 @@ static void mmc_blk_rw_done_error(struct mmc_async_req *areq,
 	case MMC_BLK_PARTIAL:
 		/* This should trigger a retransmit */
 		mmc_blk_reset_success(md, type);
-		req_pending = blk_end_request(req, BLK_STS_OK,
-					      brq->data.bytes_xfered);
+		req_pending = blk_update_request(req, BLK_STS_OK,
+						 brq->data.bytes_xfered);
 		break;
 	case MMC_BLK_CMD_ERR:
 		req_pending = mmc_blk_rw_cmd_err(md, card, brq, req, req_pending);
@@ -1906,11 +1925,13 @@ static void mmc_blk_rw_done_error(struct mmc_async_req *areq,
 		 * time, so we only reach here after trying to
 		 * read a single sector.
 		 */
-		req_pending = blk_end_request(req, BLK_STS_IOERR,
-					      brq->data.blksz);
+		req_pending = blk_update_request(req, BLK_STS_IOERR,
+						 brq->data.blksz);
 		if (!req_pending) {
 			mmc_blk_rw_try_restart(mq_rq);
 			return;
+		} else {
+			mmc_blk_complete(mq_rq, BLK_STS_IOERR);
 		}
 		break;
 	case MMC_BLK_NOMEDIUM:
@@ -1941,10 +1962,8 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq,
 {
 	struct mmc_queue_req *mq_rq;
 	struct request *req;
-	struct mmc_blk_request *brq;
 	struct mmc_queue *mq;
 	struct mmc_blk_data *md;
-	bool req_pending;
 	int type;
 
 	/*
@@ -1957,26 +1976,13 @@ static void mmc_blk_rw_done(struct mmc_async_req *areq,
 
 	/* The quick path if the request was successful */
 	mq_rq =	container_of(areq, struct mmc_queue_req, areq);
-	brq = &mq_rq->brq;
 	mq = mq_rq->mq;
 	md = mq->blkdata;
 	req = mmc_queue_req_to_req(mq_rq);
 	type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
 
 	mmc_blk_reset_success(md, type);
-	req_pending = blk_end_request(req, BLK_STS_OK,
-				      brq->data.bytes_xfered);
-	/*
-	 * If the blk_end_request function returns non-zero even
-	 * though all data has been transferred and no errors
-	 * were returned by the host controller, it's a bug.
-	 */
-	if (req_pending) {
-		pr_err("%s BUG rq_tot %d d_xfer %d\n",
-		       __func__, blk_rq_bytes(req),
-		       brq->data.bytes_xfered);
-		mmc_blk_rw_cmd_abort(mq_rq);
-	}
+	mmc_blk_complete(mq_rq, BLK_STS_OK);
 }
 
 static void mmc_blk_issue_rw_rq(struct mmc_queue_req *mq_rq)
@@ -1991,7 +1997,12 @@ static void mmc_blk_issue_rw_rq(struct mmc_queue_req *mq_rq)
 	 */
 	if (mmc_card_removed(card)) {
 		req->rq_flags |= RQF_QUIET;
-		blk_end_request_all(req, BLK_STS_IOERR);
+		/*
+		 * NOTE: this used to call blk_end_request_all()
+		 * to flush out all queued transactions to the now
+		 * non-present card.
+		 */
+		mmc_blk_complete(mq_rq, BLK_STS_IOERR);
 		return;
 	}
 
@@ -2017,8 +2028,9 @@ void mmc_blk_issue_rq(struct mmc_queue_req *mq_rq)
 {
 	int ret;
 	struct request *req = mmc_queue_req_to_req(mq_rq);
-	struct mmc_blk_data *md = mq_rq->mq->blkdata;
-	struct mmc_card *card = md->queue.card;
+	struct mmc_queue *mq = mq_rq->mq;
+	struct mmc_blk_data *md = mq->blkdata;
+	struct mmc_card *card = mq->card;
 
 	if (!req) {
 		pr_err("%s: tried to issue NULL request\n", __func__);
@@ -2027,7 +2039,7 @@ void mmc_blk_issue_rq(struct mmc_queue_req *mq_rq)
 
 	ret = mmc_blk_part_switch(card, md->part_type);
 	if (ret) {
-		blk_end_request_all(req, BLK_STS_IOERR);
+		mmc_blk_complete(mq_rq, BLK_STS_IOERR);
 		return;
 	}
 
@@ -2124,12 +2136,11 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
 		goto err_kfree;
 	}
 
-	spin_lock_init(&md->lock);
 	INIT_LIST_HEAD(&md->part);
 	INIT_LIST_HEAD(&md->rpmbs);
 	md->usage = 1;
 
-	ret = mmc_init_queue(&md->queue, card, &md->lock, subname);
+	ret = mmc_init_queue(&md->queue, card, subname);
 	if (ret)
 		goto err_putdisk;
 
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 5511e323db31..dea6b4e3f828 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -10,6 +10,7 @@
 #include <linux/slab.h>
 #include <linux/module.h>
 #include <linux/blkdev.h>
+#include <linux/blk-mq.h>
 #include <linux/freezer.h>
 #include <linux/kthread.h>
 #include <linux/scatterlist.h>
@@ -38,74 +39,6 @@ static int mmc_prep_request(struct request_queue *q, struct request *req)
 	return BLKPREP_OK;
 }
 
-static int mmc_queue_thread(void *d)
-{
-	struct mmc_queue *mq = d;
-	struct request_queue *q = mq->queue;
-	bool claimed_card = false;
-
-	current->flags |= PF_MEMALLOC;
-
-	down(&mq->thread_sem);
-	do {
-		struct request *req;
-
-		spin_lock_irq(q->queue_lock);
-		set_current_state(TASK_INTERRUPTIBLE);
-		req = blk_fetch_request(q);
-		mq->asleep = false;
-		spin_unlock_irq(q->queue_lock);
-
-		if (req) {
-			if (!claimed_card) {
-				mmc_get_card(mq->card, NULL);
-				claimed_card = true;
-			}
-			set_current_state(TASK_RUNNING);
-			mmc_blk_issue_rq(req_to_mmc_queue_req(req));
-			cond_resched();
-		} else {
-			mq->asleep = true;
-			if (kthread_should_stop()) {
-				set_current_state(TASK_RUNNING);
-				break;
-			}
-			up(&mq->thread_sem);
-			schedule();
-			down(&mq->thread_sem);
-		}
-	} while (1);
-	up(&mq->thread_sem);
-
-	if (claimed_card)
-		mmc_put_card(mq->card, NULL);
-
-	return 0;
-}
-
-/*
- * Generic MMC request handler.  This is called for any queue on a
- * particular host.  When the host is not busy, we look for a request
- * on any queue on this host, and attempt to issue it.  This may
- * not be the queue we were asked to process.
- */
-static void mmc_request_fn(struct request_queue *q)
-{
-	struct mmc_queue *mq = q->queuedata;
-	struct request *req;
-
-	if (!mq) {
-		while ((req = blk_fetch_request(q)) != NULL) {
-			req->rq_flags |= RQF_QUIET;
-			__blk_end_request_all(req, BLK_STS_IOERR);
-		}
-		return;
-	}
-
-	if (mq->asleep)
-		wake_up_process(mq->thread);
-}
-
 static struct scatterlist *mmc_alloc_sg(int sg_len, gfp_t gfp)
 {
 	struct scatterlist *sg;
@@ -136,127 +69,158 @@ static void mmc_queue_setup_discard(struct request_queue *q,
 		queue_flag_set_unlocked(QUEUE_FLAG_SECERASE, q);
 }
 
+static blk_status_t mmc_queue_request(struct blk_mq_hw_ctx *hctx,
+				      const struct blk_mq_queue_data *bd)
+{
+	struct mmc_queue_req *mq_rq = blk_mq_rq_to_pdu(bd->rq);
+	struct mmc_queue *mq = mq_rq->mq;
+
+	/* Claim card for block queue context */
+	mmc_get_card(mq->card, &mq->blkctx);
+	mmc_blk_issue_rq(mq_rq);
+
+	return BLK_STS_OK;
+}
+
+static void mmc_complete_request(struct request *req)
+{
+	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
+	struct mmc_queue *mq = mq_rq->mq;
+
+	/* Release card for block queue context */
+	mmc_put_card(mq->card, &mq->blkctx);
+}
+
 /**
  * mmc_init_request() - initialize the MMC-specific per-request data
- * @q: the request queue
+ * @set: tag set for the request
  * @req: the request
- * @gfp: memory allocation policy
+ * @hctx_idx: hardware context index
+ * @numa_node: NUMA node
  */
-static int mmc_init_request(struct request_queue *q, struct request *req,
-			    gfp_t gfp)
+static int mmc_init_request(struct blk_mq_tag_set *set, struct request *req,
+			    unsigned int hctx_idx, unsigned int numa_node)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
-	struct mmc_queue *mq = q->queuedata;
+	struct mmc_queue *mq = set->driver_data;
 	struct mmc_card *card = mq->card;
 	struct mmc_host *host = card->host;
 
-	mq_rq->sg = mmc_alloc_sg(host->max_segs, gfp);
+	mq_rq->sg = mmc_alloc_sg(host->max_segs, GFP_KERNEL);
 	if (!mq_rq->sg)
 		return -ENOMEM;
 	mq_rq->mq = mq;
+	INIT_WORK(&mq_rq->areq.finalization_work, mmc_finalize_areq);
 
 	return 0;
 }
 
-static void mmc_exit_request(struct request_queue *q, struct request *req)
+/**
+ * mmc_exit_request() - tear down the MMC-specific per-request data
+ * @set: tag set for the request
+ * @req: the request
+ * @hctx_idx: hardware context index
+ */
+static void mmc_exit_request(struct blk_mq_tag_set *set, struct request *req,
+			     unsigned int hctx_idx)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
 
+	flush_work(&mq_rq->areq.finalization_work);
 	kfree(mq_rq->sg);
 	mq_rq->sg = NULL;
 	mq_rq->mq = NULL;
 }
 
-static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
+static void mmc_setup_queue(struct mmc_queue *mq)
 {
+	struct request_queue *q = mq->queue;
+	struct mmc_card *card = mq->card;
 	struct mmc_host *host = card->host;
 	u64 limit = BLK_BOUNCE_HIGH;
 
 	if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask)
 		limit = (u64)dma_max_pfn(mmc_dev(host)) << PAGE_SHIFT;
 
-	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue);
-	queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, mq->queue);
+	blk_queue_max_segments(q, host->max_segs);
+	blk_queue_prep_rq(q, mmc_prep_request);
+	queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q);
+	queue_flag_clear_unlocked(QUEUE_FLAG_ADD_RANDOM, q);
 	if (mmc_can_erase(card))
-		mmc_queue_setup_discard(mq->queue, card);
-
-	blk_queue_bounce_limit(mq->queue, limit);
-	blk_queue_max_hw_sectors(mq->queue,
+		mmc_queue_setup_discard(q, card);
+	blk_queue_bounce_limit(q, limit);
+	blk_queue_max_hw_sectors(q,
 		min(host->max_blk_count, host->max_req_size / 512));
-	blk_queue_max_segments(mq->queue, host->max_segs);
-	blk_queue_max_segment_size(mq->queue, host->max_seg_size);
-
-	/* Initialize thread_sem even if it is not used */
-	sema_init(&mq->thread_sem, 1);
+	blk_queue_max_segments(q, host->max_segs);
+	blk_queue_max_segment_size(q, host->max_seg_size);
 }
 
+static const struct blk_mq_ops mmc_mq_ops = {
+	.queue_rq       = mmc_queue_request,
+	.init_request   = mmc_init_request,
+	.exit_request   = mmc_exit_request,
+	.complete	= mmc_complete_request,
+};
+
 /**
  * mmc_init_queue - initialise a queue structure.
  * @mq: mmc queue
  * @card: mmc card to attach this queue
- * @lock: queue lock
  * @subname: partition subname
  *
  * Initialise a MMC card request queue.
  */
 int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
-		   spinlock_t *lock, const char *subname)
+		   const char *subname)
 {
 	struct mmc_host *host = card->host;
-	int ret = -ENOMEM;
+	int ret;
 
 	mq->card = card;
-	mq->queue = blk_alloc_queue(GFP_KERNEL);
-	if (!mq->queue)
-		return -ENOMEM;
-	mq->queue->queue_lock = lock;
-	mq->queue->request_fn = mmc_request_fn;
-	mq->queue->init_rq_fn = mmc_init_request;
-	mq->queue->exit_rq_fn = mmc_exit_request;
-	mq->queue->cmd_size = sizeof(struct mmc_queue_req);
-	mq->queue->queuedata = mq;
-	ret = blk_init_allocated_queue(mq->queue);
+	mq->tag_set.ops = &mmc_mq_ops;
+	/* The MMC/SD protocols have only one command pipe */
+	mq->tag_set.nr_hw_queues = 1;
+	/* Set this to 2 to simulate async requests, should we use 3? */
+	mq->tag_set.queue_depth = 2;
+	mq->tag_set.cmd_size = sizeof(struct mmc_queue_req);
+	mq->tag_set.numa_node = NUMA_NO_NODE;
+	/* We use blocking requests */
+	mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING;
+	/* Should we use BLK_MQ_F_SG_MERGE? */
+	mq->tag_set.driver_data = mq;
+
+	ret = blk_mq_alloc_tag_set(&mq->tag_set);
 	if (ret) {
-		blk_cleanup_queue(mq->queue);
+		dev_err(host->parent, "failed to allocate MQ tag set\n");
 		return ret;
 	}
-
-	blk_queue_prep_rq(mq->queue, mmc_prep_request);
-
-	mmc_setup_queue(mq, card);
-
-	mq->thread = kthread_run(mmc_queue_thread, mq, "mmcqd/%d%s",
-		host->index, subname ? subname : "");
-
-	if (IS_ERR(mq->thread)) {
-		ret = PTR_ERR(mq->thread);
-		goto cleanup_queue;
+	mq->queue = blk_mq_init_queue(&mq->tag_set);
+	if (!mq->queue) {
+		dev_err(host->parent, "failed to initialize block MQ\n");
+		goto cleanup_free_tag_set;
 	}
+	mq->queue->queuedata = mq;
+	mmc_setup_queue(mq);
 
 	return 0;
 
-cleanup_queue:
-	blk_cleanup_queue(mq->queue);
+cleanup_free_tag_set:
+	blk_mq_free_tag_set(&mq->tag_set);
 	return ret;
 }
 
 void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
 
 	/* Make sure the queue isn't suspended, as that will deadlock */
 	mmc_queue_resume(mq);
 
-	/* Then terminate our worker thread */
-	kthread_stop(mq->thread);
-
 	/* Empty the queue */
-	spin_lock_irqsave(q->queue_lock, flags);
 	q->queuedata = NULL;
 	blk_start_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-
+	blk_cleanup_queue(q);
+	blk_mq_free_tag_set(&mq->tag_set);
 	mq->card = NULL;
 }
 EXPORT_SYMBOL(mmc_cleanup_queue);
@@ -265,23 +229,16 @@ EXPORT_SYMBOL(mmc_cleanup_queue);
  * mmc_queue_suspend - suspend a MMC request queue
  * @mq: MMC queue to suspend
  *
- * Stop the block request queue, and wait for our thread to
- * complete any outstanding requests.  This ensures that we
+ * Stop the block request queue. This ensures that we
  * won't suspend while a request is being processed.
  */
 void mmc_queue_suspend(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
 
 	if (!mq->suspended) {
-		mq->suspended |= true;
-
-		spin_lock_irqsave(q->queue_lock, flags);
+		mq->suspended = true;
 		blk_stop_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-
-		down(&mq->thread_sem);
 	}
 }
 
@@ -292,16 +249,10 @@ void mmc_queue_suspend(struct mmc_queue *mq)
 void mmc_queue_resume(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
 
 	if (mq->suspended) {
 		mq->suspended = false;
-
-		up(&mq->thread_sem);
-
-		spin_lock_irqsave(q->queue_lock, flags);
 		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
 	}
 }
 
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 67ae311b107f..c78fbb226a90 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -61,16 +61,14 @@ struct mmc_queue_req {
 
 struct mmc_queue {
 	struct mmc_card		*card;
-	struct task_struct	*thread;
-	struct semaphore	thread_sem;
 	bool			suspended;
-	bool			asleep;
 	struct mmc_blk_data	*blkdata;
 	struct request_queue	*queue;
+	struct mmc_ctx		blkctx;
+	struct blk_mq_tag_set	tag_set;
 };
 
-extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
-			  const char *);
+extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, const char *);
 extern void mmc_cleanup_queue(struct mmc_queue *);
 extern void mmc_queue_suspend(struct mmc_queue *);
 extern void mmc_queue_resume(struct mmc_queue *);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/12 v4] multiqueue for MMC/SD
  2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
                   ` (11 preceding siblings ...)
  2017-10-26 12:57 ` [PATCH 12/12 v4] mmc: switch MMC/SD to use blk-mq multiqueueing Linus Walleij
@ 2017-10-26 13:34 ` Adrian Hunter
  2017-10-26 14:20   ` Linus Walleij
  12 siblings, 1 reply; 22+ messages in thread
From: Adrian Hunter @ 2017-10-26 13:34 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson
  Cc: linux-block, Jens Axboe, Christoph Hellwig, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Paolo Valente, Avri Altman

On 26/10/17 15:57, Linus Walleij wrote:
> This switches the MMC/SD stack over to unconditionally
> using the multiqueue block interface for block access.
> This modernizes the MMC/SD stack and makes it possible
> to enable BFQ scheduling on these single-queue devices.
> 
> This is the v4 version of this v3 patch set from february:
> https://marc.info/?l=linux-mmc&m=148665788227015&w=2
> 
> The patches are available in a git branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git/log/?h=mmc-mq-v4.14-rc4
> 
> You can pull it to a clean kernel tree like this:
> git checkout -b mmc-test v4.14-rc4
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git mmc-mq-v4.14-rc4
> 
> I have now worked on it for more than a year. I was side
> tracked to clean up some code, move request allocation to
> be handled by the block layer, delete bounce buffer handling
> and refactoring the RPMB support. With the changes to request
> allocation, the patch set is a better fit and has shrunk
> from 16 to 12 patches as a result.

None of which was necessary for blk-mq support.

> 
> It is still quite invasive. Yet it is something I think would
> be nice to merge for v4.16...
> 
> The rationale for this approach was Arnd's suggestion to try to
> switch the MMC/SD stack around so as to complete requests as
> quickly as possible when they return from the device driver
> so that new requests can be issued. We are doing this now:
> the polling loop that was pulling NULL out of the request
> queue and driving the pipeline with a loop is gone with
> the next-to last patch ("block: issue requests in massive
> parallel"). This sets the stage for MQ to go in and hammer
> requests on the asynchronous issuing layer.
> 
> We use the trick to set the queue depth to 2 to get two
> parallel requests pushed down to the host. I tried to set this
> to 4, the code survives it, the queue just have three items
> waiting to be submitted all the time.

The queue depth also sets the number of requests, so you are strangling the
I/O scheduler.

> 
> In my opinion this is also a better fit for command queueuing.

Not true.  CQE support worked perfectly before blk-mq and did not depend on
blk-mq in any way.  Obviously the current CQE patch set actually implements
the CQE requirements for blk-mq - which this patch set does not.

> Handling command queueing needs to happen in the asynchronous
> submission codepath, so instead of waiting on a pending
> areq, we just stack up requests in the command queue.

That is how CQE has always worked.  It worked that way just fine without blk-mq.

> 
> It sounds simple but I bet this drives a truck through Adrians
> patch series. Sorry. :(

I waited a long time for your patches but I had to give up waiting when Ulf
belated insisted on blk-mq before CQE.  I am not sure what you are expecting
now it seems too late.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/12 v4] multiqueue for MMC/SD
  2017-10-26 13:34 ` [PATCH 00/12 v4] multiqueue for MMC/SD Adrian Hunter
@ 2017-10-26 14:20   ` Linus Walleij
  2017-10-26 19:27       ` Hunter, Adrian
  0 siblings, 1 reply; 22+ messages in thread
From: Linus Walleij @ 2017-10-26 14:20 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

On Thu, Oct 26, 2017 at 3:34 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 26/10/17 15:57, Linus Walleij wrote:

>> I have now worked on it for more than a year. I was side
>> tracked to clean up some code, move request allocation to
>> be handled by the block layer, delete bounce buffer handling
>> and refactoring the RPMB support. With the changes to request
>> allocation, the patch set is a better fit and has shrunk
>> from 16 to 12 patches as a result.
>
> None of which was necessary for blk-mq support.

I was not smart enough to realize that it was possible to do what
you did in
commit d685f5d5fcf75c30ef009771d3067f7438cd8baf
"mmc: core: Introduce host claiming by context"
this simple and clever solution simply didn't occur to me
at all.

And now it uses that solution, as you can see :)

But since I didn't have that simple solution, the other
solution was to get rid of the lock altogether (which we should
anyways...) getting rid of the RPMB "partition" for example
removes some locks. (I guess I still will have to go on and
find a solution for the boot and generic partitions but it's no
blocker for MQ anymore.)

My patch set was dependent on solving that. As I already wrote
to you on sep 13:
https://marc.info/?l=linux-mmc&m=150607944524752&w=2

My patches for allocating the struct mmc_queue_req from the
block layer was actually originally a part of this series
so the old patches
  mmc: queue: issue struct mmc_queue_req items
  mmc: queue: get/put struct mmc_queue_req
was doing a stupid homebrewn solution to what the block
layer already can do, mea culpa. (Yeah I was educating
myself in the block layer too...)

Anyways, all of this happened in the context of moving
forward with my MQ patch set, not as random activity.

Now it looks like I'm defending myself from a project leader,
haha :D

Well for better or worse, this was how I was working.

>> We use the trick to set the queue depth to 2 to get two
>> parallel requests pushed down to the host. I tried to set this
>> to 4, the code survives it, the queue just have three items
>> waiting to be submitted all the time.
>
> The queue depth also sets the number of requests, so you are strangling the
> I/O scheduler.

Yup. Just did it to see if it survives.

>> In my opinion this is also a better fit for command queueuing.
>
> Not true.  CQE support worked perfectly before blk-mq and did not depend on
> blk-mq in any way.  Obviously the current CQE patch set actually implements
> the CQE requirements for blk-mq - which this patch set does not.

What I mean is that the CQE code will likely look better on top
of these refactorings.

But as I say it is a matter of taste. I just love the looks of my own code
as much as the next programmer so I can't judge that. Let's see what
the reviewers say.

>> Handling command queueing needs to happen in the asynchronous
>> submission codepath, so instead of waiting on a pending
>> areq, we just stack up requests in the command queue.
>
> That is how CQE has always worked.  It worked that way just fine without blk-mq.

Okay nice.

>> It sounds simple but I bet this drives a truck through Adrians
>> patch series. Sorry. :(
>
> I waited a long time for your patches but I had to give up waiting when Ulf
> belated insisted on blk-mq before CQE.  I am not sure what you are expecting
> now it seems too late.

Too late for what? It's just a patch set, I don't really have a deadline
for this or anything. As I explained above I have
been working on this all the time, the problem was that I was/am not
smart enough to find that solution for host claiming by context.

The host claiming by context was merged a month ago and now I have
understood that I can use that and rebased my patches on it. Slow
learner, I guess.

If you feel it is ungrateful that you have put in so much work and things
are not getting merged, and you feel your patches deserve to be
merged first (because of human nature reasons) I can empathize with
that. It's sad that your patches are at v12. Also I see that patch 4 bears
the signoffs of a significant team at CodeAurora, so they are likely
as impatient.

I would just rebase my remaining work on top of the CQE patches if
they end up being merged first, no big deal, just work.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 00/12 v4] multiqueue for MMC/SD
  2017-10-26 14:20   ` Linus Walleij
@ 2017-10-26 19:27       ` Hunter, Adrian
  0 siblings, 0 replies; 22+ messages in thread
From: Hunter, Adrian @ 2017-10-26 19:27 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBMaW51cyBXYWxsZWlqIFttYWls
dG86bGludXMud2FsbGVpakBsaW5hcm8ub3JnXQ0KPiBTZW50OiBUaHVyc2RheSwgT2N0b2JlciAy
NiwgMjAxNyA1OjIwIFBNDQo+IFRvOiBIdW50ZXIsIEFkcmlhbiA8YWRyaWFuLmh1bnRlckBpbnRl
bC5jb20+DQo+IENjOiBsaW51eC1tbWNAdmdlci5rZXJuZWwub3JnOyBVbGYgSGFuc3NvbiA8dWxm
LmhhbnNzb25AbGluYXJvLm9yZz47DQo+IGxpbnV4LWJsb2NrIDxsaW51eC1ibG9ja0B2Z2VyLmtl
cm5lbC5vcmc+OyBKZW5zIEF4Ym9lIDxheGJvZUBrZXJuZWwuZGs+Ow0KPiBDaHJpc3RvcGggSGVs
bHdpZyA8aGNoQGxzdC5kZT47IEFybmQgQmVyZ21hbm4gPGFybmRAYXJuZGIuZGU+Ow0KPiBCYXJ0
bG9taWVqIFpvbG5pZXJraWV3aWN6IDxiLnpvbG5pZXJraWVAc2Ftc3VuZy5jb20+OyBQYW9sbyBW
YWxlbnRlDQo+IDxwYW9sby52YWxlbnRlQGxpbmFyby5vcmc+OyBBdnJpIEFsdG1hbiA8QXZyaS5B
bHRtYW5Ac2FuZGlzay5jb20+DQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggMDAvMTIgdjRdIG11bHRp
cXVldWUgZm9yIE1NQy9TRA0KPiANCj4gT24gVGh1LCBPY3QgMjYsIDIwMTcgYXQgMzozNCBQTSwg
QWRyaWFuIEh1bnRlciA8YWRyaWFuLmh1bnRlckBpbnRlbC5jb20+DQo+IHdyb3RlOg0KPiA+IE9u
IDI2LzEwLzE3IDE1OjU3LCBMaW51cyBXYWxsZWlqIHdyb3RlOg0KPiA+PiBJbiBteSBvcGluaW9u
IHRoaXMgaXMgYWxzbyBhIGJldHRlciBmaXQgZm9yIGNvbW1hbmQgcXVldWV1aW5nLg0KPiA+DQo+
ID4gTm90IHRydWUuICBDUUUgc3VwcG9ydCB3b3JrZWQgcGVyZmVjdGx5IGJlZm9yZSBibGstbXEg
YW5kIGRpZCBub3QNCj4gPiBkZXBlbmQgb24gYmxrLW1xIGluIGFueSB3YXkuICBPYnZpb3VzbHkg
dGhlIGN1cnJlbnQgQ1FFIHBhdGNoIHNldA0KPiA+IGFjdHVhbGx5IGltcGxlbWVudHMgdGhlIENR
RSByZXF1aXJlbWVudHMgZm9yIGJsay1tcSAtIHdoaWNoIHRoaXMgcGF0Y2ggc2V0DQo+IGRvZXMg
bm90Lg0KPiANCj4gV2hhdCBJIG1lYW4gaXMgdGhhdCB0aGUgQ1FFIGNvZGUgd2lsbCBsaWtlbHkg
bG9vayBiZXR0ZXIgb24gdG9wIG9mIHRoZXNlDQo+IHJlZmFjdG9yaW5ncy4NCj4gDQo+IEJ1dCBh
cyBJIHNheSBpdCBpcyBhIG1hdHRlciBvZiB0YXN0ZS4gSSBqdXN0IGxvdmUgdGhlIGxvb2tzIG9m
IG15IG93biBjb2RlIGFzDQo+IG11Y2ggYXMgdGhlIG5leHQgcHJvZ3JhbW1lciBzbyBJIGNhbid0
IGp1ZGdlIHRoYXQuIExldCdzIHNlZSB3aGF0IHRoZQ0KPiByZXZpZXdlcnMgc2F5Lg0KDQpJdCBk
b2Vzbid0IGxvb2sgcmVhZHkuICBUaGVyZSBzZWVtcyBzdGlsbCB0byBiZSAyIHRhc2sgc3dpdGNo
ZXMgYmV0d2Vlbg0KZWFjaCB0cmFuc2Zlci4gIG1tY19ibGtfcndfZG9uZV9lcnJvcigpIGlzIHN0
aWxsIHVzaW5nIHRoZSBtZXNzeSBlcnJvcg0KaGFuZGxpbmcgYW5kIGRvZXNu4oCZdCBoYW5kbGUg
cmV0cmllcyBlLmcuICdyZXRyeScgaXMgYSBsb2NhbCB2YXJpYWJsZSBzbyBpdCBjYW4ndA0KY291
bnQgdGhlIG51bWJlciBvZiByZXRyaWVzIG5vdyB0aGF0IHRoZXJlIGlzIG5vIGxvb3AuDQoNCj4g
Pj4gSXQgc291bmRzIHNpbXBsZSBidXQgSSBiZXQgdGhpcyBkcml2ZXMgYSB0cnVjayB0aHJvdWdo
IEFkcmlhbnMgcGF0Y2gNCj4gPj4gc2VyaWVzLiBTb3JyeS4gOigNCj4gPg0KPiA+IEkgd2FpdGVk
IGEgbG9uZyB0aW1lIGZvciB5b3VyIHBhdGNoZXMgYnV0IEkgaGFkIHRvIGdpdmUgdXAgd2FpdGlu
Zw0KPiA+IHdoZW4gVWxmIGJlbGF0ZWQgaW5zaXN0ZWQgb24gYmxrLW1xIGJlZm9yZSBDUUUuICBJ
IGFtIG5vdCBzdXJlIHdoYXQNCj4gPiB5b3UgYXJlIGV4cGVjdGluZyBub3cgaXQgc2VlbXMgdG9v
IGxhdGUuDQo+IA0KPiBUb28gbGF0ZSBmb3Igd2hhdD8gSXQncyBqdXN0IGEgcGF0Y2ggc2V0LCBJ
IGRvbid0IHJlYWxseSBoYXZlIGEgZGVhZGxpbmUgZm9yIHRoaXMgb3INCj4gYW55dGhpbmcuIEFz
IEkgZXhwbGFpbmVkIGFib3ZlIEkgaGF2ZSBiZWVuIHdvcmtpbmcgb24gdGhpcyBhbGwgdGhlIHRp
bWUsIHRoZQ0KPiBwcm9ibGVtIHdhcyB0aGF0IEkgd2FzL2FtIG5vdCBzbWFydCBlbm91Z2ggdG8g
ZmluZCB0aGF0IHNvbHV0aW9uIGZvciBob3N0DQo+IGNsYWltaW5nIGJ5IGNvbnRleHQuDQoNClRv
byBsYXRlIHRvIGdvIGJlZm9yZSBDUUUuICBBbGwgdGhlIGJsay1tcSB3b3JrIGlzIG5vdyBpbiB0
aGUgQ1FFIHBhdGNoc2V0Lg0KDQo+IA0KPiBUaGUgaG9zdCBjbGFpbWluZyBieSBjb250ZXh0IHdh
cyBtZXJnZWQgYSBtb250aCBhZ28gYW5kIG5vdyBJIGhhdmUNCj4gdW5kZXJzdG9vZCB0aGF0IEkg
Y2FuIHVzZSB0aGF0IGFuZCByZWJhc2VkIG15IHBhdGNoZXMgb24gaXQuIFNsb3cgbGVhcm5lciwg
SQ0KPiBndWVzcy4NCj4gDQo+IElmIHlvdSBmZWVsIGl0IGlzIHVuZ3JhdGVmdWwgdGhhdCB5b3Ug
aGF2ZSBwdXQgaW4gc28gbXVjaCB3b3JrIGFuZCB0aGluZ3MgYXJlDQo+IG5vdCBnZXR0aW5nIG1l
cmdlZCwgYW5kIHlvdSBmZWVsIHlvdXIgcGF0Y2hlcyBkZXNlcnZlIHRvIGJlIG1lcmdlZCBmaXJz
dA0KPiAoYmVjYXVzZSBvZiBodW1hbiBuYXR1cmUgcmVhc29ucykgSSBjYW4gZW1wYXRoaXplIHdp
dGggdGhhdC4gSXQncyBzYWQgdGhhdA0KPiB5b3VyIHBhdGNoZXMgYXJlIGF0IHYxMi4gQWxzbyBJ
IHNlZSB0aGF0IHBhdGNoIDQgYmVhcnMgdGhlIHNpZ25vZmZzIG9mIGENCj4gc2lnbmlmaWNhbnQg
dGVhbSBhdCBDb2RlQXVyb3JhLCBzbyB0aGV5IGFyZSBsaWtlbHkgYXMgaW1wYXRpZW50Lg0KDQpJ
dCBpcyBpbXBvcnRhbnQgdGhhdCB5b3UgdW5kZXJzdGFuZCB0aGF0IHRoaXMgaGFzIG5vdGhpbmcg
dG8gZG8gd2l0aA0KImh1bWFuIG5hdHVyZSByZWFzb25zIi4gICAgTGludXggZGlzdHJpYnV0aW9u
cyB1c2UgdXBzdHJlYW0ga2VybmVscy4gDQpDaHJvbWVPUyAgaGFzIGFuICJ1cHN0cmVhbSBmaXJz
dCIgcG9saWN5LiAgRGVsYXlpbmcgZmVhdHVyZXMgZm9yIGxvbmcNCnBlcmlvZHMgaGFzIHJlYWwt
d29ybGQgY29uc2VxdWVuY2VzLiAgV2hlbiBwZW9wbGUgYXNrLCB3aGF0IGtlcm5lbA0Kc2hvdWxk
IHRoZXkgdXNlLCB3ZSBleHBlY3QgdG8gcmVwbHksIGp1c3QgdXNlIG1haW5saW5lLg0KDQo=

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 00/12 v4] multiqueue for MMC/SD
@ 2017-10-26 19:27       ` Hunter, Adrian
  0 siblings, 0 replies; 22+ messages in thread
From: Hunter, Adrian @ 2017-10-26 19:27 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

> -----Original Message-----
> From: Linus Walleij [mailto:linus.walleij@linaro.org]
> Sent: Thursday, October 26, 2017 5:20 PM
> To: Hunter, Adrian <adrian.hunter@intel.com>
> Cc: linux-mmc@vger.kernel.org; Ulf Hansson <ulf.hansson@linaro.org>;
> linux-block <linux-block@vger.kernel.org>; Jens Axboe <axboe@kernel.dk>;
> Christoph Hellwig <hch@lst.de>; Arnd Bergmann <arnd@arndb.de>;
> Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>; Paolo Valente
> <paolo.valente@linaro.org>; Avri Altman <Avri.Altman@sandisk.com>
> Subject: Re: [PATCH 00/12 v4] multiqueue for MMC/SD
> 
> On Thu, Oct 26, 2017 at 3:34 PM, Adrian Hunter <adrian.hunter@intel.com>
> wrote:
> > On 26/10/17 15:57, Linus Walleij wrote:
> >> In my opinion this is also a better fit for command queueuing.
> >
> > Not true.  CQE support worked perfectly before blk-mq and did not
> > depend on blk-mq in any way.  Obviously the current CQE patch set
> > actually implements the CQE requirements for blk-mq - which this patch set
> does not.
> 
> What I mean is that the CQE code will likely look better on top of these
> refactorings.
> 
> But as I say it is a matter of taste. I just love the looks of my own code as
> much as the next programmer so I can't judge that. Let's see what the
> reviewers say.

It doesn't look ready.  There seems still to be 2 task switches between
each transfer.  mmc_blk_rw_done_error() is still using the messy error
handling and doesn’t handle retries e.g. 'retry' is a local variable so it can't
count the number of retries now that there is no loop.

> >> It sounds simple but I bet this drives a truck through Adrians patch
> >> series. Sorry. :(
> >
> > I waited a long time for your patches but I had to give up waiting
> > when Ulf belated insisted on blk-mq before CQE.  I am not sure what
> > you are expecting now it seems too late.
> 
> Too late for what? It's just a patch set, I don't really have a deadline for this or
> anything. As I explained above I have been working on this all the time, the
> problem was that I was/am not smart enough to find that solution for host
> claiming by context.

Too late to go before CQE.  All the blk-mq work is now in the CQE patchset.

> 
> The host claiming by context was merged a month ago and now I have
> understood that I can use that and rebased my patches on it. Slow learner, I
> guess.
> 
> If you feel it is ungrateful that you have put in so much work and things are
> not getting merged, and you feel your patches deserve to be merged first
> (because of human nature reasons) I can empathize with that. It's sad that
> your patches are at v12. Also I see that patch 4 bears the signoffs of a
> significant team at CodeAurora, so they are likely as impatient.

It is important that you understand that this has nothing to do with
"human nature reasons".    Linux distributions use upstream kernels. 
ChromeOS  has an "upstream first" policy.  Delaying features for long
periods has real-world consequences.  When people ask, what kernel
should they use, we expect to reply, just use mainline.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/12 v4] multiqueue for MMC/SD
  2017-10-26 19:27       ` Hunter, Adrian
@ 2017-10-27 11:25         ` Linus Walleij
  -1 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-27 11:25 UTC (permalink / raw)
  To: Hunter, Adrian
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

On Thu, Oct 26, 2017 at 9:27 PM, Hunter, Adrian <adrian.hunter@intel.com> w=
rote:
> On Thu, Oct 26, 2017 at 3:34 PM, Adrian Hunter <adrian.hunter@intel.com>

>> What I mean is that the CQE code will likely look better on top of these
>> refactorings.
>>
>> But as I say it is a matter of taste. I just love the looks of my own co=
de as
>> much as the next programmer so I can't judge that. Let's see what the
>> reviewers say.
>
> It doesn't look ready.  There seems still to be 2 task switches between
> each transfer.

IIUC there is a task in blk-mq submitting the requests, and that goes
all the way to starting it on the host. Then as an interrupt comes back
I kick the worker and that reports back to the block layer. So that
is one task switch per request and then one switch back to the MQ
layer.

But I am experimenting as we speak to get rid of the worker
for all simple cases that don't require retune or BKOPS and that's
pretty cool if it can be made to work :)

> mmc_blk_rw_done_error() is still using the messy error
> handling and doesn=E2=80=99t handle retries e.g. 'retry' is a local varia=
ble so it can't
> count the number of retries now that there is no loop.

Right! Nice catch.

I will update the patch and put that in the struct mmc_async_req so
it survives the retry rounds.

>> >> It sounds simple but I bet this drives a truck through Adrians patch
>> >> series. Sorry. :(
>> >
>> > I waited a long time for your patches but I had to give up waiting
>> > when Ulf belated insisted on blk-mq before CQE.  I am not sure what
>> > you are expecting now it seems too late.
>>
>> Too late for what? It's just a patch set, I don't really have a deadline=
 for this or
>> anything. As I explained above I have been working on this all the time,=
 the
>> problem was that I was/am not smart enough to find that solution for hos=
t
>> claiming by context.
>
> Too late to go before CQE.  All the blk-mq work is now in the CQE patchse=
t.

You seem to have an either/or stance.

Mine if more of a both/and.

It is not even necessary to have one set of these patches on
top of each other, they can also be mixed in some order.

A lot of factors influence this I think, like structure of code and
maintainability, performance, block layer interaction semantics,
etc etc.

We definately need input from Ulf and Bartlomiej (who was actually
the first to work on MQ for MMC/SD).

>> The host claiming by context was merged a month ago and now I have
>> understood that I can use that and rebased my patches on it. Slow learne=
r, I
>> guess.
>>
>> If you feel it is ungrateful that you have put in so much work and thing=
s are
>> not getting merged, and you feel your patches deserve to be merged first
>> (because of human nature reasons) I can empathize with that. It's sad th=
at
>> your patches are at v12. Also I see that patch 4 bears the signoffs of a
>> significant team at CodeAurora, so they are likely as impatient.
>
> It is important that you understand that this has nothing to do with
> "human nature reasons".

You do come across as a bit hard-headed.

But I think it is better to focus on the code per se.

I would suggest we go and review each others patch series to
learn from each codebase what was done in a good way for the
MMC/SD stack and what was not, you tossed out a nice review
comment above for example.

>  Linux distributions use upstream kernels.
> ChromeOS  has an "upstream first" policy.  Delaying features for long
> periods has real-world consequences.  When people ask, what kernel
> should they use, we expect to reply, just use mainline.

We are in violent agreement.

I take it that you are working on ChromeOS context and that since
they have this policy, they, through their influence over Intel as a
supplier is putting heavy pressure on you personally to get this
merged.

Is that correctly understood?

That would explain your increasing pushing to get this
upstream pretty well, especially if you have tech leads and
managers hovering over your shoulder every week asking how
the CQE upstream work is progressing.

It is indeed tough to juggle this with the pressure to "upstream
first" the BFQ scheduler policy that we are working on in Linaro
to increase interactivity. We need to enable this on devices
pronto and that means migrating MMC/SD to MQ and MQ only.
I have shared this motivation since the start, so it should come
as no surprise.

So I also have some pressure to "Get This Feature In Now".

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/12 v4] multiqueue for MMC/SD
@ 2017-10-27 11:25         ` Linus Walleij
  0 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-27 11:25 UTC (permalink / raw)
  To: Hunter, Adrian
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

On Thu, Oct 26, 2017 at 9:27 PM, Hunter, Adrian <adrian.hunter@intel.com> wrote:
> On Thu, Oct 26, 2017 at 3:34 PM, Adrian Hunter <adrian.hunter@intel.com>

>> What I mean is that the CQE code will likely look better on top of these
>> refactorings.
>>
>> But as I say it is a matter of taste. I just love the looks of my own code as
>> much as the next programmer so I can't judge that. Let's see what the
>> reviewers say.
>
> It doesn't look ready.  There seems still to be 2 task switches between
> each transfer.

IIUC there is a task in blk-mq submitting the requests, and that goes
all the way to starting it on the host. Then as an interrupt comes back
I kick the worker and that reports back to the block layer. So that
is one task switch per request and then one switch back to the MQ
layer.

But I am experimenting as we speak to get rid of the worker
for all simple cases that don't require retune or BKOPS and that's
pretty cool if it can be made to work :)

> mmc_blk_rw_done_error() is still using the messy error
> handling and doesn’t handle retries e.g. 'retry' is a local variable so it can't
> count the number of retries now that there is no loop.

Right! Nice catch.

I will update the patch and put that in the struct mmc_async_req so
it survives the retry rounds.

>> >> It sounds simple but I bet this drives a truck through Adrians patch
>> >> series. Sorry. :(
>> >
>> > I waited a long time for your patches but I had to give up waiting
>> > when Ulf belated insisted on blk-mq before CQE.  I am not sure what
>> > you are expecting now it seems too late.
>>
>> Too late for what? It's just a patch set, I don't really have a deadline for this or
>> anything. As I explained above I have been working on this all the time, the
>> problem was that I was/am not smart enough to find that solution for host
>> claiming by context.
>
> Too late to go before CQE.  All the blk-mq work is now in the CQE patchset.

You seem to have an either/or stance.

Mine if more of a both/and.

It is not even necessary to have one set of these patches on
top of each other, they can also be mixed in some order.

A lot of factors influence this I think, like structure of code and
maintainability, performance, block layer interaction semantics,
etc etc.

We definately need input from Ulf and Bartlomiej (who was actually
the first to work on MQ for MMC/SD).

>> The host claiming by context was merged a month ago and now I have
>> understood that I can use that and rebased my patches on it. Slow learner, I
>> guess.
>>
>> If you feel it is ungrateful that you have put in so much work and things are
>> not getting merged, and you feel your patches deserve to be merged first
>> (because of human nature reasons) I can empathize with that. It's sad that
>> your patches are at v12. Also I see that patch 4 bears the signoffs of a
>> significant team at CodeAurora, so they are likely as impatient.
>
> It is important that you understand that this has nothing to do with
> "human nature reasons".

You do come across as a bit hard-headed.

But I think it is better to focus on the code per se.

I would suggest we go and review each others patch series to
learn from each codebase what was done in a good way for the
MMC/SD stack and what was not, you tossed out a nice review
comment above for example.

>  Linux distributions use upstream kernels.
> ChromeOS  has an "upstream first" policy.  Delaying features for long
> periods has real-world consequences.  When people ask, what kernel
> should they use, we expect to reply, just use mainline.

We are in violent agreement.

I take it that you are working on ChromeOS context and that since
they have this policy, they, through their influence over Intel as a
supplier is putting heavy pressure on you personally to get this
merged.

Is that correctly understood?

That would explain your increasing pushing to get this
upstream pretty well, especially if you have tech leads and
managers hovering over your shoulder every week asking how
the CQE upstream work is progressing.

It is indeed tough to juggle this with the pressure to "upstream
first" the BFQ scheduler policy that we are working on in Linaro
to increase interactivity. We need to enable this on devices
pronto and that means migrating MMC/SD to MQ and MQ only.
I have shared this motivation since the start, so it should come
as no surprise.

So I also have some pressure to "Get This Feature In Now".

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/12 v4] multiqueue for MMC/SD
  2017-10-27 11:25         ` Linus Walleij
  (?)
@ 2017-10-27 12:59         ` Adrian Hunter
  2017-10-27 14:29           ` Linus Walleij
  -1 siblings, 1 reply; 22+ messages in thread
From: Adrian Hunter @ 2017-10-27 12:59 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

On 27/10/17 14:25, Linus Walleij wrote:
> On Thu, Oct 26, 2017 at 9:27 PM, Hunter, Adrian <adrian.hunter@intel.com> wrote:
>> On Thu, Oct 26, 2017 at 3:34 PM, Adrian Hunter <adrian.hunter@intel.com>
> 
>>> What I mean is that the CQE code will likely look better on top of these
>>> refactorings.
>>>
>>> But as I say it is a matter of taste. I just love the looks of my own code as
>>> much as the next programmer so I can't judge that. Let's see what the
>>> reviewers say.
>>
>> It doesn't look ready.  There seems still to be 2 task switches between
>> each transfer.
> 
> IIUC there is a task in blk-mq submitting the requests, and that goes
> all the way to starting it on the host. Then as an interrupt comes back
> I kick the worker and that reports back to the block layer. So that
> is one task switch per request and then one switch back to the MQ
> layer.
> 
> But I am experimenting as we speak to get rid of the worker
> for all simple cases that don't require retune or BKOPS and that's
> pretty cool if it can be made to work :)

Well the CQE patches already do that.

> 
>> mmc_blk_rw_done_error() is still using the messy error
>> handling and doesn’t handle retries e.g. 'retry' is a local variable so it can't
>> count the number of retries now that there is no loop.
> 
> Right! Nice catch.
> 
> I will update the patch and put that in the struct mmc_async_req so
> it survives the retry rounds.
> 
>>>>> It sounds simple but I bet this drives a truck through Adrians patch
>>>>> series. Sorry. :(
>>>>
>>>> I waited a long time for your patches but I had to give up waiting
>>>> when Ulf belated insisted on blk-mq before CQE.  I am not sure what
>>>> you are expecting now it seems too late.
>>>
>>> Too late for what? It's just a patch set, I don't really have a deadline for this or
>>> anything. As I explained above I have been working on this all the time, the
>>> problem was that I was/am not smart enough to find that solution for host
>>> claiming by context.
>>
>> Too late to go before CQE.  All the blk-mq work is now in the CQE patchset.
> 
> You seem to have an either/or stance.
> 
> Mine if more of a both/and.
> 
> It is not even necessary to have one set of these patches on
> top of each other, they can also be mixed in some order.
> 
> A lot of factors influence this I think, like structure of code and
> maintainability, performance, block layer interaction semantics,
> etc etc.
> 
> We definately need input from Ulf and Bartlomiej (who was actually
> the first to work on MQ for MMC/SD).
> 
>>> The host claiming by context was merged a month ago and now I have
>>> understood that I can use that and rebased my patches on it. Slow learner, I
>>> guess.
>>>
>>> If you feel it is ungrateful that you have put in so much work and things are
>>> not getting merged, and you feel your patches deserve to be merged first
>>> (because of human nature reasons) I can empathize with that. It's sad that
>>> your patches are at v12. Also I see that patch 4 bears the signoffs of a
>>> significant team at CodeAurora, so they are likely as impatient.
>>
>> It is important that you understand that this has nothing to do with
>> "human nature reasons".
> 
> You do come across as a bit hard-headed.
> 
> But I think it is better to focus on the code per se.
> 
> I would suggest we go and review each others patch series to
> learn from each codebase what was done in a good way for the
> MMC/SD stack and what was not, you tossed out a nice review
> comment above for example.

I can make a few more comments about what else is broken.  Have you tried
suspend / resume?  At a glance, it looks like you are trying to use
blk_stop_queue() which is not a blk-mq function.

> 
>>  Linux distributions use upstream kernels.
>> ChromeOS  has an "upstream first" policy.  Delaying features for long
>> periods has real-world consequences.  When people ask, what kernel
>> should they use, we expect to reply, just use mainline.
> 
> We are in violent agreement.
> 
> I take it that you are working on ChromeOS context and that since
> they have this policy, they, through their influence over Intel as a
> supplier is putting heavy pressure on you personally to get this
> merged.
> 
> Is that correctly understood?

No.  We just expect to base everything on mainline.

> 
> That would explain your increasing pushing to get this
> upstream pretty well, especially if you have tech leads and
> managers hovering over your shoulder every week asking how
> the CQE upstream work is progressing.
> 
> It is indeed tough to juggle this with the pressure to "upstream
> first" the BFQ scheduler policy that we are working on in Linaro
> to increase interactivity. We need to enable this on devices
> pronto and that means migrating MMC/SD to MQ and MQ only.
> I have shared this motivation since the start, so it should come
> as no surprise.

IMHO BFQ is just another example of unnecessary delay.

> 
> So I also have some pressure to "Get This Feature In Now".

It has nothing to do with pressure.  It is about what is reasonable.
Features should go in as soon as they are ready.  Ideally queued up in the
same release cycle they are submitted.  If the code doesn't work right, then
it can't go in straight away, but fake reasons for delaying things needs to
stop.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 11/12 v4] mmc: block: issue requests in massive parallel
  2017-10-26 12:57 ` [PATCH 11/12 v4] mmc: block: issue requests in massive parallel Linus Walleij
@ 2017-10-27 14:19   ` Ulf Hansson
  0 siblings, 0 replies; 22+ messages in thread
From: Ulf Hansson @ 2017-10-27 14:19 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-mmc, linux-block, Jens Axboe, Christoph Hellwig,
	Arnd Bergmann, Bartlomiej Zolnierkiewicz, Paolo Valente,
	Avri Altman, Adrian Hunter

On 26 October 2017 at 14:57, Linus Walleij <linus.walleij@linaro.org> wrote:
> This makes a crucial change to the issueing mechanism for the
> MMC requests:
>
> Before commit "mmc: core: move the asynchronous post-processing"
> some parallelism on the read/write requests was achieved by
> speculatively postprocessing a request and re-preprocess and
> re-issue the request if something went wrong, which we discover
> later when checking for an error.
>
> This is kind of ugly. Instead we need a mechanism like here:
>
> We issue requests, and when they come back from the hardware,
> we know if they finished successfully or not. If the request
> was successful, we complete the asynchronous request and let a
> new request immediately start on the hardware. If, and only if,
> it returned an error from the hardware we go down the error
> path.
>
> This is achieved by splitting the work path from the hardware
> in two: a successful path ending up calling down to
> mmc_blk_rw_done() and completing quickly, and an errorpath
> calling down to mmc_blk_rw_done_error().
>
> This has a profound effect: we reintroduce the parallelism on
> the successful path as mmc_post_req() can now be called in
> while the next request is in transit (just like prior to
> commit "mmc: core: move the asynchronous post-processing")
> and blk_end_request() is called while the next request is
> already on the hardware.
>
> The latter has the profound effect of issuing a new request
> again so that we actually may have three requests
> in transit at the same time: one on the hardware, one being
> prepared (such as DMA flushing) and one being prepared for
> issuing next by the block layer. This shows up when we
> transit to multiqueue, where this can be exploited.

So this change should more or less restore the behavior we had before
this series. I would actually be interested to see a comparison in
throughput towards that, before moving on to the last patch12, which
converts to blkmq.

Also, if I get things right so far, you have manged to get rid off a
waitqueue but instead introduced a worker, so from context switching
point of view, it would be interesting to see how/if that affects
things.

I do some tests myself and let you know the results.

[...]

> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 209ebb8a7f3f..0f57e9fe66b6 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -735,15 +735,35 @@ void mmc_finalize_areq(struct work_struct *work)
>                 mmc_start_bkops(host->card, true);
>         }
>
> -       /* Successfully postprocess the old request at this point */
> -       mmc_post_req(host, areq->mrq, 0);
> -
> -       /* Call back with status, this will trigger retry etc if needed */
> -       if (areq->report_done_status)
> -               areq->report_done_status(areq, status);
> -
> -       /* This opens the gate for the next request to start on the host */
> -       complete(&areq->complete);
> +       /*
> +        * Here we postprocess the request differently depending on if
> +        * we go on the success path or error path. The success path will
> +        * immediately let new requests hit the host, whereas the error
> +        * path will hold off new requests until we have retried and
> +        * succeeded or failed the current asynchronous request.
> +        */
> +       if (status == MMC_BLK_SUCCESS) {
> +               /*
> +                * This immediately opens the gate for the next request
> +                * to start on the host while we perform post-processing
> +                * and report back to the block layer.
> +                */
> +               host->areq = NULL;
> +               complete(&areq->complete);
> +               mmc_post_req(host, areq->mrq, 0);
> +               if (areq->report_done_status)
> +                       areq->report_done_status(areq, MMC_BLK_SUCCESS);
> +       } else {
> +               mmc_post_req(host, areq->mrq, 0);
> +               /*
> +                * Call back with error status, this will trigger retry
> +                * etc if needed
> +                */
> +               if (areq->report_done_status)
> +                       areq->report_done_status(areq, status);

I was trying to wrap my head around what this really means from a
request preparation point of view.

Can't we end up here having a new request being prepared, but then
doing error handling and re-trying with the current one?

It's been a long week, so I should probably stop reviewing code by now. :-)

Anyway, it seems like this error path really needs to be properly
tested/triggered, especially to make sure so the above still plays
nicely.

Earlier experiences also tells me that doing a card hotplug in the
middle of transactions could trigger interesting errors, related to
this path.

> +               host->areq = NULL;
> +               complete(&areq->complete);
> +       }
>  }
>  EXPORT_SYMBOL(mmc_finalize_areq);
>
> --
> 2.13.6
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 00/12 v4] multiqueue for MMC/SD
  2017-10-27 12:59         ` Adrian Hunter
@ 2017-10-27 14:29           ` Linus Walleij
  0 siblings, 0 replies; 22+ messages in thread
From: Linus Walleij @ 2017-10-27 14:29 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-mmc, Ulf Hansson, linux-block, Jens Axboe,
	Christoph Hellwig, Arnd Bergmann, Bartlomiej Zolnierkiewicz,
	Paolo Valente, Avri Altman

On Fri, Oct 27, 2017 at 2:59 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 27/10/17 14:25, Linus Walleij wrote:

>> It is indeed tough to juggle this with the pressure to "upstream
>> first" the BFQ scheduler policy that we are working on in Linaro
>> to increase interactivity. We need to enable this on devices
>> pronto and that means migrating MMC/SD to MQ and MQ only.
>> I have shared this motivation since the start, so it should come
>> as no surprise.
>
> IMHO BFQ is just another example of unnecessary delay.

I do not see it as a delay to anything, it is a motivation for
my work. I am telling you why I am still working on my patch
set, what is driving and motivating it.

I guess CQE is driving and motivating your work?

>> So I also have some pressure to "Get This Feature In Now".
>
> It has nothing to do with pressure.  It is about what is reasonable.
> Features should go in as soon as they are ready.  Ideally queued up in the
> same release cycle they are submitted.  If the code doesn't work right, then
> it can't go in straight away, but fake reasons for delaying things needs to
> stop.

I don't understand who you are addressing or accusing.

Nobody wants to delay CQE if that is what you are implying,
I want to see it supported as much as you do.

I just prefer to see MQ happen first, and now you say your patch
set does that and that is great, so I just need to review the code
better I guess?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-10-27 14:29 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-26 12:57 [PATCH 00/12 v4] multiqueue for MMC/SD Linus Walleij
2017-10-26 12:57 ` [PATCH 01/12 v4] mmc: core: move the asynchronous post-processing Linus Walleij
2017-10-26 12:57 ` [PATCH 02/12 v4] mmc: core: add a workqueue for completing requests Linus Walleij
2017-10-26 12:57 ` [PATCH 03/12 v4] mmc: core: replace waitqueue with worker Linus Walleij
2017-10-26 12:57 ` [PATCH 04/12 v4] mmc: core: do away with is_done_rcv Linus Walleij
2017-10-26 12:57 ` [PATCH 05/12 v4] mmc: core: do away with is_new_req Linus Walleij
2017-10-26 12:57 ` [PATCH 06/12 v4] mmc: core: kill off the context info Linus Walleij
2017-10-26 12:57 ` [PATCH 07/12 v4] mmc: queue: simplify queue logic Linus Walleij
2017-10-26 12:57 ` [PATCH 08/12 v4] mmc: block: shuffle retry and error handling Linus Walleij
2017-10-26 12:57 ` [PATCH 09/12 v4] mmc: queue: stop flushing the pipeline with NULL Linus Walleij
2017-10-26 12:57 ` [PATCH 10/12 v4] mmc: queue/block: pass around struct mmc_queue_req*s Linus Walleij
2017-10-26 12:57 ` [PATCH 11/12 v4] mmc: block: issue requests in massive parallel Linus Walleij
2017-10-27 14:19   ` Ulf Hansson
2017-10-26 12:57 ` [PATCH 12/12 v4] mmc: switch MMC/SD to use blk-mq multiqueueing Linus Walleij
2017-10-26 13:34 ` [PATCH 00/12 v4] multiqueue for MMC/SD Adrian Hunter
2017-10-26 14:20   ` Linus Walleij
2017-10-26 19:27     ` Hunter, Adrian
2017-10-26 19:27       ` Hunter, Adrian
2017-10-27 11:25       ` Linus Walleij
2017-10-27 11:25         ` Linus Walleij
2017-10-27 12:59         ` Adrian Hunter
2017-10-27 14:29           ` Linus Walleij

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.