From: Linus Walleij <linus.walleij@linaro.org>
To: linux-mmc@vger.kernel.org, Ulf Hansson <ulf.hansson@linaro.org>
Cc: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@lst.de>, Arnd Bergmann <arnd@arndb.de>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
Paolo Valente <paolo.valente@linaro.org>,
Avri Altman <Avri.Altman@sandisk.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Linus Walleij <linus.walleij@linaro.org>
Subject: [PATCH 00/12 v4] multiqueue for MMC/SD
Date: Thu, 26 Oct 2017 14:57:45 +0200 [thread overview]
Message-ID: <20171026125757.10200-1-linus.walleij@linaro.org> (raw)
This switches the MMC/SD stack over to unconditionally
using the multiqueue block interface for block access.
This modernizes the MMC/SD stack and makes it possible
to enable BFQ scheduling on these single-queue devices.
This is the v4 version of this v3 patch set from february:
https://marc.info/?l=linux-mmc&m=148665788227015&w=2
The patches are available in a git branch:
https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git/log/?h=mmc-mq-v4.14-rc4
You can pull it to a clean kernel tree like this:
git checkout -b mmc-test v4.14-rc4
git pull git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git mmc-mq-v4.14-rc4
I have now worked on it for more than a year. I was side
tracked to clean up some code, move request allocation to
be handled by the block layer, delete bounce buffer handling
and refactoring the RPMB support. With the changes to request
allocation, the patch set is a better fit and has shrunk
from 16 to 12 patches as a result.
It is still quite invasive. Yet it is something I think would
be nice to merge for v4.16...
The rationale for this approach was Arnd's suggestion to try to
switch the MMC/SD stack around so as to complete requests as
quickly as possible when they return from the device driver
so that new requests can be issued. We are doing this now:
the polling loop that was pulling NULL out of the request
queue and driving the pipeline with a loop is gone with
the next-to last patch ("block: issue requests in massive
parallel"). This sets the stage for MQ to go in and hammer
requests on the asynchronous issuing layer.
We use the trick to set the queue depth to 2 to get two
parallel requests pushed down to the host. I tried to set this
to 4, the code survives it, the queue just have three items
waiting to be submitted all the time.
In my opinion this is also a better fit for command queueuing.
Handling command queueing needs to happen in the asynchronous
submission codepath, so instead of waiting on a pending
areq, we just stack up requests in the command queue.
It sounds simple but I bet this drives a truck through Adrians
patch series. Sorry. :(
We are not issueing new requests from interrupt context: I still
have to post a work on a workqueue for it. Since there is the
retune and background operations that need to be checked after
every command and yeah, it needs to happen in blocking context
as far as I know.
I might make a hack trying to strip out the retune (etc) and
instead run request until something fail and report requests
back to the block layer in interrupt context. It would be an
interesting experiment, but for later.
We have parallelism in pre/post hooks also with multiqueue.
All asynchronous optimization that was there for the old block
layer is now also there for multiqueue.
Last time I followed up with some open questions
https://marc.info/?l=linux-mmc&m=149075698610224&w=2
I think these are now resolved.
As a result, the last patch is no longer in RFC state. I
think this works. (Famous last words, OK there WILL be
regressions but hey, we need to do this.)
You can see there are three steps:
- I do some necessary refactoring and need to move postprocessing
to after the requests have been completed. This clearly, as you
can see, introduce a performance regression in the dd test with
the patch:
"mmc: core: move the asynchronous post-processing"
It seems the random seek with find isn't much affected.
- I continue the refactoring and get to the point of issueing
requests immediately after every successful transfer, and the
dd performance is restored with patch
"mmc: queue: issue requests in massive parallel"
- Then I add multiqueue on top of the cake. So before the change
we have the nice performance we want so we can study the effect
of just introducing multiqueueing in the last patch
"mmc: switch MMC/SD to use blk-mq multiqueueing v4"
PERFORMANCE BEFORE AND AFTER:
BEFORE this patch series, on Ulf's next branch ending with
commit cf653c788a29fa70e07b86492a7599c471c705de (mmc-next)
Merge: 4dda8e1f70f8 eb701ce16a45 ("Merge branch 'fixes' into next")
sync
echo 3 > /proc/sys/vm/drop_caches
sync
time dd if=/dev/mmcblk3 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.0GB) copied, 23.966583 seconds, 42.7MB/s
real 0m 23.97s
user 0m 0.01s
sys 0m 3.74s
mount /dev/mmcblk3p1 /mnt/
cd /mnt/
sync
echo 3 > /proc/sys/vm/drop_caches
sync
time find . > /dev/null
real 0m 3.24s
user 0m 0.22s
sys 0m 1.23s
sync
echo 3 > /proc/sys/vm/drop_caches
sync
iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test
random random
kB reclen write rewrite read reread read write
20480 4 1598 1559 6782 6740 6751 536
20480 8 2134 2281 11449 11449 11407 1145
20480 16 3695 4171 17676 17677 17638 1234
20480 32 5751 7475 23622 23622 23584 3004
20480 64 6778 8648 27937 27950 27914 3445
20480 128 6073 8115 29091 29080 29070 4892
20480 256 7106 7208 29658 29670 29657 6743
20480 512 8828 9953 29911 29905 29901 7424
20480 1024 6566 7199 27233 27236 27209 6808
20480 2048 7370 7403 27492 27490 27475 7428
20480 4096 7352 7456 28124 28123 28109 7411
20480 8192 7271 7462 28371 28369 28359 7458
20480 16384 7097 7478 28540 28538 28528 7464
AFTER this patch series ending with
"mmc: switch MMC/SD to use blk-mq multiqueueing v4":
sync
echo 3 > /proc/sys/vm/drop_caches
sync
time dd if=/dev/mmcblk3 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.0GB) copied, 24.358276 seconds, 42.0MB/s
real 0m 24.36s
user 0m 0.00s
sys 0m 3.92s
mount /dev/mmcblk3p1 /mnt/
cd /mnt/
sync
echo 3 > /proc/sys/vm/drop_caches
sync
time find . > /dev/null
real 0m 3.92s
user 0m 0.26s
sys 0m 1.21s
sync
echo 3 > /proc/sys/vm/drop_caches
sync
iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test
random random
kB reclen write rewrite read reread read write
20480 4 1614 1569 6913 6889 6876 531
20480 8 2147 2301 11628 11625 11581 1165
20480 16 3820 4256 17760 17764 17725 1549
20480 32 5814 7508 23148 23145 23123 3561
20480 64 7396 8161 27513 27527 27500 4177
20480 128 6707 9025 29160 29166 29139 5199
20480 256 7902 7860 29459 29456 29462 7307
20480 512 8061 11343 29888 29891 29881 6800
20480 1024 7076 7442 27702 27704 27700 7445
20480 2048 6846 8194 27417 27419 27418 6781
20480 4096 8115 6810 28113 28113 28109 8191
20480 8192 7254 7434 28413 28419 28414 7476
20480 16384 7090 7481 28623 28619 28625 7454
As you can see, performance is not affected, errors are in the noise
margin.
Linus Walleij (12):
mmc: core: move the asynchronous post-processing
mmc: core: add a workqueue for completing requests
mmc: core: replace waitqueue with worker
mmc: core: do away with is_done_rcv
mmc: core: do away with is_new_req
mmc: core: kill off the context info
mmc: queue: simplify queue logic
mmc: block: shuffle retry and error handling
mmc: queue: stop flushing the pipeline with NULL
mmc: queue/block: pass around struct mmc_queue_req*s
mmc: block: issue requests in massive parallel
mmc: switch MMC/SD to use blk-mq multiqueueing v4
drivers/mmc/core/block.c | 539 ++++++++++++++++++++++----------------------
drivers/mmc/core/block.h | 5 +-
drivers/mmc/core/bus.c | 1 -
drivers/mmc/core/core.c | 194 +++++++++-------
drivers/mmc/core/core.h | 11 +-
drivers/mmc/core/host.c | 1 -
drivers/mmc/core/mmc_test.c | 31 +--
drivers/mmc/core/queue.c | 238 ++++++++-----------
drivers/mmc/core/queue.h | 16 +-
include/linux/mmc/core.h | 3 +-
include/linux/mmc/host.h | 27 +--
11 files changed, 503 insertions(+), 563 deletions(-)
--
2.13.6
next reply other threads:[~2017-10-26 12:57 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-26 12:57 Linus Walleij [this message]
2017-10-26 12:57 ` [PATCH 01/12 v4] mmc: core: move the asynchronous post-processing Linus Walleij
2017-10-26 12:57 ` [PATCH 02/12 v4] mmc: core: add a workqueue for completing requests Linus Walleij
2017-10-26 12:57 ` [PATCH 03/12 v4] mmc: core: replace waitqueue with worker Linus Walleij
2017-10-26 12:57 ` [PATCH 04/12 v4] mmc: core: do away with is_done_rcv Linus Walleij
2017-10-26 12:57 ` [PATCH 05/12 v4] mmc: core: do away with is_new_req Linus Walleij
2017-10-26 12:57 ` [PATCH 06/12 v4] mmc: core: kill off the context info Linus Walleij
2017-10-26 12:57 ` [PATCH 07/12 v4] mmc: queue: simplify queue logic Linus Walleij
2017-10-26 12:57 ` [PATCH 08/12 v4] mmc: block: shuffle retry and error handling Linus Walleij
2017-10-26 12:57 ` [PATCH 09/12 v4] mmc: queue: stop flushing the pipeline with NULL Linus Walleij
2017-10-26 12:57 ` [PATCH 10/12 v4] mmc: queue/block: pass around struct mmc_queue_req*s Linus Walleij
2017-10-26 12:57 ` [PATCH 11/12 v4] mmc: block: issue requests in massive parallel Linus Walleij
2017-10-27 14:19 ` Ulf Hansson
2017-10-26 12:57 ` [PATCH 12/12 v4] mmc: switch MMC/SD to use blk-mq multiqueueing Linus Walleij
2017-10-26 13:34 ` [PATCH 00/12 v4] multiqueue for MMC/SD Adrian Hunter
2017-10-26 14:20 ` Linus Walleij
2017-10-26 19:27 ` Hunter, Adrian
2017-10-26 19:27 ` Hunter, Adrian
2017-10-27 11:25 ` Linus Walleij
2017-10-27 11:25 ` Linus Walleij
2017-10-27 12:59 ` Adrian Hunter
2017-10-27 14:29 ` Linus Walleij
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171026125757.10200-1-linus.walleij@linaro.org \
--to=linus.walleij@linaro.org \
--cc=Avri.Altman@sandisk.com \
--cc=adrian.hunter@intel.com \
--cc=arnd@arndb.de \
--cc=axboe@kernel.dk \
--cc=b.zolnierkie@samsung.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-mmc@vger.kernel.org \
--cc=paolo.valente@linaro.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.