linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang7@gmail.com>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
	Asutosh Das <asutoshd@codeaurora.org>,
	Orson Zhai <orsonzhai@gmail.com>,
	Chunyan Zhang <zhang.lyra@gmail.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Linus Walleij <linus.walleij@linaro.org>,
	Baolin Wang <baolin.wang@linaro.org>,
	"linux-mmc@vger.kernel.org" <linux-mmc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v8 0/5] Add MMC software queue support
Date: Mon, 10 Feb 2020 16:41:20 +0800	[thread overview]
Message-ID: <CADBw62rgoXjgoqaokhBqYCqAY-3vYnCq4iiVYaxZu_JJsufvDQ@mail.gmail.com> (raw)
In-Reply-To: <CAPDyKFpqtCi4H7T99PguzsUi+-803EA-y+vXGK7C0cHVbzeJ6A@mail.gmail.com>

Hi Ulf,

On Thu, Feb 6, 2020 at 11:00 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Wed, 5 Feb 2020 at 13:51, Baolin Wang <baolin.wang7@gmail.com> wrote:
> >
> > Hi All,
> >
> > Now the MMC read/write stack will always wait for previous request is
> > completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> > or queue a work to complete request, that will bring context switching
> > overhead, especially for high I/O per second rates, to affect the IO
> > performance.
>
> In the regular request path (non CQE), we call mmc_blk_card_busy() to
> complete a request. For write I/O, this leads to calling
> card_busy_detect(), which starts to poll the card by sending a CMD13.
>
> At least one CMD13 will be sent to the card, before we exit the
> polling loop and a new I/O request can get submitted. However, in many
> cases, depending on the controller/host/card/request-size, my best
> guess is that *one* CMD13 might not be sufficient. At least, that is
> what I have observed on those platforms I recently have been working
> on.
>
> That said, I am wondering if you have done some measurement/profiling
> on this particular behaviour for your controller/driver? For example,
> how many CMD13 gets sent for random small writes during polling?

Ah, I had not checked how many CMD13 for random small writes before.
And I did a quick testing today, I found only 1 CMD13 gets sent for
random writes on my platform.


> Why am I asking this? Because, unless I am mistaken, when using the
> new hsq path that you introduce in $subject series, based on the cqe
> ops, then mmc_blk_card_busy() is not being called at all. In other
> words, you rely on HW busy detection from the controller/driver,
> rather than polling with CMD13. Is that correct?

Right. I think so.

> This seems like an additional reason to why you achieve significant
> improvements for the random write case. Don't you think?

Yes, agree wtih you.

> >
> > Thus this patch set will introduce the MMC software command queue support
> > based on command queue engine's interfaces, and set the queue depth as 64
> > to allow more requests can be be prepared, merged and inserted into IO
> > scheduler, but we only allow 2 requests in flight, that is enough to let
> > the irq handler always trigger the next request without a context switch,
> > as well as avoiding a long latency.
> >
> > Moreover we can expand the MMC software queue interface to support
> > MMC packed request or packed command instead of adding new interfaces,
> > according to previosus discussion.
> >
> > Below are some comparison data with fio tool. The fio command I used
> > is like below with changing the '--rw' parameter and enabling the direct
> > IO flag to measure the actual hardware transfer speed in 4K block size.
> >
> > ./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=1G --group_reporting --numjobs=20 --name=test_read
> >
> > My eMMC card working at HS400 Enhanced strobe mode:
> > [    2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> > [    2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB
> > [    2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB
> > [    2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB
> > [    2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0)
> >
> > 1. Without MMC software queue
> > I tested 5 times for each case and output a average speed.
> >
> > 1) Sequential read:
> > Speed: 59.4MiB/s, 63.4MiB/s, 57.5MiB/s, 57.2MiB/s, 60.8MiB/s
> > Average speed: 59.66MiB/s
> >
> > 2) Random read:
> > Speed: 26.9MiB/s, 26.9MiB/s, 27.1MiB/s, 27.1MiB/s, 27.2MiB/s
> > Average speed: 27.04MiB/s
> >
> > 3) Sequential write:
> > Speed: 71.6MiB/s, 72.5MiB/s, 72.2MiB/s, 64.6MiB/s, 67.5MiB/s
> > Average speed: 69.68MiB/s
> >
> > 4) Random write:
> > Speed: 36.3MiB/s, 35.4MiB/s, 38.6MiB/s, 34MiB/s, 35.5MiB/s
> > Average speed: 35.96MiB/s
> >
> > 2. With MMC software queue
> > I tested 5 times for each case and output a average speed.
> >
> > 1) Sequential read:
> > Speed: 59.2MiB/s, 60.4MiB/s, 63.6MiB/s, 60.3MiB/s, 59.9MiB/s
> > Average speed: 60.68MiB/s
> >
> > 2) Random read:
> > Speed: 31.3MiB/s, 31.4MiB/s, 31.5MiB/s, 31.3MiB/s, 31.3MiB/s
> > Average speed: 31.36MiB/s
> >
> > 3) Sequential write:
> > Speed: 71MiB/s, 71.8MiB/s, 72.3MiB/s, 72.2MiB/s, 71MiB/s
> > Average speed: 71.66MiB/s
> >
> > 4) Random write:
> > Speed: 68.9MiB/s, 68.7MiB/s, 68.8MiB/s, 68.6MiB/s, 68.8MiB/s
> > Average speed: 68.76MiB/s
> >
> > Form above data, we can see the MMC software queue can help to improve some
> > performance obviously for random read and write, though no obvious improvement
> > for sequential read and write.
> >
> > Any comments are welcome. Thanks a lot.
> >
> > Changes from v7:
> >  - Add reviewed tag from Arnd.
> >  - Use the 'hsq' acronym for varibles and functions in the core layer.
> >  - Check the 'card->ext_csd.cmdq_en' in cqhci.c to make sure the CQE
> >  can work normally.
> >  - Add a new patch to enable the host software queue for the SD card.
> >  - Use the default MMC queue depth for host software queue.
>
> It would be nice to also have some measurements for an SD card, now
> that the series supports this. Is that possible for you test as well?

Yes, but my SD card works at high speed mode, and shows a low speed in
4k block size.
[    2.941965] mmc0: new high speed SDHC card at address b368
[    2.948325] mmcblk0: mmc0:b368 SD08G 7.42 GiB
[    2.956554]  mmcblk0: p1

And I did not see any obvious improvement or recession for my SD card
in 4k block size from below data, I think the most of the time is
spent in hardware. (But when I enabled the packed request based on
hsq, I can see some obvious improvement.)
Without hsq:
read: bw=4347KiB/s
randread: bw=3040KiB/s
write: bw=1361KiB/s
randwrite: bw=692KiB/s

With hsq:
read: bw=4246KiB/s
randread: bw=29950KiB/s
write: bw=1417KiB/s
randwrite: bw=697KiB/s

  reply	other threads:[~2020-02-10  8:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-05 12:50 [PATCH v8 0/5] Add MMC software queue support Baolin Wang
2020-02-05 12:50 ` [PATCH v8 1/5] mmc: Add MMC host " Baolin Wang
2020-02-11  9:44   ` Ulf Hansson
2020-02-11 12:51     ` Baolin Wang
2020-02-05 12:50 ` [PATCH v8 2/5] mmc: core: Enable the MMC host software queue for the SD card Baolin Wang
2020-02-05 12:50 ` [PATCH v8 3/5] mmc: host: sdhci: Add request_done ops for struct sdhci_ops Baolin Wang
2020-02-05 12:50 ` [PATCH v8 4/5] mmc: host: sdhci: Add a variable to defer to complete requests if needed Baolin Wang
2020-02-05 12:50 ` [PATCH v8 5/5] mmc: host: sdhci-sprd: Add software queue support Baolin Wang
2020-02-06 15:00 ` [PATCH v8 0/5] Add MMC " Ulf Hansson
2020-02-10  8:41   ` Baolin Wang [this message]
2020-02-10 13:25     ` Ulf Hansson
2020-02-11  4:47       ` Baolin Wang
2020-02-11  9:46         ` Ulf Hansson
2020-02-11 12:34           ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADBw62rgoXjgoqaokhBqYCqAY-3vYnCq4iiVYaxZu_JJsufvDQ@mail.gmail.com \
    --to=baolin.wang7@gmail.com \
    --cc=adrian.hunter@intel.com \
    --cc=arnd@arndb.de \
    --cc=asutoshd@codeaurora.org \
    --cc=baolin.wang@linaro.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=orsonzhai@gmail.com \
    --cc=ulf.hansson@linaro.org \
    --cc=zhang.lyra@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).