linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ulf Hansson <ulf.hansson@linaro.org>
To: DooHyun Hwang <dh0421.hwang@samsung.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
	"linux-mmc@vger.kernel.org" <linux-mmc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	DTML <devicetree@vger.kernel.org>,
	Rob Herring <robh+dt@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Satya Tangirala <satyat@google.com>,
	ebiggers@google.com,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	grant.jung@samsung.com, jt77.jang@samsung.com,
	junwoo80.lee@samsung.com, jangsub.yi@samsung.com,
	sh043.lee@samsung.com, Chanwoo Lee <cw9316.lee@samsung.com>,
	sh8267.baek@samsung.com, wkon.kim@samsung.com
Subject: Re: [PATCH 2/2] mmc: core: Add no single read retries
Date: Mon, 1 Mar 2021 09:23:44 +0100	[thread overview]
Message-ID: <CAPDyKFrgAanRYCe1QckWK8vxwV=rXV3KzTRynY_mkNaRkSrj+g@mail.gmail.com> (raw)
In-Reply-To: <000001d70509$54bf59b0$fe3e0d10$@samsung.com>

On Wed, 17 Feb 2021 at 09:46, DooHyun Hwang <dh0421.hwang@samsung.com> wrote:
>
>
> On 17/02/21 8:00 am, Adrian Hunter wrote:
> >On 17/02/21 7:46 am, Adrian Hunter wrote:
> >> On 17/02/21 7:22 am, DooHyun Hwang wrote:
> >>> This makes to handle read errors faster by not retrying multiple
> >>> block read(CMD18) errors with single block reads(CMD17).
> >>>
> >>> On some bad SD Cards that have problem with read operations, it is
> >>> not helpful to retry multiple block read errors with several single
> >>> block reads, and it is delayed to treat read operations as I/O error
> >>> as much as retrying single block reads.
> >>
> >> If the issue is that it takes too long, then maybe it would be better
> >> to get
> >> mmc_blk_read_single() to give up after a certain amount of time.
> >>
> >
> >So that a device property would not be needed I mean.  Then everyone would
> >benefit.

Just wanted to confirm with Adrian's points, that we don't want a
device property for this.

In fact, the DT maintainers would reject it because it would be
considered as a software configuration, which doesn't belong in DT.

>
> Thank you for reviewing this.
>
> mmc_blk_read_single() takes a different time depending on the number of
> sectors to read and the timeout value for each CMD.
>
> I think it's difficult to set the criteria for "a certain amount of time"
> you talked about.
> And it's harder to proceed with any errors caused by giving up in
> mmc_blk_read_single() than no retrying.
>
> So, I would like to add a configurable property to skip the single block
> read retrying because if multiple block read error occurs, single block
> read retrying doesn't help for some bad SD cards.

I certainly agree that falling back to single block reads is
questionable, at least for some cases. Moreover, I am pretty sure it's
not always the SD card that should be blamed, but a broken mmc host
driver or broken HW/controller.

That said, I assume that the main reason why we fall back to retry
with single block reads, is to try to recover as much data as possible
from a broken SD card. The intent is good, but to recover data from a
broken card, we should also consider to move to a lower/legacy speed
mode and to decrease the clock rate of the interface.

For the clock rate, we already have a debugfs node, allowing us to
change the rate per mmc host. I suggest we add a few more debugfs
nodes, allowing us to restrict the speed mode and to enable/disable
single/multi block read.

If we can get these things in place to help with recovery, I wouldn't
mind us changing the default behaviour to skip the single block read
in the recovery path.

>
> This is the log to check for this patch.
> #0. time difference is about 2.37s for 8 sectors between with(#1) and without(#2)
>      single block read retrying
>      This is a test for just one CMD18.
>      When there are many I/O requests, it takes too long to handle the errors.
>
> #1. retry multiple block read (8 sectors) error with single block reads
> // It takes about 3.585671s for the I/O error.
> // issue CMD23 (+ arg 0x8)
> // issue CMD18 (+ arg 0x000320e0) and error occurs
> <7>[  316.657115]  [5:   kworker/5:1H:  324] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  316.657124]  [5:   kworker/5:1H:  324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> <7>[  316.826302] I[0:      swapper/0:    0] mmc0: req done <CMD23>: 0: 00000000 00000000 00000000 00000000
> <7>[  316.826327] I[0:      swapper/0:    0] mmc0: req done (CMD18): 0: 00000900 00000000 00000000 00000000
> <7>[  316.826362] I[0:      swapper/0:    0] mmc0:     0 bytes transferred: -110
> <7>[  316.826389] I[0:      swapper/0:    0] mmc0:     (CMD12): 0: 00000b00 00000000 00000000 00000000
> <7>[  316.826516]  [0:   kworker/0:1H:  338] mmc0: starting CMD13 arg 00010000 flags 00000195
> <7>[  316.826621] I[0:   kworker/0:1H:  338] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
> // retry CMD18 (+ arg 0x000320e0) and error occurs again. Same as above.
> <7>[  316.829224]  [5:   kworker/5:1H:  324] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  316.829237]  [5:   kworker/5:1H:  324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> <7>[  316.999588] I[0:      swapper/0:    0] mmc0: req done <CMD23>: 0: 00000000 00000000 00000000 00000000
> <7>[  316.999653] I[0:      swapper/0:    0] mmc0: req done (CMD18): 0: 00000900 00000000 00000000 00000000
> <7>[  316.999725] I[0:      swapper/0:    0] mmc0:     0 bytes transferred: -110
> <7>[  316.999789] I[0:      swapper/0:    0] mmc0:     (CMD12): 0: 00000b00 00000000 00000000 00000000
> <7>[  317.000034]  [0:   kworker/0:1H:  338] mmc0: starting CMD13 arg 00010000 flags 00000195
> <7>[  317.000370] I[0:   kworker/0:1H:  338] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
> // mmc_blk_reset() and it's completed
> <7>[  317.000523]  [0:   kworker/0:1H:  338] mmc0: clock 0Hz busmode 2 powermode 0 cs 0 Vdd 0 width 1 timing 0
> ...
> // mmc_blk_read_single() : CMD17, CMD13 and CMD12 repeats 8 times (for retrying multiple block read with 8 sectors)
> // CMD17 (+ arg 0x000320e0 ~ 0x000320e7) and timeout errors occur
> // It takes about 1.351s
> <7>[  317.200351]  [0:   kworker/0:1H:  338] mmc0: starting CMD17 arg 000320e0 flags 000000b5
> <7>[  317.368748] I[0:      swapper/0:    0] mmc0: req done (CMD17): 0: 00000900 00000000 00000000 00000000
> <7>[  317.368776] I[0:      swapper/0:    0] mmc0:     0 bytes transferred: -110
> <7>[  317.368871]  [0:   kworker/0:1H:  338] mmc0: starting CMD13 arg 00010000 flags 00000195
> <7>[  317.368932] I[0:   kworker/0:1H:  338] mmc0: sdhci: IRQ status 0x00000001
> <7>[  317.368970] I[0:   kworker/0:1H:  338] mmc0: req done (CMD13): 0: 00000b00 00000000 00000000 00000000
> <7>[  317.369020]  [0:   kworker/0:1H:  338] mmc0: starting CMD12 arg 00000000 flags 00000095
> <7>[  317.369070] I[0:   kworker/0:1H:  338] mmc0: sdhci: IRQ status 0x00000001
> <7>[  317.369108] I[0:   kworker/0:1H:  338] mmc0: req done (CMD12): 0: 00000b00 00000000 00000000 00000000
> <7>[  317.369155]  [0:   kworker/0:1H:  338] mmc0: starting CMD13 arg 00010000 flags 00000195
> <7>[  317.369204] I[0:   kworker/0:1H:  338] mmc0: sdhci: IRQ status 0x00000001
> <7>[  317.369245] I[0:   kworker/0:1H:  338] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
> <3>[  317.369298]  [0:   kworker/0:1H:  338] print_req_error: I/O error, dev mmcblk0, sector 205024
> <7>[  317.369342]  [0:   kworker/0:1H:  338] mmc0: starting CMD17 arg 000320e1 flags 000000b5
> ...
> <7>[  318.382668]  [0:   kworker/0:1H:  338] mmc0: starting CMD17 arg 000320e7 flags 000000b5
> <3>[  318.551568]  [0:   kworker/0:1H:  338] print_req_error: I/O error, dev mmcblk0, sector 205031
> // retry CMD18 (+ arg 0x000320e0) and error occurs again.
> <7>[  318.551850]  [5:   kworker/5:1H:  324] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  318.551867]  [5:   kworker/5:1H:  324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> ...
> // retry CMD18 (+ arg 0x000320e0) and error occurs again.
> <7>[  318.721767]  [5:   kworker/5:1H:  324] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> // CMD17 (+ arg 0x000320e0 ~ 0x000320e7)
> <7>[  318.891054]  [0:   kworker/0:1H:  338] mmc0: starting CMD17 arg 000320e0 flags 000000b5
> ...
> <7>[  320.073861]  [0:   kworker/0:1H:  338] mmc0: starting CMD17 arg 000320e7 flags 000000b5
> // Return I/O error for read operation finally
> <3>[  320.242786]  [0:   kworker/0:1H:  338] Buffer I/O error on dev mmcblk0, logical block 25628, async page read
>
> #2. retry multiple block read (8 sectors) error without single block reads
> // It takes about 1.205941s for the I/O error.
> // issue CMD23 (+ arg 0x8)
> // issue CMD18 (+ arg 0x000320e0) and error occurs
> <7>[  126.467114]  [7:   kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  126.467125]  [7:   kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> <7>[  126.636188] I[0:Measurement Wor: 9074] mmc0: req done <CMD23>: 0: 00000000 00000000 00000000 00000000
> <7>[  126.636213] I[0:Measurement Wor: 9074] mmc0: req done (CMD18): 0: 00000900 00000000 00000000 00000000
> <7>[  126.636241] I[0:Measurement Wor: 9074] mmc0:     0 bytes transferred: -110
> <7>[  126.636265] I[0:Measurement Wor: 9074] mmc0:     (CMD12): 0: 00000b00 00000000 00000000 00000000
> <7>[  126.636379]  [0:   kworker/0:1H:  336] mmc0: starting CMD13 arg 00010000 flags 00000195
> <7>[  126.636495] I[0:   kworker/0:1H:  336] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 00000000
> // retry CMD18 (+ arg 0x000320e0) and error occurs again. Same as above.
> <7>[  126.638284]  [7:   kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  126.638298]  [7:   kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> // mmc_blk_reset() and it's completed
> <7>[  126.807645]  [0:   kworker/0:1H:  336] mmc0: clock 0Hz busmode 2 powermode 0 cs 0 Vdd 0 width 1 timing 0
> ...
> // no mmc_blk_read_single() calling
> // retry CMD18 (+ arg 0x000320e0) and error occurs again.
> <7>[  126.993628]  [7:   kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  126.993643]  [7:   kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> // retry CMD18 (+ arg 0x000320e0) and error occurs again.
> <7>[  127.164836]  [7:   kworker/7:2H: 8887] <mmc0: starting CMD23 arg 00000008 flags 00000015>
> <7>[  127.164848]  [7:   kworker/7:2H: 8887] mmc0: starting CMD18 arg 000320e0 flags 000000b5
> ...
> // Return I/O error for read operation finally
> <3>[  127.673055] I[7:      swapper/7:    0] Buffer I/O error on dev mmcblk0, logical block 25628, async page read
>

Kind regards
Uffe

  reply	other threads:[~2021-03-01  8:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20210217053519epcas1p4204253d3af8e1c6d2a5c3acb73eb877f@epcas1p4.samsung.com>
2021-02-17  5:22 ` [PATCH 0/2] mmc: core: add a new property DooHyun Hwang
     [not found]   ` <CGME20210217053520epcas1p1af462836bd9fc690d99baed0b122d6fd@epcas1p1.samsung.com>
2021-02-17  5:22     ` [PATCH 1/2] dt-bindings: mmc: Add no-single-read-retry property DooHyun Hwang
     [not found]   ` <CGME20210217053521epcas1p2aa80cae5d52f30c8c8882f44abe8045c@epcas1p2.samsung.com>
2021-02-17  5:22     ` [PATCH 2/2] mmc: core: Add no single read retries DooHyun Hwang
2021-02-17  5:46       ` Adrian Hunter
2021-02-17  6:00         ` Adrian Hunter
2021-02-17  8:46           ` DooHyun Hwang
2021-03-01  8:23             ` Ulf Hansson [this message]
2021-02-17  5:19 [PATCH 0/2] mmc: core: add a new property DooHyun Hwang
     [not found] ` <CGME20210217053242epcas1p426e4bdc44270bb6b98eb18f33e66023d@epcas1p4.samsung.com>
2021-02-17  5:19   ` [PATCH 2/2] mmc: core: Add no single read retries DooHyun Hwang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPDyKFrgAanRYCe1QckWK8vxwV=rXV3KzTRynY_mkNaRkSrj+g@mail.gmail.com' \
    --to=ulf.hansson@linaro.org \
    --cc=adrian.hunter@intel.com \
    --cc=axboe@kernel.dk \
    --cc=cw9316.lee@samsung.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dh0421.hwang@samsung.com \
    --cc=ebiggers@google.com \
    --cc=grant.jung@samsung.com \
    --cc=gustavoars@kernel.org \
    --cc=jangsub.yi@samsung.com \
    --cc=jt77.jang@samsung.com \
    --cc=junwoo80.lee@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=satyat@google.com \
    --cc=sh043.lee@samsung.com \
    --cc=sh8267.baek@samsung.com \
    --cc=wkon.kim@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).