All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
@ 2019-07-08 19:56 Douglas Anderson
  2019-07-09  9:06   ` Krzysztof Kozlowski
  2019-07-22 13:41 ` Ulf Hansson
  0 siblings, 2 replies; 12+ messages in thread
From: Douglas Anderson @ 2019-07-08 19:56 UTC (permalink / raw)
  To: Jaehoon Chung, Ulf Hansson
  Cc: linux-samsung-soc, linux-rockchip, briannorris, mka, groeck,
	sonnyrao, Douglas Anderson, Marek Szyprowski, Alim Akhtar,
	Enric Balletbo i Serra, linux-mmc, linux-kernel

In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
response errors.") we fixed a tuning-induced hang that I saw when
stress testing tuning on certain SD cards.  I won't re-hash that whole
commit, but the summary is that as a normal part of tuning you need to
deal with transfer errors and there were cases where these transfer
errors was putting my system into a bad state causing all future
transfers to fail.  That commit fixed handling of the transfer errors
for me.

In downstream Chrome OS my fix landed and had the same behavior for
all SD/MMC commands.  However, it looks like when the commit landed
upstream we limited it to only SD tuning commands.  Presumably this
was to try to get around problems that Alim Akhtar reported on exynos
[1].

Unfortunately while stress testing reboots (and suspend/resume) on
some rk3288-based Chromebooks I found the same problem on the eMMC on
some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
same situation.

I'm hoping that whatever problems exynos was having in the past are
somehow magically fixed now and we can make the behavior the same for
all commands.

[1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com

Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Alim Akhtar <alim.akhtar@gmail.com>
Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
---
Marek (or anyone else using exynos): is it easy for you to test this
and check if things are still broken when we land this patch?  If so,
I guess we could have a quirk to have different behavior for just
Rockchip SoCs but I'd rather avoid that if possible.

NOTE: I'm not hoping totally in vain here.  It is possible that some
of the CTO/DTO timers that landed could be the magic that would get
exynos unstuck.

 drivers/mmc/host/dw_mmc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index b53b6b7d4dd4..60c3a06e3469 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -2034,8 +2034,7 @@ static void dw_mci_tasklet_func(unsigned long priv)
 				 * delayed. Allowing the transfer to take place
 				 * avoids races and keeps things simple.
 				 */
-				if ((err != -ETIMEDOUT) &&
-				    (cmd->opcode == MMC_SEND_TUNING_BLOCK)) {
+				if (err != -ETIMEDOUT) {
 					state = STATE_SENDING_DATA;
 					continue;
 				}
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  2019-07-08 19:56 [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC Douglas Anderson
@ 2019-07-09  9:06   ` Krzysztof Kozlowski
  2019-07-22 13:41 ` Ulf Hansson
  1 sibling, 0 replies; 12+ messages in thread
From: Krzysztof Kozlowski @ 2019-07-09  9:06 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc, linux-rockchip,
	briannorris, mka, groeck, sonnyrao, Marek Szyprowski,
	Alim Akhtar, Enric Balletbo i Serra, linux-mmc, linux-kernel

On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
>
> In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> response errors.") we fixed a tuning-induced hang that I saw when
> stress testing tuning on certain SD cards.  I won't re-hash that whole
> commit, but the summary is that as a normal part of tuning you need to
> deal with transfer errors and there were cases where these transfer
> errors was putting my system into a bad state causing all future
> transfers to fail.  That commit fixed handling of the transfer errors
> for me.
>
> In downstream Chrome OS my fix landed and had the same behavior for
> all SD/MMC commands.  However, it looks like when the commit landed
> upstream we limited it to only SD tuning commands.  Presumably this
> was to try to get around problems that Alim Akhtar reported on exynos
> [1].
>
> Unfortunately while stress testing reboots (and suspend/resume) on
> some rk3288-based Chromebooks I found the same problem on the eMMC on
> some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> same situation.
>
> I'm hoping that whatever problems exynos was having in the past are
> somehow magically fixed now and we can make the behavior the same for
> all commands.
>
> [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
>
> Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Alim Akhtar <alim.akhtar@gmail.com>
> Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> ---
> Marek (or anyone else using exynos): is it easy for you to test this
> and check if things are still broken when we land this patch?  If so,
> I guess we could have a quirk to have different behavior for just
> Rockchip SoCs but I'd rather avoid that if possible.
>
> NOTE: I'm not hoping totally in vain here.  It is possible that some
> of the CTO/DTO timers that landed could be the magic that would get
> exynos unstuck.

I have eMMC module attached to Odroid U3 (Exynos4412,
samsung,exynos4412-dw-mshc). What is the testing procedure? With your
patch it boots fine:
[    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
req 52000000Hz, actual 50000000HZ div = 0)
[    3.703900] mmc1: new DDR MMC card at address 0001
[    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
@ 2019-07-09  9:06   ` Krzysztof Kozlowski
  0 siblings, 0 replies; 12+ messages in thread
From: Krzysztof Kozlowski @ 2019-07-09  9:06 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc, linux-rockchip,
	briannorris, mka, groeck, sonnyrao, Marek Szyprowski,
	Alim Akhtar, Enric Balletbo i Serra, linux-mmc, linux-kernel

On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
>
> In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> response errors.") we fixed a tuning-induced hang that I saw when
> stress testing tuning on certain SD cards.  I won't re-hash that whole
> commit, but the summary is that as a normal part of tuning you need to
> deal with transfer errors and there were cases where these transfer
> errors was putting my system into a bad state causing all future
> transfers to fail.  That commit fixed handling of the transfer errors
> for me.
>
> In downstream Chrome OS my fix landed and had the same behavior for
> all SD/MMC commands.  However, it looks like when the commit landed
> upstream we limited it to only SD tuning commands.  Presumably this
> was to try to get around problems that Alim Akhtar reported on exynos
> [1].
>
> Unfortunately while stress testing reboots (and suspend/resume) on
> some rk3288-based Chromebooks I found the same problem on the eMMC on
> some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> same situation.
>
> I'm hoping that whatever problems exynos was having in the past are
> somehow magically fixed now and we can make the behavior the same for
> all commands.
>
> [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
>
> Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Alim Akhtar <alim.akhtar@gmail.com>
> Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> ---
> Marek (or anyone else using exynos): is it easy for you to test this
> and check if things are still broken when we land this patch?  If so,
> I guess we could have a quirk to have different behavior for just
> Rockchip SoCs but I'd rather avoid that if possible.
>
> NOTE: I'm not hoping totally in vain here.  It is possible that some
> of the CTO/DTO timers that landed could be the magic that would get
> exynos unstuck.

I have eMMC module attached to Odroid U3 (Exynos4412,
samsung,exynos4412-dw-mshc). What is the testing procedure? With your
patch it boots fine:
[    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
req 52000000Hz, actual 50000000HZ div = 0)
[    3.703900] mmc1: new DDR MMC card at address 0001
[    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  2019-07-09  9:06   ` Krzysztof Kozlowski
@ 2019-07-09 16:38     ` Doug Anderson
  -1 siblings, 0 replies; 12+ messages in thread
From: Doug Anderson @ 2019-07-09 16:38 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra,
	Linux MMC List, LKML

Hi,

On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> >
> > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > response errors.") we fixed a tuning-induced hang that I saw when
> > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > commit, but the summary is that as a normal part of tuning you need to
> > deal with transfer errors and there were cases where these transfer
> > errors was putting my system into a bad state causing all future
> > transfers to fail.  That commit fixed handling of the transfer errors
> > for me.
> >
> > In downstream Chrome OS my fix landed and had the same behavior for
> > all SD/MMC commands.  However, it looks like when the commit landed
> > upstream we limited it to only SD tuning commands.  Presumably this
> > was to try to get around problems that Alim Akhtar reported on exynos
> > [1].
> >
> > Unfortunately while stress testing reboots (and suspend/resume) on
> > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > same situation.
> >
> > I'm hoping that whatever problems exynos was having in the past are
> > somehow magically fixed now and we can make the behavior the same for
> > all commands.
> >
> > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> >
> > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > ---
> > Marek (or anyone else using exynos): is it easy for you to test this
> > and check if things are still broken when we land this patch?  If so,
> > I guess we could have a quirk to have different behavior for just
> > Rockchip SoCs but I'd rather avoid that if possible.
> >
> > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > of the CTO/DTO timers that landed could be the magic that would get
> > exynos unstuck.
>
> I have eMMC module attached to Odroid U3 (Exynos4412,
> samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> patch it boots fine:
> [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> req 52000000Hz, actual 50000000HZ div = 0)
> [    3.703900] mmc1: new DDR MMC card at address 0001
> [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB

To really test it, it'd be nice to see some HS200 eMMC cards enumerate
OK.  Specifically the patch adjusts the error handling and the place
where that happens mostly is during tuning.

I'll also try to find some time today to check a peach_pit or a
peach_pi.  I think I saw one in the pile near my desk so if it isn't
in too bad of a shape I can give mainline a shot on it.

-Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
@ 2019-07-09 16:38     ` Doug Anderson
  0 siblings, 0 replies; 12+ messages in thread
From: Doug Anderson @ 2019-07-09 16:38 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra,
	Linux MMC List, LKML

Hi,

On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> >
> > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > response errors.") we fixed a tuning-induced hang that I saw when
> > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > commit, but the summary is that as a normal part of tuning you need to
> > deal with transfer errors and there were cases where these transfer
> > errors was putting my system into a bad state causing all future
> > transfers to fail.  That commit fixed handling of the transfer errors
> > for me.
> >
> > In downstream Chrome OS my fix landed and had the same behavior for
> > all SD/MMC commands.  However, it looks like when the commit landed
> > upstream we limited it to only SD tuning commands.  Presumably this
> > was to try to get around problems that Alim Akhtar reported on exynos
> > [1].
> >
> > Unfortunately while stress testing reboots (and suspend/resume) on
> > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > same situation.
> >
> > I'm hoping that whatever problems exynos was having in the past are
> > somehow magically fixed now and we can make the behavior the same for
> > all commands.
> >
> > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> >
> > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > ---
> > Marek (or anyone else using exynos): is it easy for you to test this
> > and check if things are still broken when we land this patch?  If so,
> > I guess we could have a quirk to have different behavior for just
> > Rockchip SoCs but I'd rather avoid that if possible.
> >
> > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > of the CTO/DTO timers that landed could be the magic that would get
> > exynos unstuck.
>
> I have eMMC module attached to Odroid U3 (Exynos4412,
> samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> patch it boots fine:
> [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> req 52000000Hz, actual 50000000HZ div = 0)
> [    3.703900] mmc1: new DDR MMC card at address 0001
> [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB

To really test it, it'd be nice to see some HS200 eMMC cards enumerate
OK.  Specifically the patch adjusts the error handling and the place
where that happens mostly is during tuning.

I'll also try to find some time today to check a peach_pit or a
peach_pi.  I think I saw one in the pile near my desk so if it isn't
in too bad of a shape I can give mainline a shot on it.

-Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  2019-07-09 16:38     ` Doug Anderson
@ 2019-07-09 21:48       ` Enric Balletbo Serra
  -1 siblings, 0 replies; 12+ messages in thread
From: Enric Balletbo Serra @ 2019-07-09 21:48 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Krzysztof Kozlowski, Enric Balletbo i Serra, Ulf Hansson,
	linux-samsung-soc, Brian Norris, Linux MMC List, LKML,
	Jaehoon Chung, open list:ARM/Rockchip SoC...,
	Matthias Kaehlcke, Guenter Roeck, Alim Akhtar, Sonny Rao,
	Marek Szyprowski

Hi,

Missatge de Doug Anderson <dianders@chromium.org> del dia dt., 9 de
jul. 2019 a les 18:38:
>
> Hi,
>
> On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> > >
> > > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > > response errors.") we fixed a tuning-induced hang that I saw when
> > > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > > commit, but the summary is that as a normal part of tuning you need to
> > > deal with transfer errors and there were cases where these transfer
> > > errors was putting my system into a bad state causing all future
> > > transfers to fail.  That commit fixed handling of the transfer errors
> > > for me.
> > >
> > > In downstream Chrome OS my fix landed and had the same behavior for
> > > all SD/MMC commands.  However, it looks like when the commit landed
> > > upstream we limited it to only SD tuning commands.  Presumably this
> > > was to try to get around problems that Alim Akhtar reported on exynos
> > > [1].
> > >
> > > Unfortunately while stress testing reboots (and suspend/resume) on
> > > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > > same situation.
> > >
> > > I'm hoping that whatever problems exynos was having in the past are
> > > somehow magically fixed now and we can make the behavior the same for
> > > all commands.
> > >
> > > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> > >
> > > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > > ---
> > > Marek (or anyone else using exynos): is it easy for you to test this
> > > and check if things are still broken when we land this patch?  If so,
> > > I guess we could have a quirk to have different behavior for just
> > > Rockchip SoCs but I'd rather avoid that if possible.
> > >
> > > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > > of the CTO/DTO timers that landed could be the magic that would get
> > > exynos unstuck.
> >
> > I have eMMC module attached to Odroid U3 (Exynos4412,
> > samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> > patch it boots fine:
> > [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> > req 52000000Hz, actual 50000000HZ div = 0)
> > [    3.703900] mmc1: new DDR MMC card at address 0001
> > [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB
>
> To really test it, it'd be nice to see some HS200 eMMC cards enumerate
> OK.  Specifically the patch adjusts the error handling and the place
> where that happens mostly is during tuning.
>
> I'll also try to find some time today to check a peach_pit or a
> peach_pi.  I think I saw one in the pile near my desk so if it isn't
> in too bad of a shape I can give mainline a shot on it.
>

I did a normal boot on peach_pi [1] and odroidxu3 [2] with that patch
applied, and the eMMC attached on both was detected as

 [    2.294798] mmc0: new HS400 MMC card at address 0001

I can do some stress tests tomorrow on those boards if that helps.

Cheers,
~ Enric

[1] https://storage.kernelci.org/chrome-platform/for-kernelci/ib-mfd-cros-v5.3-87-g0fe7e9d7d5a3/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-exynos5800-peach-pi.html
[2] https://storage.kernelci.org/chrome-platform/for-kernelci/ib-mfd-cros-v5.3-87-g0fe7e9d7d5a3/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-exynos5422-odroidxu3.html

> -Doug
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
@ 2019-07-09 21:48       ` Enric Balletbo Serra
  0 siblings, 0 replies; 12+ messages in thread
From: Enric Balletbo Serra @ 2019-07-09 21:48 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Krzysztof Kozlowski, Enric Balletbo i Serra, Ulf Hansson,
	linux-samsung-soc, Brian Norris, Linux MMC List, LKML,
	Jaehoon Chung, open list:ARM/Rockchip SoC...,
	Matthias Kaehlcke, Guenter Roeck, Alim Akhtar, Sonny Rao,
	Marek Szyprowski

Hi,

Missatge de Doug Anderson <dianders@chromium.org> del dia dt., 9 de
jul. 2019 a les 18:38:
>
> Hi,
>
> On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> > >
> > > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > > response errors.") we fixed a tuning-induced hang that I saw when
> > > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > > commit, but the summary is that as a normal part of tuning you need to
> > > deal with transfer errors and there were cases where these transfer
> > > errors was putting my system into a bad state causing all future
> > > transfers to fail.  That commit fixed handling of the transfer errors
> > > for me.
> > >
> > > In downstream Chrome OS my fix landed and had the same behavior for
> > > all SD/MMC commands.  However, it looks like when the commit landed
> > > upstream we limited it to only SD tuning commands.  Presumably this
> > > was to try to get around problems that Alim Akhtar reported on exynos
> > > [1].
> > >
> > > Unfortunately while stress testing reboots (and suspend/resume) on
> > > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > > same situation.
> > >
> > > I'm hoping that whatever problems exynos was having in the past are
> > > somehow magically fixed now and we can make the behavior the same for
> > > all commands.
> > >
> > > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> > >
> > > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > > ---
> > > Marek (or anyone else using exynos): is it easy for you to test this
> > > and check if things are still broken when we land this patch?  If so,
> > > I guess we could have a quirk to have different behavior for just
> > > Rockchip SoCs but I'd rather avoid that if possible.
> > >
> > > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > > of the CTO/DTO timers that landed could be the magic that would get
> > > exynos unstuck.
> >
> > I have eMMC module attached to Odroid U3 (Exynos4412,
> > samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> > patch it boots fine:
> > [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> > req 52000000Hz, actual 50000000HZ div = 0)
> > [    3.703900] mmc1: new DDR MMC card at address 0001
> > [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB
>
> To really test it, it'd be nice to see some HS200 eMMC cards enumerate
> OK.  Specifically the patch adjusts the error handling and the place
> where that happens mostly is during tuning.
>
> I'll also try to find some time today to check a peach_pit or a
> peach_pi.  I think I saw one in the pile near my desk so if it isn't
> in too bad of a shape I can give mainline a shot on it.
>

I did a normal boot on peach_pi [1] and odroidxu3 [2] with that patch
applied, and the eMMC attached on both was detected as

 [    2.294798] mmc0: new HS400 MMC card at address 0001

I can do some stress tests tomorrow on those boards if that helps.

Cheers,
~ Enric

[1] https://storage.kernelci.org/chrome-platform/for-kernelci/ib-mfd-cros-v5.3-87-g0fe7e9d7d5a3/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-exynos5800-peach-pi.html
[2] https://storage.kernelci.org/chrome-platform/for-kernelci/ib-mfd-cros-v5.3-87-g0fe7e9d7d5a3/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-exynos5422-odroidxu3.html

> -Doug
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  2019-07-09 16:38     ` Doug Anderson
@ 2019-07-09 22:02       ` Doug Anderson
  -1 siblings, 0 replies; 12+ messages in thread
From: Doug Anderson @ 2019-07-09 22:02 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra,
	Linux MMC List, LKML

Hi,

On Tue, Jul 9, 2019 at 9:38 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> > >
> > > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > > response errors.") we fixed a tuning-induced hang that I saw when
> > > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > > commit, but the summary is that as a normal part of tuning you need to
> > > deal with transfer errors and there were cases where these transfer
> > > errors was putting my system into a bad state causing all future
> > > transfers to fail.  That commit fixed handling of the transfer errors
> > > for me.
> > >
> > > In downstream Chrome OS my fix landed and had the same behavior for
> > > all SD/MMC commands.  However, it looks like when the commit landed
> > > upstream we limited it to only SD tuning commands.  Presumably this
> > > was to try to get around problems that Alim Akhtar reported on exynos
> > > [1].
> > >
> > > Unfortunately while stress testing reboots (and suspend/resume) on
> > > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > > same situation.
> > >
> > > I'm hoping that whatever problems exynos was having in the past are
> > > somehow magically fixed now and we can make the behavior the same for
> > > all commands.
> > >
> > > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> > >
> > > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > > ---
> > > Marek (or anyone else using exynos): is it easy for you to test this
> > > and check if things are still broken when we land this patch?  If so,
> > > I guess we could have a quirk to have different behavior for just
> > > Rockchip SoCs but I'd rather avoid that if possible.
> > >
> > > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > > of the CTO/DTO timers that landed could be the magic that would get
> > > exynos unstuck.
> >
> > I have eMMC module attached to Odroid U3 (Exynos4412,
> > samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> > patch it boots fine:
> > [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> > req 52000000Hz, actual 50000000HZ div = 0)
> > [    3.703900] mmc1: new DDR MMC card at address 0001
> > [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB
>
> To really test it, it'd be nice to see some HS200 eMMC cards enumerate
> OK.  Specifically the patch adjusts the error handling and the place
> where that happens mostly is during tuning.
>
> I'll also try to find some time today to check a peach_pit or a
> peach_pi.  I think I saw one in the pile near my desk so if it isn't
> in too bad of a shape I can give mainline a shot on it.

OK, I managed to get an exynos5800-peach-pi up and running.  I put my
patch in place and am currently at 45 reboots and counting w/ no
problems.

NOTE: in my case I actually had to disable "hs400" mode on my peach-pi
but that's because the board I dug up was an early version of the
board that didn't have the strobe line connected.  However, Alim's
earlier reports of problems were with hs200 anyway and hs200 still
executes the tuning plenty of times.  His reports of problems also
said that he had problems after just a few boots.

So I'll assert that whatever problems were present 4 years ago have
indeed gone away.  I'll leave rebooting happening overnight just in
case, but otherwise I'll assert that this is fine.


-Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
@ 2019-07-09 22:02       ` Doug Anderson
  0 siblings, 0 replies; 12+ messages in thread
From: Doug Anderson @ 2019-07-09 22:02 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra,
	Linux MMC List, LKML

Hi,

On Tue, Jul 9, 2019 at 9:38 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >
> > On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> > >
> > > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > > response errors.") we fixed a tuning-induced hang that I saw when
> > > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > > commit, but the summary is that as a normal part of tuning you need to
> > > deal with transfer errors and there were cases where these transfer
> > > errors was putting my system into a bad state causing all future
> > > transfers to fail.  That commit fixed handling of the transfer errors
> > > for me.
> > >
> > > In downstream Chrome OS my fix landed and had the same behavior for
> > > all SD/MMC commands.  However, it looks like when the commit landed
> > > upstream we limited it to only SD tuning commands.  Presumably this
> > > was to try to get around problems that Alim Akhtar reported on exynos
> > > [1].
> > >
> > > Unfortunately while stress testing reboots (and suspend/resume) on
> > > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > > same situation.
> > >
> > > I'm hoping that whatever problems exynos was having in the past are
> > > somehow magically fixed now and we can make the behavior the same for
> > > all commands.
> > >
> > > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> > >
> > > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > > ---
> > > Marek (or anyone else using exynos): is it easy for you to test this
> > > and check if things are still broken when we land this patch?  If so,
> > > I guess we could have a quirk to have different behavior for just
> > > Rockchip SoCs but I'd rather avoid that if possible.
> > >
> > > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > > of the CTO/DTO timers that landed could be the magic that would get
> > > exynos unstuck.
> >
> > I have eMMC module attached to Odroid U3 (Exynos4412,
> > samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> > patch it boots fine:
> > [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> > req 52000000Hz, actual 50000000HZ div = 0)
> > [    3.703900] mmc1: new DDR MMC card at address 0001
> > [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB
>
> To really test it, it'd be nice to see some HS200 eMMC cards enumerate
> OK.  Specifically the patch adjusts the error handling and the place
> where that happens mostly is during tuning.
>
> I'll also try to find some time today to check a peach_pit or a
> peach_pi.  I think I saw one in the pile near my desk so if it isn't
> in too bad of a shape I can give mainline a shot on it.

OK, I managed to get an exynos5800-peach-pi up and running.  I put my
patch in place and am currently at 45 reboots and counting w/ no
problems.

NOTE: in my case I actually had to disable "hs400" mode on my peach-pi
but that's because the board I dug up was an early version of the
board that didn't have the strobe line connected.  However, Alim's
earlier reports of problems were with hs200 anyway and hs200 still
executes the tuning plenty of times.  His reports of problems also
said that he had problems after just a few boots.

So I'll assert that whatever problems were present 4 years ago have
indeed gone away.  I'll leave rebooting happening overnight just in
case, but otherwise I'll assert that this is fine.


-Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  2019-07-09 22:02       ` Doug Anderson
@ 2019-07-10 20:21         ` Doug Anderson
  -1 siblings, 0 replies; 12+ messages in thread
From: Doug Anderson @ 2019-07-10 20:21 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra,
	Linux MMC List, LKML

Hi,

On Tue, Jul 9, 2019 at 3:02 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Jul 9, 2019 at 9:38 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > >
> > > On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> > > >
> > > > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > > > response errors.") we fixed a tuning-induced hang that I saw when
> > > > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > > > commit, but the summary is that as a normal part of tuning you need to
> > > > deal with transfer errors and there were cases where these transfer
> > > > errors was putting my system into a bad state causing all future
> > > > transfers to fail.  That commit fixed handling of the transfer errors
> > > > for me.
> > > >
> > > > In downstream Chrome OS my fix landed and had the same behavior for
> > > > all SD/MMC commands.  However, it looks like when the commit landed
> > > > upstream we limited it to only SD tuning commands.  Presumably this
> > > > was to try to get around problems that Alim Akhtar reported on exynos
> > > > [1].
> > > >
> > > > Unfortunately while stress testing reboots (and suspend/resume) on
> > > > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > > > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > > > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > > > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > > > same situation.
> > > >
> > > > I'm hoping that whatever problems exynos was having in the past are
> > > > somehow magically fixed now and we can make the behavior the same for
> > > > all commands.
> > > >
> > > > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> > > >
> > > > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > > > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > > > ---
> > > > Marek (or anyone else using exynos): is it easy for you to test this
> > > > and check if things are still broken when we land this patch?  If so,
> > > > I guess we could have a quirk to have different behavior for just
> > > > Rockchip SoCs but I'd rather avoid that if possible.
> > > >
> > > > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > > > of the CTO/DTO timers that landed could be the magic that would get
> > > > exynos unstuck.
> > >
> > > I have eMMC module attached to Odroid U3 (Exynos4412,
> > > samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> > > patch it boots fine:
> > > [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> > > req 52000000Hz, actual 50000000HZ div = 0)
> > > [    3.703900] mmc1: new DDR MMC card at address 0001
> > > [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB
> >
> > To really test it, it'd be nice to see some HS200 eMMC cards enumerate
> > OK.  Specifically the patch adjusts the error handling and the place
> > where that happens mostly is during tuning.
> >
> > I'll also try to find some time today to check a peach_pit or a
> > peach_pi.  I think I saw one in the pile near my desk so if it isn't
> > in too bad of a shape I can give mainline a shot on it.
>
> OK, I managed to get an exynos5800-peach-pi up and running.  I put my
> patch in place and am currently at 45 reboots and counting w/ no
> problems.

In case it helps, I made it through 2379 more reboots on my peach_pi
w/ no hangs.  I'm putting the device back in mothball now.  :-P  I
didn't go back and try to reproduce the original problems so I guess I
can't assert with 100% authority that the original issue is gone, but
my testing combined with Enric's seems like things are working fine.

-Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
@ 2019-07-10 20:21         ` Doug Anderson
  0 siblings, 0 replies; 12+ messages in thread
From: Doug Anderson @ 2019-07-10 20:21 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jaehoon Chung, Ulf Hansson, linux-samsung-soc,
	open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra,
	Linux MMC List, LKML

Hi,

On Tue, Jul 9, 2019 at 3:02 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Jul 9, 2019 at 9:38 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Tue, Jul 9, 2019 at 2:07 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > >
> > > On Tue, 9 Jul 2019 at 00:48, Douglas Anderson <dianders@chromium.org> wrote:
> > > >
> > > > In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> > > > response errors.") we fixed a tuning-induced hang that I saw when
> > > > stress testing tuning on certain SD cards.  I won't re-hash that whole
> > > > commit, but the summary is that as a normal part of tuning you need to
> > > > deal with transfer errors and there were cases where these transfer
> > > > errors was putting my system into a bad state causing all future
> > > > transfers to fail.  That commit fixed handling of the transfer errors
> > > > for me.
> > > >
> > > > In downstream Chrome OS my fix landed and had the same behavior for
> > > > all SD/MMC commands.  However, it looks like when the commit landed
> > > > upstream we limited it to only SD tuning commands.  Presumably this
> > > > was to try to get around problems that Alim Akhtar reported on exynos
> > > > [1].
> > > >
> > > > Unfortunately while stress testing reboots (and suspend/resume) on
> > > > some rk3288-based Chromebooks I found the same problem on the eMMC on
> > > > some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> > > > tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> > > > vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> > > > same situation.
> > > >
> > > > I'm hoping that whatever problems exynos was having in the past are
> > > > somehow magically fixed now and we can make the behavior the same for
> > > > all commands.
> > > >
> > > > [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
> > > >
> > > > Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > > Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > Cc: Alim Akhtar <alim.akhtar@gmail.com>
> > > > Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
> > > > ---
> > > > Marek (or anyone else using exynos): is it easy for you to test this
> > > > and check if things are still broken when we land this patch?  If so,
> > > > I guess we could have a quirk to have different behavior for just
> > > > Rockchip SoCs but I'd rather avoid that if possible.
> > > >
> > > > NOTE: I'm not hoping totally in vain here.  It is possible that some
> > > > of the CTO/DTO timers that landed could be the magic that would get
> > > > exynos unstuck.
> > >
> > > I have eMMC module attached to Odroid U3 (Exynos4412,
> > > samsung,exynos4412-dw-mshc). What is the testing procedure? With your
> > > patch it boots fine:
> > > [    3.698637] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
> > > req 52000000Hz, actual 50000000HZ div = 0)
> > > [    3.703900] mmc1: new DDR MMC card at address 0001
> > > [    3.728458] mmcblk1: mmc1:0001 008G92 7.28 GiB
> >
> > To really test it, it'd be nice to see some HS200 eMMC cards enumerate
> > OK.  Specifically the patch adjusts the error handling and the place
> > where that happens mostly is during tuning.
> >
> > I'll also try to find some time today to check a peach_pit or a
> > peach_pi.  I think I saw one in the pile near my desk so if it isn't
> > in too bad of a shape I can give mainline a shot on it.
>
> OK, I managed to get an exynos5800-peach-pi up and running.  I put my
> patch in place and am currently at 45 reboots and counting w/ no
> problems.

In case it helps, I made it through 2379 more reboots on my peach_pi
w/ no hangs.  I'm putting the device back in mothball now.  :-P  I
didn't go back and try to reproduce the original problems so I guess I
can't assert with 100% authority that the original issue is gone, but
my testing combined with Enric's seems like things are working fine.

-Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC
  2019-07-08 19:56 [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC Douglas Anderson
  2019-07-09  9:06   ` Krzysztof Kozlowski
@ 2019-07-22 13:41 ` Ulf Hansson
  1 sibling, 0 replies; 12+ messages in thread
From: Ulf Hansson @ 2019-07-22 13:41 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Jaehoon Chung, linux-samsung-soc, open list:ARM/Rockchip SoC...,
	Brian Norris, Matthias Kaehlcke, Guenter Roeck, Sonny Rao,
	Marek Szyprowski, Alim Akhtar, Enric Balletbo i Serra, linux-mmc,
	Linux Kernel Mailing List

On Mon, 8 Jul 2019 at 21:56, Douglas Anderson <dianders@chromium.org> wrote:
>
> In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after
> response errors.") we fixed a tuning-induced hang that I saw when
> stress testing tuning on certain SD cards.  I won't re-hash that whole
> commit, but the summary is that as a normal part of tuning you need to
> deal with transfer errors and there were cases where these transfer
> errors was putting my system into a bad state causing all future
> transfers to fail.  That commit fixed handling of the transfer errors
> for me.
>
> In downstream Chrome OS my fix landed and had the same behavior for
> all SD/MMC commands.  However, it looks like when the commit landed
> upstream we limited it to only SD tuning commands.  Presumably this
> was to try to get around problems that Alim Akhtar reported on exynos
> [1].
>
> Unfortunately while stress testing reboots (and suspend/resume) on
> some rk3288-based Chromebooks I found the same problem on the eMMC on
> some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
> tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
> vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
> same situation.
>
> I'm hoping that whatever problems exynos was having in the past are
> somehow magically fixed now and we can make the behavior the same for
> all commands.
>
> [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
>
> Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Alim Akhtar <alim.akhtar@gmail.com>
> Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>

Applied for fixes and by adding a stable tag, thanks!

Kind regards
Uffe


> ---
> Marek (or anyone else using exynos): is it easy for you to test this
> and check if things are still broken when we land this patch?  If so,
> I guess we could have a quirk to have different behavior for just
> Rockchip SoCs but I'd rather avoid that if possible.
>
> NOTE: I'm not hoping totally in vain here.  It is possible that some
> of the CTO/DTO timers that landed could be the magic that would get
> exynos unstuck.
>
>  drivers/mmc/host/dw_mmc.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index b53b6b7d4dd4..60c3a06e3469 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -2034,8 +2034,7 @@ static void dw_mci_tasklet_func(unsigned long priv)
>                                  * delayed. Allowing the transfer to take place
>                                  * avoids races and keeps things simple.
>                                  */
> -                               if ((err != -ETIMEDOUT) &&
> -                                   (cmd->opcode == MMC_SEND_TUNING_BLOCK)) {
> +                               if (err != -ETIMEDOUT) {
>                                         state = STATE_SENDING_DATA;
>                                         continue;
>                                 }
> --
> 2.22.0.410.gd8fdbe21b5-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-22 13:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08 19:56 [PATCH] mmc: dw_mmc: Fix occasional hang after tuning on eMMC Douglas Anderson
2019-07-09  9:06 ` Krzysztof Kozlowski
2019-07-09  9:06   ` Krzysztof Kozlowski
2019-07-09 16:38   ` Doug Anderson
2019-07-09 16:38     ` Doug Anderson
2019-07-09 21:48     ` Enric Balletbo Serra
2019-07-09 21:48       ` Enric Balletbo Serra
2019-07-09 22:02     ` Doug Anderson
2019-07-09 22:02       ` Doug Anderson
2019-07-10 20:21       ` Doug Anderson
2019-07-10 20:21         ` Doug Anderson
2019-07-22 13:41 ` Ulf Hansson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.