Regression after "do not use CMD13 to get status after speed mode switch"

All of lore.kernel.org
 help / color / mirror / Atom feed

* Regression after "do not use CMD13 to get status after speed mode switch"
@ 2016-10-17 14:32 Linus Walleij
  2016-10-18  4:06 ` Ritesh Harjani
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Linus Walleij @ 2016-10-17 14:32 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Chaotian Jing
  Cc: linux-arm-msm, Bjorn Andersson, Stephen Boyd, Andy Gross

Hi Chaotian,

my Dragonboard 8060 has a regression on detecting the eMMC after
bisecting it down to this patch (which is sadly already in v4.8):

commit 08573eaf1a70104f83fdbee9b84e5be03480e9ed
Author: Chaotian Jing <chaotian.jing@mediatek.com>
Date:   Thu May 19 16:47:41 2016 +0800
mmc: mmc: do not use CMD13 to get status after speed mode switch

Before this patch the eMMC is detected and all partitions enumerated
immediately, but after the patch it doesn't come up at all, except
sometimes, when it appears minutes (!) after boot, all of a sudden.

Do you have any ideas on what can be wrong?

This is the Qualcomm MMCI variant whid does *not* support
busydetection in hardware.

I will continue looking at it but would be happy for any hints on how to
solve it, currently I cannot mount root off the eMMC because of this.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-17 14:32 Regression after "do not use CMD13 to get status after speed mode switch" Linus Walleij
@ 2016-10-18  4:06 ` Ritesh Harjani
  2016-10-19  1:03   ` Stephen Boyd
  2016-10-18  8:36 ` Linus Walleij
  2016-10-20 15:17 ` Srinivas Kandagatla
  2 siblings, 1 reply; 19+ messages in thread
From: Ritesh Harjani @ 2016-10-18  4:06 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Chaotian Jing
  Cc: linux-arm-msm, Bjorn Andersson, Stephen Boyd, Andy Gross, Georgi Djakov

Hi Linus,

Recently for the same patch below was submitted for Qcom sdhci-msm driver.

https://patchwork.kernel.org/patch/9237679/

You mentioned yours is mmci driver.
See if it provides any hints.


Regards
Ritesh


On 10/17/2016 8:02 PM, Linus Walleij wrote:
> Hi Chaotian,
>
> my Dragonboard 8060 has a regression on detecting the eMMC after
> bisecting it down to this patch (which is sadly already in v4.8):
>
> commit 08573eaf1a70104f83fdbee9b84e5be03480e9ed
> Author: Chaotian Jing <chaotian.jing@mediatek.com>
> Date:   Thu May 19 16:47:41 2016 +0800
> mmc: mmc: do not use CMD13 to get status after speed mode switch
>
> Before this patch the eMMC is detected and all partitions enumerated
> immediately, but after the patch it doesn't come up at all, except
> sometimes, when it appears minutes (!) after boot, all of a sudden.
>
> Do you have any ideas on what can be wrong?
>
> This is the Qualcomm MMCI variant whid does *not* support
> busydetection in hardware.
>
> I will continue looking at it but would be happy for any hints on how to
> solve it, currently I cannot mount root off the eMMC because of this.
>
> Yours,
> Linus Walleij
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-17 14:32 Regression after "do not use CMD13 to get status after speed mode switch" Linus Walleij
  2016-10-18  4:06 ` Ritesh Harjani
@ 2016-10-18  8:36 ` Linus Walleij
  2016-10-18 13:23   ` Adrian Hunter
  2016-10-20 15:17 ` Srinivas Kandagatla
  2 siblings, 1 reply; 19+ messages in thread
From: Linus Walleij @ 2016-10-18  8:36 UTC (permalink / raw)
  To: linux-mmc, Ulf Hansson, Chaotian Jing
  Cc: linux-arm-msm, Bjorn Andersson, Stephen Boyd, Andy Gross

On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:

> Before this patch the eMMC is detected and all partitions enumerated
> immediately, but after the patch it doesn't come up at all, except
> sometimes, when it appears minutes (!) after boot, all of a sudden.

FYI this is what it looks like when it eventually happens:
root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
[  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
[  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
[  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
[  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
[  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
p14 p15 p16 p17 p18 p19 p20 p21 >

So after 627 seconds, a bit hard for users to wait this long for their
root filesystem.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-18  8:36 ` Linus Walleij
@ 2016-10-18 13:23   ` Adrian Hunter
  2016-10-19 16:41     ` Ulf Hansson
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2016-10-18 13:23 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Chaotian Jing
  Cc: linux-arm-msm, Bjorn Andersson, Stephen Boyd, Andy Gross

On 18/10/16 11:36, Linus Walleij wrote:
> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> 
>> Before this patch the eMMC is detected and all partitions enumerated
>> immediately, but after the patch it doesn't come up at all, except
>> sometimes, when it appears minutes (!) after boot, all of a sudden.
> 
> FYI this is what it looks like when it eventually happens:
> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
> p14 p15 p16 p17 p18 p19 p20 p21 >
> 
> So after 627 seconds, a bit hard for users to wait this long for their
> root filesystem.

If the driver does not support busy detection and the eMMC card provides
zero as the cmd6 generic timeout (which it may especially as cmd6 generic
timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
waiting 10 minutes i.e.

#define MMC_OPS_TIMEOUT_MS	(10 * 60 * 1000) /* 10 minute timeout */

So removal of CMD13 polling for HS mode (as per commit
08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
combinations of eMMC cards and host drivers.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-18  4:06 ` Ritesh Harjani
@ 2016-10-19  1:03   ` Stephen Boyd
  0 siblings, 0 replies; 19+ messages in thread
From: Stephen Boyd @ 2016-10-19  1:03 UTC (permalink / raw)
  To: Ritesh Harjani, Linus Walleij, linux-mmc, Ulf Hansson, Chaotian Jing
  Cc: linux-arm-msm, Bjorn Andersson, Andy Gross, Georgi Djakov

On 10/17/2016 09:06 PM, Ritesh Harjani wrote:
> Hi Linus,
>
> Recently for the same patch below was submitted for Qcom sdhci-msm
> driver.
>
> https://patchwork.kernel.org/patch/9237679/
>
> You mentioned yours is mmci driver.
> See if it provides any hints.
>

I still see an mmc1 error on apq8074 dragonboard, but I don't see the
long 10 minute timeout. Maybe this is a different problem?

[    5.440702] mmc1: card never left busy state
[    5.440729] mmc1: error -110 whilst initialising SD card
[    5.546527] mmc1: Reset 0x1 never completed.
[    5.546548] sdhci: =========== REGISTER DUMP (mmc1)===========
[    5.549854] sdhci: Sys addr: 0x00000000 | Version:  0x00003802
[    5.555497] sdhci: Blk size: 0x00004000 | Blk cnt:  0x00000000
[    5.561310] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[    5.567127] sdhci: Present:  0x01f80000 | Host ctl: 0x00000000
[    5.572944] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[    5.578759] sdhci: Wake-up:  0x00000000 | Clock:    0x00000003
[    5.584575] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
[    5.590391] sdhci: Int enab: 0x00000000 | Sig enab: 0x00000000
[    5.596207] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[    5.602023] sdhci: Caps:     0x2329c8b2 | Caps_1:   0x00008007
[    5.607838] sdhci: Cmd:      0x00000000 | Max curr: 0x00000000
[    5.613653] sdhci: Host ctl2: 0x00000000
[    5.619468] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[    5.623549] sdhci: ===========================================


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-18 13:23   ` Adrian Hunter
@ 2016-10-19 16:41     ` Ulf Hansson
  2016-10-20  2:22       ` Chaotian Jing
  0 siblings, 1 reply; 19+ messages in thread
From: Ulf Hansson @ 2016-10-19 16:41 UTC (permalink / raw)
  To: Adrian Hunter, Linus Walleij
  Cc: linux-mmc, Chaotian Jing, linux-arm-msm, Bjorn Andersson,
	Stephen Boyd, Andy Gross

Adrian, Linus,

Thanks for looking into this and reporting!

On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 18/10/16 11:36, Linus Walleij wrote:
>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>
>>> Before this patch the eMMC is detected and all partitions enumerated
>>> immediately, but after the patch it doesn't come up at all, except
>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>
>> FYI this is what it looks like when it eventually happens:
>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>> p14 p15 p16 p17 p18 p19 p20 p21 >
>>
>> So after 627 seconds, a bit hard for users to wait this long for their
>> root filesystem.
>
> If the driver does not support busy detection and the eMMC card provides
> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
> waiting 10 minutes i.e.
>
> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */

Urgh! Yes, I have verified that this is exactly what happens.

>
> So removal of CMD13 polling for HS mode (as per commit
> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
> combinations of eMMC cards and host drivers.

I was looking in the __mmc_switch() function, it's just a pain to walk
trough it :-) So first out I decided to clean it up and factor out the
polling parts. I will post the patches first out tomorrow morning,
running some final test right now.

Although, that of course doesn't solve our problem. As I see it we
only have a few options here.

1) In case when cmd6 generic timeout isn't available, let's assign
another empirically selected value.
2) Use a specific timeout when switching to HS mode.
3) Even if we deploy 1 (and 2), perhaps we still should allow polling
with CMD13 for switching to HS mode - unless it causes issues for some
cards/drivers combination?

BTW, I already tried 2) and it indeed solves the problem, although
depending on the selected timeout, it might delay the card detection
to process.

Thoughts?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-19 16:41     ` Ulf Hansson
@ 2016-10-20  2:22       ` Chaotian Jing
  2016-10-20  7:06         ` Ulf Hansson
  0 siblings, 1 reply; 19+ messages in thread
From: Chaotian Jing @ 2016-10-20  2:22 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Adrian Hunter, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
> Adrian, Linus,
> 
> Thanks for looking into this and reporting!
> 
> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
> > On 18/10/16 11:36, Linus Walleij wrote:
> >> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> >>
> >>> Before this patch the eMMC is detected and all partitions enumerated
> >>> immediately, but after the patch it doesn't come up at all, except
> >>> sometimes, when it appears minutes (!) after boot, all of a sudden.
> >>
> >> FYI this is what it looks like when it eventually happens:
> >> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
> >> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
> >> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
> >> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
> >> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
> >> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
> >> p14 p15 p16 p17 p18 p19 p20 p21 >
> >>
> >> So after 627 seconds, a bit hard for users to wait this long for their
> >> root filesystem.
> >
> > If the driver does not support busy detection and the eMMC card provides
> > zero as the cmd6 generic timeout (which it may especially as cmd6 generic
> > timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
> > waiting 10 minutes i.e.
> >
> > #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
> 
> Urgh! Yes, I have verified that this is exactly what happens.
> 
> >
> > So removal of CMD13 polling for HS mode (as per commit
> > 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
> > combinations of eMMC cards and host drivers.
> 
> I was looking in the __mmc_switch() function, it's just a pain to walk
> trough it :-) So first out I decided to clean it up and factor out the
> polling parts. I will post the patches first out tomorrow morning,
> running some final test right now.
> 
> Although, that of course doesn't solve our problem. As I see it we
> only have a few options here.
> 
> 1) In case when cmd6 generic timeout isn't available, let's assign
> another empirically selected value.
> 2) Use a specific timeout when switching to HS mode.
> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
> with CMD13 for switching to HS mode - unless it causes issues for some
> cards/drivers combination?
> 
> BTW, I already tried 2) and it indeed solves the problem, although
> depending on the selected timeout, it might delay the card detection
> to process.
> 
> Thoughts?

I just have a try of switching to HS mode with Hynix EMMC, the first
CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
that CMD13 cannot indicate current card status in this case.
> 
> Kind regards
> Uffe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-20  2:22       ` Chaotian Jing
@ 2016-10-20  7:06         ` Ulf Hansson
  2016-10-20  7:19           ` Adrian Hunter
  2016-10-27 10:04           ` Ulf Hansson
  0 siblings, 2 replies; 19+ messages in thread
From: Ulf Hansson @ 2016-10-20  7:06 UTC (permalink / raw)
  To: Chaotian Jing
  Cc: Adrian Hunter, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>> Adrian, Linus,
>>
>> Thanks for looking into this and reporting!
>>
>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> > On 18/10/16 11:36, Linus Walleij wrote:
>> >> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>> >>
>> >>> Before this patch the eMMC is detected and all partitions enumerated
>> >>> immediately, but after the patch it doesn't come up at all, except
>> >>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>> >>
>> >> FYI this is what it looks like when it eventually happens:
>> >> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>> >> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>> >> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>> >> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>> >> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>> >> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>> >> p14 p15 p16 p17 p18 p19 p20 p21 >
>> >>
>> >> So after 627 seconds, a bit hard for users to wait this long for their
>> >> root filesystem.
>> >
>> > If the driver does not support busy detection and the eMMC card provides
>> > zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>> > timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>> > waiting 10 minutes i.e.
>> >
>> > #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>
>> Urgh! Yes, I have verified that this is exactly what happens.
>>
>> >
>> > So removal of CMD13 polling for HS mode (as per commit
>> > 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>> > combinations of eMMC cards and host drivers.
>>
>> I was looking in the __mmc_switch() function, it's just a pain to walk
>> trough it :-) So first out I decided to clean it up and factor out the
>> polling parts. I will post the patches first out tomorrow morning,
>> running some final test right now.
>>
>> Although, that of course doesn't solve our problem. As I see it we
>> only have a few options here.
>>
>> 1) In case when cmd6 generic timeout isn't available, let's assign
>> another empirically selected value.
>> 2) Use a specific timeout when switching to HS mode.
>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>> with CMD13 for switching to HS mode - unless it causes issues for some
>> cards/drivers combination?
>>
>> BTW, I already tried 2) and it indeed solves the problem, although
>> depending on the selected timeout, it might delay the card detection
>> to process.
>>
>> Thoughts?
>
> I just have a try of switching to HS mode with Hynix EMMC, the first
> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
> that CMD13 cannot indicate current card status in this case.

Thanks for sharing that. Okay, so clearly we have some cards that
don't supports polling with CMD13 when switching to HS mode.
One could of course add quirks for these kind of cards and do a fixed
delay for them, but then to find out which these cards are is going to
be hard.

It seems like we are left with using a fixed delay. Any ideas of what
such delay should be? And should we have one specific for switch to
the various speed modes and a different one that overrides the CMD6
generic timout, when it doesn't exist?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-20  7:06         ` Ulf Hansson
@ 2016-10-20  7:19           ` Adrian Hunter
  2016-10-20 11:00             ` Ulf Hansson
  2016-10-27 10:04           ` Ulf Hansson
  1 sibling, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2016-10-20  7:19 UTC (permalink / raw)
  To: Ulf Hansson, Chaotian Jing
  Cc: Linus Walleij, linux-mmc, linux-arm-msm, Bjorn Andersson,
	Stephen Boyd, Andy Gross

On 20/10/16 10:06, Ulf Hansson wrote:
> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>>> Adrian, Linus,
>>>
>>> Thanks for looking into this and reporting!
>>>
>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>> On 18/10/16 11:36, Linus Walleij wrote:
>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>>>
>>>>>> Before this patch the eMMC is detected and all partitions enumerated
>>>>>> immediately, but after the patch it doesn't come up at all, except
>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>>>>
>>>>> FYI this is what it looks like when it eventually happens:
>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
>>>>>
>>>>> So after 627 seconds, a bit hard for users to wait this long for their
>>>>> root filesystem.
>>>>
>>>> If the driver does not support busy detection and the eMMC card provides
>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>>>> waiting 10 minutes i.e.
>>>>
>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>>
>>> Urgh! Yes, I have verified that this is exactly what happens.
>>>
>>>>
>>>> So removal of CMD13 polling for HS mode (as per commit
>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>>>> combinations of eMMC cards and host drivers.
>>>
>>> I was looking in the __mmc_switch() function, it's just a pain to walk
>>> trough it :-) So first out I decided to clean it up and factor out the
>>> polling parts. I will post the patches first out tomorrow morning,
>>> running some final test right now.
>>>
>>> Although, that of course doesn't solve our problem. As I see it we
>>> only have a few options here.
>>>
>>> 1) In case when cmd6 generic timeout isn't available, let's assign
>>> another empirically selected value.
>>> 2) Use a specific timeout when switching to HS mode.
>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>>> with CMD13 for switching to HS mode - unless it causes issues for some
>>> cards/drivers combination?
>>>
>>> BTW, I already tried 2) and it indeed solves the problem, although
>>> depending on the selected timeout, it might delay the card detection
>>> to process.
>>>
>>> Thoughts?
>>
>> I just have a try of switching to HS mode with Hynix EMMC, the first
>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
>> that CMD13 cannot indicate current card status in this case.
> 
> Thanks for sharing that. Okay, so clearly we have some cards that
> don't supports polling with CMD13 when switching to HS mode.
> One could of course add quirks for these kind of cards and do a fixed
> delay for them, but then to find out which these cards are is going to
> be hard.
> 
> It seems like we are left with using a fixed delay. Any ideas of what
> such delay should be? And should we have one specific for switch to
> the various speed modes and a different one that overrides the CMD6
> generic timout, when it doesn't exist?

We have supported polling for a long time, so presumably the broken cards
are newer ones.  What about using busy detection if that is available,
otherwise using polling if the card does not support HS200, otherwise using
the cmd6 generic timeout unless it is zero, and finally falling back to an
arbitrary waiting time.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-20  7:19           ` Adrian Hunter
@ 2016-10-20 11:00             ` Ulf Hansson
  0 siblings, 0 replies; 19+ messages in thread
From: Ulf Hansson @ 2016-10-20 11:00 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Chaotian Jing, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On 20 October 2016 at 09:19, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 20/10/16 10:06, Ulf Hansson wrote:
>> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
>>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>>>> Adrian, Linus,
>>>>
>>>> Thanks for looking into this and reporting!
>>>>
>>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>> On 18/10/16 11:36, Linus Walleij wrote:
>>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>>>>
>>>>>>> Before this patch the eMMC is detected and all partitions enumerated
>>>>>>> immediately, but after the patch it doesn't come up at all, except
>>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>>>>>
>>>>>> FYI this is what it looks like when it eventually happens:
>>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
>>>>>>
>>>>>> So after 627 seconds, a bit hard for users to wait this long for their
>>>>>> root filesystem.
>>>>>
>>>>> If the driver does not support busy detection and the eMMC card provides
>>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>>>>> waiting 10 minutes i.e.
>>>>>
>>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>>>
>>>> Urgh! Yes, I have verified that this is exactly what happens.
>>>>
>>>>>
>>>>> So removal of CMD13 polling for HS mode (as per commit
>>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>>>>> combinations of eMMC cards and host drivers.
>>>>
>>>> I was looking in the __mmc_switch() function, it's just a pain to walk
>>>> trough it :-) So first out I decided to clean it up and factor out the
>>>> polling parts. I will post the patches first out tomorrow morning,
>>>> running some final test right now.
>>>>
>>>> Although, that of course doesn't solve our problem. As I see it we
>>>> only have a few options here.
>>>>
>>>> 1) In case when cmd6 generic timeout isn't available, let's assign
>>>> another empirically selected value.
>>>> 2) Use a specific timeout when switching to HS mode.
>>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>>>> with CMD13 for switching to HS mode - unless it causes issues for some
>>>> cards/drivers combination?
>>>>
>>>> BTW, I already tried 2) and it indeed solves the problem, although
>>>> depending on the selected timeout, it might delay the card detection
>>>> to process.
>>>>
>>>> Thoughts?
>>>
>>> I just have a try of switching to HS mode with Hynix EMMC, the first
>>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
>>> that CMD13 cannot indicate current card status in this case.
>>
>> Thanks for sharing that. Okay, so clearly we have some cards that
>> don't supports polling with CMD13 when switching to HS mode.
>> One could of course add quirks for these kind of cards and do a fixed
>> delay for them, but then to find out which these cards are is going to
>> be hard.
>>
>> It seems like we are left with using a fixed delay. Any ideas of what
>> such delay should be? And should we have one specific for switch to
>> the various speed modes and a different one that overrides the CMD6
>> generic timout, when it doesn't exist?
>
> We have supported polling for a long time, so presumably the broken cards
> are newer ones.  What about using busy detection if that is available,

Using HW busy detection, already takes highest precedence.

Although we can improve the behaviour a bit. When
MMC_CAP_WAIT_WHILE_BUSY isn't available but when ->card_busy() is, as
in a submitted patch [1].

> otherwise using polling if the card does not support HS200, otherwise using

Maybe we should first investigate the behaviour for some old cards!?

Perhaps it's more or less by "accident" that the cards has been
working and maybe that's because the delay needed is really small?

> the cmd6 generic timeout unless it is zero, and finally falling back to an
> arbitrary waiting time.
>

If I understand correct, you suggest we move back to allow polling for
older cards, but only when switching to high-speed mode, right?

Kind regards
Uffe

[1]
https://patchwork.kernel.org/patch/9386229/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-17 14:32 Regression after "do not use CMD13 to get status after speed mode switch" Linus Walleij
  2016-10-18  4:06 ` Ritesh Harjani
  2016-10-18  8:36 ` Linus Walleij
@ 2016-10-20 15:17 ` Srinivas Kandagatla
  2016-10-24 14:23   ` Linus Walleij
  2 siblings, 1 reply; 19+ messages in thread
From: Srinivas Kandagatla @ 2016-10-20 15:17 UTC (permalink / raw)
  To: Linus Walleij, linux-mmc, Ulf Hansson, Chaotian Jing
  Cc: linux-arm-msm, Bjorn Andersson, Stephen Boyd, Andy Gross

Hi Linus,

On 17/10/16 15:32, Linus Walleij wrote:
> Hi Chaotian,
>
> my Dragonboard 8060 has a regression on detecting the eMMC after
> bisecting it down to this patch (which is sadly already in v4.8):
>
> commit 08573eaf1a70104f83fdbee9b84e5be03480e9ed
> Author: Chaotian Jing <chaotian.jing@mediatek.com>
> Date:   Thu May 19 16:47:41 2016 +0800
> mmc: mmc: do not use CMD13 to get status after speed mode switch
>
> Before this patch the eMMC is detected and all partitions enumerated
> immediately, but after the patch it doesn't come up at all, except
> sometimes, when it appears minutes (!) after boot, all of a sudden.
>
> Do you have any ideas on what can be wrong?

I remember hitting timeout issue with new version of eMMC on SD600c-eval 
board, this patch/hack fixed it for me.

https://git.linaro.org/people/srinivas.kandagatla/linux.git/commitdiff/e42d7427b9ce77174eef079be0e04e22bdb8fd07?hp=3e4bf0b78405694fd1ab4371c73e80932be21e3a


Also one more important thing was to do with regulator mode too. 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93b8e6fda11757a20df21f3fc18e12789841f173

Can you give it a try.. If it works I can clean it up and send it for 
review.

Thanks,
srini

>
> This is the Qualcomm MMCI variant whid does *not* support
> busydetection in hardware.
>
> I will continue looking at it but would be happy for any hints on how to
> solve it, currently I cannot mount root off the eMMC because of this.


>
> Yours,
> Linus Walleij
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-20 15:17 ` Srinivas Kandagatla
@ 2016-10-24 14:23   ` Linus Walleij
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Walleij @ 2016-10-24 14:23 UTC (permalink / raw)
  To: Srinivas Kandagatla
  Cc: linux-mmc, Ulf Hansson, Chaotian Jing, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On Thu, Oct 20, 2016 at 5:17 PM, Srinivas Kandagatla
<srinivas.kandagatla@linaro.org> wrote:

> I remember hitting timeout issue with new version of eMMC on SD600c-eval
> board, this patch/hack fixed it for me.
>
> https://git.linaro.org/people/srinivas.kandagatla/linux.git/commitdiff/e42d7427b9ce77174eef079be0e04e22bdb8fd07?hp=3e4bf0b78405694fd1ab4371c73e80932be21e3a

This doesn't work for me, sadly. I reworked the patch to use the same
codeflow as the Ux500/ST variant and posted my patches, but still no
luck (hope it helps though, and just wanted to illustrate how I think the
final patch should look).

Are you sure the MMCI derivates on the APQ8060 and MSM8660 has this
extra "PROGDONE" bit? (I guess you have the SoC reference manual for
these, I don't...)

> Also one more important thing was to do with regulator mode too.
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93b8e6fda11757a20df21f3fc18e12789841f173

Since my rootfs does come up after the 20 minute timeout I do not thing
power surges are the cause. This problem seems SD600c-specific.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-20  7:06         ` Ulf Hansson
  2016-10-20  7:19           ` Adrian Hunter
@ 2016-10-27 10:04           ` Ulf Hansson
  2016-10-31 13:09             ` Adrian Hunter
  1 sibling, 1 reply; 19+ messages in thread
From: Ulf Hansson @ 2016-10-27 10:04 UTC (permalink / raw)
  To: Chaotian Jing
  Cc: Adrian Hunter, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>>> Adrian, Linus,
>>>
>>> Thanks for looking into this and reporting!
>>>
>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> > On 18/10/16 11:36, Linus Walleij wrote:
>>> >> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>> >>
>>> >>> Before this patch the eMMC is detected and all partitions enumerated
>>> >>> immediately, but after the patch it doesn't come up at all, except
>>> >>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>> >>
>>> >> FYI this is what it looks like when it eventually happens:
>>> >> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>>> >> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>>> >> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>>> >> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>>> >> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>>> >> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>>> >> p14 p15 p16 p17 p18 p19 p20 p21 >
>>> >>
>>> >> So after 627 seconds, a bit hard for users to wait this long for their
>>> >> root filesystem.
>>> >
>>> > If the driver does not support busy detection and the eMMC card provides
>>> > zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>>> > timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>>> > waiting 10 minutes i.e.
>>> >
>>> > #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>>
>>> Urgh! Yes, I have verified that this is exactly what happens.
>>>
>>> >
>>> > So removal of CMD13 polling for HS mode (as per commit
>>> > 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>>> > combinations of eMMC cards and host drivers.
>>>
>>> I was looking in the __mmc_switch() function, it's just a pain to walk
>>> trough it :-) So first out I decided to clean it up and factor out the
>>> polling parts. I will post the patches first out tomorrow morning,
>>> running some final test right now.
>>>
>>> Although, that of course doesn't solve our problem. As I see it we
>>> only have a few options here.
>>>
>>> 1) In case when cmd6 generic timeout isn't available, let's assign
>>> another empirically selected value.
>>> 2) Use a specific timeout when switching to HS mode.
>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>>> with CMD13 for switching to HS mode - unless it causes issues for some
>>> cards/drivers combination?
>>>
>>> BTW, I already tried 2) and it indeed solves the problem, although
>>> depending on the selected timeout, it might delay the card detection
>>> to process.
>>>
>>> Thoughts?
>>
>> I just have a try of switching to HS mode with Hynix EMMC, the first
>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
>> that CMD13 cannot indicate current card status in this case.
>
> Thanks for sharing that. Okay, so clearly we have some cards that
> don't supports polling with CMD13 when switching to HS mode.
> One could of course add quirks for these kind of cards and do a fixed
> delay for them, but then to find out which these cards are is going to
> be hard.
>
> It seems like we are left with using a fixed delay. Any ideas of what
> such delay should be? And should we have one specific for switch to
> the various speed modes and a different one that overrides the CMD6
> generic timout, when it doesn't exist?
>

Replying to my own earlier response, as I believe the problem could
also be related to another old commit, see below.

commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
Author: Seungwon Jeon <tgih.jun@samsung.com>
Date:   Wed Sep 4 21:21:05 2013 +0900

    mmc: add ignorance case for CMD13 CRC error

    While speed mode is changed, CMD13 cannot be guaranteed.
    According to the spec., it is not recommended to use CMD13
    to check the busy completion of the timing change.
    If CMD13 is used in this case, CRC error must be ignored.

    Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
    Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Chris Ball <cjb@laptop.org>


The intent with this commit was not really correct. We don't want to
ignore CRC errors, but instead we should *re-try* sending CMD13 once
we get a CRC error.

Unfortunate since this commit, instead we tell the host driver to
*ignore* CRC errors and instead reads the status and returns 0
(indicating success). In the mmc core, in __mmc_switch(), it will thus
parse the status reply, even for a reply that might have been received
with a CRC error. Not good!

I am wondering whether this actually is the main problem to why we
think polling isn't working for some cases. And perhaps that was the
original problem Chaotian was trying to solve?

Thoughts?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-27 10:04           ` Ulf Hansson
@ 2016-10-31 13:09             ` Adrian Hunter
  2016-11-01  1:43               ` Chaotian Jing
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2016-10-31 13:09 UTC (permalink / raw)
  To: Ulf Hansson, Chaotian Jing
  Cc: Linus Walleij, linux-mmc, linux-arm-msm, Bjorn Andersson,
	Stephen Boyd, Andy Gross

On 27/10/16 13:04, Ulf Hansson wrote:
> On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
>>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>>>> Adrian, Linus,
>>>>
>>>> Thanks for looking into this and reporting!
>>>>
>>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>> On 18/10/16 11:36, Linus Walleij wrote:
>>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>>>>
>>>>>>> Before this patch the eMMC is detected and all partitions enumerated
>>>>>>> immediately, but after the patch it doesn't come up at all, except
>>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>>>>>
>>>>>> FYI this is what it looks like when it eventually happens:
>>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
>>>>>>
>>>>>> So after 627 seconds, a bit hard for users to wait this long for their
>>>>>> root filesystem.
>>>>>
>>>>> If the driver does not support busy detection and the eMMC card provides
>>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>>>>> waiting 10 minutes i.e.
>>>>>
>>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>>>
>>>> Urgh! Yes, I have verified that this is exactly what happens.
>>>>
>>>>>
>>>>> So removal of CMD13 polling for HS mode (as per commit
>>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>>>>> combinations of eMMC cards and host drivers.
>>>>
>>>> I was looking in the __mmc_switch() function, it's just a pain to walk
>>>> trough it :-) So first out I decided to clean it up and factor out the
>>>> polling parts. I will post the patches first out tomorrow morning,
>>>> running some final test right now.
>>>>
>>>> Although, that of course doesn't solve our problem. As I see it we
>>>> only have a few options here.
>>>>
>>>> 1) In case when cmd6 generic timeout isn't available, let's assign
>>>> another empirically selected value.
>>>> 2) Use a specific timeout when switching to HS mode.
>>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>>>> with CMD13 for switching to HS mode - unless it causes issues for some
>>>> cards/drivers combination?
>>>>
>>>> BTW, I already tried 2) and it indeed solves the problem, although
>>>> depending on the selected timeout, it might delay the card detection
>>>> to process.
>>>>
>>>> Thoughts?
>>>
>>> I just have a try of switching to HS mode with Hynix EMMC, the first
>>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
>>> that CMD13 cannot indicate current card status in this case.
>>
>> Thanks for sharing that. Okay, so clearly we have some cards that
>> don't supports polling with CMD13 when switching to HS mode.
>> One could of course add quirks for these kind of cards and do a fixed
>> delay for them, but then to find out which these cards are is going to
>> be hard.
>>
>> It seems like we are left with using a fixed delay. Any ideas of what
>> such delay should be? And should we have one specific for switch to
>> the various speed modes and a different one that overrides the CMD6
>> generic timout, when it doesn't exist?
>>
> 
> Replying to my own earlier response, as I believe the problem could
> also be related to another old commit, see below.
> 
> commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
> Author: Seungwon Jeon <tgih.jun@samsung.com>
> Date:   Wed Sep 4 21:21:05 2013 +0900
> 
>     mmc: add ignorance case for CMD13 CRC error
> 
>     While speed mode is changed, CMD13 cannot be guaranteed.
>     According to the spec., it is not recommended to use CMD13
>     to check the busy completion of the timing change.
>     If CMD13 is used in this case, CRC error must be ignored.
> 
>     Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
>     Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
>     Signed-off-by: Chris Ball <cjb@laptop.org>
> 
> 
> The intent with this commit was not really correct. We don't want to
> ignore CRC errors, but instead we should *re-try* sending CMD13 once
> we get a CRC error.
> 
> Unfortunate since this commit, instead we tell the host driver to
> *ignore* CRC errors and instead reads the status and returns 0
> (indicating success). In the mmc core, in __mmc_switch(), it will thus
> parse the status reply, even for a reply that might have been received
> with a CRC error. Not good!

I agree: ignoring CRC errors and then expecting the status in the response
to be correct doesn't make sense.

However, it raises the question of what to do if there are always CRC errors
e.g. if it only works without CRC errors once the mode and frequency are
changed in the host controller.

> I am wondering whether this actually is the main problem to why we
> think polling isn't working for some cases. And perhaps that was the
> original problem Chaotian was trying to solve?
> 
> Thoughts?

Does Chaotian have a real problem since his driver has busy detection anyway?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-10-31 13:09             ` Adrian Hunter
@ 2016-11-01  1:43               ` Chaotian Jing
  2016-11-02  8:19                 ` Adrian Hunter
  0 siblings, 1 reply; 19+ messages in thread
From: Chaotian Jing @ 2016-11-01  1:43 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On Mon, 2016-10-31 at 15:09 +0200, Adrian Hunter wrote:
> On 27/10/16 13:04, Ulf Hansson wrote:
> > On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
> >>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
> >>>> Adrian, Linus,
> >>>>
> >>>> Thanks for looking into this and reporting!
> >>>>
> >>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>> On 18/10/16 11:36, Linus Walleij wrote:
> >>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> >>>>>>
> >>>>>>> Before this patch the eMMC is detected and all partitions enumerated
> >>>>>>> immediately, but after the patch it doesn't come up at all, except
> >>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
> >>>>>>
> >>>>>> FYI this is what it looks like when it eventually happens:
> >>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
> >>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
> >>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
> >>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
> >>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
> >>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
> >>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
> >>>>>>
> >>>>>> So after 627 seconds, a bit hard for users to wait this long for their
> >>>>>> root filesystem.
> >>>>>
> >>>>> If the driver does not support busy detection and the eMMC card provides
> >>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
> >>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
> >>>>> waiting 10 minutes i.e.
> >>>>>
> >>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
> >>>>
> >>>> Urgh! Yes, I have verified that this is exactly what happens.
> >>>>
> >>>>>
> >>>>> So removal of CMD13 polling for HS mode (as per commit
> >>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
> >>>>> combinations of eMMC cards and host drivers.
> >>>>
> >>>> I was looking in the __mmc_switch() function, it's just a pain to walk
> >>>> trough it :-) So first out I decided to clean it up and factor out the
> >>>> polling parts. I will post the patches first out tomorrow morning,
> >>>> running some final test right now.
> >>>>
> >>>> Although, that of course doesn't solve our problem. As I see it we
> >>>> only have a few options here.
> >>>>
> >>>> 1) In case when cmd6 generic timeout isn't available, let's assign
> >>>> another empirically selected value.
> >>>> 2) Use a specific timeout when switching to HS mode.
> >>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
> >>>> with CMD13 for switching to HS mode - unless it causes issues for some
> >>>> cards/drivers combination?
> >>>>
> >>>> BTW, I already tried 2) and it indeed solves the problem, although
> >>>> depending on the selected timeout, it might delay the card detection
> >>>> to process.
> >>>>
> >>>> Thoughts?
> >>>
> >>> I just have a try of switching to HS mode with Hynix EMMC, the first
> >>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
> >>> that CMD13 cannot indicate current card status in this case.
> >>
> >> Thanks for sharing that. Okay, so clearly we have some cards that
> >> don't supports polling with CMD13 when switching to HS mode.
> >> One could of course add quirks for these kind of cards and do a fixed
> >> delay for them, but then to find out which these cards are is going to
> >> be hard.
> >>
> >> It seems like we are left with using a fixed delay. Any ideas of what
> >> such delay should be? And should we have one specific for switch to
> >> the various speed modes and a different one that overrides the CMD6
> >> generic timout, when it doesn't exist?
> >>
> > 
> > Replying to my own earlier response, as I believe the problem could
> > also be related to another old commit, see below.
> > 
> > commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
> > Author: Seungwon Jeon <tgih.jun@samsung.com>
> > Date:   Wed Sep 4 21:21:05 2013 +0900
> > 
> >     mmc: add ignorance case for CMD13 CRC error
> > 
> >     While speed mode is changed, CMD13 cannot be guaranteed.
> >     According to the spec., it is not recommended to use CMD13
> >     to check the busy completion of the timing change.
> >     If CMD13 is used in this case, CRC error must be ignored.
> > 
> >     Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
> >     Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> >     Signed-off-by: Chris Ball <cjb@laptop.org>
> > 
> > 
> > The intent with this commit was not really correct. We don't want to
> > ignore CRC errors, but instead we should *re-try* sending CMD13 once
> > we get a CRC error.
> > 
> > Unfortunate since this commit, instead we tell the host driver to
> > *ignore* CRC errors and instead reads the status and returns 0
> > (indicating success). In the mmc core, in __mmc_switch(), it will thus
> > parse the status reply, even for a reply that might have been received
> > with a CRC error. Not good!
> 
> I agree: ignoring CRC errors and then expecting the status in the response
> to be correct doesn't make sense.
> 
> However, it raises the question of what to do if there are always CRC errors
> e.g. if it only works without CRC errors once the mode and frequency are
> changed in the host controller.
> 
> > I am wondering whether this actually is the main problem to why we
> > think polling isn't working for some cases. And perhaps that was the
> > original problem Chaotian was trying to solve?
> > 
> > Thoughts?
> 
> Does Chaotian have a real problem since his driver has busy detection anyway?

In fact, I have not encounter CRC errors of CMD13, I have tried several
eMMC cards, after mode switch, CMD13 will only gets 0x800 response and
we don't know if card is busy by 0x800 response.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-11-01  1:43               ` Chaotian Jing
@ 2016-11-02  8:19                 ` Adrian Hunter
  2016-11-02 10:28                   ` Chaotian Jing
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2016-11-02  8:19 UTC (permalink / raw)
  To: Chaotian Jing
  Cc: Ulf Hansson, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On 01/11/16 03:43, Chaotian Jing wrote:
> On Mon, 2016-10-31 at 15:09 +0200, Adrian Hunter wrote:
>> On 27/10/16 13:04, Ulf Hansson wrote:
>>> On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
>>>>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>>>>>> Adrian, Linus,
>>>>>>
>>>>>> Thanks for looking into this and reporting!
>>>>>>
>>>>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>>> On 18/10/16 11:36, Linus Walleij wrote:
>>>>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>>>>>>
>>>>>>>>> Before this patch the eMMC is detected and all partitions enumerated
>>>>>>>>> immediately, but after the patch it doesn't come up at all, except
>>>>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>>>>>>>
>>>>>>>> FYI this is what it looks like when it eventually happens:
>>>>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>>>>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>>>>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>>>>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>>>>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>>>>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>>>>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
>>>>>>>>
>>>>>>>> So after 627 seconds, a bit hard for users to wait this long for their
>>>>>>>> root filesystem.
>>>>>>>
>>>>>>> If the driver does not support busy detection and the eMMC card provides
>>>>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>>>>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>>>>>>> waiting 10 minutes i.e.
>>>>>>>
>>>>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>>>>>
>>>>>> Urgh! Yes, I have verified that this is exactly what happens.
>>>>>>
>>>>>>>
>>>>>>> So removal of CMD13 polling for HS mode (as per commit
>>>>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>>>>>>> combinations of eMMC cards and host drivers.
>>>>>>
>>>>>> I was looking in the __mmc_switch() function, it's just a pain to walk
>>>>>> trough it :-) So first out I decided to clean it up and factor out the
>>>>>> polling parts. I will post the patches first out tomorrow morning,
>>>>>> running some final test right now.
>>>>>>
>>>>>> Although, that of course doesn't solve our problem. As I see it we
>>>>>> only have a few options here.
>>>>>>
>>>>>> 1) In case when cmd6 generic timeout isn't available, let's assign
>>>>>> another empirically selected value.
>>>>>> 2) Use a specific timeout when switching to HS mode.
>>>>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>>>>>> with CMD13 for switching to HS mode - unless it causes issues for some
>>>>>> cards/drivers combination?
>>>>>>
>>>>>> BTW, I already tried 2) and it indeed solves the problem, although
>>>>>> depending on the selected timeout, it might delay the card detection
>>>>>> to process.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>> I just have a try of switching to HS mode with Hynix EMMC, the first
>>>>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
>>>>> that CMD13 cannot indicate current card status in this case.
>>>>
>>>> Thanks for sharing that. Okay, so clearly we have some cards that
>>>> don't supports polling with CMD13 when switching to HS mode.
>>>> One could of course add quirks for these kind of cards and do a fixed
>>>> delay for them, but then to find out which these cards are is going to
>>>> be hard.
>>>>
>>>> It seems like we are left with using a fixed delay. Any ideas of what
>>>> such delay should be? And should we have one specific for switch to
>>>> the various speed modes and a different one that overrides the CMD6
>>>> generic timout, when it doesn't exist?
>>>>
>>>
>>> Replying to my own earlier response, as I believe the problem could
>>> also be related to another old commit, see below.
>>>
>>> commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
>>> Author: Seungwon Jeon <tgih.jun@samsung.com>
>>> Date:   Wed Sep 4 21:21:05 2013 +0900
>>>
>>>     mmc: add ignorance case for CMD13 CRC error
>>>
>>>     While speed mode is changed, CMD13 cannot be guaranteed.
>>>     According to the spec., it is not recommended to use CMD13
>>>     to check the busy completion of the timing change.
>>>     If CMD13 is used in this case, CRC error must be ignored.
>>>
>>>     Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
>>>     Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
>>>     Signed-off-by: Chris Ball <cjb@laptop.org>
>>>
>>>
>>> The intent with this commit was not really correct. We don't want to
>>> ignore CRC errors, but instead we should *re-try* sending CMD13 once
>>> we get a CRC error.
>>>
>>> Unfortunate since this commit, instead we tell the host driver to
>>> *ignore* CRC errors and instead reads the status and returns 0
>>> (indicating success). In the mmc core, in __mmc_switch(), it will thus
>>> parse the status reply, even for a reply that might have been received
>>> with a CRC error. Not good!
>>
>> I agree: ignoring CRC errors and then expecting the status in the response
>> to be correct doesn't make sense.
>>
>> However, it raises the question of what to do if there are always CRC errors
>> e.g. if it only works without CRC errors once the mode and frequency are
>> changed in the host controller.
>>
>>> I am wondering whether this actually is the main problem to why we
>>> think polling isn't working for some cases. And perhaps that was the
>>> original problem Chaotian was trying to solve?
>>>
>>> Thoughts?
>>
>> Does Chaotian have a real problem since his driver has busy detection anyway?
> 
> In fact, I have not encounter CRC errors of CMD13, I have tried several
> eMMC cards, after mode switch, CMD13 will only gets 0x800 response and
> we don't know if card is busy by 0x800 response.

Does it change to 0x900 when it is not busy?

But anyway the question was: do you have busy detection in your driver?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-11-02  8:19                 ` Adrian Hunter
@ 2016-11-02 10:28                   ` Chaotian Jing
  2016-11-02 12:51                     ` Adrian Hunter
  0 siblings, 1 reply; 19+ messages in thread
From: Chaotian Jing @ 2016-11-02 10:28 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On Wed, 2016-11-02 at 10:19 +0200, Adrian Hunter wrote:
> On 01/11/16 03:43, Chaotian Jing wrote:
> > On Mon, 2016-10-31 at 15:09 +0200, Adrian Hunter wrote:
> >> On 27/10/16 13:04, Ulf Hansson wrote:
> >>> On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >>>> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
> >>>>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
> >>>>>> Adrian, Linus,
> >>>>>>
> >>>>>> Thanks for looking into this and reporting!
> >>>>>>
> >>>>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>>>> On 18/10/16 11:36, Linus Walleij wrote:
> >>>>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> >>>>>>>>
> >>>>>>>>> Before this patch the eMMC is detected and all partitions enumerated
> >>>>>>>>> immediately, but after the patch it doesn't come up at all, except
> >>>>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
> >>>>>>>>
> >>>>>>>> FYI this is what it looks like when it eventually happens:
> >>>>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
> >>>>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
> >>>>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
> >>>>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
> >>>>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
> >>>>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
> >>>>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
> >>>>>>>>
> >>>>>>>> So after 627 seconds, a bit hard for users to wait this long for their
> >>>>>>>> root filesystem.
> >>>>>>>
> >>>>>>> If the driver does not support busy detection and the eMMC card provides
> >>>>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
> >>>>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
> >>>>>>> waiting 10 minutes i.e.
> >>>>>>>
> >>>>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
> >>>>>>
> >>>>>> Urgh! Yes, I have verified that this is exactly what happens.
> >>>>>>
> >>>>>>>
> >>>>>>> So removal of CMD13 polling for HS mode (as per commit
> >>>>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
> >>>>>>> combinations of eMMC cards and host drivers.
> >>>>>>
> >>>>>> I was looking in the __mmc_switch() function, it's just a pain to walk
> >>>>>> trough it :-) So first out I decided to clean it up and factor out the
> >>>>>> polling parts. I will post the patches first out tomorrow morning,
> >>>>>> running some final test right now.
> >>>>>>
> >>>>>> Although, that of course doesn't solve our problem. As I see it we
> >>>>>> only have a few options here.
> >>>>>>
> >>>>>> 1) In case when cmd6 generic timeout isn't available, let's assign
> >>>>>> another empirically selected value.
> >>>>>> 2) Use a specific timeout when switching to HS mode.
> >>>>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
> >>>>>> with CMD13 for switching to HS mode - unless it causes issues for some
> >>>>>> cards/drivers combination?
> >>>>>>
> >>>>>> BTW, I already tried 2) and it indeed solves the problem, although
> >>>>>> depending on the selected timeout, it might delay the card detection
> >>>>>> to process.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>
> >>>>> I just have a try of switching to HS mode with Hynix EMMC, the first
> >>>>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
> >>>>> that CMD13 cannot indicate current card status in this case.
> >>>>
> >>>> Thanks for sharing that. Okay, so clearly we have some cards that
> >>>> don't supports polling with CMD13 when switching to HS mode.
> >>>> One could of course add quirks for these kind of cards and do a fixed
> >>>> delay for them, but then to find out which these cards are is going to
> >>>> be hard.
> >>>>
> >>>> It seems like we are left with using a fixed delay. Any ideas of what
> >>>> such delay should be? And should we have one specific for switch to
> >>>> the various speed modes and a different one that overrides the CMD6
> >>>> generic timout, when it doesn't exist?
> >>>>
> >>>
> >>> Replying to my own earlier response, as I believe the problem could
> >>> also be related to another old commit, see below.
> >>>
> >>> commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
> >>> Author: Seungwon Jeon <tgih.jun@samsung.com>
> >>> Date:   Wed Sep 4 21:21:05 2013 +0900
> >>>
> >>>     mmc: add ignorance case for CMD13 CRC error
> >>>
> >>>     While speed mode is changed, CMD13 cannot be guaranteed.
> >>>     According to the spec., it is not recommended to use CMD13
> >>>     to check the busy completion of the timing change.
> >>>     If CMD13 is used in this case, CRC error must be ignored.
> >>>
> >>>     Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
> >>>     Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> >>>     Signed-off-by: Chris Ball <cjb@laptop.org>
> >>>
> >>>
> >>> The intent with this commit was not really correct. We don't want to
> >>> ignore CRC errors, but instead we should *re-try* sending CMD13 once
> >>> we get a CRC error.
> >>>
> >>> Unfortunate since this commit, instead we tell the host driver to
> >>> *ignore* CRC errors and instead reads the status and returns 0
> >>> (indicating success). In the mmc core, in __mmc_switch(), it will thus
> >>> parse the status reply, even for a reply that might have been received
> >>> with a CRC error. Not good!
> >>
> >> I agree: ignoring CRC errors and then expecting the status in the response
> >> to be correct doesn't make sense.
> >>
> >> However, it raises the question of what to do if there are always CRC errors
> >> e.g. if it only works without CRC errors once the mode and frequency are
> >> changed in the host controller.
> >>
> >>> I am wondering whether this actually is the main problem to why we
> >>> think polling isn't working for some cases. And perhaps that was the
> >>> original problem Chaotian was trying to solve?
> >>>
> >>> Thoughts?
> >>
> >> Does Chaotian have a real problem since his driver has busy detection anyway?
> > 
> > In fact, I have not encounter CRC errors of CMD13, I have tried several
> > eMMC cards, after mode switch, CMD13 will only gets 0x800 response and
> > we don't know if card is busy by 0x800 response.
> 
> Does it change to 0x900 when it is not busy?
> 
No, it will not change to 0x900 when it is not busy.

> But anyway the question was: do you have busy detection in your driver?
> 
driver has busy detection ops->card_busy() but seems it's MMC core
layer's responsibility to ensure that card is not busy when driver
starts to issue commands.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-11-02 10:28                   ` Chaotian Jing
@ 2016-11-02 12:51                     ` Adrian Hunter
  2016-11-03  3:39                       ` Chaotian Jing
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Hunter @ 2016-11-02 12:51 UTC (permalink / raw)
  To: Chaotian Jing
  Cc: Ulf Hansson, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On 02/11/16 12:28, Chaotian Jing wrote:
> On Wed, 2016-11-02 at 10:19 +0200, Adrian Hunter wrote:
>> On 01/11/16 03:43, Chaotian Jing wrote:
>>> On Mon, 2016-10-31 at 15:09 +0200, Adrian Hunter wrote:
>>>> On 27/10/16 13:04, Ulf Hansson wrote:
>>>>> On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>>>> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
>>>>>>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
>>>>>>>> Adrian, Linus,
>>>>>>>>
>>>>>>>> Thanks for looking into this and reporting!
>>>>>>>>
>>>>>>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>>>>> On 18/10/16 11:36, Linus Walleij wrote:
>>>>>>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Before this patch the eMMC is detected and all partitions enumerated
>>>>>>>>>>> immediately, but after the patch it doesn't come up at all, except
>>>>>>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
>>>>>>>>>>
>>>>>>>>>> FYI this is what it looks like when it eventually happens:
>>>>>>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
>>>>>>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
>>>>>>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
>>>>>>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
>>>>>>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
>>>>>>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
>>>>>>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
>>>>>>>>>>
>>>>>>>>>> So after 627 seconds, a bit hard for users to wait this long for their
>>>>>>>>>> root filesystem.
>>>>>>>>>
>>>>>>>>> If the driver does not support busy detection and the eMMC card provides
>>>>>>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
>>>>>>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
>>>>>>>>> waiting 10 minutes i.e.
>>>>>>>>>
>>>>>>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
>>>>>>>>
>>>>>>>> Urgh! Yes, I have verified that this is exactly what happens.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So removal of CMD13 polling for HS mode (as per commit
>>>>>>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
>>>>>>>>> combinations of eMMC cards and host drivers.
>>>>>>>>
>>>>>>>> I was looking in the __mmc_switch() function, it's just a pain to walk
>>>>>>>> trough it :-) So first out I decided to clean it up and factor out the
>>>>>>>> polling parts. I will post the patches first out tomorrow morning,
>>>>>>>> running some final test right now.
>>>>>>>>
>>>>>>>> Although, that of course doesn't solve our problem. As I see it we
>>>>>>>> only have a few options here.
>>>>>>>>
>>>>>>>> 1) In case when cmd6 generic timeout isn't available, let's assign
>>>>>>>> another empirically selected value.
>>>>>>>> 2) Use a specific timeout when switching to HS mode.
>>>>>>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
>>>>>>>> with CMD13 for switching to HS mode - unless it causes issues for some
>>>>>>>> cards/drivers combination?
>>>>>>>>
>>>>>>>> BTW, I already tried 2) and it indeed solves the problem, although
>>>>>>>> depending on the selected timeout, it might delay the card detection
>>>>>>>> to process.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>
>>>>>>> I just have a try of switching to HS mode with Hynix EMMC, the first
>>>>>>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
>>>>>>> that CMD13 cannot indicate current card status in this case.
>>>>>>
>>>>>> Thanks for sharing that. Okay, so clearly we have some cards that
>>>>>> don't supports polling with CMD13 when switching to HS mode.
>>>>>> One could of course add quirks for these kind of cards and do a fixed
>>>>>> delay for them, but then to find out which these cards are is going to
>>>>>> be hard.
>>>>>>
>>>>>> It seems like we are left with using a fixed delay. Any ideas of what
>>>>>> such delay should be? And should we have one specific for switch to
>>>>>> the various speed modes and a different one that overrides the CMD6
>>>>>> generic timout, when it doesn't exist?
>>>>>>
>>>>>
>>>>> Replying to my own earlier response, as I believe the problem could
>>>>> also be related to another old commit, see below.
>>>>>
>>>>> commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
>>>>> Author: Seungwon Jeon <tgih.jun@samsung.com>
>>>>> Date:   Wed Sep 4 21:21:05 2013 +0900
>>>>>
>>>>>     mmc: add ignorance case for CMD13 CRC error
>>>>>
>>>>>     While speed mode is changed, CMD13 cannot be guaranteed.
>>>>>     According to the spec., it is not recommended to use CMD13
>>>>>     to check the busy completion of the timing change.
>>>>>     If CMD13 is used in this case, CRC error must be ignored.
>>>>>
>>>>>     Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
>>>>>     Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
>>>>>     Signed-off-by: Chris Ball <cjb@laptop.org>
>>>>>
>>>>>
>>>>> The intent with this commit was not really correct. We don't want to
>>>>> ignore CRC errors, but instead we should *re-try* sending CMD13 once
>>>>> we get a CRC error.
>>>>>
>>>>> Unfortunate since this commit, instead we tell the host driver to
>>>>> *ignore* CRC errors and instead reads the status and returns 0
>>>>> (indicating success). In the mmc core, in __mmc_switch(), it will thus
>>>>> parse the status reply, even for a reply that might have been received
>>>>> with a CRC error. Not good!
>>>>
>>>> I agree: ignoring CRC errors and then expecting the status in the response
>>>> to be correct doesn't make sense.
>>>>
>>>> However, it raises the question of what to do if there are always CRC errors
>>>> e.g. if it only works without CRC errors once the mode and frequency are
>>>> changed in the host controller.
>>>>
>>>>> I am wondering whether this actually is the main problem to why we
>>>>> think polling isn't working for some cases. And perhaps that was the
>>>>> original problem Chaotian was trying to solve?
>>>>>
>>>>> Thoughts?
>>>>
>>>> Does Chaotian have a real problem since his driver has busy detection anyway?
>>>
>>> In fact, I have not encounter CRC errors of CMD13, I have tried several
>>> eMMC cards, after mode switch, CMD13 will only gets 0x800 response and
>>> we don't know if card is busy by 0x800 response.
>>
>> Does it change to 0x900 when it is not busy?
>>
> No, it will not change to 0x900 when it is not busy.
> 
>> But anyway the question was: do you have busy detection in your driver?
>>
> driver has busy detection ops->card_busy() but seems it's MMC core
> layer's responsibility to ensure that card is not busy when driver
> starts to issue commands.

I tried a card here.  The time between HS switch response and busy
de-assertion was only 58us i.e. practically instant.  The CMD6 response was
0x800 but the subsequent CMD13 response was 0x900.

How long does it take your failing card to switch to HS?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Regression after "do not use CMD13 to get status after speed mode switch"
  2016-11-02 12:51                     ` Adrian Hunter
@ 2016-11-03  3:39                       ` Chaotian Jing
  0 siblings, 0 replies; 19+ messages in thread
From: Chaotian Jing @ 2016-11-03  3:39 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ulf Hansson, Linus Walleij, linux-mmc, linux-arm-msm,
	Bjorn Andersson, Stephen Boyd, Andy Gross

On Wed, 2016-11-02 at 14:51 +0200, Adrian Hunter wrote:
> On 02/11/16 12:28, Chaotian Jing wrote:
> > On Wed, 2016-11-02 at 10:19 +0200, Adrian Hunter wrote:
> >> On 01/11/16 03:43, Chaotian Jing wrote:
> >>> On Mon, 2016-10-31 at 15:09 +0200, Adrian Hunter wrote:
> >>>> On 27/10/16 13:04, Ulf Hansson wrote:
> >>>>> On 20 October 2016 at 09:06, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >>>>>> On 20 October 2016 at 04:22, Chaotian Jing <chaotian.jing@mediatek.com> wrote:
> >>>>>>> On Wed, 2016-10-19 at 18:41 +0200, Ulf Hansson wrote:
> >>>>>>>> Adrian, Linus,
> >>>>>>>>
> >>>>>>>> Thanks for looking into this and reporting!
> >>>>>>>>
> >>>>>>>> On 18 October 2016 at 15:23, Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>>>>>> On 18/10/16 11:36, Linus Walleij wrote:
> >>>>>>>>>> On Mon, Oct 17, 2016 at 4:32 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Before this patch the eMMC is detected and all partitions enumerated
> >>>>>>>>>>> immediately, but after the patch it doesn't come up at all, except
> >>>>>>>>>>> sometimes, when it appears minutes (!) after boot, all of a sudden.
> >>>>>>>>>>
> >>>>>>>>>> FYI this is what it looks like when it eventually happens:
> >>>>>>>>>> root@msm8660:/ [  627.710175] mmc0: new high speed MMC card at address 0001
> >>>>>>>>>> [  627.711641] mmcblk0: mmc0:0001 SEM04G 3.69 GiB
> >>>>>>>>>> [  627.715485] mmcblk0boot0: mmc0:0001 SEM04G partition 1 1.00 MiB
> >>>>>>>>>> [  627.736654] mmcblk0boot1: mmc0:0001 SEM04G partition 2 1.00 MiB
> >>>>>>>>>> [  627.747397] mmcblk0rpmb: mmc0:0001 SEM04G partition 3 128 KiB
> >>>>>>>>>> [  627.756326]  mmcblk0: p1 p2 p3 p4 < p5 p6 p7 p8 p9 p10 p11 p12 p13
> >>>>>>>>>> p14 p15 p16 p17 p18 p19 p20 p21 >
> >>>>>>>>>>
> >>>>>>>>>> So after 627 seconds, a bit hard for users to wait this long for their
> >>>>>>>>>> root filesystem.
> >>>>>>>>>
> >>>>>>>>> If the driver does not support busy detection and the eMMC card provides
> >>>>>>>>> zero as the cmd6 generic timeout (which it may especially as cmd6 generic
> >>>>>>>>> timeout wasn't added until eMMCv4.5), then __mmc_switch() defaults to
> >>>>>>>>> waiting 10 minutes i.e.
> >>>>>>>>>
> >>>>>>>>> #define MMC_OPS_TIMEOUT_MS      (10 * 60 * 1000) /* 10 minute timeout */
> >>>>>>>>
> >>>>>>>> Urgh! Yes, I have verified that this is exactly what happens.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> So removal of CMD13 polling for HS mode (as per commit
> >>>>>>>>> 08573eaf1a70104f83fdbee9b84e5be03480e9ed) is going to be a problem for some
> >>>>>>>>> combinations of eMMC cards and host drivers.
> >>>>>>>>
> >>>>>>>> I was looking in the __mmc_switch() function, it's just a pain to walk
> >>>>>>>> trough it :-) So first out I decided to clean it up and factor out the
> >>>>>>>> polling parts. I will post the patches first out tomorrow morning,
> >>>>>>>> running some final test right now.
> >>>>>>>>
> >>>>>>>> Although, that of course doesn't solve our problem. As I see it we
> >>>>>>>> only have a few options here.
> >>>>>>>>
> >>>>>>>> 1) In case when cmd6 generic timeout isn't available, let's assign
> >>>>>>>> another empirically selected value.
> >>>>>>>> 2) Use a specific timeout when switching to HS mode.
> >>>>>>>> 3) Even if we deploy 1 (and 2), perhaps we still should allow polling
> >>>>>>>> with CMD13 for switching to HS mode - unless it causes issues for some
> >>>>>>>> cards/drivers combination?
> >>>>>>>>
> >>>>>>>> BTW, I already tried 2) and it indeed solves the problem, although
> >>>>>>>> depending on the selected timeout, it might delay the card detection
> >>>>>>>> to process.
> >>>>>>>>
> >>>>>>>> Thoughts?
> >>>>>>>
> >>>>>>> I just have a try of switching to HS mode with Hynix EMMC, the first
> >>>>>>> CMD13 gets response of 0x900, but the EMMC is still pull-low DAT0. so
> >>>>>>> that CMD13 cannot indicate current card status in this case.
> >>>>>>
> >>>>>> Thanks for sharing that. Okay, so clearly we have some cards that
> >>>>>> don't supports polling with CMD13 when switching to HS mode.
> >>>>>> One could of course add quirks for these kind of cards and do a fixed
> >>>>>> delay for them, but then to find out which these cards are is going to
> >>>>>> be hard.
> >>>>>>
> >>>>>> It seems like we are left with using a fixed delay. Any ideas of what
> >>>>>> such delay should be? And should we have one specific for switch to
> >>>>>> the various speed modes and a different one that overrides the CMD6
> >>>>>> generic timout, when it doesn't exist?
> >>>>>>
> >>>>>
> >>>>> Replying to my own earlier response, as I believe the problem could
> >>>>> also be related to another old commit, see below.
> >>>>>
> >>>>> commit a27fbf2f067b0cd6f172c8b696b9a44c58bfaa7a
> >>>>> Author: Seungwon Jeon <tgih.jun@samsung.com>
> >>>>> Date:   Wed Sep 4 21:21:05 2013 +0900
> >>>>>
> >>>>>     mmc: add ignorance case for CMD13 CRC error
> >>>>>
> >>>>>     While speed mode is changed, CMD13 cannot be guaranteed.
> >>>>>     According to the spec., it is not recommended to use CMD13
> >>>>>     to check the busy completion of the timing change.
> >>>>>     If CMD13 is used in this case, CRC error must be ignored.
> >>>>>
> >>>>>     Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
> >>>>>     Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> >>>>>     Signed-off-by: Chris Ball <cjb@laptop.org>
> >>>>>
> >>>>>
> >>>>> The intent with this commit was not really correct. We don't want to
> >>>>> ignore CRC errors, but instead we should *re-try* sending CMD13 once
> >>>>> we get a CRC error.
> >>>>>
> >>>>> Unfortunate since this commit, instead we tell the host driver to
> >>>>> *ignore* CRC errors and instead reads the status and returns 0
> >>>>> (indicating success). In the mmc core, in __mmc_switch(), it will thus
> >>>>> parse the status reply, even for a reply that might have been received
> >>>>> with a CRC error. Not good!
> >>>>
> >>>> I agree: ignoring CRC errors and then expecting the status in the response
> >>>> to be correct doesn't make sense.
> >>>>
> >>>> However, it raises the question of what to do if there are always CRC errors
> >>>> e.g. if it only works without CRC errors once the mode and frequency are
> >>>> changed in the host controller.
> >>>>
> >>>>> I am wondering whether this actually is the main problem to why we
> >>>>> think polling isn't working for some cases. And perhaps that was the
> >>>>> original problem Chaotian was trying to solve?
> >>>>>
> >>>>> Thoughts?
> >>>>
> >>>> Does Chaotian have a real problem since his driver has busy detection anyway?
> >>>
> >>> In fact, I have not encounter CRC errors of CMD13, I have tried several
> >>> eMMC cards, after mode switch, CMD13 will only gets 0x800 response and
> >>> we don't know if card is busy by 0x800 response.
> >>
> >> Does it change to 0x900 when it is not busy?
> >>
> > No, it will not change to 0x900 when it is not busy.
> > 
> >> But anyway the question was: do you have busy detection in your driver?
> >>
> > driver has busy detection ops->card_busy() but seems it's MMC core
> > layer's responsibility to ensure that card is not busy when driver
> > starts to issue commands.
> 
> I tried a card here.  The time between HS switch response and busy
> de-assertion was only 58us i.e. practically instant.  The CMD6 response was
> 0x800 but the subsequent CMD13 response was 0x900.
> 
> How long does it take your failing card to switch to HS?
> 
It depends on EMMC chip type, some are very fast and some are take
several ms. I just test Sandisk-SDIN9D-S2, CMD13 also gets 0x800
response after busy-deassert.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-11-03  3:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-17 14:32 Regression after "do not use CMD13 to get status after speed mode switch" Linus Walleij
2016-10-18  4:06 ` Ritesh Harjani
2016-10-19  1:03   ` Stephen Boyd
2016-10-18  8:36 ` Linus Walleij
2016-10-18 13:23   ` Adrian Hunter
2016-10-19 16:41     ` Ulf Hansson
2016-10-20  2:22       ` Chaotian Jing
2016-10-20  7:06         ` Ulf Hansson
2016-10-20  7:19           ` Adrian Hunter
2016-10-20 11:00             ` Ulf Hansson
2016-10-27 10:04           ` Ulf Hansson
2016-10-31 13:09             ` Adrian Hunter
2016-11-01  1:43               ` Chaotian Jing
2016-11-02  8:19                 ` Adrian Hunter
2016-11-02 10:28                   ` Chaotian Jing
2016-11-02 12:51                     ` Adrian Hunter
2016-11-03  3:39                       ` Chaotian Jing
2016-10-20 15:17 ` Srinivas Kandagatla
2016-10-24 14:23   ` Linus Walleij

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.