All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
@ 2023-11-18 13:46 Liming Sun
  2023-11-20  6:49 ` Adrian Hunter
  2023-11-27 13:36 ` Christian Loehle
  0 siblings, 2 replies; 9+ messages in thread
From: Liming Sun @ 2023-11-18 13:46 UTC (permalink / raw)
  To: Adrian Hunter, Ulf Hansson, David Thompson
  Cc: Liming Sun, linux-mmc, linux-kernel

This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
intermittent eMMC timeout issue reported on some cards under eMMC
stress test.

Reported error message:
  dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110

Signed-off-by: Liming Sun <limings@nvidia.com>
---
 drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
index 3a3bae6948a8..3c8fe8aec558 100644
--- a/drivers/mmc/host/sdhci-of-dwcmshc.c
+++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
@@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data sdhci_dwcmshc_pdata = {
 #ifdef CONFIG_ACPI
 static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
 	.ops = &sdhci_dwcmshc_ops,
-	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
+	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
+		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
 	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
 		   SDHCI_QUIRK2_ACMD23_BROKEN,
 };
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-11-18 13:46 [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC Liming Sun
@ 2023-11-20  6:49 ` Adrian Hunter
  2023-11-20 15:18   ` Liming Sun
  2023-11-27 13:36 ` Christian Loehle
  1 sibling, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2023-11-20  6:49 UTC (permalink / raw)
  To: Liming Sun, Ulf Hansson, David Thompson; +Cc: linux-mmc, linux-kernel

On 18/11/23 15:46, Liming Sun wrote:
> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> intermittent eMMC timeout issue reported on some cards under eMMC
> stress test.
> 
> Reported error message:
>   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110

Were you able to determine the root cause?  For example,
is the host controller timeout correct, is the eMMC
providing correct timeout values, is the mmc subsystem
calculating a correct value, is sdhci programming a correct
value?

If there are problems outside the host controller then we
need to address them also.

> 
> Signed-off-by: Liming Sun <limings@nvidia.com>

Fixes tag?

> ---
>  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
> index 3a3bae6948a8..3c8fe8aec558 100644
> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data sdhci_dwcmshc_pdata = {
>  #ifdef CONFIG_ACPI
>  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>  	.ops = &sdhci_dwcmshc_ops,
> -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>  		   SDHCI_QUIRK2_ACMD23_BROKEN,
>  };


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-11-20  6:49 ` Adrian Hunter
@ 2023-11-20 15:18   ` Liming Sun
  2023-11-21  8:08     ` Adrian Hunter
  0 siblings, 1 reply; 9+ messages in thread
From: Liming Sun @ 2023-11-20 15:18 UTC (permalink / raw)
  To: Adrian Hunter, Ulf Hansson, David Thompson; +Cc: linux-mmc, linux-kernel



> -----Original Message-----
> From: Adrian Hunter <adrian.hunter@intel.com>
> Sent: Monday, November 20, 2023 1:49 AM
> To: Liming Sun <limings@nvidia.com>; Ulf Hansson <ulf.hansson@linaro.org>;
> David Thompson <davthompson@nvidia.com>
> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
> BlueField-3 SoC
> 
> On 18/11/23 15:46, Liming Sun wrote:
> > This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> > intermittent eMMC timeout issue reported on some cards under eMMC
> > stress test.
> >
> > Reported error message:
> >   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
> 
> Were you able to determine the root cause?  For example,
> is the host controller timeout correct, is the eMMC
> providing correct timeout values, is the mmc subsystem
> calculating a correct value, is sdhci programming a correct
> value?
> 
> If there are problems outside the host controller then we
> need to address them also.

It is caused by the host controller timeout, but is hard to tell whether the
configuration provided by the card is good enough since it's
intermittent under stress test the SoC needs to work with different eMMC vendors. 
In UEFI eMMC driver similar max timeout (0xe) is used to avoid such
issue. This commit tries to use existing quirk, which I think that it would work 
if there is another way to adjust the TOUT_CNT register. Any concern or suggestions?

> 
> >
> > Signed-off-by: Liming Sun <limings@nvidia.com>
> 
> Fixes tag?

Will update it in v2.

> 
> > ---
> >  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
> b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > index 3a3bae6948a8..3c8fe8aec558 100644
> > --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> > +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
> sdhci_dwcmshc_pdata = {
> >  #ifdef CONFIG_ACPI
> >  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> >  	.ops = &sdhci_dwcmshc_ops,
> > -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> > +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> > +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> >  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> >  		   SDHCI_QUIRK2_ACMD23_BROKEN,
> >  };


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-11-20 15:18   ` Liming Sun
@ 2023-11-21  8:08     ` Adrian Hunter
  0 siblings, 0 replies; 9+ messages in thread
From: Adrian Hunter @ 2023-11-21  8:08 UTC (permalink / raw)
  To: Liming Sun, Ulf Hansson, David Thompson; +Cc: linux-mmc, linux-kernel

On 20/11/23 17:18, Liming Sun wrote:
> 
> 
>> -----Original Message-----
>> From: Adrian Hunter <adrian.hunter@intel.com>
>> Sent: Monday, November 20, 2023 1:49 AM
>> To: Liming Sun <limings@nvidia.com>; Ulf Hansson <ulf.hansson@linaro.org>;
>> David Thompson <davthompson@nvidia.com>
>> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
>> BlueField-3 SoC
>>
>> On 18/11/23 15:46, Liming Sun wrote:
>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
>>> intermittent eMMC timeout issue reported on some cards under eMMC
>>> stress test.
>>>
>>> Reported error message:
>>>   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>>
>> Were you able to determine the root cause?  For example,
>> is the host controller timeout correct, is the eMMC
>> providing correct timeout values, is the mmc subsystem
>> calculating a correct value, is sdhci programming a correct
>> value?
>>
>> If there are problems outside the host controller then we
>> need to address them also.
> 
> It is caused by the host controller timeout, but is hard to tell whether the
> configuration provided by the card is good enough since it's
> intermittent under stress test the SoC needs to work with different eMMC vendors. 
> In UEFI eMMC driver similar max timeout (0xe) is used to avoid such
> issue. This commit tries to use existing quirk, which I think that it would work 
> if there is another way to adjust the TOUT_CNT register. Any concern or suggestions?

If cards are providing timeout values that are too low under stress,
it would be better to fix it in the mmc subsystem so that all host
controllers can benefit.

> 
>>
>>>
>>> Signed-off-by: Liming Sun <limings@nvidia.com>
>>
>> Fixes tag?
> 
> Will update it in v2.
> 
>>
>>> ---
>>>  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
>> b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> index 3a3bae6948a8..3c8fe8aec558 100644
>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
>> sdhci_dwcmshc_pdata = {
>>>  #ifdef CONFIG_ACPI
>>>  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>>>  	.ops = &sdhci_dwcmshc_ops,
>>> -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>> +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
>>> +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>>>  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>>  		   SDHCI_QUIRK2_ACMD23_BROKEN,
>>>  };
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-11-18 13:46 [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC Liming Sun
  2023-11-20  6:49 ` Adrian Hunter
@ 2023-11-27 13:36 ` Christian Loehle
  2023-11-30 13:19   ` Liming Sun
  1 sibling, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2023-11-27 13:36 UTC (permalink / raw)
  To: Liming Sun, Adrian Hunter, Ulf Hansson, David Thompson
  Cc: linux-mmc, linux-kernel

On 18/11/2023 13:46, Liming Sun wrote:
> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> intermittent eMMC timeout issue reported on some cards under eMMC
> stress test.
> 
> Reported error message:
>   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
> 
> Signed-off-by: Liming Sun <limings@nvidia.com>
> ---
>  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
> index 3a3bae6948a8..3c8fe8aec558 100644
> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data sdhci_dwcmshc_pdata = {
>  #ifdef CONFIG_ACPI
>  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>  	.ops = &sdhci_dwcmshc_ops,
> -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>  		   SDHCI_QUIRK2_ACMD23_BROKEN,
>  };

__mmc_blk_ioctl_cmd: data error ?
What stresstest are you running that issues ioctl commands?
On which commands does the timeout occur?
Anyway you should be able to increase the timeout in ioctl structure
directly, i.e. in userspace, or does that not work?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-11-27 13:36 ` Christian Loehle
@ 2023-11-30 13:19   ` Liming Sun
  2023-12-11 11:38     ` Adrian Hunter
  0 siblings, 1 reply; 9+ messages in thread
From: Liming Sun @ 2023-11-30 13:19 UTC (permalink / raw)
  To: Christian Loehle, Adrian Hunter, Ulf Hansson, David Thompson
  Cc: linux-mmc, linux-kernel



> -----Original Message-----
> From: Christian Loehle <christian.loehle@arm.com>
> Sent: Monday, November 27, 2023 8:36 AM
> To: Liming Sun <limings@nvidia.com>; Adrian Hunter
> <adrian.hunter@intel.com>; Ulf Hansson <ulf.hansson@linaro.org>; David
> Thompson <davthompson@nvidia.com>
> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
> BlueField-3 SoC
> 
> On 18/11/2023 13:46, Liming Sun wrote:
> > This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> > intermittent eMMC timeout issue reported on some cards under eMMC
> > stress test.
> >
> > Reported error message:
> >   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
> >
> > Signed-off-by: Liming Sun <limings@nvidia.com>
> > ---
> >  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
> b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > index 3a3bae6948a8..3c8fe8aec558 100644
> > --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> > +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
> sdhci_dwcmshc_pdata = {
> >  #ifdef CONFIG_ACPI
> >  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> >  	.ops = &sdhci_dwcmshc_ops,
> > -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> > +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> > +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> >  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> >  		   SDHCI_QUIRK2_ACMD23_BROKEN,
> >  };
> 
> __mmc_blk_ioctl_cmd: data error ?
> What stresstest are you running that issues ioctl commands?
> On which commands does the timeout occur?
> Anyway you should be able to increase the timeout in ioctl structure
> directly, i.e. in userspace, or does that not work?

It's running stress test with tool like "fio --name=randrw_stress_round_1 --ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K/3:76K/3 --filename=/dev/mmcblk0"
The tool(application) is owned by user or with some standard tool.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-11-30 13:19   ` Liming Sun
@ 2023-12-11 11:38     ` Adrian Hunter
  2023-12-19 21:18       ` Liming Sun
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2023-12-11 11:38 UTC (permalink / raw)
  To: Liming Sun, Christian Loehle, Ulf Hansson, David Thompson
  Cc: linux-mmc, linux-kernel

On 30/11/23 15:19, Liming Sun wrote:
> 
> 
>> -----Original Message-----
>> From: Christian Loehle <christian.loehle@arm.com>
>> Sent: Monday, November 27, 2023 8:36 AM
>> To: Liming Sun <limings@nvidia.com>; Adrian Hunter
>> <adrian.hunter@intel.com>; Ulf Hansson <ulf.hansson@linaro.org>; David
>> Thompson <davthompson@nvidia.com>
>> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
>> BlueField-3 SoC
>>
>> On 18/11/2023 13:46, Liming Sun wrote:
>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
>>> intermittent eMMC timeout issue reported on some cards under eMMC
>>> stress test.
>>>
>>> Reported error message:
>>>   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>>>
>>> Signed-off-by: Liming Sun <limings@nvidia.com>
>>> ---
>>>  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
>> b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> index 3a3bae6948a8..3c8fe8aec558 100644
>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
>> sdhci_dwcmshc_pdata = {
>>>  #ifdef CONFIG_ACPI
>>>  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>>>  	.ops = &sdhci_dwcmshc_ops,
>>> -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>> +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
>>> +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>>>  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>>  		   SDHCI_QUIRK2_ACMD23_BROKEN,
>>>  };
>>
>> __mmc_blk_ioctl_cmd: data error ?
>> What stresstest are you running that issues ioctl commands?
>> On which commands does the timeout occur?
>> Anyway you should be able to increase the timeout in ioctl structure
>> directly, i.e. in userspace, or does that not work?
> 
> It's running stress test with tool like "fio --name=randrw_stress_round_1 --ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K/3:76K/3 --filename=/dev/mmcblk0"
> The tool(application) is owned by user or with some standard tool.

fio does not send mmc ioctls, so I am also a bit confused about
how you get "__mmc_blk_ioctl_cmd: data error -110" ?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-12-11 11:38     ` Adrian Hunter
@ 2023-12-19 21:18       ` Liming Sun
  2024-01-04  9:24         ` Adrian Hunter
  0 siblings, 1 reply; 9+ messages in thread
From: Liming Sun @ 2023-12-19 21:18 UTC (permalink / raw)
  To: Adrian Hunter, Christian Loehle, Ulf Hansson, David Thompson
  Cc: linux-mmc, linux-kernel



> -----Original Message-----
> From: Adrian Hunter <adrian.hunter@intel.com>
> Sent: Monday, December 11, 2023 6:39 AM
> To: Liming Sun <limings@nvidia.com>; Christian Loehle
> <christian.loehle@arm.com>; Ulf Hansson <ulf.hansson@linaro.org>; David
> Thompson <davthompson@nvidia.com>
> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
> BlueField-3 SoC
> 
> On 30/11/23 15:19, Liming Sun wrote:
> >
> >
> >> -----Original Message-----
> >> From: Christian Loehle <christian.loehle@arm.com>
> >> Sent: Monday, November 27, 2023 8:36 AM
> >> To: Liming Sun <limings@nvidia.com>; Adrian Hunter
> >> <adrian.hunter@intel.com>; Ulf Hansson <ulf.hansson@linaro.org>; David
> >> Thompson <davthompson@nvidia.com>
> >> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
> >> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk
> for
> >> BlueField-3 SoC
> >>
> >> On 18/11/2023 13:46, Liming Sun wrote:
> >>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> >>> intermittent eMMC timeout issue reported on some cards under eMMC
> >>> stress test.
> >>>
> >>> Reported error message:
> >>>   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
> >>>
> >>> Signed-off-by: Liming Sun <limings@nvidia.com>
> >>> ---
> >>>  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
> >> b/drivers/mmc/host/sdhci-of-dwcmshc.c
> >>> index 3a3bae6948a8..3c8fe8aec558 100644
> >>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> >>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> >>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
> >> sdhci_dwcmshc_pdata = {
> >>>  #ifdef CONFIG_ACPI
> >>>  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> >>>  	.ops = &sdhci_dwcmshc_ops,
> >>> -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> >>> +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> >>> +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> >>>  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> >>>  		   SDHCI_QUIRK2_ACMD23_BROKEN,
> >>>  };
> >>
> >> __mmc_blk_ioctl_cmd: data error ?
> >> What stresstest are you running that issues ioctl commands?
> >> On which commands does the timeout occur?
> >> Anyway you should be able to increase the timeout in ioctl structure
> >> directly, i.e. in userspace, or does that not work?
> >
> > It's running stress test with tool like "fio --name=randrw_stress_round_1 --
> ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --
> norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --
> iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --
> bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K
> /3:76K/3 --filename=/dev/mmcblk0"
> > The tool(application) is owned by user or with some standard tool.
> 
> fio does not send mmc ioctls, so I am also a bit confused about
> how you get "__mmc_blk_ioctl_cmd: data error -110" ?

There are other activities or background task going on. I assume it's other
MMC access which are affected by the stress FIO and got timeout. Would it make sense?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC
  2023-12-19 21:18       ` Liming Sun
@ 2024-01-04  9:24         ` Adrian Hunter
  0 siblings, 0 replies; 9+ messages in thread
From: Adrian Hunter @ 2024-01-04  9:24 UTC (permalink / raw)
  To: Liming Sun, Christian Loehle, Ulf Hansson, David Thompson
  Cc: linux-mmc, linux-kernel

On 19/12/23 23:18, Liming Sun wrote:
> 
> 
>> -----Original Message-----
>> From: Adrian Hunter <adrian.hunter@intel.com>
>> Sent: Monday, December 11, 2023 6:39 AM
>> To: Liming Sun <limings@nvidia.com>; Christian Loehle
>> <christian.loehle@arm.com>; Ulf Hansson <ulf.hansson@linaro.org>; David
>> Thompson <davthompson@nvidia.com>
>> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
>> BlueField-3 SoC
>>
>> On 30/11/23 15:19, Liming Sun wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Christian Loehle <christian.loehle@arm.com>
>>>> Sent: Monday, November 27, 2023 8:36 AM
>>>> To: Liming Sun <limings@nvidia.com>; Adrian Hunter
>>>> <adrian.hunter@intel.com>; Ulf Hansson <ulf.hansson@linaro.org>; David
>>>> Thompson <davthompson@nvidia.com>
>>>> Cc: linux-mmc@vger.kernel.org; linux-kernel@vger.kernel.org
>>>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk
>> for
>>>> BlueField-3 SoC
>>>>
>>>> On 18/11/2023 13:46, Liming Sun wrote:
>>>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
>>>>> intermittent eMMC timeout issue reported on some cards under eMMC
>>>>> stress test.
>>>>>
>>>>> Reported error message:
>>>>>   dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>>>>>
>>>>> Signed-off-by: Liming Sun <limings@nvidia.com>
>>>>> ---
>>>>>  drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>> b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>>> index 3a3bae6948a8..3c8fe8aec558 100644
>>>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
>>>> sdhci_dwcmshc_pdata = {
>>>>>  #ifdef CONFIG_ACPI
>>>>>  static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>>>>>  	.ops = &sdhci_dwcmshc_ops,
>>>>> -	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>>>> +	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
>>>>> +		  SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>>>>>  	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>>>>  		   SDHCI_QUIRK2_ACMD23_BROKEN,
>>>>>  };
>>>>
>>>> __mmc_blk_ioctl_cmd: data error ?
>>>> What stresstest are you running that issues ioctl commands?
>>>> On which commands does the timeout occur?
>>>> Anyway you should be able to increase the timeout in ioctl structure
>>>> directly, i.e. in userspace, or does that not work?
>>>
>>> It's running stress test with tool like "fio --name=randrw_stress_round_1 --
>> ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --
>> norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --
>> iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --
>> bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K
>> /3:76K/3 --filename=/dev/mmcblk0"
>>> The tool(application) is owned by user or with some standard tool.
>>
>> fio does not send mmc ioctls, so I am also a bit confused about
>> how you get "__mmc_blk_ioctl_cmd: data error -110" ?
> 
> There are other activities or background task going on. I assume it's other
> MMC access which are affected by the stress FIO and got timeout. Would it make sense?
> 

It depends on whether the IOCTL is overriding the timeout.  In
struct mmc_ioc_cmd there is data_timeout_ns which overrides the
mmc core data timeout calculated by mmc_set_data_timeout().  There
is also cmd_timeout_ms for commands.  You need to check whether
"__mmc_blk_ioctl_cmd: data error -110" is because data_timeout_ns
was set too low (but non-zero) by the caller of the IOCTL.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-01-04  9:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-18 13:46 [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC Liming Sun
2023-11-20  6:49 ` Adrian Hunter
2023-11-20 15:18   ` Liming Sun
2023-11-21  8:08     ` Adrian Hunter
2023-11-27 13:36 ` Christian Loehle
2023-11-30 13:19   ` Liming Sun
2023-12-11 11:38     ` Adrian Hunter
2023-12-19 21:18       ` Liming Sun
2024-01-04  9:24         ` Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.