All of lore.kernel.org
 help / color / mirror / Atom feed
* IMX8MM eMMC CQHCI timeout
@ 2021-10-29 20:47 Tim Harvey
  2021-11-01  1:57 ` Bough Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Harvey @ 2021-10-29 20:47 UTC (permalink / raw)
  To: Linux MMC List, Marcel Ziswiler, Fabio Estevam, Schrempf Frieder,
	Adam Ford, Haibo Chen, Lucas Stach, Peng Fan, Frank Li
  Cc: Adrian Hunter, Shawn Guo, Ulf Hansson, Sascha Hauer,
	Pengutronix Kernel Team, NXP Linux Team, Cale Collins

Greetings,

I've encountered the following MMC CQHCI timeout message a couple of
times now on IMX8MM boards with eMMC with a 5.10 based kernel:

[  224.356283] mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
[  224.362764] mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
[  224.369250] mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
[  224.375726] mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
[  224.382197] mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
[  224.388665] mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
[  224.395129] mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
[  224.401598] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
[  224.408064] mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
[  224.414532] mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
[  224.420997] mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
[  224.427467] mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
[  224.433934] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
[  224.440404] mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
[  224.446877] mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
[  224.453346] mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
[  224.459811] mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
[  224.466281] mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
[  224.472752] mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
[  224.479225] mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
[  224.485690] mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
[  224.492161] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[  224.498628] mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
[  224.505097] mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
[  224.511575] mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
[  224.518043] mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
[  224.524512] mmc2: sdhci: Host ctl2: 0x00000088
[  224.528986] mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
[  224.535451] mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
[  224.543052] mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
[  224.548740] mmc2: sdhci-esdhc-imx: data debug status:  0x2200
[  224.554510] mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
[  224.560368] mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
[  224.566054] mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
[  224.571826] mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
[  224.577608] mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
[  224.583900] mmc2: sdhci: ============================================

I don't know how to make the issue occur, both times it occured simply
reading a file in the rootfs ext4 fs on the emmc.

Some research shows:
- https://community.nxp.com/t5/i-MX-Processors/The-issues-on-quot-mmc0-cqhci-timeout-for-tag-0-quot/m-p/993779
- http://git.toradex.com/cgit/linux-toradex.git/commit/?h=toradex_5.4-2.3.x-imx&id=fd33531be843566c59a5fc655f204bbd36d7f3c6

I'm not clear if this info is up-to-date. The NXP 5.4 kernel did not
enable this feature but if I'm not mistaken CQHCI support itself
didn't land in mainline until a later kernel so it would make sense it
was not enabled at that time. I do see the NXP 5.10 kernels have this
enabled so I'm curious if it is an issue there.

Any other IMX8MM or other SoC users know what this could be about or
what I could do for a test to try to reproduce it so I can see if it
occurs in other kernel versions?

Best regards,

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: IMX8MM eMMC CQHCI timeout
  2021-10-29 20:47 IMX8MM eMMC CQHCI timeout Tim Harvey
@ 2021-11-01  1:57 ` Bough Chen
  2021-11-03 16:49   ` Tim Harvey
  0 siblings, 1 reply; 11+ messages in thread
From: Bough Chen @ 2021-11-01  1:57 UTC (permalink / raw)
  To: tharvey, Linux MMC List, Marcel Ziswiler, Fabio Estevam,
	Schrempf Frieder, Adam Ford, Lucas Stach, Peng Fan, Frank Li
  Cc: Adrian Hunter, Shawn Guo, Ulf Hansson, Sascha Hauer,
	Pengutronix Kernel Team, dl-linux-imx, Cale Collins

[-- Attachment #1: Type: text/plain, Size: 5987 bytes --]

> -----Original Message-----
> From: Tim Harvey [mailto:tharvey@gateworks.com]
> Sent: 2021年10月30日 4:47
> To: Linux MMC List <linux-mmc@vger.kernel.org>; Marcel Ziswiler
> <marcel@ziswiler.com>; Fabio Estevam <festevam@gmail.com>; Schrempf
> Frieder <frieder.schrempf@kontron.de>; Adam Ford <aford173@gmail.com>;
> Bough Chen <haibo.chen@nxp.com>; Lucas Stach <l.stach@pengutronix.de>;
> Peng Fan <peng.fan@nxp.com>; Frank Li <frank.li@nxp.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>; Shawn Guo
> <shawnguo@kernel.org>; Ulf Hansson <ulf.hansson@linaro.org>; Sascha
> Hauer <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> <kernel@pengutronix.de>; dl-linux-imx <linux-imx@nxp.com>; Cale Collins
> <ccollins@gateworks.com>
> Subject: IMX8MM eMMC CQHCI timeout
> 
> Greetings,
> 
> I've encountered the following MMC CQHCI timeout message a couple of times
> now on IMX8MM boards with eMMC with a 5.10 based kernel:
> 
> [  224.356283] mmc2: cqhci: ============ CQHCI REGISTER DUMP
> ===========
> [  224.362764] mmc2: cqhci: Caps:      0x0000310a | Version:
> 0x00000510
> [  224.369250] mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> [  224.375726] mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> [  224.382197] mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> [  224.388665] mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> [  224.395129] mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> [  224.401598] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> [  224.408064] mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> [  224.414532] mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp:
> 0x00000800
> [  224.420997] mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:
> 0x00000000
> [  224.427467] mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> [  224.433934] mmc2: sdhci: ============ SDHCI REGISTER DUMP
> =========== [  224.440404] mmc2: sdhci: Sys addr:  0x7c722000 | Version:
> 0x00000002 [  224.446877] mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:
> 0x00000020 [  224.453346] mmc2: sdhci: Argument:  0x00018000 | Trn
> mode: 0x00000023
> [  224.459811] mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> [  224.466281] mmc2: sdhci: Power:     0x00000002 | Blk gap:
> 0x00000080
> [  224.472752] mmc2: sdhci: Wake-up:   0x00000008 | Clock:
> 0x0000000f
> [  224.479225] mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> [  224.485690] mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> [  224.492161] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> [  224.498628] mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:
> 0x8000b407
> [  224.505097] mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> [  224.511575] mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> [  224.518043] mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> [  224.524512] mmc2: sdhci: Host ctl2: 0x00000088 [  224.528986] mmc2:
> sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020 [  224.535451]
> mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> [  224.543052] mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> [  224.548740] mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> [  224.554510] mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> [  224.560368] mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> [  224.566054] mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> [  224.571826] mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> [  224.577608] mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> [  224.583900] mmc2: sdhci:
> ============================================
> 
> I don't know how to make the issue occur, both times it occured simply
reading
> a file in the rootfs ext4 fs on the emmc.
> 
> Some research shows:
> -
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommu
> nity.nxp.com%2Ft5%2Fi-MX-Processors%2FThe-issues-on-quot-mmc0-cqhci-tim
> eout-for-tag-0-quot%2Fm-p%2F993779&amp;data=04%7C01%7Chaibo.chen%4
> 0nxp.com%7C1dc0981634f5460a779808d99b1d5a88%7C686ea1d3bc2b4c6fa9
> 2cd99c5c301635%7C0%7C0%7C637711372651089473%7CUnknown%7CTWFp
> bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C1000&amp;sdata=ITcs7%2FMy%2F1Vx1TMB2VlaY4QhibKuSFBD
> 6UZhzVFl%2FqY%3D&amp;reserved=0
> -
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit.torad
> ex.com%2Fcgit%2Flinux-toradex.git%2Fcommit%2F%3Fh%3Dtoradex_5.4-2.3.x
> -imx%26id%3Dfd33531be843566c59a5fc655f204bbd36d7f3c6&amp;data=04%
> 7C01%7Chaibo.chen%40nxp.com%7C1dc0981634f5460a779808d99b1d5a88%
> 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637711372651089473
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJ
> BTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xaamzPb2CdW6YDzW
> g8uBb0PjomkoWAziu5qglvMbT2I%3D&amp;reserved=0
> 
> I'm not clear if this info is up-to-date. The NXP 5.4 kernel did not
enable this
> feature but if I'm not mistaken CQHCI support itself didn't land in
mainline until
> a later kernel so it would make sense it was not enabled at that time. I
do see
> the NXP 5.10 kernels have this enabled so I'm curious if it is an issue
there.
> 
> Any other IMX8MM or other SoC users know what this could be about or what
I
> could do for a test to try to reproduce it so I can see if it occurs in
other kernel
> versions?

Hi Tim,

I'm debugging this issue those days, but unfortunately, still not find the
root cause.
The register value of Doorbell, Dev Queue, Dev Pend seems abnormal. This
issue happens
on all i.MX SoC which support cmdq feature when cpu loading is high.. Now I
lack a mmc
logic analyzer, make it not easy to debug this issue. So stll need some
time. Sorry about that.
If you want to make mmc work stable, you can disable the cmdq as a
workaround.

Best Regards
Haibo Chen
> 
> Best regards,
> 
> Tim

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 9551 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IMX8MM eMMC CQHCI timeout
  2021-11-01  1:57 ` Bough Chen
@ 2021-11-03 16:49   ` Tim Harvey
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Tim Harvey @ 2021-11-03 16:49 UTC (permalink / raw)
  To: Bough Chen
  Cc: Linux MMC List, Marcel Ziswiler, Fabio Estevam, Schrempf Frieder,
	Adam Ford, Lucas Stach, Peng Fan, Frank Li, Adrian Hunter,
	Shawn Guo, Ulf Hansson, Sascha Hauer, Pengutronix Kernel Team,
	dl-linux-imx, Cale Collins

On Sun, Oct 31, 2021 at 6:57 PM Bough Chen <haibo.chen@nxp.com> wrote:
>
> > -----Original Message-----
> > From: Tim Harvey [mailto:tharvey@gateworks.com]
> > Sent: 2021年10月30日 4:47
> > To: Linux MMC List <linux-mmc@vger.kernel.org>; Marcel Ziswiler
> > <marcel@ziswiler.com>; Fabio Estevam <festevam@gmail.com>; Schrempf
> > Frieder <frieder.schrempf@kontron.de>; Adam Ford <aford173@gmail.com>;
> > Bough Chen <haibo.chen@nxp.com>; Lucas Stach <l.stach@pengutronix.de>;
> > Peng Fan <peng.fan@nxp.com>; Frank Li <frank.li@nxp.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>; Shawn Guo
> > <shawnguo@kernel.org>; Ulf Hansson <ulf.hansson@linaro.org>; Sascha
> > Hauer <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> > <kernel@pengutronix.de>; dl-linux-imx <linux-imx@nxp.com>; Cale Collins
> > <ccollins@gateworks.com>
> > Subject: IMX8MM eMMC CQHCI timeout
> >
> > Greetings,
> >
> > I've encountered the following MMC CQHCI timeout message a couple of times
> > now on IMX8MM boards with eMMC with a 5.10 based kernel:
> >
> > [  224.356283] mmc2: cqhci: ============ CQHCI REGISTER DUMP
> > ===========
> > [  224.362764] mmc2: cqhci: Caps:      0x0000310a | Version:
> > 0x00000510
> > [  224.369250] mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> > [  224.375726] mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> > [  224.382197] mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> > [  224.388665] mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> > [  224.395129] mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> > [  224.401598] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> > [  224.408064] mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> > [  224.414532] mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp:
> > 0x00000800
> > [  224.420997] mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:
> > 0x00000000
> > [  224.427467] mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> > [  224.433934] mmc2: sdhci: ============ SDHCI REGISTER DUMP
> > =========== [  224.440404] mmc2: sdhci: Sys addr:  0x7c722000 | Version:
> > 0x00000002 [  224.446877] mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:
> > 0x00000020 [  224.453346] mmc2: sdhci: Argument:  0x00018000 | Trn
> > mode: 0x00000023
> > [  224.459811] mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> > [  224.466281] mmc2: sdhci: Power:     0x00000002 | Blk gap:
> > 0x00000080
> > [  224.472752] mmc2: sdhci: Wake-up:   0x00000008 | Clock:
> > 0x0000000f
> > [  224.479225] mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> > [  224.485690] mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> > [  224.492161] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> > [  224.498628] mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:
> > 0x8000b407
> > [  224.505097] mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> > [  224.511575] mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> > [  224.518043] mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> > [  224.524512] mmc2: sdhci: Host ctl2: 0x00000088 [  224.528986] mmc2:
> > sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020 [  224.535451]
> > mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> > [  224.543052] mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> > [  224.548740] mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> > [  224.554510] mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> > [  224.560368] mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> > [  224.566054] mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> > [  224.571826] mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> > [  224.577608] mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> > [  224.583900] mmc2: sdhci:
> > ============================================
> >
> > I don't know how to make the issue occur, both times it occured simply
> reading
> > a file in the rootfs ext4 fs on the emmc.
> >
> > Some research shows:
> > -
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommu
> > nity.nxp.com%2Ft5%2Fi-MX-Processors%2FThe-issues-on-quot-mmc0-cqhci-tim
> > eout-for-tag-0-quot%2Fm-p%2F993779&amp;data=04%7C01%7Chaibo.chen%4
> > 0nxp.com%7C1dc0981634f5460a779808d99b1d5a88%7C686ea1d3bc2b4c6fa9
> > 2cd99c5c301635%7C0%7C0%7C637711372651089473%7CUnknown%7CTWFp
> > bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> > 6Mn0%3D%7C1000&amp;sdata=ITcs7%2FMy%2F1Vx1TMB2VlaY4QhibKuSFBD
> > 6UZhzVFl%2FqY%3D&amp;reserved=0
> > -
> > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit.torad
> > ex.com%2Fcgit%2Flinux-toradex.git%2Fcommit%2F%3Fh%3Dtoradex_5.4-2.3.x
> > -imx%26id%3Dfd33531be843566c59a5fc655f204bbd36d7f3c6&amp;data=04%
> > 7C01%7Chaibo.chen%40nxp.com%7C1dc0981634f5460a779808d99b1d5a88%
> > 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637711372651089473
> > %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJ
> > BTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xaamzPb2CdW6YDzW
> > g8uBb0PjomkoWAziu5qglvMbT2I%3D&amp;reserved=0
> >
> > I'm not clear if this info is up-to-date. The NXP 5.4 kernel did not
> enable this
> > feature but if I'm not mistaken CQHCI support itself didn't land in
> mainline until
> > a later kernel so it would make sense it was not enabled at that time. I
> do see
> > the NXP 5.10 kernels have this enabled so I'm curious if it is an issue
> there.
> >
> > Any other IMX8MM or other SoC users know what this could be about or what
> I
> > could do for a test to try to reproduce it so I can see if it occurs in
> other kernel
> > versions?
>
> Hi Tim,
>
> I'm debugging this issue those days, but unfortunately, still not find the
> root cause.
> The register value of Doorbell, Dev Queue, Dev Pend seems abnormal. This
> issue happens
> on all i.MX SoC which support cmdq feature when cpu loading is high.. Now I
> lack a mmc
> logic analyzer, make it not easy to debug this issue. So stll need some
> time. Sorry about that.
> If you want to make mmc work stable, you can disable the cmdq as a
> workaround.
>
> Best Regards
> Haibo Chen

Haibo,

Thanks for the information. Do you know how to easily reproduce it
reliably for testing?

I have tried the following on an eMMC filesystem:
stress --cpu 32 --io 32 &
dd if=/dev/zero of=foo bs=1M count=1000 &
dd if=/dev/zero of=foo bs=1M count=1000 &
rm foo

I'm unable to reproduce the issue that way, and it has only happened
randomly once or twice.

Perhaps we should disable CMDQ for now until you can sort this out? I
can submit a patch for that.

Best regards,

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
  2021-11-03 16:49   ` Tim Harvey
@ 2021-11-03 16:54     ` Tim Harvey
  2021-11-03 17:12       ` Fabio Estevam
                         ` (3 more replies)
  2021-11-03 17:21     ` IMX8MM eMMC CQHCI timeout Marcel Ziswiler
  2021-11-04  2:13     ` Bough Chen
  2 siblings, 4 replies; 11+ messages in thread
From: Tim Harvey @ 2021-11-03 16:54 UTC (permalink / raw)
  To: Adrian Hunter, Ulf Hansson, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	BOUGH CHEN, linux-mmc, Marcel Ziswiler, Schrempf Frieder,
	Adam Ford, Lucas Stach, Peng Fan
  Cc: Tim Harvey, stable

On IMX SoC's which support CMDQ the following can occur during high a
high cpu load:

mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
mmc2: sdhci: Host ctl2: 0x00000088
mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
mmc2: sdhci-esdhc-imx: data debug status:  0x2200
mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
mmc2: sdhci: ============================================

For now, disable CMDQ support on the imx8qm/imx8qxp/imx8mm until the
issue is found and resolved.

Fixes: bb6e358169bf6 ("mmc: sdhci-esdhc-imx: add CMDQ support")
Fixes: cde5e8e9ff146 ("mmc: sdhci-esdhc-imx: Add an new esdhc_soc_data
for i.MX8MM")

Cc: stable@vger.kernel.org
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
---
 drivers/mmc/host/sdhci-esdhc-imx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c
index e658f0174242..60f19369de84 100644
--- a/drivers/mmc/host/sdhci-esdhc-imx.c
+++ b/drivers/mmc/host/sdhci-esdhc-imx.c
@@ -300,7 +300,6 @@ static struct esdhc_soc_data usdhc_imx8qxp_data = {
 	.flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
 			| ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
 			| ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
-			| ESDHC_FLAG_CQHCI
 			| ESDHC_FLAG_STATE_LOST_IN_LPMODE
 			| ESDHC_FLAG_CLK_RATE_LOST_IN_PM_RUNTIME,
 };
@@ -309,7 +308,6 @@ static struct esdhc_soc_data usdhc_imx8mm_data = {
 	.flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
 			| ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
 			| ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
-			| ESDHC_FLAG_CQHCI
 			| ESDHC_FLAG_STATE_LOST_IN_LPMODE,
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
@ 2021-11-03 17:12       ` Fabio Estevam
  2021-11-04  2:06         ` Bough Chen
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Fabio Estevam @ 2021-11-03 17:12 UTC (permalink / raw)
  To: Tim Harvey
  Cc: Adrian Hunter, Ulf Hansson, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, NXP Linux Team, BOUGH CHEN, linux-mmc,
	Marcel Ziswiler, Schrempf Frieder, Adam Ford, Lucas Stach,
	Peng Fan, stable

Hi Tim,

On Wed, Nov 3, 2021 at 1:54 PM Tim Harvey <tharvey@gateworks.com> wrote:
>
> On IMX SoC's which support CMDQ the following can occur during high a
> high cpu load:
>
> mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
> mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
> mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
> mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
> mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
> mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
> mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
> mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
> mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
> mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
> mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> mmc2: sdhci: Host ctl2: 0x00000088
> mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
> mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> mmc2: sdhci: ============================================
>
> For now, disable CMDQ support on the imx8qm/imx8qxp/imx8mm until the
> issue is found and resolved.
>
> Fixes: bb6e358169bf6 ("mmc: sdhci-esdhc-imx: add CMDQ support")
> Fixes: cde5e8e9ff146 ("mmc: sdhci-esdhc-imx: Add an new esdhc_soc_data
> for i.MX8MM")
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Tim Harvey <tharvey@gateworks.com>

It seem this is the best approach we can do at the moment, until
the ESDHC_FLAG_CQHCI issue is debugged.

Reviewed-by: Fabio Estevam <festevam@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IMX8MM eMMC CQHCI timeout
  2021-11-03 16:49   ` Tim Harvey
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
@ 2021-11-03 17:21     ` Marcel Ziswiler
  2021-11-04  2:13     ` Bough Chen
  2 siblings, 0 replies; 11+ messages in thread
From: Marcel Ziswiler @ 2021-11-03 17:21 UTC (permalink / raw)
  To: Tim Harvey, Bough Chen
  Cc: Linux MMC List, Fabio Estevam, Schrempf Frieder, Adam Ford,
	Lucas Stach, Peng Fan, Frank Li, Adrian Hunter, Shawn Guo,
	Ulf Hansson, Sascha Hauer, Pengutronix Kernel Team, dl-linux-imx,
	Cale Collins

On Wed, 2021-11-03 at 09:49 -0700, Tim Harvey wrote:
> On Sun, Oct 31, 2021 at 6:57 PM Bough Chen <haibo.chen@nxp.com> wrote:
> > 
> > > -----Original Message-----
> > > From: Tim Harvey [mailto:tharvey@gateworks.com]
> > > Sent: 2021年10月30日 4:47
> > > To: Linux MMC List <linux-mmc@vger.kernel.org>; Marcel Ziswiler
> > > <marcel@ziswiler.com>; Fabio Estevam <festevam@gmail.com>; Schrempf
> > > Frieder <frieder.schrempf@kontron.de>; Adam Ford <aford173@gmail.com>;
> > > Bough Chen <haibo.chen@nxp.com>; Lucas Stach <l.stach@pengutronix.de>;
> > > Peng Fan <peng.fan@nxp.com>; Frank Li <frank.li@nxp.com>
> > > Cc: Adrian Hunter <adrian.hunter@intel.com>; Shawn Guo
> > > <shawnguo@kernel.org>; Ulf Hansson <ulf.hansson@linaro.org>; Sascha
> > > Hauer <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> > > <kernel@pengutronix.de>; dl-linux-imx <linux-imx@nxp.com>; Cale Collins
> > > <ccollins@gateworks.com>
> > > Subject: IMX8MM eMMC CQHCI timeout
> > > 
> > > Greetings,
> > > 
> > > I've encountered the following MMC CQHCI timeout message a couple of times
> > > now on IMX8MM boards with eMMC with a 5.10 based kernel:
> > > 
> > > [  224.356283] mmc2: cqhci: ============ CQHCI REGISTER DUMP
> > > ===========
> > > [  224.362764] mmc2: cqhci: Caps:      0x0000310a | Version:
> > > 0x00000510
> > > [  224.369250] mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> > > [  224.375726] mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> > > [  224.382197] mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> > > [  224.388665] mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> > > [  224.395129] mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> > > [  224.401598] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> > > [  224.408064] mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> > > [  224.414532] mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp:
> > > 0x00000800
> > > [  224.420997] mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:
> > > 0x00000000
> > > [  224.427467] mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> > > [  224.433934] mmc2: sdhci: ============ SDHCI REGISTER DUMP
> > > =========== [  224.440404] mmc2: sdhci: Sys addr:  0x7c722000 | Version:
> > > 0x00000002 [  224.446877] mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:
> > > 0x00000020 [  224.453346] mmc2: sdhci: Argument:  0x00018000 | Trn
> > > mode: 0x00000023
> > > [  224.459811] mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> > > [  224.466281] mmc2: sdhci: Power:     0x00000002 | Blk gap:
> > > 0x00000080
> > > [  224.472752] mmc2: sdhci: Wake-up:   0x00000008 | Clock:
> > > 0x0000000f
> > > [  224.479225] mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> > > [  224.485690] mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> > > [  224.492161] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> > > [  224.498628] mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:
> > > 0x8000b407
> > > [  224.505097] mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> > > [  224.511575] mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> > > [  224.518043] mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> > > [  224.524512] mmc2: sdhci: Host ctl2: 0x00000088 [  224.528986] mmc2:
> > > sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020 [  224.535451]
> > > mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> > > [  224.543052] mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> > > [  224.548740] mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> > > [  224.554510] mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> > > [  224.560368] mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> > > [  224.566054] mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> > > [  224.571826] mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> > > [  224.577608] mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> > > [  224.583900] mmc2: sdhci:
> > > ============================================
> > > 
> > > I don't know how to make the issue occur, both times it occured simply
> > reading
> > > a file in the rootfs ext4 fs on the emmc.
> > > 
> > > Some research shows:
> > > -
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommu
> > > nity.nxp.com%2Ft5%2Fi-MX-Processors%2FThe-issues-on-quot-mmc0-cqhci-tim
> > > eout-for-tag-0-quot%2Fm-p%2F993779&amp;data=04%7C01%7Chaibo.chen%4
> > > 0nxp.com%7C1dc0981634f5460a779808d99b1d5a88%7C686ea1d3bc2b4c6fa9
> > > 2cd99c5c301635%7C0%7C0%7C637711372651089473%7CUnknown%7CTWFp
> > > bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> > > 6Mn0%3D%7C1000&amp;sdata=ITcs7%2FMy%2F1Vx1TMB2VlaY4QhibKuSFBD
> > > 6UZhzVFl%2FqY%3D&amp;reserved=0
> > > -
> > > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit.torad
> > > ex.com%2Fcgit%2Flinux-toradex.git%2Fcommit%2F%3Fh%3Dtoradex_5.4-2.3.x
> > > -imx%26id%3Dfd33531be843566c59a5fc655f204bbd36d7f3c6&amp;data=04%
> > > 7C01%7Chaibo.chen%40nxp.com%7C1dc0981634f5460a779808d99b1d5a88%
> > > 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637711372651089473
> > > %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJ
> > > BTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xaamzPb2CdW6YDzW
> > > g8uBb0PjomkoWAziu5qglvMbT2I%3D&amp;reserved=0
> > > 
> > > I'm not clear if this info is up-to-date. The NXP 5.4 kernel did not
> > enable this
> > > feature but if I'm not mistaken CQHCI support itself didn't land in
> > mainline until
> > > a later kernel so it would make sense it was not enabled at that time. I
> > do see
> > > the NXP 5.10 kernels have this enabled so I'm curious if it is an issue
> > there.
> > > 
> > > Any other IMX8MM or other SoC users know what this could be about or what
> > I
> > > could do for a test to try to reproduce it so I can see if it occurs in
> > other kernel
> > > versions?
> > 
> > Hi Tim,
> > 
> > I'm debugging this issue those days, but unfortunately, still not find the
> > root cause.
> > The register value of Doorbell, Dev Queue, Dev Pend seems abnormal. This
> > issue happens
> > on all i.MX SoC which support cmdq feature when cpu loading is high.. Now I
> > lack a mmc
> > logic analyzer, make it not easy to debug this issue. So stll need some
> > time. Sorry about that.
> > If you want to make mmc work stable, you can disable the cmdq as a
> > workaround.
> > 
> > Best Regards
> > Haibo Chen
> 
> Haibo,
> 
> Thanks for the information. Do you know how to easily reproduce it
> reliably for testing?
> 
> I have tried the following on an eMMC filesystem:
> stress --cpu 32 --io 32 &
> dd if=/dev/zero of=foo bs=1M count=1000 &
> dd if=/dev/zero of=foo bs=1M count=1000 &
> rm foo
> 
> I'm unable to reproduce the issue that way, and it has only happened
> randomly once or twice.

It seems to only happen on rather concurrent and high IO load. We got it reliably e.g. when doing docker pulls.

> Perhaps we should disable CMDQ for now until you can sort this out?

We also had to disable it. There even was some inconclusive discussion on NXP's community forum at one time.
However, downstream related.

https://community.nxp.com/t5/i-MX-Processors/The-issues-on-quot-mmc0-cqhci-timeout-for-tag-0-quot/m-p/1330691/highlight/true#M179200

> I can submit a patch for that.
> 
> Best regards,
> 
> Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
@ 2021-11-04  2:06         ` Bough Chen
  2021-11-04  2:06         ` Bough Chen
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Bough Chen @ 2021-11-04  2:06 UTC (permalink / raw)
  To: tharvey, Adrian Hunter, Ulf Hansson, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, dl-linux-imx, linux-mmc,
	Marcel Ziswiler, Schrempf Frieder, Adam Ford, Lucas Stach,
	Peng Fan
  Cc: tharvey, stable

[-- Attachment #1: Type: text/plain, Size: 4644 bytes --]

> -----Original Message-----
> From: Tim Harvey [mailto:tharvey@gateworks.com]
> Sent: 2021年11月4日 0:54
> To: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> <ulf.hansson@linaro.org>; Shawn Guo <shawnguo@kernel.org>; Sascha Hauer
> <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> <kernel@pengutronix.de>; Fabio Estevam <festevam@gmail.com>;
> dl-linux-imx <linux-imx@nxp.com>; Bough Chen <haibo.chen@nxp.com>;
> linux-mmc@vger.kernel.org; Marcel Ziswiler <marcel@ziswiler.com>;
> Schrempf Frieder <frieder.schrempf@kontron.de>; Adam Ford
> <aford173@gmail.com>; Lucas Stach <l.stach@pengutronix.de>; Peng Fan
> <peng.fan@nxp.com>
> Cc: tharvey@gateworks.com; stable@vger.kernel.org
> Subject: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
>
> On IMX SoC's which support CMDQ the following can occur during high a high
> cpu load:
>
> mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
> mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
> mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
> mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
> mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
> mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
> mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
> mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
> mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
> mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
> mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> mmc2: sdhci: Host ctl2: 0x00000088
> mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
> mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> mmc2: sdhci: ============================================
>
> For now, disable CMDQ support on the imx8qm/imx8qxp/imx8mm until the
> issue is found and resolved.
>
> Fixes: bb6e358169bf6 ("mmc: sdhci-esdhc-imx: add CMDQ support")
> Fixes: cde5e8e9ff146 ("mmc: sdhci-esdhc-imx: Add an new esdhc_soc_data for
> i.MX8MM")
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Tim Harvey <tharvey@gateworks.com>

Reviewed-by: Haibo Chen <haibo.chen@nxp.com>

Best Regards
Haibo Chen
> ---
>  drivers/mmc/host/sdhci-esdhc-imx.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c
> b/drivers/mmc/host/sdhci-esdhc-imx.c
> index e658f0174242..60f19369de84 100644
> --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> @@ -300,7 +300,6 @@ static struct esdhc_soc_data usdhc_imx8qxp_data = {
>       .flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>                       | ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>                       | ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -                     | ESDHC_FLAG_CQHCI
>                       | ESDHC_FLAG_STATE_LOST_IN_LPMODE
>                       | ESDHC_FLAG_CLK_RATE_LOST_IN_PM_RUNTIME,
>  };
> @@ -309,7 +308,6 @@ static struct esdhc_soc_data usdhc_imx8mm_data = {
>       .flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>                       | ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>                       | ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -                     | ESDHC_FLAG_CQHCI
>                       | ESDHC_FLAG_STATE_LOST_IN_LPMODE,
>  };
>
> --
> 2.17.1


[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 34729 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
@ 2021-11-04  2:06         ` Bough Chen
  0 siblings, 0 replies; 11+ messages in thread
From: Bough Chen @ 2021-11-04  2:06 UTC (permalink / raw)
  To: tharvey, Adrian Hunter, Ulf Hansson, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, dl-linux-imx, linux-mmc,
	Marcel Ziswiler, Schrempf Frieder, Adam Ford, Lucas Stach,
	Peng Fan
  Cc: tharvey, stable

[-- Attachment #1: Type: text/plain, Size: 4481 bytes --]

> -----Original Message-----
> From: Tim Harvey [mailto:tharvey@gateworks.com]
> Sent: 2021年11月4日 0:54
> To: Adrian Hunter <adrian.hunter@intel.com>; Ulf Hansson
> <ulf.hansson@linaro.org>; Shawn Guo <shawnguo@kernel.org>; Sascha Hauer
> <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> <kernel@pengutronix.de>; Fabio Estevam <festevam@gmail.com>;
> dl-linux-imx <linux-imx@nxp.com>; Bough Chen <haibo.chen@nxp.com>;
> linux-mmc@vger.kernel.org; Marcel Ziswiler <marcel@ziswiler.com>;
> Schrempf Frieder <frieder.schrempf@kontron.de>; Adam Ford
> <aford173@gmail.com>; Lucas Stach <l.stach@pengutronix.de>; Peng Fan
> <peng.fan@nxp.com>
> Cc: tharvey@gateworks.com; stable@vger.kernel.org
> Subject: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
> 
> On IMX SoC's which support CMDQ the following can occur during high a high
> cpu load:
> 
> mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
> mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
> mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
> mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
> mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
> mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
> mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
> mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
> mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
> mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
> mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> mmc2: sdhci: Host ctl2: 0x00000088
> mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
> mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> mmc2: sdhci: ============================================
> 
> For now, disable CMDQ support on the imx8qm/imx8qxp/imx8mm until the
> issue is found and resolved.
> 
> Fixes: bb6e358169bf6 ("mmc: sdhci-esdhc-imx: add CMDQ support")
> Fixes: cde5e8e9ff146 ("mmc: sdhci-esdhc-imx: Add an new esdhc_soc_data for
> i.MX8MM")
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Tim Harvey <tharvey@gateworks.com>

Reviewed-by: Haibo Chen <haibo.chen@nxp.com>

Best Regards
Haibo Chen
> ---
>  drivers/mmc/host/sdhci-esdhc-imx.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c
> b/drivers/mmc/host/sdhci-esdhc-imx.c
> index e658f0174242..60f19369de84 100644
> --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> @@ -300,7 +300,6 @@ static struct esdhc_soc_data usdhc_imx8qxp_data = {
>  	.flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>  			| ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>  			| ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -			| ESDHC_FLAG_CQHCI
>  			| ESDHC_FLAG_STATE_LOST_IN_LPMODE
>  			| ESDHC_FLAG_CLK_RATE_LOST_IN_PM_RUNTIME,
>  };
> @@ -309,7 +308,6 @@ static struct esdhc_soc_data usdhc_imx8mm_data = {
>  	.flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>  			| ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>  			| ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -			| ESDHC_FLAG_CQHCI
>  			| ESDHC_FLAG_STATE_LOST_IN_LPMODE,
>  };
> 
> --
> 2.17.1


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 9551 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: IMX8MM eMMC CQHCI timeout
  2021-11-03 16:49   ` Tim Harvey
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
  2021-11-03 17:21     ` IMX8MM eMMC CQHCI timeout Marcel Ziswiler
@ 2021-11-04  2:13     ` Bough Chen
  2 siblings, 0 replies; 11+ messages in thread
From: Bough Chen @ 2021-11-04  2:13 UTC (permalink / raw)
  To: tharvey
  Cc: Linux MMC List, Marcel Ziswiler, Fabio Estevam, Schrempf Frieder,
	Adam Ford, Lucas Stach, Peng Fan, Frank Li, Adrian Hunter,
	Shawn Guo, Ulf Hansson, Sascha Hauer, Pengutronix Kernel Team,
	dl-linux-imx, Cale Collins

[-- Attachment #1: Type: text/plain, Size: 8581 bytes --]

> -----Original Message-----
> From: Tim Harvey [mailto:tharvey@gateworks.com]
> Sent: 2021年11月4日 0:50
> To: Bough Chen <haibo.chen@nxp.com>
> Cc: Linux MMC List <linux-mmc@vger.kernel.org>; Marcel Ziswiler
> <marcel@ziswiler.com>; Fabio Estevam <festevam@gmail.com>; Schrempf
> Frieder <frieder.schrempf@kontron.de>; Adam Ford <aford173@gmail.com>;
> Lucas Stach <l.stach@pengutronix.de>; Peng Fan <peng.fan@nxp.com>; Frank
> Li <frank.li@nxp.com>; Adrian Hunter <adrian.hunter@intel.com>; Shawn Guo
> <shawnguo@kernel.org>; Ulf Hansson <ulf.hansson@linaro.org>; Sascha
> Hauer <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> <kernel@pengutronix.de>; dl-linux-imx <linux-imx@nxp.com>; Cale Collins
> <ccollins@gateworks.com>
> Subject: Re: IMX8MM eMMC CQHCI timeout
> 
> On Sun, Oct 31, 2021 at 6:57 PM Bough Chen <haibo.chen@nxp.com> wrote:
> >
> > > -----Original Message-----
> > > From: Tim Harvey [mailto:tharvey@gateworks.com]
> > > Sent: 2021年10月30日 4:47
> > > To: Linux MMC List <linux-mmc@vger.kernel.org>; Marcel Ziswiler
> > > <marcel@ziswiler.com>; Fabio Estevam <festevam@gmail.com>; Schrempf
> > > Frieder <frieder.schrempf@kontron.de>; Adam Ford
> > > <aford173@gmail.com>; Bough Chen <haibo.chen@nxp.com>; Lucas Stach
> > > <l.stach@pengutronix.de>; Peng Fan <peng.fan@nxp.com>; Frank Li
> > > <frank.li@nxp.com>
> > > Cc: Adrian Hunter <adrian.hunter@intel.com>; Shawn Guo
> > > <shawnguo@kernel.org>; Ulf Hansson <ulf.hansson@linaro.org>; Sascha
> > > Hauer <s.hauer@pengutronix.de>; Pengutronix Kernel Team
> > > <kernel@pengutronix.de>; dl-linux-imx <linux-imx@nxp.com>; Cale
> > > Collins <ccollins@gateworks.com>
> > > Subject: IMX8MM eMMC CQHCI timeout
> > >
> > > Greetings,
> > >
> > > I've encountered the following MMC CQHCI timeout message a couple of
> > > times now on IMX8MM boards with eMMC with a 5.10 based kernel:
> > >
> > > [  224.356283] mmc2: cqhci: ============ CQHCI REGISTER DUMP
> > > ===========
> > > [  224.362764] mmc2: cqhci: Caps:      0x0000310a | Version:
> > > 0x00000510
> > > [  224.369250] mmc2: cqhci: Config:    0x00001001 | Control:
> 0x00000000
> > > [  224.375726] mmc2: cqhci: Int stat:  0x00000000 | Int enab:
> 0x00000006
> > > [  224.382197] mmc2: cqhci: Int sig:   0x00000006 | Int Coal:
> 0x00000000
> > > [  224.388665] mmc2: cqhci: TDL base:  0x8003f000 | TDL up32:
> 0x00000000
> > > [  224.395129] mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:
> 0x00000000
> > > [  224.401598] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend:
> 0x08000000
> > > [  224.408064] mmc2: cqhci: Task clr:  0x00000000 | SSC1:
> 0x00011000
> > > [  224.414532] mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp:
> > > 0x00000800
> > > [  224.420997] mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:
> > > 0x00000000
> > > [  224.427467] mmc2: cqhci: Resp idx:  0x0000000d | Resp arg:
> > > 0x00000000 [  224.433934] mmc2: sdhci: ============ SDHCI REGISTER
> > > DUMP =========== [  224.440404] mmc2: sdhci: Sys addr:  0x7c722000
> | Version:
> > > 0x00000002 [  224.446877] mmc2: sdhci: Blk size:  0x00000200 | Blk
cnt:
> > > 0x00000020 [  224.453346] mmc2: sdhci: Argument:  0x00018000 | Trn
> > > mode: 0x00000023
> > > [  224.459811] mmc2: sdhci: Present:   0x01f88008 | Host ctl:
> 0x00000030
> > > [  224.466281] mmc2: sdhci: Power:     0x00000002 | Blk gap:
> > > 0x00000080
> > > [  224.472752] mmc2: sdhci: Wake-up:   0x00000008 | Clock:
> > > 0x0000000f
> > > [  224.479225] mmc2: sdhci: Timeout:   0x0000008f | Int stat:
> 0x00000000
> > > [  224.485690] mmc2: sdhci: Int enab:  0x107f4000 | Sig enab:
> > > 0x107f4000 [  224.492161] mmc2: sdhci: ACmd stat: 0x00000000 | Slot
int:
> 0x00000502
> > > [  224.498628] mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:
> > > 0x8000b407
> > > [  224.505097] mmc2: sdhci: Cmd:       0x00000d1a | Max curr:
> 0x00ffffff
> > > [  224.511575] mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:
> 0xffc003ff
> > > [  224.518043] mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:
> 0x00d07f01
> > > [  224.524512] mmc2: sdhci: Host ctl2: 0x00000088 [  224.528986]
> mmc2:
> > > sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020 [  224.535451]
> > > mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP
> ==== [
> > > 224.543052] mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120 [
> > > 224.548740] mmc2: sdhci-esdhc-imx: data debug status:  0x2200 [
> > > 224.554510] mmc2: sdhci-esdhc-imx: trans debug status:  0x2300 [
> > > 224.560368] mmc2: sdhci-esdhc-imx: dma debug status:  0x2400 [
> > > 224.566054] mmc2: sdhci-esdhc-imx: adma debug status:  0x2510 [
> > > 224.571826] mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680 [
> > > 224.577608] mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> > > [  224.583900] mmc2: sdhci:
> > > ============================================
> > >
> > > I don't know how to make the issue occur, both times it occured
> > > simply
> > reading
> > > a file in the rootfs ext4 fs on the emmc.
> > >
> > > Some research shows:
> > > -
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
> > > mmu
> > >
> nity.nxp.com%2Ft5%2Fi-MX-Processors%2FThe-issues-on-quot-mmc0-cqhci-
> > > tim
> > >
> eout-for-tag-0-quot%2Fm-p%2F993779&amp;data=04%7C01%7Chaibo.chen%4
> > >
> 0nxp.com%7C1dc0981634f5460a779808d99b1d5a88%7C686ea1d3bc2b4c6fa9
> > >
> 2cd99c5c301635%7C0%7C0%7C637711372651089473%7CUnknown%7CTWFp
> > >
> bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> > >
> 6Mn0%3D%7C1000&amp;sdata=ITcs7%2FMy%2F1Vx1TMB2VlaY4QhibKuSFBD
> > > 6UZhzVFl%2FqY%3D&amp;reserved=0
> > > -
> > > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit
> > > .torad%2F&amp;data=04%7C01%7Chaibo.chen%40nxp.com%7C281983c39
> 6a442e7
> > >
> 8d2108d99ee9f858%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6
> 37715
> > >
> 549993442194%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ
> IjoiV2l
> > >
> uMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=CyMZIUVjzXj
> 2tD3
> > > MfO4kUAOXr5SazgtJSRlhro9wOvU%3D&amp;reserved=0
> > >
> ex.com%2Fcgit%2Flinux-toradex.git%2Fcommit%2F%3Fh%3Dtoradex_5.4-2.3.
> > > x
> -imx%26id%3Dfd33531be843566c59a5fc655f204bbd36d7f3c6&amp;data=04%
> > >
> 7C01%7Chaibo.chen%40nxp.com%7C1dc0981634f5460a779808d99b1d5a88%
> > >
> 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637711372651089473
> > > %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> iLCJ
> > >
> BTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xaamzPb2CdW6YDzW
> > > g8uBb0PjomkoWAziu5qglvMbT2I%3D&amp;reserved=0
> > >
> > > I'm not clear if this info is up-to-date. The NXP 5.4 kernel did not
> > enable this
> > > feature but if I'm not mistaken CQHCI support itself didn't land in
> > mainline until
> > > a later kernel so it would make sense it was not enabled at that
> > > time. I
> > do see
> > > the NXP 5.10 kernels have this enabled so I'm curious if it is an
> > > issue
> > there.
> > >
> > > Any other IMX8MM or other SoC users know what this could be about or
> > > what
> > I
> > > could do for a test to try to reproduce it so I can see if it occurs
> > > in
> > other kernel
> > > versions?
> >
> > Hi Tim,
> >
> > I'm debugging this issue those days, but unfortunately, still not find
> > the root cause.
> > The register value of Doorbell, Dev Queue, Dev Pend seems abnormal.
> > This issue happens on all i.MX SoC which support cmdq feature when cpu
> > loading is high.. Now I lack a mmc logic analyzer, make it not easy to
> > debug this issue. So stll need some time. Sorry about that.
> > If you want to make mmc work stable, you can disable the cmdq as a
> > workaround.
> >
> > Best Regards
> > Haibo Chen
> 
> Haibo,
> 
> Thanks for the information. Do you know how to easily reproduce it
reliably for
> testing?

Still not, can only meet this issue randomly after few hours stress test
under high CPU loading.

My next step is :
1, find a way to reproduce this issue easily
2, get emmc logic analyzer.


> 
> I have tried the following on an eMMC filesystem:
> stress --cpu 32 --io 32 &
> dd if=/dev/zero of=foo bs=1M count=1000 & dd if=/dev/zero of=foo bs=1M
> count=1000 & rm foo
> 
> I'm unable to reproduce the issue that way, and it has only happened
randomly
> once or twice.
> 
> Perhaps we should disable CMDQ for now until you can sort this out? I can
> submit a patch for that.

Yes, please.

Best Regards
Haibo Chen
> 
> Best regards,
> 
> Tim

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 9551 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
  2021-11-03 17:12       ` Fabio Estevam
  2021-11-04  2:06         ` Bough Chen
@ 2021-11-05  7:56       ` Adrian Hunter
  2021-11-15 14:54       ` Ulf Hansson
  3 siblings, 0 replies; 11+ messages in thread
From: Adrian Hunter @ 2021-11-05  7:56 UTC (permalink / raw)
  To: Tim Harvey, Ulf Hansson, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	BOUGH CHEN, linux-mmc, Marcel Ziswiler, Schrempf Frieder,
	Adam Ford, Lucas Stach, Peng Fan
  Cc: stable

On 03/11/2021 18:54, Tim Harvey wrote:
> On IMX SoC's which support CMDQ the following can occur during high a
> high cpu load:
> 
> mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
> mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
> mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
> mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
> mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
> mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
> mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
> mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
> mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
> mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
> mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> mmc2: sdhci: Host ctl2: 0x00000088
> mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
> mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> mmc2: sdhci: ============================================
> 
> For now, disable CMDQ support on the imx8qm/imx8qxp/imx8mm until the
> issue is found and resolved.
> 
> Fixes: bb6e358169bf6 ("mmc: sdhci-esdhc-imx: add CMDQ support")
> Fixes: cde5e8e9ff146 ("mmc: sdhci-esdhc-imx: Add an new esdhc_soc_data
> for i.MX8MM")
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Tim Harvey <tharvey@gateworks.com>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/mmc/host/sdhci-esdhc-imx.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c
> index e658f0174242..60f19369de84 100644
> --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> @@ -300,7 +300,6 @@ static struct esdhc_soc_data usdhc_imx8qxp_data = {
>  	.flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>  			| ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>  			| ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -			| ESDHC_FLAG_CQHCI
>  			| ESDHC_FLAG_STATE_LOST_IN_LPMODE
>  			| ESDHC_FLAG_CLK_RATE_LOST_IN_PM_RUNTIME,
>  };
> @@ -309,7 +308,6 @@ static struct esdhc_soc_data usdhc_imx8mm_data = {
>  	.flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>  			| ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>  			| ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -			| ESDHC_FLAG_CQHCI
>  			| ESDHC_FLAG_STATE_LOST_IN_LPMODE,
>  };
>  
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support
  2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
                         ` (2 preceding siblings ...)
  2021-11-05  7:56       ` Adrian Hunter
@ 2021-11-15 14:54       ` Ulf Hansson
  3 siblings, 0 replies; 11+ messages in thread
From: Ulf Hansson @ 2021-11-15 14:54 UTC (permalink / raw)
  To: Tim Harvey
  Cc: Adrian Hunter, Shawn Guo, Sascha Hauer, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, BOUGH CHEN, linux-mmc,
	Marcel Ziswiler, Schrempf Frieder, Adam Ford, Lucas Stach,
	Peng Fan, stable

On Wed, 3 Nov 2021 at 17:54, Tim Harvey <tharvey@gateworks.com> wrote:
>
> On IMX SoC's which support CMDQ the following can occur during high a
> high cpu load:
>
> mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
> mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
> mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
> mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> mmc2: cqhci: TDL base:  0x8003f000 | TDL up32: 0x00000000
> mmc2: cqhci: Doorbell:  0xbf01dfff | TCN:      0x00000000
> mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x08000000
> mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000800
> mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
> mmc2: cqhci: Resp idx:  0x0000000d | Resp arg: 0x00000000
> mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
> mmc2: sdhci: Sys addr:  0x7c722000 | Version:  0x00000002
> mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000020
> mmc2: sdhci: Argument:  0x00018000 | Trn mode: 0x00000023
> mmc2: sdhci: Present:   0x01f88008 | Host ctl: 0x00000030
> mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
> mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
> mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
> mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
> mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
> mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
> mmc2: sdhci: Cmd:       0x00000d1a | Max curr: 0x00ffffff
> mmc2: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0xffc003ff
> mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d07f01
> mmc2: sdhci: Host ctl2: 0x00000088
> mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0xfe179020
> mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP ====
> mmc2: sdhci-esdhc-imx: cmd debug status:  0x2120
> mmc2: sdhci-esdhc-imx: data debug status:  0x2200
> mmc2: sdhci-esdhc-imx: trans debug status:  0x2300
> mmc2: sdhci-esdhc-imx: dma debug status:  0x2400
> mmc2: sdhci-esdhc-imx: adma debug status:  0x2510
> mmc2: sdhci-esdhc-imx: fifo debug status:  0x2680
> mmc2: sdhci-esdhc-imx: async fifo debug status:  0x2750
> mmc2: sdhci: ============================================
>
> For now, disable CMDQ support on the imx8qm/imx8qxp/imx8mm until the
> issue is found and resolved.
>
> Fixes: bb6e358169bf6 ("mmc: sdhci-esdhc-imx: add CMDQ support")
> Fixes: cde5e8e9ff146 ("mmc: sdhci-esdhc-imx: Add an new esdhc_soc_data
> for i.MX8MM")
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Tim Harvey <tharvey@gateworks.com>

Applied for fixes, thanks!

Kind regards
Uffe


> ---
>  drivers/mmc/host/sdhci-esdhc-imx.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c
> index e658f0174242..60f19369de84 100644
> --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> @@ -300,7 +300,6 @@ static struct esdhc_soc_data usdhc_imx8qxp_data = {
>         .flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>                         | ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>                         | ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -                       | ESDHC_FLAG_CQHCI
>                         | ESDHC_FLAG_STATE_LOST_IN_LPMODE
>                         | ESDHC_FLAG_CLK_RATE_LOST_IN_PM_RUNTIME,
>  };
> @@ -309,7 +308,6 @@ static struct esdhc_soc_data usdhc_imx8mm_data = {
>         .flags = ESDHC_FLAG_USDHC | ESDHC_FLAG_STD_TUNING
>                         | ESDHC_FLAG_HAVE_CAP1 | ESDHC_FLAG_HS200
>                         | ESDHC_FLAG_HS400 | ESDHC_FLAG_HS400_ES
> -                       | ESDHC_FLAG_CQHCI
>                         | ESDHC_FLAG_STATE_LOST_IN_LPMODE,
>  };
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-15 14:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-29 20:47 IMX8MM eMMC CQHCI timeout Tim Harvey
2021-11-01  1:57 ` Bough Chen
2021-11-03 16:49   ` Tim Harvey
2021-11-03 16:54     ` [PATCH] mmc: sdhci-esdhc-imx: disable CMDQ support Tim Harvey
2021-11-03 17:12       ` Fabio Estevam
2021-11-04  2:06       ` Bough Chen
2021-11-04  2:06         ` Bough Chen
2021-11-05  7:56       ` Adrian Hunter
2021-11-15 14:54       ` Ulf Hansson
2021-11-03 17:21     ` IMX8MM eMMC CQHCI timeout Marcel Ziswiler
2021-11-04  2:13     ` Bough Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.