All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ahmad Fatoum <a.fatoum@pengutronix.de>
To: linux-mtd@lists.infradead.org, ikegami@allied-telesis.co.jp,
	Joakim.Tjernlund@infinera.com, miquel.raynal@bootlin.com,
	vigneshr@ti.com, richard@nod.at, ikegami.t@gmail.com
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>,
	Brian Norris <computersforpeace@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>,
	marek.vasut@gmail.com, cyrille.pitchen@wedev4u.fr,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Pengutronix Kernel Team <kernel@pengutronix.de>,
	linuxppc-dev@lists.ozlabs.org, Shaohui.Xie@nxp.com
Subject: [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1
Date: Mon, 13 Dec 2021 14:24:39 +0100	[thread overview]
Message-ID: <b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.de> (raw)

Hi,

I've been investigating a breakage on a PowerPC MPC8313: The SoC is connected
via the "Enhanced Local Bus Controller" to a 8-bit-parallel S29GL064N flash,
which is represented as a memory-mapped cfi-flash.

The regression began in v4.17-rc1 with

  dfeae1073583 ("mtd: cfi_cmdset_0002: Change write buffer to check correct value")

and causes all flash write accesses on the hardware to fail. Example output
after v5.1-rc2[1]:

  root@host:~# mount -t jffs2 /dev/mtdblock0 /mnt
  MTD do_write_buffer_wait(): software timeout, address:0x000c000b.
  jffs2: Write clean marker to block at 0x000c0000 failed: -5

This issue still persists with v5.16-rc. Reverting aforementioned patch fixes
it, but I am still looking for a change that keeps both Tokunori's and my
hardware happy.

What Tokunori's patch did is that it strengthened the success condition
for flash writes:

 - Prior to the patch, DQ polling was done until bits
   stopped toggling. This was taken as an indicator that the write succeeded
   and was reported up the stack. i.e. success condition is chip_ready()

 - After the patch, polling continues until the just written data is
   actually read back, i.e. success condition is chip_good()

This new condition never holds for me, when DQ stabilizes, it reads 0xFF,
never the just written data. The data is still written and can be read back
on subsequent reads, just not at that point of time in the poll loop.

We haven't had write issues for the years predating that patch. As the
regression has been mainline for a while, I am wondering what about my setup
that makes it pop up here, but not elsewhere?

I consulted the data sheet[2] and found Figure 27, which describes DQ polling
during embedded algorithms. DQ switches from status output to "True" (I assume
True == all bits set == 0xFF) until CS# is reasserted. 

I compared with another chip's datasheet, and it (Figure 8.4) doesn't describe
such an intermittent "True" state. In any case, the driver polls a few hundred
times, however, before giving up, so there should be enough CS# toggles.


Locally, I'll revert this patch for now. I think accepting 0xFF as a success
condition may be appropriate, but I don't yet have the rationale to back it up.

I am investigating this some more, probably with a logic trace, but I wanted
to report this in case someone has pointers and in case other people run into
the same issue.


Cheers,
Ahmad

[1] Prior to d9b8a67b3b95 ("mtd: cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer") 
    first included with v5.1-rc2, failing writes just hung indefinitely in kernel space.
    That's fixed, but the writes still fail.

[2]: 001-98525 Rev. *B, https://www.infineon.com/dgdl/Infineon-S29GL064N_S29GL032N_64_Mbit_32_Mbit_3_V_Page_Mode_MirrorBit_Flash-DataSheet-v03_00-EN.pdf?fileId=8ac78c8c7d0d8da4017d0ed556fd548b

[3]: https://www.mouser.com/datasheet/2/268/SST39VF1601C-SST39VF1602C-16-Mbit-x16-Multi-Purpos-709008.pdf
     Note that "true data" means valid data here, not all bits one.
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

WARNING: multiple messages have this Message-ID (diff)
From: Ahmad Fatoum <a.fatoum@pengutronix.de>
To: linux-mtd@lists.infradead.org, ikegami@allied-telesis.co.jp,
	Joakim.Tjernlund@infinera.com, miquel.raynal@bootlin.com,
	vigneshr@ti.com, richard@nod.at, ikegami.t@gmail.com
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>,
	Brian Norris <computersforpeace@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>,
	marek.vasut@gmail.com, cyrille.pitchen@wedev4u.fr,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Pengutronix Kernel Team <kernel@pengutronix.de>,
	linuxppc-dev@lists.ozlabs.org, Shaohui.Xie@nxp.com
Subject: [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1
Date: Mon, 13 Dec 2021 14:24:39 +0100	[thread overview]
Message-ID: <b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.de> (raw)

Hi,

I've been investigating a breakage on a PowerPC MPC8313: The SoC is connected
via the "Enhanced Local Bus Controller" to a 8-bit-parallel S29GL064N flash,
which is represented as a memory-mapped cfi-flash.

The regression began in v4.17-rc1 with

  dfeae1073583 ("mtd: cfi_cmdset_0002: Change write buffer to check correct value")

and causes all flash write accesses on the hardware to fail. Example output
after v5.1-rc2[1]:

  root@host:~# mount -t jffs2 /dev/mtdblock0 /mnt
  MTD do_write_buffer_wait(): software timeout, address:0x000c000b.
  jffs2: Write clean marker to block at 0x000c0000 failed: -5

This issue still persists with v5.16-rc. Reverting aforementioned patch fixes
it, but I am still looking for a change that keeps both Tokunori's and my
hardware happy.

What Tokunori's patch did is that it strengthened the success condition
for flash writes:

 - Prior to the patch, DQ polling was done until bits
   stopped toggling. This was taken as an indicator that the write succeeded
   and was reported up the stack. i.e. success condition is chip_ready()

 - After the patch, polling continues until the just written data is
   actually read back, i.e. success condition is chip_good()

This new condition never holds for me, when DQ stabilizes, it reads 0xFF,
never the just written data. The data is still written and can be read back
on subsequent reads, just not at that point of time in the poll loop.

We haven't had write issues for the years predating that patch. As the
regression has been mainline for a while, I am wondering what about my setup
that makes it pop up here, but not elsewhere?

I consulted the data sheet[2] and found Figure 27, which describes DQ polling
during embedded algorithms. DQ switches from status output to "True" (I assume
True == all bits set == 0xFF) until CS# is reasserted. 

I compared with another chip's datasheet, and it (Figure 8.4) doesn't describe
such an intermittent "True" state. In any case, the driver polls a few hundred
times, however, before giving up, so there should be enough CS# toggles.


Locally, I'll revert this patch for now. I think accepting 0xFF as a success
condition may be appropriate, but I don't yet have the rationale to back it up.

I am investigating this some more, probably with a logic trace, but I wanted
to report this in case someone has pointers and in case other people run into
the same issue.


Cheers,
Ahmad

[1] Prior to d9b8a67b3b95 ("mtd: cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer") 
    first included with v5.1-rc2, failing writes just hung indefinitely in kernel space.
    That's fixed, but the writes still fail.

[2]: 001-98525 Rev. *B, https://www.infineon.com/dgdl/Infineon-S29GL064N_S29GL032N_64_Mbit_32_Mbit_3_V_Page_Mode_MirrorBit_Flash-DataSheet-v03_00-EN.pdf?fileId=8ac78c8c7d0d8da4017d0ed556fd548b

[3]: https://www.mouser.com/datasheet/2/268/SST39VF1601C-SST39VF1602C-16-Mbit-x16-Multi-Purpos-709008.pdf
     Note that "true data" means valid data here, not all bits one.
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

WARNING: multiple messages have this Message-ID (diff)
From: Ahmad Fatoum <a.fatoum@pengutronix.de>
To: linux-mtd@lists.infradead.org, ikegami@allied-telesis.co.jp,
	Joakim.Tjernlund@infinera.com, miquel.raynal@bootlin.com,
	vigneshr@ti.com, richard@nod.at, ikegami.t@gmail.com
Cc: linuxppc-dev@lists.ozlabs.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	marek.vasut@gmail.com,
	Chris Packham <chris.packham@alliedtelesis.co.nz>,
	Pengutronix Kernel Team <kernel@pengutronix.de>,
	cyrille.pitchen@wedev4u.fr, Shaohui.Xie@nxp.com,
	Brian Norris <computersforpeace@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>
Subject: [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1
Date: Mon, 13 Dec 2021 14:24:39 +0100	[thread overview]
Message-ID: <b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.de> (raw)

Hi,

I've been investigating a breakage on a PowerPC MPC8313: The SoC is connected
via the "Enhanced Local Bus Controller" to a 8-bit-parallel S29GL064N flash,
which is represented as a memory-mapped cfi-flash.

The regression began in v4.17-rc1 with

  dfeae1073583 ("mtd: cfi_cmdset_0002: Change write buffer to check correct value")

and causes all flash write accesses on the hardware to fail. Example output
after v5.1-rc2[1]:

  root@host:~# mount -t jffs2 /dev/mtdblock0 /mnt
  MTD do_write_buffer_wait(): software timeout, address:0x000c000b.
  jffs2: Write clean marker to block at 0x000c0000 failed: -5

This issue still persists with v5.16-rc. Reverting aforementioned patch fixes
it, but I am still looking for a change that keeps both Tokunori's and my
hardware happy.

What Tokunori's patch did is that it strengthened the success condition
for flash writes:

 - Prior to the patch, DQ polling was done until bits
   stopped toggling. This was taken as an indicator that the write succeeded
   and was reported up the stack. i.e. success condition is chip_ready()

 - After the patch, polling continues until the just written data is
   actually read back, i.e. success condition is chip_good()

This new condition never holds for me, when DQ stabilizes, it reads 0xFF,
never the just written data. The data is still written and can be read back
on subsequent reads, just not at that point of time in the poll loop.

We haven't had write issues for the years predating that patch. As the
regression has been mainline for a while, I am wondering what about my setup
that makes it pop up here, but not elsewhere?

I consulted the data sheet[2] and found Figure 27, which describes DQ polling
during embedded algorithms. DQ switches from status output to "True" (I assume
True == all bits set == 0xFF) until CS# is reasserted. 

I compared with another chip's datasheet, and it (Figure 8.4) doesn't describe
such an intermittent "True" state. In any case, the driver polls a few hundred
times, however, before giving up, so there should be enough CS# toggles.


Locally, I'll revert this patch for now. I think accepting 0xFF as a success
condition may be appropriate, but I don't yet have the rationale to back it up.

I am investigating this some more, probably with a logic trace, but I wanted
to report this in case someone has pointers and in case other people run into
the same issue.


Cheers,
Ahmad

[1] Prior to d9b8a67b3b95 ("mtd: cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer") 
    first included with v5.1-rc2, failing writes just hung indefinitely in kernel space.
    That's fixed, but the writes still fail.

[2]: 001-98525 Rev. *B, https://www.infineon.com/dgdl/Infineon-S29GL064N_S29GL032N_64_Mbit_32_Mbit_3_V_Page_Mode_MirrorBit_Flash-DataSheet-v03_00-EN.pdf?fileId=8ac78c8c7d0d8da4017d0ed556fd548b

[3]: https://www.mouser.com/datasheet/2/268/SST39VF1601C-SST39VF1602C-16-Mbit-x16-Multi-Purpos-709008.pdf
     Note that "true data" means valid data here, not all bits one.
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

             reply	other threads:[~2021-12-13 13:24 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-13 13:24 Ahmad Fatoum [this message]
2021-12-13 13:24 ` [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1 Ahmad Fatoum
2021-12-13 13:24 ` Ahmad Fatoum
2021-12-14  7:23 ` Thorsten Leemhuis
2021-12-14  7:23   ` Thorsten Leemhuis
2021-12-14  7:23   ` Thorsten Leemhuis
2021-12-15 17:34   ` Tokunori Ikegami
2021-12-15 17:34     ` Tokunori Ikegami
2021-12-15 17:34     ` Tokunori Ikegami
2022-01-20 13:00     ` Thorsten Leemhuis
2022-01-20 13:00       ` Thorsten Leemhuis
2022-01-20 13:00       ` Thorsten Leemhuis
2022-01-28 12:55     ` Ahmad Fatoum
2022-01-28 12:55       ` Ahmad Fatoum
2022-01-28 12:55       ` Ahmad Fatoum
2022-01-29 18:01       ` Tokunori Ikegami
2022-01-29 18:01         ` Tokunori Ikegami
2022-01-29 18:01         ` Tokunori Ikegami
2022-02-07 14:28         ` Ahmad Fatoum
2022-02-07 14:28           ` Ahmad Fatoum
2022-02-07 14:28           ` Ahmad Fatoum
2022-02-13 16:47           ` Tokunori Ikegami
2022-02-13 16:47             ` Tokunori Ikegami
2022-02-13 16:47             ` Tokunori Ikegami
2022-02-14 16:22             ` Ahmad Fatoum
2022-02-14 16:22               ` Ahmad Fatoum
2022-02-14 16:22               ` Ahmad Fatoum
2022-02-14 18:46               ` Tokunori Ikegami
2022-02-14 18:46                 ` Tokunori Ikegami
2022-02-14 18:46                 ` Tokunori Ikegami
2022-02-20 12:22                 ` Tokunori Ikegami
2022-02-20 12:22                   ` Tokunori Ikegami
2022-02-20 12:22                   ` Tokunori Ikegami
2022-03-04 11:11                   ` Ahmad Fatoum
2022-03-04 11:11                     ` Ahmad Fatoum
2022-03-04 11:11                     ` Ahmad Fatoum
2022-03-06 15:49                     ` Tokunori Ikegami
2022-03-06 15:49                       ` Tokunori Ikegami
2022-03-06 15:49                       ` Tokunori Ikegami
2022-03-08  9:44                       ` Ahmad Fatoum
2022-03-08  9:44                         ` Ahmad Fatoum
2022-03-08  9:44                         ` Ahmad Fatoum
2022-03-08 16:13                         ` Tokunori Ikegami
2022-03-08 16:13                           ` Tokunori Ikegami
2022-03-08 16:13                           ` Tokunori Ikegami
2022-03-08 16:23                           ` Ahmad Fatoum
2022-03-08 16:23                             ` Ahmad Fatoum
2022-03-08 16:23                             ` Ahmad Fatoum
2022-03-08 16:40                             ` Tokunori Ikegami
2022-03-08 16:40                               ` Tokunori Ikegami
2022-03-08 16:40                               ` Tokunori Ikegami

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.de \
    --to=a.fatoum@pengutronix.de \
    --cc=Joakim.Tjernlund@infinera.com \
    --cc=Shaohui.Xie@nxp.com \
    --cc=chris.packham@alliedtelesis.co.nz \
    --cc=computersforpeace@gmail.com \
    --cc=cyrille.pitchen@wedev4u.fr \
    --cc=dwmw2@infradead.org \
    --cc=ikegami.t@gmail.com \
    --cc=ikegami@allied-telesis.co.jp \
    --cc=kernel@pengutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=marek.vasut@gmail.com \
    --cc=miquel.raynal@bootlin.com \
    --cc=richard@nod.at \
    --cc=vigneshr@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.