All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masahiro Yamada <masahiroy@kernel.org>
To: Tim Sander <tim@krieglstein.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>,
	Marek Vasut <marek.vasut@gmail.com>,
	Richard Weinberger <richard@nod.at>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Dinh Nguyen <dinguyen@kernel.org>,
	linux-mtd <linux-mtd@lists.infradead.org>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Brian Norris <computersforpeace@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V
Date: Sat, 11 Jan 2020 11:38:35 +0900	[thread overview]
Message-ID: <CAK7LNASZMH34QcQij8CuGnOkC1_g6UShiHw3+_QBLddzf6W4XA@mail.gmail.com> (raw)
In-Reply-To: <2585494.6OhLyxUeiZ@hydra>

On Sat, Jan 11, 2020 at 7:38 AM Tim Sander <tim@krieglstein.org> wrote:
>
> Hi
> Am Freitag, 10. Januar 2020, 20:05:20 CET schrieb Masahiro Yamada:
> > On Sat, Jan 11, 2020 at 1:47 AM Tim Sander <tim@krieglstein.org> wrote:
> > > Hi Masahiro Yamada
> > >
> > > Sorry for the large delay. I have seen the patches at
> > > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html
> > > Seem to resolve the question about the spare_area_skip_bytes register.
> > >
> > > I have now set the register to 2 which seems to be the right choice on an
> > > Intel SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or
> > > 5.5-rc5. I get the following messages during bootup booting:
> > > [    1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq
> > > 0x1000 [    1.832936] denali-nand-dt: probe of ff900000.nand failed with
> > > error -5
> > >
> > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel
> > > release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious
> > > commit which is causing the problem. Looking at the changes it might be
> > > that the timing calculations in the driver changed which might also lead
> > > to a similar error.
> > >
> > > I am booting via NFS the bootloader is placed in NOR flash.  The
> > > corresponding>
> > > nand dts entry is updated to the new format and looks like this:
> > >                 nand@ff900000 {
> > >
> > >                         #address-cells = <0x1>;
> > >                         #size-cells = <0x0>;
> > >                         compatible = "altr,socfpga-denali-nand";
> > >                         reg = <0xff900000 0x100000 0xffb80000 0x10000>;
> > >                         reg-names = "nand_data", "denali_reg";
> > >                         interrupts = <0x0 0x90 0x4>;
> > >                         clocks = <0x2d 0x1e 0x2e>;
> > >                         clock-names = "nand", "nand_x", "ecc";
> > >                         resets = <0x6 0x24>;
> > >                         status = "okay";
> > >                         nand@0 {
> > >
> > >                                 reg = <0x0>;
> > >                                 #address-cells = <0x1>;
> > >                                 #size-cells = <0x1>;
> > >                                 partition@0 {
> > >
> > >                                         label = "work";
> > >                                         reg = <0x0 0x10000000>;
> > >
> > >                                 };
> > >
> > >                         };
> > >
> > >                 };
> > >
> > > The last kernel i am able to boot is 4.19.10. I have tried booting:
> > > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed.
> > > Unfortunately the range is quite large for bisecting the problem. It also
> > > occurred to me that all the platforms with Intel Cyclone V in mainline
> > > are development boards which boot from SD-card not exhibiting this
> > > problem on their default boot path.
> > What will happen if you apply all of these:
> >
> > http://patchwork.ozlabs.org/project/linux-mtd/list/?series=149821
> I have applied this patch set but it does not help completely.


OK, I just wanted to eliminate any other possibility, just in case.


> The timings are
> wrong. I don't have access to the hardware now but one thing i tested before i
> left (the HW) was to write the NAND timings from the bootloader into the
> denali controller after the driver configured the timings in denali_init.
> After that the driver worked again for me.
>
> > on top of the mainline kernel,
> > and then, hack denali->clk_rate and denali->clk_x_rate as follows?
> >
> >
> > -       denali->clk_rate = clk_get_rate(dt->clk);
> > -       denali->clk_x_rate = clk_get_rate(dt->clk_x);
> > +       denali->clk_rate = 50000000;
> > +       denali->clk_x_rate = 200000000;
> >
> > If it still fails, what about this?
> >
> >        denali->clk_rate = 0;
> >        denali->clk_x_rate = 0;
> Will try the above next week. Skimming over the socfpga.dtsi it seems as if
> on the Intel SocFPGA the OSC1 has a value of 25000000 set in
> socfpga_cyclone5.dtsi (I am currently not sure about the clock tree with all
> the plls and i am missing the value of osc2?). Also right now it seems i am to
> tired to parse denali_setup_data_interface...


You do not need to parse denali_setup_data_interface().


There are good hints.

You said:
"The last kernel i am able to boot is 4.19.10. I have tried booting:
5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed."

There is no commit between 4.19.10 and 5.1.21
that changes denali_setup_data_interface().

So, denali_setup_data_interface() is not the
root cause.


From the information you provided,
I suspect some clock settings are wrong.



> > > PS: Here is some snippet from an older mail i didn't sent to the list yet
> > > which might be superseded by now:
> > > To get into this matter i started reading the "Intel Cyclone V HPS TRM"
> > > Section 13-20 Preserving Bad Block Markers:
> > > "You can configure the NAND flash controller to skip over a specified
> > > number of bytes when it writes the last sector in a page to the spare
> > > area. This option write the desired offset to the spare_area_skip_bytes
> > > register in the config group. For example, if the device page size is 2
> > > KB, and the device area, set the spare_area_skip_bytes register to 2.
> > > When the flash controller writes the last sector of the page that
> > > overlaps with the spare area, it spare_area_skip_bytes must be an even
> > > number. For example, if the bad block marker is a single byte, set
> > > spare_area_skip_bytes to 2."
> >
> > I did not know this documentation.
> >
> > It says "For example" (twice),
> > it sounds uncertain to me, though.
> >
> > Anyway, an intel engineer checked the boot ROM code.
> > SPARE_AREA_SKIP_BYTES=2 is correct, he said.
> As far as i understand the documentation it must be a multiple of 2. The most
> nand flashes i know need one byte for bad block marking so 2 seems to be a
> pretty sane value.


Most of NAND flashes, but not all.

See the "Bad Block Location" in this page:

http://www.linux-mtd.infradead.org/nand-data/nanddata.html



Many of devices have BBM at 1st byte/word,
but there are devices that have it at 6th byte.

SPARE_AREA_SKIP_BYTES=2 for SOCFPGA
corrupts the BBM at offset 6.
So, probably such a device is not used
on SOCFPGA boards.

I am guessing that is why the UniPhier platform
adopted SPARE_AREA_SKIP_BYTES=8.





> The explanation why default value of
> spare_area_skip_bytes=0 of the boot rom is a little unfortunate is also in the
> documentation: The fact that the ECC values might spill into the spare area
> where the bad block marker of the nand is located.




-- 
Best Regards
Masahiro Yamada

WARNING: multiple messages have this Message-ID (diff)
From: Masahiro Yamada <masahiroy@kernel.org>
To: Tim Sander <tim@krieglstein.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Richard Weinberger <richard@nod.at>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Marek Vasut <marek.vasut@gmail.com>,
	linux-mtd <linux-mtd@lists.infradead.org>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Brian Norris <computersforpeace@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V
Date: Sat, 11 Jan 2020 11:38:35 +0900	[thread overview]
Message-ID: <CAK7LNASZMH34QcQij8CuGnOkC1_g6UShiHw3+_QBLddzf6W4XA@mail.gmail.com> (raw)
In-Reply-To: <2585494.6OhLyxUeiZ@hydra>

On Sat, Jan 11, 2020 at 7:38 AM Tim Sander <tim@krieglstein.org> wrote:
>
> Hi
> Am Freitag, 10. Januar 2020, 20:05:20 CET schrieb Masahiro Yamada:
> > On Sat, Jan 11, 2020 at 1:47 AM Tim Sander <tim@krieglstein.org> wrote:
> > > Hi Masahiro Yamada
> > >
> > > Sorry for the large delay. I have seen the patches at
> > > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html
> > > Seem to resolve the question about the spare_area_skip_bytes register.
> > >
> > > I have now set the register to 2 which seems to be the right choice on an
> > > Intel SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or
> > > 5.5-rc5. I get the following messages during bootup booting:
> > > [    1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq
> > > 0x1000 [    1.832936] denali-nand-dt: probe of ff900000.nand failed with
> > > error -5
> > >
> > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel
> > > release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious
> > > commit which is causing the problem. Looking at the changes it might be
> > > that the timing calculations in the driver changed which might also lead
> > > to a similar error.
> > >
> > > I am booting via NFS the bootloader is placed in NOR flash.  The
> > > corresponding>
> > > nand dts entry is updated to the new format and looks like this:
> > >                 nand@ff900000 {
> > >
> > >                         #address-cells = <0x1>;
> > >                         #size-cells = <0x0>;
> > >                         compatible = "altr,socfpga-denali-nand";
> > >                         reg = <0xff900000 0x100000 0xffb80000 0x10000>;
> > >                         reg-names = "nand_data", "denali_reg";
> > >                         interrupts = <0x0 0x90 0x4>;
> > >                         clocks = <0x2d 0x1e 0x2e>;
> > >                         clock-names = "nand", "nand_x", "ecc";
> > >                         resets = <0x6 0x24>;
> > >                         status = "okay";
> > >                         nand@0 {
> > >
> > >                                 reg = <0x0>;
> > >                                 #address-cells = <0x1>;
> > >                                 #size-cells = <0x1>;
> > >                                 partition@0 {
> > >
> > >                                         label = "work";
> > >                                         reg = <0x0 0x10000000>;
> > >
> > >                                 };
> > >
> > >                         };
> > >
> > >                 };
> > >
> > > The last kernel i am able to boot is 4.19.10. I have tried booting:
> > > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed.
> > > Unfortunately the range is quite large for bisecting the problem. It also
> > > occurred to me that all the platforms with Intel Cyclone V in mainline
> > > are development boards which boot from SD-card not exhibiting this
> > > problem on their default boot path.
> > What will happen if you apply all of these:
> >
> > http://patchwork.ozlabs.org/project/linux-mtd/list/?series=149821
> I have applied this patch set but it does not help completely.


OK, I just wanted to eliminate any other possibility, just in case.


> The timings are
> wrong. I don't have access to the hardware now but one thing i tested before i
> left (the HW) was to write the NAND timings from the bootloader into the
> denali controller after the driver configured the timings in denali_init.
> After that the driver worked again for me.
>
> > on top of the mainline kernel,
> > and then, hack denali->clk_rate and denali->clk_x_rate as follows?
> >
> >
> > -       denali->clk_rate = clk_get_rate(dt->clk);
> > -       denali->clk_x_rate = clk_get_rate(dt->clk_x);
> > +       denali->clk_rate = 50000000;
> > +       denali->clk_x_rate = 200000000;
> >
> > If it still fails, what about this?
> >
> >        denali->clk_rate = 0;
> >        denali->clk_x_rate = 0;
> Will try the above next week. Skimming over the socfpga.dtsi it seems as if
> on the Intel SocFPGA the OSC1 has a value of 25000000 set in
> socfpga_cyclone5.dtsi (I am currently not sure about the clock tree with all
> the plls and i am missing the value of osc2?). Also right now it seems i am to
> tired to parse denali_setup_data_interface...


You do not need to parse denali_setup_data_interface().


There are good hints.

You said:
"The last kernel i am able to boot is 4.19.10. I have tried booting:
5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed."

There is no commit between 4.19.10 and 5.1.21
that changes denali_setup_data_interface().

So, denali_setup_data_interface() is not the
root cause.


From the information you provided,
I suspect some clock settings are wrong.



> > > PS: Here is some snippet from an older mail i didn't sent to the list yet
> > > which might be superseded by now:
> > > To get into this matter i started reading the "Intel Cyclone V HPS TRM"
> > > Section 13-20 Preserving Bad Block Markers:
> > > "You can configure the NAND flash controller to skip over a specified
> > > number of bytes when it writes the last sector in a page to the spare
> > > area. This option write the desired offset to the spare_area_skip_bytes
> > > register in the config group. For example, if the device page size is 2
> > > KB, and the device area, set the spare_area_skip_bytes register to 2.
> > > When the flash controller writes the last sector of the page that
> > > overlaps with the spare area, it spare_area_skip_bytes must be an even
> > > number. For example, if the bad block marker is a single byte, set
> > > spare_area_skip_bytes to 2."
> >
> > I did not know this documentation.
> >
> > It says "For example" (twice),
> > it sounds uncertain to me, though.
> >
> > Anyway, an intel engineer checked the boot ROM code.
> > SPARE_AREA_SKIP_BYTES=2 is correct, he said.
> As far as i understand the documentation it must be a multiple of 2. The most
> nand flashes i know need one byte for bad block marking so 2 seems to be a
> pretty sane value.


Most of NAND flashes, but not all.

See the "Bad Block Location" in this page:

http://www.linux-mtd.infradead.org/nand-data/nanddata.html



Many of devices have BBM at 1st byte/word,
but there are devices that have it at 6th byte.

SPARE_AREA_SKIP_BYTES=2 for SOCFPGA
corrupts the BBM at offset 6.
So, probably such a device is not used
on SOCFPGA boards.

I am guessing that is why the UniPhier platform
adopted SPARE_AREA_SKIP_BYTES=8.





> The explanation why default value of
> spare_area_skip_bytes=0 of the boot rom is a little unfortunate is also in the
> documentation: The fact that the ECC values might spill into the spare area
> where the bad block marker of the nand is located.




-- 
Best Regards
Masahiro Yamada

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2020-01-11  2:43 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06 12:38 mtd raw nand denali.c broken for Intel/Altera Cyclone V Tim Sander
2019-09-06 12:38 ` Tim Sander
2019-09-10  7:16 ` Masahiro Yamada
2019-09-10  7:16   ` Masahiro Yamada
2019-09-10 13:48   ` Tim Sander
2019-09-10 13:48     ` Tim Sander
2019-09-10 15:22     ` Dinh Nguyen
2019-09-10 15:22       ` Dinh Nguyen
2019-09-11  2:37       ` Masahiro Yamada
2019-09-11  2:37         ` Masahiro Yamada
2019-09-11  7:27         ` Tim Sander
2019-09-11  7:27           ` Tim Sander
2019-09-26  9:10         ` Tim Sander
2019-09-26  9:10           ` Tim Sander
2019-09-26 17:47           ` Masahiro Yamada
2019-09-26 17:47             ` Masahiro Yamada
2020-01-10 16:46             ` Tim Sander
2020-01-10 16:46               ` Tim Sander
2020-01-10 17:13               ` Marek Vasut
2020-01-10 17:13                 ` Marek Vasut
2020-01-10 19:05               ` Masahiro Yamada
2020-01-10 19:05                 ` Masahiro Yamada
2020-01-10 22:38                 ` Tim Sander
2020-01-10 22:38                   ` Tim Sander
2020-01-11  2:38                   ` Masahiro Yamada [this message]
2020-01-11  2:38                     ` Masahiro Yamada
2020-01-13 10:22                     ` Tim Sander
2020-01-13 10:22                       ` Tim Sander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK7LNASZMH34QcQij8CuGnOkC1_g6UShiHw3+_QBLddzf6W4XA@mail.gmail.com \
    --to=masahiroy@kernel.org \
    --cc=computersforpeace@gmail.com \
    --cc=dinguyen@kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=marek.vasut@gmail.com \
    --cc=miquel.raynal@bootlin.com \
    --cc=richard@nod.at \
    --cc=tim@krieglstein.org \
    --cc=vigneshr@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.