* mtd raw nand denali.c broken for Intel/Altera Cyclone V @ 2019-09-06 12:38 Tim Sander 2019-09-10 7:16 ` Masahiro Yamada 0 siblings, 1 reply; 14+ messages in thread From: Tim Sander @ 2019-09-06 12:38 UTC (permalink / raw) To: Masahiro Yamada Cc: Miquel Raynal, Richard Weinberger, David Woodhouse, Brian Norris, Marek Vasut, Vignesh Raghavendra, linux-mtd, linux-kernel, Dinh Nguyen Hi I have noticed that there multiple breakages piling up for the denali nand driver on the Intel/Altera Cyclone V. Unfortunately i had no time to track the mainline kernel closely. So the breakage seems to pile up. I am a little disapointed that Intel is not on the lookout that the kernel works on the chips they are selling. I was really happy about the state of the platform before concerning mainline support. The failure starts with kernel 4.19 or stable kernel release 4.18.19. The commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. The problem here is that our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. But in this case the patch assumes the default value 8 which is straight out wrong on this variant. Without this patch reverted all blocks of the nand flash are beeing marked bad :-(. When reverting the patch ba4a1b62a2d742df9e9c607ac53b3bf33496508f i can boot 4.19.10 again. With 5.0 the it goes further down the drain and i didn't manage to boot it even with the above patch reverted. I also tried 5.3-rc7 with the above patch reverted and the variable t_x dirty hacked to the value 0x1388 as i got the impression that the timing calculation is off too. I still get an interrupt error and boot failure: [ 0.817588] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [ 0.823946] nand: Micron MT29F2G08ABAEAWP [ 0.827965] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 [ 1.887052] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 [ 2.911056] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 I have seen this https://lore.kernel.org/patchwork/patch/983055/ thread and this might fix at least the 4.19 boot problem. I would be really happy for hints how to get the Intel Cyclone V working again. Best regards Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-06 12:38 mtd raw nand denali.c broken for Intel/Altera Cyclone V Tim Sander @ 2019-09-10 7:16 ` Masahiro Yamada 2019-09-10 13:48 ` Tim Sander 0 siblings, 1 reply; 14+ messages in thread From: Masahiro Yamada @ 2019-09-10 7:16 UTC (permalink / raw) To: Tim Sander Cc: Vignesh Raghavendra, Dinh Nguyen, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse On Fri, Sep 6, 2019 at 9:39 PM Tim Sander <tim@krieglstein.org> wrote: > > Hi > > I have noticed that there multiple breakages piling up for the denali nand > driver on the Intel/Altera Cyclone V. Unfortunately i had no time to track the > mainline kernel closely. So the breakage seems to pile up. I am a little > disapointed that Intel is not on the lookout that the kernel works on the > chips they are selling. I was really happy about the state of the platform > before concerning mainline support. > > The failure starts with kernel 4.19 or stable kernel release 4.18.19. The > commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. Just for clarification, this corresponds to 0d55c668b218a1db68b5044bce4de74e1bd0f0c8 upstream. > The problem here is that > our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. Please clarify the scope of "our platform". (Only you, or your company, or every individual using this chip?) First, SPARE_AREA_SKIP_BYTES is not the property of the hardware. Rather, it is about the OOB layout, in other words, this parameter is defined by software. For example, U-Boot supports the Denali NAND driver. The SPARE_AREA_SKIP_BYTES is a user-configurable parameter: https://github.com/u-boot/u-boot/blob/v2019.10-rc3/drivers/mtd/nand/raw/Kconfig#L112 Your platform works with a zero in the SPARE_AREA_SKIP_BYTES register because the NAND chip on the board was initialized with a zero set to the SPARE_AREA_SKIP_BYTES register. If the NAND chip had been initialized with 8 set to the SPARE_AREA_SKIP_BYTES register, it would have been working with 8 to the SPARE_AREA_SKIP_BYTES. The Boot ROM is the only (semi-)software that is unconfigurable by users, so the value of SPARE_AREA_SKIP_BYTES should be aligned with the boot ROM. I recommend you to check the spec of the boot ROM. (The maintainer of the platform, Dihn is CC'ed, so I hope he will jump in) Second, I doubt 0 is a good value for SPARE_AREA_SKIP_BYTES. As explained in commit log, SPARE_AREA_SKIP_BYTES==0 means the OOB is used for ECC without any offset. So, the BBM marked in the factory will be destroyed. > But in > this case the patch assumes the default value 8 which is straight out wrong > on this variant. Without this patch reverted all blocks of the nand flash are > beeing marked bad :-(. > > When reverting the patch ba4a1b62a2d742df9e9c607ac53b3bf33496508f i can boot > 4.19.10 again. > > With 5.0 the it goes further down the drain and i didn't manage to boot it > even with the above patch reverted. > > I also tried 5.3-rc7 with the above patch reverted and the variable t_x dirty hacked to the > value 0x1388 as i got the impression that the timing calculation is off too. I still get an > interrupt error and boot failure: git-bisect is a general solution to pin point the problem. BTW, if you end up with hacking the clock frequency, something is already wrong. denali->clk_rate, denali->clk_x_rate should be 50MHz, 200MHz, respectively. If not, please check the clock driver and your DT. > [ 0.817588] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > [ 0.823946] nand: Micron MT29F2G08ABAEAWP > [ 0.827965] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > [ 1.887052] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 > [ 2.911056] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 > > I have seen this https://lore.kernel.org/patchwork/patch/983055/ thread and > this might fix at least the 4.19 boot problem. > > I would be really happy for hints how to get the Intel Cyclone V working again. > > Best regards > Tim > > > > > ______________________________________________________ > Linux MTD discussion mailing list > http://lists.infradead.org/mailman/listinfo/linux-mtd/ -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-10 7:16 ` Masahiro Yamada @ 2019-09-10 13:48 ` Tim Sander 2019-09-10 15:22 ` Dinh Nguyen 0 siblings, 1 reply; 14+ messages in thread From: Tim Sander @ 2019-09-10 13:48 UTC (permalink / raw) To: Masahiro Yamada Cc: Vignesh Raghavendra, Dinh Nguyen, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi I have noticed that my SPF records where not in place after moving the server, so it seems the mail didn't go to the mailing list. Hopefully that's fixed now. Am Dienstag, 10. September 2019, 09:16:37 CEST schrieb Masahiro Yamada: > On Fri, Sep 6, 2019 at 9:39 PM Tim Sander <tim@krieglstein.org> wrote: > > Hi > > > > I have noticed that there multiple breakages piling up for the denali nand > > driver on the Intel/Altera Cyclone V. Unfortunately i had no time to track > > the mainline kernel closely. So the breakage seems to pile up. I am a > > little disapointed that Intel is not on the lookout that the kernel works > > on the chips they are selling. I was really happy about the state of the > > platform before concerning mainline support. > > > > The failure starts with kernel 4.19 or stable kernel release 4.18.19. The > > commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. > > Just for clarification, this corresponds to > 0d55c668b218a1db68b5044bce4de74e1bd0f0c8 upstream. > > > The problem here is that > > our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. > > Please clarify the scope of "our platform". > (Only you, or your company, or every individual using this chip?) The company i work for uses this chip as a base for multiple products. > First, SPARE_AREA_SKIP_BYTES is not the property of the hardware. > Rather, it is about the OOB layout, in other words, this parameter > is defined by software. > > For example, U-Boot supports the Denali NAND driver. > The SPARE_AREA_SKIP_BYTES is a user-configurable parameter: > https://github.com/u-boot/u-boot/blob/v2019.10-rc3/drivers/mtd/nand/raw/Kcon > fig#L112 > > > Your platform works with a zero in the SPARE_AREA_SKIP_BYTES register > because the NAND chip on the board was initialized with a zero > set to the SPARE_AREA_SKIP_BYTES register. > > If the NAND chip had been initialized with 8 > set to the SPARE_AREA_SKIP_BYTES register, it would have > been working with 8 to the SPARE_AREA_SKIP_BYTES. > > The Boot ROM is the only (semi-)software that is unconfigurable by users, > so the value of SPARE_AREA_SKIP_BYTES should be aligned with > the boot ROM. > I recommend you to check the spec of the boot ROM. We boot from NOR flash. That's why i didn't see a problem booting probably. > (The maintainer of the platform, Dihn is CC'ed, > so I hope he will jump in) Yes i hope so too. > Second, I doubt 0 is a good value for SPARE_AREA_SKIP_BYTES. > > As explained in commit log, SPARE_AREA_SKIP_BYTES==0 means > the OOB is used for ECC without any offset. > So, the BBM marked in the factory will be destroyed. Oh my! Thats bad news. > > But in > > this case the patch assumes the default value 8 which is straight out > > wrong on this variant. Without this patch reverted all blocks of the nand > > flash are beeing marked bad :-(. > > > > When reverting the patch ba4a1b62a2d742df9e9c607ac53b3bf33496508f i can > > boot 4.19.10 again. > > > > With 5.0 the it goes further down the drain and i didn't manage to boot it > > even with the above patch reverted. > > > > I also tried 5.3-rc7 with the above patch reverted and the variable t_x > > dirty hacked to the value 0x1388 as i got the impression that the timing > > calculation is off too. I still get an > > interrupt error and boot failure: > git-bisect is a general solution to pin point the problem. > > BTW, if you end up with hacking the clock frequency, something is already > wrong. This was just a dirty hack to verify that this is the problem. > denali->clk_rate, denali->clk_x_rate should be 50MHz, 200MHz, respectively. > > If not, please check the clock driver and your DT. We include the device tree file for this chip directly from kernel sources. Which means that we are using the settings which are within the kernel tree in linux-5.3-rc8/arch/arm/boot/dts/socfpga.dtsi The dts entries taken verbatim from the above file are: nand0: nand@ff900000 { #address-cells = <0x1>; #size-cells = <0x1>; compatible = "altr,socfpga-denali-nand"; reg = <0xff900000 0x100000>, <0xffb80000 0x10000>; reg-names = "nand_data", "denali_reg"; interrupts = <0x0 0x90 0x4>; clocks = <&nand_clk>, <&nand_x_clk>, <&nand_ecc_clk>; clock-names = "nand", "nand_x", "ecc"; resets = <&rst NAND_RESET>; status = "disabled"; }; nand_ecc_clk: nand_ecc_clk { #clock-cells = <0>; compatible = "altr,socfpga-gate-clk"; clocks = <&nand_x_clk>; clk-gate = <0xa0 9>; }; nand_clk: nand_clk { #clock-cells = <0>; compatible = "altr,socfpga-gate-clk"; clocks = <&nand_x_clk>; clk-gate = <0xa0 10>; fixed-divider = <4>; }; nand_x_clk: nand_x_clk { #clock-cells = <0>; compatible = "altr,socfpga-gate-clk"; clocks = <&f2s_periph_ref_clk>, <&main_nand_sdmmc_clk>, <&per_nand_mmc_clk>; clk-gate = <0xa0 9>; }; f2s_periph_ref_clk: f2s_periph_ref_clk { #clock-cells = <0>; compatible = "fixed-clock"; }; main_nand_sdmmc_clk: main_nand_sdmmc_clk@58 { #clock-cells = <0>; compatible = "altr,socfpga-perip-clk"; clocks = <&main_pll>; reg = <0x58>; }; per_nand_mmc_clk: per_nand_mmc_clk@94 { #clock-cells = <0>; compatible = "altr,socfpga-perip-clk"; clocks = <&periph_pll>; reg = <0x94>; }; main_pll: main_pll@40 { #address-cells = <1>; #size-cells = <0>; #clock-cells = <0>; compatible = "altr,socfpga-pll-clock"; clocks = <&osc1>; reg = <0x40>; ... }; periph_pll: periph_pll@80 { #address-cells = <1>; #size-cells = <0>; #clock-cells = <0>; compatible = "altr,socfpga-pll-clock"; clocks = <&osc1>, <&osc2>, <&f2s_periph_ref_clk>; reg = <0x80>; ... }; and from file: linux-5.3-rc8/arch/arm/boot/dts/socfpga_cyclone5.dtsi clkmgr@ffd04000 { clocks { osc1 { clock-frequency = <25000000>; }; }; }; So basically it boils down to osc1 set to 25MHz and osc2, f2s_periph_ref_clk have a undefined frequency? Currently i have no idea what the undefined frequencies in the device tree result which frequency in the driver? But the base frequency is at least nowhere near the 50MHz and 200MHz you mentioned. Best regards Tim Below the hack to get the platform booting again, which are the timings we need in this case: Subject: [PATCH 2/2] denali: hack: overwrite setup values --- drivers/mtd/nand/raw/denali.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/mtd/nand/raw/denali.c b/drivers/mtd/nand/raw/denali.c index 5bfaa3863dbb..7b8bc9920f17 100644 --- a/drivers/mtd/nand/raw/denali.c +++ b/drivers/mtd/nand/raw/denali.c @@ -887,6 +887,15 @@ static int denali_setup_data_interface(struct nand_chip *chip, int chipnr, tmp |= FIELD_PREP(CS_SETUP_CNT__VALUE, cs_setup); sel->cs_setup_cnt = tmp; + sel->acc_clks = 0x4; + sel->re_2_re = 0x14; + sel->re_2_we = 0x14; + sel->tcwaw_and_addr_2_data = 0x3f; + sel->hwhr2_and_we_2_re = 0x14; + sel->rdwr_en_hi_cnt = 2; + sel->rdwr_en_lo_cnt = 4; + sel->cs_setup_cnt = 1; + return 0; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-10 13:48 ` Tim Sander @ 2019-09-10 15:22 ` Dinh Nguyen 2019-09-11 2:37 ` Masahiro Yamada 0 siblings, 1 reply; 14+ messages in thread From: Dinh Nguyen @ 2019-09-10 15:22 UTC (permalink / raw) To: Tim Sander, Masahiro Yamada Cc: Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse On 9/10/19 8:48 AM, Tim Sander wrote: > Hi > > I have noticed that my SPF records where not in place after moving the server, > so it seems the mail didn't go to the mailing list. Hopefully that's fixed now. > > Am Dienstag, 10. September 2019, 09:16:37 CEST schrieb Masahiro Yamada: >> On Fri, Sep 6, 2019 at 9:39 PM Tim Sander <tim@krieglstein.org> wrote: >>> Hi >>> >>> I have noticed that there multiple breakages piling up for the denali nand >>> driver on the Intel/Altera Cyclone V. Unfortunately i had no time to track >>> the mainline kernel closely. So the breakage seems to pile up. I am a >>> little disapointed that Intel is not on the lookout that the kernel works >>> on the chips they are selling. I was really happy about the state of the >>> platform before concerning mainline support. >>> >>> The failure starts with kernel 4.19 or stable kernel release 4.18.19. The >>> commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. >> >> Just for clarification, this corresponds to >> 0d55c668b218a1db68b5044bce4de74e1bd0f0c8 upstream. >> >>> The problem here is that >>> our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. >> >> Please clarify the scope of "our platform". >> (Only you, or your company, or every individual using this chip?) > The company i work for uses this chip as a base for multiple products. > >> First, SPARE_AREA_SKIP_BYTES is not the property of the hardware. >> Rather, it is about the OOB layout, in other words, this parameter >> is defined by software. >> >> For example, U-Boot supports the Denali NAND driver. >> The SPARE_AREA_SKIP_BYTES is a user-configurable parameter: >> https://github.com/u-boot/u-boot/blob/v2019.10-rc3/drivers/mtd/nand/raw/Kcon >> fig#L112 >> >> >> Your platform works with a zero in the SPARE_AREA_SKIP_BYTES register >> because the NAND chip on the board was initialized with a zero >> set to the SPARE_AREA_SKIP_BYTES register. >> >> If the NAND chip had been initialized with 8 >> set to the SPARE_AREA_SKIP_BYTES register, it would have >> been working with 8 to the SPARE_AREA_SKIP_BYTES. >> >> The Boot ROM is the only (semi-)software that is unconfigurable by users, >> so the value of SPARE_AREA_SKIP_BYTES should be aligned with >> the boot ROM. >> I recommend you to check the spec of the boot ROM. > We boot from NOR flash. That's why i didn't see a problem booting probably. > >> (The maintainer of the platform, Dihn is CC'ed, >> so I hope he will jump in) > Yes i hope so too. > I don't have access to a NAND device at the moment. I'll try to find one and debug. Dinh ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-10 15:22 ` Dinh Nguyen @ 2019-09-11 2:37 ` Masahiro Yamada 2019-09-11 7:27 ` Tim Sander 2019-09-26 9:10 ` Tim Sander 0 siblings, 2 replies; 14+ messages in thread From: Masahiro Yamada @ 2019-09-11 2:37 UTC (permalink / raw) To: Dinh Nguyen Cc: Tim Sander, Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi Dinh, On Wed, Sep 11, 2019 at 12:22 AM Dinh Nguyen <dinguyen@kernel.org> wrote: > > > > On 9/10/19 8:48 AM, Tim Sander wrote: > > Hi > > > > I have noticed that my SPF records where not in place after moving the server, > > so it seems the mail didn't go to the mailing list. Hopefully that's fixed now. > > > > Am Dienstag, 10. September 2019, 09:16:37 CEST schrieb Masahiro Yamada: > >> On Fri, Sep 6, 2019 at 9:39 PM Tim Sander <tim@krieglstein.org> wrote: > >>> Hi > >>> > >>> I have noticed that there multiple breakages piling up for the denali nand > >>> driver on the Intel/Altera Cyclone V. Unfortunately i had no time to track > >>> the mainline kernel closely. So the breakage seems to pile up. I am a > >>> little disapointed that Intel is not on the lookout that the kernel works > >>> on the chips they are selling. I was really happy about the state of the > >>> platform before concerning mainline support. > >>> > >>> The failure starts with kernel 4.19 or stable kernel release 4.18.19. The > >>> commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. > >> > >> Just for clarification, this corresponds to > >> 0d55c668b218a1db68b5044bce4de74e1bd0f0c8 upstream. > >> > >>> The problem here is that > >>> our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. > >> > >> Please clarify the scope of "our platform". > >> (Only you, or your company, or every individual using this chip?) > > The company i work for uses this chip as a base for multiple products. > > > >> First, SPARE_AREA_SKIP_BYTES is not the property of the hardware. > >> Rather, it is about the OOB layout, in other words, this parameter > >> is defined by software. > >> > >> For example, U-Boot supports the Denali NAND driver. > >> The SPARE_AREA_SKIP_BYTES is a user-configurable parameter: > >> https://github.com/u-boot/u-boot/blob/v2019.10-rc3/drivers/mtd/nand/raw/Kcon > >> fig#L112 > >> > >> > >> Your platform works with a zero in the SPARE_AREA_SKIP_BYTES register > >> because the NAND chip on the board was initialized with a zero > >> set to the SPARE_AREA_SKIP_BYTES register. > >> > >> If the NAND chip had been initialized with 8 > >> set to the SPARE_AREA_SKIP_BYTES register, it would have > >> been working with 8 to the SPARE_AREA_SKIP_BYTES. > >> > >> The Boot ROM is the only (semi-)software that is unconfigurable by users, > >> so the value of SPARE_AREA_SKIP_BYTES should be aligned with > >> the boot ROM. > >> I recommend you to check the spec of the boot ROM. > > We boot from NOR flash. That's why i didn't see a problem booting probably. > > > >> (The maintainer of the platform, Dihn is CC'ed, > >> so I hope he will jump in) > > Yes i hope so too. > > > > I don't have access to a NAND device at the moment. I'll try to find one > and debug. > > Dinh Dinh, Do you have answers for the following questions? - Does the SOCFPGA boot ROM support the NAND boot mode? - If so, which value does it use for SPARE_AREA_SKIP_BYTES? -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-11 2:37 ` Masahiro Yamada @ 2019-09-11 7:27 ` Tim Sander 2019-09-26 9:10 ` Tim Sander 1 sibling, 0 replies; 14+ messages in thread From: Tim Sander @ 2019-09-11 7:27 UTC (permalink / raw) To: Masahiro Yamada Cc: Dinh Nguyen, Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi Am Mittwoch, 11. September 2019, 04:37:46 CEST schrieb Masahiro Yamada: > - Does the SOCFPGA boot ROM support the NAND boot mode? Cyclone V HPS TRM Section "A3 Booting and Configuration" lists QSPI, SD/MMC and Nand as bootsource. > - If so, which value does it use for SPARE_AREA_SKIP_BYTES? I have no idea about this one. Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-11 2:37 ` Masahiro Yamada 2019-09-11 7:27 ` Tim Sander @ 2019-09-26 9:10 ` Tim Sander 2019-09-26 17:47 ` Masahiro Yamada 1 sibling, 1 reply; 14+ messages in thread From: Tim Sander @ 2019-09-26 9:10 UTC (permalink / raw) To: Masahiro Yamada Cc: Dinh Nguyen, Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi Am Mittwoch, 11. September 2019, 04:37:46 CEST schrieb Masahiro Yamada: > Hi Dinh, > > On Wed, Sep 11, 2019 at 12:22 AM Dinh Nguyen <dinguyen@kernel.org> wrote: > > On 9/10/19 8:48 AM, Tim Sander wrote: > > > Hi > > > > > > I have noticed that my SPF records where not in place after moving the > > > server, so it seems the mail didn't go to the mailing list. Hopefully > > > that's fixed now.> > > > > Am Dienstag, 10. September 2019, 09:16:37 CEST schrieb Masahiro Yamada: > > >> On Fri, Sep 6, 2019 at 9:39 PM Tim Sander <tim@krieglstein.org> wrote: > > >>> Hi > > >>> > > >>> I have noticed that there multiple breakages piling up for the denali > > >>> nand > > >>> driver on the Intel/Altera Cyclone V. Unfortunately i had no time to > > >>> track > > >>> the mainline kernel closely. So the breakage seems to pile up. I am a > > >>> little disapointed that Intel is not on the lookout that the kernel > > >>> works > > >>> on the chips they are selling. I was really happy about the state of > > >>> the > > >>> platform before concerning mainline support. > > >>> > > >>> The failure starts with kernel 4.19 or stable kernel release 4.18.19. > > >>> The > > >>> commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. > > >> > > >> Just for clarification, this corresponds to > > >> 0d55c668b218a1db68b5044bce4de74e1bd0f0c8 upstream. > > >> > > >>> The problem here is that > > >>> our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. > > >> > > >> Please clarify the scope of "our platform". > > >> (Only you, or your company, or every individual using this chip?) > > > > > > The company i work for uses this chip as a base for multiple products. > > > > > >> First, SPARE_AREA_SKIP_BYTES is not the property of the hardware. > > >> Rather, it is about the OOB layout, in other words, this parameter > > >> is defined by software. > > >> > > >> For example, U-Boot supports the Denali NAND driver. > > >> The SPARE_AREA_SKIP_BYTES is a user-configurable parameter: > > >> https://github.com/u-boot/u-boot/blob/v2019.10-rc3/drivers/mtd/nand/raw > > >> /Kcon fig#L112 I am using barebox for booting. I looked at the code and found a comment in denali_hw_init: * tell driver how many bit controller will skip before * writing ECC code in OOB, this register may be already * set by firmware. So we read this value out. * if this value is 0, just let it be. I have checked the barebox code and the denali register SPARE_AREA_SKIP_BYTES (offset 0x230) is read only once on booting. I have not found any occurrence of the register being set by barebox. So i would concur as the value is zero in my case that the boot ROM seems not to set the value. The code in barebox is mostly imported from linux in 2015 which is before the reorganization which happened on the linux side later on. > > >> > > >> > > >> Your platform works with a zero in the SPARE_AREA_SKIP_BYTES register > > >> because the NAND chip on the board was initialized with a zero > > >> set to the SPARE_AREA_SKIP_BYTES register. > > >> > > >> If the NAND chip had been initialized with 8 > > >> set to the SPARE_AREA_SKIP_BYTES register, it would have > > >> been working with 8 to the SPARE_AREA_SKIP_BYTES. > > >> > > >> The Boot ROM is the only (semi-)software that is unconfigurable by > > >> users, > > >> so the value of SPARE_AREA_SKIP_BYTES should be aligned with > > >> the boot ROM. > > >> I recommend you to check the spec of the boot ROM. > > > > > > We boot from NOR flash. That's why i didn't see a problem booting > > > probably. > > > > > >> (The maintainer of the platform, Dihn is CC'ed, > > >> so I hope he will jump in) > > > > > > Yes i hope so too. > > > > I don't have access to a NAND device at the moment. I'll try to find one > > and debug. I have hardware available to me, so i would be happy to test any ideas/ guesses. > Dinh, > Do you have answers for the following questions? > > > - Does the SOCFPGA boot ROM support the NAND boot mode? > > - If so, which value does it use for SPARE_AREA_SKIP_BYTES? Best regards Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-26 9:10 ` Tim Sander @ 2019-09-26 17:47 ` Masahiro Yamada 2020-01-10 16:46 ` Tim Sander 0 siblings, 1 reply; 14+ messages in thread From: Masahiro Yamada @ 2019-09-26 17:47 UTC (permalink / raw) To: Tim Sander Cc: Dinh Nguyen, Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi Tim, On Thu, Sep 26, 2019 at 6:10 PM Tim Sander <tim@krieglstein.org> wrote: > > Hi > > Am Mittwoch, 11. September 2019, 04:37:46 CEST schrieb Masahiro Yamada: > > Hi Dinh, > > > > On Wed, Sep 11, 2019 at 12:22 AM Dinh Nguyen <dinguyen@kernel.org> wrote: > > > On 9/10/19 8:48 AM, Tim Sander wrote: > > > > Hi > > > > > > > > I have noticed that my SPF records where not in place after moving the > > > > server, so it seems the mail didn't go to the mailing list. Hopefully > > > > that's fixed now.> > > > > > Am Dienstag, 10. September 2019, 09:16:37 CEST schrieb Masahiro Yamada: > > > >> On Fri, Sep 6, 2019 at 9:39 PM Tim Sander <tim@krieglstein.org> wrote: > > > >>> Hi > > > >>> > > > >>> I have noticed that there multiple breakages piling up for the denali > > > >>> nand > > > >>> driver on the Intel/Altera Cyclone V. Unfortunately i had no time to > > > >>> track > > > >>> the mainline kernel closely. So the breakage seems to pile up. I am a > > > >>> little disapointed that Intel is not on the lookout that the kernel > > > >>> works > > > >>> on the chips they are selling. I was really happy about the state of > > > >>> the > > > >>> platform before concerning mainline support. > > > >>> > > > >>> The failure starts with kernel 4.19 or stable kernel release 4.18.19. > > > >>> The > > > >>> commit is ba4a1b62a2d742df9e9c607ac53b3bf33496508f. > > > >> > > > >> Just for clarification, this corresponds to > > > >> 0d55c668b218a1db68b5044bce4de74e1bd0f0c8 upstream. > > > >> > > > >>> The problem here is that > > > >>> our platform works with a zero in the SPARE_AREA_SKIP_BYTES register. > > > >> > > > >> Please clarify the scope of "our platform". > > > >> (Only you, or your company, or every individual using this chip?) > > > > > > > > The company i work for uses this chip as a base for multiple products. > > > > > > > >> First, SPARE_AREA_SKIP_BYTES is not the property of the hardware. > > > >> Rather, it is about the OOB layout, in other words, this parameter > > > >> is defined by software. > > > >> > > > >> For example, U-Boot supports the Denali NAND driver. > > > >> The SPARE_AREA_SKIP_BYTES is a user-configurable parameter: > > > >> https://github.com/u-boot/u-boot/blob/v2019.10-rc3/drivers/mtd/nand/raw > > > >> /Kcon fig#L112 > I am using barebox for booting. I looked at the code and found a comment in > denali_hw_init: > * tell driver how many bit controller will skip before > * writing ECC code in OOB, this register may be already > * set by firmware. So we read this value out. > * if this value is 0, just let it be. > > I have checked the barebox code and the denali register SPARE_AREA_SKIP_BYTES > (offset 0x230) is read only once on booting. I have not found any occurrence of > the register being set by barebox. So i would concur as the value is zero in > my case that the boot ROM seems not to set the value. The code in barebox is > mostly imported from linux in 2015 which is before the reorganization which > happened on the linux side later on. > > > > >> > > > >> > > > >> Your platform works with a zero in the SPARE_AREA_SKIP_BYTES register > > > >> because the NAND chip on the board was initialized with a zero > > > >> set to the SPARE_AREA_SKIP_BYTES register. > > > >> > > > >> If the NAND chip had been initialized with 8 > > > >> set to the SPARE_AREA_SKIP_BYTES register, it would have > > > >> been working with 8 to the SPARE_AREA_SKIP_BYTES. > > > >> > > > >> The Boot ROM is the only (semi-)software that is unconfigurable by > > > >> users, > > > >> so the value of SPARE_AREA_SKIP_BYTES should be aligned with > > > >> the boot ROM. > > > >> I recommend you to check the spec of the boot ROM. > > > > > > > > We boot from NOR flash. That's why i didn't see a problem booting > > > > probably. > > > > > > > >> (The maintainer of the platform, Dihn is CC'ed, > > > >> so I hope he will jump in) > > > > > > > > Yes i hope so too. > > > > > > I don't have access to a NAND device at the moment. I'll try to find one > > > and debug. > I have hardware available to me, so i would be happy to test any ideas/ > guesses. You previously mentioned, "We boot from NOR flash. That's why i didn't see a problem booting probably." Could you try the NAND device as the boot source? - Flash the boot image into the NAND device, changing the value for SPARE_AREA_SKIP_BYTES. - Please find out the appropriate value for SPARE_AREA_SKIP_BYTES for booting successfully. -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2019-09-26 17:47 ` Masahiro Yamada @ 2020-01-10 16:46 ` Tim Sander 2020-01-10 17:13 ` Marek Vasut 2020-01-10 19:05 ` Masahiro Yamada 0 siblings, 2 replies; 14+ messages in thread From: Tim Sander @ 2020-01-10 16:46 UTC (permalink / raw) To: Masahiro Yamada Cc: Dinh Nguyen, Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, Marek Vasut, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi Masahiro Yamada Sorry for the large delay. I have seen the patches at https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html Seem to resolve the question about the spare_area_skip_bytes register. I have now set the register to 2 which seems to be the right choice on an Intel SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or 5.5-rc5. I get the following messages during bootup booting: [ 1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 [ 1.832936] denali-nand-dt: probe of ff900000.nand failed with error -5 But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious commit which is causing the problem. Looking at the changes it might be that the timing calculations in the driver changed which might also lead to a similar error. I am booting via NFS the bootloader is placed in NOR flash. The corresponding nand dts entry is updated to the new format and looks like this: nand@ff900000 { #address-cells = <0x1>; #size-cells = <0x0>; compatible = "altr,socfpga-denali-nand"; reg = <0xff900000 0x100000 0xffb80000 0x10000>; reg-names = "nand_data", "denali_reg"; interrupts = <0x0 0x90 0x4>; clocks = <0x2d 0x1e 0x2e>; clock-names = "nand", "nand_x", "ecc"; resets = <0x6 0x24>; status = "okay"; nand@0 { reg = <0x0>; #address-cells = <0x1>; #size-cells = <0x1>; partition@0 { label = "work"; reg = <0x0 0x10000000>; }; }; }; The last kernel i am able to boot is 4.19.10. I have tried booting: 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed. Unfortunately the range is quite large for bisecting the problem. It also occurred to me that all the platforms with Intel Cyclone V in mainline are development boards which boot from SD-card not exhibiting this problem on their default boot path. Best regards Tim PS: Here is some snippet from an older mail i didn't sent to the list yet which might be superseded by now: To get into this matter i started reading the "Intel Cyclone V HPS TRM" Section 13-20 Preserving Bad Block Markers: "You can configure the NAND flash controller to skip over a specified number of bytes when it writes the last sector in a page to the spare area. This option write the desired offset to the spare_area_skip_bytes register in the config group. For example, if the device page size is 2 KB, and the device area, set the spare_area_skip_bytes register to 2. When the flash controller writes the last sector of the page that overlaps with the spare area, it spare_area_skip_bytes must be an even number. For example, if the bad block marker is a single byte, set spare_area_skip_bytes to 2." ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2020-01-10 16:46 ` Tim Sander @ 2020-01-10 17:13 ` Marek Vasut 2020-01-10 19:05 ` Masahiro Yamada 1 sibling, 0 replies; 14+ messages in thread From: Marek Vasut @ 2020-01-10 17:13 UTC (permalink / raw) To: Tim Sander, Masahiro Yamada Cc: Dinh Nguyen, Vignesh Raghavendra, Richard Weinberger, Linux Kernel Mailing List, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse On 1/10/20 5:46 PM, Tim Sander wrote: > Hi Masahiro Yamada Hi, > Sorry for the large delay. I have seen the patches at > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html > Seem to resolve the question about the spare_area_skip_bytes register. > > I have now set the register to 2 which seems to be the right choice on an Intel > SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or 5.5-rc5. I get the > following messages during bootup booting: > [ 1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 > [ 1.832936] denali-nand-dt: probe of ff900000.nand failed with error -5 > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel > release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious commit > which is causing the problem. Looking at the changes it might be that the timing > calculations in the driver changed which might also lead to a similar error. > > I am booting via NFS the bootloader is placed in NOR flash. The corresponding > nand dts entry is updated to the new format and looks like this: > nand@ff900000 { > #address-cells = <0x1>; > #size-cells = <0x0>; > compatible = "altr,socfpga-denali-nand"; > reg = <0xff900000 0x100000 0xffb80000 0x10000>; > reg-names = "nand_data", "denali_reg"; > interrupts = <0x0 0x90 0x4>; > clocks = <0x2d 0x1e 0x2e>; > clock-names = "nand", "nand_x", "ecc"; > resets = <0x6 0x24>; > status = "okay"; > nand@0 { > reg = <0x0>; > #address-cells = <0x1>; > #size-cells = <0x1>; > partition@0 { > label = "work"; > reg = <0x0 0x10000000>; > }; > }; > }; > > The last kernel i am able to boot is 4.19.10. I have tried booting: > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed. Unfortunately the > range is quite large for bisecting the problem. It also occurred to me that > all the platforms with Intel Cyclone V in mainline are development boards > which boot from SD-card not exhibiting this problem on their default boot path. There are also patches for U-Boot which you need to get this whole thing working, unless you have reset support for the Denali NAND in mainline Linux. See https://patchwork.ozlabs.org/project/uboot/list/?series=152289 Sadly, all of the efforts thus far crashed on various review pushback. > Best regards > Tim > > PS: Here is some snippet from an older mail i didn't sent to the list yet which > might be superseded by now: > To get into this matter i started reading the "Intel Cyclone V HPS TRM" > Section 13-20 Preserving Bad Block Markers: > "You can configure the NAND flash controller to skip over a specified number of > bytes when it writes the last sector in a page to the spare area. This option > write the desired offset to the spare_area_skip_bytes register in the config > group. For example, if the device page size is 2 KB, and the device > area, set the spare_area_skip_bytes register to 2. When the flash controller > writes the last sector of the page that overlaps with the spare area, it > spare_area_skip_bytes must be an even number. For example, if the bad block > marker is a single byte, set spare_area_skip_bytes to 2." > > > > > -- Best regards, Marek Vasut ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2020-01-10 16:46 ` Tim Sander 2020-01-10 17:13 ` Marek Vasut @ 2020-01-10 19:05 ` Masahiro Yamada 2020-01-10 22:38 ` Tim Sander 1 sibling, 1 reply; 14+ messages in thread From: Masahiro Yamada @ 2020-01-10 19:05 UTC (permalink / raw) To: Tim Sander Cc: Vignesh Raghavendra, Marek Vasut, Richard Weinberger, Linux Kernel Mailing List, Dinh Nguyen, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse On Sat, Jan 11, 2020 at 1:47 AM Tim Sander <tim@krieglstein.org> wrote: > > Hi Masahiro Yamada > > Sorry for the large delay. I have seen the patches at > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html > Seem to resolve the question about the spare_area_skip_bytes register. > > I have now set the register to 2 which seems to be the right choice on an Intel > SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or 5.5-rc5. I get the > following messages during bootup booting: > [ 1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq 0x1000 > [ 1.832936] denali-nand-dt: probe of ff900000.nand failed with error -5 > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel > release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious commit > which is causing the problem. Looking at the changes it might be that the timing > calculations in the driver changed which might also lead to a similar error. > > I am booting via NFS the bootloader is placed in NOR flash. The corresponding > nand dts entry is updated to the new format and looks like this: > nand@ff900000 { > #address-cells = <0x1>; > #size-cells = <0x0>; > compatible = "altr,socfpga-denali-nand"; > reg = <0xff900000 0x100000 0xffb80000 0x10000>; > reg-names = "nand_data", "denali_reg"; > interrupts = <0x0 0x90 0x4>; > clocks = <0x2d 0x1e 0x2e>; > clock-names = "nand", "nand_x", "ecc"; > resets = <0x6 0x24>; > status = "okay"; > nand@0 { > reg = <0x0>; > #address-cells = <0x1>; > #size-cells = <0x1>; > partition@0 { > label = "work"; > reg = <0x0 0x10000000>; > }; > }; > }; > > The last kernel i am able to boot is 4.19.10. I have tried booting: > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed. Unfortunately the > range is quite large for bisecting the problem. It also occurred to me that > all the platforms with Intel Cyclone V in mainline are development boards > which boot from SD-card not exhibiting this problem on their default boot path. What will happen if you apply all of these: http://patchwork.ozlabs.org/project/linux-mtd/list/?series=149821 on top of the mainline kernel, and then, hack denali->clk_rate and denali->clk_x_rate as follows? - denali->clk_rate = clk_get_rate(dt->clk); - denali->clk_x_rate = clk_get_rate(dt->clk_x); + denali->clk_rate = 50000000; + denali->clk_x_rate = 200000000; If it still fails, what about this? denali->clk_rate = 0; denali->clk_x_rate = 0; > PS: Here is some snippet from an older mail i didn't sent to the list yet which > might be superseded by now: > To get into this matter i started reading the "Intel Cyclone V HPS TRM" > Section 13-20 Preserving Bad Block Markers: > "You can configure the NAND flash controller to skip over a specified number of > bytes when it writes the last sector in a page to the spare area. This option > write the desired offset to the spare_area_skip_bytes register in the config > group. For example, if the device page size is 2 KB, and the device > area, set the spare_area_skip_bytes register to 2. When the flash controller > writes the last sector of the page that overlaps with the spare area, it > spare_area_skip_bytes must be an even number. For example, if the bad block > marker is a single byte, set spare_area_skip_bytes to 2." I did not know this documentation. It says "For example" (twice), it sounds uncertain to me, though. Anyway, an intel engineer checked the boot ROM code. SPARE_AREA_SKIP_BYTES=2 is correct, he said. -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2020-01-10 19:05 ` Masahiro Yamada @ 2020-01-10 22:38 ` Tim Sander 2020-01-11 2:38 ` Masahiro Yamada 0 siblings, 1 reply; 14+ messages in thread From: Tim Sander @ 2020-01-10 22:38 UTC (permalink / raw) To: Masahiro Yamada Cc: Vignesh Raghavendra, Marek Vasut, Richard Weinberger, Linux Kernel Mailing List, Dinh Nguyen, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Hi Am Freitag, 10. Januar 2020, 20:05:20 CET schrieb Masahiro Yamada: > On Sat, Jan 11, 2020 at 1:47 AM Tim Sander <tim@krieglstein.org> wrote: > > Hi Masahiro Yamada > > > > Sorry for the large delay. I have seen the patches at > > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html > > Seem to resolve the question about the spare_area_skip_bytes register. > > > > I have now set the register to 2 which seems to be the right choice on an > > Intel SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or > > 5.5-rc5. I get the following messages during bootup booting: > > [ 1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq > > 0x1000 [ 1.832936] denali-nand-dt: probe of ff900000.nand failed with > > error -5 > > > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel > > release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious > > commit which is causing the problem. Looking at the changes it might be > > that the timing calculations in the driver changed which might also lead > > to a similar error. > > > > I am booting via NFS the bootloader is placed in NOR flash. The > > corresponding> > > nand dts entry is updated to the new format and looks like this: > > nand@ff900000 { > > > > #address-cells = <0x1>; > > #size-cells = <0x0>; > > compatible = "altr,socfpga-denali-nand"; > > reg = <0xff900000 0x100000 0xffb80000 0x10000>; > > reg-names = "nand_data", "denali_reg"; > > interrupts = <0x0 0x90 0x4>; > > clocks = <0x2d 0x1e 0x2e>; > > clock-names = "nand", "nand_x", "ecc"; > > resets = <0x6 0x24>; > > status = "okay"; > > nand@0 { > > > > reg = <0x0>; > > #address-cells = <0x1>; > > #size-cells = <0x1>; > > partition@0 { > > > > label = "work"; > > reg = <0x0 0x10000000>; > > > > }; > > > > }; > > > > }; > > > > The last kernel i am able to boot is 4.19.10. I have tried booting: > > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed. > > Unfortunately the range is quite large for bisecting the problem. It also > > occurred to me that all the platforms with Intel Cyclone V in mainline > > are development boards which boot from SD-card not exhibiting this > > problem on their default boot path. > What will happen if you apply all of these: > > http://patchwork.ozlabs.org/project/linux-mtd/list/?series=149821 I have applied this patch set but it does not help completely. The timings are wrong. I don't have access to the hardware now but one thing i tested before i left (the HW) was to write the NAND timings from the bootloader into the denali controller after the driver configured the timings in denali_init. After that the driver worked again for me. > on top of the mainline kernel, > and then, hack denali->clk_rate and denali->clk_x_rate as follows? > > > - denali->clk_rate = clk_get_rate(dt->clk); > - denali->clk_x_rate = clk_get_rate(dt->clk_x); > + denali->clk_rate = 50000000; > + denali->clk_x_rate = 200000000; > > If it still fails, what about this? > > denali->clk_rate = 0; > denali->clk_x_rate = 0; Will try the above next week. Skimming over the socfpga.dtsi it seems as if on the Intel SocFPGA the OSC1 has a value of 25000000 set in socfpga_cyclone5.dtsi (I am currently not sure about the clock tree with all the plls and i am missing the value of osc2?). Also right now it seems i am to tired to parse denali_setup_data_interface... > > PS: Here is some snippet from an older mail i didn't sent to the list yet > > which might be superseded by now: > > To get into this matter i started reading the "Intel Cyclone V HPS TRM" > > Section 13-20 Preserving Bad Block Markers: > > "You can configure the NAND flash controller to skip over a specified > > number of bytes when it writes the last sector in a page to the spare > > area. This option write the desired offset to the spare_area_skip_bytes > > register in the config group. For example, if the device page size is 2 > > KB, and the device area, set the spare_area_skip_bytes register to 2. > > When the flash controller writes the last sector of the page that > > overlaps with the spare area, it spare_area_skip_bytes must be an even > > number. For example, if the bad block marker is a single byte, set > > spare_area_skip_bytes to 2." > > I did not know this documentation. > > It says "For example" (twice), > it sounds uncertain to me, though. > > Anyway, an intel engineer checked the boot ROM code. > SPARE_AREA_SKIP_BYTES=2 is correct, he said. As far as i understand the documentation it must be a multiple of 2. The most nand flashes i know need one byte for bad block marking so 2 seems to be a pretty sane value. The explanation why default value of spare_area_skip_bytes=0 of the boot rom is a little unfortunate is also in the documentation: The fact that the ECC values might spill into the spare area where the bad block marker of the nand is located. Best regards Tim ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2020-01-10 22:38 ` Tim Sander @ 2020-01-11 2:38 ` Masahiro Yamada 2020-01-13 10:22 ` Tim Sander 0 siblings, 1 reply; 14+ messages in thread From: Masahiro Yamada @ 2020-01-11 2:38 UTC (permalink / raw) To: Tim Sander Cc: Vignesh Raghavendra, Marek Vasut, Richard Weinberger, Linux Kernel Mailing List, Dinh Nguyen, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse On Sat, Jan 11, 2020 at 7:38 AM Tim Sander <tim@krieglstein.org> wrote: > > Hi > Am Freitag, 10. Januar 2020, 20:05:20 CET schrieb Masahiro Yamada: > > On Sat, Jan 11, 2020 at 1:47 AM Tim Sander <tim@krieglstein.org> wrote: > > > Hi Masahiro Yamada > > > > > > Sorry for the large delay. I have seen the patches at > > > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.html > > > Seem to resolve the question about the spare_area_skip_bytes register. > > > > > > I have now set the register to 2 which seems to be the right choice on an > > > Intel SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or > > > 5.5-rc5. I get the following messages during bootup booting: > > > [ 1.825590] denali-nand-dt ff900000.nand: timeout while waiting for irq > > > 0x1000 [ 1.832936] denali-nand-dt: probe of ff900000.nand failed with > > > error -5 > > > > > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 kernel > > > release (Mon Oct 22 07:37:37 2018). So it seems there is not an obvious > > > commit which is causing the problem. Looking at the changes it might be > > > that the timing calculations in the driver changed which might also lead > > > to a similar error. > > > > > > I am booting via NFS the bootloader is placed in NOR flash. The > > > corresponding> > > > nand dts entry is updated to the new format and looks like this: > > > nand@ff900000 { > > > > > > #address-cells = <0x1>; > > > #size-cells = <0x0>; > > > compatible = "altr,socfpga-denali-nand"; > > > reg = <0xff900000 0x100000 0xffb80000 0x10000>; > > > reg-names = "nand_data", "denali_reg"; > > > interrupts = <0x0 0x90 0x4>; > > > clocks = <0x2d 0x1e 0x2e>; > > > clock-names = "nand", "nand_x", "ecc"; > > > resets = <0x6 0x24>; > > > status = "okay"; > > > nand@0 { > > > > > > reg = <0x0>; > > > #address-cells = <0x1>; > > > #size-cells = <0x1>; > > > partition@0 { > > > > > > label = "work"; > > > reg = <0x0 0x10000000>; > > > > > > }; > > > > > > }; > > > > > > }; > > > > > > The last kernel i am able to boot is 4.19.10. I have tried booting: > > > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed. > > > Unfortunately the range is quite large for bisecting the problem. It also > > > occurred to me that all the platforms with Intel Cyclone V in mainline > > > are development boards which boot from SD-card not exhibiting this > > > problem on their default boot path. > > What will happen if you apply all of these: > > > > http://patchwork.ozlabs.org/project/linux-mtd/list/?series=149821 > I have applied this patch set but it does not help completely. OK, I just wanted to eliminate any other possibility, just in case. > The timings are > wrong. I don't have access to the hardware now but one thing i tested before i > left (the HW) was to write the NAND timings from the bootloader into the > denali controller after the driver configured the timings in denali_init. > After that the driver worked again for me. > > > on top of the mainline kernel, > > and then, hack denali->clk_rate and denali->clk_x_rate as follows? > > > > > > - denali->clk_rate = clk_get_rate(dt->clk); > > - denali->clk_x_rate = clk_get_rate(dt->clk_x); > > + denali->clk_rate = 50000000; > > + denali->clk_x_rate = 200000000; > > > > If it still fails, what about this? > > > > denali->clk_rate = 0; > > denali->clk_x_rate = 0; > Will try the above next week. Skimming over the socfpga.dtsi it seems as if > on the Intel SocFPGA the OSC1 has a value of 25000000 set in > socfpga_cyclone5.dtsi (I am currently not sure about the clock tree with all > the plls and i am missing the value of osc2?). Also right now it seems i am to > tired to parse denali_setup_data_interface... You do not need to parse denali_setup_data_interface(). There are good hints. You said: "The last kernel i am able to boot is 4.19.10. I have tried booting: 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed." There is no commit between 4.19.10 and 5.1.21 that changes denali_setup_data_interface(). So, denali_setup_data_interface() is not the root cause. From the information you provided, I suspect some clock settings are wrong. > > > PS: Here is some snippet from an older mail i didn't sent to the list yet > > > which might be superseded by now: > > > To get into this matter i started reading the "Intel Cyclone V HPS TRM" > > > Section 13-20 Preserving Bad Block Markers: > > > "You can configure the NAND flash controller to skip over a specified > > > number of bytes when it writes the last sector in a page to the spare > > > area. This option write the desired offset to the spare_area_skip_bytes > > > register in the config group. For example, if the device page size is 2 > > > KB, and the device area, set the spare_area_skip_bytes register to 2. > > > When the flash controller writes the last sector of the page that > > > overlaps with the spare area, it spare_area_skip_bytes must be an even > > > number. For example, if the bad block marker is a single byte, set > > > spare_area_skip_bytes to 2." > > > > I did not know this documentation. > > > > It says "For example" (twice), > > it sounds uncertain to me, though. > > > > Anyway, an intel engineer checked the boot ROM code. > > SPARE_AREA_SKIP_BYTES=2 is correct, he said. > As far as i understand the documentation it must be a multiple of 2. The most > nand flashes i know need one byte for bad block marking so 2 seems to be a > pretty sane value. Most of NAND flashes, but not all. See the "Bad Block Location" in this page: http://www.linux-mtd.infradead.org/nand-data/nanddata.html Many of devices have BBM at 1st byte/word, but there are devices that have it at 6th byte. SPARE_AREA_SKIP_BYTES=2 for SOCFPGA corrupts the BBM at offset 6. So, probably such a device is not used on SOCFPGA boards. I am guessing that is why the UniPhier platform adopted SPARE_AREA_SKIP_BYTES=8. > The explanation why default value of > spare_area_skip_bytes=0 of the boot rom is a little unfortunate is also in the > documentation: The fact that the ECC values might spill into the spare area > where the bad block marker of the nand is located. -- Best Regards Masahiro Yamada ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: mtd raw nand denali.c broken for Intel/Altera Cyclone V 2020-01-11 2:38 ` Masahiro Yamada @ 2020-01-13 10:22 ` Tim Sander 0 siblings, 0 replies; 14+ messages in thread From: Tim Sander @ 2020-01-13 10:22 UTC (permalink / raw) To: Masahiro Yamada Cc: Vignesh Raghavendra, Marek Vasut, Richard Weinberger, Linux Kernel Mailing List, Dinh Nguyen, linux-mtd, Miquel Raynal, Brian Norris, David Woodhouse Am Samstag, 11. Januar 2020, 03:38:35 CET schrieb Masahiro Yamada: > On Sat, Jan 11, 2020 at 7:38 AM Tim Sander <tim@krieglstein.org> wrote: > > Hi > > > > Am Freitag, 10. Januar 2020, 20:05:20 CET schrieb Masahiro Yamada: > > > On Sat, Jan 11, 2020 at 1:47 AM Tim Sander <tim@krieglstein.org> wrote: > > > > Hi Masahiro Yamada > > > > > > > > Sorry for the large delay. I have seen the patches at > > > > https://lists.infradead.org/pipermail/linux-mtd/2019-December/092852.h > > > > tml > > > > Seem to resolve the question about the spare_area_skip_bytes register. > > > > > > > > I have now set the register to 2 which seems to be the right choice on > > > > an > > > > Intel SocFPGA. But still i am out of luck trying to boot 5.4.5-rt3 or > > > > 5.5-rc5. I get the following messages during bootup booting: > > > > [ 1.825590] denali-nand-dt ff900000.nand: timeout while waiting for > > > > irq > > > > 0x1000 [ 1.832936] denali-nand-dt: probe of ff900000.nand failed > > > > with > > > > error -5 > > > > > > > > But the commit c19e31d0a32dd 2017-06-13 22:45:38 predates the 4.19 > > > > kernel > > > > release (Mon Oct 22 07:37:37 2018). So it seems there is not an > > > > obvious > > > > commit which is causing the problem. Looking at the changes it might > > > > be > > > > that the timing calculations in the driver changed which might also > > > > lead > > > > to a similar error. > > > > > > > > I am booting via NFS the bootloader is placed in NOR flash. The > > > > corresponding> > > > > > > > > nand dts entry is updated to the new format and looks like this: > > > > nand@ff900000 { > > > > > > > > #address-cells = <0x1>; > > > > #size-cells = <0x0>; > > > > compatible = "altr,socfpga-denali-nand"; > > > > reg = <0xff900000 0x100000 0xffb80000 > > > > 0x10000>; > > > > reg-names = "nand_data", "denali_reg"; > > > > interrupts = <0x0 0x90 0x4>; > > > > clocks = <0x2d 0x1e 0x2e>; > > > > clock-names = "nand", "nand_x", "ecc"; > > > > resets = <0x6 0x24>; > > > > status = "okay"; > > > > nand@0 { > > > > > > > > reg = <0x0>; > > > > #address-cells = <0x1>; > > > > #size-cells = <0x1>; > > > > partition@0 { > > > > > > > > label = "work"; > > > > reg = <0x0 0x10000000>; > > > > > > > > }; > > > > > > > > }; > > > > > > > > }; > > > > > > > > The last kernel i am able to boot is 4.19.10. I have tried booting: > > > > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed. > > > > Unfortunately the range is quite large for bisecting the problem. It > > > > also > > > > occurred to me that all the platforms with Intel Cyclone V in mainline > > > > are development boards which boot from SD-card not exhibiting this > > > > problem on their default boot path. > > > > > > What will happen if you apply all of these: > > > > > > http://patchwork.ozlabs.org/project/linux-mtd/list/?series=149821 > > > > I have applied this patch set but it does not help completely. > > OK, I just wanted to eliminate any other possibility, just in case. As far as i remember i also need the linked patchset but it does not help completely. As far is i remember overwriting the timings didn't help because that's the first thing i tried without the other patches. > > The timings are > > wrong. I don't have access to the hardware now but one thing i tested > > before i left (the HW) was to write the NAND timings from the bootloader > > into the denali controller after the driver configured the timings in > > denali_init. After that the driver worked again for me. > > > > > on top of the mainline kernel, > > > and then, hack denali->clk_rate and denali->clk_x_rate as follows? > > > > > > > > > - denali->clk_rate = clk_get_rate(dt->clk); > > > - denali->clk_x_rate = clk_get_rate(dt->clk_x); > > > + denali->clk_rate = 50000000; > > > + denali->clk_x_rate = 200000000; > > > > > > If it still fails, what about this? > > > > > > denali->clk_rate = 0; > > > denali->clk_x_rate = 0; I have not tried this yet because i have written out the values calculated by the driver. My hope is that the error made in the timings can be deduced from the values below. As assumed the timings are not correct. But there is one more thing, the timing calculation is being called twice! [ 0.336216] 001: ffc03000.serial1: ttyS1 at MMIO 0xffc03000 (irq = 41, base_baud = 6250000) is a 16550A [ 0.338882] 001: previous settings: acc_clks 1 [ 0.338882] 001: re_2_re: 3 [ 0.338882] 001: re_2_we: 3 [ 0.338882] 001: tcwaw_and_addr_2_data: 5 [ 0.338882] 001: hwhr2_and_we2_re: 63 [ 0.338882] 001: rdwr_en_hi_cnt: 1 [ 0.338882] 001: rdwr_en_lo_cnt: 3 [ 0.338882] 001: rdwr_en_lo_cnt: 1 [ 0.338882] 001: cs_setup_cnt: -2125159496 [ 0.338882] 001: [ 0.340720] 001: nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [ 0.340729] 001: nand: Micron MT29F2G08ABAEAWP [ 0.340734] 001: nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 [ 0.341084] 001: previous settings: acc_clks 1 [ 0.341084] 001: re_2_re: 2 [ 0.341084] 001: re_2_we: 2 [ 0.341084] 001: tcwaw_and_addr_2_data: 5 [ 0.341084] 001: hwhr2_and_we2_re: 2 [ 0.341084] 001: rdwr_en_hi_cnt: 1 [ 0.341084] 001: rdwr_en_lo_cnt: 3 [ 0.341084] 001: rdwr_en_lo_cnt: 0 [ 0.341084] 001: cs_setup_cnt: 1 [ 0.341084] 001: <- here the values are beeing overwritten with the values from the patch below! [ 0.342438] 001: Bad block table found at page 131008, version 0x01 [ 0.343671] 001: Bad block table found at page 130944, version 0x01 [ 0.345267] 001: 1 fixed-partitions partitions found on MTD device denali-nand [ 0.345275] 001: Creating 1 MTD partitions on "denali-nand": [ 0.345284] 001: 0x000000000000-0x000010000000 : "work" [ 0.351416] 000: libphy: Fixed MDIO Bus: probed The following hack has been used to create the output and get the system booting by overriding the computed timing values: drivers/mtd/nand/raw/denali.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/mtd/nand/raw/denali.c b/drivers/mtd/nand/raw/denali.c index fafd0a0aa8e2..5c8a92d4896f 100644 --- a/drivers/mtd/nand/raw/denali.c +++ b/drivers/mtd/nand/raw/denali.c @@ -886,6 +886,17 @@ static int denali_setup_data_interface(struct nand_chip *chip, int chipnr, tmp |= FIELD_PREP(CS_SETUP_CNT__VALUE, cs_setup); sel->cs_setup_cnt = tmp; + printk("previous settings: acc_clks %i\nre_2_re: %i\nre_2_we: %i\ntcwaw_and_addr_2_data: %i\nhwhr2_and_we2_re: %i\nrdwr_en_hi_cnt: %i\nrdwr_en_lo_cnt: %i\nrdwr_en_lo_cnt: %i\ncs_setup_cnt: %i\n ",sel->acc_clks, sel->re_2_re, sel->re_2_we, sel->tcwaw_and_addr_2_data, sel->hwhr2_and_we_2_re, sel->rdwr_en_hi_cnt, sel->rdwr_en_lo_cnt, sel->cs_setup_cnt); + + sel->acc_clks = 0x4; + sel->re_2_re = 0x14; + sel->re_2_we = 0x14; + sel->tcwaw_and_addr_2_data = 0x3f; + sel->hwhr2_and_we_2_re = 0x14; + sel->rdwr_en_hi_cnt = 2; + sel->rdwr_en_lo_cnt = 4; + sel->cs_setup_cnt = 1; + return 0; } -- 2.20.1 > > Will try the above next week. Skimming over the socfpga.dtsi it seems as > > if > > on the Intel SocFPGA the OSC1 has a value of 25000000 set in > > socfpga_cyclone5.dtsi (I am currently not sure about the clock tree with > > all the plls and i am missing the value of osc2?). Also right now it > > seems i am to tired to parse denali_setup_data_interface... > > You do not need to parse denali_setup_data_interface(). > > > There are good hints. > > You said: > "The last kernel i am able to boot is 4.19.10. I have tried booting: > 5.1.21, 5.2.9, 5.3-rc8, 5.4.5-rt3 and 5.5-rc5. They all failed." > > There is no commit between 4.19.10 and 5.1.21 > that changes denali_setup_data_interface(). > > So, denali_setup_data_interface() is not the > root cause. > > >From the information you provided, > > I suspect some clock settings are wrong. This guess we agree that this is no suspicion any more... > > > > PS: Here is some snippet from an older mail i didn't sent to the list > > > > yet > > > > which might be superseded by now: > > > > To get into this matter i started reading the "Intel Cyclone V HPS > > > > TRM" > > > > Section 13-20 Preserving Bad Block Markers: > > > > "You can configure the NAND flash controller to skip over a specified > > > > number of bytes when it writes the last sector in a page to the spare > > > > area. This option write the desired offset to the > > > > spare_area_skip_bytes > > > > register in the config group. For example, if the device page size is > > > > 2 > > > > KB, and the device area, set the spare_area_skip_bytes register to 2. > > > > When the flash controller writes the last sector of the page that > > > > overlaps with the spare area, it spare_area_skip_bytes must be an even > > > > number. For example, if the bad block marker is a single byte, set > > > > spare_area_skip_bytes to 2." > > > > > > I did not know this documentation. > > > > > > It says "For example" (twice), > > > it sounds uncertain to me, though. > > > > > > Anyway, an intel engineer checked the boot ROM code. > > > SPARE_AREA_SKIP_BYTES=2 is correct, he said. > > > > As far as i understand the documentation it must be a multiple of 2. The > > most nand flashes i know need one byte for bad block marking so 2 seems > > to be a pretty sane value. > > Most of NAND flashes, but not all. > > See the "Bad Block Location" in this page: > > http://www.linux-mtd.infradead.org/nand-data/nanddata.html > > > > Many of devices have BBM at 1st byte/word, > but there are devices that have it at 6th byte. > > SPARE_AREA_SKIP_BYTES=2 for SOCFPGA > corrupts the BBM at offset 6. > So, probably such a device is not used > on SOCFPGA boards. > > I am guessing that is why the UniPhier platform > adopted SPARE_AREA_SKIP_BYTES=8. > > > The explanation why default value of > > spare_area_skip_bytes=0 of the boot rom is a little unfortunate is also in > > the documentation: The fact that the ECC values might spill into the > > spare area where the bad block marker of the nand is located. ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-01-13 10:22 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-06 12:38 mtd raw nand denali.c broken for Intel/Altera Cyclone V Tim Sander 2019-09-10 7:16 ` Masahiro Yamada 2019-09-10 13:48 ` Tim Sander 2019-09-10 15:22 ` Dinh Nguyen 2019-09-11 2:37 ` Masahiro Yamada 2019-09-11 7:27 ` Tim Sander 2019-09-26 9:10 ` Tim Sander 2019-09-26 17:47 ` Masahiro Yamada 2020-01-10 16:46 ` Tim Sander 2020-01-10 17:13 ` Marek Vasut 2020-01-10 19:05 ` Masahiro Yamada 2020-01-10 22:38 ` Tim Sander 2020-01-11 2:38 ` Masahiro Yamada 2020-01-13 10:22 ` Tim Sander
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).