All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] mtd: nand: raw: macronix: allow disabling block protection
@ 2023-03-23 12:45 ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:45 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel
  Cc: Álvaro Fernández Rojas

Some devices hang when block protection is enabled, so let's add a boolean
property to allow disabling it.

Álvaro Fernández Rojas (2):
  dt-bindings: mtd: nand: Macronix: document new binding
  mtd: nand: raw: macronix: allow disabling block protection

 Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
 drivers/mtd/nand/raw/nand_macronix.c                    | 4 ++++
 2 files changed, 7 insertions(+)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/2] mtd: nand: raw: macronix: allow disabling block protection
@ 2023-03-23 12:45 ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:45 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel
  Cc: Álvaro Fernández Rojas

Some devices hang when block protection is enabled, so let's add a boolean
property to allow disabling it.

Álvaro Fernández Rojas (2):
  dt-bindings: mtd: nand: Macronix: document new binding
  mtd: nand: raw: macronix: allow disabling block protection

 Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
 drivers/mtd/nand/raw/nand_macronix.c                    | 4 ++++
 2 files changed, 7 insertions(+)

-- 
2.30.2


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-23 12:45 ` Álvaro Fernández Rojas
@ 2023-03-23 12:45   ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:45 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel
  Cc: Álvaro Fernández Rojas

Add new "mxic,disable-block-protection" binding documentation.
This binding allows disabling block protection support for those devices not
supporting it.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
index ffab28a2c4d1..03f65ca32cd3 100644
--- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
+++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
@@ -16,6 +16,9 @@ in children nodes.
 Required NAND chip properties in children mode:
 - randomizer enable: should be "mxic,enable-randomizer-otp"
 
+Optional NAND chip properties in children mode:
+- block protection disable: should be "mxic,disable-block-protection"
+
 Example:
 
 	nand: nand-controller@unit-address {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-23 12:45   ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:45 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel
  Cc: Álvaro Fernández Rojas

Add new "mxic,disable-block-protection" binding documentation.
This binding allows disabling block protection support for those devices not
supporting it.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
index ffab28a2c4d1..03f65ca32cd3 100644
--- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
+++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
@@ -16,6 +16,9 @@ in children nodes.
 Required NAND chip properties in children mode:
 - randomizer enable: should be "mxic,enable-randomizer-otp"
 
+Optional NAND chip properties in children mode:
+- block protection disable: should be "mxic,disable-block-protection"
+
 Example:
 
 	nand: nand-controller@unit-address {
-- 
2.30.2


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection
  2023-03-23 12:45 ` Álvaro Fernández Rojas
@ 2023-03-23 12:45   ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:45 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel
  Cc: Álvaro Fernández Rojas

Some devices hang when block protection is enabled, so let's add a boolean
property to allow disabling it.

Fixes: 03a539c7a118 ("mtd: rawnand: Macronix: Add support for block protection")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 drivers/mtd/nand/raw/nand_macronix.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/mtd/nand/raw/nand_macronix.c b/drivers/mtd/nand/raw/nand_macronix.c
index 1472f925f386..b9f0338ebdaf 100644
--- a/drivers/mtd/nand/raw/nand_macronix.c
+++ b/drivers/mtd/nand/raw/nand_macronix.c
@@ -219,9 +219,13 @@ static int mxic_nand_unlock(struct nand_chip *chip, loff_t ofs, uint64_t len)
 
 static void macronix_nand_block_protection_support(struct nand_chip *chip)
 {
+	struct device_node *dn = nand_get_flash_node(chip);
 	u8 feature[ONFI_SUBFEATURE_PARAM_LEN];
 	int ret;
 
+	if (of_property_read_bool(dn, "mxic,disable-block-protection"))
+		return;
+
 	bitmap_set(chip->parameters.get_feature_list,
 		   ONFI_FEATURE_ADDR_MXIC_PROTECTION, 1);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection
@ 2023-03-23 12:45   ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:45 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel
  Cc: Álvaro Fernández Rojas

Some devices hang when block protection is enabled, so let's add a boolean
property to allow disabling it.

Fixes: 03a539c7a118 ("mtd: rawnand: Macronix: Add support for block protection")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 drivers/mtd/nand/raw/nand_macronix.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/mtd/nand/raw/nand_macronix.c b/drivers/mtd/nand/raw/nand_macronix.c
index 1472f925f386..b9f0338ebdaf 100644
--- a/drivers/mtd/nand/raw/nand_macronix.c
+++ b/drivers/mtd/nand/raw/nand_macronix.c
@@ -219,9 +219,13 @@ static int mxic_nand_unlock(struct nand_chip *chip, loff_t ofs, uint64_t len)
 
 static void macronix_nand_block_protection_support(struct nand_chip *chip)
 {
+	struct device_node *dn = nand_get_flash_node(chip);
 	u8 feature[ONFI_SUBFEATURE_PARAM_LEN];
 	int ret;
 
+	if (of_property_read_bool(dn, "mxic,disable-block-protection"))
+		return;
+
 	bitmap_set(chip->parameters.get_feature_list,
 		   ONFI_FEATURE_ADDR_MXIC_PROTECTION, 1);
 
-- 
2.30.2


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection
  2023-03-23 12:45   ` Álvaro Fernández Rojas
@ 2023-03-23 12:47     ` Tudor Ambarus
  -1 siblings, 0 replies; 50+ messages in thread
From: Tudor Ambarus @ 2023-03-23 12:47 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, miquel.raynal, richard,
	vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi,

On 3/23/23 12:45, Álvaro Fernández Rojas wrote:
> Some devices hang when block protection is enabled, so let's add a boolean
> property to allow disabling it.
> 

Why do they hang?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection
@ 2023-03-23 12:47     ` Tudor Ambarus
  0 siblings, 0 replies; 50+ messages in thread
From: Tudor Ambarus @ 2023-03-23 12:47 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, miquel.raynal, richard,
	vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi,

On 3/23/23 12:45, Álvaro Fernández Rojas wrote:
> Some devices hang when block protection is enabled, so let's add a boolean
> property to allow disabling it.
> 

Why do they hang?

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection
  2023-03-23 12:47     ` Tudor Ambarus
@ 2023-03-23 12:55       ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:55 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel

El jue, 23 mar 2023 a las 13:47, Tudor Ambarus
(<tudor.ambarus@linaro.org>) escribió:
>
> Hi,
>
> On 3/23/23 12:45, Álvaro Fernández Rojas wrote:
> > Some devices hang when block protection is enabled, so let's add a boolean
> > property to allow disabling it.
> >
>
> Why do they hang?

At first I thought it would be due to the low level op not being
supported on BCM63268 brcmnand controllers, but after debugging it
seemed to be working...

This is the log with block protection disabled:
[    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.511526] nand: Macronix MX30LF1G18AC
[    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.535912] Bad block table found at page 65472, version 0x01
[    0.544268] Bad block table found at page 65408, version 0x01
[    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
...

This is the log with block protection enabled:
[    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.510772] nand: Macronix MX30LF1G18AC
[    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.539687] Bad block table not found for chip 0
[    0.550153] Bad block table not found for chip 0
[    0.555069] Scanning device for bad blocks
[    0.601213] CPU 1 Unable to handle kernel paging request at virtual
address 10277f00, epc == 8039ce70, ra == 8016ad50
*** Device hangs ***

As you can see, when block protection is enabled, the bad block table
isn't found and when the device is scanned for bad blocks it just
hangs...

If you want me to debug something I would be happy to do it, but I
need some guidance here...

Best regards,
Álvaro.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection
@ 2023-03-23 12:55       ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-23 12:55 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: miquel.raynal, richard, vigneshr, robh+dt,
	krzysztof.kozlowski+dt, masonccyang, linux-mtd, devicetree,
	linux-kernel

El jue, 23 mar 2023 a las 13:47, Tudor Ambarus
(<tudor.ambarus@linaro.org>) escribió:
>
> Hi,
>
> On 3/23/23 12:45, Álvaro Fernández Rojas wrote:
> > Some devices hang when block protection is enabled, so let's add a boolean
> > property to allow disabling it.
> >
>
> Why do they hang?

At first I thought it would be due to the low level op not being
supported on BCM63268 brcmnand controllers, but after debugging it
seemed to be working...

This is the log with block protection disabled:
[    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.511526] nand: Macronix MX30LF1G18AC
[    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.535912] Bad block table found at page 65472, version 0x01
[    0.544268] Bad block table found at page 65408, version 0x01
[    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
...

This is the log with block protection enabled:
[    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.510772] nand: Macronix MX30LF1G18AC
[    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.539687] Bad block table not found for chip 0
[    0.550153] Bad block table not found for chip 0
[    0.555069] Scanning device for bad blocks
[    0.601213] CPU 1 Unable to handle kernel paging request at virtual
address 10277f00, epc == 8039ce70, ra == 8016ad50
*** Device hangs ***

As you can see, when block protection is enabled, the bad block table
isn't found and when the device is scanned for bad blocks it just
hangs...

If you want me to debug something I would be happy to do it, but I
need some guidance here...

Best regards,
Álvaro.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-23 12:45   ` Álvaro Fernández Rojas
@ 2023-03-24  9:40     ` Miquel Raynal
  -1 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24  9:40 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Álvaro,

noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:

> Add new "mxic,disable-block-protection" binding documentation.
> This binding allows disabling block protection support for those devices not
> supporting it.
> 
> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> ---
>  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> index ffab28a2c4d1..03f65ca32cd3 100644
> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> @@ -16,6 +16,9 @@ in children nodes.
>  Required NAND chip properties in children mode:
>  - randomizer enable: should be "mxic,enable-randomizer-otp"
>  
> +Optional NAND chip properties in children mode:
> +- block protection disable: should be "mxic,disable-block-protection"
> +

Besides the fact that nowadays we prefer to see binding conversions to
yaml before adding anything, I don't think this will fly.

I'm not sure exactly what "disable block protection" means, we
already have similar properties like "lock" and "secure-regions", not
sure they will fit but I think it's worth checking.

Otherwise, why would you disable the block protection? What does it
mean exactly? I'm not in favor of a Macronix-specific property here.

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24  9:40     ` Miquel Raynal
  0 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24  9:40 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Álvaro,

noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:

> Add new "mxic,disable-block-protection" binding documentation.
> This binding allows disabling block protection support for those devices not
> supporting it.
> 
> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> ---
>  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> index ffab28a2c4d1..03f65ca32cd3 100644
> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> @@ -16,6 +16,9 @@ in children nodes.
>  Required NAND chip properties in children mode:
>  - randomizer enable: should be "mxic,enable-randomizer-otp"
>  
> +Optional NAND chip properties in children mode:
> +- block protection disable: should be "mxic,disable-block-protection"
> +

Besides the fact that nowadays we prefer to see binding conversions to
yaml before adding anything, I don't think this will fly.

I'm not sure exactly what "disable block protection" means, we
already have similar properties like "lock" and "secure-regions", not
sure they will fit but I think it's worth checking.

Otherwise, why would you disable the block protection? What does it
mean exactly? I'm not in favor of a Macronix-specific property here.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24  9:40     ` Miquel Raynal
@ 2023-03-24 10:31       ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 10:31 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Miquèl,

El vie, 24 mar 2023 a las 10:40, Miquel Raynal
(<miquel.raynal@bootlin.com>) escribió:
>
> Hi Álvaro,
>
> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>
> > Add new "mxic,disable-block-protection" binding documentation.
> > This binding allows disabling block protection support for those devices not
> > supporting it.
> >
> > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > ---
> >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > index ffab28a2c4d1..03f65ca32cd3 100644
> > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > @@ -16,6 +16,9 @@ in children nodes.
> >  Required NAND chip properties in children mode:
> >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >
> > +Optional NAND chip properties in children mode:
> > +- block protection disable: should be "mxic,disable-block-protection"
> > +
>
> Besides the fact that nowadays we prefer to see binding conversions to
> yaml before adding anything, I don't think this will fly.
>
> I'm not sure exactly what "disable block protection" means, we
> already have similar properties like "lock" and "secure-regions", not
> sure they will fit but I think it's worth checking.

As explained in 2/2, commit 03a539c7a118 introduced a regression on
Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
which hangs the device.

This is the log with block protection disabled:
[    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.511526] nand: Macronix MX30LF1G18AC
[    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.535912] Bad block table found at page 65472, version 0x01
[    0.544268] Bad block table found at page 65408, version 0x01
[    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
...

This is the log with block protection enabled:
[    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.510772] nand: Macronix MX30LF1G18AC
[    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.539687] Bad block table not found for chip 0
[    0.550153] Bad block table not found for chip 0
[    0.555069] Scanning device for bad blocks
[    0.601213] CPU 1 Unable to handle kernel paging request at virtual
address 10277f00, epc == 8039ce70, ra == 8016ad50
*** Device hangs ***

Enabling macronix_nand_block_protection_support() makes the device
unable to detect the bad block table and hangs it when trying to scan
for bad blocks.

>
> Otherwise, why would you disable the block protection? What does it
> mean exactly? I'm not in favor of a Macronix-specific property here.
>
> Thanks,
> Miquèl

--
Álvaro

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 10:31       ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 10:31 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Miquèl,

El vie, 24 mar 2023 a las 10:40, Miquel Raynal
(<miquel.raynal@bootlin.com>) escribió:
>
> Hi Álvaro,
>
> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>
> > Add new "mxic,disable-block-protection" binding documentation.
> > This binding allows disabling block protection support for those devices not
> > supporting it.
> >
> > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > ---
> >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > index ffab28a2c4d1..03f65ca32cd3 100644
> > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > @@ -16,6 +16,9 @@ in children nodes.
> >  Required NAND chip properties in children mode:
> >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >
> > +Optional NAND chip properties in children mode:
> > +- block protection disable: should be "mxic,disable-block-protection"
> > +
>
> Besides the fact that nowadays we prefer to see binding conversions to
> yaml before adding anything, I don't think this will fly.
>
> I'm not sure exactly what "disable block protection" means, we
> already have similar properties like "lock" and "secure-regions", not
> sure they will fit but I think it's worth checking.

As explained in 2/2, commit 03a539c7a118 introduced a regression on
Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
which hangs the device.

This is the log with block protection disabled:
[    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.511526] nand: Macronix MX30LF1G18AC
[    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.535912] Bad block table found at page 65472, version 0x01
[    0.544268] Bad block table found at page 65408, version 0x01
[    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
...

This is the log with block protection enabled:
[    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.510772] nand: Macronix MX30LF1G18AC
[    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.539687] Bad block table not found for chip 0
[    0.550153] Bad block table not found for chip 0
[    0.555069] Scanning device for bad blocks
[    0.601213] CPU 1 Unable to handle kernel paging request at virtual
address 10277f00, epc == 8039ce70, ra == 8016ad50
*** Device hangs ***

Enabling macronix_nand_block_protection_support() makes the device
unable to detect the bad block table and hangs it when trying to scan
for bad blocks.

>
> Otherwise, why would you disable the block protection? What does it
> mean exactly? I'm not in favor of a Macronix-specific property here.
>
> Thanks,
> Miquèl

--
Álvaro

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 10:31       ` Álvaro Fernández Rojas
@ 2023-03-24 10:49         ` Miquel Raynal
  -1 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24 10:49 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Álvaro,

noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:

> Hi Miquèl,
> 
> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> (<miquel.raynal@bootlin.com>) escribió:
> >
> > Hi Álvaro,
> >
> > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >  
> > > Add new "mxic,disable-block-protection" binding documentation.
> > > This binding allows disabling block protection support for those devices not
> > > supporting it.
> > >
> > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > ---
> > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > @@ -16,6 +16,9 @@ in children nodes.
> > >  Required NAND chip properties in children mode:
> > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > >
> > > +Optional NAND chip properties in children mode:
> > > +- block protection disable: should be "mxic,disable-block-protection"
> > > +  
> >
> > Besides the fact that nowadays we prefer to see binding conversions to
> > yaml before adding anything, I don't think this will fly.
> >
> > I'm not sure exactly what "disable block protection" means, we
> > already have similar properties like "lock" and "secure-regions", not
> > sure they will fit but I think it's worth checking.  
> 
> As explained in 2/2, commit 03a539c7a118 introduced a regression on
> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> which hangs the device.
> 
> This is the log with block protection disabled:
> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.511526] nand: Macronix MX30LF1G18AC
> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.535912] Bad block table found at page 65472, version 0x01
> [    0.544268] Bad block table found at page 65408, version 0x01
> [    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
> ...
> 
> This is the log with block protection enabled:
> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.510772] nand: Macronix MX30LF1G18AC
> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.539687] Bad block table not found for chip 0
> [    0.550153] Bad block table not found for chip 0
> [    0.555069] Scanning device for bad blocks
> [    0.601213] CPU 1 Unable to handle kernel paging request at virtual
> address 10277f00, epc == 8039ce70, ra == 8016ad50
> *** Device hangs ***
> 
> Enabling macronix_nand_block_protection_support() makes the device
> unable to detect the bad block table and hangs it when trying to scan
> for bad blocks.

Please trace nand_macronix.c and look:
- are the get_features and set_features really supported by the
  controller driver?
- what is the state of the locking configuration in the chip when you
  boot?
- is there anything that locks the device by calling mxic_nand_lock() ?
- finding no bbt is one thing, hanging is another, where is it hanging
  exactly? (offset in nand/ and line in the code)

> 
> >
> > Otherwise, why would you disable the block protection? What does it
> > mean exactly? I'm not in favor of a Macronix-specific property here.
> >
> > Thanks,
> > Miquèl  
> 
> --
> Álvaro


Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 10:49         ` Miquel Raynal
  0 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24 10:49 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Álvaro,

noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:

> Hi Miquèl,
> 
> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> (<miquel.raynal@bootlin.com>) escribió:
> >
> > Hi Álvaro,
> >
> > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >  
> > > Add new "mxic,disable-block-protection" binding documentation.
> > > This binding allows disabling block protection support for those devices not
> > > supporting it.
> > >
> > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > ---
> > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > @@ -16,6 +16,9 @@ in children nodes.
> > >  Required NAND chip properties in children mode:
> > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > >
> > > +Optional NAND chip properties in children mode:
> > > +- block protection disable: should be "mxic,disable-block-protection"
> > > +  
> >
> > Besides the fact that nowadays we prefer to see binding conversions to
> > yaml before adding anything, I don't think this will fly.
> >
> > I'm not sure exactly what "disable block protection" means, we
> > already have similar properties like "lock" and "secure-regions", not
> > sure they will fit but I think it's worth checking.  
> 
> As explained in 2/2, commit 03a539c7a118 introduced a regression on
> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> which hangs the device.
> 
> This is the log with block protection disabled:
> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.511526] nand: Macronix MX30LF1G18AC
> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.535912] Bad block table found at page 65472, version 0x01
> [    0.544268] Bad block table found at page 65408, version 0x01
> [    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
> ...
> 
> This is the log with block protection enabled:
> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.510772] nand: Macronix MX30LF1G18AC
> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.539687] Bad block table not found for chip 0
> [    0.550153] Bad block table not found for chip 0
> [    0.555069] Scanning device for bad blocks
> [    0.601213] CPU 1 Unable to handle kernel paging request at virtual
> address 10277f00, epc == 8039ce70, ra == 8016ad50
> *** Device hangs ***
> 
> Enabling macronix_nand_block_protection_support() makes the device
> unable to detect the bad block table and hangs it when trying to scan
> for bad blocks.

Please trace nand_macronix.c and look:
- are the get_features and set_features really supported by the
  controller driver?
- what is the state of the locking configuration in the chip when you
  boot?
- is there anything that locks the device by calling mxic_nand_lock() ?
- finding no bbt is one thing, hanging is another, where is it hanging
  exactly? (offset in nand/ and line in the code)

> 
> >
> > Otherwise, why would you disable the block protection? What does it
> > mean exactly? I'm not in favor of a Macronix-specific property here.
> >
> > Thanks,
> > Miquèl  
> 
> --
> Álvaro


Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 10:49         ` Miquel Raynal
@ 2023-03-24 11:21           ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 11:21 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

El vie, 24 mar 2023 a las 11:49, Miquel Raynal
(<miquel.raynal@bootlin.com>) escribió:
>
> Hi Álvaro,
>
> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>
> > Hi Miquèl,
> >
> > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > (<miquel.raynal@bootlin.com>) escribió:
> > >
> > > Hi Álvaro,
> > >
> > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > >
> > > > Add new "mxic,disable-block-protection" binding documentation.
> > > > This binding allows disabling block protection support for those devices not
> > > > supporting it.
> > > >
> > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > > ---
> > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > @@ -16,6 +16,9 @@ in children nodes.
> > > >  Required NAND chip properties in children mode:
> > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > > >
> > > > +Optional NAND chip properties in children mode:
> > > > +- block protection disable: should be "mxic,disable-block-protection"
> > > > +
> > >
> > > Besides the fact that nowadays we prefer to see binding conversions to
> > > yaml before adding anything, I don't think this will fly.
> > >
> > > I'm not sure exactly what "disable block protection" means, we
> > > already have similar properties like "lock" and "secure-regions", not
> > > sure they will fit but I think it's worth checking.
> >
> > As explained in 2/2, commit 03a539c7a118 introduced a regression on
> > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> > which hangs the device.
> >
> > This is the log with block protection disabled:
> > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
> > state default
> > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > [    0.511526] nand: Macronix MX30LF1G18AC
> > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > 2048, OOB size: 64
> > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > [    0.535912] Bad block table found at page 65472, version 0x01
> > [    0.544268] Bad block table found at page 65408, version 0x01
> > [    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
> > ...
> >
> > This is the log with block protection enabled:
> > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
> > state default
> > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > [    0.510772] nand: Macronix MX30LF1G18AC
> > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > 2048, OOB size: 64
> > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > [    0.539687] Bad block table not found for chip 0
> > [    0.550153] Bad block table not found for chip 0
> > [    0.555069] Scanning device for bad blocks
> > [    0.601213] CPU 1 Unable to handle kernel paging request at virtual
> > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > *** Device hangs ***
> >
> > Enabling macronix_nand_block_protection_support() makes the device
> > unable to detect the bad block table and hangs it when trying to scan
> > for bad blocks.
>
> Please trace nand_macronix.c and look:
> - are the get_features and set_features really supported by the
>   controller driver?

This is what I could find by debugging:
[    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.512077] nand: Macronix MX30LF1G18AC
[    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
[    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
[    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
[    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
[    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
[    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
[    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
00 00 00] -> 0
[    0.602341] macronix_nand_block_protection_support:
ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
[    0.610548] macronix_nand_block_protection_support: !=
MXIC_BLOCK_PROTECTION_ALL_LOCK
[    0.624760] Bad block table not found for chip 0
[    0.635542] Bad block table not found for chip 0
[    0.640270] Scanning device for bad blocks

I don't know how to tell if get_features / set_features is really supported...

> - what is the state of the locking configuration in the chip when you
>   boot?

Unlocked, I guess...
How can I check that?

> - is there anything that locks the device by calling mxic_nand_lock() ?
> - finding no bbt is one thing, hanging is another, where is it hanging
>   exactly? (offset in nand/ and line in the code)

I've got no idea...

>
> >
> > >
> > > Otherwise, why would you disable the block protection? What does it
> > > mean exactly? I'm not in favor of a Macronix-specific property here.
> > >
> > > Thanks,
> > > Miquèl
> >
> > --
> > Álvaro
>
>
> Thanks,
> Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 11:21           ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 11:21 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

El vie, 24 mar 2023 a las 11:49, Miquel Raynal
(<miquel.raynal@bootlin.com>) escribió:
>
> Hi Álvaro,
>
> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>
> > Hi Miquèl,
> >
> > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > (<miquel.raynal@bootlin.com>) escribió:
> > >
> > > Hi Álvaro,
> > >
> > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > >
> > > > Add new "mxic,disable-block-protection" binding documentation.
> > > > This binding allows disabling block protection support for those devices not
> > > > supporting it.
> > > >
> > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > > ---
> > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > @@ -16,6 +16,9 @@ in children nodes.
> > > >  Required NAND chip properties in children mode:
> > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > > >
> > > > +Optional NAND chip properties in children mode:
> > > > +- block protection disable: should be "mxic,disable-block-protection"
> > > > +
> > >
> > > Besides the fact that nowadays we prefer to see binding conversions to
> > > yaml before adding anything, I don't think this will fly.
> > >
> > > I'm not sure exactly what "disable block protection" means, we
> > > already have similar properties like "lock" and "secure-regions", not
> > > sure they will fit but I think it's worth checking.
> >
> > As explained in 2/2, commit 03a539c7a118 introduced a regression on
> > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> > which hangs the device.
> >
> > This is the log with block protection disabled:
> > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
> > state default
> > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > [    0.511526] nand: Macronix MX30LF1G18AC
> > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > 2048, OOB size: 64
> > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > [    0.535912] Bad block table found at page 65472, version 0x01
> > [    0.544268] Bad block table found at page 65408, version 0x01
> > [    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
> > ...
> >
> > This is the log with block protection enabled:
> > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
> > state default
> > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > [    0.510772] nand: Macronix MX30LF1G18AC
> > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > 2048, OOB size: 64
> > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > [    0.539687] Bad block table not found for chip 0
> > [    0.550153] Bad block table not found for chip 0
> > [    0.555069] Scanning device for bad blocks
> > [    0.601213] CPU 1 Unable to handle kernel paging request at virtual
> > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > *** Device hangs ***
> >
> > Enabling macronix_nand_block_protection_support() makes the device
> > unable to detect the bad block table and hangs it when trying to scan
> > for bad blocks.
>
> Please trace nand_macronix.c and look:
> - are the get_features and set_features really supported by the
>   controller driver?

This is what I could find by debugging:
[    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.512077] nand: Macronix MX30LF1G18AC
[    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
[    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
[    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
[    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
[    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
[    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
[    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
[    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
00 00 00] -> 0
[    0.602341] macronix_nand_block_protection_support:
ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
[    0.610548] macronix_nand_block_protection_support: !=
MXIC_BLOCK_PROTECTION_ALL_LOCK
[    0.624760] Bad block table not found for chip 0
[    0.635542] Bad block table not found for chip 0
[    0.640270] Scanning device for bad blocks

I don't know how to tell if get_features / set_features is really supported...

> - what is the state of the locking configuration in the chip when you
>   boot?

Unlocked, I guess...
How can I check that?

> - is there anything that locks the device by calling mxic_nand_lock() ?
> - finding no bbt is one thing, hanging is another, where is it hanging
>   exactly? (offset in nand/ and line in the code)

I've got no idea...

>
> >
> > >
> > > Otherwise, why would you disable the block protection? What does it
> > > mean exactly? I'm not in favor of a Macronix-specific property here.
> > >
> > > Thanks,
> > > Miquèl
> >
> > --
> > Álvaro
>
>
> Thanks,
> Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 11:21           ` Álvaro Fernández Rojas
@ 2023-03-24 13:45             ` Miquel Raynal
  -1 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24 13:45 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Álvaro,

noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:

> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> (<miquel.raynal@bootlin.com>) escribió:
> >
> > Hi Álvaro,
> >
> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >  
> > > Hi Miquèl,
> > >
> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > > (<miquel.raynal@bootlin.com>) escribió:  
> > > >
> > > > Hi Álvaro,
> > > >
> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > > >  
> > > > > Add new "mxic,disable-block-protection" binding documentation.
> > > > > This binding allows disabling block protection support for those devices not
> > > > > supporting it.
> > > > >
> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > > > ---
> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > > @@ -16,6 +16,9 @@ in children nodes.
> > > > >  Required NAND chip properties in children mode:
> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > > > >
> > > > > +Optional NAND chip properties in children mode:
> > > > > +- block protection disable: should be "mxic,disable-block-protection"
> > > > > +  
> > > >
> > > > Besides the fact that nowadays we prefer to see binding conversions to
> > > > yaml before adding anything, I don't think this will fly.
> > > >
> > > > I'm not sure exactly what "disable block protection" means, we
> > > > already have similar properties like "lock" and "secure-regions", not
> > > > sure they will fit but I think it's worth checking.  
> > >
> > > As explained in 2/2, commit 03a539c7a118 introduced a regression on
> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> > > which hangs the device.
> > >
> > > This is the log with block protection disabled:
> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
> > > state default
> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > [    0.511526] nand: Macronix MX30LF1G18AC
> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > 2048, OOB size: 64
> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > [    0.535912] Bad block table found at page 65472, version 0x01
> > > [    0.544268] Bad block table found at page 65408, version 0x01
> > > [    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
> > > ...
> > >
> > > This is the log with block protection enabled:
> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
> > > state default
> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > [    0.510772] nand: Macronix MX30LF1G18AC
> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > 2048, OOB size: 64
> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > [    0.539687] Bad block table not found for chip 0
> > > [    0.550153] Bad block table not found for chip 0
> > > [    0.555069] Scanning device for bad blocks
> > > [    0.601213] CPU 1 Unable to handle kernel paging request at virtual
> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > > *** Device hangs ***
> > >
> > > Enabling macronix_nand_block_protection_support() makes the device
> > > unable to detect the bad block table and hangs it when trying to scan
> > > for bad blocks.  
> >
> > Please trace nand_macronix.c and look:
> > - are the get_features and set_features really supported by the
> >   controller driver?  
> 
> This is what I could find by debugging:
> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.512077] nand: Macronix MX30LF1G18AC
> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> 00 00 00] -> 0
> [    0.602341] macronix_nand_block_protection_support:
> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> [    0.610548] macronix_nand_block_protection_support: !=
> MXIC_BLOCK_PROTECTION_ALL_LOCK
> [    0.624760] Bad block table not found for chip 0
> [    0.635542] Bad block table not found for chip 0
> [    0.640270] Scanning device for bad blocks
> 
> I don't know how to tell if get_features / set_features is really supported...

Looks like your driver does not support exec_op but the core provides a
get/set_feature implementation.

> 
> > - what is the state of the locking configuration in the chip when you
> >   boot?  
> 
> Unlocked, I guess...
> How can I check that?

It's in your dump, the chip returns 0, meaning it's all unlocked,
apparently.

> > - is there anything that locks the device by calling mxic_nand_lock() ?

So nobody locks the device I guess? Did you add traces there?

> > - finding no bbt is one thing, hanging is another, where is it hanging
> >   exactly? (offset in nand/ and line in the code)  
> 
> I've got no idea...

You can use ftrace or just add printks a bit everywhere and try to get
closer and closer.

I looked at the patch, I don't see anything strange. Besides, I have a
close enough datasheet and I don't see what could confuse the device.

Are you really sure this patch is the problem? Is the WP pin wired on
your design?

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 13:45             ` Miquel Raynal
  0 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24 13:45 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Álvaro,

noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:

> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> (<miquel.raynal@bootlin.com>) escribió:
> >
> > Hi Álvaro,
> >
> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >  
> > > Hi Miquèl,
> > >
> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > > (<miquel.raynal@bootlin.com>) escribió:  
> > > >
> > > > Hi Álvaro,
> > > >
> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > > >  
> > > > > Add new "mxic,disable-block-protection" binding documentation.
> > > > > This binding allows disabling block protection support for those devices not
> > > > > supporting it.
> > > > >
> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > > > ---
> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/Documentation/devicetree/bindings/mtd/nand-macronix.txt b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > > > @@ -16,6 +16,9 @@ in children nodes.
> > > > >  Required NAND chip properties in children mode:
> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > > > >
> > > > > +Optional NAND chip properties in children mode:
> > > > > +- block protection disable: should be "mxic,disable-block-protection"
> > > > > +  
> > > >
> > > > Besides the fact that nowadays we prefer to see binding conversions to
> > > > yaml before adding anything, I don't think this will fly.
> > > >
> > > > I'm not sure exactly what "disable block protection" means, we
> > > > already have similar properties like "lock" and "secure-regions", not
> > > > sure they will fit but I think it's worth checking.  
> > >
> > > As explained in 2/2, commit 03a539c7a118 introduced a regression on
> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> > > which hangs the device.
> > >
> > > This is the log with block protection disabled:
> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps for
> > > state default
> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > [    0.511526] nand: Macronix MX30LF1G18AC
> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > 2048, OOB size: 64
> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > [    0.535912] Bad block table found at page 65472, version 0x01
> > > [    0.544268] Bad block table found at page 65408, version 0x01
> > > [    0.954329] 9 fixed-partitions partitions found on MTD device brcmnand.0
> > > ...
> > >
> > > This is the log with block protection enabled:
> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps for
> > > state default
> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > [    0.510772] nand: Macronix MX30LF1G18AC
> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > 2048, OOB size: 64
> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > [    0.539687] Bad block table not found for chip 0
> > > [    0.550153] Bad block table not found for chip 0
> > > [    0.555069] Scanning device for bad blocks
> > > [    0.601213] CPU 1 Unable to handle kernel paging request at virtual
> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > > *** Device hangs ***
> > >
> > > Enabling macronix_nand_block_protection_support() makes the device
> > > unable to detect the bad block table and hangs it when trying to scan
> > > for bad blocks.  
> >
> > Please trace nand_macronix.c and look:
> > - are the get_features and set_features really supported by the
> >   controller driver?  
> 
> This is what I could find by debugging:
> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.512077] nand: Macronix MX30LF1G18AC
> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> 00 00 00] -> 0
> [    0.602341] macronix_nand_block_protection_support:
> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> [    0.610548] macronix_nand_block_protection_support: !=
> MXIC_BLOCK_PROTECTION_ALL_LOCK
> [    0.624760] Bad block table not found for chip 0
> [    0.635542] Bad block table not found for chip 0
> [    0.640270] Scanning device for bad blocks
> 
> I don't know how to tell if get_features / set_features is really supported...

Looks like your driver does not support exec_op but the core provides a
get/set_feature implementation.

> 
> > - what is the state of the locking configuration in the chip when you
> >   boot?  
> 
> Unlocked, I guess...
> How can I check that?

It's in your dump, the chip returns 0, meaning it's all unlocked,
apparently.

> > - is there anything that locks the device by calling mxic_nand_lock() ?

So nobody locks the device I guess? Did you add traces there?

> > - finding no bbt is one thing, hanging is another, where is it hanging
> >   exactly? (offset in nand/ and line in the code)  
> 
> I've got no idea...

You can use ftrace or just add printks a bit everywhere and try to get
closer and closer.

I looked at the patch, I don't see anything strange. Besides, I have a
close enough datasheet and I don't see what could confuse the device.

Are you really sure this patch is the problem? Is the WP pin wired on
your design?

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 13:45             ` Miquel Raynal
@ 2023-03-24 14:15               ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 14:15 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Miquèl,

2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> Hi Álvaro,
>
> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>
>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>> (<miquel.raynal@bootlin.com>) escribió:
>> >
>> > Hi Álvaro,
>> >
>> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>> >
>> > > Hi Miquèl,
>> > >
>> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>> > > (<miquel.raynal@bootlin.com>) escribió:
>> > > >
>> > > > Hi Álvaro,
>> > > >
>> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>> > > >
>> > > > > Add new "mxic,disable-block-protection" binding documentation.
>> > > > > This binding allows disabling block protection support for those
>> > > > > devices not
>> > > > > supporting it.
>> > > > >
>> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>> > > > > ---
>> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
>> > > > >  1 file changed, 3 insertions(+)
>> > > > >
>> > > > > diff --git
>> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
>> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > @@ -16,6 +16,9 @@ in children nodes.
>> > > > >  Required NAND chip properties in children mode:
>> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
>> > > > >
>> > > > > +Optional NAND chip properties in children mode:
>> > > > > +- block protection disable: should be
>> > > > > "mxic,disable-block-protection"
>> > > > > +
>> > > >
>> > > > Besides the fact that nowadays we prefer to see binding conversions
>> > > > to
>> > > > yaml before adding anything, I don't think this will fly.
>> > > >
>> > > > I'm not sure exactly what "disable block protection" means, we
>> > > > already have similar properties like "lock" and "secure-regions",
>> > > > not
>> > > > sure they will fit but I think it's worth checking.
>> > >
>> > > As explained in 2/2, commit 03a539c7a118 introduced a regression on
>> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
>> > > which hangs the device.
>> > >
>> > > This is the log with block protection disabled:
>> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>> > > for
>> > > state default
>> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> > > 0xf1
>> > > [    0.511526] nand: Macronix MX30LF1G18AC
>> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> > > 2048, OOB size: 64
>> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> > > [    0.535912] Bad block table found at page 65472, version 0x01
>> > > [    0.544268] Bad block table found at page 65408, version 0x01
>> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
>> > > brcmnand.0
>> > > ...
>> > >
>> > > This is the log with block protection enabled:
>> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>> > > for
>> > > state default
>> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> > > 0xf1
>> > > [    0.510772] nand: Macronix MX30LF1G18AC
>> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> > > 2048, OOB size: 64
>> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> > > [    0.539687] Bad block table not found for chip 0
>> > > [    0.550153] Bad block table not found for chip 0
>> > > [    0.555069] Scanning device for bad blocks
>> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
>> > > virtual
>> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
>> > > *** Device hangs ***
>> > >
>> > > Enabling macronix_nand_block_protection_support() makes the device
>> > > unable to detect the bad block table and hangs it when trying to scan
>> > > for bad blocks.
>> >
>> > Please trace nand_macronix.c and look:
>> > - are the get_features and set_features really supported by the
>> >   controller driver?
>>
>> This is what I could find by debugging:
>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>> state default
>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
>> [    0.512077] nand: Macronix MX30LF1G18AC
>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> 2048, OOB size: 64
>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>> 00 00 00] -> 0
>> [    0.602341] macronix_nand_block_protection_support:
>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>> [    0.610548] macronix_nand_block_protection_support: !=
>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>> [    0.624760] Bad block table not found for chip 0
>> [    0.635542] Bad block table not found for chip 0
>> [    0.640270] Scanning device for bad blocks
>>
>> I don't know how to tell if get_features / set_features is really
>> supported...
>
> Looks like your driver does not support exec_op but the core provides a
> get/set_feature implementation.

According to Florian, low level should be supported on brcmnand
controllers >= 4.0
Also:
https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597

>
>>
>> > - what is the state of the locking configuration in the chip when you
>> >   boot?
>>
>> Unlocked, I guess...
>> How can I check that?
>
> It's in your dump, the chip returns 0, meaning it's all unlocked,
> apparently.

Well, I can read/write the device if block protection isn’t disabled,
so I guess we can confirm it’s unlocked…

>
>> > - is there anything that locks the device by calling mxic_nand_lock() ?
>
> So nobody locks the device I guess? Did you add traces there?

It doesn’t get to the point that it enabled the lock/unlock functions
since it fails when checking if feature is 0x38, so there’s no point
in adding those traces…

>
>> > - finding no bbt is one thing, hanging is another, where is it hanging
>> >   exactly? (offset in nand/ and line in the code)
>>
>> I've got no idea...
>
> You can use ftrace or just add printks a bit everywhere and try to get
> closer and closer.

I think that after trying to get the feature it just start reading
nonsense from the NAND and at some point it hangs due to that garbage…
Is it posible that the NAND starts behaving like this after getting
the feature due to some specific config of my device?

>
> I looked at the patch, I don't see anything strange. Besides, I have a
> close enough datasheet and I don't see what could confuse the device.
>
> Are you really sure this patch is the problem? Is the WP pin wired on
> your design?

There’s no WP pin in brcmnand controllers < 7.0

>
> Thanks,
> Miquèl
>

Thanks,
Álvaro

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 14:15               ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 14:15 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hi Miquèl,

2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> Hi Álvaro,
>
> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>
>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>> (<miquel.raynal@bootlin.com>) escribió:
>> >
>> > Hi Álvaro,
>> >
>> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>> >
>> > > Hi Miquèl,
>> > >
>> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>> > > (<miquel.raynal@bootlin.com>) escribió:
>> > > >
>> > > > Hi Álvaro,
>> > > >
>> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>> > > >
>> > > > > Add new "mxic,disable-block-protection" binding documentation.
>> > > > > This binding allows disabling block protection support for those
>> > > > > devices not
>> > > > > supporting it.
>> > > > >
>> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>> > > > > ---
>> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
>> > > > >  1 file changed, 3 insertions(+)
>> > > > >
>> > > > > diff --git
>> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
>> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> > > > > @@ -16,6 +16,9 @@ in children nodes.
>> > > > >  Required NAND chip properties in children mode:
>> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
>> > > > >
>> > > > > +Optional NAND chip properties in children mode:
>> > > > > +- block protection disable: should be
>> > > > > "mxic,disable-block-protection"
>> > > > > +
>> > > >
>> > > > Besides the fact that nowadays we prefer to see binding conversions
>> > > > to
>> > > > yaml before adding anything, I don't think this will fly.
>> > > >
>> > > > I'm not sure exactly what "disable block protection" means, we
>> > > > already have similar properties like "lock" and "secure-regions",
>> > > > not
>> > > > sure they will fit but I think it's worth checking.
>> > >
>> > > As explained in 2/2, commit 03a539c7a118 introduced a regression on
>> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
>> > > which hangs the device.
>> > >
>> > > This is the log with block protection disabled:
>> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>> > > for
>> > > state default
>> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> > > 0xf1
>> > > [    0.511526] nand: Macronix MX30LF1G18AC
>> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> > > 2048, OOB size: 64
>> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> > > [    0.535912] Bad block table found at page 65472, version 0x01
>> > > [    0.544268] Bad block table found at page 65408, version 0x01
>> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
>> > > brcmnand.0
>> > > ...
>> > >
>> > > This is the log with block protection enabled:
>> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>> > > for
>> > > state default
>> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> > > 0xf1
>> > > [    0.510772] nand: Macronix MX30LF1G18AC
>> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> > > 2048, OOB size: 64
>> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> > > [    0.539687] Bad block table not found for chip 0
>> > > [    0.550153] Bad block table not found for chip 0
>> > > [    0.555069] Scanning device for bad blocks
>> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
>> > > virtual
>> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
>> > > *** Device hangs ***
>> > >
>> > > Enabling macronix_nand_block_protection_support() makes the device
>> > > unable to detect the bad block table and hangs it when trying to scan
>> > > for bad blocks.
>> >
>> > Please trace nand_macronix.c and look:
>> > - are the get_features and set_features really supported by the
>> >   controller driver?
>>
>> This is what I could find by debugging:
>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>> state default
>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
>> [    0.512077] nand: Macronix MX30LF1G18AC
>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> 2048, OOB size: 64
>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>> 00 00 00] -> 0
>> [    0.602341] macronix_nand_block_protection_support:
>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>> [    0.610548] macronix_nand_block_protection_support: !=
>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>> [    0.624760] Bad block table not found for chip 0
>> [    0.635542] Bad block table not found for chip 0
>> [    0.640270] Scanning device for bad blocks
>>
>> I don't know how to tell if get_features / set_features is really
>> supported...
>
> Looks like your driver does not support exec_op but the core provides a
> get/set_feature implementation.

According to Florian, low level should be supported on brcmnand
controllers >= 4.0
Also:
https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597

>
>>
>> > - what is the state of the locking configuration in the chip when you
>> >   boot?
>>
>> Unlocked, I guess...
>> How can I check that?
>
> It's in your dump, the chip returns 0, meaning it's all unlocked,
> apparently.

Well, I can read/write the device if block protection isn’t disabled,
so I guess we can confirm it’s unlocked…

>
>> > - is there anything that locks the device by calling mxic_nand_lock() ?
>
> So nobody locks the device I guess? Did you add traces there?

It doesn’t get to the point that it enabled the lock/unlock functions
since it fails when checking if feature is 0x38, so there’s no point
in adding those traces…

>
>> > - finding no bbt is one thing, hanging is another, where is it hanging
>> >   exactly? (offset in nand/ and line in the code)
>>
>> I've got no idea...
>
> You can use ftrace or just add printks a bit everywhere and try to get
> closer and closer.

I think that after trying to get the feature it just start reading
nonsense from the NAND and at some point it hangs due to that garbage…
Is it posible that the NAND starts behaving like this after getting
the feature due to some specific config of my device?

>
> I looked at the patch, I don't see anything strange. Besides, I have a
> close enough datasheet and I don't see what could confuse the device.
>
> Are you really sure this patch is the problem? Is the WP pin wired on
> your design?

There’s no WP pin in brcmnand controllers < 7.0

>
> Thanks,
> Miquèl
>

Thanks,
Álvaro

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 14:15               ` Álvaro Fernández Rojas
@ 2023-03-24 14:36                 ` Miquel Raynal
  -1 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24 14:36 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel, Jaime Liao, YouChing

Hi Álvaro,

+ YouChing and Jaime from Macronix
TLDR for them: there is a misbehavior since Mason added block
protection support. Just checking if the blocks are protected seems to
misconfigure the chip entirely, see below. Any hints?

noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:

> Hi Miquèl,
> 
> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > Hi Álvaro,
> >
> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >  
> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >> (<miquel.raynal@bootlin.com>) escribió:  
> >> >
> >> > Hi Álvaro,
> >> >
> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >> >  
> >> > > Hi Miquèl,
> >> > >
> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >> > > (<miquel.raynal@bootlin.com>) escribió:  
> >> > > >
> >> > > > Hi Álvaro,
> >> > > >
> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >> > > >  
> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> >> > > > > This binding allows disabling block protection support for those
> >> > > > > devices not
> >> > > > > supporting it.
> >> > > > >
> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >> > > > > ---
> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> >> > > > >  1 file changed, 3 insertions(+)
> >> > > > >
> >> > > > > diff --git
> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> >> > > > >  Required NAND chip properties in children mode:
> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >> > > > >
> >> > > > > +Optional NAND chip properties in children mode:
> >> > > > > +- block protection disable: should be
> >> > > > > "mxic,disable-block-protection"
> >> > > > > +  
> >> > > >
> >> > > > Besides the fact that nowadays we prefer to see binding conversions
> >> > > > to
> >> > > > yaml before adding anything, I don't think this will fly.
> >> > > >
> >> > > > I'm not sure exactly what "disable block protection" means, we
> >> > > > already have similar properties like "lock" and "secure-regions",
> >> > > > not
> >> > > > sure they will fit but I think it's worth checking.  
> >> > >
> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression on
> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> >> > > which hangs the device.
> >> > >
> >> > > This is the log with block protection disabled:
> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >> > > for
> >> > > state default
> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> > > 0xf1
> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> > > 2048, OOB size: 64
> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> >> > > brcmnand.0
> >> > > ...
> >> > >
> >> > > This is the log with block protection enabled:
> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >> > > for
> >> > > state default
> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> > > 0xf1
> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> > > 2048, OOB size: 64
> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> > > [    0.539687] Bad block table not found for chip 0
> >> > > [    0.550153] Bad block table not found for chip 0
> >> > > [    0.555069] Scanning device for bad blocks
> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> >> > > virtual
> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> >> > > *** Device hangs ***
> >> > >
> >> > > Enabling macronix_nand_block_protection_support() makes the device
> >> > > unable to detect the bad block table and hangs it when trying to scan
> >> > > for bad blocks.  
> >> >
> >> > Please trace nand_macronix.c and look:
> >> > - are the get_features and set_features really supported by the
> >> >   controller driver?  
> >>
> >> This is what I could find by debugging:
> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >> state default
> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> >> [    0.512077] nand: Macronix MX30LF1G18AC
> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> 2048, OOB size: 64
> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >> 00 00 00] -> 0
> >> [    0.602341] macronix_nand_block_protection_support:
> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >> [    0.610548] macronix_nand_block_protection_support: !=
> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >> [    0.624760] Bad block table not found for chip 0
> >> [    0.635542] Bad block table not found for chip 0
> >> [    0.640270] Scanning device for bad blocks
> >>
> >> I don't know how to tell if get_features / set_features is really
> >> supported...  
> >
> > Looks like your driver does not support exec_op but the core provides a
> > get/set_feature implementation.  
> 
> According to Florian, low level should be supported on brcmnand
> controllers >= 4.0
> Also:
> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597

Just to be sure, you're using a mainline controller driver, not this
one?

> >  
> >>  
> >> > - what is the state of the locking configuration in the chip when you
> >> >   boot?  
> >>
> >> Unlocked, I guess...
> >> How can I check that?  
> >
> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> > apparently.  
> 
> Well, I can read/write the device if block protection isn’t disabled,
> so I guess we can confirm it’s unlocked…
> 
> >  
> >> > - is there anything that locks the device by calling mxic_nand_lock() ?  
> >
> > So nobody locks the device I guess? Did you add traces there?  
> 
> It doesn’t get to the point that it enabled the lock/unlock functions
> since it fails when checking if feature is 0x38, so there’s no point
> in adding those traces…

Right, it returns before setting these I guess.

> 
> >  
> >> > - finding no bbt is one thing, hanging is another, where is it hanging
> >> >   exactly? (offset in nand/ and line in the code)  
> >>
> >> I've got no idea...  
> >
> > You can use ftrace or just add printks a bit everywhere and try to get
> > closer and closer.  
> 
> I think that after trying to get the feature it just start reading
> nonsense from the NAND and at some point it hangs due to that garbage…

It should refuse to mount the device somehow, but in no case the kernel
should hang.

> Is it posible that the NAND starts behaving like this after getting
> the feature due to some specific config of my device?
> 
> >
> > I looked at the patch, I don't see anything strange. Besides, I have a
> > close enough datasheet and I don't see what could confuse the device.
> >
> > Are you really sure this patch is the problem? Is the WP pin wired on
> > your design?  
> 
> There’s no WP pin in brcmnand controllers < 7.0

What about the chip?

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 14:36                 ` Miquel Raynal
  0 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-24 14:36 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel, Jaime Liao, YouChing

Hi Álvaro,

+ YouChing and Jaime from Macronix
TLDR for them: there is a misbehavior since Mason added block
protection support. Just checking if the blocks are protected seems to
misconfigure the chip entirely, see below. Any hints?

noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:

> Hi Miquèl,
> 
> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > Hi Álvaro,
> >
> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >  
> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >> (<miquel.raynal@bootlin.com>) escribió:  
> >> >
> >> > Hi Álvaro,
> >> >
> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >> >  
> >> > > Hi Miquèl,
> >> > >
> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >> > > (<miquel.raynal@bootlin.com>) escribió:  
> >> > > >
> >> > > > Hi Álvaro,
> >> > > >
> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >> > > >  
> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> >> > > > > This binding allows disabling block protection support for those
> >> > > > > devices not
> >> > > > > supporting it.
> >> > > > >
> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >> > > > > ---
> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3 +++
> >> > > > >  1 file changed, 3 insertions(+)
> >> > > > >
> >> > > > > diff --git
> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> >> > > > >  Required NAND chip properties in children mode:
> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >> > > > >
> >> > > > > +Optional NAND chip properties in children mode:
> >> > > > > +- block protection disable: should be
> >> > > > > "mxic,disable-block-protection"
> >> > > > > +  
> >> > > >
> >> > > > Besides the fact that nowadays we prefer to see binding conversions
> >> > > > to
> >> > > > yaml before adding anything, I don't think this will fly.
> >> > > >
> >> > > > I'm not sure exactly what "disable block protection" means, we
> >> > > > already have similar properties like "lock" and "secure-regions",
> >> > > > not
> >> > > > sure they will fit but I think it's worth checking.  
> >> > >
> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression on
> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix MX30LF1G18AC
> >> > > which hangs the device.
> >> > >
> >> > > This is the log with block protection disabled:
> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >> > > for
> >> > > state default
> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> > > 0xf1
> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> > > 2048, OOB size: 64
> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> >> > > brcmnand.0
> >> > > ...
> >> > >
> >> > > This is the log with block protection enabled:
> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >> > > for
> >> > > state default
> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> > > 0xf1
> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> > > 2048, OOB size: 64
> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> > > [    0.539687] Bad block table not found for chip 0
> >> > > [    0.550153] Bad block table not found for chip 0
> >> > > [    0.555069] Scanning device for bad blocks
> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> >> > > virtual
> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> >> > > *** Device hangs ***
> >> > >
> >> > > Enabling macronix_nand_block_protection_support() makes the device
> >> > > unable to detect the bad block table and hangs it when trying to scan
> >> > > for bad blocks.  
> >> >
> >> > Please trace nand_macronix.c and look:
> >> > - are the get_features and set_features really supported by the
> >> >   controller driver?  
> >>
> >> This is what I could find by debugging:
> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >> state default
> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> >> [    0.512077] nand: Macronix MX30LF1G18AC
> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> 2048, OOB size: 64
> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES = 0x00
> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >> 00 00 00] -> 0
> >> [    0.602341] macronix_nand_block_protection_support:
> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >> [    0.610548] macronix_nand_block_protection_support: !=
> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >> [    0.624760] Bad block table not found for chip 0
> >> [    0.635542] Bad block table not found for chip 0
> >> [    0.640270] Scanning device for bad blocks
> >>
> >> I don't know how to tell if get_features / set_features is really
> >> supported...  
> >
> > Looks like your driver does not support exec_op but the core provides a
> > get/set_feature implementation.  
> 
> According to Florian, low level should be supported on brcmnand
> controllers >= 4.0
> Also:
> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597

Just to be sure, you're using a mainline controller driver, not this
one?

> >  
> >>  
> >> > - what is the state of the locking configuration in the chip when you
> >> >   boot?  
> >>
> >> Unlocked, I guess...
> >> How can I check that?  
> >
> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> > apparently.  
> 
> Well, I can read/write the device if block protection isn’t disabled,
> so I guess we can confirm it’s unlocked…
> 
> >  
> >> > - is there anything that locks the device by calling mxic_nand_lock() ?  
> >
> > So nobody locks the device I guess? Did you add traces there?  
> 
> It doesn’t get to the point that it enabled the lock/unlock functions
> since it fails when checking if feature is 0x38, so there’s no point
> in adding those traces…

Right, it returns before setting these I guess.

> 
> >  
> >> > - finding no bbt is one thing, hanging is another, where is it hanging
> >> >   exactly? (offset in nand/ and line in the code)  
> >>
> >> I've got no idea...  
> >
> > You can use ftrace or just add printks a bit everywhere and try to get
> > closer and closer.  
> 
> I think that after trying to get the feature it just start reading
> nonsense from the NAND and at some point it hangs due to that garbage…

It should refuse to mount the device somehow, but in no case the kernel
should hang.

> Is it posible that the NAND starts behaving like this after getting
> the feature due to some specific config of my device?
> 
> >
> > I looked at the patch, I don't see anything strange. Besides, I have a
> > close enough datasheet and I don't see what could confuse the device.
> >
> > Are you really sure this patch is the problem? Is the WP pin wired on
> > your design?  
> 
> There’s no WP pin in brcmnand controllers < 7.0

What about the chip?

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 14:36                 ` Miquel Raynal
@ 2023-03-24 17:04                   ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 17:04 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel, Jaime Liao, YouChing

Hi Miquèl,

2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> Hi Álvaro,
>
> + YouChing and Jaime from Macronix
> TLDR for them: there is a misbehavior since Mason added block
> protection support. Just checking if the blocks are protected seems to
> misconfigure the chip entirely, see below. Any hints?

Could it be that the NAND is stuck expecting a read 0x00 command which
isn’t sent after getting the features?

>
> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>
>> Hi Miquèl,
>>
>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>> > Hi Álvaro,
>> >
>> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>> >
>> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>> >> (<miquel.raynal@bootlin.com>) escribió:
>> >> >
>> >> > Hi Álvaro,
>> >> >
>> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>> >> >
>> >> > > Hi Miquèl,
>> >> > >
>> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>> >> > > (<miquel.raynal@bootlin.com>) escribió:
>> >> > > >
>> >> > > > Hi Álvaro,
>> >> > > >
>> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>> >> > > >
>> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
>> >> > > > > This binding allows disabling block protection support for
>> >> > > > > those
>> >> > > > > devices not
>> >> > > > > supporting it.
>> >> > > > >
>> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>> >> > > > > ---
>> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>> >> > > > > +++
>> >> > > > >  1 file changed, 3 insertions(+)
>> >> > > > >
>> >> > > > > diff --git
>> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
>> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
>> >> > > > >  Required NAND chip properties in children mode:
>> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
>> >> > > > >
>> >> > > > > +Optional NAND chip properties in children mode:
>> >> > > > > +- block protection disable: should be
>> >> > > > > "mxic,disable-block-protection"
>> >> > > > > +
>> >> > > >
>> >> > > > Besides the fact that nowadays we prefer to see binding
>> >> > > > conversions
>> >> > > > to
>> >> > > > yaml before adding anything, I don't think this will fly.
>> >> > > >
>> >> > > > I'm not sure exactly what "disable block protection" means, we
>> >> > > > already have similar properties like "lock" and
>> >> > > > "secure-regions",
>> >> > > > not
>> >> > > > sure they will fit but I think it's worth checking.
>> >> > >
>> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
>> >> > > on
>> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>> >> > > MX30LF1G18AC
>> >> > > which hangs the device.
>> >> > >
>> >> > > This is the log with block protection disabled:
>> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>> >> > > for
>> >> > > state default
>> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> >> > > 0xf1
>> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
>> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> >> > > 2048, OOB size: 64
>> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
>> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
>> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
>> >> > > brcmnand.0
>> >> > > ...
>> >> > >
>> >> > > This is the log with block protection enabled:
>> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>> >> > > for
>> >> > > state default
>> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> >> > > 0xf1
>> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
>> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> >> > > 2048, OOB size: 64
>> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> >> > > [    0.539687] Bad block table not found for chip 0
>> >> > > [    0.550153] Bad block table not found for chip 0
>> >> > > [    0.555069] Scanning device for bad blocks
>> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
>> >> > > virtual
>> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
>> >> > > *** Device hangs ***
>> >> > >
>> >> > > Enabling macronix_nand_block_protection_support() makes the device
>> >> > > unable to detect the bad block table and hangs it when trying to
>> >> > > scan
>> >> > > for bad blocks.
>> >> >
>> >> > Please trace nand_macronix.c and look:
>> >> > - are the get_features and set_features really supported by the
>> >> >   controller driver?
>> >>
>> >> This is what I could find by debugging:
>> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>> >> state default
>> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> >> 0xf1
>> >> [    0.512077] nand: Macronix MX30LF1G18AC
>> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> >> 2048, OOB size: 64
>> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>> >> 00 00 00] -> 0
>> >> [    0.602341] macronix_nand_block_protection_support:
>> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>> >> [    0.610548] macronix_nand_block_protection_support: !=
>> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
>> >> [    0.624760] Bad block table not found for chip 0
>> >> [    0.635542] Bad block table not found for chip 0
>> >> [    0.640270] Scanning device for bad blocks
>> >>
>> >> I don't know how to tell if get_features / set_features is really
>> >> supported...
>> >
>> > Looks like your driver does not support exec_op but the core provides a
>> > get/set_feature implementation.
>>
>> According to Florian, low level should be supported on brcmnand
>> controllers >= 4.0
>> Also:
>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>
> Just to be sure, you're using a mainline controller driver, not this
> one?

Yes, this was just to prove that the HW I’m using has get/set features support.
I’m using OpenWrt, so it’s linux v5.15 driver.

>
>> >
>> >>
>> >> > - what is the state of the locking configuration in the chip when
>> >> > you
>> >> >   boot?
>> >>
>> >> Unlocked, I guess...
>> >> How can I check that?
>> >
>> > It's in your dump, the chip returns 0, meaning it's all unlocked,
>> > apparently.
>>
>> Well, I can read/write the device if block protection isn’t disabled,
>> so I guess we can confirm it’s unlocked…
>>
>> >
>> >> > - is there anything that locks the device by calling mxic_nand_lock()
>> >> > ?
>> >
>> > So nobody locks the device I guess? Did you add traces there?
>>
>> It doesn’t get to the point that it enabled the lock/unlock functions
>> since it fails when checking if feature is 0x38, so there’s no point
>> in adding those traces…
>
> Right, it returns before setting these I guess.
>
>>
>> >
>> >> > - finding no bbt is one thing, hanging is another, where is it
>> >> > hanging
>> >> >   exactly? (offset in nand/ and line in the code)
>> >>
>> >> I've got no idea...
>> >
>> > You can use ftrace or just add printks a bit everywhere and try to get
>> > closer and closer.
>>
>> I think that after trying to get the feature it just start reading
>> nonsense from the NAND and at some point it hangs due to that garbage…
>
> It should refuse to mount the device somehow, but in no case the kernel
> should hang.

Yes, I think that this is a side effect (maybe a different bug somewhere else).

>
>> Is it posible that the NAND starts behaving like this after getting
>> the feature due to some specific config of my device?
>>
>> >
>> > I looked at the patch, I don't see anything strange. Besides, I have a
>> > close enough datasheet and I don't see what could confuse the device.
>> >
>> > Are you really sure this patch is the problem? Is the WP pin wired on
>> > your design?
>>
>> There’s no WP pin in brcmnand controllers < 7.0
>
> What about the chip?

Maybe it has a GPIO controlling that, but I don’t have that info…

>
> Thanks,
> Miquèl
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-24 17:04                   ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-24 17:04 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel, Jaime Liao, YouChing

Hi Miquèl,

2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> Hi Álvaro,
>
> + YouChing and Jaime from Macronix
> TLDR for them: there is a misbehavior since Mason added block
> protection support. Just checking if the blocks are protected seems to
> misconfigure the chip entirely, see below. Any hints?

Could it be that the NAND is stuck expecting a read 0x00 command which
isn’t sent after getting the features?

>
> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>
>> Hi Miquèl,
>>
>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>> > Hi Álvaro,
>> >
>> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>> >
>> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>> >> (<miquel.raynal@bootlin.com>) escribió:
>> >> >
>> >> > Hi Álvaro,
>> >> >
>> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>> >> >
>> >> > > Hi Miquèl,
>> >> > >
>> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>> >> > > (<miquel.raynal@bootlin.com>) escribió:
>> >> > > >
>> >> > > > Hi Álvaro,
>> >> > > >
>> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>> >> > > >
>> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
>> >> > > > > This binding allows disabling block protection support for
>> >> > > > > those
>> >> > > > > devices not
>> >> > > > > supporting it.
>> >> > > > >
>> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>> >> > > > > ---
>> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>> >> > > > > +++
>> >> > > > >  1 file changed, 3 insertions(+)
>> >> > > > >
>> >> > > > > diff --git
>> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
>> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
>> >> > > > >  Required NAND chip properties in children mode:
>> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
>> >> > > > >
>> >> > > > > +Optional NAND chip properties in children mode:
>> >> > > > > +- block protection disable: should be
>> >> > > > > "mxic,disable-block-protection"
>> >> > > > > +
>> >> > > >
>> >> > > > Besides the fact that nowadays we prefer to see binding
>> >> > > > conversions
>> >> > > > to
>> >> > > > yaml before adding anything, I don't think this will fly.
>> >> > > >
>> >> > > > I'm not sure exactly what "disable block protection" means, we
>> >> > > > already have similar properties like "lock" and
>> >> > > > "secure-regions",
>> >> > > > not
>> >> > > > sure they will fit but I think it's worth checking.
>> >> > >
>> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
>> >> > > on
>> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>> >> > > MX30LF1G18AC
>> >> > > which hangs the device.
>> >> > >
>> >> > > This is the log with block protection disabled:
>> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>> >> > > for
>> >> > > state default
>> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> >> > > 0xf1
>> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
>> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> >> > > 2048, OOB size: 64
>> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
>> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
>> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
>> >> > > brcmnand.0
>> >> > > ...
>> >> > >
>> >> > > This is the log with block protection enabled:
>> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>> >> > > for
>> >> > > state default
>> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> >> > > 0xf1
>> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
>> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> >> > > 2048, OOB size: 64
>> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> >> > > [    0.539687] Bad block table not found for chip 0
>> >> > > [    0.550153] Bad block table not found for chip 0
>> >> > > [    0.555069] Scanning device for bad blocks
>> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
>> >> > > virtual
>> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
>> >> > > *** Device hangs ***
>> >> > >
>> >> > > Enabling macronix_nand_block_protection_support() makes the device
>> >> > > unable to detect the bad block table and hangs it when trying to
>> >> > > scan
>> >> > > for bad blocks.
>> >> >
>> >> > Please trace nand_macronix.c and look:
>> >> > - are the get_features and set_features really supported by the
>> >> >   controller driver?
>> >>
>> >> This is what I could find by debugging:
>> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>> >> state default
>> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>> >> 0xf1
>> >> [    0.512077] nand: Macronix MX30LF1G18AC
>> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>> >> 2048, OOB size: 64
>> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>> >> 0x00
>> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>> >> 00 00 00] -> 0
>> >> [    0.602341] macronix_nand_block_protection_support:
>> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>> >> [    0.610548] macronix_nand_block_protection_support: !=
>> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
>> >> [    0.624760] Bad block table not found for chip 0
>> >> [    0.635542] Bad block table not found for chip 0
>> >> [    0.640270] Scanning device for bad blocks
>> >>
>> >> I don't know how to tell if get_features / set_features is really
>> >> supported...
>> >
>> > Looks like your driver does not support exec_op but the core provides a
>> > get/set_feature implementation.
>>
>> According to Florian, low level should be supported on brcmnand
>> controllers >= 4.0
>> Also:
>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>
> Just to be sure, you're using a mainline controller driver, not this
> one?

Yes, this was just to prove that the HW I’m using has get/set features support.
I’m using OpenWrt, so it’s linux v5.15 driver.

>
>> >
>> >>
>> >> > - what is the state of the locking configuration in the chip when
>> >> > you
>> >> >   boot?
>> >>
>> >> Unlocked, I guess...
>> >> How can I check that?
>> >
>> > It's in your dump, the chip returns 0, meaning it's all unlocked,
>> > apparently.
>>
>> Well, I can read/write the device if block protection isn’t disabled,
>> so I guess we can confirm it’s unlocked…
>>
>> >
>> >> > - is there anything that locks the device by calling mxic_nand_lock()
>> >> > ?
>> >
>> > So nobody locks the device I guess? Did you add traces there?
>>
>> It doesn’t get to the point that it enabled the lock/unlock functions
>> since it fails when checking if feature is 0x38, so there’s no point
>> in adding those traces…
>
> Right, it returns before setting these I guess.
>
>>
>> >
>> >> > - finding no bbt is one thing, hanging is another, where is it
>> >> > hanging
>> >> >   exactly? (offset in nand/ and line in the code)
>> >>
>> >> I've got no idea...
>> >
>> > You can use ftrace or just add printks a bit everywhere and try to get
>> > closer and closer.
>>
>> I think that after trying to get the feature it just start reading
>> nonsense from the NAND and at some point it hangs due to that garbage…
>
> It should refuse to mount the device somehow, but in no case the kernel
> should hang.

Yes, I think that this is a side effect (maybe a different bug somewhere else).

>
>> Is it posible that the NAND starts behaving like this after getting
>> the feature due to some specific config of my device?
>>
>> >
>> > I looked at the patch, I don't see anything strange. Besides, I have a
>> > close enough datasheet and I don't see what could confuse the device.
>> >
>> > Are you really sure this patch is the problem? Is the WP pin wired on
>> > your design?
>>
>> There’s no WP pin in brcmnand controllers < 7.0
>
> What about the chip?

Maybe it has a GPIO controlling that, but I don’t have that info…

>
> Thanks,
> Miquèl
>

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 17:04                   ` Álvaro Fernández Rojas
@ 2023-03-27  8:21                     ` Miquel Raynal
  -1 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-27  8:21 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel, Jaime Liao, YouChing

Hi Álvaro,

noltari@gmail.com wrote on Fri, 24 Mar 2023 18:04:38 +0100:

> Hi Miquèl,
> 
> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > Hi Álvaro,
> >
> > + YouChing and Jaime from Macronix
> > TLDR for them: there is a misbehavior since Mason added block
> > protection support. Just checking if the blocks are protected seems to
> > misconfigure the chip entirely, see below. Any hints?  
> 
> Could it be that the NAND is stuck expecting a read 0x00 command which
> isn’t sent after getting the features?

I have no idea, please try that, you can manually generate a READ0 by
hacking an existing read helper.

> > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >  
> >> Hi Miquèl,
> >>
> >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:  
> >> > Hi Álvaro,
> >> >
> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >> >  
> >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >> >> (<miquel.raynal@bootlin.com>) escribió:  
> >> >> >
> >> >> > Hi Álvaro,
> >> >> >
> >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >> >> >  
> >> >> > > Hi Miquèl,
> >> >> > >
> >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >> >> > > (<miquel.raynal@bootlin.com>) escribió:  
> >> >> > > >
> >> >> > > > Hi Álvaro,
> >> >> > > >
> >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >> >> > > >  
> >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> >> >> > > > > This binding allows disabling block protection support for
> >> >> > > > > those
> >> >> > > > > devices not
> >> >> > > > > supporting it.
> >> >> > > > >
> >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >> >> > > > > ---
> >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >> >> > > > > +++
> >> >> > > > >  1 file changed, 3 insertions(+)
> >> >> > > > >
> >> >> > > > > diff --git
> >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> >> >> > > > >  Required NAND chip properties in children mode:
> >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >> >> > > > >
> >> >> > > > > +Optional NAND chip properties in children mode:
> >> >> > > > > +- block protection disable: should be
> >> >> > > > > "mxic,disable-block-protection"
> >> >> > > > > +  
> >> >> > > >
> >> >> > > > Besides the fact that nowadays we prefer to see binding
> >> >> > > > conversions
> >> >> > > > to
> >> >> > > > yaml before adding anything, I don't think this will fly.
> >> >> > > >
> >> >> > > > I'm not sure exactly what "disable block protection" means, we
> >> >> > > > already have similar properties like "lock" and
> >> >> > > > "secure-regions",
> >> >> > > > not
> >> >> > > > sure they will fit but I think it's worth checking.  
> >> >> > >
> >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> >> >> > > on
> >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >> >> > > MX30LF1G18AC
> >> >> > > which hangs the device.
> >> >> > >
> >> >> > > This is the log with block protection disabled:
> >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> >> >> > > brcmnand.0
> >> >> > > ...
> >> >> > >
> >> >> > > This is the log with block protection enabled:
> >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.539687] Bad block table not found for chip 0
> >> >> > > [    0.550153] Bad block table not found for chip 0
> >> >> > > [    0.555069] Scanning device for bad blocks
> >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> >> >> > > virtual
> >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> >> >> > > *** Device hangs ***
> >> >> > >
> >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> >> >> > > unable to detect the bad block table and hangs it when trying to
> >> >> > > scan
> >> >> > > for bad blocks.  
> >> >> >
> >> >> > Please trace nand_macronix.c and look:
> >> >> > - are the get_features and set_features really supported by the
> >> >> >   controller driver?  
> >> >>
> >> >> This is what I could find by debugging:
> >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >> >> state default
> >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> 0xf1
> >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> 2048, OOB size: 64
> >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >> >> 00 00 00] -> 0
> >> >> [    0.602341] macronix_nand_block_protection_support:
> >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >> >> [    0.610548] macronix_nand_block_protection_support: !=
> >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >> >> [    0.624760] Bad block table not found for chip 0
> >> >> [    0.635542] Bad block table not found for chip 0
> >> >> [    0.640270] Scanning device for bad blocks
> >> >>
> >> >> I don't know how to tell if get_features / set_features is really
> >> >> supported...  
> >> >
> >> > Looks like your driver does not support exec_op but the core provides a
> >> > get/set_feature implementation.  
> >>
> >> According to Florian, low level should be supported on brcmnand
> >> controllers >= 4.0
> >> Also:
> >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597  
> >
> > Just to be sure, you're using a mainline controller driver, not this
> > one?  
> 
> Yes, this was just to prove that the HW I’m using has get/set features support.
> I’m using OpenWrt, so it’s linux v5.15 driver.

Ok, thanks for the confirmation.

> 
> >  
> >> >  
> >> >>  
> >> >> > - what is the state of the locking configuration in the chip when
> >> >> > you
> >> >> >   boot?  
> >> >>
> >> >> Unlocked, I guess...
> >> >> How can I check that?  
> >> >
> >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> >> > apparently.  
> >>
> >> Well, I can read/write the device if block protection isn’t disabled,
> >> so I guess we can confirm it’s unlocked…
> >>  
> >> >  
> >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> >> >> > ?  
> >> >
> >> > So nobody locks the device I guess? Did you add traces there?  
> >>
> >> It doesn’t get to the point that it enabled the lock/unlock functions
> >> since it fails when checking if feature is 0x38, so there’s no point
> >> in adding those traces…  
> >
> > Right, it returns before setting these I guess.
> >  
> >>  
> >> >  
> >> >> > - finding no bbt is one thing, hanging is another, where is it
> >> >> > hanging
> >> >> >   exactly? (offset in nand/ and line in the code)  
> >> >>
> >> >> I've got no idea...  
> >> >
> >> > You can use ftrace or just add printks a bit everywhere and try to get
> >> > closer and closer.  
> >>
> >> I think that after trying to get the feature it just start reading
> >> nonsense from the NAND and at some point it hangs due to that garbage…  
> >
> > It should refuse to mount the device somehow, but in no case the kernel
> > should hang.  
> 
> Yes, I think that this is a side effect (maybe a different bug somewhere else).

Could be worth checking.

> 
> >  
> >> Is it posible that the NAND starts behaving like this after getting
> >> the feature due to some specific config of my device?
> >>  
> >> >
> >> > I looked at the patch, I don't see anything strange. Besides, I have a
> >> > close enough datasheet and I don't see what could confuse the device.
> >> >
> >> > Are you really sure this patch is the problem? Is the WP pin wired on
> >> > your design?  
> >>
> >> There’s no WP pin in brcmnand controllers < 7.0  
> >
> > What about the chip?  
> 
> Maybe it has a GPIO controlling that, but I don’t have that info…

I mean, on the board, is the chip connected to some kind of
pull-up/down resistor? Because it may change its behavior.

Regarding your issue, I see there is a problem, but I don't get why.
The current proposal is not satisfying, I cannot pick this up. We
need feedback from Macronix :-)

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-03-27  8:21                     ` Miquel Raynal
  0 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-03-27  8:21 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel, Jaime Liao, YouChing

Hi Álvaro,

noltari@gmail.com wrote on Fri, 24 Mar 2023 18:04:38 +0100:

> Hi Miquèl,
> 
> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > Hi Álvaro,
> >
> > + YouChing and Jaime from Macronix
> > TLDR for them: there is a misbehavior since Mason added block
> > protection support. Just checking if the blocks are protected seems to
> > misconfigure the chip entirely, see below. Any hints?  
> 
> Could it be that the NAND is stuck expecting a read 0x00 command which
> isn’t sent after getting the features?

I have no idea, please try that, you can manually generate a READ0 by
hacking an existing read helper.

> > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >  
> >> Hi Miquèl,
> >>
> >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:  
> >> > Hi Álvaro,
> >> >
> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >> >  
> >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >> >> (<miquel.raynal@bootlin.com>) escribió:  
> >> >> >
> >> >> > Hi Álvaro,
> >> >> >
> >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >> >> >  
> >> >> > > Hi Miquèl,
> >> >> > >
> >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >> >> > > (<miquel.raynal@bootlin.com>) escribió:  
> >> >> > > >
> >> >> > > > Hi Álvaro,
> >> >> > > >
> >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >> >> > > >  
> >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> >> >> > > > > This binding allows disabling block protection support for
> >> >> > > > > those
> >> >> > > > > devices not
> >> >> > > > > supporting it.
> >> >> > > > >
> >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >> >> > > > > ---
> >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >> >> > > > > +++
> >> >> > > > >  1 file changed, 3 insertions(+)
> >> >> > > > >
> >> >> > > > > diff --git
> >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> >> >> > > > >  Required NAND chip properties in children mode:
> >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >> >> > > > >
> >> >> > > > > +Optional NAND chip properties in children mode:
> >> >> > > > > +- block protection disable: should be
> >> >> > > > > "mxic,disable-block-protection"
> >> >> > > > > +  
> >> >> > > >
> >> >> > > > Besides the fact that nowadays we prefer to see binding
> >> >> > > > conversions
> >> >> > > > to
> >> >> > > > yaml before adding anything, I don't think this will fly.
> >> >> > > >
> >> >> > > > I'm not sure exactly what "disable block protection" means, we
> >> >> > > > already have similar properties like "lock" and
> >> >> > > > "secure-regions",
> >> >> > > > not
> >> >> > > > sure they will fit but I think it's worth checking.  
> >> >> > >
> >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> >> >> > > on
> >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >> >> > > MX30LF1G18AC
> >> >> > > which hangs the device.
> >> >> > >
> >> >> > > This is the log with block protection disabled:
> >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> >> >> > > brcmnand.0
> >> >> > > ...
> >> >> > >
> >> >> > > This is the log with block protection enabled:
> >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.539687] Bad block table not found for chip 0
> >> >> > > [    0.550153] Bad block table not found for chip 0
> >> >> > > [    0.555069] Scanning device for bad blocks
> >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> >> >> > > virtual
> >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> >> >> > > *** Device hangs ***
> >> >> > >
> >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> >> >> > > unable to detect the bad block table and hangs it when trying to
> >> >> > > scan
> >> >> > > for bad blocks.  
> >> >> >
> >> >> > Please trace nand_macronix.c and look:
> >> >> > - are the get_features and set_features really supported by the
> >> >> >   controller driver?  
> >> >>
> >> >> This is what I could find by debugging:
> >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >> >> state default
> >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> 0xf1
> >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> 2048, OOB size: 64
> >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >> >> 00 00 00] -> 0
> >> >> [    0.602341] macronix_nand_block_protection_support:
> >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >> >> [    0.610548] macronix_nand_block_protection_support: !=
> >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >> >> [    0.624760] Bad block table not found for chip 0
> >> >> [    0.635542] Bad block table not found for chip 0
> >> >> [    0.640270] Scanning device for bad blocks
> >> >>
> >> >> I don't know how to tell if get_features / set_features is really
> >> >> supported...  
> >> >
> >> > Looks like your driver does not support exec_op but the core provides a
> >> > get/set_feature implementation.  
> >>
> >> According to Florian, low level should be supported on brcmnand
> >> controllers >= 4.0
> >> Also:
> >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597  
> >
> > Just to be sure, you're using a mainline controller driver, not this
> > one?  
> 
> Yes, this was just to prove that the HW I’m using has get/set features support.
> I’m using OpenWrt, so it’s linux v5.15 driver.

Ok, thanks for the confirmation.

> 
> >  
> >> >  
> >> >>  
> >> >> > - what is the state of the locking configuration in the chip when
> >> >> > you
> >> >> >   boot?  
> >> >>
> >> >> Unlocked, I guess...
> >> >> How can I check that?  
> >> >
> >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> >> > apparently.  
> >>
> >> Well, I can read/write the device if block protection isn’t disabled,
> >> so I guess we can confirm it’s unlocked…
> >>  
> >> >  
> >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> >> >> > ?  
> >> >
> >> > So nobody locks the device I guess? Did you add traces there?  
> >>
> >> It doesn’t get to the point that it enabled the lock/unlock functions
> >> since it fails when checking if feature is 0x38, so there’s no point
> >> in adding those traces…  
> >
> > Right, it returns before setting these I guess.
> >  
> >>  
> >> >  
> >> >> > - finding no bbt is one thing, hanging is another, where is it
> >> >> > hanging
> >> >> >   exactly? (offset in nand/ and line in the code)  
> >> >>
> >> >> I've got no idea...  
> >> >
> >> > You can use ftrace or just add printks a bit everywhere and try to get
> >> > closer and closer.  
> >>
> >> I think that after trying to get the feature it just start reading
> >> nonsense from the NAND and at some point it hangs due to that garbage…  
> >
> > It should refuse to mount the device somehow, but in no case the kernel
> > should hang.  
> 
> Yes, I think that this is a side effect (maybe a different bug somewhere else).

Could be worth checking.

> 
> >  
> >> Is it posible that the NAND starts behaving like this after getting
> >> the feature due to some specific config of my device?
> >>  
> >> >
> >> > I looked at the patch, I don't see anything strange. Besides, I have a
> >> > close enough datasheet and I don't see what could confuse the device.
> >> >
> >> > Are you really sure this patch is the problem? Is the WP pin wired on
> >> > your design?  
> >>
> >> There’s no WP pin in brcmnand controllers < 7.0  
> >
> > What about the chip?  
> 
> Maybe it has a GPIO controlling that, but I don’t have that info…

I mean, on the board, is the chip connected to some kind of
pull-up/down resistor? Because it may change its behavior.

Regarding your issue, I see there is a problem, but I don't get why.
The current proposal is not satisfying, I cannot pick this up. We
need feedback from Macronix :-)

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-03-24 17:04                   ` Álvaro Fernández Rojas
@ 2023-04-22  9:28                     ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-04-22  9:28 UTC (permalink / raw)
  To: Miquel Raynal, Jaime Liao, YouChing
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hello YouChing and Jaime,

I still didn't get any feedback from you (or Macronix) on this issue.
Did you have time to look into it?

Thanks,
Álvaro.

El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
(<noltari@gmail.com>) escribió:
>
> Hi Miquèl,
>
> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > Hi Álvaro,
> >
> > + YouChing and Jaime from Macronix
> > TLDR for them: there is a misbehavior since Mason added block
> > protection support. Just checking if the blocks are protected seems to
> > misconfigure the chip entirely, see below. Any hints?
>
> Could it be that the NAND is stuck expecting a read 0x00 command which
> isn’t sent after getting the features?
>
> >
> > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >
> >> Hi Miquèl,
> >>
> >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >> > Hi Álvaro,
> >> >
> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >> >
> >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >> >> (<miquel.raynal@bootlin.com>) escribió:
> >> >> >
> >> >> > Hi Álvaro,
> >> >> >
> >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >> >> >
> >> >> > > Hi Miquèl,
> >> >> > >
> >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >> >> > > (<miquel.raynal@bootlin.com>) escribió:
> >> >> > > >
> >> >> > > > Hi Álvaro,
> >> >> > > >
> >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >> >> > > >
> >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> >> >> > > > > This binding allows disabling block protection support for
> >> >> > > > > those
> >> >> > > > > devices not
> >> >> > > > > supporting it.
> >> >> > > > >
> >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >> >> > > > > ---
> >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >> >> > > > > +++
> >> >> > > > >  1 file changed, 3 insertions(+)
> >> >> > > > >
> >> >> > > > > diff --git
> >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> >> >> > > > >  Required NAND chip properties in children mode:
> >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >> >> > > > >
> >> >> > > > > +Optional NAND chip properties in children mode:
> >> >> > > > > +- block protection disable: should be
> >> >> > > > > "mxic,disable-block-protection"
> >> >> > > > > +
> >> >> > > >
> >> >> > > > Besides the fact that nowadays we prefer to see binding
> >> >> > > > conversions
> >> >> > > > to
> >> >> > > > yaml before adding anything, I don't think this will fly.
> >> >> > > >
> >> >> > > > I'm not sure exactly what "disable block protection" means, we
> >> >> > > > already have similar properties like "lock" and
> >> >> > > > "secure-regions",
> >> >> > > > not
> >> >> > > > sure they will fit but I think it's worth checking.
> >> >> > >
> >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> >> >> > > on
> >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >> >> > > MX30LF1G18AC
> >> >> > > which hangs the device.
> >> >> > >
> >> >> > > This is the log with block protection disabled:
> >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> >> >> > > brcmnand.0
> >> >> > > ...
> >> >> > >
> >> >> > > This is the log with block protection enabled:
> >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.539687] Bad block table not found for chip 0
> >> >> > > [    0.550153] Bad block table not found for chip 0
> >> >> > > [    0.555069] Scanning device for bad blocks
> >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> >> >> > > virtual
> >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> >> >> > > *** Device hangs ***
> >> >> > >
> >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> >> >> > > unable to detect the bad block table and hangs it when trying to
> >> >> > > scan
> >> >> > > for bad blocks.
> >> >> >
> >> >> > Please trace nand_macronix.c and look:
> >> >> > - are the get_features and set_features really supported by the
> >> >> >   controller driver?
> >> >>
> >> >> This is what I could find by debugging:
> >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >> >> state default
> >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> 0xf1
> >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> 2048, OOB size: 64
> >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >> >> 00 00 00] -> 0
> >> >> [    0.602341] macronix_nand_block_protection_support:
> >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >> >> [    0.610548] macronix_nand_block_protection_support: !=
> >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >> >> [    0.624760] Bad block table not found for chip 0
> >> >> [    0.635542] Bad block table not found for chip 0
> >> >> [    0.640270] Scanning device for bad blocks
> >> >>
> >> >> I don't know how to tell if get_features / set_features is really
> >> >> supported...
> >> >
> >> > Looks like your driver does not support exec_op but the core provides a
> >> > get/set_feature implementation.
> >>
> >> According to Florian, low level should be supported on brcmnand
> >> controllers >= 4.0
> >> Also:
> >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >
> > Just to be sure, you're using a mainline controller driver, not this
> > one?
>
> Yes, this was just to prove that the HW I’m using has get/set features support.
> I’m using OpenWrt, so it’s linux v5.15 driver.
>
> >
> >> >
> >> >>
> >> >> > - what is the state of the locking configuration in the chip when
> >> >> > you
> >> >> >   boot?
> >> >>
> >> >> Unlocked, I guess...
> >> >> How can I check that?
> >> >
> >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> >> > apparently.
> >>
> >> Well, I can read/write the device if block protection isn’t disabled,
> >> so I guess we can confirm it’s unlocked…
> >>
> >> >
> >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> >> >> > ?
> >> >
> >> > So nobody locks the device I guess? Did you add traces there?
> >>
> >> It doesn’t get to the point that it enabled the lock/unlock functions
> >> since it fails when checking if feature is 0x38, so there’s no point
> >> in adding those traces…
> >
> > Right, it returns before setting these I guess.
> >
> >>
> >> >
> >> >> > - finding no bbt is one thing, hanging is another, where is it
> >> >> > hanging
> >> >> >   exactly? (offset in nand/ and line in the code)
> >> >>
> >> >> I've got no idea...
> >> >
> >> > You can use ftrace or just add printks a bit everywhere and try to get
> >> > closer and closer.
> >>
> >> I think that after trying to get the feature it just start reading
> >> nonsense from the NAND and at some point it hangs due to that garbage…
> >
> > It should refuse to mount the device somehow, but in no case the kernel
> > should hang.
>
> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>
> >
> >> Is it posible that the NAND starts behaving like this after getting
> >> the feature due to some specific config of my device?
> >>
> >> >
> >> > I looked at the patch, I don't see anything strange. Besides, I have a
> >> > close enough datasheet and I don't see what could confuse the device.
> >> >
> >> > Are you really sure this patch is the problem? Is the WP pin wired on
> >> > your design?
> >>
> >> There’s no WP pin in brcmnand controllers < 7.0
> >
> > What about the chip?
>
> Maybe it has a GPIO controlling that, but I don’t have that info…
>
> >
> > Thanks,
> > Miquèl
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-04-22  9:28                     ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-04-22  9:28 UTC (permalink / raw)
  To: Miquel Raynal, Jaime Liao, YouChing
  Cc: richard, vigneshr, robh+dt, krzysztof.kozlowski+dt, masonccyang,
	linux-mtd, devicetree, linux-kernel

Hello YouChing and Jaime,

I still didn't get any feedback from you (or Macronix) on this issue.
Did you have time to look into it?

Thanks,
Álvaro.

El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
(<noltari@gmail.com>) escribió:
>
> Hi Miquèl,
>
> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > Hi Álvaro,
> >
> > + YouChing and Jaime from Macronix
> > TLDR for them: there is a misbehavior since Mason added block
> > protection support. Just checking if the blocks are protected seems to
> > misconfigure the chip entirely, see below. Any hints?
>
> Could it be that the NAND is stuck expecting a read 0x00 command which
> isn’t sent after getting the features?
>
> >
> > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >
> >> Hi Miquèl,
> >>
> >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >> > Hi Álvaro,
> >> >
> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >> >
> >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >> >> (<miquel.raynal@bootlin.com>) escribió:
> >> >> >
> >> >> > Hi Álvaro,
> >> >> >
> >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >> >> >
> >> >> > > Hi Miquèl,
> >> >> > >
> >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >> >> > > (<miquel.raynal@bootlin.com>) escribió:
> >> >> > > >
> >> >> > > > Hi Álvaro,
> >> >> > > >
> >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >> >> > > >
> >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> >> >> > > > > This binding allows disabling block protection support for
> >> >> > > > > those
> >> >> > > > > devices not
> >> >> > > > > supporting it.
> >> >> > > > >
> >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >> >> > > > > ---
> >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >> >> > > > > +++
> >> >> > > > >  1 file changed, 3 insertions(+)
> >> >> > > > >
> >> >> > > > > diff --git
> >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> >> >> > > > >  Required NAND chip properties in children mode:
> >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> >> >> > > > >
> >> >> > > > > +Optional NAND chip properties in children mode:
> >> >> > > > > +- block protection disable: should be
> >> >> > > > > "mxic,disable-block-protection"
> >> >> > > > > +
> >> >> > > >
> >> >> > > > Besides the fact that nowadays we prefer to see binding
> >> >> > > > conversions
> >> >> > > > to
> >> >> > > > yaml before adding anything, I don't think this will fly.
> >> >> > > >
> >> >> > > > I'm not sure exactly what "disable block protection" means, we
> >> >> > > > already have similar properties like "lock" and
> >> >> > > > "secure-regions",
> >> >> > > > not
> >> >> > > > sure they will fit but I think it's worth checking.
> >> >> > >
> >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> >> >> > > on
> >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >> >> > > MX30LF1G18AC
> >> >> > > which hangs the device.
> >> >> > >
> >> >> > > This is the log with block protection disabled:
> >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> >> >> > > brcmnand.0
> >> >> > > ...
> >> >> > >
> >> >> > > This is the log with block protection enabled:
> >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >> >> > > for
> >> >> > > state default
> >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> > > 0xf1
> >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> > > 2048, OOB size: 64
> >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> > > [    0.539687] Bad block table not found for chip 0
> >> >> > > [    0.550153] Bad block table not found for chip 0
> >> >> > > [    0.555069] Scanning device for bad blocks
> >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> >> >> > > virtual
> >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> >> >> > > *** Device hangs ***
> >> >> > >
> >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> >> >> > > unable to detect the bad block table and hangs it when trying to
> >> >> > > scan
> >> >> > > for bad blocks.
> >> >> >
> >> >> > Please trace nand_macronix.c and look:
> >> >> > - are the get_features and set_features really supported by the
> >> >> >   controller driver?
> >> >>
> >> >> This is what I could find by debugging:
> >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >> >> state default
> >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >> >> 0xf1
> >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >> >> 2048, OOB size: 64
> >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >> >> 0x00
> >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >> >> 00 00 00] -> 0
> >> >> [    0.602341] macronix_nand_block_protection_support:
> >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >> >> [    0.610548] macronix_nand_block_protection_support: !=
> >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >> >> [    0.624760] Bad block table not found for chip 0
> >> >> [    0.635542] Bad block table not found for chip 0
> >> >> [    0.640270] Scanning device for bad blocks
> >> >>
> >> >> I don't know how to tell if get_features / set_features is really
> >> >> supported...
> >> >
> >> > Looks like your driver does not support exec_op but the core provides a
> >> > get/set_feature implementation.
> >>
> >> According to Florian, low level should be supported on brcmnand
> >> controllers >= 4.0
> >> Also:
> >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >
> > Just to be sure, you're using a mainline controller driver, not this
> > one?
>
> Yes, this was just to prove that the HW I’m using has get/set features support.
> I’m using OpenWrt, so it’s linux v5.15 driver.
>
> >
> >> >
> >> >>
> >> >> > - what is the state of the locking configuration in the chip when
> >> >> > you
> >> >> >   boot?
> >> >>
> >> >> Unlocked, I guess...
> >> >> How can I check that?
> >> >
> >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> >> > apparently.
> >>
> >> Well, I can read/write the device if block protection isn’t disabled,
> >> so I guess we can confirm it’s unlocked…
> >>
> >> >
> >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> >> >> > ?
> >> >
> >> > So nobody locks the device I guess? Did you add traces there?
> >>
> >> It doesn’t get to the point that it enabled the lock/unlock functions
> >> since it fails when checking if feature is 0x38, so there’s no point
> >> in adding those traces…
> >
> > Right, it returns before setting these I guess.
> >
> >>
> >> >
> >> >> > - finding no bbt is one thing, hanging is another, where is it
> >> >> > hanging
> >> >> >   exactly? (offset in nand/ and line in the code)
> >> >>
> >> >> I've got no idea...
> >> >
> >> > You can use ftrace or just add printks a bit everywhere and try to get
> >> > closer and closer.
> >>
> >> I think that after trying to get the feature it just start reading
> >> nonsense from the NAND and at some point it hangs due to that garbage…
> >
> > It should refuse to mount the device somehow, but in no case the kernel
> > should hang.
>
> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>
> >
> >> Is it posible that the NAND starts behaving like this after getting
> >> the feature due to some specific config of my device?
> >>
> >> >
> >> > I looked at the patch, I don't see anything strange. Besides, I have a
> >> > close enough datasheet and I don't see what could confuse the device.
> >> >
> >> > Are you really sure this patch is the problem? Is the WP pin wired on
> >> > your design?
> >>
> >> There’s no WP pin in brcmnand controllers < 7.0
> >
> > What about the chip?
>
> Maybe it has a GPIO controlling that, but I don’t have that info…
>
> >
> > Thanks,
> > Miquèl
> >

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-23  0:59               ` William Zhang
@ 2023-05-24  5:30                 ` liao jaime
  -1 siblings, 0 replies; 50+ messages in thread
From: liao jaime @ 2023-05-24  5:30 UTC (permalink / raw)
  To: William Zhang
  Cc: Álvaro Fernández Rojas, Florian Fainelli,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

Hi William


>
> Hi Alvaro,
>
> On 05/17/2023 08:20 AM, Álvaro Fernández Rojas wrote:
> > Hi William,
> >
> > El mié, 17 may 2023 a las 7:30, William Zhang
> > (<william.zhang@broadcom.com>) escribió:
> >>
> >>
> >>
> >> On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> >>> Sure,
> >>>
> >>> Here you go:
> >>> [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> >>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> >>> 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> >>> [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> >>> [    0.000000] MIPS: machine is Sercomm H500-s vfes
> >>> [    0.000000] 128MB of RAM installed
> >>> [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> >>> [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> >>> [    0.000000] Initrd not found or empty - disabling initrd
> >>> [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> >>> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> >>> [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> >>> linesize 16 bytes
> >>> [    0.000000] Zone ranges:
> >>> [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> >>> [    0.000000] Movable zone start for each node
> >>> [    0.000000] Early memory node ranges
> >>> [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> >>> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> >>> [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> >>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> >>> [    0.000000] Kernel command line: earlycon
> >>> [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> >>> bytes, linear)
> >>> [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> >>> bytes, linear)
> >>> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> >>> [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> >>> 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> >>> cma-reserved)
> >>> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> >>> [    0.000000] rcu: Hierarchical RCU implementation.
> >>> [    0.000000]  Tracing variant of Tasks RCU enabled.
> >>> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> >>> is 10 jiffies.
> >>> [    0.000000] NR_IRQS: 256
> >>> [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> >>> [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> >>> [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> >>> [    0.000000] brcm,bcm63268 detected @ 400 MHz
> >>> [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> >>> 0xffffffff, max_idle_ns: 9556302233 ns
> >>> [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> >>> every 10737418237ns
> >>> [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> >>> [    0.074683] pid_max: default: 32768 minimum: 301
> >>> [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> >>> bytes, linear)
> >>> [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> >>> 4096 bytes, linear)
> >>> [    0.106094] rcu: Hierarchical SRCU implementation.
> >>> [    0.112665] smp: Bringing up secondary CPUs ...
> >>> [    0.119348] SMP: Booting CPU1...
> >>> [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> >>> [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> >>> linesize 16 bytes
> >>> [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> >>> [    0.182819] Synchronize counters for CPU 1:
> >>> [    0.203500] SMP: CPU1 is running
> >>> [    0.203512] done.
> >>> [    0.213401] smp: Brought up 1 node, 2 CPUs
> >>> [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> >>> 0xffffffff, max_idle_ns: 19112604462750000 ns
> >>> [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> >>> [    0.246439] pinctrl core: initialized pinctrl subsystem
> >>> [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> >>> [    0.312700] clocksource: Switched to clocksource MIPS
> >>> [    0.321061] NET: Registered PF_INET protocol family
> >>> [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> >>> bytes, linear)
> >>> [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> >>> (order: 0, 6144 bytes, linear)
> >>> [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> >>> 262144 bytes, linear)
> >>> [    0.352721] TCP established hash table entries: 1024 (order: 0,
> >>> 4096 bytes, linear)
> >>> [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> >>> [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> >>> [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> >>> [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> >>> [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> >>> [    0.395748] PCI: CLS 0 bytes, default 16
> >>> [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> >>> [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> >>> [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> >>> (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> >>> [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> >>> registered 14 power domains
> >>> [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> >>> base_baud = 1562500) is a bcm63xx_uart
> >>> [    0.479996] printk: console [ttyS0] enabled
> >>> [    0.479996] printk: console [ttyS0] enabled
> >>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> >>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> >>> [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> >>> [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> >>> state default
> >>> [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> >>> [    0.640506] nand: Macronix MX30LF1G18AC
> >>> [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>> 2048, OOB size: 64
> >>> [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>> [    0.703373] Bad block table not found for chip 0
> >>> [    0.732040] Bad block table not found for chip 0
> >>> [    0.736842] Scanning device for bad blocks
> >>> [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> >>> address 00000014, epc == 8009b300, ra == 806cc650
> >>> [    0.843628] Oops[#1]:
> >>> [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> >>> [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> >>> [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> >>> [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> >>> [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> >>>
> >>> Please, tell me if you want me to add any debugging to the log.
> >>>
> >>> Best regards,
> >>> Álvaro.
> >>>
> >>> El mar, 16 may 2023 a las 20:58, Florian Fainelli
> >>> (<f.fainelli@gmail.com>) escribió:
> >>>>
> >>>> +William,
> >>>>
> >>>> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> >>>>> Hi Jaime,
> >>>>>
> >>>>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> >>>>> forcing it to check block protection (it's not supported on that
> >>>>> device), the NAND controller stops reading/writing anything.
> >>>>>
> >>>>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> >>>>> aren't supported on BCM63268 NAND controllers and this is causing the
> >>>>> issue?
> >>>>
> >>>> Yes, this looks like what we have seen as well even with newer NAND
> >>>> controllers actually. Would it be possible to obtain a full log from
> >>>> either of you?
> >>>>
> >>>> William, is this something you have seen before as well?
> >>>>
> >> No, I haven't seen such issue before.  It is possible I didn't have this
> >>    Macronix parts in my board. If I can find a board with Macronix part,
> >> I will try it. But we don't use this feature and don't connect the PT
> >> pin in our reference board which means the PT feature is disabled in the
> >> nand part.
> >>
> >> Alvaro, Do you know if your 63268 board has PT pin connected or not?
> >
> > No, I don't know if PT pin is connected.
> > I would have to open the case and check, but judging from the
> > following image I would say it's not connected:
> > https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> >
> >> Can you check if the macronix's lock and unlock function being calling
> >> before the hang?   Or is it just get/set feature function getting called
> >> to determine PT is supported?   The get/set feature function should work
> >> as they are used by other pathes
> >
> > No, the macronix's lock/unlock functions aren't called before the hang.
> > In fact, if I comment out the nand_get_features call and replace it
> > with ret = 1 it doesn't hang:
> > https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230
> >
> I see. In fact I saw your earlier debug log with ll_op cmd dump for the
> nand_get_features function and they went through successfully.  Really
> strange how this function call will cause problem to subsequent nand read.
>
> Can you keep this code commented out and then after the board boot up
> and manually write to nand controller register for these ll_op cmd
> sequence following the code of brcmnand_low_level_op,  assuming you have
> a way to write to controller registering from a shell.   If not,  you
> might have to hack the brcmnand or base nand driver code and insert
> following call at some special condition at run time:
> ret = nand_get_features(chip, ONFI_FEATURE_ADDR_MXIC_PROTECTION,
>                                 feature);
> Then check if nand read function still works.  At least we can confirm
> if feature query function actually cause the problem.  You can try
> different feature code and see if it make any difference.
>
> Question to Jaime,  if PT pin is not connected,  would the PT feature
> check cause any issue afterwards? Or the nand chip should just return
> block not protected?

PT will be keep low internally if not connected and IO2(PT#) will always "0"
during read block-protection status.

Thanks
Jaime
>
> >>
> >>
> >>>>>
> >>>>> Best regards,
> >>>>> Álvaro.
> >>>>>
> >>>>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
> >>>>>>
> >>>>>> Hi Álvaro
> >>>>>>
> >>>>>> In nand_scan_tail(), each manufacturer init function call will be execute.
> >>>>>> In macronix_nand_init(), block protect will be execute after flash detect.
> >>>>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
> >>>>>> I didn't got situation "device hangs"  on my side.
> >>>>>> BP is to prevent incorrect operations.
> >>>>>> Please check the controller settings for tracing this issue.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Jaime
> >>>>>>
> >>>>>>>
> >>>>>>> Hello YouChing and Jaime,
> >>>>>>>
> >>>>>>> I still didn't get any feedback from you (or Macronix) on this issue.
> >>>>>>> Did you have time to look into it?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Álvaro.
> >>>>>>>
> >>>>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> >>>>>>> (<noltari@gmail.com>) escribió:
> >>>>>>>>
> >>>>>>>> Hi Miquèl,
> >>>>>>>>
> >>>>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>>>> Hi Álvaro,
> >>>>>>>>>
> >>>>>>>>> + YouChing and Jaime from Macronix
> >>>>>>>>> TLDR for them: there is a misbehavior since Mason added block
> >>>>>>>>> protection support. Just checking if the blocks are protected seems to
> >>>>>>>>> misconfigure the chip entirely, see below. Any hints?
> >>>>>>>>
> >>>>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
> >>>>>>>> isn’t sent after getting the features?
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >>>>>>>>>
> >>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>
> >>>>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>
> >>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >>>>>>>>>>>
> >>>>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >>>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
> >>>>>>>>>>>>>>>> This binding allows disabling block protection support for
> >>>>>>>>>>>>>>>> those
> >>>>>>>>>>>>>>>> devices not
> >>>>>>>>>>>>>>>> supporting it.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>     Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >>>>>>>>>>>>>>>> +++
> >>>>>>>>>>>>>>>>     1 file changed, 3 insertions(+)
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> diff --git
> >>>>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
> >>>>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
> >>>>>>>>>>>>>>>>     Required NAND chip properties in children mode:
> >>>>>>>>>>>>>>>>     - randomizer enable: should be "mxic,enable-randomizer-otp"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
> >>>>>>>>>>>>>>>> +- block protection disable: should be
> >>>>>>>>>>>>>>>> "mxic,disable-block-protection"
> >>>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
> >>>>>>>>>>>>>>> conversions
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
> >>>>>>>>>>>>>>> already have similar properties like "lock" and
> >>>>>>>>>>>>>>> "secure-regions",
> >>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>> sure they will fit but I think it's worth checking.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
> >>>>>>>>>>>>>> on
> >>>>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >>>>>>>>>>>>>> MX30LF1G18AC
> >>>>>>>>>>>>>> which hangs the device.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This is the log with block protection disabled:
> >>>>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>> state default
> >>>>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>>>> 0xf1
> >>>>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
> >>>>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
> >>>>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
> >>>>>>>>>>>>>> brcmnand.0
> >>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This is the log with block protection enabled:
> >>>>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>> state default
> >>>>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>>>> 0xf1
> >>>>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
> >>>>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
> >>>>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
> >>>>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
> >>>>>>>>>>>>>> virtual
> >>>>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
> >>>>>>>>>>>>>> *** Device hangs ***
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
> >>>>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
> >>>>>>>>>>>>>> scan
> >>>>>>>>>>>>>> for bad blocks.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Please trace nand_macronix.c and look:
> >>>>>>>>>>>>> - are the get_features and set_features really supported by the
> >>>>>>>>>>>>>      controller driver?
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is what I could find by debugging:
> >>>>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >>>>>>>>>>>> state default
> >>>>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>> 0xf1
> >>>>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >>>>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >>>>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >>>>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >>>>>>>>>>>> 00 00 00] -> 0
> >>>>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
> >>>>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >>>>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
> >>>>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >>>>>>>>>>>> [    0.624760] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.635542] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.640270] Scanning device for bad blocks
> >>>>>>>>>>>>
> >>>>>>>>>>>> I don't know how to tell if get_features / set_features is really
> >>>>>>>>>>>> supported...
> >>>>>>>>>>>
> >>>>>>>>>>> Looks like your driver does not support exec_op but the core provides a
> >>>>>>>>>>> get/set_feature implementation.
> >>>>>>>>>>
> >>>>>>>>>> According to Florian, low level should be supported on brcmnand
> >>>>>>>>>> controllers >= 4.0
> >>>>>>>>>> Also:
> >>>>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >>>>>>>>>
> >>>>>>>>> Just to be sure, you're using a mainline controller driver, not this
> >>>>>>>>> one?
> >>>>>>>>
> >>>>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
> >>>>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> - what is the state of the locking configuration in the chip when
> >>>>>>>>>>>>> you
> >>>>>>>>>>>>>      boot?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Unlocked, I guess...
> >>>>>>>>>>>> How can I check that?
> >>>>>>>>>>>
> >>>>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
> >>>>>>>>>>> apparently.
> >>>>>>>>>>
> >>>>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
> >>>>>>>>>> so I guess we can confirm it’s unlocked…
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
> >>>>>>>>>>>>> ?
> >>>>>>>>>>>
> >>>>>>>>>>> So nobody locks the device I guess? Did you add traces there?
> >>>>>>>>>>
> >>>>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
> >>>>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
> >>>>>>>>>> in adding those traces…
> >>>>>>>>>
> >>>>>>>>> Right, it returns before setting these I guess.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
> >>>>>>>>>>>>> hanging
> >>>>>>>>>>>>>      exactly? (offset in nand/ and line in the code)
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've got no idea...
> >>>>>>>>>>>
> >>>>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
> >>>>>>>>>>> closer and closer.
> >>>>>>>>>>
> >>>>>>>>>> I think that after trying to get the feature it just start reading
> >>>>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
> >>>>>>>>>
> >>>>>>>>> It should refuse to mount the device somehow, but in no case the kernel
> >>>>>>>>> should hang.
> >>>>>>>>
> >>>>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> Is it posible that the NAND starts behaving like this after getting
> >>>>>>>>>> the feature due to some specific config of my device?
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
> >>>>>>>>>>> close enough datasheet and I don't see what could confuse the device.
> >>>>>>>>>>>
> >>>>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
> >>>>>>>>>>> your design?
> >>>>>>>>>>
> >>>>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
> >>>>>>>>>
> >>>>>>>>> What about the chip?
> >>>>>>>>
> >>>>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Miquèl
> >>>>>>>>>
> >>>>
> >>>> --
> >>>> Florian
> >>>>
> >
> > --
> > Álvaro
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-24  5:30                 ` liao jaime
  0 siblings, 0 replies; 50+ messages in thread
From: liao jaime @ 2023-05-24  5:30 UTC (permalink / raw)
  To: William Zhang
  Cc: Álvaro Fernández Rojas, Florian Fainelli,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

Hi William


>
> Hi Alvaro,
>
> On 05/17/2023 08:20 AM, Álvaro Fernández Rojas wrote:
> > Hi William,
> >
> > El mié, 17 may 2023 a las 7:30, William Zhang
> > (<william.zhang@broadcom.com>) escribió:
> >>
> >>
> >>
> >> On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> >>> Sure,
> >>>
> >>> Here you go:
> >>> [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> >>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> >>> 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> >>> [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> >>> [    0.000000] MIPS: machine is Sercomm H500-s vfes
> >>> [    0.000000] 128MB of RAM installed
> >>> [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> >>> [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> >>> [    0.000000] Initrd not found or empty - disabling initrd
> >>> [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> >>> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> >>> [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> >>> linesize 16 bytes
> >>> [    0.000000] Zone ranges:
> >>> [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> >>> [    0.000000] Movable zone start for each node
> >>> [    0.000000] Early memory node ranges
> >>> [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> >>> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> >>> [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> >>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> >>> [    0.000000] Kernel command line: earlycon
> >>> [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> >>> bytes, linear)
> >>> [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> >>> bytes, linear)
> >>> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> >>> [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> >>> 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> >>> cma-reserved)
> >>> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> >>> [    0.000000] rcu: Hierarchical RCU implementation.
> >>> [    0.000000]  Tracing variant of Tasks RCU enabled.
> >>> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> >>> is 10 jiffies.
> >>> [    0.000000] NR_IRQS: 256
> >>> [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> >>> [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> >>> [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> >>> [    0.000000] brcm,bcm63268 detected @ 400 MHz
> >>> [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> >>> 0xffffffff, max_idle_ns: 9556302233 ns
> >>> [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> >>> every 10737418237ns
> >>> [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> >>> [    0.074683] pid_max: default: 32768 minimum: 301
> >>> [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> >>> bytes, linear)
> >>> [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> >>> 4096 bytes, linear)
> >>> [    0.106094] rcu: Hierarchical SRCU implementation.
> >>> [    0.112665] smp: Bringing up secondary CPUs ...
> >>> [    0.119348] SMP: Booting CPU1...
> >>> [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> >>> [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> >>> linesize 16 bytes
> >>> [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> >>> [    0.182819] Synchronize counters for CPU 1:
> >>> [    0.203500] SMP: CPU1 is running
> >>> [    0.203512] done.
> >>> [    0.213401] smp: Brought up 1 node, 2 CPUs
> >>> [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> >>> 0xffffffff, max_idle_ns: 19112604462750000 ns
> >>> [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> >>> [    0.246439] pinctrl core: initialized pinctrl subsystem
> >>> [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> >>> [    0.312700] clocksource: Switched to clocksource MIPS
> >>> [    0.321061] NET: Registered PF_INET protocol family
> >>> [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> >>> bytes, linear)
> >>> [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> >>> (order: 0, 6144 bytes, linear)
> >>> [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> >>> 262144 bytes, linear)
> >>> [    0.352721] TCP established hash table entries: 1024 (order: 0,
> >>> 4096 bytes, linear)
> >>> [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> >>> [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> >>> [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> >>> [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> >>> [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> >>> [    0.395748] PCI: CLS 0 bytes, default 16
> >>> [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> >>> [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> >>> [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> >>> (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> >>> [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> >>> registered 14 power domains
> >>> [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> >>> base_baud = 1562500) is a bcm63xx_uart
> >>> [    0.479996] printk: console [ttyS0] enabled
> >>> [    0.479996] printk: console [ttyS0] enabled
> >>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> >>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> >>> [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> >>> [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> >>> state default
> >>> [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> >>> [    0.640506] nand: Macronix MX30LF1G18AC
> >>> [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>> 2048, OOB size: 64
> >>> [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>> [    0.703373] Bad block table not found for chip 0
> >>> [    0.732040] Bad block table not found for chip 0
> >>> [    0.736842] Scanning device for bad blocks
> >>> [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> >>> address 00000014, epc == 8009b300, ra == 806cc650
> >>> [    0.843628] Oops[#1]:
> >>> [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> >>> [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> >>> [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> >>> [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> >>> [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> >>>
> >>> Please, tell me if you want me to add any debugging to the log.
> >>>
> >>> Best regards,
> >>> Álvaro.
> >>>
> >>> El mar, 16 may 2023 a las 20:58, Florian Fainelli
> >>> (<f.fainelli@gmail.com>) escribió:
> >>>>
> >>>> +William,
> >>>>
> >>>> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> >>>>> Hi Jaime,
> >>>>>
> >>>>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> >>>>> forcing it to check block protection (it's not supported on that
> >>>>> device), the NAND controller stops reading/writing anything.
> >>>>>
> >>>>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> >>>>> aren't supported on BCM63268 NAND controllers and this is causing the
> >>>>> issue?
> >>>>
> >>>> Yes, this looks like what we have seen as well even with newer NAND
> >>>> controllers actually. Would it be possible to obtain a full log from
> >>>> either of you?
> >>>>
> >>>> William, is this something you have seen before as well?
> >>>>
> >> No, I haven't seen such issue before.  It is possible I didn't have this
> >>    Macronix parts in my board. If I can find a board with Macronix part,
> >> I will try it. But we don't use this feature and don't connect the PT
> >> pin in our reference board which means the PT feature is disabled in the
> >> nand part.
> >>
> >> Alvaro, Do you know if your 63268 board has PT pin connected or not?
> >
> > No, I don't know if PT pin is connected.
> > I would have to open the case and check, but judging from the
> > following image I would say it's not connected:
> > https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> >
> >> Can you check if the macronix's lock and unlock function being calling
> >> before the hang?   Or is it just get/set feature function getting called
> >> to determine PT is supported?   The get/set feature function should work
> >> as they are used by other pathes
> >
> > No, the macronix's lock/unlock functions aren't called before the hang.
> > In fact, if I comment out the nand_get_features call and replace it
> > with ret = 1 it doesn't hang:
> > https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230
> >
> I see. In fact I saw your earlier debug log with ll_op cmd dump for the
> nand_get_features function and they went through successfully.  Really
> strange how this function call will cause problem to subsequent nand read.
>
> Can you keep this code commented out and then after the board boot up
> and manually write to nand controller register for these ll_op cmd
> sequence following the code of brcmnand_low_level_op,  assuming you have
> a way to write to controller registering from a shell.   If not,  you
> might have to hack the brcmnand or base nand driver code and insert
> following call at some special condition at run time:
> ret = nand_get_features(chip, ONFI_FEATURE_ADDR_MXIC_PROTECTION,
>                                 feature);
> Then check if nand read function still works.  At least we can confirm
> if feature query function actually cause the problem.  You can try
> different feature code and see if it make any difference.
>
> Question to Jaime,  if PT pin is not connected,  would the PT feature
> check cause any issue afterwards? Or the nand chip should just return
> block not protected?

PT will be keep low internally if not connected and IO2(PT#) will always "0"
during read block-protection status.

Thanks
Jaime
>
> >>
> >>
> >>>>>
> >>>>> Best regards,
> >>>>> Álvaro.
> >>>>>
> >>>>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
> >>>>>>
> >>>>>> Hi Álvaro
> >>>>>>
> >>>>>> In nand_scan_tail(), each manufacturer init function call will be execute.
> >>>>>> In macronix_nand_init(), block protect will be execute after flash detect.
> >>>>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
> >>>>>> I didn't got situation "device hangs"  on my side.
> >>>>>> BP is to prevent incorrect operations.
> >>>>>> Please check the controller settings for tracing this issue.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Jaime
> >>>>>>
> >>>>>>>
> >>>>>>> Hello YouChing and Jaime,
> >>>>>>>
> >>>>>>> I still didn't get any feedback from you (or Macronix) on this issue.
> >>>>>>> Did you have time to look into it?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Álvaro.
> >>>>>>>
> >>>>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> >>>>>>> (<noltari@gmail.com>) escribió:
> >>>>>>>>
> >>>>>>>> Hi Miquèl,
> >>>>>>>>
> >>>>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>>>> Hi Álvaro,
> >>>>>>>>>
> >>>>>>>>> + YouChing and Jaime from Macronix
> >>>>>>>>> TLDR for them: there is a misbehavior since Mason added block
> >>>>>>>>> protection support. Just checking if the blocks are protected seems to
> >>>>>>>>> misconfigure the chip entirely, see below. Any hints?
> >>>>>>>>
> >>>>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
> >>>>>>>> isn’t sent after getting the features?
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >>>>>>>>>
> >>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>
> >>>>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>
> >>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >>>>>>>>>>>
> >>>>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >>>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
> >>>>>>>>>>>>>>>> This binding allows disabling block protection support for
> >>>>>>>>>>>>>>>> those
> >>>>>>>>>>>>>>>> devices not
> >>>>>>>>>>>>>>>> supporting it.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>     Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >>>>>>>>>>>>>>>> +++
> >>>>>>>>>>>>>>>>     1 file changed, 3 insertions(+)
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> diff --git
> >>>>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
> >>>>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
> >>>>>>>>>>>>>>>>     Required NAND chip properties in children mode:
> >>>>>>>>>>>>>>>>     - randomizer enable: should be "mxic,enable-randomizer-otp"
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
> >>>>>>>>>>>>>>>> +- block protection disable: should be
> >>>>>>>>>>>>>>>> "mxic,disable-block-protection"
> >>>>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
> >>>>>>>>>>>>>>> conversions
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
> >>>>>>>>>>>>>>> already have similar properties like "lock" and
> >>>>>>>>>>>>>>> "secure-regions",
> >>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>> sure they will fit but I think it's worth checking.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
> >>>>>>>>>>>>>> on
> >>>>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >>>>>>>>>>>>>> MX30LF1G18AC
> >>>>>>>>>>>>>> which hangs the device.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This is the log with block protection disabled:
> >>>>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>> state default
> >>>>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>>>> 0xf1
> >>>>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
> >>>>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
> >>>>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
> >>>>>>>>>>>>>> brcmnand.0
> >>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This is the log with block protection enabled:
> >>>>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>> state default
> >>>>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>>>> 0xf1
> >>>>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
> >>>>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
> >>>>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
> >>>>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
> >>>>>>>>>>>>>> virtual
> >>>>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
> >>>>>>>>>>>>>> *** Device hangs ***
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
> >>>>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
> >>>>>>>>>>>>>> scan
> >>>>>>>>>>>>>> for bad blocks.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Please trace nand_macronix.c and look:
> >>>>>>>>>>>>> - are the get_features and set_features really supported by the
> >>>>>>>>>>>>>      controller driver?
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is what I could find by debugging:
> >>>>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >>>>>>>>>>>> state default
> >>>>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>> 0xf1
> >>>>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >>>>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >>>>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >>>>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>>>> 0x00
> >>>>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >>>>>>>>>>>> 00 00 00] -> 0
> >>>>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
> >>>>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >>>>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
> >>>>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >>>>>>>>>>>> [    0.624760] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.635542] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.640270] Scanning device for bad blocks
> >>>>>>>>>>>>
> >>>>>>>>>>>> I don't know how to tell if get_features / set_features is really
> >>>>>>>>>>>> supported...
> >>>>>>>>>>>
> >>>>>>>>>>> Looks like your driver does not support exec_op but the core provides a
> >>>>>>>>>>> get/set_feature implementation.
> >>>>>>>>>>
> >>>>>>>>>> According to Florian, low level should be supported on brcmnand
> >>>>>>>>>> controllers >= 4.0
> >>>>>>>>>> Also:
> >>>>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >>>>>>>>>
> >>>>>>>>> Just to be sure, you're using a mainline controller driver, not this
> >>>>>>>>> one?
> >>>>>>>>
> >>>>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
> >>>>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> - what is the state of the locking configuration in the chip when
> >>>>>>>>>>>>> you
> >>>>>>>>>>>>>      boot?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Unlocked, I guess...
> >>>>>>>>>>>> How can I check that?
> >>>>>>>>>>>
> >>>>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
> >>>>>>>>>>> apparently.
> >>>>>>>>>>
> >>>>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
> >>>>>>>>>> so I guess we can confirm it’s unlocked…
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
> >>>>>>>>>>>>> ?
> >>>>>>>>>>>
> >>>>>>>>>>> So nobody locks the device I guess? Did you add traces there?
> >>>>>>>>>>
> >>>>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
> >>>>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
> >>>>>>>>>> in adding those traces…
> >>>>>>>>>
> >>>>>>>>> Right, it returns before setting these I guess.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
> >>>>>>>>>>>>> hanging
> >>>>>>>>>>>>>      exactly? (offset in nand/ and line in the code)
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've got no idea...
> >>>>>>>>>>>
> >>>>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
> >>>>>>>>>>> closer and closer.
> >>>>>>>>>>
> >>>>>>>>>> I think that after trying to get the feature it just start reading
> >>>>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
> >>>>>>>>>
> >>>>>>>>> It should refuse to mount the device somehow, but in no case the kernel
> >>>>>>>>> should hang.
> >>>>>>>>
> >>>>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> Is it posible that the NAND starts behaving like this after getting
> >>>>>>>>>> the feature due to some specific config of my device?
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
> >>>>>>>>>>> close enough datasheet and I don't see what could confuse the device.
> >>>>>>>>>>>
> >>>>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
> >>>>>>>>>>> your design?
> >>>>>>>>>>
> >>>>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
> >>>>>>>>>
> >>>>>>>>> What about the chip?
> >>>>>>>>
> >>>>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Miquèl
> >>>>>>>>>
> >>>>
> >>>> --
> >>>> Florian
> >>>>
> >
> > --
> > Álvaro
> >

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-17 15:20             ` Álvaro Fernández Rojas
@ 2023-05-23  0:59               ` William Zhang
  -1 siblings, 0 replies; 50+ messages in thread
From: William Zhang @ 2023-05-23  0:59 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: Florian Fainelli, liao jaime, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 22194 bytes --]

Hi Alvaro,

On 05/17/2023 08:20 AM, Álvaro Fernández Rojas wrote:
> Hi William,
> 
> El mié, 17 may 2023 a las 7:30, William Zhang
> (<william.zhang@broadcom.com>) escribió:
>>
>>
>>
>> On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
>>> Sure,
>>>
>>> Here you go:
>>> [    0.000000] Linux version 5.15.111 (noltari@atlantis)
>>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
>>> 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
>>> [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
>>> [    0.000000] MIPS: machine is Sercomm H500-s vfes
>>> [    0.000000] 128MB of RAM installed
>>> [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
>>> [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
>>> [    0.000000] Initrd not found or empty - disabling initrd
>>> [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
>>> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
>>> [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
>>> linesize 16 bytes
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
>>> [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
>>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
>>> [    0.000000] Kernel command line: earlycon
>>> [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
>>> bytes, linear)
>>> [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
>>> bytes, linear)
>>> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
>>> [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
>>> 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
>>> cma-reserved)
>>> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
>>> [    0.000000] rcu: Hierarchical RCU implementation.
>>> [    0.000000]  Tracing variant of Tasks RCU enabled.
>>> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
>>> is 10 jiffies.
>>> [    0.000000] NR_IRQS: 256
>>> [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
>>> [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
>>> [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
>>> [    0.000000] brcm,bcm63268 detected @ 400 MHz
>>> [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 9556302233 ns
>>> [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
>>> every 10737418237ns
>>> [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
>>> [    0.074683] pid_max: default: 32768 minimum: 301
>>> [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
>>> bytes, linear)
>>> [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
>>> 4096 bytes, linear)
>>> [    0.106094] rcu: Hierarchical SRCU implementation.
>>> [    0.112665] smp: Bringing up secondary CPUs ...
>>> [    0.119348] SMP: Booting CPU1...
>>> [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
>>> [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
>>> linesize 16 bytes
>>> [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
>>> [    0.182819] Synchronize counters for CPU 1:
>>> [    0.203500] SMP: CPU1 is running
>>> [    0.203512] done.
>>> [    0.213401] smp: Brought up 1 node, 2 CPUs
>>> [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 19112604462750000 ns
>>> [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
>>> [    0.246439] pinctrl core: initialized pinctrl subsystem
>>> [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
>>> [    0.312700] clocksource: Switched to clocksource MIPS
>>> [    0.321061] NET: Registered PF_INET protocol family
>>> [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
>>> bytes, linear)
>>> [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
>>> (order: 0, 6144 bytes, linear)
>>> [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
>>> 262144 bytes, linear)
>>> [    0.352721] TCP established hash table entries: 1024 (order: 0,
>>> 4096 bytes, linear)
>>> [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
>>> [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
>>> [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
>>> [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
>>> [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
>>> [    0.395748] PCI: CLS 0 bytes, default 16
>>> [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
>>> [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
>>> [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
>>> (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
>>> [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
>>> registered 14 power domains
>>> [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
>>> base_baud = 1562500) is a bcm63xx_uart
>>> [    0.479996] printk: console [ttyS0] enabled
>>> [    0.479996] printk: console [ttyS0] enabled
>>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
>>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
>>> [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
>>> [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
>>> state default
>>> [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
>>> [    0.640506] nand: Macronix MX30LF1G18AC
>>> [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>> 2048, OOB size: 64
>>> [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>> [    0.703373] Bad block table not found for chip 0
>>> [    0.732040] Bad block table not found for chip 0
>>> [    0.736842] Scanning device for bad blocks
>>> [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
>>> address 00000014, epc == 8009b300, ra == 806cc650
>>> [    0.843628] Oops[#1]:
>>> [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
>>> [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
>>> [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
>>> [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
>>> [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
>>>
>>> Please, tell me if you want me to add any debugging to the log.
>>>
>>> Best regards,
>>> Álvaro.
>>>
>>> El mar, 16 may 2023 a las 20:58, Florian Fainelli
>>> (<f.fainelli@gmail.com>) escribió:
>>>>
>>>> +William,
>>>>
>>>> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
>>>>> Hi Jaime,
>>>>>
>>>>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
>>>>> forcing it to check block protection (it's not supported on that
>>>>> device), the NAND controller stops reading/writing anything.
>>>>>
>>>>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
>>>>> aren't supported on BCM63268 NAND controllers and this is causing the
>>>>> issue?
>>>>
>>>> Yes, this looks like what we have seen as well even with newer NAND
>>>> controllers actually. Would it be possible to obtain a full log from
>>>> either of you?
>>>>
>>>> William, is this something you have seen before as well?
>>>>
>> No, I haven't seen such issue before.  It is possible I didn't have this
>>    Macronix parts in my board. If I can find a board with Macronix part,
>> I will try it. But we don't use this feature and don't connect the PT
>> pin in our reference board which means the PT feature is disabled in the
>> nand part.
>>
>> Alvaro, Do you know if your 63268 board has PT pin connected or not?
> 
> No, I don't know if PT pin is connected.
> I would have to open the case and check, but judging from the
> following image I would say it's not connected:
> https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> 
>> Can you check if the macronix's lock and unlock function being calling
>> before the hang?   Or is it just get/set feature function getting called
>> to determine PT is supported?   The get/set feature function should work
>> as they are used by other pathes
> 
> No, the macronix's lock/unlock functions aren't called before the hang.
> In fact, if I comment out the nand_get_features call and replace it
> with ret = 1 it doesn't hang:
> https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230
> 
I see. In fact I saw your earlier debug log with ll_op cmd dump for the 
nand_get_features function and they went through successfully.  Really 
strange how this function call will cause problem to subsequent nand read.

Can you keep this code commented out and then after the board boot up 
and manually write to nand controller register for these ll_op cmd 
sequence following the code of brcmnand_low_level_op,  assuming you have 
a way to write to controller registering from a shell.   If not,  you 
might have to hack the brcmnand or base nand driver code and insert 
following call at some special condition at run time:
ret = nand_get_features(chip, ONFI_FEATURE_ADDR_MXIC_PROTECTION,
				feature);
Then check if nand read function still works.  At least we can confirm 
if feature query function actually cause the problem.  You can try 
different feature code and see if it make any difference.

Question to Jaime,  if PT pin is not connected,  would the PT feature 
check cause any issue afterwards? Or the nand chip should just return 
block not protected?

>>
>>
>>>>>
>>>>> Best regards,
>>>>> Álvaro.
>>>>>
>>>>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>>>>>>
>>>>>> Hi Álvaro
>>>>>>
>>>>>> In nand_scan_tail(), each manufacturer init function call will be execute.
>>>>>> In macronix_nand_init(), block protect will be execute after flash detect.
>>>>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
>>>>>> I didn't got situation "device hangs"  on my side.
>>>>>> BP is to prevent incorrect operations.
>>>>>> Please check the controller settings for tracing this issue.
>>>>>>
>>>>>> Thanks
>>>>>> Jaime
>>>>>>
>>>>>>>
>>>>>>> Hello YouChing and Jaime,
>>>>>>>
>>>>>>> I still didn't get any feedback from you (or Macronix) on this issue.
>>>>>>> Did you have time to look into it?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Álvaro.
>>>>>>>
>>>>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
>>>>>>> (<noltari@gmail.com>) escribió:
>>>>>>>>
>>>>>>>> Hi Miquèl,
>>>>>>>>
>>>>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>>>> Hi Álvaro,
>>>>>>>>>
>>>>>>>>> + YouChing and Jaime from Macronix
>>>>>>>>> TLDR for them: there is a misbehavior since Mason added block
>>>>>>>>> protection support. Just checking if the blocks are protected seems to
>>>>>>>>> misconfigure the chip entirely, see below. Any hints?
>>>>>>>>
>>>>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
>>>>>>>> isn’t sent after getting the features?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>>>>>>>>>
>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>
>>>>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>
>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>>>>>>>>>>>
>>>>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>>>
>>>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>>>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
>>>>>>>>>>>>>>>> This binding allows disabling block protection support for
>>>>>>>>>>>>>>>> those
>>>>>>>>>>>>>>>> devices not
>>>>>>>>>>>>>>>> supporting it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>     Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>>>>>>>>>>>>>>>> +++
>>>>>>>>>>>>>>>>     1 file changed, 3 insertions(+)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git
>>>>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
>>>>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
>>>>>>>>>>>>>>>>     Required NAND chip properties in children mode:
>>>>>>>>>>>>>>>>     - randomizer enable: should be "mxic,enable-randomizer-otp"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
>>>>>>>>>>>>>>>> +- block protection disable: should be
>>>>>>>>>>>>>>>> "mxic,disable-block-protection"
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
>>>>>>>>>>>>>>> conversions
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
>>>>>>>>>>>>>>> already have similar properties like "lock" and
>>>>>>>>>>>>>>> "secure-regions",
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>> sure they will fit but I think it's worth checking.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>>>>>>>>>>>>>> MX30LF1G18AC
>>>>>>>>>>>>>> which hangs the device.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is the log with block protection disabled:
>>>>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> state default
>>>>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>>>> 0xf1
>>>>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
>>>>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
>>>>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
>>>>>>>>>>>>>> brcmnand.0
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is the log with block protection enabled:
>>>>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> state default
>>>>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>>>> 0xf1
>>>>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
>>>>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
>>>>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
>>>>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
>>>>>>>>>>>>>> virtual
>>>>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
>>>>>>>>>>>>>> *** Device hangs ***
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
>>>>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
>>>>>>>>>>>>>> scan
>>>>>>>>>>>>>> for bad blocks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please trace nand_macronix.c and look:
>>>>>>>>>>>>> - are the get_features and set_features really supported by the
>>>>>>>>>>>>>      controller driver?
>>>>>>>>>>>>
>>>>>>>>>>>> This is what I could find by debugging:
>>>>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>>>>>>>>>>>> state default
>>>>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>> 0xf1
>>>>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>>>>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>>>>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>>>>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>>>>>>>>>>>> 00 00 00] -> 0
>>>>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
>>>>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>>>>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
>>>>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>>>>>>>>>>>> [    0.624760] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.635542] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.640270] Scanning device for bad blocks
>>>>>>>>>>>>
>>>>>>>>>>>> I don't know how to tell if get_features / set_features is really
>>>>>>>>>>>> supported...
>>>>>>>>>>>
>>>>>>>>>>> Looks like your driver does not support exec_op but the core provides a
>>>>>>>>>>> get/set_feature implementation.
>>>>>>>>>>
>>>>>>>>>> According to Florian, low level should be supported on brcmnand
>>>>>>>>>> controllers >= 4.0
>>>>>>>>>> Also:
>>>>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>>>>>>>>>
>>>>>>>>> Just to be sure, you're using a mainline controller driver, not this
>>>>>>>>> one?
>>>>>>>>
>>>>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
>>>>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> - what is the state of the locking configuration in the chip when
>>>>>>>>>>>>> you
>>>>>>>>>>>>>      boot?
>>>>>>>>>>>>
>>>>>>>>>>>> Unlocked, I guess...
>>>>>>>>>>>> How can I check that?
>>>>>>>>>>>
>>>>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
>>>>>>>>>>> apparently.
>>>>>>>>>>
>>>>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
>>>>>>>>>> so I guess we can confirm it’s unlocked…
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
>>>>>>>>>>>>> ?
>>>>>>>>>>>
>>>>>>>>>>> So nobody locks the device I guess? Did you add traces there?
>>>>>>>>>>
>>>>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
>>>>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
>>>>>>>>>> in adding those traces…
>>>>>>>>>
>>>>>>>>> Right, it returns before setting these I guess.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
>>>>>>>>>>>>> hanging
>>>>>>>>>>>>>      exactly? (offset in nand/ and line in the code)
>>>>>>>>>>>>
>>>>>>>>>>>> I've got no idea...
>>>>>>>>>>>
>>>>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
>>>>>>>>>>> closer and closer.
>>>>>>>>>>
>>>>>>>>>> I think that after trying to get the feature it just start reading
>>>>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
>>>>>>>>>
>>>>>>>>> It should refuse to mount the device somehow, but in no case the kernel
>>>>>>>>> should hang.
>>>>>>>>
>>>>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Is it posible that the NAND starts behaving like this after getting
>>>>>>>>>> the feature due to some specific config of my device?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
>>>>>>>>>>> close enough datasheet and I don't see what could confuse the device.
>>>>>>>>>>>
>>>>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
>>>>>>>>>>> your design?
>>>>>>>>>>
>>>>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
>>>>>>>>>
>>>>>>>>> What about the chip?
>>>>>>>>
>>>>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Miquèl
>>>>>>>>>
>>>>
>>>> --
>>>> Florian
>>>>
> 
> --
> Álvaro
> 

[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

[-- Attachment #2: Type: text/plain, Size: 144 bytes --]

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-23  0:59               ` William Zhang
  0 siblings, 0 replies; 50+ messages in thread
From: William Zhang @ 2023-05-23  0:59 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: Florian Fainelli, liao jaime, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 22194 bytes --]

Hi Alvaro,

On 05/17/2023 08:20 AM, Álvaro Fernández Rojas wrote:
> Hi William,
> 
> El mié, 17 may 2023 a las 7:30, William Zhang
> (<william.zhang@broadcom.com>) escribió:
>>
>>
>>
>> On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
>>> Sure,
>>>
>>> Here you go:
>>> [    0.000000] Linux version 5.15.111 (noltari@atlantis)
>>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
>>> 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
>>> [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
>>> [    0.000000] MIPS: machine is Sercomm H500-s vfes
>>> [    0.000000] 128MB of RAM installed
>>> [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
>>> [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
>>> [    0.000000] Initrd not found or empty - disabling initrd
>>> [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
>>> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
>>> [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
>>> linesize 16 bytes
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
>>> [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
>>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
>>> [    0.000000] Kernel command line: earlycon
>>> [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
>>> bytes, linear)
>>> [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
>>> bytes, linear)
>>> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
>>> [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
>>> 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
>>> cma-reserved)
>>> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
>>> [    0.000000] rcu: Hierarchical RCU implementation.
>>> [    0.000000]  Tracing variant of Tasks RCU enabled.
>>> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
>>> is 10 jiffies.
>>> [    0.000000] NR_IRQS: 256
>>> [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
>>> [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
>>> [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
>>> [    0.000000] brcm,bcm63268 detected @ 400 MHz
>>> [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 9556302233 ns
>>> [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
>>> every 10737418237ns
>>> [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
>>> [    0.074683] pid_max: default: 32768 minimum: 301
>>> [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
>>> bytes, linear)
>>> [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
>>> 4096 bytes, linear)
>>> [    0.106094] rcu: Hierarchical SRCU implementation.
>>> [    0.112665] smp: Bringing up secondary CPUs ...
>>> [    0.119348] SMP: Booting CPU1...
>>> [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
>>> [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
>>> linesize 16 bytes
>>> [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
>>> [    0.182819] Synchronize counters for CPU 1:
>>> [    0.203500] SMP: CPU1 is running
>>> [    0.203512] done.
>>> [    0.213401] smp: Brought up 1 node, 2 CPUs
>>> [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 19112604462750000 ns
>>> [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
>>> [    0.246439] pinctrl core: initialized pinctrl subsystem
>>> [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
>>> [    0.312700] clocksource: Switched to clocksource MIPS
>>> [    0.321061] NET: Registered PF_INET protocol family
>>> [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
>>> bytes, linear)
>>> [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
>>> (order: 0, 6144 bytes, linear)
>>> [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
>>> 262144 bytes, linear)
>>> [    0.352721] TCP established hash table entries: 1024 (order: 0,
>>> 4096 bytes, linear)
>>> [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
>>> [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
>>> [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
>>> [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
>>> [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
>>> [    0.395748] PCI: CLS 0 bytes, default 16
>>> [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
>>> [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
>>> [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
>>> (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
>>> [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
>>> registered 14 power domains
>>> [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
>>> base_baud = 1562500) is a bcm63xx_uart
>>> [    0.479996] printk: console [ttyS0] enabled
>>> [    0.479996] printk: console [ttyS0] enabled
>>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
>>> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
>>> [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
>>> [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
>>> state default
>>> [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
>>> [    0.640506] nand: Macronix MX30LF1G18AC
>>> [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>> 2048, OOB size: 64
>>> [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>> [    0.703373] Bad block table not found for chip 0
>>> [    0.732040] Bad block table not found for chip 0
>>> [    0.736842] Scanning device for bad blocks
>>> [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
>>> address 00000014, epc == 8009b300, ra == 806cc650
>>> [    0.843628] Oops[#1]:
>>> [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
>>> [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
>>> [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
>>> [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
>>> [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
>>>
>>> Please, tell me if you want me to add any debugging to the log.
>>>
>>> Best regards,
>>> Álvaro.
>>>
>>> El mar, 16 may 2023 a las 20:58, Florian Fainelli
>>> (<f.fainelli@gmail.com>) escribió:
>>>>
>>>> +William,
>>>>
>>>> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
>>>>> Hi Jaime,
>>>>>
>>>>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
>>>>> forcing it to check block protection (it's not supported on that
>>>>> device), the NAND controller stops reading/writing anything.
>>>>>
>>>>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
>>>>> aren't supported on BCM63268 NAND controllers and this is causing the
>>>>> issue?
>>>>
>>>> Yes, this looks like what we have seen as well even with newer NAND
>>>> controllers actually. Would it be possible to obtain a full log from
>>>> either of you?
>>>>
>>>> William, is this something you have seen before as well?
>>>>
>> No, I haven't seen such issue before.  It is possible I didn't have this
>>    Macronix parts in my board. If I can find a board with Macronix part,
>> I will try it. But we don't use this feature and don't connect the PT
>> pin in our reference board which means the PT feature is disabled in the
>> nand part.
>>
>> Alvaro, Do you know if your 63268 board has PT pin connected or not?
> 
> No, I don't know if PT pin is connected.
> I would have to open the case and check, but judging from the
> following image I would say it's not connected:
> https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> 
>> Can you check if the macronix's lock and unlock function being calling
>> before the hang?   Or is it just get/set feature function getting called
>> to determine PT is supported?   The get/set feature function should work
>> as they are used by other pathes
> 
> No, the macronix's lock/unlock functions aren't called before the hang.
> In fact, if I comment out the nand_get_features call and replace it
> with ret = 1 it doesn't hang:
> https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230
> 
I see. In fact I saw your earlier debug log with ll_op cmd dump for the 
nand_get_features function and they went through successfully.  Really 
strange how this function call will cause problem to subsequent nand read.

Can you keep this code commented out and then after the board boot up 
and manually write to nand controller register for these ll_op cmd 
sequence following the code of brcmnand_low_level_op,  assuming you have 
a way to write to controller registering from a shell.   If not,  you 
might have to hack the brcmnand or base nand driver code and insert 
following call at some special condition at run time:
ret = nand_get_features(chip, ONFI_FEATURE_ADDR_MXIC_PROTECTION,
				feature);
Then check if nand read function still works.  At least we can confirm 
if feature query function actually cause the problem.  You can try 
different feature code and see if it make any difference.

Question to Jaime,  if PT pin is not connected,  would the PT feature 
check cause any issue afterwards? Or the nand chip should just return 
block not protected?

>>
>>
>>>>>
>>>>> Best regards,
>>>>> Álvaro.
>>>>>
>>>>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>>>>>>
>>>>>> Hi Álvaro
>>>>>>
>>>>>> In nand_scan_tail(), each manufacturer init function call will be execute.
>>>>>> In macronix_nand_init(), block protect will be execute after flash detect.
>>>>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
>>>>>> I didn't got situation "device hangs"  on my side.
>>>>>> BP is to prevent incorrect operations.
>>>>>> Please check the controller settings for tracing this issue.
>>>>>>
>>>>>> Thanks
>>>>>> Jaime
>>>>>>
>>>>>>>
>>>>>>> Hello YouChing and Jaime,
>>>>>>>
>>>>>>> I still didn't get any feedback from you (or Macronix) on this issue.
>>>>>>> Did you have time to look into it?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Álvaro.
>>>>>>>
>>>>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
>>>>>>> (<noltari@gmail.com>) escribió:
>>>>>>>>
>>>>>>>> Hi Miquèl,
>>>>>>>>
>>>>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>>>> Hi Álvaro,
>>>>>>>>>
>>>>>>>>> + YouChing and Jaime from Macronix
>>>>>>>>> TLDR for them: there is a misbehavior since Mason added block
>>>>>>>>> protection support. Just checking if the blocks are protected seems to
>>>>>>>>> misconfigure the chip entirely, see below. Any hints?
>>>>>>>>
>>>>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
>>>>>>>> isn’t sent after getting the features?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>>>>>>>>>
>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>
>>>>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>
>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>>>>>>>>>>>
>>>>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>>>
>>>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>>>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
>>>>>>>>>>>>>>>> This binding allows disabling block protection support for
>>>>>>>>>>>>>>>> those
>>>>>>>>>>>>>>>> devices not
>>>>>>>>>>>>>>>> supporting it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>     Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>>>>>>>>>>>>>>>> +++
>>>>>>>>>>>>>>>>     1 file changed, 3 insertions(+)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git
>>>>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
>>>>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
>>>>>>>>>>>>>>>>     Required NAND chip properties in children mode:
>>>>>>>>>>>>>>>>     - randomizer enable: should be "mxic,enable-randomizer-otp"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
>>>>>>>>>>>>>>>> +- block protection disable: should be
>>>>>>>>>>>>>>>> "mxic,disable-block-protection"
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
>>>>>>>>>>>>>>> conversions
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
>>>>>>>>>>>>>>> already have similar properties like "lock" and
>>>>>>>>>>>>>>> "secure-regions",
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>> sure they will fit but I think it's worth checking.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>>>>>>>>>>>>>> MX30LF1G18AC
>>>>>>>>>>>>>> which hangs the device.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is the log with block protection disabled:
>>>>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> state default
>>>>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>>>> 0xf1
>>>>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
>>>>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
>>>>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
>>>>>>>>>>>>>> brcmnand.0
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is the log with block protection enabled:
>>>>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> state default
>>>>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>>>> 0xf1
>>>>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
>>>>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
>>>>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
>>>>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
>>>>>>>>>>>>>> virtual
>>>>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
>>>>>>>>>>>>>> *** Device hangs ***
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
>>>>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
>>>>>>>>>>>>>> scan
>>>>>>>>>>>>>> for bad blocks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please trace nand_macronix.c and look:
>>>>>>>>>>>>> - are the get_features and set_features really supported by the
>>>>>>>>>>>>>      controller driver?
>>>>>>>>>>>>
>>>>>>>>>>>> This is what I could find by debugging:
>>>>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>>>>>>>>>>>> state default
>>>>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>> 0xf1
>>>>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>>>>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>>>>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>>>>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>>>> 0x00
>>>>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>>>>>>>>>>>> 00 00 00] -> 0
>>>>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
>>>>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>>>>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
>>>>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>>>>>>>>>>>> [    0.624760] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.635542] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.640270] Scanning device for bad blocks
>>>>>>>>>>>>
>>>>>>>>>>>> I don't know how to tell if get_features / set_features is really
>>>>>>>>>>>> supported...
>>>>>>>>>>>
>>>>>>>>>>> Looks like your driver does not support exec_op but the core provides a
>>>>>>>>>>> get/set_feature implementation.
>>>>>>>>>>
>>>>>>>>>> According to Florian, low level should be supported on brcmnand
>>>>>>>>>> controllers >= 4.0
>>>>>>>>>> Also:
>>>>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>>>>>>>>>
>>>>>>>>> Just to be sure, you're using a mainline controller driver, not this
>>>>>>>>> one?
>>>>>>>>
>>>>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
>>>>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> - what is the state of the locking configuration in the chip when
>>>>>>>>>>>>> you
>>>>>>>>>>>>>      boot?
>>>>>>>>>>>>
>>>>>>>>>>>> Unlocked, I guess...
>>>>>>>>>>>> How can I check that?
>>>>>>>>>>>
>>>>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
>>>>>>>>>>> apparently.
>>>>>>>>>>
>>>>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
>>>>>>>>>> so I guess we can confirm it’s unlocked…
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
>>>>>>>>>>>>> ?
>>>>>>>>>>>
>>>>>>>>>>> So nobody locks the device I guess? Did you add traces there?
>>>>>>>>>>
>>>>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
>>>>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
>>>>>>>>>> in adding those traces…
>>>>>>>>>
>>>>>>>>> Right, it returns before setting these I guess.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
>>>>>>>>>>>>> hanging
>>>>>>>>>>>>>      exactly? (offset in nand/ and line in the code)
>>>>>>>>>>>>
>>>>>>>>>>>> I've got no idea...
>>>>>>>>>>>
>>>>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
>>>>>>>>>>> closer and closer.
>>>>>>>>>>
>>>>>>>>>> I think that after trying to get the feature it just start reading
>>>>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
>>>>>>>>>
>>>>>>>>> It should refuse to mount the device somehow, but in no case the kernel
>>>>>>>>> should hang.
>>>>>>>>
>>>>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Is it posible that the NAND starts behaving like this after getting
>>>>>>>>>> the feature due to some specific config of my device?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
>>>>>>>>>>> close enough datasheet and I don't see what could confuse the device.
>>>>>>>>>>>
>>>>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
>>>>>>>>>>> your design?
>>>>>>>>>>
>>>>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
>>>>>>>>>
>>>>>>>>> What about the chip?
>>>>>>>>
>>>>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Miquèl
>>>>>>>>>
>>>>
>>>> --
>>>> Florian
>>>>
> 
> --
> Álvaro
> 

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-22  8:15               ` Miquel Raynal
@ 2023-05-22  9:21                 ` liao jaime
  -1 siblings, 0 replies; 50+ messages in thread
From: liao jaime @ 2023-05-22  9:21 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Álvaro Fernández Rojas, William Zhang,
	Florian Fainelli, Richard Weinberger, Vignesh Raghavendra,
	robh+dt, krzysztof.kozlowski+dt, linux-mtd, devicetree,
	linux-kernel

Hi

>
> Hi Jaime, Álvaro,
>
> noltari@gmail.com wrote on Wed, 17 May 2023 17:20:26 +0200:
>
> > Hi William,
> >
> > El mié, 17 may 2023 a las 7:30, William Zhang
> > (<william.zhang@broadcom.com>) escribió:
> > >
> > >
> > >
> > > On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> > > > Sure,
> > > >
> > > > Here you go:
> > > > [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> > > > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> > > > 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> > > > [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> > > > [    0.000000] MIPS: machine is Sercomm H500-s vfes
> > > > [    0.000000] 128MB of RAM installed
> > > > [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> > > > [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> > > > [    0.000000] Initrd not found or empty - disabling initrd
> > > > [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> > > > [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > > [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > > linesize 16 bytes
> > > > [    0.000000] Zone ranges:
> > > > [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> > > > [    0.000000] Movable zone start for each node
> > > > [    0.000000] Early memory node ranges
> > > > [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> > > > [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> > > > [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> > > > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> > > > [    0.000000] Kernel command line: earlycon
> > > > [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> > > > bytes, linear)
> > > > [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> > > > bytes, linear)
> > > > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > > > [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> > > > 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> > > > cma-reserved)
> > > > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> > > > [    0.000000] rcu: Hierarchical RCU implementation.
> > > > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > > > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> > > > is 10 jiffies.
> > > > [    0.000000] NR_IRQS: 256
> > > > [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> > > > [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> > > > [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> > > > [    0.000000] brcm,bcm63268 detected @ 400 MHz
> > > > [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> > > > 0xffffffff, max_idle_ns: 9556302233 ns
> > > > [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> > > > every 10737418237ns
> > > > [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> > > > [    0.074683] pid_max: default: 32768 minimum: 301
> > > > [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> > > > bytes, linear)
> > > > [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> > > > 4096 bytes, linear)
> > > > [    0.106094] rcu: Hierarchical SRCU implementation.
> > > > [    0.112665] smp: Bringing up secondary CPUs ...
> > > > [    0.119348] SMP: Booting CPU1...
> > > > [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > > [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > > linesize 16 bytes
> > > > [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> > > > [    0.182819] Synchronize counters for CPU 1:
> > > > [    0.203500] SMP: CPU1 is running
> > > > [    0.203512] done.
> > > > [    0.213401] smp: Brought up 1 node, 2 CPUs
> > > > [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > > > 0xffffffff, max_idle_ns: 19112604462750000 ns
> > > > [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> > > > [    0.246439] pinctrl core: initialized pinctrl subsystem
> > > > [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> > > > [    0.312700] clocksource: Switched to clocksource MIPS
> > > > [    0.321061] NET: Registered PF_INET protocol family
> > > > [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> > > > bytes, linear)
> > > > [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> > > > (order: 0, 6144 bytes, linear)
> > > > [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> > > > 262144 bytes, linear)
> > > > [    0.352721] TCP established hash table entries: 1024 (order: 0,
> > > > 4096 bytes, linear)
> > > > [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> > > > [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> > > > [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > > [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > > [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> > > > [    0.395748] PCI: CLS 0 bytes, default 16
> > > > [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> > > > [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> > > > [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> > > > (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> > > > [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> > > > registered 14 power domains
> > > > [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> > > > base_baud = 1562500) is a bcm63xx_uart
> > > > [    0.479996] printk: console [ttyS0] enabled
> > > > [    0.479996] printk: console [ttyS0] enabled
> > > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > > [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> > > > [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> > > > state default
> > > > [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > > [    0.640506] nand: Macronix MX30LF1G18AC
> > > > [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > > 2048, OOB size: 64
> > > > [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > > [    0.703373] Bad block table not found for chip 0
> > > > [    0.732040] Bad block table not found for chip 0
> > > > [    0.736842] Scanning device for bad blocks
> > > > [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> > > > address 00000014, epc == 8009b300, ra == 806cc650
> > > > [    0.843628] Oops[#1]:
> > > > [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> > > > [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> > > > [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> > > > [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> > > > [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> > > >
> > > > Please, tell me if you want me to add any debugging to the log.
> > > >
> > > > Best regards,
> > > > Álvaro.
> > > >
> > > > El mar, 16 may 2023 a las 20:58, Florian Fainelli
> > > > (<f.fainelli@gmail.com>) escribió:
> > > >>
> > > >> +William,
> > > >>
> > > >> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> > > >>> Hi Jaime,
> > > >>>
> > > >>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> > > >>> forcing it to check block protection (it's not supported on that
> > > >>> device), the NAND controller stops reading/writing anything.
> > > >>>
> > > >>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> > > >>> aren't supported on BCM63268 NAND controllers and this is causing the
> > > >>> issue?
> > > >>
> > > >> Yes, this looks like what we have seen as well even with newer NAND
> > > >> controllers actually. Would it be possible to obtain a full log from
> > > >> either of you?
> > > >>
> > > >> William, is this something you have seen before as well?
> > > >>
> > > No, I haven't seen such issue before.  It is possible I didn't have this
> > >   Macronix parts in my board. If I can find a board with Macronix part,
> > > I will try it. But we don't use this feature and don't connect the PT
> > > pin in our reference board which means the PT feature is disabled in the
> > > nand part.
> > >
> > > Alvaro, Do you know if your 63268 board has PT pin connected or not?
> >
> > No, I don't know if PT pin is connected.
> > I would have to open the case and check, but judging from the
> > following image I would say it's not connected:
> > https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> >
> > > Can you check if the macronix's lock and unlock function being calling
> > > before the hang?   Or is it just get/set feature function getting called
> > > to determine PT is supported?   The get/set feature function should work
> > > as they are used by other pathes
> >
> > No, the macronix's lock/unlock functions aren't called before the hang.
> > In fact, if I comment out the nand_get_features call and replace it
> > with ret = 1 it doesn't hang:
> > https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230
>
> This does not make any sense to me. Jaime, can you test with the exact
> same MX30LF1G18AC chip? I'm wondering whether the bug comes from the
> chip or the controller side.
Sure, I have test MX30LF1G18AC on Xilinx zynq-picozed and spi-mxic
host controller.
The test result is good.

>
> Álvaro, any chances you can try with a mainline kernel rather than
> OpenWRT's?
>
> Thanks,
> Miquèl

Thanks
Jaime

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-22  9:21                 ` liao jaime
  0 siblings, 0 replies; 50+ messages in thread
From: liao jaime @ 2023-05-22  9:21 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Álvaro Fernández Rojas, William Zhang,
	Florian Fainelli, Richard Weinberger, Vignesh Raghavendra,
	robh+dt, krzysztof.kozlowski+dt, linux-mtd, devicetree,
	linux-kernel

Hi

>
> Hi Jaime, Álvaro,
>
> noltari@gmail.com wrote on Wed, 17 May 2023 17:20:26 +0200:
>
> > Hi William,
> >
> > El mié, 17 may 2023 a las 7:30, William Zhang
> > (<william.zhang@broadcom.com>) escribió:
> > >
> > >
> > >
> > > On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> > > > Sure,
> > > >
> > > > Here you go:
> > > > [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> > > > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> > > > 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> > > > [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> > > > [    0.000000] MIPS: machine is Sercomm H500-s vfes
> > > > [    0.000000] 128MB of RAM installed
> > > > [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> > > > [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> > > > [    0.000000] Initrd not found or empty - disabling initrd
> > > > [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> > > > [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > > [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > > linesize 16 bytes
> > > > [    0.000000] Zone ranges:
> > > > [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> > > > [    0.000000] Movable zone start for each node
> > > > [    0.000000] Early memory node ranges
> > > > [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> > > > [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> > > > [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> > > > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> > > > [    0.000000] Kernel command line: earlycon
> > > > [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> > > > bytes, linear)
> > > > [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> > > > bytes, linear)
> > > > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > > > [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> > > > 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> > > > cma-reserved)
> > > > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> > > > [    0.000000] rcu: Hierarchical RCU implementation.
> > > > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > > > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> > > > is 10 jiffies.
> > > > [    0.000000] NR_IRQS: 256
> > > > [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> > > > [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> > > > [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> > > > [    0.000000] brcm,bcm63268 detected @ 400 MHz
> > > > [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> > > > 0xffffffff, max_idle_ns: 9556302233 ns
> > > > [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> > > > every 10737418237ns
> > > > [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> > > > [    0.074683] pid_max: default: 32768 minimum: 301
> > > > [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> > > > bytes, linear)
> > > > [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> > > > 4096 bytes, linear)
> > > > [    0.106094] rcu: Hierarchical SRCU implementation.
> > > > [    0.112665] smp: Bringing up secondary CPUs ...
> > > > [    0.119348] SMP: Booting CPU1...
> > > > [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > > [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > > linesize 16 bytes
> > > > [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> > > > [    0.182819] Synchronize counters for CPU 1:
> > > > [    0.203500] SMP: CPU1 is running
> > > > [    0.203512] done.
> > > > [    0.213401] smp: Brought up 1 node, 2 CPUs
> > > > [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > > > 0xffffffff, max_idle_ns: 19112604462750000 ns
> > > > [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> > > > [    0.246439] pinctrl core: initialized pinctrl subsystem
> > > > [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> > > > [    0.312700] clocksource: Switched to clocksource MIPS
> > > > [    0.321061] NET: Registered PF_INET protocol family
> > > > [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> > > > bytes, linear)
> > > > [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> > > > (order: 0, 6144 bytes, linear)
> > > > [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> > > > 262144 bytes, linear)
> > > > [    0.352721] TCP established hash table entries: 1024 (order: 0,
> > > > 4096 bytes, linear)
> > > > [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> > > > [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> > > > [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > > [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > > [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> > > > [    0.395748] PCI: CLS 0 bytes, default 16
> > > > [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> > > > [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> > > > [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> > > > (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> > > > [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> > > > registered 14 power domains
> > > > [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> > > > base_baud = 1562500) is a bcm63xx_uart
> > > > [    0.479996] printk: console [ttyS0] enabled
> > > > [    0.479996] printk: console [ttyS0] enabled
> > > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > > [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> > > > [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> > > > state default
> > > > [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > > [    0.640506] nand: Macronix MX30LF1G18AC
> > > > [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > > 2048, OOB size: 64
> > > > [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > > [    0.703373] Bad block table not found for chip 0
> > > > [    0.732040] Bad block table not found for chip 0
> > > > [    0.736842] Scanning device for bad blocks
> > > > [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> > > > address 00000014, epc == 8009b300, ra == 806cc650
> > > > [    0.843628] Oops[#1]:
> > > > [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> > > > [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> > > > [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> > > > [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> > > > [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> > > >
> > > > Please, tell me if you want me to add any debugging to the log.
> > > >
> > > > Best regards,
> > > > Álvaro.
> > > >
> > > > El mar, 16 may 2023 a las 20:58, Florian Fainelli
> > > > (<f.fainelli@gmail.com>) escribió:
> > > >>
> > > >> +William,
> > > >>
> > > >> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> > > >>> Hi Jaime,
> > > >>>
> > > >>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> > > >>> forcing it to check block protection (it's not supported on that
> > > >>> device), the NAND controller stops reading/writing anything.
> > > >>>
> > > >>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> > > >>> aren't supported on BCM63268 NAND controllers and this is causing the
> > > >>> issue?
> > > >>
> > > >> Yes, this looks like what we have seen as well even with newer NAND
> > > >> controllers actually. Would it be possible to obtain a full log from
> > > >> either of you?
> > > >>
> > > >> William, is this something you have seen before as well?
> > > >>
> > > No, I haven't seen such issue before.  It is possible I didn't have this
> > >   Macronix parts in my board. If I can find a board with Macronix part,
> > > I will try it. But we don't use this feature and don't connect the PT
> > > pin in our reference board which means the PT feature is disabled in the
> > > nand part.
> > >
> > > Alvaro, Do you know if your 63268 board has PT pin connected or not?
> >
> > No, I don't know if PT pin is connected.
> > I would have to open the case and check, but judging from the
> > following image I would say it's not connected:
> > https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> >
> > > Can you check if the macronix's lock and unlock function being calling
> > > before the hang?   Or is it just get/set feature function getting called
> > > to determine PT is supported?   The get/set feature function should work
> > > as they are used by other pathes
> >
> > No, the macronix's lock/unlock functions aren't called before the hang.
> > In fact, if I comment out the nand_get_features call and replace it
> > with ret = 1 it doesn't hang:
> > https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230
>
> This does not make any sense to me. Jaime, can you test with the exact
> same MX30LF1G18AC chip? I'm wondering whether the bug comes from the
> chip or the controller side.
Sure, I have test MX30LF1G18AC on Xilinx zynq-picozed and spi-mxic
host controller.
The test result is good.

>
> Álvaro, any chances you can try with a mainline kernel rather than
> OpenWRT's?
>
> Thanks,
> Miquèl

Thanks
Jaime

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-17 15:20             ` Álvaro Fernández Rojas
@ 2023-05-22  8:15               ` Miquel Raynal
  -1 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-05-22  8:15 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: William Zhang, Florian Fainelli, liao jaime, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel

Hi Jaime, Álvaro,

noltari@gmail.com wrote on Wed, 17 May 2023 17:20:26 +0200:

> Hi William,
> 
> El mié, 17 may 2023 a las 7:30, William Zhang
> (<william.zhang@broadcom.com>) escribió:
> >
> >
> >
> > On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:  
> > > Sure,
> > >
> > > Here you go:
> > > [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> > > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> > > 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> > > [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> > > [    0.000000] MIPS: machine is Sercomm H500-s vfes
> > > [    0.000000] 128MB of RAM installed
> > > [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> > > [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> > > [    0.000000] Initrd not found or empty - disabling initrd
> > > [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> > > [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > linesize 16 bytes
> > > [    0.000000] Zone ranges:
> > > [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> > > [    0.000000] Movable zone start for each node
> > > [    0.000000] Early memory node ranges
> > > [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> > > [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> > > [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> > > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> > > [    0.000000] Kernel command line: earlycon
> > > [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> > > bytes, linear)
> > > [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> > > bytes, linear)
> > > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > > [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> > > 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> > > cma-reserved)
> > > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> > > [    0.000000] rcu: Hierarchical RCU implementation.
> > > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> > > is 10 jiffies.
> > > [    0.000000] NR_IRQS: 256
> > > [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> > > [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> > > [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> > > [    0.000000] brcm,bcm63268 detected @ 400 MHz
> > > [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> > > 0xffffffff, max_idle_ns: 9556302233 ns
> > > [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> > > every 10737418237ns
> > > [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> > > [    0.074683] pid_max: default: 32768 minimum: 301
> > > [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> > > bytes, linear)
> > > [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> > > 4096 bytes, linear)
> > > [    0.106094] rcu: Hierarchical SRCU implementation.
> > > [    0.112665] smp: Bringing up secondary CPUs ...
> > > [    0.119348] SMP: Booting CPU1...
> > > [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > linesize 16 bytes
> > > [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> > > [    0.182819] Synchronize counters for CPU 1:
> > > [    0.203500] SMP: CPU1 is running
> > > [    0.203512] done.
> > > [    0.213401] smp: Brought up 1 node, 2 CPUs
> > > [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > > 0xffffffff, max_idle_ns: 19112604462750000 ns
> > > [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> > > [    0.246439] pinctrl core: initialized pinctrl subsystem
> > > [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> > > [    0.312700] clocksource: Switched to clocksource MIPS
> > > [    0.321061] NET: Registered PF_INET protocol family
> > > [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> > > bytes, linear)
> > > [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> > > (order: 0, 6144 bytes, linear)
> > > [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> > > 262144 bytes, linear)
> > > [    0.352721] TCP established hash table entries: 1024 (order: 0,
> > > 4096 bytes, linear)
> > > [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> > > [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> > > [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> > > [    0.395748] PCI: CLS 0 bytes, default 16
> > > [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> > > [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> > > [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> > > (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> > > [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> > > registered 14 power domains
> > > [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> > > base_baud = 1562500) is a bcm63xx_uart
> > > [    0.479996] printk: console [ttyS0] enabled
> > > [    0.479996] printk: console [ttyS0] enabled
> > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> > > [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> > > state default
> > > [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > [    0.640506] nand: Macronix MX30LF1G18AC
> > > [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > 2048, OOB size: 64
> > > [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > [    0.703373] Bad block table not found for chip 0
> > > [    0.732040] Bad block table not found for chip 0
> > > [    0.736842] Scanning device for bad blocks
> > > [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> > > address 00000014, epc == 8009b300, ra == 806cc650
> > > [    0.843628] Oops[#1]:
> > > [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> > > [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> > > [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> > > [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> > > [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> > >
> > > Please, tell me if you want me to add any debugging to the log.
> > >
> > > Best regards,
> > > Álvaro.
> > >
> > > El mar, 16 may 2023 a las 20:58, Florian Fainelli
> > > (<f.fainelli@gmail.com>) escribió:  
> > >>
> > >> +William,
> > >>
> > >> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:  
> > >>> Hi Jaime,
> > >>>
> > >>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> > >>> forcing it to check block protection (it's not supported on that
> > >>> device), the NAND controller stops reading/writing anything.
> > >>>
> > >>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> > >>> aren't supported on BCM63268 NAND controllers and this is causing the
> > >>> issue?  
> > >>
> > >> Yes, this looks like what we have seen as well even with newer NAND
> > >> controllers actually. Would it be possible to obtain a full log from
> > >> either of you?
> > >>
> > >> William, is this something you have seen before as well?
> > >>  
> > No, I haven't seen such issue before.  It is possible I didn't have this
> >   Macronix parts in my board. If I can find a board with Macronix part,
> > I will try it. But we don't use this feature and don't connect the PT
> > pin in our reference board which means the PT feature is disabled in the
> > nand part.
> >
> > Alvaro, Do you know if your 63268 board has PT pin connected or not?  
> 
> No, I don't know if PT pin is connected.
> I would have to open the case and check, but judging from the
> following image I would say it's not connected:
> https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> 
> > Can you check if the macronix's lock and unlock function being calling
> > before the hang?   Or is it just get/set feature function getting called
> > to determine PT is supported?   The get/set feature function should work
> > as they are used by other pathes  
> 
> No, the macronix's lock/unlock functions aren't called before the hang.
> In fact, if I comment out the nand_get_features call and replace it
> with ret = 1 it doesn't hang:
> https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230

This does not make any sense to me. Jaime, can you test with the exact
same MX30LF1G18AC chip? I'm wondering whether the bug comes from the
chip or the controller side.

Álvaro, any chances you can try with a mainline kernel rather than
OpenWRT's?

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-22  8:15               ` Miquel Raynal
  0 siblings, 0 replies; 50+ messages in thread
From: Miquel Raynal @ 2023-05-22  8:15 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: William Zhang, Florian Fainelli, liao jaime, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel

Hi Jaime, Álvaro,

noltari@gmail.com wrote on Wed, 17 May 2023 17:20:26 +0200:

> Hi William,
> 
> El mié, 17 may 2023 a las 7:30, William Zhang
> (<william.zhang@broadcom.com>) escribió:
> >
> >
> >
> > On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:  
> > > Sure,
> > >
> > > Here you go:
> > > [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> > > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> > > 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> > > [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> > > [    0.000000] MIPS: machine is Sercomm H500-s vfes
> > > [    0.000000] 128MB of RAM installed
> > > [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> > > [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> > > [    0.000000] Initrd not found or empty - disabling initrd
> > > [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> > > [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > linesize 16 bytes
> > > [    0.000000] Zone ranges:
> > > [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> > > [    0.000000] Movable zone start for each node
> > > [    0.000000] Early memory node ranges
> > > [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> > > [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> > > [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> > > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> > > [    0.000000] Kernel command line: earlycon
> > > [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> > > bytes, linear)
> > > [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> > > bytes, linear)
> > > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > > [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> > > 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> > > cma-reserved)
> > > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> > > [    0.000000] rcu: Hierarchical RCU implementation.
> > > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> > > is 10 jiffies.
> > > [    0.000000] NR_IRQS: 256
> > > [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> > > [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> > > [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> > > [    0.000000] brcm,bcm63268 detected @ 400 MHz
> > > [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> > > 0xffffffff, max_idle_ns: 9556302233 ns
> > > [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> > > every 10737418237ns
> > > [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> > > [    0.074683] pid_max: default: 32768 minimum: 301
> > > [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> > > bytes, linear)
> > > [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> > > 4096 bytes, linear)
> > > [    0.106094] rcu: Hierarchical SRCU implementation.
> > > [    0.112665] smp: Bringing up secondary CPUs ...
> > > [    0.119348] SMP: Booting CPU1...
> > > [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > > [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > > linesize 16 bytes
> > > [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> > > [    0.182819] Synchronize counters for CPU 1:
> > > [    0.203500] SMP: CPU1 is running
> > > [    0.203512] done.
> > > [    0.213401] smp: Brought up 1 node, 2 CPUs
> > > [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > > 0xffffffff, max_idle_ns: 19112604462750000 ns
> > > [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> > > [    0.246439] pinctrl core: initialized pinctrl subsystem
> > > [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> > > [    0.312700] clocksource: Switched to clocksource MIPS
> > > [    0.321061] NET: Registered PF_INET protocol family
> > > [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> > > bytes, linear)
> > > [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> > > (order: 0, 6144 bytes, linear)
> > > [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> > > 262144 bytes, linear)
> > > [    0.352721] TCP established hash table entries: 1024 (order: 0,
> > > 4096 bytes, linear)
> > > [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> > > [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> > > [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> > > [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> > > [    0.395748] PCI: CLS 0 bytes, default 16
> > > [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> > > [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> > > [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> > > (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> > > [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> > > registered 14 power domains
> > > [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> > > base_baud = 1562500) is a bcm63xx_uart
> > > [    0.479996] printk: console [ttyS0] enabled
> > > [    0.479996] printk: console [ttyS0] enabled
> > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > > [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> > > [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> > > state default
> > > [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > > [    0.640506] nand: Macronix MX30LF1G18AC
> > > [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > 2048, OOB size: 64
> > > [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > [    0.703373] Bad block table not found for chip 0
> > > [    0.732040] Bad block table not found for chip 0
> > > [    0.736842] Scanning device for bad blocks
> > > [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> > > address 00000014, epc == 8009b300, ra == 806cc650
> > > [    0.843628] Oops[#1]:
> > > [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> > > [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> > > [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> > > [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> > > [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> > >
> > > Please, tell me if you want me to add any debugging to the log.
> > >
> > > Best regards,
> > > Álvaro.
> > >
> > > El mar, 16 may 2023 a las 20:58, Florian Fainelli
> > > (<f.fainelli@gmail.com>) escribió:  
> > >>
> > >> +William,
> > >>
> > >> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:  
> > >>> Hi Jaime,
> > >>>
> > >>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> > >>> forcing it to check block protection (it's not supported on that
> > >>> device), the NAND controller stops reading/writing anything.
> > >>>
> > >>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> > >>> aren't supported on BCM63268 NAND controllers and this is causing the
> > >>> issue?  
> > >>
> > >> Yes, this looks like what we have seen as well even with newer NAND
> > >> controllers actually. Would it be possible to obtain a full log from
> > >> either of you?
> > >>
> > >> William, is this something you have seen before as well?
> > >>  
> > No, I haven't seen such issue before.  It is possible I didn't have this
> >   Macronix parts in my board. If I can find a board with Macronix part,
> > I will try it. But we don't use this feature and don't connect the PT
> > pin in our reference board which means the PT feature is disabled in the
> > nand part.
> >
> > Alvaro, Do you know if your 63268 board has PT pin connected or not?  
> 
> No, I don't know if PT pin is connected.
> I would have to open the case and check, but judging from the
> following image I would say it's not connected:
> https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg
> 
> > Can you check if the macronix's lock and unlock function being calling
> > before the hang?   Or is it just get/set feature function getting called
> > to determine PT is supported?   The get/set feature function should work
> > as they are used by other pathes  
> 
> No, the macronix's lock/unlock functions aren't called before the hang.
> In fact, if I comment out the nand_get_features call and replace it
> with ret = 1 it doesn't hang:
> https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230

This does not make any sense to me. Jaime, can you test with the exact
same MX30LF1G18AC chip? I'm wondering whether the bug comes from the
chip or the controller side.

Álvaro, any chances you can try with a mainline kernel rather than
OpenWRT's?

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-17  5:30           ` William Zhang
@ 2023-05-17 15:20             ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-05-17 15:20 UTC (permalink / raw)
  To: William Zhang
  Cc: Florian Fainelli, liao jaime, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel

Hi William,

El mié, 17 may 2023 a las 7:30, William Zhang
(<william.zhang@broadcom.com>) escribió:
>
>
>
> On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> > Sure,
> >
> > Here you go:
> > [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> > 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> > [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> > [    0.000000] MIPS: machine is Sercomm H500-s vfes
> > [    0.000000] 128MB of RAM installed
> > [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> > [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> > [    0.000000] Initrd not found or empty - disabling initrd
> > [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> > [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > linesize 16 bytes
> > [    0.000000] Zone ranges:
> > [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> > [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> > [    0.000000] Kernel command line: earlycon
> > [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> > bytes, linear)
> > [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> > bytes, linear)
> > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> > 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> > cma-reserved)
> > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> > [    0.000000] rcu: Hierarchical RCU implementation.
> > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> > is 10 jiffies.
> > [    0.000000] NR_IRQS: 256
> > [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> > [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> > [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> > [    0.000000] brcm,bcm63268 detected @ 400 MHz
> > [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> > 0xffffffff, max_idle_ns: 9556302233 ns
> > [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> > every 10737418237ns
> > [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> > [    0.074683] pid_max: default: 32768 minimum: 301
> > [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> > bytes, linear)
> > [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> > 4096 bytes, linear)
> > [    0.106094] rcu: Hierarchical SRCU implementation.
> > [    0.112665] smp: Bringing up secondary CPUs ...
> > [    0.119348] SMP: Booting CPU1...
> > [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > linesize 16 bytes
> > [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> > [    0.182819] Synchronize counters for CPU 1:
> > [    0.203500] SMP: CPU1 is running
> > [    0.203512] done.
> > [    0.213401] smp: Brought up 1 node, 2 CPUs
> > [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > 0xffffffff, max_idle_ns: 19112604462750000 ns
> > [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> > [    0.246439] pinctrl core: initialized pinctrl subsystem
> > [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> > [    0.312700] clocksource: Switched to clocksource MIPS
> > [    0.321061] NET: Registered PF_INET protocol family
> > [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> > bytes, linear)
> > [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> > (order: 0, 6144 bytes, linear)
> > [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> > 262144 bytes, linear)
> > [    0.352721] TCP established hash table entries: 1024 (order: 0,
> > 4096 bytes, linear)
> > [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> > [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> > [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> > [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> > [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> > [    0.395748] PCI: CLS 0 bytes, default 16
> > [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> > [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> > [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> > (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> > [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> > registered 14 power domains
> > [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> > base_baud = 1562500) is a bcm63xx_uart
> > [    0.479996] printk: console [ttyS0] enabled
> > [    0.479996] printk: console [ttyS0] enabled
> > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> > [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> > state default
> > [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > [    0.640506] nand: Macronix MX30LF1G18AC
> > [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > 2048, OOB size: 64
> > [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > [    0.703373] Bad block table not found for chip 0
> > [    0.732040] Bad block table not found for chip 0
> > [    0.736842] Scanning device for bad blocks
> > [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> > address 00000014, epc == 8009b300, ra == 806cc650
> > [    0.843628] Oops[#1]:
> > [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> > [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> > [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> > [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> > [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> >
> > Please, tell me if you want me to add any debugging to the log.
> >
> > Best regards,
> > Álvaro.
> >
> > El mar, 16 may 2023 a las 20:58, Florian Fainelli
> > (<f.fainelli@gmail.com>) escribió:
> >>
> >> +William,
> >>
> >> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> >>> Hi Jaime,
> >>>
> >>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> >>> forcing it to check block protection (it's not supported on that
> >>> device), the NAND controller stops reading/writing anything.
> >>>
> >>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> >>> aren't supported on BCM63268 NAND controllers and this is causing the
> >>> issue?
> >>
> >> Yes, this looks like what we have seen as well even with newer NAND
> >> controllers actually. Would it be possible to obtain a full log from
> >> either of you?
> >>
> >> William, is this something you have seen before as well?
> >>
> No, I haven't seen such issue before.  It is possible I didn't have this
>   Macronix parts in my board. If I can find a board with Macronix part,
> I will try it. But we don't use this feature and don't connect the PT
> pin in our reference board which means the PT feature is disabled in the
> nand part.
>
> Alvaro, Do you know if your 63268 board has PT pin connected or not?

No, I don't know if PT pin is connected.
I would have to open the case and check, but judging from the
following image I would say it's not connected:
https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg

> Can you check if the macronix's lock and unlock function being calling
> before the hang?   Or is it just get/set feature function getting called
> to determine PT is supported?   The get/set feature function should work
> as they are used by other pathes

No, the macronix's lock/unlock functions aren't called before the hang.
In fact, if I comment out the nand_get_features call and replace it
with ret = 1 it doesn't hang:
https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230

>
>
> >>>
> >>> Best regards,
> >>> Álvaro.
> >>>
> >>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
> >>>>
> >>>> Hi Álvaro
> >>>>
> >>>> In nand_scan_tail(), each manufacturer init function call will be execute.
> >>>> In macronix_nand_init(), block protect will be execute after flash detect.
> >>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
> >>>> I didn't got situation "device hangs"  on my side.
> >>>> BP is to prevent incorrect operations.
> >>>> Please check the controller settings for tracing this issue.
> >>>>
> >>>> Thanks
> >>>> Jaime
> >>>>
> >>>>>
> >>>>> Hello YouChing and Jaime,
> >>>>>
> >>>>> I still didn't get any feedback from you (or Macronix) on this issue.
> >>>>> Did you have time to look into it?
> >>>>>
> >>>>> Thanks,
> >>>>> Álvaro.
> >>>>>
> >>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> >>>>> (<noltari@gmail.com>) escribió:
> >>>>>>
> >>>>>> Hi Miquèl,
> >>>>>>
> >>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>> Hi Álvaro,
> >>>>>>>
> >>>>>>> + YouChing and Jaime from Macronix
> >>>>>>> TLDR for them: there is a misbehavior since Mason added block
> >>>>>>> protection support. Just checking if the blocks are protected seems to
> >>>>>>> misconfigure the chip entirely, see below. Any hints?
> >>>>>>
> >>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
> >>>>>> isn’t sent after getting the features?
> >>>>>>
> >>>>>>>
> >>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >>>>>>>
> >>>>>>>> Hi Miquèl,
> >>>>>>>>
> >>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>>>> Hi Álvaro,
> >>>>>>>>>
> >>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >>>>>>>>>
> >>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>
> >>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>>>
> >>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
> >>>>>>>>>>>>>> This binding allows disabling block protection support for
> >>>>>>>>>>>>>> those
> >>>>>>>>>>>>>> devices not
> >>>>>>>>>>>>>> supporting it.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>    Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >>>>>>>>>>>>>> +++
> >>>>>>>>>>>>>>    1 file changed, 3 insertions(+)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> diff --git
> >>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
> >>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
> >>>>>>>>>>>>>>    Required NAND chip properties in children mode:
> >>>>>>>>>>>>>>    - randomizer enable: should be "mxic,enable-randomizer-otp"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
> >>>>>>>>>>>>>> +- block protection disable: should be
> >>>>>>>>>>>>>> "mxic,disable-block-protection"
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
> >>>>>>>>>>>>> conversions
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
> >>>>>>>>>>>>> already have similar properties like "lock" and
> >>>>>>>>>>>>> "secure-regions",
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>> sure they will fit but I think it's worth checking.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
> >>>>>>>>>>>> on
> >>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >>>>>>>>>>>> MX30LF1G18AC
> >>>>>>>>>>>> which hangs the device.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is the log with block protection disabled:
> >>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>> for
> >>>>>>>>>>>> state default
> >>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>> 0xf1
> >>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
> >>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
> >>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
> >>>>>>>>>>>> brcmnand.0
> >>>>>>>>>>>> ...
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is the log with block protection enabled:
> >>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>> for
> >>>>>>>>>>>> state default
> >>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>> 0xf1
> >>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
> >>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
> >>>>>>>>>>>> virtual
> >>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
> >>>>>>>>>>>> *** Device hangs ***
> >>>>>>>>>>>>
> >>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
> >>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
> >>>>>>>>>>>> scan
> >>>>>>>>>>>> for bad blocks.
> >>>>>>>>>>>
> >>>>>>>>>>> Please trace nand_macronix.c and look:
> >>>>>>>>>>> - are the get_features and set_features really supported by the
> >>>>>>>>>>>     controller driver?
> >>>>>>>>>>
> >>>>>>>>>> This is what I could find by debugging:
> >>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >>>>>>>>>> state default
> >>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>> 0xf1
> >>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
> >>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >>>>>>>>>> 00 00 00] -> 0
> >>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
> >>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
> >>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >>>>>>>>>> [    0.624760] Bad block table not found for chip 0
> >>>>>>>>>> [    0.635542] Bad block table not found for chip 0
> >>>>>>>>>> [    0.640270] Scanning device for bad blocks
> >>>>>>>>>>
> >>>>>>>>>> I don't know how to tell if get_features / set_features is really
> >>>>>>>>>> supported...
> >>>>>>>>>
> >>>>>>>>> Looks like your driver does not support exec_op but the core provides a
> >>>>>>>>> get/set_feature implementation.
> >>>>>>>>
> >>>>>>>> According to Florian, low level should be supported on brcmnand
> >>>>>>>> controllers >= 4.0
> >>>>>>>> Also:
> >>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >>>>>>>
> >>>>>>> Just to be sure, you're using a mainline controller driver, not this
> >>>>>>> one?
> >>>>>>
> >>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
> >>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
> >>>>>>
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> - what is the state of the locking configuration in the chip when
> >>>>>>>>>>> you
> >>>>>>>>>>>     boot?
> >>>>>>>>>>
> >>>>>>>>>> Unlocked, I guess...
> >>>>>>>>>> How can I check that?
> >>>>>>>>>
> >>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
> >>>>>>>>> apparently.
> >>>>>>>>
> >>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
> >>>>>>>> so I guess we can confirm it’s unlocked…
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
> >>>>>>>>>>> ?
> >>>>>>>>>
> >>>>>>>>> So nobody locks the device I guess? Did you add traces there?
> >>>>>>>>
> >>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
> >>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
> >>>>>>>> in adding those traces…
> >>>>>>>
> >>>>>>> Right, it returns before setting these I guess.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
> >>>>>>>>>>> hanging
> >>>>>>>>>>>     exactly? (offset in nand/ and line in the code)
> >>>>>>>>>>
> >>>>>>>>>> I've got no idea...
> >>>>>>>>>
> >>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
> >>>>>>>>> closer and closer.
> >>>>>>>>
> >>>>>>>> I think that after trying to get the feature it just start reading
> >>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
> >>>>>>>
> >>>>>>> It should refuse to mount the device somehow, but in no case the kernel
> >>>>>>> should hang.
> >>>>>>
> >>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >>>>>>
> >>>>>>>
> >>>>>>>> Is it posible that the NAND starts behaving like this after getting
> >>>>>>>> the feature due to some specific config of my device?
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
> >>>>>>>>> close enough datasheet and I don't see what could confuse the device.
> >>>>>>>>>
> >>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
> >>>>>>>>> your design?
> >>>>>>>>
> >>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
> >>>>>>>
> >>>>>>> What about the chip?
> >>>>>>
> >>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Miquèl
> >>>>>>>
> >>
> >> --
> >> Florian
> >>

--
Álvaro

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-17 15:20             ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-05-17 15:20 UTC (permalink / raw)
  To: William Zhang
  Cc: Florian Fainelli, liao jaime, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel

Hi William,

El mié, 17 may 2023 a las 7:30, William Zhang
(<william.zhang@broadcom.com>) escribió:
>
>
>
> On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> > Sure,
> >
> > Here you go:
> > [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> > 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> > [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> > [    0.000000] MIPS: machine is Sercomm H500-s vfes
> > [    0.000000] 128MB of RAM installed
> > [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> > [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> > [    0.000000] Initrd not found or empty - disabling initrd
> > [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> > [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > linesize 16 bytes
> > [    0.000000] Zone ranges:
> > [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> > [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> > [    0.000000] Kernel command line: earlycon
> > [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> > bytes, linear)
> > [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> > bytes, linear)
> > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> > 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> > cma-reserved)
> > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> > [    0.000000] rcu: Hierarchical RCU implementation.
> > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> > is 10 jiffies.
> > [    0.000000] NR_IRQS: 256
> > [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> > [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> > [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> > [    0.000000] brcm,bcm63268 detected @ 400 MHz
> > [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> > 0xffffffff, max_idle_ns: 9556302233 ns
> > [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> > every 10737418237ns
> > [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> > [    0.074683] pid_max: default: 32768 minimum: 301
> > [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> > bytes, linear)
> > [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> > 4096 bytes, linear)
> > [    0.106094] rcu: Hierarchical SRCU implementation.
> > [    0.112665] smp: Bringing up secondary CPUs ...
> > [    0.119348] SMP: Booting CPU1...
> > [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> > [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> > linesize 16 bytes
> > [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> > [    0.182819] Synchronize counters for CPU 1:
> > [    0.203500] SMP: CPU1 is running
> > [    0.203512] done.
> > [    0.213401] smp: Brought up 1 node, 2 CPUs
> > [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > 0xffffffff, max_idle_ns: 19112604462750000 ns
> > [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> > [    0.246439] pinctrl core: initialized pinctrl subsystem
> > [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> > [    0.312700] clocksource: Switched to clocksource MIPS
> > [    0.321061] NET: Registered PF_INET protocol family
> > [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> > bytes, linear)
> > [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> > (order: 0, 6144 bytes, linear)
> > [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> > 262144 bytes, linear)
> > [    0.352721] TCP established hash table entries: 1024 (order: 0,
> > 4096 bytes, linear)
> > [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> > [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> > [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> > [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> > [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> > [    0.395748] PCI: CLS 0 bytes, default 16
> > [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> > [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> > [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> > (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> > [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> > registered 14 power domains
> > [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> > base_baud = 1562500) is a bcm63xx_uart
> > [    0.479996] printk: console [ttyS0] enabled
> > [    0.479996] printk: console [ttyS0] enabled
> > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> > [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> > [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> > state default
> > [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> > [    0.640506] nand: Macronix MX30LF1G18AC
> > [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > 2048, OOB size: 64
> > [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > [    0.703373] Bad block table not found for chip 0
> > [    0.732040] Bad block table not found for chip 0
> > [    0.736842] Scanning device for bad blocks
> > [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> > address 00000014, epc == 8009b300, ra == 806cc650
> > [    0.843628] Oops[#1]:
> > [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> > [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> > [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> > [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> > [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> >
> > Please, tell me if you want me to add any debugging to the log.
> >
> > Best regards,
> > Álvaro.
> >
> > El mar, 16 may 2023 a las 20:58, Florian Fainelli
> > (<f.fainelli@gmail.com>) escribió:
> >>
> >> +William,
> >>
> >> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> >>> Hi Jaime,
> >>>
> >>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> >>> forcing it to check block protection (it's not supported on that
> >>> device), the NAND controller stops reading/writing anything.
> >>>
> >>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> >>> aren't supported on BCM63268 NAND controllers and this is causing the
> >>> issue?
> >>
> >> Yes, this looks like what we have seen as well even with newer NAND
> >> controllers actually. Would it be possible to obtain a full log from
> >> either of you?
> >>
> >> William, is this something you have seen before as well?
> >>
> No, I haven't seen such issue before.  It is possible I didn't have this
>   Macronix parts in my board. If I can find a board with Macronix part,
> I will try it. But we don't use this feature and don't connect the PT
> pin in our reference board which means the PT feature is disabled in the
> nand part.
>
> Alvaro, Do you know if your 63268 board has PT pin connected or not?

No, I don't know if PT pin is connected.
I would have to open the case and check, but judging from the
following image I would say it's not connected:
https://openwrt.org/_media/media/sercomm/h500s/h500s-nand.jpg

> Can you check if the macronix's lock and unlock function being calling
> before the hang?   Or is it just get/set feature function getting called
> to determine PT is supported?   The get/set feature function should work
> as they are used by other pathes

No, the macronix's lock/unlock functions aren't called before the hang.
In fact, if I comment out the nand_get_features call and replace it
with ret = 1 it doesn't hang:
https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/drivers/mtd/nand/raw/nand_macronix.c#L229-L230

>
>
> >>>
> >>> Best regards,
> >>> Álvaro.
> >>>
> >>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
> >>>>
> >>>> Hi Álvaro
> >>>>
> >>>> In nand_scan_tail(), each manufacturer init function call will be execute.
> >>>> In macronix_nand_init(), block protect will be execute after flash detect.
> >>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
> >>>> I didn't got situation "device hangs"  on my side.
> >>>> BP is to prevent incorrect operations.
> >>>> Please check the controller settings for tracing this issue.
> >>>>
> >>>> Thanks
> >>>> Jaime
> >>>>
> >>>>>
> >>>>> Hello YouChing and Jaime,
> >>>>>
> >>>>> I still didn't get any feedback from you (or Macronix) on this issue.
> >>>>> Did you have time to look into it?
> >>>>>
> >>>>> Thanks,
> >>>>> Álvaro.
> >>>>>
> >>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> >>>>> (<noltari@gmail.com>) escribió:
> >>>>>>
> >>>>>> Hi Miquèl,
> >>>>>>
> >>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>> Hi Álvaro,
> >>>>>>>
> >>>>>>> + YouChing and Jaime from Macronix
> >>>>>>> TLDR for them: there is a misbehavior since Mason added block
> >>>>>>> protection support. Just checking if the blocks are protected seems to
> >>>>>>> misconfigure the chip entirely, see below. Any hints?
> >>>>>>
> >>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
> >>>>>> isn’t sent after getting the features?
> >>>>>>
> >>>>>>>
> >>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >>>>>>>
> >>>>>>>> Hi Miquèl,
> >>>>>>>>
> >>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>>>> Hi Álvaro,
> >>>>>>>>>
> >>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >>>>>>>>>
> >>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>
> >>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>>>
> >>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
> >>>>>>>>>>>>>> This binding allows disabling block protection support for
> >>>>>>>>>>>>>> those
> >>>>>>>>>>>>>> devices not
> >>>>>>>>>>>>>> supporting it.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>    Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >>>>>>>>>>>>>> +++
> >>>>>>>>>>>>>>    1 file changed, 3 insertions(+)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> diff --git
> >>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
> >>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
> >>>>>>>>>>>>>>    Required NAND chip properties in children mode:
> >>>>>>>>>>>>>>    - randomizer enable: should be "mxic,enable-randomizer-otp"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
> >>>>>>>>>>>>>> +- block protection disable: should be
> >>>>>>>>>>>>>> "mxic,disable-block-protection"
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
> >>>>>>>>>>>>> conversions
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
> >>>>>>>>>>>>> already have similar properties like "lock" and
> >>>>>>>>>>>>> "secure-regions",
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>> sure they will fit but I think it's worth checking.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
> >>>>>>>>>>>> on
> >>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >>>>>>>>>>>> MX30LF1G18AC
> >>>>>>>>>>>> which hangs the device.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is the log with block protection disabled:
> >>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>> for
> >>>>>>>>>>>> state default
> >>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>> 0xf1
> >>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
> >>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
> >>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
> >>>>>>>>>>>> brcmnand.0
> >>>>>>>>>>>> ...
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is the log with block protection enabled:
> >>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>>>> for
> >>>>>>>>>>>> state default
> >>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>>>> 0xf1
> >>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
> >>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
> >>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
> >>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
> >>>>>>>>>>>> virtual
> >>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
> >>>>>>>>>>>> *** Device hangs ***
> >>>>>>>>>>>>
> >>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
> >>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
> >>>>>>>>>>>> scan
> >>>>>>>>>>>> for bad blocks.
> >>>>>>>>>>>
> >>>>>>>>>>> Please trace nand_macronix.c and look:
> >>>>>>>>>>> - are the get_features and set_features really supported by the
> >>>>>>>>>>>     controller driver?
> >>>>>>>>>>
> >>>>>>>>>> This is what I could find by debugging:
> >>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >>>>>>>>>> state default
> >>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>> 0xf1
> >>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
> >>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>>>> 0x00
> >>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >>>>>>>>>> 00 00 00] -> 0
> >>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
> >>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
> >>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >>>>>>>>>> [    0.624760] Bad block table not found for chip 0
> >>>>>>>>>> [    0.635542] Bad block table not found for chip 0
> >>>>>>>>>> [    0.640270] Scanning device for bad blocks
> >>>>>>>>>>
> >>>>>>>>>> I don't know how to tell if get_features / set_features is really
> >>>>>>>>>> supported...
> >>>>>>>>>
> >>>>>>>>> Looks like your driver does not support exec_op but the core provides a
> >>>>>>>>> get/set_feature implementation.
> >>>>>>>>
> >>>>>>>> According to Florian, low level should be supported on brcmnand
> >>>>>>>> controllers >= 4.0
> >>>>>>>> Also:
> >>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >>>>>>>
> >>>>>>> Just to be sure, you're using a mainline controller driver, not this
> >>>>>>> one?
> >>>>>>
> >>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
> >>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
> >>>>>>
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> - what is the state of the locking configuration in the chip when
> >>>>>>>>>>> you
> >>>>>>>>>>>     boot?
> >>>>>>>>>>
> >>>>>>>>>> Unlocked, I guess...
> >>>>>>>>>> How can I check that?
> >>>>>>>>>
> >>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
> >>>>>>>>> apparently.
> >>>>>>>>
> >>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
> >>>>>>>> so I guess we can confirm it’s unlocked…
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
> >>>>>>>>>>> ?
> >>>>>>>>>
> >>>>>>>>> So nobody locks the device I guess? Did you add traces there?
> >>>>>>>>
> >>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
> >>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
> >>>>>>>> in adding those traces…
> >>>>>>>
> >>>>>>> Right, it returns before setting these I guess.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
> >>>>>>>>>>> hanging
> >>>>>>>>>>>     exactly? (offset in nand/ and line in the code)
> >>>>>>>>>>
> >>>>>>>>>> I've got no idea...
> >>>>>>>>>
> >>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
> >>>>>>>>> closer and closer.
> >>>>>>>>
> >>>>>>>> I think that after trying to get the feature it just start reading
> >>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
> >>>>>>>
> >>>>>>> It should refuse to mount the device somehow, but in no case the kernel
> >>>>>>> should hang.
> >>>>>>
> >>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >>>>>>
> >>>>>>>
> >>>>>>>> Is it posible that the NAND starts behaving like this after getting
> >>>>>>>> the feature due to some specific config of my device?
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
> >>>>>>>>> close enough datasheet and I don't see what could confuse the device.
> >>>>>>>>>
> >>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
> >>>>>>>>> your design?
> >>>>>>>>
> >>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
> >>>>>>>
> >>>>>>> What about the chip?
> >>>>>>
> >>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Miquèl
> >>>>>>>
> >>
> >> --
> >> Florian
> >>

--
Álvaro

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-16 19:02         ` Álvaro Fernández Rojas
@ 2023-05-17  5:30           ` William Zhang
  -1 siblings, 0 replies; 50+ messages in thread
From: William Zhang @ 2023-05-17  5:30 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, Florian Fainelli
  Cc: liao jaime, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 19493 bytes --]



On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> Sure,
> 
> Here you go:
> [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> [    0.000000] MIPS: machine is Sercomm H500-s vfes
> [    0.000000] 128MB of RAM installed
> [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> [    0.000000] Initrd not found or empty - disabling initrd
> [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> linesize 16 bytes
> [    0.000000] Zone ranges:
> [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> [    0.000000] Kernel command line: earlycon
> [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> bytes, linear)
> [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> bytes, linear)
> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> cma-reserved)
> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> [    0.000000] rcu: Hierarchical RCU implementation.
> [    0.000000]  Tracing variant of Tasks RCU enabled.
> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> is 10 jiffies.
> [    0.000000] NR_IRQS: 256
> [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> [    0.000000] brcm,bcm63268 detected @ 400 MHz
> [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 9556302233 ns
> [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> every 10737418237ns
> [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> [    0.074683] pid_max: default: 32768 minimum: 301
> [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> bytes, linear)
> [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> 4096 bytes, linear)
> [    0.106094] rcu: Hierarchical SRCU implementation.
> [    0.112665] smp: Bringing up secondary CPUs ...
> [    0.119348] SMP: Booting CPU1...
> [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> linesize 16 bytes
> [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> [    0.182819] Synchronize counters for CPU 1:
> [    0.203500] SMP: CPU1 is running
> [    0.203512] done.
> [    0.213401] smp: Brought up 1 node, 2 CPUs
> [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 19112604462750000 ns
> [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> [    0.246439] pinctrl core: initialized pinctrl subsystem
> [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> [    0.312700] clocksource: Switched to clocksource MIPS
> [    0.321061] NET: Registered PF_INET protocol family
> [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> bytes, linear)
> [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> (order: 0, 6144 bytes, linear)
> [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> 262144 bytes, linear)
> [    0.352721] TCP established hash table entries: 1024 (order: 0,
> 4096 bytes, linear)
> [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> [    0.395748] PCI: CLS 0 bytes, default 16
> [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> registered 14 power domains
> [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> base_baud = 1562500) is a bcm63xx_uart
> [    0.479996] printk: console [ttyS0] enabled
> [    0.479996] printk: console [ttyS0] enabled
> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.640506] nand: Macronix MX30LF1G18AC
> [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.703373] Bad block table not found for chip 0
> [    0.732040] Bad block table not found for chip 0
> [    0.736842] Scanning device for bad blocks
> [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> address 00000014, epc == 8009b300, ra == 806cc650
> [    0.843628] Oops[#1]:
> [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> 
> Please, tell me if you want me to add any debugging to the log.
> 
> Best regards,
> Álvaro.
> 
> El mar, 16 may 2023 a las 20:58, Florian Fainelli
> (<f.fainelli@gmail.com>) escribió:
>>
>> +William,
>>
>> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
>>> Hi Jaime,
>>>
>>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
>>> forcing it to check block protection (it's not supported on that
>>> device), the NAND controller stops reading/writing anything.
>>>
>>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
>>> aren't supported on BCM63268 NAND controllers and this is causing the
>>> issue?
>>
>> Yes, this looks like what we have seen as well even with newer NAND
>> controllers actually. Would it be possible to obtain a full log from
>> either of you?
>>
>> William, is this something you have seen before as well?
>>
No, I haven't seen such issue before.  It is possible I didn't have this 
  Macronix parts in my board. If I can find a board with Macronix part, 
I will try it. But we don't use this feature and don't connect the PT 
pin in our reference board which means the PT feature is disabled in the 
nand part.

Alvaro, Do you know if your 63268 board has PT pin connected or not? Can 
you check if the macronix's lock and unlock function being calling 
before the hang?   Or is it just get/set feature function getting called 
to determine PT is supported?   The get/set feature function should work 
as they are used by other pathes


>>>
>>> Best regards,
>>> Álvaro.
>>>
>>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>>>>
>>>> Hi Álvaro
>>>>
>>>> In nand_scan_tail(), each manufacturer init function call will be execute.
>>>> In macronix_nand_init(), block protect will be execute after flash detect.
>>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
>>>> I didn't got situation "device hangs"  on my side.
>>>> BP is to prevent incorrect operations.
>>>> Please check the controller settings for tracing this issue.
>>>>
>>>> Thanks
>>>> Jaime
>>>>
>>>>>
>>>>> Hello YouChing and Jaime,
>>>>>
>>>>> I still didn't get any feedback from you (or Macronix) on this issue.
>>>>> Did you have time to look into it?
>>>>>
>>>>> Thanks,
>>>>> Álvaro.
>>>>>
>>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
>>>>> (<noltari@gmail.com>) escribió:
>>>>>>
>>>>>> Hi Miquèl,
>>>>>>
>>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>> Hi Álvaro,
>>>>>>>
>>>>>>> + YouChing and Jaime from Macronix
>>>>>>> TLDR for them: there is a misbehavior since Mason added block
>>>>>>> protection support. Just checking if the blocks are protected seems to
>>>>>>> misconfigure the chip entirely, see below. Any hints?
>>>>>>
>>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
>>>>>> isn’t sent after getting the features?
>>>>>>
>>>>>>>
>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>>>>>>>
>>>>>>>> Hi Miquèl,
>>>>>>>>
>>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>>>> Hi Álvaro,
>>>>>>>>>
>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>>>>>>>>>
>>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>
>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>>>
>>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>>>
>>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
>>>>>>>>>>>>>> This binding allows disabling block protection support for
>>>>>>>>>>>>>> those
>>>>>>>>>>>>>> devices not
>>>>>>>>>>>>>> supporting it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>    Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>>>>>>>>>>>>>> +++
>>>>>>>>>>>>>>    1 file changed, 3 insertions(+)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git
>>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
>>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
>>>>>>>>>>>>>>    Required NAND chip properties in children mode:
>>>>>>>>>>>>>>    - randomizer enable: should be "mxic,enable-randomizer-otp"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
>>>>>>>>>>>>>> +- block protection disable: should be
>>>>>>>>>>>>>> "mxic,disable-block-protection"
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
>>>>>>>>>>>>> conversions
>>>>>>>>>>>>> to
>>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
>>>>>>>>>>>>> already have similar properties like "lock" and
>>>>>>>>>>>>> "secure-regions",
>>>>>>>>>>>>> not
>>>>>>>>>>>>> sure they will fit but I think it's worth checking.
>>>>>>>>>>>>
>>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
>>>>>>>>>>>> on
>>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>>>>>>>>>>>> MX30LF1G18AC
>>>>>>>>>>>> which hangs the device.
>>>>>>>>>>>>
>>>>>>>>>>>> This is the log with block protection disabled:
>>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>> for
>>>>>>>>>>>> state default
>>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>> 0xf1
>>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
>>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
>>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
>>>>>>>>>>>> brcmnand.0
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> This is the log with block protection enabled:
>>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>> for
>>>>>>>>>>>> state default
>>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>> 0xf1
>>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
>>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
>>>>>>>>>>>> virtual
>>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
>>>>>>>>>>>> *** Device hangs ***
>>>>>>>>>>>>
>>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
>>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
>>>>>>>>>>>> scan
>>>>>>>>>>>> for bad blocks.
>>>>>>>>>>>
>>>>>>>>>>> Please trace nand_macronix.c and look:
>>>>>>>>>>> - are the get_features and set_features really supported by the
>>>>>>>>>>>     controller driver?
>>>>>>>>>>
>>>>>>>>>> This is what I could find by debugging:
>>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>>>>>>>>>> state default
>>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>> 0xf1
>>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
>>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>>>>>>>>>> 00 00 00] -> 0
>>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
>>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
>>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>>>>>>>>>> [    0.624760] Bad block table not found for chip 0
>>>>>>>>>> [    0.635542] Bad block table not found for chip 0
>>>>>>>>>> [    0.640270] Scanning device for bad blocks
>>>>>>>>>>
>>>>>>>>>> I don't know how to tell if get_features / set_features is really
>>>>>>>>>> supported...
>>>>>>>>>
>>>>>>>>> Looks like your driver does not support exec_op but the core provides a
>>>>>>>>> get/set_feature implementation.
>>>>>>>>
>>>>>>>> According to Florian, low level should be supported on brcmnand
>>>>>>>> controllers >= 4.0
>>>>>>>> Also:
>>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>>>>>>>
>>>>>>> Just to be sure, you're using a mainline controller driver, not this
>>>>>>> one?
>>>>>>
>>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
>>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - what is the state of the locking configuration in the chip when
>>>>>>>>>>> you
>>>>>>>>>>>     boot?
>>>>>>>>>>
>>>>>>>>>> Unlocked, I guess...
>>>>>>>>>> How can I check that?
>>>>>>>>>
>>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
>>>>>>>>> apparently.
>>>>>>>>
>>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
>>>>>>>> so I guess we can confirm it’s unlocked…
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
>>>>>>>>>>> ?
>>>>>>>>>
>>>>>>>>> So nobody locks the device I guess? Did you add traces there?
>>>>>>>>
>>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
>>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
>>>>>>>> in adding those traces…
>>>>>>>
>>>>>>> Right, it returns before setting these I guess.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
>>>>>>>>>>> hanging
>>>>>>>>>>>     exactly? (offset in nand/ and line in the code)
>>>>>>>>>>
>>>>>>>>>> I've got no idea...
>>>>>>>>>
>>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
>>>>>>>>> closer and closer.
>>>>>>>>
>>>>>>>> I think that after trying to get the feature it just start reading
>>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
>>>>>>>
>>>>>>> It should refuse to mount the device somehow, but in no case the kernel
>>>>>>> should hang.
>>>>>>
>>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>>>>>>
>>>>>>>
>>>>>>>> Is it posible that the NAND starts behaving like this after getting
>>>>>>>> the feature due to some specific config of my device?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
>>>>>>>>> close enough datasheet and I don't see what could confuse the device.
>>>>>>>>>
>>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
>>>>>>>>> your design?
>>>>>>>>
>>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
>>>>>>>
>>>>>>> What about the chip?
>>>>>>
>>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Miquèl
>>>>>>>
>>
>> --
>> Florian
>>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-17  5:30           ` William Zhang
  0 siblings, 0 replies; 50+ messages in thread
From: William Zhang @ 2023-05-17  5:30 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, Florian Fainelli
  Cc: liao jaime, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, robh+dt, krzysztof.kozlowski+dt, linux-mtd,
	devicetree, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 19493 bytes --]



On 05/16/2023 12:02 PM, Álvaro Fernández Rojas wrote:
> Sure,
> 
> Here you go:
> [    0.000000] Linux version 5.15.111 (noltari@atlantis)
> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
> 12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
> [    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
> [    0.000000] MIPS: machine is Sercomm H500-s vfes
> [    0.000000] 128MB of RAM installed
> [    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
> [    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
> [    0.000000] Initrd not found or empty - disabling initrd
> [    0.000000] Reserving 0KB of memory at 4194303KB for kdump
> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> [    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> linesize 16 bytes
> [    0.000000] Zone ranges:
> [    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> [    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
> [    0.000000] Kernel command line: earlycon
> [    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
> bytes, linear)
> [    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
> bytes, linear)
> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> [    0.000000] Memory: 108656K/131072K available (6902K kernel code,
> 613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
> cma-reserved)
> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> [    0.000000] rcu: Hierarchical RCU implementation.
> [    0.000000]  Tracing variant of Tasks RCU enabled.
> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
> is 10 jiffies.
> [    0.000000] NR_IRQS: 256
> [    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
> [    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
> [    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
> [    0.000000] brcm,bcm63268 detected @ 400 MHz
> [    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 9556302233 ns
> [    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
> every 10737418237ns
> [    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
> [    0.074683] pid_max: default: 32768 minimum: 301
> [    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
> bytes, linear)
> [    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
> 4096 bytes, linear)
> [    0.106094] rcu: Hierarchical SRCU implementation.
> [    0.112665] smp: Bringing up secondary CPUs ...
> [    0.119348] SMP: Booting CPU1...
> [    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
> [    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
> linesize 16 bytes
> [    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
> [    0.182819] Synchronize counters for CPU 1:
> [    0.203500] SMP: CPU1 is running
> [    0.203512] done.
> [    0.213401] smp: Brought up 1 node, 2 CPUs
> [    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 19112604462750000 ns
> [    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
> [    0.246439] pinctrl core: initialized pinctrl subsystem
> [    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> [    0.312700] clocksource: Switched to clocksource MIPS
> [    0.321061] NET: Registered PF_INET protocol family
> [    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
> bytes, linear)
> [    0.335972] tcp_listen_portaddr_hash hash table entries: 512
> (order: 0, 6144 bytes, linear)
> [    0.344721] Table-perturb hash table entries: 65536 (order: 6,
> 262144 bytes, linear)
> [    0.352721] TCP established hash table entries: 1024 (order: 0,
> 4096 bytes, linear)
> [    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
> [    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
> [    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
> [    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
> [    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
> [    0.395748] PCI: CLS 0 bytes, default 16
> [    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> [    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> [    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
> (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
> [    0.459472] bcm63xx-power-controller 1000184c.power-controller:
> registered 14 power domains
> [    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
> base_baud = 1562500) is a bcm63xx_uart
> [    0.479996] printk: console [ttyS0] enabled
> [    0.479996] printk: console [ttyS0] enabled
> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> [    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
> [    0.533435] bcm2835-rng 10002880.rng: hwrng registered
> [    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
> state default
> [    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
> [    0.640506] nand: Macronix MX30LF1G18AC
> [    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> 2048, OOB size: 64
> [    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> [    0.703373] Bad block table not found for chip 0
> [    0.732040] Bad block table not found for chip 0
> [    0.736842] Scanning device for bad blocks
> [    0.832678] CPU 0 Unable to handle kernel paging request at virtual
> address 00000014, epc == 8009b300, ra == 806cc650
> [    0.843628] Oops[#1]:
> [    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
> [    0.851959] $ 0   : 00000000 00000001 00000008 00000000
> [    0.857358] $ 4   : 81808464 00000064 00000000 00000001
> [    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
> [    0.868146] $12   : 0000b79d 00000000 00000000 00009bb
> 
> Please, tell me if you want me to add any debugging to the log.
> 
> Best regards,
> Álvaro.
> 
> El mar, 16 may 2023 a las 20:58, Florian Fainelli
> (<f.fainelli@gmail.com>) escribió:
>>
>> +William,
>>
>> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
>>> Hi Jaime,
>>>
>>> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
>>> forcing it to check block protection (it's not supported on that
>>> device), the NAND controller stops reading/writing anything.
>>>
>>> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
>>> aren't supported on BCM63268 NAND controllers and this is causing the
>>> issue?
>>
>> Yes, this looks like what we have seen as well even with newer NAND
>> controllers actually. Would it be possible to obtain a full log from
>> either of you?
>>
>> William, is this something you have seen before as well?
>>
No, I haven't seen such issue before.  It is possible I didn't have this 
  Macronix parts in my board. If I can find a board with Macronix part, 
I will try it. But we don't use this feature and don't connect the PT 
pin in our reference board which means the PT feature is disabled in the 
nand part.

Alvaro, Do you know if your 63268 board has PT pin connected or not? Can 
you check if the macronix's lock and unlock function being calling 
before the hang?   Or is it just get/set feature function getting called 
to determine PT is supported?   The get/set feature function should work 
as they are used by other pathes


>>>
>>> Best regards,
>>> Álvaro.
>>>
>>> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>>>>
>>>> Hi Álvaro
>>>>
>>>> In nand_scan_tail(), each manufacturer init function call will be execute.
>>>> In macronix_nand_init(), block protect will be execute after flash detect.
>>>> I have validate MX30LF1G18AC in Linux kernel v5.15.
>>>> I didn't got situation "device hangs"  on my side.
>>>> BP is to prevent incorrect operations.
>>>> Please check the controller settings for tracing this issue.
>>>>
>>>> Thanks
>>>> Jaime
>>>>
>>>>>
>>>>> Hello YouChing and Jaime,
>>>>>
>>>>> I still didn't get any feedback from you (or Macronix) on this issue.
>>>>> Did you have time to look into it?
>>>>>
>>>>> Thanks,
>>>>> Álvaro.
>>>>>
>>>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
>>>>> (<noltari@gmail.com>) escribió:
>>>>>>
>>>>>> Hi Miquèl,
>>>>>>
>>>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>> Hi Álvaro,
>>>>>>>
>>>>>>> + YouChing and Jaime from Macronix
>>>>>>> TLDR for them: there is a misbehavior since Mason added block
>>>>>>> protection support. Just checking if the blocks are protected seems to
>>>>>>> misconfigure the chip entirely, see below. Any hints?
>>>>>>
>>>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
>>>>>> isn’t sent after getting the features?
>>>>>>
>>>>>>>
>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>>>>>>>
>>>>>>>> Hi Miquèl,
>>>>>>>>
>>>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>>>> Hi Álvaro,
>>>>>>>>>
>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>>>>>>>>>
>>>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>
>>>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>>>
>>>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>>>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>>>
>>>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
>>>>>>>>>>>>>> This binding allows disabling block protection support for
>>>>>>>>>>>>>> those
>>>>>>>>>>>>>> devices not
>>>>>>>>>>>>>> supporting it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>    Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>>>>>>>>>>>>>> +++
>>>>>>>>>>>>>>    1 file changed, 3 insertions(+)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git
>>>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
>>>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
>>>>>>>>>>>>>>    Required NAND chip properties in children mode:
>>>>>>>>>>>>>>    - randomizer enable: should be "mxic,enable-randomizer-otp"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +Optional NAND chip properties in children mode:
>>>>>>>>>>>>>> +- block protection disable: should be
>>>>>>>>>>>>>> "mxic,disable-block-protection"
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
>>>>>>>>>>>>> conversions
>>>>>>>>>>>>> to
>>>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
>>>>>>>>>>>>> already have similar properties like "lock" and
>>>>>>>>>>>>> "secure-regions",
>>>>>>>>>>>>> not
>>>>>>>>>>>>> sure they will fit but I think it's worth checking.
>>>>>>>>>>>>
>>>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
>>>>>>>>>>>> on
>>>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>>>>>>>>>>>> MX30LF1G18AC
>>>>>>>>>>>> which hangs the device.
>>>>>>>>>>>>
>>>>>>>>>>>> This is the log with block protection disabled:
>>>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>> for
>>>>>>>>>>>> state default
>>>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>> 0xf1
>>>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
>>>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
>>>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
>>>>>>>>>>>> brcmnand.0
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> This is the log with block protection enabled:
>>>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>>>> for
>>>>>>>>>>>> state default
>>>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>>>> 0xf1
>>>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
>>>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
>>>>>>>>>>>> [    0.555069] Scanning device for bad blocks
>>>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
>>>>>>>>>>>> virtual
>>>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
>>>>>>>>>>>> *** Device hangs ***
>>>>>>>>>>>>
>>>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
>>>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
>>>>>>>>>>>> scan
>>>>>>>>>>>> for bad blocks.
>>>>>>>>>>>
>>>>>>>>>>> Please trace nand_macronix.c and look:
>>>>>>>>>>> - are the get_features and set_features really supported by the
>>>>>>>>>>>     controller driver?
>>>>>>>>>>
>>>>>>>>>> This is what I could find by debugging:
>>>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>>>>>>>>>> state default
>>>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>> 0xf1
>>>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
>>>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>>>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>>>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>>>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>>>> 0x00
>>>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>>>>>>>>>> 00 00 00] -> 0
>>>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
>>>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>>>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
>>>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>>>>>>>>>> [    0.624760] Bad block table not found for chip 0
>>>>>>>>>> [    0.635542] Bad block table not found for chip 0
>>>>>>>>>> [    0.640270] Scanning device for bad blocks
>>>>>>>>>>
>>>>>>>>>> I don't know how to tell if get_features / set_features is really
>>>>>>>>>> supported...
>>>>>>>>>
>>>>>>>>> Looks like your driver does not support exec_op but the core provides a
>>>>>>>>> get/set_feature implementation.
>>>>>>>>
>>>>>>>> According to Florian, low level should be supported on brcmnand
>>>>>>>> controllers >= 4.0
>>>>>>>> Also:
>>>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>>>>>>>
>>>>>>> Just to be sure, you're using a mainline controller driver, not this
>>>>>>> one?
>>>>>>
>>>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
>>>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - what is the state of the locking configuration in the chip when
>>>>>>>>>>> you
>>>>>>>>>>>     boot?
>>>>>>>>>>
>>>>>>>>>> Unlocked, I guess...
>>>>>>>>>> How can I check that?
>>>>>>>>>
>>>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
>>>>>>>>> apparently.
>>>>>>>>
>>>>>>>> Well, I can read/write the device if block protection isn’t disabled,
>>>>>>>> so I guess we can confirm it’s unlocked…
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
>>>>>>>>>>> ?
>>>>>>>>>
>>>>>>>>> So nobody locks the device I guess? Did you add traces there?
>>>>>>>>
>>>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
>>>>>>>> since it fails when checking if feature is 0x38, so there’s no point
>>>>>>>> in adding those traces…
>>>>>>>
>>>>>>> Right, it returns before setting these I guess.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
>>>>>>>>>>> hanging
>>>>>>>>>>>     exactly? (offset in nand/ and line in the code)
>>>>>>>>>>
>>>>>>>>>> I've got no idea...
>>>>>>>>>
>>>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
>>>>>>>>> closer and closer.
>>>>>>>>
>>>>>>>> I think that after trying to get the feature it just start reading
>>>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
>>>>>>>
>>>>>>> It should refuse to mount the device somehow, but in no case the kernel
>>>>>>> should hang.
>>>>>>
>>>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>>>>>>
>>>>>>>
>>>>>>>> Is it posible that the NAND starts behaving like this after getting
>>>>>>>> the feature due to some specific config of my device?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
>>>>>>>>> close enough datasheet and I don't see what could confuse the device.
>>>>>>>>>
>>>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
>>>>>>>>> your design?
>>>>>>>>
>>>>>>>> There’s no WP pin in brcmnand controllers < 7.0
>>>>>>>
>>>>>>> What about the chip?
>>>>>>
>>>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Miquèl
>>>>>>>
>>
>> --
>> Florian
>>

[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

[-- Attachment #2: Type: text/plain, Size: 144 bytes --]

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-16 18:58       ` Florian Fainelli
@ 2023-05-16 19:02         ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-05-16 19:02 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: liao jaime, William (Zhenghao) Zhang, Miquel Raynal,
	Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

Sure,

Here you go:
[    0.000000] Linux version 5.15.111 (noltari@atlantis)
(mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
[    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
[    0.000000] MIPS: machine is Sercomm H500-s vfes
[    0.000000] 128MB of RAM installed
[    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
[    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] Reserving 0KB of memory at 4194303KB for kdump
[    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
[    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
linesize 16 bytes
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
[    0.000000] Kernel command line: earlycon
[    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
bytes, linear)
[    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 108656K/131072K available (6902K kernel code,
613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
is 10 jiffies.
[    0.000000] NR_IRQS: 256
[    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
[    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
[    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
[    0.000000] brcm,bcm63268 detected @ 400 MHz
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 9556302233 ns
[    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
every 10737418237ns
[    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
[    0.074683] pid_max: default: 32768 minimum: 301
[    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
bytes, linear)
[    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
4096 bytes, linear)
[    0.106094] rcu: Hierarchical SRCU implementation.
[    0.112665] smp: Bringing up secondary CPUs ...
[    0.119348] SMP: Booting CPU1...
[    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
[    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
linesize 16 bytes
[    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
[    0.182819] Synchronize counters for CPU 1:
[    0.203500] SMP: CPU1 is running
[    0.203512] done.
[    0.213401] smp: Brought up 1 node, 2 CPUs
[    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
[    0.246439] pinctrl core: initialized pinctrl subsystem
[    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.312700] clocksource: Switched to clocksource MIPS
[    0.321061] NET: Registered PF_INET protocol family
[    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
bytes, linear)
[    0.335972] tcp_listen_portaddr_hash hash table entries: 512
(order: 0, 6144 bytes, linear)
[    0.344721] Table-perturb hash table entries: 65536 (order: 6,
262144 bytes, linear)
[    0.352721] TCP established hash table entries: 1024 (order: 0,
4096 bytes, linear)
[    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
[    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.395748] PCI: CLS 0 bytes, default 16
[    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
[    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
(CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.459472] bcm63xx-power-controller 1000184c.power-controller:
registered 14 power domains
[    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
base_baud = 1562500) is a bcm63xx_uart
[    0.479996] printk: console [ttyS0] enabled
[    0.479996] printk: console [ttyS0] enabled
[    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
[    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
[    0.533435] bcm2835-rng 10002880.rng: hwrng registered
[    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.640506] nand: Macronix MX30LF1G18AC
[    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.703373] Bad block table not found for chip 0
[    0.732040] Bad block table not found for chip 0
[    0.736842] Scanning device for bad blocks
[    0.832678] CPU 0 Unable to handle kernel paging request at virtual
address 00000014, epc == 8009b300, ra == 806cc650
[    0.843628] Oops[#1]:
[    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
[    0.851959] $ 0   : 00000000 00000001 00000008 00000000
[    0.857358] $ 4   : 81808464 00000064 00000000 00000001
[    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
[    0.868146] $12   : 0000b79d 00000000 00000000 00009bb

Please, tell me if you want me to add any debugging to the log.

Best regards,
Álvaro.

El mar, 16 may 2023 a las 20:58, Florian Fainelli
(<f.fainelli@gmail.com>) escribió:
>
> +William,
>
> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> > Hi Jaime,
> >
> > I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> > forcing it to check block protection (it's not supported on that
> > device), the NAND controller stops reading/writing anything.
> >
> > @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> > aren't supported on BCM63268 NAND controllers and this is causing the
> > issue?
>
> Yes, this looks like what we have seen as well even with newer NAND
> controllers actually. Would it be possible to obtain a full log from
> either of you?
>
> William, is this something you have seen before as well?
>
> >
> > Best regards,
> > Álvaro.
> >
> > El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
> >>
> >> Hi Álvaro
> >>
> >> In nand_scan_tail(), each manufacturer init function call will be execute.
> >> In macronix_nand_init(), block protect will be execute after flash detect.
> >> I have validate MX30LF1G18AC in Linux kernel v5.15.
> >> I didn't got situation "device hangs"  on my side.
> >> BP is to prevent incorrect operations.
> >> Please check the controller settings for tracing this issue.
> >>
> >> Thanks
> >> Jaime
> >>
> >>>
> >>> Hello YouChing and Jaime,
> >>>
> >>> I still didn't get any feedback from you (or Macronix) on this issue.
> >>> Did you have time to look into it?
> >>>
> >>> Thanks,
> >>> Álvaro.
> >>>
> >>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> >>> (<noltari@gmail.com>) escribió:
> >>>>
> >>>> Hi Miquèl,
> >>>>
> >>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>> Hi Álvaro,
> >>>>>
> >>>>> + YouChing and Jaime from Macronix
> >>>>> TLDR for them: there is a misbehavior since Mason added block
> >>>>> protection support. Just checking if the blocks are protected seems to
> >>>>> misconfigure the chip entirely, see below. Any hints?
> >>>>
> >>>> Could it be that the NAND is stuck expecting a read 0x00 command which
> >>>> isn’t sent after getting the features?
> >>>>
> >>>>>
> >>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >>>>>
> >>>>>> Hi Miquèl,
> >>>>>>
> >>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>> Hi Álvaro,
> >>>>>>>
> >>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >>>>>>>
> >>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>
> >>>>>>>>> Hi Álvaro,
> >>>>>>>>>
> >>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >>>>>>>>>
> >>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>
> >>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>
> >>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >>>>>>>>>>>
> >>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
> >>>>>>>>>>>> This binding allows disabling block protection support for
> >>>>>>>>>>>> those
> >>>>>>>>>>>> devices not
> >>>>>>>>>>>> supporting it.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>   Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >>>>>>>>>>>> +++
> >>>>>>>>>>>>   1 file changed, 3 insertions(+)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git
> >>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
> >>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
> >>>>>>>>>>>>   Required NAND chip properties in children mode:
> >>>>>>>>>>>>   - randomizer enable: should be "mxic,enable-randomizer-otp"
> >>>>>>>>>>>>
> >>>>>>>>>>>> +Optional NAND chip properties in children mode:
> >>>>>>>>>>>> +- block protection disable: should be
> >>>>>>>>>>>> "mxic,disable-block-protection"
> >>>>>>>>>>>> +
> >>>>>>>>>>>
> >>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
> >>>>>>>>>>> conversions
> >>>>>>>>>>> to
> >>>>>>>>>>> yaml before adding anything, I don't think this will fly.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
> >>>>>>>>>>> already have similar properties like "lock" and
> >>>>>>>>>>> "secure-regions",
> >>>>>>>>>>> not
> >>>>>>>>>>> sure they will fit but I think it's worth checking.
> >>>>>>>>>>
> >>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
> >>>>>>>>>> on
> >>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >>>>>>>>>> MX30LF1G18AC
> >>>>>>>>>> which hangs the device.
> >>>>>>>>>>
> >>>>>>>>>> This is the log with block protection disabled:
> >>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>> for
> >>>>>>>>>> state default
> >>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>> 0xf1
> >>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
> >>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
> >>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
> >>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
> >>>>>>>>>> brcmnand.0
> >>>>>>>>>> ...
> >>>>>>>>>>
> >>>>>>>>>> This is the log with block protection enabled:
> >>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>> for
> >>>>>>>>>> state default
> >>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>> 0xf1
> >>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
> >>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>> [    0.539687] Bad block table not found for chip 0
> >>>>>>>>>> [    0.550153] Bad block table not found for chip 0
> >>>>>>>>>> [    0.555069] Scanning device for bad blocks
> >>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
> >>>>>>>>>> virtual
> >>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
> >>>>>>>>>> *** Device hangs ***
> >>>>>>>>>>
> >>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
> >>>>>>>>>> unable to detect the bad block table and hangs it when trying to
> >>>>>>>>>> scan
> >>>>>>>>>> for bad blocks.
> >>>>>>>>>
> >>>>>>>>> Please trace nand_macronix.c and look:
> >>>>>>>>> - are the get_features and set_features really supported by the
> >>>>>>>>>    controller driver?
> >>>>>>>>
> >>>>>>>> This is what I could find by debugging:
> >>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >>>>>>>> state default
> >>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>> 0xf1
> >>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
> >>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>> 2048, OOB size: 64
> >>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >>>>>>>> 00 00 00] -> 0
> >>>>>>>> [    0.602341] macronix_nand_block_protection_support:
> >>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
> >>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >>>>>>>> [    0.624760] Bad block table not found for chip 0
> >>>>>>>> [    0.635542] Bad block table not found for chip 0
> >>>>>>>> [    0.640270] Scanning device for bad blocks
> >>>>>>>>
> >>>>>>>> I don't know how to tell if get_features / set_features is really
> >>>>>>>> supported...
> >>>>>>>
> >>>>>>> Looks like your driver does not support exec_op but the core provides a
> >>>>>>> get/set_feature implementation.
> >>>>>>
> >>>>>> According to Florian, low level should be supported on brcmnand
> >>>>>> controllers >= 4.0
> >>>>>> Also:
> >>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >>>>>
> >>>>> Just to be sure, you're using a mainline controller driver, not this
> >>>>> one?
> >>>>
> >>>> Yes, this was just to prove that the HW I’m using has get/set features support.
> >>>> I’m using OpenWrt, so it’s linux v5.15 driver.
> >>>>
> >>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> - what is the state of the locking configuration in the chip when
> >>>>>>>>> you
> >>>>>>>>>    boot?
> >>>>>>>>
> >>>>>>>> Unlocked, I guess...
> >>>>>>>> How can I check that?
> >>>>>>>
> >>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
> >>>>>>> apparently.
> >>>>>>
> >>>>>> Well, I can read/write the device if block protection isn’t disabled,
> >>>>>> so I guess we can confirm it’s unlocked…
> >>>>>>
> >>>>>>>
> >>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
> >>>>>>>>> ?
> >>>>>>>
> >>>>>>> So nobody locks the device I guess? Did you add traces there?
> >>>>>>
> >>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
> >>>>>> since it fails when checking if feature is 0x38, so there’s no point
> >>>>>> in adding those traces…
> >>>>>
> >>>>> Right, it returns before setting these I guess.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
> >>>>>>>>> hanging
> >>>>>>>>>    exactly? (offset in nand/ and line in the code)
> >>>>>>>>
> >>>>>>>> I've got no idea...
> >>>>>>>
> >>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
> >>>>>>> closer and closer.
> >>>>>>
> >>>>>> I think that after trying to get the feature it just start reading
> >>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
> >>>>>
> >>>>> It should refuse to mount the device somehow, but in no case the kernel
> >>>>> should hang.
> >>>>
> >>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >>>>
> >>>>>
> >>>>>> Is it posible that the NAND starts behaving like this after getting
> >>>>>> the feature due to some specific config of my device?
> >>>>>>
> >>>>>>>
> >>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
> >>>>>>> close enough datasheet and I don't see what could confuse the device.
> >>>>>>>
> >>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
> >>>>>>> your design?
> >>>>>>
> >>>>>> There’s no WP pin in brcmnand controllers < 7.0
> >>>>>
> >>>>> What about the chip?
> >>>>
> >>>> Maybe it has a GPIO controlling that, but I don’t have that info…
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Miquèl
> >>>>>
>
> --
> Florian
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-16 19:02         ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-05-16 19:02 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: liao jaime, William (Zhenghao) Zhang, Miquel Raynal,
	Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

Sure,

Here you go:
[    0.000000] Linux version 5.15.111 (noltari@atlantis)
(mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.3.0 r0+22899-466be0612a)
12.3.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Tue May 16 14:33:20 2023
[    0.000000] CPU0 revision is: 0002a080 (Broadcom BMIPS4350)
[    0.000000] MIPS: machine is Sercomm H500-s vfes
[    0.000000] 128MB of RAM installed
[    0.000000] earlycon: bcm63xx_uart0 at MMIO 0x10000180 (options '115200n8')
[    0.000000] printk: bootconsole [bcm63xx_uart0] enabled
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] Reserving 0KB of memory at 4194303KB for kdump
[    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
[    0.000000] Primary data cache 32kB, 2-way, VIPT, cache aliases,
linesize 16 bytes
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
[    0.000000] percpu: Embedded 11 pages/cpu s13328 r8192 d23536 u45056
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 32480
[    0.000000] Kernel command line: earlycon
[    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
bytes, linear)
[    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 108656K/131072K available (6902K kernel code,
613K rwdata, 1404K rodata, 11872K init, 215K bss, 22416K reserved, 0K
cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay
is 10 jiffies.
[    0.000000] NR_IRQS: 256
[    0.000000] irq_bcm6345_l1: registered BCM6345 L1 intc (IRQs: 128)
[    0.000000] irq_bcm6345_l1:   CPU0 (irq = 2)
[    0.000000] irq_bcm6345_l1:   CPU1 (irq = 3)
[    0.000000] brcm,bcm63268 detected @ 400 MHz
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 9556302233 ns
[    0.000002] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps
every 10737418237ns
[    0.008292] Calibrating delay loop... 398.13 BogoMIPS (lpj=1990656)
[    0.074683] pid_max: default: 32768 minimum: 301
[    0.081788] Mount-cache hash table entries: 1024 (order: 0, 4096
bytes, linear)
[    0.089319] Mountpoint-cache hash table entries: 1024 (order: 0,
4096 bytes, linear)
[    0.106094] rcu: Hierarchical SRCU implementation.
[    0.112665] smp: Bringing up secondary CPUs ...
[    0.119348] SMP: Booting CPU1...
[    8.330979] Primary instruction cache 64kB, VIPT, 4-way, linesize 16 bytes.
[    8.331017] Primary data cache 32kB, 2-way, VIPT, cache aliases,
linesize 16 bytes
[    8.331294] CPU1 revision is: 0002a080 (Broadcom BMIPS4350)
[    0.182819] Synchronize counters for CPU 1:
[    0.203500] SMP: CPU1 is running
[    0.203512] done.
[    0.213401] smp: Brought up 1 node, 2 CPUs
[    0.228870] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.239058] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
[    0.246439] pinctrl core: initialized pinctrl subsystem
[    0.254917] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.312700] clocksource: Switched to clocksource MIPS
[    0.321061] NET: Registered PF_INET protocol family
[    0.326879] IP idents hash table entries: 2048 (order: 2, 16384
bytes, linear)
[    0.335972] tcp_listen_portaddr_hash hash table entries: 512
(order: 0, 6144 bytes, linear)
[    0.344721] Table-perturb hash table entries: 65536 (order: 6,
262144 bytes, linear)
[    0.352721] TCP established hash table entries: 1024 (order: 0,
4096 bytes, linear)
[    0.360622] TCP bind hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.368005] TCP: Hash tables configured (established 1024 bind 1024)
[    0.375074] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.381862] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.389762] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.395748] PCI: CLS 0 bytes, default 16
[    0.403410] workingset: timestamp_bits=14 max_order=15 bucket_order=1
[    0.426490] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.432492] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME)
(CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.459472] bcm63xx-power-controller 1000184c.power-controller:
registered 14 power domains
[    0.470267] 10000180.serial: ttyS0 at MMIO 0x10000180 (irq = 8,
base_baud = 1562500) is a bcm63xx_uart
[    0.479996] printk: console [ttyS0] enabled
[    0.479996] printk: console [ttyS0] enabled
[    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
[    0.488651] printk: bootconsole [bcm63xx_uart0] disabled
[    0.533435] bcm2835-rng 10002880.rng: hwrng registered
[    0.606025] bcm6368_nand 10000200.nand: there is not valid maps for
state default
[    0.633977] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
[    0.640506] nand: Macronix MX30LF1G18AC
[    0.644551] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 64
[    0.652359] bcm6368_nand 10000200.nand: detected 128MiB total,
128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
[    0.703373] Bad block table not found for chip 0
[    0.732040] Bad block table not found for chip 0
[    0.736842] Scanning device for bad blocks
[    0.832678] CPU 0 Unable to handle kernel paging request at virtual
address 00000014, epc == 8009b300, ra == 806cc650
[    0.843628] Oops[#1]:
[    0.845958] CPU: 0 PID: 88 Comm: hwrng Not tainted 5.15.111 #0
[    0.851959] $ 0   : 00000000 00000001 00000008 00000000
[    0.857358] $ 4   : 81808464 00000064 00000000 00000001
[    0.862753] $ 8   : 81810000 00001ff0 00001c00 815b8880
[    0.868146] $12   : 0000b79d 00000000 00000000 00009bb

Please, tell me if you want me to add any debugging to the log.

Best regards,
Álvaro.

El mar, 16 may 2023 a las 20:58, Florian Fainelli
(<f.fainelli@gmail.com>) escribió:
>
> +William,
>
> On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> > Hi Jaime,
> >
> > I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> > forcing it to check block protection (it's not supported on that
> > device), the NAND controller stops reading/writing anything.
> >
> > @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> > aren't supported on BCM63268 NAND controllers and this is causing the
> > issue?
>
> Yes, this looks like what we have seen as well even with newer NAND
> controllers actually. Would it be possible to obtain a full log from
> either of you?
>
> William, is this something you have seen before as well?
>
> >
> > Best regards,
> > Álvaro.
> >
> > El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
> >>
> >> Hi Álvaro
> >>
> >> In nand_scan_tail(), each manufacturer init function call will be execute.
> >> In macronix_nand_init(), block protect will be execute after flash detect.
> >> I have validate MX30LF1G18AC in Linux kernel v5.15.
> >> I didn't got situation "device hangs"  on my side.
> >> BP is to prevent incorrect operations.
> >> Please check the controller settings for tracing this issue.
> >>
> >> Thanks
> >> Jaime
> >>
> >>>
> >>> Hello YouChing and Jaime,
> >>>
> >>> I still didn't get any feedback from you (or Macronix) on this issue.
> >>> Did you have time to look into it?
> >>>
> >>> Thanks,
> >>> Álvaro.
> >>>
> >>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> >>> (<noltari@gmail.com>) escribió:
> >>>>
> >>>> Hi Miquèl,
> >>>>
> >>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>> Hi Álvaro,
> >>>>>
> >>>>> + YouChing and Jaime from Macronix
> >>>>> TLDR for them: there is a misbehavior since Mason added block
> >>>>> protection support. Just checking if the blocks are protected seems to
> >>>>> misconfigure the chip entirely, see below. Any hints?
> >>>>
> >>>> Could it be that the NAND is stuck expecting a read 0x00 command which
> >>>> isn’t sent after getting the features?
> >>>>
> >>>>>
> >>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> >>>>>
> >>>>>> Hi Miquèl,
> >>>>>>
> >>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> >>>>>>> Hi Álvaro,
> >>>>>>>
> >>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> >>>>>>>
> >>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> >>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>
> >>>>>>>>> Hi Álvaro,
> >>>>>>>>>
> >>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> >>>>>>>>>
> >>>>>>>>>> Hi Miquèl,
> >>>>>>>>>>
> >>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> >>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Álvaro,
> >>>>>>>>>>>
> >>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> >>>>>>>>>>>
> >>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
> >>>>>>>>>>>> This binding allows disabling block protection support for
> >>>>>>>>>>>> those
> >>>>>>>>>>>> devices not
> >>>>>>>>>>>> supporting it.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>   Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> >>>>>>>>>>>> +++
> >>>>>>>>>>>>   1 file changed, 3 insertions(+)
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git
> >>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
> >>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> >>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
> >>>>>>>>>>>>   Required NAND chip properties in children mode:
> >>>>>>>>>>>>   - randomizer enable: should be "mxic,enable-randomizer-otp"
> >>>>>>>>>>>>
> >>>>>>>>>>>> +Optional NAND chip properties in children mode:
> >>>>>>>>>>>> +- block protection disable: should be
> >>>>>>>>>>>> "mxic,disable-block-protection"
> >>>>>>>>>>>> +
> >>>>>>>>>>>
> >>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
> >>>>>>>>>>> conversions
> >>>>>>>>>>> to
> >>>>>>>>>>> yaml before adding anything, I don't think this will fly.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
> >>>>>>>>>>> already have similar properties like "lock" and
> >>>>>>>>>>> "secure-regions",
> >>>>>>>>>>> not
> >>>>>>>>>>> sure they will fit but I think it's worth checking.
> >>>>>>>>>>
> >>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
> >>>>>>>>>> on
> >>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> >>>>>>>>>> MX30LF1G18AC
> >>>>>>>>>> which hangs the device.
> >>>>>>>>>>
> >>>>>>>>>> This is the log with block protection disabled:
> >>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>> for
> >>>>>>>>>> state default
> >>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>> 0xf1
> >>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
> >>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
> >>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
> >>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
> >>>>>>>>>> brcmnand.0
> >>>>>>>>>> ...
> >>>>>>>>>>
> >>>>>>>>>> This is the log with block protection enabled:
> >>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> >>>>>>>>>> for
> >>>>>>>>>> state default
> >>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>>>> 0xf1
> >>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
> >>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>>>> 2048, OOB size: 64
> >>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>>>> [    0.539687] Bad block table not found for chip 0
> >>>>>>>>>> [    0.550153] Bad block table not found for chip 0
> >>>>>>>>>> [    0.555069] Scanning device for bad blocks
> >>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
> >>>>>>>>>> virtual
> >>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
> >>>>>>>>>> *** Device hangs ***
> >>>>>>>>>>
> >>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
> >>>>>>>>>> unable to detect the bad block table and hangs it when trying to
> >>>>>>>>>> scan
> >>>>>>>>>> for bad blocks.
> >>>>>>>>>
> >>>>>>>>> Please trace nand_macronix.c and look:
> >>>>>>>>> - are the get_features and set_features really supported by the
> >>>>>>>>>    controller driver?
> >>>>>>>>
> >>>>>>>> This is what I could find by debugging:
> >>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> >>>>>>>> state default
> >>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> >>>>>>>> 0xf1
> >>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
> >>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> >>>>>>>> 2048, OOB size: 64
> >>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> >>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> >>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> >>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> >>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> >>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> >>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> >>>>>>>> 0x00
> >>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> >>>>>>>> 00 00 00] -> 0
> >>>>>>>> [    0.602341] macronix_nand_block_protection_support:
> >>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> >>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
> >>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
> >>>>>>>> [    0.624760] Bad block table not found for chip 0
> >>>>>>>> [    0.635542] Bad block table not found for chip 0
> >>>>>>>> [    0.640270] Scanning device for bad blocks
> >>>>>>>>
> >>>>>>>> I don't know how to tell if get_features / set_features is really
> >>>>>>>> supported...
> >>>>>>>
> >>>>>>> Looks like your driver does not support exec_op but the core provides a
> >>>>>>> get/set_feature implementation.
> >>>>>>
> >>>>>> According to Florian, low level should be supported on brcmnand
> >>>>>> controllers >= 4.0
> >>>>>> Also:
> >>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> >>>>>
> >>>>> Just to be sure, you're using a mainline controller driver, not this
> >>>>> one?
> >>>>
> >>>> Yes, this was just to prove that the HW I’m using has get/set features support.
> >>>> I’m using OpenWrt, so it’s linux v5.15 driver.
> >>>>
> >>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> - what is the state of the locking configuration in the chip when
> >>>>>>>>> you
> >>>>>>>>>    boot?
> >>>>>>>>
> >>>>>>>> Unlocked, I guess...
> >>>>>>>> How can I check that?
> >>>>>>>
> >>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
> >>>>>>> apparently.
> >>>>>>
> >>>>>> Well, I can read/write the device if block protection isn’t disabled,
> >>>>>> so I guess we can confirm it’s unlocked…
> >>>>>>
> >>>>>>>
> >>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
> >>>>>>>>> ?
> >>>>>>>
> >>>>>>> So nobody locks the device I guess? Did you add traces there?
> >>>>>>
> >>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
> >>>>>> since it fails when checking if feature is 0x38, so there’s no point
> >>>>>> in adding those traces…
> >>>>>
> >>>>> Right, it returns before setting these I guess.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
> >>>>>>>>> hanging
> >>>>>>>>>    exactly? (offset in nand/ and line in the code)
> >>>>>>>>
> >>>>>>>> I've got no idea...
> >>>>>>>
> >>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
> >>>>>>> closer and closer.
> >>>>>>
> >>>>>> I think that after trying to get the feature it just start reading
> >>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
> >>>>>
> >>>>> It should refuse to mount the device somehow, but in no case the kernel
> >>>>> should hang.
> >>>>
> >>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >>>>
> >>>>>
> >>>>>> Is it posible that the NAND starts behaving like this after getting
> >>>>>> the feature due to some specific config of my device?
> >>>>>>
> >>>>>>>
> >>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
> >>>>>>> close enough datasheet and I don't see what could confuse the device.
> >>>>>>>
> >>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
> >>>>>>> your design?
> >>>>>>
> >>>>>> There’s no WP pin in brcmnand controllers < 7.0
> >>>>>
> >>>>> What about the chip?
> >>>>
> >>>> Maybe it has a GPIO controlling that, but I don’t have that info…
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Miquèl
> >>>>>
>
> --
> Florian
>

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-05-16 18:55     ` Álvaro Fernández Rojas
@ 2023-05-16 18:58       ` Florian Fainelli
  -1 siblings, 0 replies; 50+ messages in thread
From: Florian Fainelli @ 2023-05-16 18:58 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, liao jaime, William (Zhenghao) Zhang
  Cc: Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

+William,

On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> Hi Jaime,
> 
> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> forcing it to check block protection (it's not supported on that
> device), the NAND controller stops reading/writing anything.
> 
> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> aren't supported on BCM63268 NAND controllers and this is causing the
> issue?

Yes, this looks like what we have seen as well even with newer NAND 
controllers actually. Would it be possible to obtain a full log from 
either of you?

William, is this something you have seen before as well?

> 
> Best regards,
> Álvaro.
> 
> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>>
>> Hi Álvaro
>>
>> In nand_scan_tail(), each manufacturer init function call will be execute.
>> In macronix_nand_init(), block protect will be execute after flash detect.
>> I have validate MX30LF1G18AC in Linux kernel v5.15.
>> I didn't got situation "device hangs"  on my side.
>> BP is to prevent incorrect operations.
>> Please check the controller settings for tracing this issue.
>>
>> Thanks
>> Jaime
>>
>>>
>>> Hello YouChing and Jaime,
>>>
>>> I still didn't get any feedback from you (or Macronix) on this issue.
>>> Did you have time to look into it?
>>>
>>> Thanks,
>>> Álvaro.
>>>
>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
>>> (<noltari@gmail.com>) escribió:
>>>>
>>>> Hi Miquèl,
>>>>
>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>> Hi Álvaro,
>>>>>
>>>>> + YouChing and Jaime from Macronix
>>>>> TLDR for them: there is a misbehavior since Mason added block
>>>>> protection support. Just checking if the blocks are protected seems to
>>>>> misconfigure the chip entirely, see below. Any hints?
>>>>
>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
>>>> isn’t sent after getting the features?
>>>>
>>>>>
>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>>>>>
>>>>>> Hi Miquèl,
>>>>>>
>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>> Hi Álvaro,
>>>>>>>
>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>>>>>>>
>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>
>>>>>>>>> Hi Álvaro,
>>>>>>>>>
>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>>>>>>>>>
>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>
>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>
>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>>>>>>>>>>>
>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
>>>>>>>>>>>> This binding allows disabling block protection support for
>>>>>>>>>>>> those
>>>>>>>>>>>> devices not
>>>>>>>>>>>> supporting it.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>   Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>>>>>>>>>>>> +++
>>>>>>>>>>>>   1 file changed, 3 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git
>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
>>>>>>>>>>>>   Required NAND chip properties in children mode:
>>>>>>>>>>>>   - randomizer enable: should be "mxic,enable-randomizer-otp"
>>>>>>>>>>>>
>>>>>>>>>>>> +Optional NAND chip properties in children mode:
>>>>>>>>>>>> +- block protection disable: should be
>>>>>>>>>>>> "mxic,disable-block-protection"
>>>>>>>>>>>> +
>>>>>>>>>>>
>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
>>>>>>>>>>> conversions
>>>>>>>>>>> to
>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
>>>>>>>>>>> already have similar properties like "lock" and
>>>>>>>>>>> "secure-regions",
>>>>>>>>>>> not
>>>>>>>>>>> sure they will fit but I think it's worth checking.
>>>>>>>>>>
>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
>>>>>>>>>> on
>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>>>>>>>>>> MX30LF1G18AC
>>>>>>>>>> which hangs the device.
>>>>>>>>>>
>>>>>>>>>> This is the log with block protection disabled:
>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>> for
>>>>>>>>>> state default
>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>> 0xf1
>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
>>>>>>>>>> brcmnand.0
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> This is the log with block protection enabled:
>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>> for
>>>>>>>>>> state default
>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>> 0xf1
>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
>>>>>>>>>> [    0.555069] Scanning device for bad blocks
>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
>>>>>>>>>> virtual
>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
>>>>>>>>>> *** Device hangs ***
>>>>>>>>>>
>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
>>>>>>>>>> scan
>>>>>>>>>> for bad blocks.
>>>>>>>>>
>>>>>>>>> Please trace nand_macronix.c and look:
>>>>>>>>> - are the get_features and set_features really supported by the
>>>>>>>>>    controller driver?
>>>>>>>>
>>>>>>>> This is what I could find by debugging:
>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>>>>>>>> state default
>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>> 0xf1
>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>> 2048, OOB size: 64
>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>>>>>>>> 00 00 00] -> 0
>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>>>>>>>> [    0.624760] Bad block table not found for chip 0
>>>>>>>> [    0.635542] Bad block table not found for chip 0
>>>>>>>> [    0.640270] Scanning device for bad blocks
>>>>>>>>
>>>>>>>> I don't know how to tell if get_features / set_features is really
>>>>>>>> supported...
>>>>>>>
>>>>>>> Looks like your driver does not support exec_op but the core provides a
>>>>>>> get/set_feature implementation.
>>>>>>
>>>>>> According to Florian, low level should be supported on brcmnand
>>>>>> controllers >= 4.0
>>>>>> Also:
>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>>>>>
>>>>> Just to be sure, you're using a mainline controller driver, not this
>>>>> one?
>>>>
>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
>>>>
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> - what is the state of the locking configuration in the chip when
>>>>>>>>> you
>>>>>>>>>    boot?
>>>>>>>>
>>>>>>>> Unlocked, I guess...
>>>>>>>> How can I check that?
>>>>>>>
>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
>>>>>>> apparently.
>>>>>>
>>>>>> Well, I can read/write the device if block protection isn’t disabled,
>>>>>> so I guess we can confirm it’s unlocked…
>>>>>>
>>>>>>>
>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
>>>>>>>>> ?
>>>>>>>
>>>>>>> So nobody locks the device I guess? Did you add traces there?
>>>>>>
>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
>>>>>> since it fails when checking if feature is 0x38, so there’s no point
>>>>>> in adding those traces…
>>>>>
>>>>> Right, it returns before setting these I guess.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
>>>>>>>>> hanging
>>>>>>>>>    exactly? (offset in nand/ and line in the code)
>>>>>>>>
>>>>>>>> I've got no idea...
>>>>>>>
>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
>>>>>>> closer and closer.
>>>>>>
>>>>>> I think that after trying to get the feature it just start reading
>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
>>>>>
>>>>> It should refuse to mount the device somehow, but in no case the kernel
>>>>> should hang.
>>>>
>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>>>>
>>>>>
>>>>>> Is it posible that the NAND starts behaving like this after getting
>>>>>> the feature due to some specific config of my device?
>>>>>>
>>>>>>>
>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
>>>>>>> close enough datasheet and I don't see what could confuse the device.
>>>>>>>
>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
>>>>>>> your design?
>>>>>>
>>>>>> There’s no WP pin in brcmnand controllers < 7.0
>>>>>
>>>>> What about the chip?
>>>>
>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
>>>>
>>>>>
>>>>> Thanks,
>>>>> Miquèl
>>>>>

-- 
Florian


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-16 18:58       ` Florian Fainelli
  0 siblings, 0 replies; 50+ messages in thread
From: Florian Fainelli @ 2023-05-16 18:58 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, liao jaime, William (Zhenghao) Zhang
  Cc: Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

+William,

On 5/16/23 11:55, Álvaro Fernández Rojas wrote:
> Hi Jaime,
> 
> I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
> forcing it to check block protection (it's not supported on that
> device), the NAND controller stops reading/writing anything.
> 
> @Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
> aren't supported on BCM63268 NAND controllers and this is causing the
> issue?

Yes, this looks like what we have seen as well even with newer NAND 
controllers actually. Would it be possible to obtain a full log from 
either of you?

William, is this something you have seen before as well?

> 
> Best regards,
> Álvaro.
> 
> El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>>
>> Hi Álvaro
>>
>> In nand_scan_tail(), each manufacturer init function call will be execute.
>> In macronix_nand_init(), block protect will be execute after flash detect.
>> I have validate MX30LF1G18AC in Linux kernel v5.15.
>> I didn't got situation "device hangs"  on my side.
>> BP is to prevent incorrect operations.
>> Please check the controller settings for tracing this issue.
>>
>> Thanks
>> Jaime
>>
>>>
>>> Hello YouChing and Jaime,
>>>
>>> I still didn't get any feedback from you (or Macronix) on this issue.
>>> Did you have time to look into it?
>>>
>>> Thanks,
>>> Álvaro.
>>>
>>> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
>>> (<noltari@gmail.com>) escribió:
>>>>
>>>> Hi Miquèl,
>>>>
>>>> 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>> Hi Álvaro,
>>>>>
>>>>> + YouChing and Jaime from Macronix
>>>>> TLDR for them: there is a misbehavior since Mason added block
>>>>> protection support. Just checking if the blocks are protected seems to
>>>>> misconfigure the chip entirely, see below. Any hints?
>>>>
>>>> Could it be that the NAND is stuck expecting a read 0x00 command which
>>>> isn’t sent after getting the features?
>>>>
>>>>>
>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
>>>>>
>>>>>> Hi Miquèl,
>>>>>>
>>>>>> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
>>>>>>> Hi Álvaro,
>>>>>>>
>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
>>>>>>>
>>>>>>>> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>
>>>>>>>>> Hi Álvaro,
>>>>>>>>>
>>>>>>>>> noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
>>>>>>>>>
>>>>>>>>>> Hi Miquèl,
>>>>>>>>>>
>>>>>>>>>> El vie, 24 mar 2023 a las 10:40, Miquel Raynal
>>>>>>>>>> (<miquel.raynal@bootlin.com>) escribió:
>>>>>>>>>>>
>>>>>>>>>>> Hi Álvaro,
>>>>>>>>>>>
>>>>>>>>>>> noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
>>>>>>>>>>>
>>>>>>>>>>>> Add new "mxic,disable-block-protection" binding documentation.
>>>>>>>>>>>> This binding allows disabling block protection support for
>>>>>>>>>>>> those
>>>>>>>>>>>> devices not
>>>>>>>>>>>> supporting it.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>   Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
>>>>>>>>>>>> +++
>>>>>>>>>>>>   1 file changed, 3 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git
>>>>>>>>>>>> a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> index ffab28a2c4d1..03f65ca32cd3 100644
>>>>>>>>>>>> --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
>>>>>>>>>>>> @@ -16,6 +16,9 @@ in children nodes.
>>>>>>>>>>>>   Required NAND chip properties in children mode:
>>>>>>>>>>>>   - randomizer enable: should be "mxic,enable-randomizer-otp"
>>>>>>>>>>>>
>>>>>>>>>>>> +Optional NAND chip properties in children mode:
>>>>>>>>>>>> +- block protection disable: should be
>>>>>>>>>>>> "mxic,disable-block-protection"
>>>>>>>>>>>> +
>>>>>>>>>>>
>>>>>>>>>>> Besides the fact that nowadays we prefer to see binding
>>>>>>>>>>> conversions
>>>>>>>>>>> to
>>>>>>>>>>> yaml before adding anything, I don't think this will fly.
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure exactly what "disable block protection" means, we
>>>>>>>>>>> already have similar properties like "lock" and
>>>>>>>>>>> "secure-regions",
>>>>>>>>>>> not
>>>>>>>>>>> sure they will fit but I think it's worth checking.
>>>>>>>>>>
>>>>>>>>>> As explained in 2/2, commit 03a539c7a118 introduced a regression
>>>>>>>>>> on
>>>>>>>>>> Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
>>>>>>>>>> MX30LF1G18AC
>>>>>>>>>> which hangs the device.
>>>>>>>>>>
>>>>>>>>>> This is the log with block protection disabled:
>>>>>>>>>> [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>> for
>>>>>>>>>> state default
>>>>>>>>>> [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>> 0xf1
>>>>>>>>>> [    0.511526] nand: Macronix MX30LF1G18AC
>>>>>>>>>> [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>> [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>> [    0.535912] Bad block table found at page 65472, version 0x01
>>>>>>>>>> [    0.544268] Bad block table found at page 65408, version 0x01
>>>>>>>>>> [    0.954329] 9 fixed-partitions partitions found on MTD device
>>>>>>>>>> brcmnand.0
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> This is the log with block protection enabled:
>>>>>>>>>> [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
>>>>>>>>>> for
>>>>>>>>>> state default
>>>>>>>>>> [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>>>> 0xf1
>>>>>>>>>> [    0.510772] nand: Macronix MX30LF1G18AC
>>>>>>>>>> [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>>>> 2048, OOB size: 64
>>>>>>>>>> [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>>>> [    0.539687] Bad block table not found for chip 0
>>>>>>>>>> [    0.550153] Bad block table not found for chip 0
>>>>>>>>>> [    0.555069] Scanning device for bad blocks
>>>>>>>>>> [    0.601213] CPU 1 Unable to handle kernel paging request at
>>>>>>>>>> virtual
>>>>>>>>>> address 10277f00, epc == 8039ce70, ra == 8016ad50
>>>>>>>>>> *** Device hangs ***
>>>>>>>>>>
>>>>>>>>>> Enabling macronix_nand_block_protection_support() makes the device
>>>>>>>>>> unable to detect the bad block table and hangs it when trying to
>>>>>>>>>> scan
>>>>>>>>>> for bad blocks.
>>>>>>>>>
>>>>>>>>> Please trace nand_macronix.c and look:
>>>>>>>>> - are the get_features and set_features really supported by the
>>>>>>>>>    controller driver?
>>>>>>>>
>>>>>>>> This is what I could find by debugging:
>>>>>>>> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
>>>>>>>> state default
>>>>>>>> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
>>>>>>>> 0xf1
>>>>>>>> [    0.512077] nand: Macronix MX30LF1G18AC
>>>>>>>> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
>>>>>>>> 2048, OOB size: 64
>>>>>>>> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
>>>>>>>> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
>>>>>>>> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
>>>>>>>> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
>>>>>>>> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
>>>>>>>> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
>>>>>>>> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
>>>>>>>> 0x00
>>>>>>>> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
>>>>>>>> 00 00 00] -> 0
>>>>>>>> [    0.602341] macronix_nand_block_protection_support:
>>>>>>>> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
>>>>>>>> [    0.610548] macronix_nand_block_protection_support: !=
>>>>>>>> MXIC_BLOCK_PROTECTION_ALL_LOCK
>>>>>>>> [    0.624760] Bad block table not found for chip 0
>>>>>>>> [    0.635542] Bad block table not found for chip 0
>>>>>>>> [    0.640270] Scanning device for bad blocks
>>>>>>>>
>>>>>>>> I don't know how to tell if get_features / set_features is really
>>>>>>>> supported...
>>>>>>>
>>>>>>> Looks like your driver does not support exec_op but the core provides a
>>>>>>> get/set_feature implementation.
>>>>>>
>>>>>> According to Florian, low level should be supported on brcmnand
>>>>>> controllers >= 4.0
>>>>>> Also:
>>>>>> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
>>>>>
>>>>> Just to be sure, you're using a mainline controller driver, not this
>>>>> one?
>>>>
>>>> Yes, this was just to prove that the HW I’m using has get/set features support.
>>>> I’m using OpenWrt, so it’s linux v5.15 driver.
>>>>
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> - what is the state of the locking configuration in the chip when
>>>>>>>>> you
>>>>>>>>>    boot?
>>>>>>>>
>>>>>>>> Unlocked, I guess...
>>>>>>>> How can I check that?
>>>>>>>
>>>>>>> It's in your dump, the chip returns 0, meaning it's all unlocked,
>>>>>>> apparently.
>>>>>>
>>>>>> Well, I can read/write the device if block protection isn’t disabled,
>>>>>> so I guess we can confirm it’s unlocked…
>>>>>>
>>>>>>>
>>>>>>>>> - is there anything that locks the device by calling mxic_nand_lock()
>>>>>>>>> ?
>>>>>>>
>>>>>>> So nobody locks the device I guess? Did you add traces there?
>>>>>>
>>>>>> It doesn’t get to the point that it enabled the lock/unlock functions
>>>>>> since it fails when checking if feature is 0x38, so there’s no point
>>>>>> in adding those traces…
>>>>>
>>>>> Right, it returns before setting these I guess.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>> - finding no bbt is one thing, hanging is another, where is it
>>>>>>>>> hanging
>>>>>>>>>    exactly? (offset in nand/ and line in the code)
>>>>>>>>
>>>>>>>> I've got no idea...
>>>>>>>
>>>>>>> You can use ftrace or just add printks a bit everywhere and try to get
>>>>>>> closer and closer.
>>>>>>
>>>>>> I think that after trying to get the feature it just start reading
>>>>>> nonsense from the NAND and at some point it hangs due to that garbage…
>>>>>
>>>>> It should refuse to mount the device somehow, but in no case the kernel
>>>>> should hang.
>>>>
>>>> Yes, I think that this is a side effect (maybe a different bug somewhere else).
>>>>
>>>>>
>>>>>> Is it posible that the NAND starts behaving like this after getting
>>>>>> the feature due to some specific config of my device?
>>>>>>
>>>>>>>
>>>>>>> I looked at the patch, I don't see anything strange. Besides, I have a
>>>>>>> close enough datasheet and I don't see what could confuse the device.
>>>>>>>
>>>>>>> Are you really sure this patch is the problem? Is the WP pin wired on
>>>>>>> your design?
>>>>>>
>>>>>> There’s no WP pin in brcmnand controllers < 7.0
>>>>>
>>>>> What about the chip?
>>>>
>>>> Maybe it has a GPIO controlling that, but I don’t have that info…
>>>>
>>>>>
>>>>> Thanks,
>>>>> Miquèl
>>>>>

-- 
Florian


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
  2023-04-26  7:24   ` liao jaime
@ 2023-05-16 18:55     ` Álvaro Fernández Rojas
  -1 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-05-16 18:55 UTC (permalink / raw)
  To: liao jaime
  Cc: Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel,
	Florian Fainelli

Hi Jaime,

I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
forcing it to check block protection (it's not supported on that
device), the NAND controller stops reading/writing anything.

@Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
aren't supported on BCM63268 NAND controllers and this is causing the
issue?

Best regards,
Álvaro.

El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>
> Hi Álvaro
>
> In nand_scan_tail(), each manufacturer init function call will be execute.
> In macronix_nand_init(), block protect will be execute after flash detect.
> I have validate MX30LF1G18AC in Linux kernel v5.15.
> I didn't got situation "device hangs"  on my side.
> BP is to prevent incorrect operations.
> Please check the controller settings for tracing this issue.
>
> Thanks
> Jaime
>
> >
> > Hello YouChing and Jaime,
> >
> > I still didn't get any feedback from you (or Macronix) on this issue.
> > Did you have time to look into it?
> >
> > Thanks,
> > Álvaro.
> >
> > El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> > (<noltari@gmail.com>) escribió:
> > >
> > > Hi Miquèl,
> > >
> > > 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > > > Hi Álvaro,
> > > >
> > > > + YouChing and Jaime from Macronix
> > > > TLDR for them: there is a misbehavior since Mason added block
> > > > protection support. Just checking if the blocks are protected seems to
> > > > misconfigure the chip entirely, see below. Any hints?
> > >
> > > Could it be that the NAND is stuck expecting a read 0x00 command which
> > > isn’t sent after getting the features?
> > >
> > > >
> > > > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> > > >
> > > >> Hi Miquèl,
> > > >>
> > > >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > > >> > Hi Álvaro,
> > > >> >
> > > >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> > > >> >
> > > >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> > > >> >> (<miquel.raynal@bootlin.com>) escribió:
> > > >> >> >
> > > >> >> > Hi Álvaro,
> > > >> >> >
> > > >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> > > >> >> >
> > > >> >> > > Hi Miquèl,
> > > >> >> > >
> > > >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > > >> >> > > (<miquel.raynal@bootlin.com>) escribió:
> > > >> >> > > >
> > > >> >> > > > Hi Álvaro,
> > > >> >> > > >
> > > >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > > >> >> > > >
> > > >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> > > >> >> > > > > This binding allows disabling block protection support for
> > > >> >> > > > > those
> > > >> >> > > > > devices not
> > > >> >> > > > > supporting it.
> > > >> >> > > > >
> > > >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > >> >> > > > > ---
> > > >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> > > >> >> > > > > +++
> > > >> >> > > > >  1 file changed, 3 insertions(+)
> > > >> >> > > > >
> > > >> >> > > > > diff --git
> > > >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> > > >> >> > > > >  Required NAND chip properties in children mode:
> > > >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > > >> >> > > > >
> > > >> >> > > > > +Optional NAND chip properties in children mode:
> > > >> >> > > > > +- block protection disable: should be
> > > >> >> > > > > "mxic,disable-block-protection"
> > > >> >> > > > > +
> > > >> >> > > >
> > > >> >> > > > Besides the fact that nowadays we prefer to see binding
> > > >> >> > > > conversions
> > > >> >> > > > to
> > > >> >> > > > yaml before adding anything, I don't think this will fly.
> > > >> >> > > >
> > > >> >> > > > I'm not sure exactly what "disable block protection" means, we
> > > >> >> > > > already have similar properties like "lock" and
> > > >> >> > > > "secure-regions",
> > > >> >> > > > not
> > > >> >> > > > sure they will fit but I think it's worth checking.
> > > >> >> > >
> > > >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> > > >> >> > > on
> > > >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> > > >> >> > > MX30LF1G18AC
> > > >> >> > > which hangs the device.
> > > >> >> > >
> > > >> >> > > This is the log with block protection disabled:
> > > >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> > > >> >> > > for
> > > >> >> > > state default
> > > >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > > >> >> > > 0xf1
> > > >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> > > >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > >> >> > > 2048, OOB size: 64
> > > >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> > > >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> > > >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> > > >> >> > > brcmnand.0
> > > >> >> > > ...
> > > >> >> > >
> > > >> >> > > This is the log with block protection enabled:
> > > >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> > > >> >> > > for
> > > >> >> > > state default
> > > >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > > >> >> > > 0xf1
> > > >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> > > >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > >> >> > > 2048, OOB size: 64
> > > >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > >> >> > > [    0.539687] Bad block table not found for chip 0
> > > >> >> > > [    0.550153] Bad block table not found for chip 0
> > > >> >> > > [    0.555069] Scanning device for bad blocks
> > > >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> > > >> >> > > virtual
> > > >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > > >> >> > > *** Device hangs ***
> > > >> >> > >
> > > >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> > > >> >> > > unable to detect the bad block table and hangs it when trying to
> > > >> >> > > scan
> > > >> >> > > for bad blocks.
> > > >> >> >
> > > >> >> > Please trace nand_macronix.c and look:
> > > >> >> > - are the get_features and set_features really supported by the
> > > >> >> >   controller driver?
> > > >> >>
> > > >> >> This is what I could find by debugging:
> > > >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> > > >> >> state default
> > > >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > > >> >> 0xf1
> > > >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> > > >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > >> >> 2048, OOB size: 64
> > > >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> > > >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> > > >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > > >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > > >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > > >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> > > >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> > > >> >> 00 00 00] -> 0
> > > >> >> [    0.602341] macronix_nand_block_protection_support:
> > > >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> > > >> >> [    0.610548] macronix_nand_block_protection_support: !=
> > > >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> > > >> >> [    0.624760] Bad block table not found for chip 0
> > > >> >> [    0.635542] Bad block table not found for chip 0
> > > >> >> [    0.640270] Scanning device for bad blocks
> > > >> >>
> > > >> >> I don't know how to tell if get_features / set_features is really
> > > >> >> supported...
> > > >> >
> > > >> > Looks like your driver does not support exec_op but the core provides a
> > > >> > get/set_feature implementation.
> > > >>
> > > >> According to Florian, low level should be supported on brcmnand
> > > >> controllers >= 4.0
> > > >> Also:
> > > >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> > > >
> > > > Just to be sure, you're using a mainline controller driver, not this
> > > > one?
> > >
> > > Yes, this was just to prove that the HW I’m using has get/set features support.
> > > I’m using OpenWrt, so it’s linux v5.15 driver.
> > >
> > > >
> > > >> >
> > > >> >>
> > > >> >> > - what is the state of the locking configuration in the chip when
> > > >> >> > you
> > > >> >> >   boot?
> > > >> >>
> > > >> >> Unlocked, I guess...
> > > >> >> How can I check that?
> > > >> >
> > > >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> > > >> > apparently.
> > > >>
> > > >> Well, I can read/write the device if block protection isn’t disabled,
> > > >> so I guess we can confirm it’s unlocked…
> > > >>
> > > >> >
> > > >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> > > >> >> > ?
> > > >> >
> > > >> > So nobody locks the device I guess? Did you add traces there?
> > > >>
> > > >> It doesn’t get to the point that it enabled the lock/unlock functions
> > > >> since it fails when checking if feature is 0x38, so there’s no point
> > > >> in adding those traces…
> > > >
> > > > Right, it returns before setting these I guess.
> > > >
> > > >>
> > > >> >
> > > >> >> > - finding no bbt is one thing, hanging is another, where is it
> > > >> >> > hanging
> > > >> >> >   exactly? (offset in nand/ and line in the code)
> > > >> >>
> > > >> >> I've got no idea...
> > > >> >
> > > >> > You can use ftrace or just add printks a bit everywhere and try to get
> > > >> > closer and closer.
> > > >>
> > > >> I think that after trying to get the feature it just start reading
> > > >> nonsense from the NAND and at some point it hangs due to that garbage…
> > > >
> > > > It should refuse to mount the device somehow, but in no case the kernel
> > > > should hang.
> > >
> > > Yes, I think that this is a side effect (maybe a different bug somewhere else).
> > >
> > > >
> > > >> Is it posible that the NAND starts behaving like this after getting
> > > >> the feature due to some specific config of my device?
> > > >>
> > > >> >
> > > >> > I looked at the patch, I don't see anything strange. Besides, I have a
> > > >> > close enough datasheet and I don't see what could confuse the device.
> > > >> >
> > > >> > Are you really sure this patch is the problem? Is the WP pin wired on
> > > >> > your design?
> > > >>
> > > >> There’s no WP pin in brcmnand controllers < 7.0
> > > >
> > > > What about the chip?
> > >
> > > Maybe it has a GPIO controlling that, but I don’t have that info…
> > >
> > > >
> > > > Thanks,
> > > > Miquèl
> > > >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-05-16 18:55     ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 50+ messages in thread
From: Álvaro Fernández Rojas @ 2023-05-16 18:55 UTC (permalink / raw)
  To: liao jaime
  Cc: Miquel Raynal, Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel,
	Florian Fainelli

Hi Jaime,

I've reproduced the issue on a Comtrend VR-3032u (MX30LF1G08AA). After
forcing it to check block protection (it's not supported on that
device), the NAND controller stops reading/writing anything.

@Florian is it possible that low level ops (GET_FEATURES/SET_FEATURES)
aren't supported on BCM63268 NAND controllers and this is causing the
issue?

Best regards,
Álvaro.

El mié, 26 abr 2023 a las 9:24, liao jaime (<jaimeliao.tw@gmail.com>) escribió:
>
> Hi Álvaro
>
> In nand_scan_tail(), each manufacturer init function call will be execute.
> In macronix_nand_init(), block protect will be execute after flash detect.
> I have validate MX30LF1G18AC in Linux kernel v5.15.
> I didn't got situation "device hangs"  on my side.
> BP is to prevent incorrect operations.
> Please check the controller settings for tracing this issue.
>
> Thanks
> Jaime
>
> >
> > Hello YouChing and Jaime,
> >
> > I still didn't get any feedback from you (or Macronix) on this issue.
> > Did you have time to look into it?
> >
> > Thanks,
> > Álvaro.
> >
> > El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> > (<noltari@gmail.com>) escribió:
> > >
> > > Hi Miquèl,
> > >
> > > 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > > > Hi Álvaro,
> > > >
> > > > + YouChing and Jaime from Macronix
> > > > TLDR for them: there is a misbehavior since Mason added block
> > > > protection support. Just checking if the blocks are protected seems to
> > > > misconfigure the chip entirely, see below. Any hints?
> > >
> > > Could it be that the NAND is stuck expecting a read 0x00 command which
> > > isn’t sent after getting the features?
> > >
> > > >
> > > > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> > > >
> > > >> Hi Miquèl,
> > > >>
> > > >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > > >> > Hi Álvaro,
> > > >> >
> > > >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> > > >> >
> > > >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> > > >> >> (<miquel.raynal@bootlin.com>) escribió:
> > > >> >> >
> > > >> >> > Hi Álvaro,
> > > >> >> >
> > > >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> > > >> >> >
> > > >> >> > > Hi Miquèl,
> > > >> >> > >
> > > >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > > >> >> > > (<miquel.raynal@bootlin.com>) escribió:
> > > >> >> > > >
> > > >> >> > > > Hi Álvaro,
> > > >> >> > > >
> > > >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > > >> >> > > >
> > > >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> > > >> >> > > > > This binding allows disabling block protection support for
> > > >> >> > > > > those
> > > >> >> > > > > devices not
> > > >> >> > > > > supporting it.
> > > >> >> > > > >
> > > >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > > >> >> > > > > ---
> > > >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> > > >> >> > > > > +++
> > > >> >> > > > >  1 file changed, 3 insertions(+)
> > > >> >> > > > >
> > > >> >> > > > > diff --git
> > > >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > > >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > > >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> > > >> >> > > > >  Required NAND chip properties in children mode:
> > > >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > > >> >> > > > >
> > > >> >> > > > > +Optional NAND chip properties in children mode:
> > > >> >> > > > > +- block protection disable: should be
> > > >> >> > > > > "mxic,disable-block-protection"
> > > >> >> > > > > +
> > > >> >> > > >
> > > >> >> > > > Besides the fact that nowadays we prefer to see binding
> > > >> >> > > > conversions
> > > >> >> > > > to
> > > >> >> > > > yaml before adding anything, I don't think this will fly.
> > > >> >> > > >
> > > >> >> > > > I'm not sure exactly what "disable block protection" means, we
> > > >> >> > > > already have similar properties like "lock" and
> > > >> >> > > > "secure-regions",
> > > >> >> > > > not
> > > >> >> > > > sure they will fit but I think it's worth checking.
> > > >> >> > >
> > > >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> > > >> >> > > on
> > > >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> > > >> >> > > MX30LF1G18AC
> > > >> >> > > which hangs the device.
> > > >> >> > >
> > > >> >> > > This is the log with block protection disabled:
> > > >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> > > >> >> > > for
> > > >> >> > > state default
> > > >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > > >> >> > > 0xf1
> > > >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> > > >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > >> >> > > 2048, OOB size: 64
> > > >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> > > >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> > > >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> > > >> >> > > brcmnand.0
> > > >> >> > > ...
> > > >> >> > >
> > > >> >> > > This is the log with block protection enabled:
> > > >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> > > >> >> > > for
> > > >> >> > > state default
> > > >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > > >> >> > > 0xf1
> > > >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> > > >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > >> >> > > 2048, OOB size: 64
> > > >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > >> >> > > [    0.539687] Bad block table not found for chip 0
> > > >> >> > > [    0.550153] Bad block table not found for chip 0
> > > >> >> > > [    0.555069] Scanning device for bad blocks
> > > >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> > > >> >> > > virtual
> > > >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > > >> >> > > *** Device hangs ***
> > > >> >> > >
> > > >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> > > >> >> > > unable to detect the bad block table and hangs it when trying to
> > > >> >> > > scan
> > > >> >> > > for bad blocks.
> > > >> >> >
> > > >> >> > Please trace nand_macronix.c and look:
> > > >> >> > - are the get_features and set_features really supported by the
> > > >> >> >   controller driver?
> > > >> >>
> > > >> >> This is what I could find by debugging:
> > > >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> > > >> >> state default
> > > >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > > >> >> 0xf1
> > > >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> > > >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > > >> >> 2048, OOB size: 64
> > > >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> > > >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > > >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> > > >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> > > >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > > >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > > >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > > >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> > > >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > > >> >> 0x00
> > > >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> > > >> >> 00 00 00] -> 0
> > > >> >> [    0.602341] macronix_nand_block_protection_support:
> > > >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> > > >> >> [    0.610548] macronix_nand_block_protection_support: !=
> > > >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> > > >> >> [    0.624760] Bad block table not found for chip 0
> > > >> >> [    0.635542] Bad block table not found for chip 0
> > > >> >> [    0.640270] Scanning device for bad blocks
> > > >> >>
> > > >> >> I don't know how to tell if get_features / set_features is really
> > > >> >> supported...
> > > >> >
> > > >> > Looks like your driver does not support exec_op but the core provides a
> > > >> > get/set_feature implementation.
> > > >>
> > > >> According to Florian, low level should be supported on brcmnand
> > > >> controllers >= 4.0
> > > >> Also:
> > > >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> > > >
> > > > Just to be sure, you're using a mainline controller driver, not this
> > > > one?
> > >
> > > Yes, this was just to prove that the HW I’m using has get/set features support.
> > > I’m using OpenWrt, so it’s linux v5.15 driver.
> > >
> > > >
> > > >> >
> > > >> >>
> > > >> >> > - what is the state of the locking configuration in the chip when
> > > >> >> > you
> > > >> >> >   boot?
> > > >> >>
> > > >> >> Unlocked, I guess...
> > > >> >> How can I check that?
> > > >> >
> > > >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> > > >> > apparently.
> > > >>
> > > >> Well, I can read/write the device if block protection isn’t disabled,
> > > >> so I guess we can confirm it’s unlocked…
> > > >>
> > > >> >
> > > >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> > > >> >> > ?
> > > >> >
> > > >> > So nobody locks the device I guess? Did you add traces there?
> > > >>
> > > >> It doesn’t get to the point that it enabled the lock/unlock functions
> > > >> since it fails when checking if feature is 0x38, so there’s no point
> > > >> in adding those traces…
> > > >
> > > > Right, it returns before setting these I guess.
> > > >
> > > >>
> > > >> >
> > > >> >> > - finding no bbt is one thing, hanging is another, where is it
> > > >> >> > hanging
> > > >> >> >   exactly? (offset in nand/ and line in the code)
> > > >> >>
> > > >> >> I've got no idea...
> > > >> >
> > > >> > You can use ftrace or just add printks a bit everywhere and try to get
> > > >> > closer and closer.
> > > >>
> > > >> I think that after trying to get the feature it just start reading
> > > >> nonsense from the NAND and at some point it hangs due to that garbage…
> > > >
> > > > It should refuse to mount the device somehow, but in no case the kernel
> > > > should hang.
> > >
> > > Yes, I think that this is a side effect (maybe a different bug somewhere else).
> > >
> > > >
> > > >> Is it posible that the NAND starts behaving like this after getting
> > > >> the feature due to some specific config of my device?
> > > >>
> > > >> >
> > > >> > I looked at the patch, I don't see anything strange. Besides, I have a
> > > >> > close enough datasheet and I don't see what could confuse the device.
> > > >> >
> > > >> > Are you really sure this patch is the problem? Is the WP pin wired on
> > > >> > your design?
> > > >>
> > > >> There’s no WP pin in brcmnand controllers < 7.0
> > > >
> > > > What about the chip?
> > >
> > > Maybe it has a GPIO controlling that, but I don’t have that info…
> > >
> > > >
> > > > Thanks,
> > > > Miquèl
> > > >

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
       [not found] <CAAQoYRm3766SG7+VuwVzu_xH8aWihoKWMEp8xQGNgJ6oOtC9+g@mail.gmail.com>
@ 2023-04-26  7:24   ` liao jaime
  0 siblings, 0 replies; 50+ messages in thread
From: liao jaime @ 2023-04-26  7:24 UTC (permalink / raw)
  To: Miquel Raynal, noltari
  Cc: Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

Hi Álvaro

In nand_scan_tail(), each manufacturer init function call will be execute.
In macronix_nand_init(), block protect will be execute after flash detect.
I have validate MX30LF1G18AC in Linux kernel v5.15.
I didn't got situation "device hangs"  on my side.
BP is to prevent incorrect operations.
Please check the controller settings for tracing this issue.

Thanks
Jaime

>
> Hello YouChing and Jaime,
>
> I still didn't get any feedback from you (or Macronix) on this issue.
> Did you have time to look into it?
>
> Thanks,
> Álvaro.
>
> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> (<noltari@gmail.com>) escribió:
> >
> > Hi Miquèl,
> >
> > 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > > Hi Álvaro,
> > >
> > > + YouChing and Jaime from Macronix
> > > TLDR for them: there is a misbehavior since Mason added block
> > > protection support. Just checking if the blocks are protected seems to
> > > misconfigure the chip entirely, see below. Any hints?
> >
> > Could it be that the NAND is stuck expecting a read 0x00 command which
> > isn’t sent after getting the features?
> >
> > >
> > > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> > >
> > >> Hi Miquèl,
> > >>
> > >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > >> > Hi Álvaro,
> > >> >
> > >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> > >> >
> > >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> > >> >> (<miquel.raynal@bootlin.com>) escribió:
> > >> >> >
> > >> >> > Hi Álvaro,
> > >> >> >
> > >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> > >> >> >
> > >> >> > > Hi Miquèl,
> > >> >> > >
> > >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > >> >> > > (<miquel.raynal@bootlin.com>) escribió:
> > >> >> > > >
> > >> >> > > > Hi Álvaro,
> > >> >> > > >
> > >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > >> >> > > >
> > >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> > >> >> > > > > This binding allows disabling block protection support for
> > >> >> > > > > those
> > >> >> > > > > devices not
> > >> >> > > > > supporting it.
> > >> >> > > > >
> > >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > >> >> > > > > ---
> > >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> > >> >> > > > > +++
> > >> >> > > > >  1 file changed, 3 insertions(+)
> > >> >> > > > >
> > >> >> > > > > diff --git
> > >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> > >> >> > > > >  Required NAND chip properties in children mode:
> > >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > >> >> > > > >
> > >> >> > > > > +Optional NAND chip properties in children mode:
> > >> >> > > > > +- block protection disable: should be
> > >> >> > > > > "mxic,disable-block-protection"
> > >> >> > > > > +
> > >> >> > > >
> > >> >> > > > Besides the fact that nowadays we prefer to see binding
> > >> >> > > > conversions
> > >> >> > > > to
> > >> >> > > > yaml before adding anything, I don't think this will fly.
> > >> >> > > >
> > >> >> > > > I'm not sure exactly what "disable block protection" means, we
> > >> >> > > > already have similar properties like "lock" and
> > >> >> > > > "secure-regions",
> > >> >> > > > not
> > >> >> > > > sure they will fit but I think it's worth checking.
> > >> >> > >
> > >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> > >> >> > > on
> > >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> > >> >> > > MX30LF1G18AC
> > >> >> > > which hangs the device.
> > >> >> > >
> > >> >> > > This is the log with block protection disabled:
> > >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> > >> >> > > for
> > >> >> > > state default
> > >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > >> >> > > 0xf1
> > >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> > >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > >> >> > > 2048, OOB size: 64
> > >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> > >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> > >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> > >> >> > > brcmnand.0
> > >> >> > > ...
> > >> >> > >
> > >> >> > > This is the log with block protection enabled:
> > >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> > >> >> > > for
> > >> >> > > state default
> > >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > >> >> > > 0xf1
> > >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> > >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > >> >> > > 2048, OOB size: 64
> > >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > >> >> > > [    0.539687] Bad block table not found for chip 0
> > >> >> > > [    0.550153] Bad block table not found for chip 0
> > >> >> > > [    0.555069] Scanning device for bad blocks
> > >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> > >> >> > > virtual
> > >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > >> >> > > *** Device hangs ***
> > >> >> > >
> > >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> > >> >> > > unable to detect the bad block table and hangs it when trying to
> > >> >> > > scan
> > >> >> > > for bad blocks.
> > >> >> >
> > >> >> > Please trace nand_macronix.c and look:
> > >> >> > - are the get_features and set_features really supported by the
> > >> >> >   controller driver?
> > >> >>
> > >> >> This is what I could find by debugging:
> > >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> > >> >> state default
> > >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > >> >> 0xf1
> > >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> > >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > >> >> 2048, OOB size: 64
> > >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> > >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> > >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> > >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> > >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> > >> >> 00 00 00] -> 0
> > >> >> [    0.602341] macronix_nand_block_protection_support:
> > >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> > >> >> [    0.610548] macronix_nand_block_protection_support: !=
> > >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> > >> >> [    0.624760] Bad block table not found for chip 0
> > >> >> [    0.635542] Bad block table not found for chip 0
> > >> >> [    0.640270] Scanning device for bad blocks
> > >> >>
> > >> >> I don't know how to tell if get_features / set_features is really
> > >> >> supported...
> > >> >
> > >> > Looks like your driver does not support exec_op but the core provides a
> > >> > get/set_feature implementation.
> > >>
> > >> According to Florian, low level should be supported on brcmnand
> > >> controllers >= 4.0
> > >> Also:
> > >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> > >
> > > Just to be sure, you're using a mainline controller driver, not this
> > > one?
> >
> > Yes, this was just to prove that the HW I’m using has get/set features support.
> > I’m using OpenWrt, so it’s linux v5.15 driver.
> >
> > >
> > >> >
> > >> >>
> > >> >> > - what is the state of the locking configuration in the chip when
> > >> >> > you
> > >> >> >   boot?
> > >> >>
> > >> >> Unlocked, I guess...
> > >> >> How can I check that?
> > >> >
> > >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> > >> > apparently.
> > >>
> > >> Well, I can read/write the device if block protection isn’t disabled,
> > >> so I guess we can confirm it’s unlocked…
> > >>
> > >> >
> > >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> > >> >> > ?
> > >> >
> > >> > So nobody locks the device I guess? Did you add traces there?
> > >>
> > >> It doesn’t get to the point that it enabled the lock/unlock functions
> > >> since it fails when checking if feature is 0x38, so there’s no point
> > >> in adding those traces…
> > >
> > > Right, it returns before setting these I guess.
> > >
> > >>
> > >> >
> > >> >> > - finding no bbt is one thing, hanging is another, where is it
> > >> >> > hanging
> > >> >> >   exactly? (offset in nand/ and line in the code)
> > >> >>
> > >> >> I've got no idea...
> > >> >
> > >> > You can use ftrace or just add printks a bit everywhere and try to get
> > >> > closer and closer.
> > >>
> > >> I think that after trying to get the feature it just start reading
> > >> nonsense from the NAND and at some point it hangs due to that garbage…
> > >
> > > It should refuse to mount the device somehow, but in no case the kernel
> > > should hang.
> >
> > Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >
> > >
> > >> Is it posible that the NAND starts behaving like this after getting
> > >> the feature due to some specific config of my device?
> > >>
> > >> >
> > >> > I looked at the patch, I don't see anything strange. Besides, I have a
> > >> > close enough datasheet and I don't see what could confuse the device.
> > >> >
> > >> > Are you really sure this patch is the problem? Is the WP pin wired on
> > >> > your design?
> > >>
> > >> There’s no WP pin in brcmnand controllers < 7.0
> > >
> > > What about the chip?
> >
> > Maybe it has a GPIO controlling that, but I don’t have that info…
> >
> > >
> > > Thanks,
> > > Miquèl
> > >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding
@ 2023-04-26  7:24   ` liao jaime
  0 siblings, 0 replies; 50+ messages in thread
From: liao jaime @ 2023-04-26  7:24 UTC (permalink / raw)
  To: Miquel Raynal, noltari
  Cc: Richard Weinberger, Vignesh Raghavendra, robh+dt,
	krzysztof.kozlowski+dt, linux-mtd, devicetree, linux-kernel

Hi Álvaro

In nand_scan_tail(), each manufacturer init function call will be execute.
In macronix_nand_init(), block protect will be execute after flash detect.
I have validate MX30LF1G18AC in Linux kernel v5.15.
I didn't got situation "device hangs"  on my side.
BP is to prevent incorrect operations.
Please check the controller settings for tracing this issue.

Thanks
Jaime

>
> Hello YouChing and Jaime,
>
> I still didn't get any feedback from you (or Macronix) on this issue.
> Did you have time to look into it?
>
> Thanks,
> Álvaro.
>
> El vie, 24 mar 2023 a las 18:04, Álvaro Fernández Rojas
> (<noltari@gmail.com>) escribió:
> >
> > Hi Miquèl,
> >
> > 2023-03-24 15:36 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > > Hi Álvaro,
> > >
> > > + YouChing and Jaime from Macronix
> > > TLDR for them: there is a misbehavior since Mason added block
> > > protection support. Just checking if the blocks are protected seems to
> > > misconfigure the chip entirely, see below. Any hints?
> >
> > Could it be that the NAND is stuck expecting a read 0x00 command which
> > isn’t sent after getting the features?
> >
> > >
> > > noltari@gmail.com wrote on Fri, 24 Mar 2023 15:15:47 +0100:
> > >
> > >> Hi Miquèl,
> > >>
> > >> 2023-03-24 14:45 GMT+01:00, Miquel Raynal <miquel.raynal@bootlin.com>:
> > >> > Hi Álvaro,
> > >> >
> > >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 12:21:11 +0100:
> > >> >
> > >> >> El vie, 24 mar 2023 a las 11:49, Miquel Raynal
> > >> >> (<miquel.raynal@bootlin.com>) escribió:
> > >> >> >
> > >> >> > Hi Álvaro,
> > >> >> >
> > >> >> > noltari@gmail.com wrote on Fri, 24 Mar 2023 11:31:17 +0100:
> > >> >> >
> > >> >> > > Hi Miquèl,
> > >> >> > >
> > >> >> > > El vie, 24 mar 2023 a las 10:40, Miquel Raynal
> > >> >> > > (<miquel.raynal@bootlin.com>) escribió:
> > >> >> > > >
> > >> >> > > > Hi Álvaro,
> > >> >> > > >
> > >> >> > > > noltari@gmail.com wrote on Thu, 23 Mar 2023 13:45:09 +0100:
> > >> >> > > >
> > >> >> > > > > Add new "mxic,disable-block-protection" binding documentation.
> > >> >> > > > > This binding allows disabling block protection support for
> > >> >> > > > > those
> > >> >> > > > > devices not
> > >> >> > > > > supporting it.
> > >> >> > > > >
> > >> >> > > > > Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> > >> >> > > > > ---
> > >> >> > > > >  Documentation/devicetree/bindings/mtd/nand-macronix.txt | 3
> > >> >> > > > > +++
> > >> >> > > > >  1 file changed, 3 insertions(+)
> > >> >> > > > >
> > >> >> > > > > diff --git
> > >> >> > > > > a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > index ffab28a2c4d1..03f65ca32cd3 100644
> > >> >> > > > > --- a/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > +++ b/Documentation/devicetree/bindings/mtd/nand-macronix.txt
> > >> >> > > > > @@ -16,6 +16,9 @@ in children nodes.
> > >> >> > > > >  Required NAND chip properties in children mode:
> > >> >> > > > >  - randomizer enable: should be "mxic,enable-randomizer-otp"
> > >> >> > > > >
> > >> >> > > > > +Optional NAND chip properties in children mode:
> > >> >> > > > > +- block protection disable: should be
> > >> >> > > > > "mxic,disable-block-protection"
> > >> >> > > > > +
> > >> >> > > >
> > >> >> > > > Besides the fact that nowadays we prefer to see binding
> > >> >> > > > conversions
> > >> >> > > > to
> > >> >> > > > yaml before adding anything, I don't think this will fly.
> > >> >> > > >
> > >> >> > > > I'm not sure exactly what "disable block protection" means, we
> > >> >> > > > already have similar properties like "lock" and
> > >> >> > > > "secure-regions",
> > >> >> > > > not
> > >> >> > > > sure they will fit but I think it's worth checking.
> > >> >> > >
> > >> >> > > As explained in 2/2, commit 03a539c7a118 introduced a regression
> > >> >> > > on
> > >> >> > > Sercomm H500-s (BCM63268) OpenWrt devices with Macronix
> > >> >> > > MX30LF1G18AC
> > >> >> > > which hangs the device.
> > >> >> > >
> > >> >> > > This is the log with block protection disabled:
> > >> >> > > [    0.495831] bcm6368_nand 10000200.nand: there is not valid maps
> > >> >> > > for
> > >> >> > > state default
> > >> >> > > [    0.504995] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > >> >> > > 0xf1
> > >> >> > > [    0.511526] nand: Macronix MX30LF1G18AC
> > >> >> > > [    0.515586] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > >> >> > > 2048, OOB size: 64
> > >> >> > > [    0.523516] bcm6368_nand 10000200.nand: detected 128MiB total,
> > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > >> >> > > [    0.535912] Bad block table found at page 65472, version 0x01
> > >> >> > > [    0.544268] Bad block table found at page 65408, version 0x01
> > >> >> > > [    0.954329] 9 fixed-partitions partitions found on MTD device
> > >> >> > > brcmnand.0
> > >> >> > > ...
> > >> >> > >
> > >> >> > > This is the log with block protection enabled:
> > >> >> > > [    0.495095] bcm6368_nand 10000200.nand: there is not valid maps
> > >> >> > > for
> > >> >> > > state default
> > >> >> > > [    0.504249] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > >> >> > > 0xf1
> > >> >> > > [    0.510772] nand: Macronix MX30LF1G18AC
> > >> >> > > [    0.514874] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > >> >> > > 2048, OOB size: 64
> > >> >> > > [    0.522780] bcm6368_nand 10000200.nand: detected 128MiB total,
> > >> >> > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > >> >> > > [    0.539687] Bad block table not found for chip 0
> > >> >> > > [    0.550153] Bad block table not found for chip 0
> > >> >> > > [    0.555069] Scanning device for bad blocks
> > >> >> > > [    0.601213] CPU 1 Unable to handle kernel paging request at
> > >> >> > > virtual
> > >> >> > > address 10277f00, epc == 8039ce70, ra == 8016ad50
> > >> >> > > *** Device hangs ***
> > >> >> > >
> > >> >> > > Enabling macronix_nand_block_protection_support() makes the device
> > >> >> > > unable to detect the bad block table and hangs it when trying to
> > >> >> > > scan
> > >> >> > > for bad blocks.
> > >> >> >
> > >> >> > Please trace nand_macronix.c and look:
> > >> >> > - are the get_features and set_features really supported by the
> > >> >> >   controller driver?
> > >> >>
> > >> >> This is what I could find by debugging:
> > >> >> [    0.494993] bcm6368_nand 10000200.nand: there is not valid maps for
> > >> >> state default
> > >> >> [    0.505375] nand: device found, Manufacturer ID: 0xc2, Chip ID:
> > >> >> 0xf1
> > >> >> [    0.512077] nand: Macronix MX30LF1G18AC
> > >> >> [    0.515994] nand: 128 MiB, SLC, erase size: 128 KiB, page size:
> > >> >> 2048, OOB size: 64
> > >> >> [    0.523928] bcm6368_nand 10000200.nand: detected 128MiB total,
> > >> >> 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-4
> > >> >> [    0.534415] bcm6368_nand 10000200.nand: ll_op cmd 0xa00ee
> > >> >> [    0.539988] bcm6368_nand 10000200.nand: ll_op cmd 0x600a0
> > >> >> [    0.545659] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > >> >> [    0.551214] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.557843] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > >> >> [    0.563475] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.569998] bcm6368_nand 10000200.nand: ll_op cmd 0x10000
> > >> >> [    0.575653] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.582246] bcm6368_nand 10000200.nand: ll_op cmd 0x80010000
> > >> >> [    0.588067] bcm6368_nand 10000200.nand: NAND_CMD_GET_FEATURES =
> > >> >> 0x00
> > >> >> [    0.594657] nand: nand_get_features: addr=a0 subfeature_param=[00
> > >> >> 00 00 00] -> 0
> > >> >> [    0.602341] macronix_nand_block_protection_support:
> > >> >> ONFI_FEATURE_ADDR_MXIC_PROTECTION=0
> > >> >> [    0.610548] macronix_nand_block_protection_support: !=
> > >> >> MXIC_BLOCK_PROTECTION_ALL_LOCK
> > >> >> [    0.624760] Bad block table not found for chip 0
> > >> >> [    0.635542] Bad block table not found for chip 0
> > >> >> [    0.640270] Scanning device for bad blocks
> > >> >>
> > >> >> I don't know how to tell if get_features / set_features is really
> > >> >> supported...
> > >> >
> > >> > Looks like your driver does not support exec_op but the core provides a
> > >> > get/set_feature implementation.
> > >>
> > >> According to Florian, low level should be supported on brcmnand
> > >> controllers >= 4.0
> > >> Also:
> > >> https://github.com/nomis/bcm963xx_4.12L.06B_consumer/blob/e2f23ddbb20bf75689372b6e6a5a0dc613f6e313/shared/opensource/include/bcm963xx/63268_map_part.h#L1597
> > >
> > > Just to be sure, you're using a mainline controller driver, not this
> > > one?
> >
> > Yes, this was just to prove that the HW I’m using has get/set features support.
> > I’m using OpenWrt, so it’s linux v5.15 driver.
> >
> > >
> > >> >
> > >> >>
> > >> >> > - what is the state of the locking configuration in the chip when
> > >> >> > you
> > >> >> >   boot?
> > >> >>
> > >> >> Unlocked, I guess...
> > >> >> How can I check that?
> > >> >
> > >> > It's in your dump, the chip returns 0, meaning it's all unlocked,
> > >> > apparently.
> > >>
> > >> Well, I can read/write the device if block protection isn’t disabled,
> > >> so I guess we can confirm it’s unlocked…
> > >>
> > >> >
> > >> >> > - is there anything that locks the device by calling mxic_nand_lock()
> > >> >> > ?
> > >> >
> > >> > So nobody locks the device I guess? Did you add traces there?
> > >>
> > >> It doesn’t get to the point that it enabled the lock/unlock functions
> > >> since it fails when checking if feature is 0x38, so there’s no point
> > >> in adding those traces…
> > >
> > > Right, it returns before setting these I guess.
> > >
> > >>
> > >> >
> > >> >> > - finding no bbt is one thing, hanging is another, where is it
> > >> >> > hanging
> > >> >> >   exactly? (offset in nand/ and line in the code)
> > >> >>
> > >> >> I've got no idea...
> > >> >
> > >> > You can use ftrace or just add printks a bit everywhere and try to get
> > >> > closer and closer.
> > >>
> > >> I think that after trying to get the feature it just start reading
> > >> nonsense from the NAND and at some point it hangs due to that garbage…
> > >
> > > It should refuse to mount the device somehow, but in no case the kernel
> > > should hang.
> >
> > Yes, I think that this is a side effect (maybe a different bug somewhere else).
> >
> > >
> > >> Is it posible that the NAND starts behaving like this after getting
> > >> the feature due to some specific config of my device?
> > >>
> > >> >
> > >> > I looked at the patch, I don't see anything strange. Besides, I have a
> > >> > close enough datasheet and I don't see what could confuse the device.
> > >> >
> > >> > Are you really sure this patch is the problem? Is the WP pin wired on
> > >> > your design?
> > >>
> > >> There’s no WP pin in brcmnand controllers < 7.0
> > >
> > > What about the chip?
> >
> > Maybe it has a GPIO controlling that, but I don’t have that info…
> >
> > >
> > > Thanks,
> > > Miquèl
> > >

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2023-05-24  5:31 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23 12:45 [PATCH 0/2] mtd: nand: raw: macronix: allow disabling block protection Álvaro Fernández Rojas
2023-03-23 12:45 ` Álvaro Fernández Rojas
2023-03-23 12:45 ` [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding Álvaro Fernández Rojas
2023-03-23 12:45   ` Álvaro Fernández Rojas
2023-03-24  9:40   ` Miquel Raynal
2023-03-24  9:40     ` Miquel Raynal
2023-03-24 10:31     ` Álvaro Fernández Rojas
2023-03-24 10:31       ` Álvaro Fernández Rojas
2023-03-24 10:49       ` Miquel Raynal
2023-03-24 10:49         ` Miquel Raynal
2023-03-24 11:21         ` Álvaro Fernández Rojas
2023-03-24 11:21           ` Álvaro Fernández Rojas
2023-03-24 13:45           ` Miquel Raynal
2023-03-24 13:45             ` Miquel Raynal
2023-03-24 14:15             ` Álvaro Fernández Rojas
2023-03-24 14:15               ` Álvaro Fernández Rojas
2023-03-24 14:36               ` Miquel Raynal
2023-03-24 14:36                 ` Miquel Raynal
2023-03-24 17:04                 ` Álvaro Fernández Rojas
2023-03-24 17:04                   ` Álvaro Fernández Rojas
2023-03-27  8:21                   ` Miquel Raynal
2023-03-27  8:21                     ` Miquel Raynal
2023-04-22  9:28                   ` Álvaro Fernández Rojas
2023-04-22  9:28                     ` Álvaro Fernández Rojas
2023-03-23 12:45 ` [PATCH 2/2] mtd: nand: raw: macronix: allow disabling block protection Álvaro Fernández Rojas
2023-03-23 12:45   ` Álvaro Fernández Rojas
2023-03-23 12:47   ` Tudor Ambarus
2023-03-23 12:47     ` Tudor Ambarus
2023-03-23 12:55     ` Álvaro Fernández Rojas
2023-03-23 12:55       ` Álvaro Fernández Rojas
     [not found] <CAAQoYRm3766SG7+VuwVzu_xH8aWihoKWMEp8xQGNgJ6oOtC9+g@mail.gmail.com>
2023-04-26  7:24 ` [PATCH 1/2] dt-bindings: mtd: nand: Macronix: document new binding liao jaime
2023-04-26  7:24   ` liao jaime
2023-05-16 18:55   ` Álvaro Fernández Rojas
2023-05-16 18:55     ` Álvaro Fernández Rojas
2023-05-16 18:58     ` Florian Fainelli
2023-05-16 18:58       ` Florian Fainelli
2023-05-16 19:02       ` Álvaro Fernández Rojas
2023-05-16 19:02         ` Álvaro Fernández Rojas
2023-05-17  5:30         ` William Zhang
2023-05-17  5:30           ` William Zhang
2023-05-17 15:20           ` Álvaro Fernández Rojas
2023-05-17 15:20             ` Álvaro Fernández Rojas
2023-05-22  8:15             ` Miquel Raynal
2023-05-22  8:15               ` Miquel Raynal
2023-05-22  9:21               ` liao jaime
2023-05-22  9:21                 ` liao jaime
2023-05-23  0:59             ` William Zhang
2023-05-23  0:59               ` William Zhang
2023-05-24  5:30               ` liao jaime
2023-05-24  5:30                 ` liao jaime

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.