linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
@ 2022-12-02  7:19 Francesco Dolcini
  2022-12-02  9:14 ` Miquel Raynal
  2022-12-02 12:43 ` Greg KH
  0 siblings, 2 replies; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-02  7:19 UTC (permalink / raw)
  To: Miquel Raynal, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Marek Vasut
  Cc: Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable

From: Francesco Dolcini <francesco.dolcini@toradex.com>

Add a fallback mechanism to handle the case in which #size-cells is set
to <0>. According to the DT binding the nand controller node should have
set it to 0 and this is not compatible with the legacy way of
specifying partitions directly as child nodes of the nand-controller node.

This fixes a boot failure on colibri-imx7 and potentially other boards.

Cc: stable@vger.kernel.org
Fixes: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
Link: https://lore.kernel.org/all/Y4dgBTGNWpM6SQXI@francesco-nb.int.toradex.com/
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
---
 drivers/mtd/parsers/ofpart_core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/mtd/parsers/ofpart_core.c b/drivers/mtd/parsers/ofpart_core.c
index 192190c42fc8..aa3b7fa61e50 100644
--- a/drivers/mtd/parsers/ofpart_core.c
+++ b/drivers/mtd/parsers/ofpart_core.c
@@ -122,6 +122,17 @@ static int parse_fixed_partitions(struct mtd_info *master,
 
 		a_cells = of_n_addr_cells(pp);
 		s_cells = of_n_size_cells(pp);
+		if (s_cells == 0) {
+			/*
+			 * Use #size-cells = <1> for backward compatibility
+			 * in case #size-cells is set to <0> and firmware adds
+			 * OF partitions without setting it.
+			 */
+			pr_warn_once("%s: ofpart partition %pOF (%pOF) #size-cells is <0>, using <1> for backward compatibility.\n",
+				     master->name, pp,
+				     mtd_node);
+			s_cells = 1;
+		}
 		if (len / 4 != a_cells + s_cells) {
 			pr_debug("%s: ofpart partition %pOF (%pOF) error parsing reg property.\n",
 				 master->name, pp,
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02  7:19 [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0 Francesco Dolcini
@ 2022-12-02  9:14 ` Miquel Raynal
  2022-12-02 10:12   ` Francesco Dolcini
  2022-12-02 12:43 ` Greg KH
  1 sibling, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02  9:14 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd, Marek Vasut,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable

Hi Francesco,

francesco@dolcini.it wrote on Fri,  2 Dec 2022 08:19:00 +0100:

> From: Francesco Dolcini <francesco.dolcini@toradex.com>
> 
> Add a fallback mechanism to handle the case in which #size-cells is set
> to <0>. According to the DT binding the nand controller node should have
> set it to 0 and this is not compatible with the legacy way of
> specifying partitions directly as child nodes of the nand-controller node.

I understand the problem, I understand the fix, but I have to say, I
strongly dislike it :) Touching an mtd core driver to fix a single
broken use case like that is... problematic, for the least.

I am sorry but if a 6.0 kernel breaks because:
- a legacy scheme is used in the description
- u-boot still does not conform to the DT standard
- a patch tries to make a tool happy
Then the solution is clearly not an 'mtd core' mainline patch.

If you really want to workaround U-Boot, either you revert that patch
or you just fix the DT description instead. The parent/child/partitions
scheme has been enforced for maybe 5 years now and for a good reason: a
NAND controller with partitions does not make _any_ sense. There are
plenty of examples out there, imx7-colibri.dtsi has received many
updates since its introduction (for the best), so why not this one?

Cheers,
Miquèl

> This fixes a boot failure on colibri-imx7 and potentially other boards.
> 
> Cc: stable@vger.kernel.org
> Fixes: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> Link: https://lore.kernel.org/all/Y4dgBTGNWpM6SQXI@francesco-nb.int.toradex.com/
> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
> ---
>  drivers/mtd/parsers/ofpart_core.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/mtd/parsers/ofpart_core.c b/drivers/mtd/parsers/ofpart_core.c
> index 192190c42fc8..aa3b7fa61e50 100644
> --- a/drivers/mtd/parsers/ofpart_core.c
> +++ b/drivers/mtd/parsers/ofpart_core.c
> @@ -122,6 +122,17 @@ static int parse_fixed_partitions(struct mtd_info *master,
>  
>  		a_cells = of_n_addr_cells(pp);
>  		s_cells = of_n_size_cells(pp);
> +		if (s_cells == 0) {
> +			/*
> +			 * Use #size-cells = <1> for backward compatibility
> +			 * in case #size-cells is set to <0> and firmware adds
> +			 * OF partitions without setting it.
> +			 */
> +			pr_warn_once("%s: ofpart partition %pOF (%pOF) #size-cells is <0>, using <1> for backward compatibility.\n",
> +				     master->name, pp,
> +				     mtd_node);
> +			s_cells = 1;
> +		}
>  		if (len / 4 != a_cells + s_cells) {
>  			pr_debug("%s: ofpart partition %pOF (%pOF) error parsing reg property.\n",
>  				 master->name, pp,


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02  9:14 ` Miquel Raynal
@ 2022-12-02 10:12   ` Francesco Dolcini
  2022-12-02 10:24     ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-02 10:12 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Marek Vasut, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable

Hello Miquel,

On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote:
> francesco@dolcini.it wrote on Fri,  2 Dec 2022 08:19:00 +0100:
> > From: Francesco Dolcini <francesco.dolcini@toradex.com>
> > 
> > Add a fallback mechanism to handle the case in which #size-cells is set
> > to <0>. According to the DT binding the nand controller node should have
> > set it to 0 and this is not compatible with the legacy way of
> > specifying partitions directly as child nodes of the nand-controller node.
> 
> I understand the problem, I understand the fix, but I have to say, I
> strongly dislike it :) Touching an mtd core driver to fix a single
> broken use case like that is... problematic, for the least.
I just noticed it 2 days after this patch was backported to a stable
kernel, I am just the first one to notice, we are not talking about a single
use case.

> I am sorry but if a 6.0 kernel breaks because:
Not only kernel 6.0 is currently broken. This patch is going to be
backported to any stable kernel given the fixes tag it has.

> If you really want to workaround U-Boot, either you revert that patch
> or you just fix the DT description instead. The parent/child/partitions
> scheme has been enforced for maybe 5 years now and for a good reason: a
> NAND controller with partitions does not make _any_ sense. There are
> plenty of examples out there, imx7-colibri.dtsi has received many
> updates since its introduction (for the best), so why not this one?

I can and I will update imx7-colibri.dtsi (patch coming), but is this
good enough given the kind of boot failure regression this introduce? We
are going to have old u-boot around that will not work with it, and the
reality is that there are tons of reasons why people do update the linux
kernel and dts everyday, but never ever update the bootloader.

We cannot tell
  "All users of the XXX kernel series must upgrade."
and at the same time introduce a regression that break the boot and
ignore it.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 10:12   ` Francesco Dolcini
@ 2022-12-02 10:24     ` Francesco Dolcini
  2022-12-02 10:53       ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-02 10:24 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Marek Vasut, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable

On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote:
> Hello Miquel,
> 
> On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote:
> > francesco@dolcini.it wrote on Fri,  2 Dec 2022 08:19:00 +0100:
> > > From: Francesco Dolcini <francesco.dolcini@toradex.com>
> > > 
> > > Add a fallback mechanism to handle the case in which #size-cells is set
> > > to <0>. According to the DT binding the nand controller node should have
> > > set it to 0 and this is not compatible with the legacy way of
> > > specifying partitions directly as child nodes of the nand-controller node.
> > 
> > I understand the problem, I understand the fix, but I have to say, I
> > strongly dislike it :) Touching an mtd core driver to fix a single
> > broken use case like that is... problematic, for the least.
> I just noticed it 2 days after this patch was backported to a stable
> kernel, I am just the first one to notice, we are not talking about a single
> use case.
> 
> > I am sorry but if a 6.0 kernel breaks because:
> Not only kernel 6.0 is currently broken. This patch is going to be
> backported to any stable kernel given the fixes tag it has.
> 
> > If you really want to workaround U-Boot, either you revert that patch
> > or you just fix the DT description instead. The parent/child/partitions
> > scheme has been enforced for maybe 5 years now and for a good reason: a
> > NAND controller with partitions does not make _any_ sense. There are
> > plenty of examples out there, imx7-colibri.dtsi has received many
> > updates since its introduction (for the best), so why not this one?
> 
> I can and I will update imx7-colibri.dtsi (patch coming), but is this
> good enough given the kind of boot failure regression this introduce? We
> are going to have old u-boot around that will not work with it, and the

Just another piece of information, support for the partitions node in
U-Boot was added in version v2022.04 [1], we are not talking about ancient
old legacy stuff.

If I add the partitions node as a child of my nand controller, as I was
planning to do and I wrote 10 lines above, I will create a new flavor of
non-booting system with U-Boot older than v2022.04 :-/

U-Boot older than v2022.04 will update the nand controller node never
the less, the partition node will still be there and Linux will use it,
but it will be empty since nobody populate it.

Francesco

[1] commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 10:24     ` Francesco Dolcini
@ 2022-12-02 10:53       ` Miquel Raynal
  2022-12-02 11:23         ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 10:53 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd, Marek Vasut,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable

Hi Francesco,

francesco@dolcini.it wrote on Fri, 2 Dec 2022 11:24:29 +0100:

> On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote:
> > Hello Miquel,
> > 
> > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote:  
> > > francesco@dolcini.it wrote on Fri,  2 Dec 2022 08:19:00 +0100:  
> > > > From: Francesco Dolcini <francesco.dolcini@toradex.com>
> > > > 
> > > > Add a fallback mechanism to handle the case in which #size-cells is set
> > > > to <0>. According to the DT binding the nand controller node should have
> > > > set it to 0 and this is not compatible with the legacy way of
> > > > specifying partitions directly as child nodes of the nand-controller node.  
> > > 
> > > I understand the problem, I understand the fix, but I have to say, I
> > > strongly dislike it :) Touching an mtd core driver to fix a single
> > > broken use case like that is... problematic, for the least.  
> > I just noticed it 2 days after this patch was backported to a stable
> > kernel, I am just the first one to notice, we are not talking about a single
> > use case.
> >   
> > > I am sorry but if a 6.0 kernel breaks because:  
> > Not only kernel 6.0 is currently broken. This patch is going to be
> > backported to any stable kernel given the fixes tag it has.
> >   
> > > If you really want to workaround U-Boot, either you revert that patch
> > > or you just fix the DT description instead. The parent/child/partitions
> > > scheme has been enforced for maybe 5 years now and for a good reason: a
> > > NAND controller with partitions does not make _any_ sense. There are
> > > plenty of examples out there, imx7-colibri.dtsi has received many
> > > updates since its introduction (for the best), so why not this one?  
> > 
> > I can and I will update imx7-colibri.dtsi (patch coming),

:thumb_up:

> > but is this
> > good enough given the kind of boot failure regression this introduce? We
> > are going to have old u-boot around that will not work with it, and the  
> 
> Just another piece of information, support for the partitions node in
> U-Boot was added in version v2022.04 [1], we are not talking about ancient
> old legacy stuff.

If it is so recent, then this is what needs to be fixed, and it should
not bother "many" people because 2022.04 is not so old.

So I am a bit lost, IIUC what is currently broken is:
- U-Boot > 2022.04 and any version of Linux with the backport?

> If I add the partitions node as a child of my nand controller, as I was
> planning to do and I wrote 10 lines above, I will create a new flavor of
> non-booting system with U-Boot older than v2022.04 :-/

I think there is a little confusion here. You are referring to the NAND
controller node, the commit refers to the NAND chip node. What this
commit does looks fine because it just tries to use the partitions {}
node rather than the NAND chip node and if the partitions {} node
already exist, I expect #address-cells and #size-cells to be defined
and be != 0 already.

Here is a proper description:

nand-controller {
	#address-cells = <1>;
	#size-cells = <0>;
	nand-chip {
		partitions {
			#address-cells = <1 or 2>;
			#size-cells = <1 or 2>;
			partition@x { };
			partition@y { };
		};
	};

	/* Here you can very well have another nand-chip node with
	 * another reg property which represents its own CS and another
	 * set of partitions.
	 */
};

> U-Boot older than v2022.04 will update the nand controller node never
> the less, the partition node will still be there and Linux will use it,
> but it will be empty since nobody populate it.
> 
> Francesco
> 
> [1] commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")


Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 10:53       ` Miquel Raynal
@ 2022-12-02 11:23         ` Francesco Dolcini
  2022-12-02 14:05           ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-02 11:23 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Marek Vasut, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable, u-boot

+ u-boot list

On Fri, Dec 02, 2022 at 11:53:27AM +0100, Miquel Raynal wrote:
> francesco@dolcini.it wrote on Fri, 2 Dec 2022 11:24:29 +0100:
> > On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote:
> > > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote:  
> > > > francesco@dolcini.it wrote on Fri,  2 Dec 2022 08:19:00 +0100:  
> > > > > From: Francesco Dolcini <francesco.dolcini@toradex.com>
> > > > > 
> > > > > Add a fallback mechanism to handle the case in which #size-cells is set
> > > > > to <0>. According to the DT binding the nand controller node should have
> > > > > set it to 0 and this is not compatible with the legacy way of
> > > > > specifying partitions directly as child nodes of the nand-controller node.  
> > > > 
> > > > I understand the problem, I understand the fix, but I have to say, I
> > > > strongly dislike it :) Touching an mtd core driver to fix a single
> > > > broken use case like that is... problematic, for the least.  
> > > I just noticed it 2 days after this patch was backported to a stable
> > > kernel, I am just the first one to notice, we are not talking about a single
> > > use case.
> > >   
> > > > I am sorry but if a 6.0 kernel breaks because:  
> > > Not only kernel 6.0 is currently broken. This patch is going to be
> > > backported to any stable kernel given the fixes tag it has.
> > >   
> > > > If you really want to workaround U-Boot, either you revert that patch
> > > > or you just fix the DT description instead. The parent/child/partitions
> > > > scheme has been enforced for maybe 5 years now and for a good reason: a
> > > > NAND controller with partitions does not make _any_ sense. There are
> > > > plenty of examples out there, imx7-colibri.dtsi has received many
> > > > updates since its introduction (for the best), so why not this one?  
> > > 
> > > I can and I will update imx7-colibri.dtsi (patch coming),
> 
> :thumb_up:
> 
> > > but is this
> > > good enough given the kind of boot failure regression this introduce? We
> > > are going to have old u-boot around that will not work with it, and the  
> > 
> > Just another piece of information, support for the partitions node in
> > U-Boot was added in version v2022.04 [1], we are not talking about ancient
> > old legacy stuff.
> 
> If it is so recent, then this is what needs to be fixed, and it should
> not bother "many" people because 2022.04 is not so old.
> 
> So I am a bit lost, IIUC what is currently broken is:
> - U-Boot > 2022.04 and any version of Linux with the backport?
> 
> > If I add the partitions node as a child of my nand controller, as I was
> > planning to do and I wrote 10 lines above, I will create a new flavor of
> > non-booting system with U-Boot older than v2022.04 :-/
> 
> I think there is a little confusion here. You are referring to the NAND
I guess I have not explained myself well enough :-)

U-Boot is creating the partitions in the dtb, they are not defined in
the source dts file (this is common practice with multiple boards).

Before v2022.04 it was always updating the nand-controller node,
starting from v2022.04 if there is a dedicated `partitions` node it uses
it. This is just the reverse of what ofpart_core.c is doing (if the
partitions node is there it assumes the partitions should go into it,
otherwise it proceeds with the legacy way).

Let's have a concrete example with colibri-imx7.

Current status:
 - The nand-controller node does not include any partitions child, any
   U-Boot version will just add the partition directly as child of the
   nand controller. This is where I am hitting this boot regression now.

Potential change I envisioned here:
 - I add the partitions node to the nand-controller, e.g.

--- a/arch/arm/boot/dts/imx7-colibri.dtsi
+++ b/arch/arm/boot/dts/imx7-colibri.dtsi
@@ -380,6 +380,12 @@ &gpmi {
        nand-on-flash-bbt;
        pinctrl-names = "default";
        pinctrl-0 = <&pinctrl_gpmi_nand>;
+
+       partitions {
+               compatible = "fixed-partitions";
+               #address-cells = <1>;
+               #size-cells = <1>;
+       };
 };

 - U-Boot >= v2022.04 will just work fine creating the partitions as
   currently described in the bindings.
 - U-Boot < v2022.04 will still create the partitions as child of the
   nand-controller node. Linux will see that a `partitions` node exists
   but it will be empty, leading to a boot failure in case mtd is used
   as boot device.


> controller node, the commit refers to the NAND chip node. What this
> commit does looks fine because it just tries to use the partitions {}
> node rather than the NAND chip node and if the partitions {} node
> already exist, I expect #address-cells and #size-cells to be defined
> and be != 0 already.
yes, this commit is perfectly fine I agree.

The reality is that people is using newer kernel with older U-Boot, and
I do not think that deliberately breaking this use case is what the
Linux kernel should do.

I do not think that I can push a change in the DTS that will break
booting any board using an older U-Boot.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02  7:19 [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0 Francesco Dolcini
  2022-12-02  9:14 ` Miquel Raynal
@ 2022-12-02 12:43 ` Greg KH
  1 sibling, 0 replies; 48+ messages in thread
From: Greg KH @ 2022-12-02 12:43 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Miquel Raynal, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Marek Vasut, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable

On Fri, Dec 02, 2022 at 08:19:00AM +0100, Francesco Dolcini wrote:
> From: Francesco Dolcini <francesco.dolcini@toradex.com>
> 
> Add a fallback mechanism to handle the case in which #size-cells is set
> to <0>. According to the DT binding the nand controller node should have
> set it to 0 and this is not compatible with the legacy way of
> specifying partitions directly as child nodes of the nand-controller node.
> 
> This fixes a boot failure on colibri-imx7 and potentially other boards.
> 
> Cc: stable@vger.kernel.org
> Fixes: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> Link: https://lore.kernel.org/all/Y4dgBTGNWpM6SQXI@francesco-nb.int.toradex.com/
> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
> ---
>  drivers/mtd/parsers/ofpart_core.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 11:23         ` Francesco Dolcini
@ 2022-12-02 14:05           ` Miquel Raynal
  2022-12-02 14:31             ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 14:05 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd, Marek Vasut,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

francesco@dolcini.it wrote on Fri, 2 Dec 2022 12:23:37 +0100:

> + u-boot list
> 
> On Fri, Dec 02, 2022 at 11:53:27AM +0100, Miquel Raynal wrote:
> > francesco@dolcini.it wrote on Fri, 2 Dec 2022 11:24:29 +0100:  
> > > On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote:  
> > > > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote:    
> > > > > francesco@dolcini.it wrote on Fri,  2 Dec 2022 08:19:00 +0100:    
> > > > > > From: Francesco Dolcini <francesco.dolcini@toradex.com>
> > > > > > 
> > > > > > Add a fallback mechanism to handle the case in which #size-cells is set
> > > > > > to <0>. According to the DT binding the nand controller node should have
> > > > > > set it to 0 and this is not compatible with the legacy way of
> > > > > > specifying partitions directly as child nodes of the nand-controller node.    
> > > > > 
> > > > > I understand the problem, I understand the fix, but I have to say, I
> > > > > strongly dislike it :) Touching an mtd core driver to fix a single
> > > > > broken use case like that is... problematic, for the least.    
> > > > I just noticed it 2 days after this patch was backported to a stable
> > > > kernel, I am just the first one to notice, we are not talking about a single
> > > > use case.
> > > >     
> > > > > I am sorry but if a 6.0 kernel breaks because:    
> > > > Not only kernel 6.0 is currently broken. This patch is going to be
> > > > backported to any stable kernel given the fixes tag it has.
> > > >     
> > > > > If you really want to workaround U-Boot, either you revert that patch
> > > > > or you just fix the DT description instead. The parent/child/partitions
> > > > > scheme has been enforced for maybe 5 years now and for a good reason: a
> > > > > NAND controller with partitions does not make _any_ sense. There are
> > > > > plenty of examples out there, imx7-colibri.dtsi has received many
> > > > > updates since its introduction (for the best), so why not this one?    
> > > > 
> > > > I can and I will update imx7-colibri.dtsi (patch coming),  
> > 
> > :thumb_up:
> >   
> > > > but is this
> > > > good enough given the kind of boot failure regression this introduce? We
> > > > are going to have old u-boot around that will not work with it, and the    
> > > 
> > > Just another piece of information, support for the partitions node in
> > > U-Boot was added in version v2022.04 [1], we are not talking about ancient
> > > old legacy stuff.  
> > 
> > If it is so recent, then this is what needs to be fixed, and it should
> > not bother "many" people because 2022.04 is not so old.
> > 
> > So I am a bit lost, IIUC what is currently broken is:
> > - U-Boot > 2022.04 and any version of Linux with the backport?
> >   
> > > If I add the partitions node as a child of my nand controller, as I was
> > > planning to do and I wrote 10 lines above, I will create a new flavor of
> > > non-booting system with U-Boot older than v2022.04 :-/  
> > 
> > I think there is a little confusion here. You are referring to the NAND  
> I guess I have not explained myself well enough :-)

Ok, there is still a confusion. Even though I think your logic still
applies, I want to emphasis on how wrong it is to define partitions in
the NAND _controller_ node rather than the NAND _chip_ node. And I
think this might have an impact on our final choice.

> U-Boot is creating the partitions in the dtb, they are not defined in
> the source dts file (this is common practice with multiple boards).

That fdt_fixup_mtdparts() thing is a mistake. The original idea is:

1. Define wrong nodes in your DT
2. Fix your DT at run time in U-Boot
3. Provide the "fixed" DT to Linux

Now step #2 now produces wrong FDT. So what, we should darken even
more the of partition driver in Linux to workaround it? At most what we
can do is warn the user so that people don't loose time understanding
what happens, but I am against supporting this, ever.

> Before v2022.04 it was always updating the nand-controller node,
> starting from v2022.04 if there is a dedicated `partitions` node it uses
> it.

Sounds reasonable.

> This is just the reverse of what ofpart_core.c is doing (if the
> partitions node is there it assumes the partitions should go into it,
> otherwise it proceeds with the legacy way).

Yes, that's how we handle legacy bindings.

> Let's have a concrete example with colibri-imx7.
> 
> Current status:
>  - The nand-controller node does not include any partitions child, any
>    U-Boot version will just add the partition directly as child of the
>    nand controller. This is where I am hitting this boot regression now.

Not exactly. It worked until now because your original DT already
included #size-cells = <1> I believe. It does not do that anymore and
that is why you get your boot regression: because the DT was modified.

The reason why the DT got modified however is interesting. The commit
log says the goal is to comply with modern bindings, which is great.
But if you break how your board boots, then you should probably not do
that. And if we really care about complying with the bindings, there
is something much more interesting than fixing a single property:
distinguishing the NAND controller vs. the NAND chip(s), which has been
enforced since 2016 (which probably predates the imx7-colibri.dtsi, but
whatever):
2d472aba15ff ("mtd: nand: document the NAND controller/NAND chip DT representation")

> Potential change I envisioned here:
>  - I add the partitions node to the nand-controller, e.g.
> 
> --- a/arch/arm/boot/dts/imx7-colibri.dtsi
> +++ b/arch/arm/boot/dts/imx7-colibri.dtsi
> @@ -380,6 +380,12 @@ &gpmi {
>         nand-on-flash-bbt;
>         pinctrl-names = "default";
>         pinctrl-0 = <&pinctrl_gpmi_nand>;
> +
> +       partitions {
> +               compatible = "fixed-partitions";
> +               #address-cells = <1>;
> +               #size-cells = <1>;
> +       };
>  };
> 
>  - U-Boot >= v2022.04 will just work fine creating the partitions as
>    currently described in the bindings.
>  - U-Boot < v2022.04 will still create the partitions as child of the
>    nand-controller node. Linux will see that a `partitions` node exists
>    but it will be empty, leading to a boot failure in case mtd is used
>    as boot device.
> 
> 
> > controller node, the commit refers to the NAND chip node. What this
> > commit does looks fine because it just tries to use the partitions {}
> > node rather than the NAND chip node and if the partitions {} node
> > already exist, I expect #address-cells and #size-cells to be defined
> > and be != 0 already.  
> yes, this commit is perfectly fine I agree.
> 
> The reality is that people is using newer kernel with older U-Boot, and
> I do not think that deliberately breaking this use case is what the
> Linux kernel should do.

Agreed.

> I do not think that I can push a change in the DTS that will break
> booting any board using an older U-Boot.

That's however the initial cause of this discovery. A DT change broke
your boot flow. I'm saying "your" boot flow because I am not sure it
affects "any" board.

For now it only affects the imx7 colibri boards because of:
753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")

But all these boards could be affected in the same way because of some
machine code playing with fdt_fixup_mtdparts():
* arch/arm/mach-uniphier/fdt-fixup.c
* board/compulab/cm_fx6/cm_fx6.c
* board/gateworks/gw_ventana/gw_ventana.c
* board/isee/igep003x/board.c
* board/isee/igep00x0/igep00x0.c
* board/phytec/phycore_am335x_r2/board.c
* board/st/stm32mp1/stm32mp1.c
* board/toradex/colibri-imx6ull/colibri-imx6ull.c
* board/toradex/colibri_imx7/colibri_imx7.c
* board/toradex/colibri_vf/colibri_vf.c
That's of course way too much possible failures.

I still strongly disagree with the initial proposal but what I think we
can do is:

1. To prevent future breakages: 
  Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
  kernel should work.

2. To help tracking down situations like that:
  Keep the warning in ofpart.c but continue to fail.

3. To fix the current situation:
   Immediately revert commit (and prevent it from being backported):
   753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
   This way your own boot flow is fixed in the short term.

4. There is no reason to partially fix a DT like what the above did
   besides trying to avoid warnings emitted by the DT check tools. If
   complying with modern bindings is a goal (and I think it should
   be), then we can modernize this DT without breaking the boot flow:
   Instead of only setting #size-cell = <0>, you can as well define
   in your DT a subnode to define the NAND chip. NAND chips are not
   supposed to have #size-cells properties, but in the past they did,
   which means #address-cells and #size-cells are allowed (and marked
   deprecated in the schema). So in practice, the dt-schema will not
   warn you if they are there, which means you can still set
   #size-cell = <1>.

   Please mind, the tools have been updated very recently to match
   what I am describing above, so they will likely still report
   errors until v6.2-rc1, see:
   https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/

Does this sound reasonable?

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 14:05           ` Miquel Raynal
@ 2022-12-02 14:31             ` Marek Vasut
  2022-12-02 15:00               ` Miquel Raynal
  2022-12-02 15:56               ` Thorsten Leemhuis
  0 siblings, 2 replies; 48+ messages in thread
From: Marek Vasut @ 2022-12-02 14:31 UTC (permalink / raw)
  To: Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 12/2/22 15:05, Miquel Raynal wrote:
> Hi Francesco,

Hi,

[...]

> I still strongly disagree with the initial proposal but what I think we
> can do is:
> 
> 1. To prevent future breakages:
>    Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
>    kernel should work.
> 
> 2. To help tracking down situations like that:
>    Keep the warning in ofpart.c but continue to fail.
> 
> 3. To fix the current situation:
>     Immediately revert commit (and prevent it from being backported):
>     753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>     This way your own boot flow is fixed in the short term.

Here I disagree, the fix is correct and I think we shouldn't proliferate 
incorrect DTs which don't match the binding document. Rather, if a 
bootloader generates incorrect (new) DT entries, I believe the driver 
should implement a fixup and warn user about this. PC does that as well 
with broken ACPI tables as far as I can tell.

I'm not convinced making a DT non-compliant with bindings again, only to 
work around a problem induced by bootloader, is the right approach here.

This would be setting a dangerous example, where anyone could request a 
DT fix to be reverted because their random bootloader does the wrong 
thing and with valid DT clean up, something broke.

> 4. There is no reason to partially fix a DT like what the above did
>     besides trying to avoid warnings emitted by the DT check tools.

Note that the 3. does not partially fix the DT, it fixes the node fully.

>     If
>     complying with modern bindings is a goal (and I think it should
>     be), then we can modernize this DT without breaking the boot flow:
>     Instead of only setting #size-cell = <0>, you can as well define
>     in your DT a subnode to define the NAND chip. NAND chips are not
>     supposed to have #size-cells properties, but in the past they did,
>     which means #address-cells and #size-cells are allowed (and marked
>     deprecated in the schema). So in practice, the dt-schema will not
>     warn you if they are there, which means you can still set
>     #size-cell = <1>.

I am really not convinced we should hack around this on the DT end and 
try to push some sort of convoluted workaround there, instead of fixing 
it on the driver side (and bootloader side, in the long run).

>     Please mind, the tools have been updated very recently to match
>     what I am describing above, so they will likely still report
>     errors until v6.2-rc1, see:
>     https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/
> 
> Does this sound reasonable?

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 14:31             ` Marek Vasut
@ 2022-12-02 15:00               ` Miquel Raynal
  2022-12-02 15:23                 ` Marek Vasut
  2022-12-02 15:56               ` Thorsten Leemhuis
  1 sibling, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 15:00 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek,

marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:

> On 12/2/22 15:05, Miquel Raynal wrote:
> > Hi Francesco,  
> 
> Hi,
> 
> [...]
> 
> > I still strongly disagree with the initial proposal but what I think we
> > can do is:
> > 
> > 1. To prevent future breakages:
> >    Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
> >    kernel should work.
> > 
> > 2. To help tracking down situations like that:
> >    Keep the warning in ofpart.c but continue to fail.
> > 
> > 3. To fix the current situation:
> >     Immediately revert commit (and prevent it from being backported):
> >     753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> >     This way your own boot flow is fixed in the short term.  
> 
> Here I disagree, the fix is correct and I think we shouldn't
> proliferate incorrect DTs which don't match the binding document.

I agree we should not proliferate incorrect DTs, so let's use a modern
description then, with a controller and a child node which defines the
chip.

> Rather, if a bootloader generates incorrect (new) DT entries, I
> believe the driver should implement a fixup and warn user about this.
> PC does that as well with broken ACPI tables as far as I can tell.
> 
> I'm not convinced making a DT non-compliant with bindings again,

I am sorry to say so, but while warnings reported by the tools
should be fixed, it's not because the tool does not scream at you that
the description is valid. We are actively working on enhancing the
schema so that "all" improper descriptions get warnings (see the series
pointed earlier), but in no way this change makes the node compliant
with modern bindings.

I'm not saying the fix is wrong, but let's be pragmatic, it currently
leads to boot failures.

> only to work around a problem induced by bootloader, is the right approach
> here.

When a patch breaks a board and there is no straight fix, you revert
it, then you think harder. That's what I am saying. This is a temporary
solution.

> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke.

Please, you know this is not valid DT clean up. We've been decoupling
controller and chip description since 2016. What I am proposing is a
valid DT cleanup, not to the latest standard, but way closer than the
current solution.

> > 4. There is no reason to partially fix a DT like what the above did
> >     besides trying to avoid warnings emitted by the DT check tools.  
> 
> Note that the 3. does not partially fix the DT, it fixes the node fully.
> 
> >     If
> >     complying with modern bindings is a goal (and I think it should
> >     be), then we can modernize this DT without breaking the boot flow:
> >     Instead of only setting #size-cell = <0>, you can as well define
> >     in your DT a subnode to define the NAND chip. NAND chips are not
> >     supposed to have #size-cells properties, but in the past they did,
> >     which means #address-cells and #size-cells are allowed (and marked
> >     deprecated in the schema). So in practice, the dt-schema will not
> >     warn you if they are there, which means you can still set
> >     #size-cell = <1>.  
> 
> I am really not convinced we should hack around this on the DT end and try to push some sort of convoluted workaround there,

"convoluted workaround" :-)

> instead of fixing it on the driver side (and bootloader side, in the
> long run).
>
> >     Please mind, the tools have been updated very recently to match

s/tools/schema/

> >     what I am describing above, so they will likely still report
> >     errors until v6.2-rc1, see:
> >     https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/
> > 
> > Does this sound reasonable?  
> 
> [...]


Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 15:00               ` Miquel Raynal
@ 2022-12-02 15:23                 ` Marek Vasut
  2022-12-02 15:49                   ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-02 15:23 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On 12/2/22 16:00, Miquel Raynal wrote:
> Hi Marek,

Hi,

> marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:
> 
>> On 12/2/22 15:05, Miquel Raynal wrote:
>>> Hi Francesco,
>>
>> Hi,
>>
>> [...]
>>
>>> I still strongly disagree with the initial proposal but what I think we
>>> can do is:
>>>
>>> 1. To prevent future breakages:
>>>     Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
>>>     kernel should work.
>>>
>>> 2. To help tracking down situations like that:
>>>     Keep the warning in ofpart.c but continue to fail.
>>>
>>> 3. To fix the current situation:
>>>      Immediately revert commit (and prevent it from being backported):
>>>      753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>>      This way your own boot flow is fixed in the short term.
>>
>> Here I disagree, the fix is correct and I think we shouldn't
>> proliferate incorrect DTs which don't match the binding document.
> 
> I agree we should not proliferate incorrect DTs, so let's use a modern
> description then

Yes please !

> , with a controller and a child node which defines the
> chip.

But what if there is no chip connected to the controller node ?

If I understand the proposal here right (please correct me if I'm 
wrong), then:

1) This is the original, old, wrong binding:
&gpmi {
   #size-cells = <1>;
   ...
   partition@N { ... };
};


2) This is the newer, but still wrong binding:
&gpmi {
   #size-cells = <0>;
   ...
   partitions {
     partition@N { ... };
   };
};

3) This is the newest binding, what we want:
&gpmi {
   #size-cells = <0>;
   ...
   nand-chip {
     partitions {
       partition@N { ... };
     };
   };
};

But if there is no physical nand chip connected to the controller, would 
we end up with empty nand-chip node in DT, like this?
&gpmi {
   #size-cells = <X>;
   ...
   nand-chip { /* empty */ };
};
What would be the gpmi controller size cells (X) in that case, still 0, 
right ? So how does that help solve this problem, wouldn't U-Boot still 
populate the partitions directly under the gpmi node or into partitions 
sub-node ?

>> Rather, if a bootloader generates incorrect (new) DT entries, I
>> believe the driver should implement a fixup and warn user about this.
>> PC does that as well with broken ACPI tables as far as I can tell.
>>
>> I'm not convinced making a DT non-compliant with bindings again,
> 
> I am sorry to say so, but while warnings reported by the tools
> should be fixed, it's not because the tool does not scream at you that
> the description is valid. We are actively working on enhancing the
> schema so that "all" improper descriptions get warnings (see the series
> pointed earlier), but in no way this change makes the node compliant
> with modern bindings.
> 
> I'm not saying the fix is wrong, but let's be pragmatic, it currently
> leads to boot failures.

I fully agree that we do have a problem, and that it trickled into 
stable makes it even worse. Maybe I don't fully understand the thing 
with nand-chip proposal, see my question above, esp. the last part.

>> only to work around a problem induced by bootloader, is the right approach
>> here.
> 
> When a patch breaks a board and there is no straight fix, you revert
> it, then you think harder. That's what I am saying. This is a temporary
> solution.

Isn't this patch the straight fix, at least until the bootloader can be 
updated to generate the nand-chip node correctly ?

>> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke.
> 
> Please, you know this is not valid DT clean up. We've been decoupling
> controller and chip description since 2016. What I am proposing is a
> valid DT cleanup, not to the latest standard, but way closer than the
> current solution.

I think I really need one more explanation of the nand-chip part above.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 15:23                 ` Marek Vasut
@ 2022-12-02 15:49                   ` Miquel Raynal
  2022-12-02 16:01                     ` Miquel Raynal
  2022-12-02 16:17                     ` Marek Vasut
  0 siblings, 2 replies; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 15:49 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek,

marex@denx.de wrote on Fri, 2 Dec 2022 16:23:29 +0100:

> On 12/2/22 16:00, Miquel Raynal wrote:
> > Hi Marek,  
> 
> Hi,
> 
> > marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:
> >   
> >> On 12/2/22 15:05, Miquel Raynal wrote:  
> >>> Hi Francesco,  
> >>
> >> Hi,
> >>
> >> [...]
> >>  
> >>> I still strongly disagree with the initial proposal but what I think we
> >>> can do is:
> >>>
> >>> 1. To prevent future breakages:
> >>>     Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
> >>>     kernel should work.
> >>>
> >>> 2. To help tracking down situations like that:
> >>>     Keep the warning in ofpart.c but continue to fail.
> >>>
> >>> 3. To fix the current situation:
> >>>      Immediately revert commit (and prevent it from being backported):
> >>>      753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> >>>      This way your own boot flow is fixed in the short term.  
> >>
> >> Here I disagree, the fix is correct and I think we shouldn't
> >> proliferate incorrect DTs which don't match the binding document.  
> > 
> > I agree we should not proliferate incorrect DTs, so let's use a modern
> > description then  
> 
> Yes please !
> 
> > , with a controller and a child node which defines the
> > chip.  
> 
> But what if there is no chip connected to the controller node ?
> 
> If I understand the proposal here right (please correct me if I'm wrong), then:

Good idea to summarize.

> 
> 1) This is the original, old, wrong binding:
> &gpmi {
>    #size-cells = <1>;
>    ...
>    partition@N { ... };
> };

Yes.

> 
> 
> 2) This is the newer, but still wrong binding:
> &gpmi {
>    #size-cells = <0>;
>    ...
>    partitions {
>      partition@N { ... };
>    };
> };

Well, this is wrong description, but it would work (for compat reasons,
even though I don't think this is considered valid DT by the schemas).

> 
> 3) This is the newest binding, what we want:
> &gpmi {
>    #size-cells = <0>;
>    ...
>    nand-chip {
>      partitions {
>        partition@N { ... };
>      };
>    };
> };

Yes

> 
> But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this?
> &gpmi {
>    #size-cells = <X>;
>    ...
>    nand-chip { /* empty */ };
> };

Is this really a concern? If there is no NAND chip, the controller
should be disabled, no? I guess technically you could even use the
status property in the nand-chip node...

However, it should not be empty, at the very least a reg property
should indicate on which CS it is wired, as expected there:
https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next

But, as nand-chip.yaml references mtd.yaml, you can as well use
whatever is described here:
https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next

> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?

The commit that was pointed in the original fix clearly stated that the
NAND chip node was targeted, not the NAND controller node. I hope this
is correctly supported in U-Boot though. So if there is a NAND chip
subnode, I suppose U-Boot would try to create the partitions that are
inside, or even in the sub "partitions" container.

> >> Rather, if a bootloader generates incorrect (new) DT entries, I
> >> believe the driver should implement a fixup and warn user about this.
> >> PC does that as well with broken ACPI tables as far as I can tell.
> >>
> >> I'm not convinced making a DT non-compliant with bindings again,  
> > 
> > I am sorry to say so, but while warnings reported by the tools
> > should be fixed, it's not because the tool does not scream at you that
> > the description is valid. We are actively working on enhancing the
> > schema so that "all" improper descriptions get warnings (see the series
> > pointed earlier), but in no way this change makes the node compliant
> > with modern bindings.
> > 
> > I'm not saying the fix is wrong, but let's be pragmatic, it currently
> > leads to boot failures.  
> 
> I fully agree that we do have a problem, and that it trickled into stable makes it even worse. Maybe I don't fully understand the thing with nand-chip proposal, see my question above, esp. the last part.
> 
> >> only to work around a problem induced by bootloader, is the right approach
> >> here.  
> > 
> > When a patch breaks a board and there is no straight fix, you revert
> > it, then you think harder. That's what I am saying. This is a temporary
> > solution.  
> 
> Isn't this patch the straight fix, at least until the bootloader can be updated to generate the nand-chip node correctly ?
> 
> >> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke.  
> > 
> > Please, you know this is not valid DT clean up. We've been decoupling
> > controller and chip description since 2016. What I am proposing is a
> > valid DT cleanup, not to the latest standard, but way closer than the
> > current solution.  
> 
> I think I really need one more explanation of the nand-chip part above.

I hope things are clearer now.

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 14:31             ` Marek Vasut
  2022-12-02 15:00               ` Miquel Raynal
@ 2022-12-02 15:56               ` Thorsten Leemhuis
  2022-12-04 12:50                 ` Marek Vasut
  1 sibling, 1 reply; 48+ messages in thread
From: Thorsten Leemhuis @ 2022-12-02 15:56 UTC (permalink / raw)
  To: Marek Vasut, Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 02.12.22 15:31, Marek Vasut wrote:
> On 12/2/22 15:05, Miquel Raynal wrote:
> [...]
>> 3. To fix the current situation:
>>     Immediately revert commit (and prevent it from being backported):
>>     753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>     This way your own boot flow is fixed in the short term.
> 
> Here I disagree, the fix is correct and I think we shouldn't proliferate
> incorrect DTs which don't match the binding document. Rather, if a
> bootloader generates incorrect (new) DT entries, I believe the driver
> should implement a fixup and warn user about this. PC does that as well
> with broken ACPI tables as far as I can tell.

Well, that might be the right solution in the long run, that's up for
others to decide, but we need to fix this *quickly*. For two reasons
actually: the 6.1 release is near and the change was backported to
stable already.

For details wrt to the "quickly", see "Prioritize work on fixing
regressions" here:
https://docs.kernel.org/process/handling-regressions.html

IOW: Ideally it should be fixed by Sunday.

I'll hence likely soon will point Linus to this and suggest to revert
this, unless there are strong reasons against that or some sort of
agreement on a better solution.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 15:49                   ` Miquel Raynal
@ 2022-12-02 16:01                     ` Miquel Raynal
  2022-12-02 16:17                     ` Marek Vasut
  1 sibling, 0 replies; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 16:01 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot


miquel.raynal@bootlin.com wrote on Fri, 2 Dec 2022 16:49:04 +0100:

> Hi Marek,
> 
> marex@denx.de wrote on Fri, 2 Dec 2022 16:23:29 +0100:
> 
> > On 12/2/22 16:00, Miquel Raynal wrote:  
> > > Hi Marek,    
> > 
> > Hi,
> >   
> > > marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:
> > >     
> > >> On 12/2/22 15:05, Miquel Raynal wrote:    
> > >>> Hi Francesco,    
> > >>
> > >> Hi,
> > >>
> > >> [...]
> > >>    
> > >>> I still strongly disagree with the initial proposal but what I think we
> > >>> can do is:
> > >>>
> > >>> 1. To prevent future breakages:
> > >>>     Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
> > >>>     kernel should work.
> > >>>
> > >>> 2. To help tracking down situations like that:
> > >>>     Keep the warning in ofpart.c but continue to fail.
> > >>>
> > >>> 3. To fix the current situation:
> > >>>      Immediately revert commit (and prevent it from being backported):
> > >>>      753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> > >>>      This way your own boot flow is fixed in the short term.    
> > >>
> > >> Here I disagree, the fix is correct and I think we shouldn't
> > >> proliferate incorrect DTs which don't match the binding document.    
> > > 
> > > I agree we should not proliferate incorrect DTs, so let's use a modern
> > > description then    
> > 
> > Yes please !
> >   
> > > , with a controller and a child node which defines the
> > > chip.    
> > 
> > But what if there is no chip connected to the controller node ?
> > 
> > If I understand the proposal here right (please correct me if I'm wrong), then:  
> 
> Good idea to summarize.
> 
> > 
> > 1) This is the original, old, wrong binding:
> > &gpmi {
> >    #size-cells = <1>;
> >    ...
> >    partition@N { ... };
> > };  
> 
> Yes.
> 
> > 
> > 
> > 2) This is the newer, but still wrong binding:
> > &gpmi {
> >    #size-cells = <0>;
> >    ...
> >    partitions {
> >      partition@N { ... };
> >    };
> > };  
> 
> Well, this is wrong description, but it would work (for compat reasons,
> even though I don't think this is considered valid DT by the schemas).
> 
> > 
> > 3) This is the newest binding, what we want:
> > &gpmi {
> >    #size-cells = <0>;
> >    ...
> >    nand-chip {
> >      partitions {
> >        partition@N { ... };
> >      };
> >    };
> > };  
> 
> Yes

Perhaps I should also mention that #size-cells expected to be 0 has
nothing to do with the "partitions" container (otherwise #address-cells
would be 0 as well). This value is however asking for an address-only
reg property describing which NAND chip should be addressed and how,
basically the NAND controller CS because you can wire your NAND to
any CS.

> > But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this?
> > &gpmi {
> >    #size-cells = <X>;
> >    ...
> >    nand-chip { /* empty */ };
> > };  
> 
> Is this really a concern? If there is no NAND chip, the controller
> should be disabled, no? I guess technically you could even use the
> status property in the nand-chip node...
> 
> However, it should not be empty, at the very least a reg property
> should indicate on which CS it is wired, as expected there:
> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next
> 
> But, as nand-chip.yaml references mtd.yaml, you can as well use
> whatever is described here:
> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next
> 
> > What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?  
> 
> The commit that was pointed in the original fix clearly stated that the
> NAND chip node was targeted, not the NAND controller node. I hope this
> is correctly supported in U-Boot though. So if there is a NAND chip
> subnode, I suppose U-Boot would try to create the partitions that are
> inside, or even in the sub "partitions" container.
> 
> > >> Rather, if a bootloader generates incorrect (new) DT entries, I
> > >> believe the driver should implement a fixup and warn user about this.
> > >> PC does that as well with broken ACPI tables as far as I can tell.
> > >>
> > >> I'm not convinced making a DT non-compliant with bindings again,    
> > > 
> > > I am sorry to say so, but while warnings reported by the tools
> > > should be fixed, it's not because the tool does not scream at you that
> > > the description is valid. We are actively working on enhancing the
> > > schema so that "all" improper descriptions get warnings (see the series
> > > pointed earlier), but in no way this change makes the node compliant
> > > with modern bindings.
> > > 
> > > I'm not saying the fix is wrong, but let's be pragmatic, it currently
> > > leads to boot failures.    
> > 
> > I fully agree that we do have a problem, and that it trickled into stable makes it even worse. Maybe I don't fully understand the thing with nand-chip proposal, see my question above, esp. the last part.
> >   
> > >> only to work around a problem induced by bootloader, is the right approach
> > >> here.    
> > > 
> > > When a patch breaks a board and there is no straight fix, you revert
> > > it, then you think harder. That's what I am saying. This is a temporary
> > > solution.    
> > 
> > Isn't this patch the straight fix, at least until the bootloader can be updated to generate the nand-chip node correctly ?
> >   
> > >> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke.    
> > > 
> > > Please, you know this is not valid DT clean up. We've been decoupling
> > > controller and chip description since 2016. What I am proposing is a
> > > valid DT cleanup, not to the latest standard, but way closer than the
> > > current solution.    
> > 
> > I think I really need one more explanation of the nand-chip part above.  
> 
> I hope things are clearer now.
> 
> Thanks,
> Miquèl


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 15:49                   ` Miquel Raynal
  2022-12-02 16:01                     ` Miquel Raynal
@ 2022-12-02 16:17                     ` Marek Vasut
  2022-12-02 16:42                       ` Miquel Raynal
  2022-12-02 16:45                       ` Francesco Dolcini
  1 sibling, 2 replies; 48+ messages in thread
From: Marek Vasut @ 2022-12-02 16:17 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On 12/2/22 16:49, Miquel Raynal wrote:
> Hi Marek,

Hi,

>> On 12/2/22 16:00, Miquel Raynal wrote:
>>> Hi Marek,
>>
>> Hi,
>>
>>> marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:
>>>    
>>>> On 12/2/22 15:05, Miquel Raynal wrote:
>>>>> Hi Francesco,
>>>>
>>>> Hi,
>>>>
>>>> [...]
>>>>   
>>>>> I still strongly disagree with the initial proposal but what I think we
>>>>> can do is:
>>>>>
>>>>> 1. To prevent future breakages:
>>>>>      Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
>>>>>      kernel should work.
>>>>>
>>>>> 2. To help tracking down situations like that:
>>>>>      Keep the warning in ofpart.c but continue to fail.
>>>>>
>>>>> 3. To fix the current situation:
>>>>>       Immediately revert commit (and prevent it from being backported):
>>>>>       753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>>>>       This way your own boot flow is fixed in the short term.
>>>>
>>>> Here I disagree, the fix is correct and I think we shouldn't
>>>> proliferate incorrect DTs which don't match the binding document.
>>>
>>> I agree we should not proliferate incorrect DTs, so let's use a modern
>>> description then
>>
>> Yes please !
>>
>>> , with a controller and a child node which defines the
>>> chip.
>>
>> But what if there is no chip connected to the controller node ?
>>
>> If I understand the proposal here right (please correct me if I'm wrong), then:
> 
> Good idea to summarize.
> 
>>
>> 1) This is the original, old, wrong binding:
>> &gpmi {
>>     #size-cells = <1>;
>>     ...
>>     partition@N { ... };
>> };
> 
> Yes.
> 
>>
>>
>> 2) This is the newer, but still wrong binding:
>> &gpmi {
>>     #size-cells = <0>;
>>     ...
>>     partitions {
>>       partition@N { ... };
>>     };
>> };
> 
> Well, this is wrong description, but it would work (for compat reasons,
> even though I don't think this is considered valid DT by the schemas).
> 
>>
>> 3) This is the newest binding, what we want:
>> &gpmi {
>>     #size-cells = <0>;
>>     ...
>>     nand-chip {
>>       partitions {
>>         partition@N { ... };
>>       };
>>     };
>> };
> 
> Yes
> 
>>
>> But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this?
>> &gpmi {
>>     #size-cells = <X>;
>>     ...
>>     nand-chip { /* empty */ };
>> };
> 
> Is this really a concern?

I don't know, maybe it is not.

> If there is no NAND chip, the controller
> should be disabled, no? I guess technically you could even use the
> status property in the nand-chip node...

Sure.

> However, it should not be empty, at the very least a reg property
> should indicate on which CS it is wired, as expected there:
> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next

OK, I see your point. So basically this?

&gpmi {
   #size-cells = <1>;
   ...
   nand-chip@0 {
     reg = <0>;
   };
};

btw. the GPMI NAND controller supports only one chipselect, so the reg 
in nand-chip node makes little sense.

> But, as nand-chip.yaml references mtd.yaml, you can as well use
> whatever is described here:
> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next
> 
>> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?
> 
> The commit that was pointed in the original fix clearly stated that the
> NAND chip node was targeted

I think this is another miscommunication here. The commit

753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")

modifies the size-cells of the NAND controller. The nand-chip is not 
involved in this at all . In the examples above, it's the "&gpmi" node 
size-cells that is modified.

> , not the NAND controller node. I hope this
> is correctly supported in U-Boot though. So if there is a NAND chip
> subnode, I suppose U-Boot would try to create the partitions that are
> inside, or even in the sub "partitions" container.

My understanding is that U-Boot checks the nand-controller node 
size-cells, not the nand-chip{} or partitions{} subnode size-cells .

Francesco, can you please share the DT, including the U-Boot generated 
partitions, which is passed to Linux on Colibri MX7 ? I think that 
should make all confusion go away.

(or am I the only one who's still confused here?)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:17                     ` Marek Vasut
@ 2022-12-02 16:42                       ` Miquel Raynal
  2022-12-02 16:52                         ` Marek Vasut
  2022-12-02 17:20                         ` Francesco Dolcini
  2022-12-02 16:45                       ` Francesco Dolcini
  1 sibling, 2 replies; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 16:42 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek,

marex@denx.de wrote on Fri, 2 Dec 2022 17:17:59 +0100:

> On 12/2/22 16:49, Miquel Raynal wrote:
> > Hi Marek,  
> 
> Hi,
> 
> >> On 12/2/22 16:00, Miquel Raynal wrote:  
> >>> Hi Marek,  
> >>
> >> Hi,
> >>  
> >>> marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100:  
> >>>    >>>> On 12/2/22 15:05, Miquel Raynal wrote:  
> >>>>> Hi Francesco,  
> >>>>
> >>>> Hi,
> >>>>
> >>>> [...]  
> >>>>   >>>>> I still strongly disagree with the initial proposal but what I think we  
> >>>>> can do is:
> >>>>>
> >>>>> 1. To prevent future breakages:
> >>>>>      Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any
> >>>>>      kernel should work.
> >>>>>
> >>>>> 2. To help tracking down situations like that:
> >>>>>      Keep the warning in ofpart.c but continue to fail.
> >>>>>
> >>>>> 3. To fix the current situation:
> >>>>>       Immediately revert commit (and prevent it from being backported):
> >>>>>       753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> >>>>>       This way your own boot flow is fixed in the short term.  
> >>>>
> >>>> Here I disagree, the fix is correct and I think we shouldn't
> >>>> proliferate incorrect DTs which don't match the binding document.  
> >>>
> >>> I agree we should not proliferate incorrect DTs, so let's use a modern
> >>> description then  
> >>
> >> Yes please !
> >>  
> >>> , with a controller and a child node which defines the
> >>> chip.  
> >>
> >> But what if there is no chip connected to the controller node ?
> >>
> >> If I understand the proposal here right (please correct me if I'm wrong), then:  
> > 
> > Good idea to summarize.
> >   
> >>
> >> 1) This is the original, old, wrong binding:
> >> &gpmi {
> >>     #size-cells = <1>;
> >>     ...
> >>     partition@N { ... };
> >> };  
> > 
> > Yes.
> >   
> >>
> >>
> >> 2) This is the newer, but still wrong binding:
> >> &gpmi {
> >>     #size-cells = <0>;
> >>     ...
> >>     partitions {
> >>       partition@N { ... };
> >>     };
> >> };  
> > 
> > Well, this is wrong description, but it would work (for compat reasons,
> > even though I don't think this is considered valid DT by the schemas).
> >   
> >>
> >> 3) This is the newest binding, what we want:
> >> &gpmi {
> >>     #size-cells = <0>;
> >>     ...
> >>     nand-chip {
> >>       partitions {
> >>         partition@N { ... };
> >>       };
> >>     };
> >> };  
> > 
> > Yes
> >   
> >>
> >> But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this?
> >> &gpmi {
> >>     #size-cells = <X>;
> >>     ...
> >>     nand-chip { /* empty */ };
> >> };  
> > 
> > Is this really a concern?  
> 
> I don't know, maybe it is not.
> 
> > If there is no NAND chip, the controller
> > should be disabled, no? I guess technically you could even use the
> > status property in the nand-chip node...  
> 
> Sure.
> 
> > However, it should not be empty, at the very least a reg property
> > should indicate on which CS it is wired, as expected there:
> > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next  
> 
> OK, I see your point. So basically this?
> 
> &gpmi {
>    #size-cells = <1>;
>    ...
>    nand-chip@0 {
>      reg = <0>;
>    };
> };
> 
> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense.

I randomly opened a reference manual (IMX6DQL.pdf), they say:

	"Up to four NAND devices, supported by four chip-selects and one
	 ganged ready/ busy."

Anyway, the NAND controller generic bindings which require this reg
property, what the controller or the driver actually supports, or even
how it is used on current designs is not relevant here.

> > But, as nand-chip.yaml references mtd.yaml, you can as well use
> > whatever is described here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next
> >   
> >> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?  
> > 
> > The commit that was pointed in the original fix clearly stated that the
> > NAND chip node was targeted  
> 
> I think this is another miscommunication here. The commit
> 
> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> 
> modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified.

Yes I know. I was referring to this commit, sorry:
36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")

The log says:

	Listing MTD partitions directly in the flash mode has been
	deprecated for a while for kernel Device Trees. Look for a node "partitions" in the
	found flash nodes and use it instead of the flash node itself for the
	partition list when it exists, so Device Trees following the current
	best practices can be fixed up.

Which (I hope) means U-boot will equivalently try to play with the
partitions container, either in the controller node or in the chip node.

> > , not the NAND controller node. I hope this
> > is correctly supported in U-Boot though. So if there is a NAND chip
> > subnode, I suppose U-Boot would try to create the partitions that are
> > inside, or even in the sub "partitions" container.  
> 
> My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells .

I don't think U-Boot cares.

> Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away.

Please also do it with the NAND chip described. If, when the NAND chip
is described U-Boot tries to create partitions in the controller node,
then the situation is even worse than I thought. But I believe
describing the node like a suggest in the DT should prevent the boot
failure while still allowing a rather good description of the hardware.

BTW I still think the relevant action right now is to revert the DT
patch.

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:17                     ` Marek Vasut
  2022-12-02 16:42                       ` Miquel Raynal
@ 2022-12-02 16:45                       ` Francesco Dolcini
  2022-12-02 17:05                         ` Miquel Raynal
  1 sibling, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-02 16:45 UTC (permalink / raw)
  To: Marek Vasut, Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On Fri, Dec 02, 2022 at 05:17:59PM +0100, Marek Vasut wrote:
> On 12/2/22 16:49, Miquel Raynal wrote:
> > , not the NAND controller node. I hope this
> > is correctly supported in U-Boot though. So if there is a NAND chip
> > subnode, I suppose U-Boot would try to create the partitions that are
> > inside, or even in the sub "partitions" container.
> 
> My understanding is that U-Boot checks the nand-controller node size-cells,
> not the nand-chip{} or partitions{} subnode size-cells .
Not 100% correct.

 - U-Boot before v2022.04 updates the nand-controller{} node, no matter what.
 - U-Boot starting from v2022.04 looks for `partitions{}` into the
   nand-controller{} node, and creates the partition into it if found.
   If not found it behaves the same way as the previous versions.
   See commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")

I'd like to stress once more the fact that we cannot expect old U-Boot
to be updated in the field, and they will keep generating the partitions
as child of the nand-controller node whatever we do with the dts file.

I think that this should be treated the same way as any other fixup we
might have for broken firmware, especially considering that this used to
"work" (yes, I can agree that it horrible, but I cannot change the past)
without even a warning since the imx7 support was first introduced in
the linux kernel years ago.

> Francesco, can you please share the DT, including the U-Boot generated
> partitions, which is passed to Linux on Colibri MX7 ? I think that should
> make all confusion go away.

The device tree part is easy, just
arch/arm/boot/dts/imx7d-colibri-eval-v3.dts.

and the nand-controller node is coming from

#include "imx7d.dtsi"

plus

&gpmi {
	fsl,use-minimum-ecc;
	nand-ecc-mode = "hw";
	nand-on-flash-bbt;
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_gpmi_nand>;
};

The partitions nodes are generated 100% by U-Boot, nothing is present in
the dts source files.

With this DTS file as input, whatever U-Boot version is used I have the
following generated:

root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/
#address-cells          dma-names               nand-on-flash-bbt       pinctrl-0
#size-cells             dmas                    partition@0             pinctrl-names
assigned-clock-parents  fsl,use-minimum-ecc     partition@200000        reg
assigned-clocks         interrupt-names         partition@380000        reg-names
clock-names             interrupts              partition@400000        status
clocks                  name                    partition@80000
compatible              nand-ecc-mode           phandle

root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/partition@*
/proc/device-tree/soc/nand-controller@33002000/partition@0:
label  name   reg

/proc/device-tree/soc/nand-controller@33002000/partition@200000:
label      name       read_only  reg

/proc/device-tree/soc/nand-controller@33002000/partition@380000:
label  name   reg

/proc/device-tree/soc/nand-controller@33002000/partition@400000:
label  name   reg

/proc/device-tree/soc/nand-controller@33002000/partition@80000:
label      name       read_only  reg



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:42                       ` Miquel Raynal
@ 2022-12-02 16:52                         ` Marek Vasut
  2022-12-02 16:57                           ` Miquel Raynal
  2022-12-02 17:20                         ` Francesco Dolcini
  1 sibling, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-02 16:52 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On 12/2/22 17:42, Miquel Raynal wrote:
> Hi Marek,

Hi,

[...]

>>> However, it should not be empty, at the very least a reg property
>>> should indicate on which CS it is wired, as expected there:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next
>>
>> OK, I see your point. So basically this?
>>
>> &gpmi {
>>     #size-cells = <1>;
>>     ...
>>     nand-chip@0 {
>>       reg = <0>;
>>     };
>> };
>>
>> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense.
> 
> I randomly opened a reference manual (IMX6DQL.pdf), they say:
> 
> 	"Up to four NAND devices, supported by four chip-selects and one
> 	 ganged ready/ busy."

Doh, and MX7D has the same controller, so size-cells = <1>; makes sense 
with nand-chip@N {} .

> Anyway, the NAND controller generic bindings which require this reg
> property, what the controller or the driver actually supports, or even
> how it is used on current designs is not relevant here.
> 
>>> But, as nand-chip.yaml references mtd.yaml, you can as well use
>>> whatever is described here:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next
>>>    
>>>> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?
>>>
>>> The commit that was pointed in the original fix clearly stated that the
>>> NAND chip node was targeted
>>
>> I think this is another miscommunication here. The commit
>>
>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>
>> modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified.
> 
> Yes I know. I was referring to this commit, sorry:
> 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")
> 
> The log says:
> 
> 	Listing MTD partitions directly in the flash mode has been
> 	deprecated for a while for kernel Device Trees. Look for a node "partitions" in the
> 	found flash nodes and use it instead of the flash node itself for the
> 	partition list when it exists, so Device Trees following the current
> 	best practices can be fixed up.
> 
> Which (I hope) means U-boot will equivalently try to play with the
> partitions container, either in the controller node or in the chip node.
> 
>>> , not the NAND controller node. I hope this
>>> is correctly supported in U-Boot though. So if there is a NAND chip
>>> subnode, I suppose U-Boot would try to create the partitions that are
>>> inside, or even in the sub "partitions" container.
>>
>> My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells .
> 
> I don't think U-Boot cares.
> 
>> Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away.
> 
> Please also do it with the NAND chip described. If, when the NAND chip
> is described U-Boot tries to create partitions in the controller node,
> then the situation is even worse than I thought. But I believe
> describing the node like a suggest in the DT should prevent the boot
> failure while still allowing a rather good description of the hardware.
> 
> BTW I still think the relevant action right now is to revert the DT
> patch.

I am starting to bank toward that variant as well (thanks for clarifying 
the rationale in the discussion, that helped a lot).

But then, the follow up fix would be what exactly, update the binding 
document to require #size-cells = <1>; ?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:52                         ` Marek Vasut
@ 2022-12-02 16:57                           ` Miquel Raynal
  2022-12-02 17:08                             ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 16:57 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek,

marex@denx.de wrote on Fri, 2 Dec 2022 17:52:05 +0100:

> On 12/2/22 17:42, Miquel Raynal wrote:
> > Hi Marek,  
> 
> Hi,
> 
> [...]
> 
> >>> However, it should not be empty, at the very least a reg property
> >>> should indicate on which CS it is wired, as expected there:
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next  
> >>
> >> OK, I see your point. So basically this?
> >>
> >> &gpmi {
> >>     #size-cells = <1>;
> >>     ...
> >>     nand-chip@0 {
> >>       reg = <0>;
> >>     };
> >> };
> >>
> >> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense.  
> > 
> > I randomly opened a reference manual (IMX6DQL.pdf), they say:
> > 
> > 	"Up to four NAND devices, supported by four chip-selects and one
> > 	 ganged ready/ busy."  
> 
> Doh, and MX7D has the same controller, so size-cells = <1>; makes sense with nand-chip@N {} .

Actually #address-cells is here for that. You need to point at one CS,
so in most cases this is:

controller {
	#address-cells = <1>;
	#size-cells = <0>;
	chip@N {
		reg = <N>;
	};
};

> 
> > Anyway, the NAND controller generic bindings which require this reg
> > property, what the controller or the driver actually supports, or even
> > how it is used on current designs is not relevant here.
> >   
> >>> But, as nand-chip.yaml references mtd.yaml, you can as well use
> >>> whatever is described here:
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next  
> >>>    >>>> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ?  
> >>>
> >>> The commit that was pointed in the original fix clearly stated that the
> >>> NAND chip node was targeted  
> >>
> >> I think this is another miscommunication here. The commit
> >>
> >> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
> >>
> >> modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified.  
> > 
> > Yes I know. I was referring to this commit, sorry:
> > 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")
> > 
> > The log says:
> > 
> > 	Listing MTD partitions directly in the flash mode has been
> > 	deprecated for a while for kernel Device Trees. Look for a node "partitions" in the
> > 	found flash nodes and use it instead of the flash node itself for the
> > 	partition list when it exists, so Device Trees following the current
> > 	best practices can be fixed up.
> > 
> > Which (I hope) means U-boot will equivalently try to play with the
> > partitions container, either in the controller node or in the chip node.
> >   
> >>> , not the NAND controller node. I hope this
> >>> is correctly supported in U-Boot though. So if there is a NAND chip
> >>> subnode, I suppose U-Boot would try to create the partitions that are
> >>> inside, or even in the sub "partitions" container.  
> >>
> >> My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells .  
> > 
> > I don't think U-Boot cares.
> >   
> >> Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away.  
> > 
> > Please also do it with the NAND chip described. If, when the NAND chip
> > is described U-Boot tries to create partitions in the controller node,
> > then the situation is even worse than I thought. But I believe
> > describing the node like a suggest in the DT should prevent the boot
> > failure while still allowing a rather good description of the hardware.
> > 
> > BTW I still think the relevant action right now is to revert the DT
> > patch.  
> 
> I am starting to bank toward that variant as well (thanks for clarifying the rationale in the discussion, that helped a lot).
> 
> But then, the follow up fix would be what exactly, update the binding document to require #size-cells = <1>; ?


Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:45                       ` Francesco Dolcini
@ 2022-12-02 17:05                         ` Miquel Raynal
  0 siblings, 0 replies; 48+ messages in thread
From: Miquel Raynal @ 2022-12-02 17:05 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Marek Vasut, Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

francesco@dolcini.it wrote on Fri, 2 Dec 2022 17:45:37 +0100:

> On Fri, Dec 02, 2022 at 05:17:59PM +0100, Marek Vasut wrote:
> > On 12/2/22 16:49, Miquel Raynal wrote:  
> > > , not the NAND controller node. I hope this
> > > is correctly supported in U-Boot though. So if there is a NAND chip
> > > subnode, I suppose U-Boot would try to create the partitions that are
> > > inside, or even in the sub "partitions" container.  
> > 
> > My understanding is that U-Boot checks the nand-controller node size-cells,
> > not the nand-chip{} or partitions{} subnode size-cells .  
> Not 100% correct.
> 
>  - U-Boot before v2022.04 updates the nand-controller{} node, no matter what.
>  - U-Boot starting from v2022.04 looks for `partitions{}` into the
>    nand-controller{} node, and creates the partition into it if found.
>    If not found it behaves the same way as the previous versions.
>    See commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")
> 
> I'd like to stress once more the fact that we cannot expect old U-Boot
> to be updated in the field, and they will keep generating the partitions
> as child of the nand-controller node whatever we do with the dts file.
> 
> I think that this should be treated the same way as any other fixup we
> might have for broken firmware, especially considering that this used to
> "work" (yes, I can agree that it horrible, but I cannot change the past)
> without even a warning since the imx7 support was first introduced in
> the linux kernel years ago.
> 
> > Francesco, can you please share the DT, including the U-Boot generated
> > partitions, which is passed to Linux on Colibri MX7 ? I think that should
> > make all confusion go away.  
> 
> The device tree part is easy, just
> arch/arm/boot/dts/imx7d-colibri-eval-v3.dts.
> 
> and the nand-controller node is coming from
> 
> #include "imx7d.dtsi"
> 
> plus
> 
> &gpmi {
> 	fsl,use-minimum-ecc;
> 	nand-ecc-mode = "hw";
> 	nand-on-flash-bbt;
> 	pinctrl-names = "default";
> 	pinctrl-0 = <&pinctrl_gpmi_nand>;
> };
> 
> The partitions nodes are generated 100% by U-Boot, nothing is present in
> the dts source files.

I hope if you provide a NAND chip child node, the partitions are created
at the right location, otherwise this is so, so wrong...

> 
> With this DTS file as input, whatever U-Boot version is used I have the
> following generated:
> 
> root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/
> #address-cells          dma-names               nand-on-flash-bbt       pinctrl-0
> #size-cells             dmas                    partition@0             pinctrl-names
> assigned-clock-parents  fsl,use-minimum-ecc     partition@200000        reg
> assigned-clocks         interrupt-names         partition@380000        reg-names
> clock-names             interrupts              partition@400000        status
> clocks                  name                    partition@80000
> compatible              nand-ecc-mode           phandle
> 
> root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/partition@*
> /proc/device-tree/soc/nand-controller@33002000/partition@0:
> label  name   reg
> 
> /proc/device-tree/soc/nand-controller@33002000/partition@200000:
> label      name       read_only  reg
> 
> /proc/device-tree/soc/nand-controller@33002000/partition@380000:
> label  name   reg
> 
> /proc/device-tree/soc/nand-controller@33002000/partition@400000:
> label  name   reg
> 
> /proc/device-tree/soc/nand-controller@33002000/partition@80000:
> label      name       read_only  reg
> 
> 


Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:57                           ` Miquel Raynal
@ 2022-12-02 17:08                             ` Marek Vasut
  2022-12-05 11:26                               ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-02 17:08 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On 12/2/22 17:57, Miquel Raynal wrote:
> Hi Marek,

Hi,

>> On 12/2/22 17:42, Miquel Raynal wrote:
>>> Hi Marek,
>>
>> Hi,
>>
>> [...]
>>
>>>>> However, it should not be empty, at the very least a reg property
>>>>> should indicate on which CS it is wired, as expected there:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next
>>>>
>>>> OK, I see your point. So basically this?
>>>>
>>>> &gpmi {
>>>>      #size-cells = <1>;
>>>>      ...
>>>>      nand-chip@0 {
>>>>        reg = <0>;
>>>>      };
>>>> };
>>>>
>>>> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense.
>>>
>>> I randomly opened a reference manual (IMX6DQL.pdf), they say:
>>>
>>> 	"Up to four NAND devices, supported by four chip-selects and one
>>> 	 ganged ready/ busy."
>>
>> Doh, and MX7D has the same controller, so size-cells = <1>; makes sense with nand-chip@N {} .
> 
> Actually #address-cells is here for that. You need to point at one CS,
> so in most cases this is:
> 
> controller {
> 	#address-cells = <1>;
> 	#size-cells = <0>;
> 	chip@N {
> 		reg = <N>;
> 	};
> };

Right ... thanks for spotting this.

But then the size-cells in the controller node should be zero. And 
753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") is 
therefore correct ?

But that correction in 753395ea1e45 breaks setup where old U-Boot 
injects partition@N directly into the nand-controller node, without 
updating the nand-controller node size-cells, i.e. this:
nand-controller {
	#address-cells = <1>;
	#size-cells = <0>;

+       partition@0 { ... };
};
Because the size-cells is 0 in that case, and U-Boot does not update the 
size-cells .

It used to work before because Linux DT contained size-cells=<1> in the 
nand-controller node before 753395ea1e45 ("ARM: dts: imx7: Fix NAND 
controller size-cells").

But here I would say this is a firmware bug and it might have to be 
handled like a firmware bug, i.e. with fixup in the partition parser. I 
seem to be changing my opinion here again.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 16:42                       ` Miquel Raynal
  2022-12-02 16:52                         ` Marek Vasut
@ 2022-12-02 17:20                         ` Francesco Dolcini
  2022-12-05 11:30                           ` Francesco Dolcini
  1 sibling, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-02 17:20 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Marek Vasut, Francesco Dolcini, Richard Weinberger,
	Vignesh Raghavendra, linux-mtd, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable, u-boot

On Fri, Dec 02, 2022 at 05:42:55PM +0100, Miquel Raynal wrote:
> Please also do it with the NAND chip described. If, when the NAND chip
> is described U-Boot tries to create partitions in the controller node,
> then the situation is even worse than I thought. But I believe

It's like that for U-Boot older than v2022.04 ... and IMO we cannot
ignore it.

Said that from the code U-Boot looks into a `partition{}` node only as a
direct child of the nand-controller, if there is a nand-chip in between
the nand-controller{} and the partitions{} it will just ignore it.

I could try to see what it is doing exactly, but I would need a little
bit more time, I just tried changing the DTS as wrote I got a non
bootable system.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 15:56               ` Thorsten Leemhuis
@ 2022-12-04 12:50                 ` Marek Vasut
  2022-12-04 12:59                   ` Thorsten Leemhuis
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-04 12:50 UTC (permalink / raw)
  To: Thorsten Leemhuis, Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 12/2/22 16:56, Thorsten Leemhuis wrote:
> On 02.12.22 15:31, Marek Vasut wrote:
>> On 12/2/22 15:05, Miquel Raynal wrote:
>> [...]
>>> 3. To fix the current situation:
>>>      Immediately revert commit (and prevent it from being backported):
>>>      753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>>      This way your own boot flow is fixed in the short term.
>>
>> Here I disagree, the fix is correct and I think we shouldn't proliferate
>> incorrect DTs which don't match the binding document. Rather, if a
>> bootloader generates incorrect (new) DT entries, I believe the driver
>> should implement a fixup and warn user about this. PC does that as well
>> with broken ACPI tables as far as I can tell.
> 
> Well, that might be the right solution in the long run, that's up for
> others to decide, but we need to fix this *quickly*. For two reasons
> actually: the 6.1 release is near and the change was backported to
> stable already.
> 
> For details wrt to the "quickly", see "Prioritize work on fixing
> regressions" here:
> https://docs.kernel.org/process/handling-regressions.html
> 
> IOW: Ideally it should be fixed by Sunday.
> 
> I'll hence likely soon will point Linus to this and suggest to revert
> this, unless there are strong reasons against that or some sort of
> agreement on a better solution.

You might want to wait until everyone is back on Monday, the discussion 
is still ongoing, but it seems to be getting to a conclusion.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-04 12:50                 ` Marek Vasut
@ 2022-12-04 12:59                   ` Thorsten Leemhuis
  2022-12-04 15:50                     ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Thorsten Leemhuis @ 2022-12-04 12:59 UTC (permalink / raw)
  To: Marek Vasut, Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 04.12.22 13:50, Marek Vasut wrote:
> On 12/2/22 16:56, Thorsten Leemhuis wrote:
>> On 02.12.22 15:31, Marek Vasut wrote:
>>> On 12/2/22 15:05, Miquel Raynal wrote:
>>> [...]
>>>> 3. To fix the current situation:
>>>>      Immediately revert commit (and prevent it from being backported):
>>>>      753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>>>      This way your own boot flow is fixed in the short term.
>>>
>>> Here I disagree, the fix is correct and I think we shouldn't proliferate
>>> incorrect DTs which don't match the binding document. Rather, if a
>>> bootloader generates incorrect (new) DT entries, I believe the driver
>>> should implement a fixup and warn user about this. PC does that as well
>>> with broken ACPI tables as far as I can tell.
>>
>> Well, that might be the right solution in the long run, that's up for
>> others to decide, but we need to fix this *quickly*. For two reasons
>> actually: the 6.1 release is near and the change was backported to
>> stable already.
>>
>> For details wrt to the "quickly", see "Prioritize work on fixing
>> regressions" here:
>> https://docs.kernel.org/process/handling-regressions.html
>>
>> IOW: Ideally it should be fixed by Sunday.
>>
>> I'll hence likely soon will point Linus to this and suggest to revert
>> this, unless there are strong reasons against that or some sort of
>> agreement on a better solution.
> 
> You might want to wait until everyone is back on Monday, the discussion
> is still ongoing, but it seems to be getting to a conclusion.

Yeah, came to a similar conclusion, but want to mentioned it
nevertheless and already have this prepared (together will appropriate
links to the discussion):

```
A regression causing boot failures on iMX7 (due to a backport this is
also affecting 6.0.y) could be fixed with a quick revert as well. But
looks like there is no need for it, after some back and forth the
developers that care are close to come to an agreement how to fix the
problem properly soonish:
```

Ciao, Thorsten

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-04 12:59                   ` Thorsten Leemhuis
@ 2022-12-04 15:50                     ` Marek Vasut
  0 siblings, 0 replies; 48+ messages in thread
From: Marek Vasut @ 2022-12-04 15:50 UTC (permalink / raw)
  To: Thorsten Leemhuis, Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 12/4/22 13:59, Thorsten Leemhuis wrote:
> On 04.12.22 13:50, Marek Vasut wrote:
>> On 12/2/22 16:56, Thorsten Leemhuis wrote:
>>> On 02.12.22 15:31, Marek Vasut wrote:
>>>> On 12/2/22 15:05, Miquel Raynal wrote:
>>>> [...]
>>>>> 3. To fix the current situation:
>>>>>       Immediately revert commit (and prevent it from being backported):
>>>>>       753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells")
>>>>>       This way your own boot flow is fixed in the short term.
>>>>
>>>> Here I disagree, the fix is correct and I think we shouldn't proliferate
>>>> incorrect DTs which don't match the binding document. Rather, if a
>>>> bootloader generates incorrect (new) DT entries, I believe the driver
>>>> should implement a fixup and warn user about this. PC does that as well
>>>> with broken ACPI tables as far as I can tell.
>>>
>>> Well, that might be the right solution in the long run, that's up for
>>> others to decide, but we need to fix this *quickly*. For two reasons
>>> actually: the 6.1 release is near and the change was backported to
>>> stable already.
>>>
>>> For details wrt to the "quickly", see "Prioritize work on fixing
>>> regressions" here:
>>> https://docs.kernel.org/process/handling-regressions.html
>>>
>>> IOW: Ideally it should be fixed by Sunday.
>>>
>>> I'll hence likely soon will point Linus to this and suggest to revert
>>> this, unless there are strong reasons against that or some sort of
>>> agreement on a better solution.
>>
>> You might want to wait until everyone is back on Monday, the discussion
>> is still ongoing, but it seems to be getting to a conclusion.
> 
> Yeah, came to a similar conclusion, but want to mentioned it
> nevertheless and already have this prepared (together will appropriate
> links to the discussion):
> 
> ```
> A regression causing boot failures on iMX7 (due to a backport this is
> also affecting 6.0.y) could be fixed with a quick revert as well. But
> looks like there is no need for it, after some back and forth the
> developers that care are close to come to an agreement how to fix the
> problem properly soonish:
> ```

ACK, thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 17:08                             ` Marek Vasut
@ 2022-12-05 11:26                               ` Francesco Dolcini
  2022-12-05 13:49                                 ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-05 11:26 UTC (permalink / raw)
  To: Marek Vasut, Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:
> But here I would say this is a firmware bug and it might have to be handled
> like a firmware bug, i.e. with fixup in the partition parser. I seem to be
> changing my opinion here again.

I was thinking at this over the weekend, and I came to the following
ideas:

 - we need some improvement on the fixup we already have in the
   partition parser. We cannot ignore the fdt produced by U-Boot - as
   bad as it is.
 - the proposed fixup is fine for the immediate need, but it is
   not going to be enough to cover the general issue with the U-Boot
   generated partitions. U-Boot might keep generating partitions as direct
   child of the nand controller even when a partitions{} node is
   available. In this case the current parser just fails since it looks
   only into it and it will find it empty.
 - the current U-Boot only handle partitions{} as a direct child of the
   nand-controller, the nand-chip is ignored. This is not the way it is
   supposed to work. U-Boot code would need to be improved.

With all of that said I think that Miquel is right

> When a patch breaks a board and there is no straight fix, you revert
> it, then you think harder. That's what I am saying. This is a temporary
> solution.

?

Francesco



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-02 17:20                         ` Francesco Dolcini
@ 2022-12-05 11:30                           ` Francesco Dolcini
  2022-12-05 15:28                             ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-05 11:30 UTC (permalink / raw)
  To: Miquel Raynal, Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hello Miquel

On Fri, Dec 02, 2022 at 06:20:02PM +0100, Francesco Dolcini wrote:
> On Fri, Dec 02, 2022 at 05:42:55PM +0100, Miquel Raynal wrote:
> > Please also do it with the NAND chip described. If, when the NAND chip
> > is described U-Boot tries to create partitions in the controller node,
> > then the situation is even worse than I thought. But I believe
> 
> It's like that for U-Boot older than v2022.04 ... and IMO we cannot
> ignore it.
> 
> Said that from the code U-Boot looks into a `partition{}` node only as a
> direct child of the nand-controller, if there is a nand-chip in between
> the nand-controller{} and the partitions{} it will just ignore it.
> 
> I could try to see what it is doing exactly, but I would need a little
> bit more time, I just tried changing the DTS as wrote I got a non
> bootable system.

If I have a nand-chip { partitions {} } described in the dts U-Boot
(even the latest one) ignores it and generates the partition as child of
the nand controller, the linux parser however see that partitions{}
exists, even if empty, and ignore the partition directly defined as
child of the nand controller.

TL;DR: parser fails and boot fails according to that.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-05 11:26                               ` Francesco Dolcini
@ 2022-12-05 13:49                                 ` Miquel Raynal
  2022-12-05 16:25                                   ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-05 13:49 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Marek Vasut, Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100:

> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:
> > But here I would say this is a firmware bug and it might have to be handled
> > like a firmware bug, i.e. with fixup in the partition parser. I seem to be
> > changing my opinion here again.  
> 
> I was thinking at this over the weekend, and I came to the following
> ideas:
> 
>  - we need some improvement on the fixup we already have in the
>    partition parser. We cannot ignore the fdt produced by U-Boot - as
>    bad as it is.
>  - the proposed fixup is fine for the immediate need, but it is
>    not going to be enough to cover the general issue with the U-Boot
>    generated partitions. U-Boot might keep generating partitions as direct
>    child of the nand controller even when a partitions{} node is
>    available. In this case the current parser just fails since it looks
>    only into it and it will find it empty.
>  - the current U-Boot only handle partitions{} as a direct child of the
>    nand-controller, the nand-chip is ignored. This is not the way it is
>    supposed to work. U-Boot code would need to be improved.

I've been thinking about it this weekend as well and the current fix
which "just set" s_cell to 1 seems risky for me, it is typically the
type of quick & dirty fix that might even break other board (nobody
knew that U-Boot current logic expected #size-cells to be set in the
DT, what if another "broken" DT expects the opposite...), not
mentioning potential issues with big storages (> 4GiB).

All in all, I really think we should revert the DT change now, reverting
as little to no drawbacks besides a dt_binding_check warning and gives
us time to deal with it properly (both in U-Boot and Linux).

> With all of that said I think that Miquel is right
> 
> > When a patch breaks a board and there is no straight fix, you revert
> > it, then you think harder. That's what I am saying. This is a temporary
> > solution.  
> 
> ?
> 
> Francesco
> 
> 

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-05 11:30                           ` Francesco Dolcini
@ 2022-12-05 15:28                             ` Miquel Raynal
  0 siblings, 0 replies; 48+ messages in thread
From: Miquel Raynal @ 2022-12-05 15:28 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Marek Vasut, Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:30:16 +0100:

> Hello Miquel
> 
> On Fri, Dec 02, 2022 at 06:20:02PM +0100, Francesco Dolcini wrote:
> > On Fri, Dec 02, 2022 at 05:42:55PM +0100, Miquel Raynal wrote:  
> > > Please also do it with the NAND chip described. If, when the NAND chip
> > > is described U-Boot tries to create partitions in the controller node,
> > > then the situation is even worse than I thought. But I believe  
> > 
> > It's like that for U-Boot older than v2022.04 ... and IMO we cannot
> > ignore it.
> > 
> > Said that from the code U-Boot looks into a `partition{}` node only as a
> > direct child of the nand-controller, if there is a nand-chip in between
> > the nand-controller{} and the partitions{} it will just ignore it.
> > 
> > I could try to see what it is doing exactly, but I would need a little
> > bit more time, I just tried changing the DTS as wrote I got a non
> > bootable system.  
> 
> If I have a nand-chip { partitions {} } described in the dts U-Boot
> (even the latest one) ignores it and generates the partition as child of
> the nand controller, the linux parser however see that partitions{}
> exists, even if empty, and ignore the partition directly defined as
> child of the nand controller.
> 
> TL;DR: parser fails and boot fails according to that.

Yeah I get that. For me the longterm goal should be to just kill that
function. We have proper DT support today, Linux knows how to read the
mtdparts cmdline variable, so there is no need for anything else. I
guess in U-Boot we should just:
- warn users of this function that the function is deprecated and they
  should update their machine support
- just migrate to another solution on the colibri board

What do you think?

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-05 13:49                                 ` Miquel Raynal
@ 2022-12-05 16:25                                   ` Marek Vasut
  2022-12-15  7:16                                     ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-05 16:25 UTC (permalink / raw)
  To: Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 12/5/22 14:49, Miquel Raynal wrote:
> Hi Francesco,

Hi,

> francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100:
> 
>> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:
>>> But here I would say this is a firmware bug and it might have to be handled
>>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be
>>> changing my opinion here again.
>>
>> I was thinking at this over the weekend, and I came to the following
>> ideas:
>>
>>   - we need some improvement on the fixup we already have in the
>>     partition parser. We cannot ignore the fdt produced by U-Boot - as
>>     bad as it is.
>>   - the proposed fixup is fine for the immediate need, but it is
>>     not going to be enough to cover the general issue with the U-Boot
>>     generated partitions. U-Boot might keep generating partitions as direct
>>     child of the nand controller even when a partitions{} node is
>>     available. In this case the current parser just fails since it looks
>>     only into it and it will find it empty.
>>   - the current U-Boot only handle partitions{} as a direct child of the
>>     nand-controller, the nand-chip is ignored. This is not the way it is
>>     supposed to work. U-Boot code would need to be improved.
> 
> I've been thinking about it this weekend as well and the current fix
> which "just set" s_cell to 1 seems risky for me, it is typically the
> type of quick & dirty fix that might even break other board (nobody
> knew that U-Boot current logic expected #size-cells to be set in the
> DT, what if another "broken" DT expects the opposite...)

Then with the current configuration, such broken DT would not work, 
since current DT does set #size-cells=<1> (wrongly).

> , not
> mentioning potential issues with big storages (> 4GiB).
> 
> All in all, I really think we should revert the DT change now, reverting
> as little to no drawbacks besides a dt_binding_check warning and gives
> us time to deal with it properly (both in U-Boot and Linux).

I am really not happy with this, but if that's marked as intermediate 
fix, go for it.

How do we deal with this in the long run however? Parser-side fix like 
this one, maybe with better heuristics ?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-05 16:25                                   ` Marek Vasut
@ 2022-12-15  7:16                                     ` Miquel Raynal
  2022-12-15  7:45                                       ` Marek Vasut
  2022-12-16  7:45                                       ` Francesco Dolcini
  0 siblings, 2 replies; 48+ messages in thread
From: Miquel Raynal @ 2022-12-15  7:16 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek & Francesco,

marex@denx.de wrote on Mon, 5 Dec 2022 17:25:11 +0100:

> On 12/5/22 14:49, Miquel Raynal wrote:
> > Hi Francesco,  
> 
> Hi,
> 
> > francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100:
> >   
> >> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:  
> >>> But here I would say this is a firmware bug and it might have to be handled
> >>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be
> >>> changing my opinion here again.  
> >>
> >> I was thinking at this over the weekend, and I came to the following
> >> ideas:
> >>
> >>   - we need some improvement on the fixup we already have in the
> >>     partition parser. We cannot ignore the fdt produced by U-Boot - as
> >>     bad as it is.
> >>   - the proposed fixup is fine for the immediate need, but it is
> >>     not going to be enough to cover the general issue with the U-Boot
> >>     generated partitions. U-Boot might keep generating partitions as direct
> >>     child of the nand controller even when a partitions{} node is
> >>     available. In this case the current parser just fails since it looks
> >>     only into it and it will find it empty.
> >>   - the current U-Boot only handle partitions{} as a direct child of the
> >>     nand-controller, the nand-chip is ignored. This is not the way it is
> >>     supposed to work. U-Boot code would need to be improved.  
> > 
> > I've been thinking about it this weekend as well and the current fix
> > which "just set" s_cell to 1 seems risky for me, it is typically the
> > type of quick & dirty fix that might even break other board (nobody
> > knew that U-Boot current logic expected #size-cells to be set in the
> > DT, what if another "broken" DT expects the opposite...)  
> 
> Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly).
> 
> > , not
> > mentioning potential issues with big storages (> 4GiB).
> > 
> > All in all, I really think we should revert the DT change now, reverting
> > as little to no drawbacks besides a dt_binding_check warning and gives
> > us time to deal with it properly (both in U-Boot and Linux).  
> 
> I am really not happy with this, but if that's marked as intermediate fix, go for it.
> 
> How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ?

Yesterday while talking about an ACPI mis-description which needed
fixing, I realized fixing up what the firmware provides to Linux should
preferably be handled as early as possible. So my first first idea was
to avoid using the broken "fixup mtdparts" function in U-Boot and I am
still convinced this is what we should do in priority. However, as
rightly pointed in this thread, we need to take care about the case
where someone would use a newer DT (let's say, with the reverted changed
reverted again) with an old U-Boot. I am still against piggy hacks in
the generic ofpart.c driver, but what we could do however is a DT
fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very
much like this:
https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
Plus a warning there saying "your dt is broken, update your firmware".

So next time someone stumbles upon this issue, we can tell them "fix
your bootloader", and apply the same hack in their board family (there
are three or four IIRC which might be concerned some day).

That would fix all cases and only have an impact on the affected boards.

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-15  7:16                                     ` Miquel Raynal
@ 2022-12-15  7:45                                       ` Marek Vasut
  2022-12-15  8:04                                         ` Miquel Raynal
  2022-12-16  7:45                                       ` Francesco Dolcini
  1 sibling, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-15  7:45 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On 12/15/22 08:16, Miquel Raynal wrote:
> Hi Marek & Francesco,

Hi,

> marex@denx.de wrote on Mon, 5 Dec 2022 17:25:11 +0100:
> 
>> On 12/5/22 14:49, Miquel Raynal wrote:
>>> Hi Francesco,
>>
>> Hi,
>>
>>> francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100:
>>>    
>>>> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:
>>>>> But here I would say this is a firmware bug and it might have to be handled
>>>>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be
>>>>> changing my opinion here again.
>>>>
>>>> I was thinking at this over the weekend, and I came to the following
>>>> ideas:
>>>>
>>>>    - we need some improvement on the fixup we already have in the
>>>>      partition parser. We cannot ignore the fdt produced by U-Boot - as
>>>>      bad as it is.
>>>>    - the proposed fixup is fine for the immediate need, but it is
>>>>      not going to be enough to cover the general issue with the U-Boot
>>>>      generated partitions. U-Boot might keep generating partitions as direct
>>>>      child of the nand controller even when a partitions{} node is
>>>>      available. In this case the current parser just fails since it looks
>>>>      only into it and it will find it empty.
>>>>    - the current U-Boot only handle partitions{} as a direct child of the
>>>>      nand-controller, the nand-chip is ignored. This is not the way it is
>>>>      supposed to work. U-Boot code would need to be improved.
>>>
>>> I've been thinking about it this weekend as well and the current fix
>>> which "just set" s_cell to 1 seems risky for me, it is typically the
>>> type of quick & dirty fix that might even break other board (nobody
>>> knew that U-Boot current logic expected #size-cells to be set in the
>>> DT, what if another "broken" DT expects the opposite...)
>>
>> Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly).
>>
>>> , not
>>> mentioning potential issues with big storages (> 4GiB).
>>>
>>> All in all, I really think we should revert the DT change now, reverting
>>> as little to no drawbacks besides a dt_binding_check warning and gives
>>> us time to deal with it properly (both in U-Boot and Linux).
>>
>> I am really not happy with this, but if that's marked as intermediate fix, go for it.
>>
>> How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ?
> 
> Yesterday while talking about an ACPI mis-description which needed
> fixing, I realized fixing up what the firmware provides to Linux should
> preferably be handled as early as possible. So my first first idea was
> to avoid using the broken "fixup mtdparts" function in U-Boot and I am
> still convinced this is what we should do in priority. However, as
> rightly pointed in this thread, we need to take care about the case
> where someone would use a newer DT (let's say, with the reverted changed
> reverted again) with an old U-Boot. I am still against piggy hacks in
> the generic ofpart.c driver, but what we could do however is a DT
> fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very
> much like this:
> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
> Plus a warning there saying "your dt is broken, update your firmware".

This does not work, because the old U-Boot fixup_mtdparts() may be 
applied on any machine, it is not colibri mx7 specific. Also, new 
arch-side workaround are really not welcome by the architecture 
maintainers as far as I can tell.

> So next time someone stumbles upon this issue, we can tell them "fix
> your bootloader", and apply the same hack in their board family (there
> are three or four IIRC which might be concerned some day).

There are also those machines we do not even know about which might be 
generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless 
there is some all-arch fixup implementation, we wouldn't be able to fix 
them all on arch side. I think the all-arch fixup implementation would 
be the driver one, i.e. this patch as it is (or maybe with some 
improvement).

> That would fix all cases and only have an impact on the affected boards.

Sadly, it does only fix the known cases, not the unknown cases like 
downstream forks which never get any bootloader updates ever, and which 
you can't find in upstream U-Boot, and which you therefore cannot easily 
catch in the arch side fixup.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-15  7:45                                       ` Marek Vasut
@ 2022-12-15  8:04                                         ` Miquel Raynal
  2022-12-16  0:36                                           ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-15  8:04 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek,

marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:

> On 12/15/22 08:16, Miquel Raynal wrote:
> > Hi Marek & Francesco,  
> 
> Hi,
> 
> > marex@denx.de wrote on Mon, 5 Dec 2022 17:25:11 +0100:
> >   
> >> On 12/5/22 14:49, Miquel Raynal wrote:  
> >>> Hi Francesco,  
> >>
> >> Hi,
> >>  
> >>> francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100:  
> >>>    >>>> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:  
> >>>>> But here I would say this is a firmware bug and it might have to be handled
> >>>>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be
> >>>>> changing my opinion here again.  
> >>>>
> >>>> I was thinking at this over the weekend, and I came to the following
> >>>> ideas:
> >>>>
> >>>>    - we need some improvement on the fixup we already have in the
> >>>>      partition parser. We cannot ignore the fdt produced by U-Boot - as
> >>>>      bad as it is.
> >>>>    - the proposed fixup is fine for the immediate need, but it is
> >>>>      not going to be enough to cover the general issue with the U-Boot
> >>>>      generated partitions. U-Boot might keep generating partitions as direct
> >>>>      child of the nand controller even when a partitions{} node is
> >>>>      available. In this case the current parser just fails since it looks
> >>>>      only into it and it will find it empty.
> >>>>    - the current U-Boot only handle partitions{} as a direct child of the
> >>>>      nand-controller, the nand-chip is ignored. This is not the way it is
> >>>>      supposed to work. U-Boot code would need to be improved.  
> >>>
> >>> I've been thinking about it this weekend as well and the current fix
> >>> which "just set" s_cell to 1 seems risky for me, it is typically the
> >>> type of quick & dirty fix that might even break other board (nobody
> >>> knew that U-Boot current logic expected #size-cells to be set in the
> >>> DT, what if another "broken" DT expects the opposite...)  
> >>
> >> Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly).
> >>  
> >>> , not
> >>> mentioning potential issues with big storages (> 4GiB).
> >>>
> >>> All in all, I really think we should revert the DT change now, reverting
> >>> as little to no drawbacks besides a dt_binding_check warning and gives
> >>> us time to deal with it properly (both in U-Boot and Linux).  
> >>
> >> I am really not happy with this, but if that's marked as intermediate fix, go for it.
> >>
> >> How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ?  
> > 
> > Yesterday while talking about an ACPI mis-description which needed
> > fixing, I realized fixing up what the firmware provides to Linux should
> > preferably be handled as early as possible. So my first first idea was
> > to avoid using the broken "fixup mtdparts" function in U-Boot and I am
> > still convinced this is what we should do in priority. However, as
> > rightly pointed in this thread, we need to take care about the case
> > where someone would use a newer DT (let's say, with the reverted changed
> > reverted again) with an old U-Boot. I am still against piggy hacks in
> > the generic ofpart.c driver, but what we could do however is a DT
> > fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very
> > much like this:
> > https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
> > Plus a warning there saying "your dt is broken, update your firmware".  
> 
> This does not work, because the old U-Boot fixup_mtdparts() may be applied on any machine,

No: https://elixir.bootlin.com/u-boot/latest/A/ident/fdt_fixup_mtdparts
And we should make our best so its use does not proliferate.
It's not like there is half a dozen of good ways to describe and forward
partitions today.

> it is not colibri mx7 specific. Also, new arch-side workaround are
> really not welcome by the architecture maintainers as far as I can
> tell.

So what? Let's propose the change and see what the maintainers have to
say. I am open to discussion.

As I said, it is not colibri mx7 specific, there are a few boards which
might be affected, they are all clearly identifiable with a compatible.
It's not the entire planet either.

> > So next time someone stumbles upon this issue, we can tell them "fix
> > your bootloader", and apply the same hack in their board family (there
> > are three or four IIRC which might be concerned some day).  
> 
> There are also those machines we do not even know about which might be generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless there is some all-arch fixup implementation, we wouldn't be able to fix them all on arch side. I think the all-arch fixup implementation would be the driver one, i.e. this patch as it is (or maybe with some improvement).

If we don't know about them, as you say, I don't feel concerned.

If something is buggy, people will report it, we will point them in the
right direction so they can fix their firmware and propose a similar
fix in their case which will involve adding a new machine compatible to
the list of boards that should tweak the #size-cell property.

> > That would fix all cases and only have an impact on the affected boards.  
> 
> Sadly, it does only fix the known cases, not the unknown cases like downstream forks which never get any bootloader updates ever, and which you can't find in upstream U-Boot, and which you therefore cannot easily catch in the arch side fixup.

And ?

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-15  8:04                                         ` Miquel Raynal
@ 2022-12-16  0:36                                           ` Marek Vasut
  2022-12-16  7:52                                             ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-16  0:36 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

On 12/15/22 09:04, Miquel Raynal wrote:
> Hi Marek,

Hi,

[...]

>>> Yesterday while talking about an ACPI mis-description which needed
>>> fixing, I realized fixing up what the firmware provides to Linux should
>>> preferably be handled as early as possible. So my first first idea was
>>> to avoid using the broken "fixup mtdparts" function in U-Boot and I am
>>> still convinced this is what we should do in priority. However, as
>>> rightly pointed in this thread, we need to take care about the case
>>> where someone would use a newer DT (let's say, with the reverted changed
>>> reverted again) with an old U-Boot. I am still against piggy hacks in
>>> the generic ofpart.c driver, but what we could do however is a DT
>>> fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very
>>> much like this:
>>> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
>>> Plus a warning there saying "your dt is broken, update your firmware".
>>
>> This does not work, because the old U-Boot fixup_mtdparts() may be applied on any machine,
> 
> No: https://elixir.bootlin.com/u-boot/latest/A/ident/fdt_fixup_mtdparts

These are the boards from vendors who upstreamed their properly.

This does NOT take into account either boards which are using downstream 
U-Boot, or older U-Boot e.g. because they can not easily update.

> And we should make our best so its use does not proliferate.

I am not disputing that, but that's a separate U-Boot side fix which I 
hope Francesco would submit soon, AND, more importantly, the code is 
already in at least two U-Boot releases, so it's done, it's not going 
away any time soon.

> It's not like there is half a dozen of good ways to describe and forward
> partitions today.

That's really not what I am disputing here, the approach to describing 
partitions is crystal clear as far as I can tell.

>> it is not colibri mx7 specific. Also, new arch-side workaround are
>> really not welcome by the architecture maintainers as far as I can
>> tell.
> 
> So what? Let's propose the change and see what the maintainers have to
> say. I am open to discussion.

Why is there such strong opposition toward generic fix in the OF 
partition parser ?

> As I said, it is not colibri mx7 specific, there are a few boards which
> might be affected,

... that you know about ...

> they are all clearly identifiable with a compatible.
> It's not the entire planet either.

Neither of us can make this statement with certainty, because neither of 
us knows what hardware is running the affected version of U-Boot.

>>> So next time someone stumbles upon this issue, we can tell them "fix
>>> your bootloader", and apply the same hack in their board family (there
>>> are three or four IIRC which might be concerned some day).
>>
>> There are also those machines we do not even know about which might be generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless there is some all-arch fixup implementation, we wouldn't be able to fix them all on arch side. I think the all-arch fixup implementation would be the driver one, i.e. this patch as it is (or maybe with some improvement).
> 
> If we don't know about them, as you say, I don't feel concerned.
> 
> If something is buggy, people will report it, we will point them in the
> right direction so they can fix their firmware and propose a similar
> fix in their case which will involve adding a new machine compatible to
> the list of boards that should tweak the #size-cell property.

Why is a potentially lengthy list of compatible strings in arch code, 
which every single user has to find _after_ their system completely 
fails to boot and forces them to perform potentially difficult recovery, 
potentially after an update to new linux-stable release no less -- over 
-- 4 liner generic fix in OF partition parser, which covers all the 
systems, does not cause systems to fail to boot completely, does not 
force users to suffer through recovery, does not require a list of 
compatibles in arch code, and rather only gracefully prints a warning ?

I very much prefer the second solution over the first.

And one more thing, the list of compatibles in arch code does not really 
work anyway, since once user updates their bootloader, the compatible 
won't change and the arch-side workaround would still be applied, which 
is not desired at that point anymore.

>>> That would fix all cases and only have an impact on the affected boards.
>>
>> Sadly, it does only fix the known cases, not the unknown cases like downstream forks which never get any bootloader updates ever, and which you can't find in upstream U-Boot, and which you therefore cannot easily catch in the arch side fixup.
> 
> And ?

I was under the impression Linux was supposed to deliver the best 
possible experience to its users even on not-perfect hardware, and if 
there are any quirks, the kernel should try to fix them up or work 
around them as best as it can, not dismiss them as broken hardware and 
fail to boot outright.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-15  7:16                                     ` Miquel Raynal
  2022-12-15  7:45                                       ` Marek Vasut
@ 2022-12-16  7:45                                       ` Francesco Dolcini
  2022-12-16 10:46                                         ` Marek Vasut
  1 sibling, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-16  7:45 UTC (permalink / raw)
  To: Miquel Raynal, Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hello Marek and Miquel,

On Thu, Dec 15, 2022 at 08:16:04AM +0100, Miquel Raynal wrote:
> So my first first idea was to avoid using the broken "fixup mtdparts"
> function in U-Boot and I am still convinced this is what we should do
> in priority.

This is something that was already discussed, but I was not really
thinking much on it till now. Do you think that the whole idea of
editing the MTD partitions from the firmware is wrong and we should just
pass the partition on the command line OR that the current
implementation is broken and can/should be fixed?

> I am still against piggy hacks in the generic ofpart.c driver, but
> what we could do however is a DT fixup in the init_machine (or the
> dt_fixup) hook for imx7 Colibri, very much like this:
> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
> Plus a warning there saying "your dt is broken, update your firmware".

I have a couple of concerns/question with this approach:
 - do we have a single point to handle this? Different architectures are
   affected by these issue. Duplicating the fixup code in multiple place
   does not seems a great idea
 - If we believe that the device tree is wrong, in the i.MX7 case
   because of #size-cells should be set to 0 and not 1, we should not
   alter the FDT. Other part of the code could rely on this being
   correctly set to 0 moving forward.

If I understood you are proposing to have a fixup at the machine level
that is converting a valid nand-controller node definition to a "broken"
one. Unless I misunderstood you and you are thinking about rewriting the
whole MTD partition from a broken definition to a proper one.

On Thu, Dec 15, 2022 at 09:04:46AM +0100, Miquel Raynal wrote:
> marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:
> > Sadly, it does only fix the known cases, not the unknown cases like
> > downstream forks which never get any bootloader updates ever, and
> > which you can't find in upstream U-Boot, and which you therefore
> > cannot easily catch in the arch side fixup.
> 
> And ?

I'm not personally and directly concerned, since the machine I care are
all available upstream and known, however this is a general problem with
U-Boot code being at the same time widely used on a range of embedded
products and producing a broken MTD partition list.

I think we will just silently break boards and just creating a lot of
issues to people. We would just introduce regression to the users, being
aware of it and deliberately decide to not care and move the problem to
someone else. I do not think this is a good way to go.

Francesco



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16  0:36                                           ` Marek Vasut
@ 2022-12-16  7:52                                             ` Francesco Dolcini
  0 siblings, 0 replies; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-16  7:52 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Miquel Raynal, Francesco Dolcini, Richard Weinberger,
	Vignesh Raghavendra, linux-mtd, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable, u-boot

On Fri, Dec 16, 2022 at 01:36:03AM +0100, Marek Vasut wrote:
> On 12/15/22 09:04, Miquel Raynal wrote:
> > > > That would fix all cases and only have an impact on the affected
> > > > boards.
> > > 
> > > Sadly, it does only fix the known cases, not the unknown cases
> > > like downstream forks which never get any bootloader updates ever,
> > > and which you can't find in upstream U-Boot, and which you
> > > therefore cannot easily catch in the arch side fixup.
> > 
> > And ?
> 
> I was under the impression Linux was supposed to deliver the best possible
> experience to its users even on not-perfect hardware, and if there are any
> quirks, the kernel should try to fix them up or work around them as best as
> it can, not dismiss them as broken hardware and fail to boot outright.

I would say something more on this.

We are not talking about Linux not working well on some hardware, we are
talking about breaking hardware that was working fine since ever.
I believe that the Linux has a quite strong point of view on such kind
of regression.

Quoting Linus
> If the kernel used to work for you, the rule is that it continues to work for you.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16  7:45                                       ` Francesco Dolcini
@ 2022-12-16 10:46                                         ` Marek Vasut
  2022-12-16 11:01                                           ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-16 10:46 UTC (permalink / raw)
  To: Francesco Dolcini, Miquel Raynal
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 12/16/22 08:45, Francesco Dolcini wrote:
> Hello Marek and Miquel,

Hi,

> On Thu, Dec 15, 2022 at 08:16:04AM +0100, Miquel Raynal wrote:
>> So my first first idea was to avoid using the broken "fixup mtdparts"
>> function in U-Boot and I am still convinced this is what we should do
>> in priority.
> 
> This is something that was already discussed, but I was not really
> thinking much on it till now. Do you think that the whole idea of
> editing the MTD partitions from the firmware is wrong and we should just
> pass the partition on the command line OR that the current
> implementation is broken and can/should be fixed?

No, patching the partition layout into DT is fine. Firmwares of all 
kinds have been patching various parts of the DT before passing it to OS 
since forever, or more recently, merging multiple DT fragments and 
passing the composite DT to Linux.

As far as I recall, OF predates Linux and the OF tree has been usually 
assembled by the Forth firmware of that era from various chunks stored 
in different parts of the system. So this patching is fundamental part 
of the design since the beginning.

It is difficult to describe complex structure like the partition mapping 
on kernel command line, it should really be in DT or other such 
structure, so patching it into the DT is fine. The only detail here is, 
it should be patched into the DT correctly ... and ... if old firmwares 
do not do that, Linux should fix it up. You don't throw away your old PC 
just because it doesn't have perfect ACPI tables one would expect today, 
I don't see why we should do that with DT machines.

>> I am still against piggy hacks in the generic ofpart.c driver, but
>> what we could do however is a DT fixup in the init_machine (or the
>> dt_fixup) hook for imx7 Colibri, very much like this:
>> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
>> Plus a warning there saying "your dt is broken, update your firmware".
> 
> I have a couple of concerns/question with this approach:
>   - do we have a single point to handle this? Different architectures are
>     affected by these issue. Duplicating the fixup code in multiple place
>     does not seems a great idea
>   - If we believe that the device tree is wrong, in the i.MX7 case
>     because of #size-cells should be set to 0 and not 1, we should not
>     alter the FDT. Other part of the code could rely on this being
>     correctly set to 0 moving forward.
> 
> If I understood you are proposing to have a fixup at the machine level
> that is converting a valid nand-controller node definition to a "broken"
> one. Unless I misunderstood you and you are thinking about rewriting the
> whole MTD partition from a broken definition to a proper one.
> 
> On Thu, Dec 15, 2022 at 09:04:46AM +0100, Miquel Raynal wrote:
>> marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:
>>> Sadly, it does only fix the known cases, not the unknown cases like
>>> downstream forks which never get any bootloader updates ever, and
>>> which you can't find in upstream U-Boot, and which you therefore
>>> cannot easily catch in the arch side fixup.
>>
>> And ?
> 
> I'm not personally and directly concerned, since the machine I care are
> all available upstream and known, however this is a general problem with
> U-Boot code being at the same time widely used on a range of embedded
> products and producing a broken MTD partition list.
> 
> I think we will just silently break boards and just creating a lot of
> issues to people. We would just introduce regression to the users, being
> aware of it and deliberately decide to not care and move the problem to
> someone else. I do not think this is a good way to go.

I agree.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 10:46                                         ` Marek Vasut
@ 2022-12-16 11:01                                           ` Miquel Raynal
  2022-12-16 12:37                                             ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-16 11:01 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot


marex@denx.de wrote on Fri, 16 Dec 2022 11:46:18 +0100:

> On 12/16/22 08:45, Francesco Dolcini wrote:
> > Hello Marek and Miquel,  
> 
> Hi,
> 
> > On Thu, Dec 15, 2022 at 08:16:04AM +0100, Miquel Raynal wrote:  
> >> So my first first idea was to avoid using the broken "fixup mtdparts"
> >> function in U-Boot and I am still convinced this is what we should do
> >> in priority.  
> > 
> > This is something that was already discussed, but I was not really
> > thinking much on it till now. Do you think that the whole idea of
> > editing the MTD partitions from the firmware is wrong and we should just
> > pass the partition on the command line OR that the current
> > implementation is broken and can/should be fixed?  
> 
> No, patching the partition layout into DT is fine. Firmwares of all kinds have been patching various parts of the DT before passing it to OS since forever, or more recently, merging multiple DT fragments and passing the composite DT to Linux.
> 
> As far as I recall, OF predates Linux and the OF tree has been usually assembled by the Forth firmware of that era from various chunks stored in different parts of the system. So this patching is fundamental part of the design since the beginning.
> 
> It is difficult to describe complex structure like the partition mapping on kernel command line, it should really be in DT or other such structure, so patching it into the DT is fine.

I think describing it in the DT is fine and welcome.
I think patching it in the DT is ugly. My 2cts.

> The only detail here is, it should be patched into the DT correctly  ... and ... if old firmwares do not do that, Linux should fix it up.  You don't throw away your old PC just because it doesn't have perfect  ACPI tables one would expect today, I don't see why we should do that  with DT machines.
> 
> >> I am still against piggy hacks in the generic ofpart.c driver, but
> >> what we could do however is a DT fixup in the init_machine (or the
> >> dt_fixup) hook for imx7 Colibri, very much like this:
> >> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
> >> Plus a warning there saying "your dt is broken, update your firmware".  
> > 
> > I have a couple of concerns/question with this approach:
> >   - do we have a single point to handle this? Different architectures are
> >     affected by these issue. Duplicating the fixup code in multiple place
> >     does not seems a great idea
> >   - If we believe that the device tree is wrong, in the i.MX7 case
> >     because of #size-cells should be set to 0 and not 1, we should not
> >     alter the FDT. Other part of the code could rely on this being
> >     correctly set to 0 moving forward.
> > 
> > If I understood you are proposing to have a fixup at the machine level
> > that is converting a valid nand-controller node definition to a "broken"
> > one. Unless I misunderstood you and you are thinking about rewriting the
> > whole MTD partition from a broken definition to a proper one.

No, quite the opposite.

Either size-cell is wrong which makes the description totally
inconsistent (if size-cell is there, it must have a use, otherwise why
do we keep it?) and we must fix it, or it is right and we should not
touch it.

What I propose is to check very early whether the description is
consistent on the board known to have this problem. If the description
is wrong, we fix it and the generic parser can then do its work
properly.

> > 
> > On Thu, Dec 15, 2022 at 09:04:46AM +0100, Miquel Raynal wrote:  
> >> marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:  
> >>> Sadly, it does only fix the known cases, not the unknown cases like
> >>> downstream forks which never get any bootloader updates ever, and
> >>> which you can't find in upstream U-Boot, and which you therefore
> >>> cannot easily catch in the arch side fixup.  
> >>
> >> And ?  
> > 
> > I'm not personally and directly concerned, since the machine I care are
> > all available upstream and known, however this is a general problem with
> > U-Boot code being at the same time widely used on a range of embedded
> > products and producing a broken MTD partition list.
> > 
> > I think we will just silently break boards and just creating a lot of
> > issues to people. We would just introduce regression to the users, being
> > aware of it and deliberately decide to not care and move the problem to
> > someone else. I do not think this is a good way to go.  

What? Since when my proposal is breaking boards? My proposal leads to a
situation where:
- If you have a board that has an inconsistent description but worked,
  it will still work.
- If you have a board that has a consistent description and worked, it
  will still work.
- If your have a board that has an inconsistent description and got
  broken *recently* by another change (typically you "fix" the DT in
  Linux to comply with the bindings), then you get a warning that leads
  you on the right path, you then update your bootloader if you can,
  but either way you add your machine compatible to the list of devices
  which need the early fix and your boot is fixed.
- If you add support for a new board with an old kernel and have an
  inconsistent description it does not "just work because we have an
  automatic piggy hack in the driver". I am against it.

Whether or not it is acceptable by arch maintainer is something else,
we won't know until we include them in the loop with a proper patch.

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 11:01                                           ` Miquel Raynal
@ 2022-12-16 12:37                                             ` Francesco Dolcini
  2022-12-16 13:37                                               ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-16 12:37 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Marek Vasut, Francesco Dolcini, Richard Weinberger,
	Vignesh Raghavendra, linux-mtd, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable, u-boot

On Fri, Dec 16, 2022 at 12:01:55PM +0100, Miquel Raynal wrote:
> marex@denx.de wrote on Fri, 16 Dec 2022 11:46:18 +0100:
> > On 12/16/22 08:45, Francesco Dolcini wrote:
> > > On Thu, Dec 15, 2022 at 08:16:04AM +0100, Miquel Raynal wrote:  
> > >> I am still against piggy hacks in the generic ofpart.c driver, but
> > >> what we could do however is a DT fixup in the init_machine (or the
> > >> dt_fixup) hook for imx7 Colibri, very much like this:
> > >> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
> > >> Plus a warning there saying "your dt is broken, update your firmware".  
> > > 
> > > I have a couple of concerns/question with this approach:
> > >   - do we have a single point to handle this? Different architectures are
> > >     affected by these issue. Duplicating the fixup code in multiple place
> > >     does not seems a great idea
> > >   - If we believe that the device tree is wrong, in the i.MX7 case
> > >     because of #size-cells should be set to 0 and not 1, we should not
> > >     alter the FDT. Other part of the code could rely on this being
> > >     correctly set to 0 moving forward.
> > > 
> > > If I understood you are proposing to have a fixup at the machine level
> > > that is converting a valid nand-controller node definition to a "broken"
> > > one. Unless I misunderstood you and you are thinking about rewriting the
> > > whole MTD partition from a broken definition to a proper one.
> 
> No, quite the opposite.
>
> Either size-cell is wrong which makes the description totally
> inconsistent (if size-cell is there, it must have a use, otherwise why
> do we keep it?) and we must fix it, or it is right and we should not
> touch it.
> 
> What I propose is to check very early whether the description is
> consistent on the board known to have this problem. If the description
> is wrong, we fix it and the generic parser can then do its work
> properly.

What if we add `nand-chip{}` children in the future (the i.MX nand
controller has nothing implemented not described in the schema so far,
but it is something that is supported by the hw)? Will this idea still
works?

> > > On Thu, Dec 15, 2022 at 09:04:46AM +0100, Miquel Raynal wrote:  
> > >> marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:  
> > >>> Sadly, it does only fix the known cases, not the unknown cases like
> > >>> downstream forks which never get any bootloader updates ever, and
> > >>> which you can't find in upstream U-Boot, and which you therefore
> > >>> cannot easily catch in the arch side fixup.  
> > >>
> > >> And ?  
> > > 
> > > I'm not personally and directly concerned, since the machine I care are
> > > all available upstream and known, however this is a general problem with
> > > U-Boot code being at the same time widely used on a range of embedded
> > > products and producing a broken MTD partition list.
> > > 
> > > I think we will just silently break boards and just creating a lot of
> > > issues to people. We would just introduce regression to the users, being
> > > aware of it and deliberately decide to not care and move the problem to
> > > someone else. I do not think this is a good way to go.  
> 
> What?

Let me rephrase, I was not clear enough.

> Since when my proposal is breaking boards? My proposal leads to a
> situation where:
> - If you have a board that has an inconsistent description but worked,
>   it will still work.
> - If you have a board that has a consistent description and worked, it
>   will still work.
> - If your have a board that has an inconsistent description and got
>   broken *recently* by another change (typically you "fix" the DT in
>   Linux to comply with the bindings), then you get a warning that leads
>   you on the right path, you then update your bootloader if you can,
>   but either way you add your machine compatible to the list of devices
>   which need the early fix and your boot is fixed.

This implies that we can proactively catch all the affected boards. I do
not believe this is reasonable and because of that my comment before
about creating regression to the users.

Francesco



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 12:37                                             ` Francesco Dolcini
@ 2022-12-16 13:37                                               ` Miquel Raynal
  2022-12-16 14:32                                                 ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-16 13:37 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Marek Vasut, Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

francesco@dolcini.it wrote on Fri, 16 Dec 2022 13:37:31 +0100:

> On Fri, Dec 16, 2022 at 12:01:55PM +0100, Miquel Raynal wrote:
> > marex@denx.de wrote on Fri, 16 Dec 2022 11:46:18 +0100:  
> > > On 12/16/22 08:45, Francesco Dolcini wrote:  
> > > > On Thu, Dec 15, 2022 at 08:16:04AM +0100, Miquel Raynal wrote:    
> > > >> I am still against piggy hacks in the generic ofpart.c driver, but
> > > >> what we could do however is a DT fixup in the init_machine (or the
> > > >> dt_fixup) hook for imx7 Colibri, very much like this:
> > > >> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
> > > >> Plus a warning there saying "your dt is broken, update your firmware".    
> > > > 
> > > > I have a couple of concerns/question with this approach:
> > > >   - do we have a single point to handle this? Different architectures are
> > > >     affected by these issue. Duplicating the fixup code in multiple place
> > > >     does not seems a great idea
> > > >   - If we believe that the device tree is wrong, in the i.MX7 case
> > > >     because of #size-cells should be set to 0 and not 1, we should not
> > > >     alter the FDT. Other part of the code could rely on this being
> > > >     correctly set to 0 moving forward.
> > > > 
> > > > If I understood you are proposing to have a fixup at the machine level
> > > > that is converting a valid nand-controller node definition to a "broken"
> > > > one. Unless I misunderstood you and you are thinking about rewriting the
> > > > whole MTD partition from a broken definition to a proper one.  
> > 
> > No, quite the opposite.
> >
> > Either size-cell is wrong which makes the description totally
> > inconsistent (if size-cell is there, it must have a use, otherwise why
> > do we keep it?) and we must fix it, or it is right and we should not
> > touch it.
> > 
> > What I propose is to check very early whether the description is
> > consistent on the board known to have this problem. If the description
> > is wrong, we fix it and the generic parser can then do its work
> > properly.  
> 
> What if we add `nand-chip{}` children in the future (the i.MX nand
> controller has nothing implemented not described in the schema so far,
> but it is something that is supported by the hw)? Will this idea still
> works?

I think yes. I mean, moving to a

nand-controller { nand-chip { partitions { part@x part@y } } }

scheme is what we should eventually find on all maintained boards, but
I would say, at the very least, the description must be coherent.

But my previous answer was only focusing on the case where you change
something in the kernel or in the DT that breaks the board because of
the mess fdt_fixup_mtdparts() brings.

> > > > On Thu, Dec 15, 2022 at 09:04:46AM +0100, Miquel Raynal wrote:    
> > > >> marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100:    
> > > >>> Sadly, it does only fix the known cases, not the unknown cases like
> > > >>> downstream forks which never get any bootloader updates ever, and
> > > >>> which you can't find in upstream U-Boot, and which you therefore
> > > >>> cannot easily catch in the arch side fixup.    
> > > >>
> > > >> And ?    
> > > > 
> > > > I'm not personally and directly concerned, since the machine I care are
> > > > all available upstream and known, however this is a general problem with
> > > > U-Boot code being at the same time widely used on a range of embedded
> > > > products and producing a broken MTD partition list.
> > > > 
> > > > I think we will just silently break boards and just creating a lot of
> > > > issues to people. We would just introduce regression to the users, being
> > > > aware of it and deliberately decide to not care and move the problem to
> > > > someone else. I do not think this is a good way to go.    
> > 
> > What?  
> 
> Let me rephrase, I was not clear enough.
> 
> > Since when my proposal is breaking boards? My proposal leads to a
> > situation where:
> > - If you have a board that has an inconsistent description but worked,
> >   it will still work.
> > - If you have a board that has a consistent description and worked, it
> >   will still work.
> > - If your have a board that has an inconsistent description and got
> >   broken *recently* by another change (typically you "fix" the DT in
> >   Linux to comply with the bindings), then you get a warning that leads
> >   you on the right path, you then update your bootloader if you can,
> >   but either way you add your machine compatible to the list of devices
> >   which need the early fix and your boot is fixed.  
> 
> This implies that we can proactively catch all the affected boards. I do
> not believe this is reasonable and because of that my comment before
> about creating regression to the users.

I really don't understand the reasoning here.

What I say is: let's fix the boards known to be incorrectly described
when we break them so they continue working with a broken firmware.

What regression could this possibly bring? I don't care about catching
the 2k boards out there which work but wrongly describe their
partitions. If they work, they will continue working.

You and Marek say: let's blindly always change a property in the DT, no
matter if the board is broken, even if we don't know if this is the
right thing to do, and apply this to the entire world.

But with this approach you're not worried about regressions.

I am sorry it does not stand.

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 13:37                                               ` Miquel Raynal
@ 2022-12-16 14:32                                                 ` Marek Vasut
  2022-12-16 15:35                                                   ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2022-12-16 14:32 UTC (permalink / raw)
  To: Miquel Raynal, Francesco Dolcini
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 12/16/22 14:37, Miquel Raynal wrote:

Hi,

[...]

>>> What?
>>
>> Let me rephrase, I was not clear enough.
>>
>>> Since when my proposal is breaking boards? My proposal leads to a
>>> situation where:
>>> - If you have a board that has an inconsistent description but worked,
>>>    it will still work.
>>> - If you have a board that has a consistent description and worked, it
>>>    will still work.
>>> - If your have a board that has an inconsistent description and got
>>>    broken *recently* by another change (typically you "fix" the DT in
>>>    Linux to comply with the bindings), then you get a warning that leads
>>>    you on the right path, you then update your bootloader if you can,
>>>    but either way you add your machine compatible to the list of devices
>>>    which need the early fix and your boot is fixed.
>>
>> This implies that we can proactively catch all the affected boards. I do
>> not believe this is reasonable and because of that my comment before
>> about creating regression to the users.
> 
> I really don't understand the reasoning here.
> 
> What I say is: let's fix the boards known to be incorrectly described
> when we break them so they continue working with a broken firmware.

The second part of the message, as far as I understand it, is "ignore 
problems this will cause to users of boards we do not know about, let 
them run into unbootable systems after some linux kernel update, and 
once they suffer through system recovery, make them add compatible 
string to the arch-side workaround".

> What regression could this possibly bring? I don't care about catching
> the 2k boards out there which work but wrongly describe their
> partitions. If they work, they will continue working.

Those boards would start failing once the Linux-side DT size-cells is 
corrected.

Also, this got missed in the previous discussion. If you use only board 
compatible string in arch-side workaround, the workaround would be 
applied even on systems with updated bootloaders, which is likely not 
what we want.

> You and Marek say: let's blindly always change a property in the DT, no
> matter if the board is broken, even if we don't know if this is the
> right thing to do, and apply this to the entire world.

As far as I can tell, if we have partitions in the NAND controller node 
and size-cells=0, then the right thing to do is to override size-cells 
to 1 , because partitions with size-cells=0 make no sense.

If the heuristics here needs to be improved somehow, let's discuss that.

> But with this approach you're not worried about regressions.
> 
> I am sorry it does not stand.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 14:32                                                 ` Marek Vasut
@ 2022-12-16 15:35                                                   ` Miquel Raynal
  2022-12-16 16:30                                                     ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2022-12-16 15:35 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot

Hi Marek,

marex@denx.de wrote on Fri, 16 Dec 2022 15:32:28 +0100:

> On 12/16/22 14:37, Miquel Raynal wrote:
> 
> Hi,
> 
> [...]
> 
> >>> What?  
> >>
> >> Let me rephrase, I was not clear enough.
> >>  
> >>> Since when my proposal is breaking boards? My proposal leads to a
> >>> situation where:
> >>> - If you have a board that has an inconsistent description but worked,
> >>>    it will still work.
> >>> - If you have a board that has a consistent description and worked, it
> >>>    will still work.
> >>> - If your have a board that has an inconsistent description and got
> >>>    broken *recently* by another change (typically you "fix" the DT in
> >>>    Linux to comply with the bindings), then you get a warning that leads
> >>>    you on the right path, you then update your bootloader if you can,
> >>>    but either way you add your machine compatible to the list of devices
> >>>    which need the early fix and your boot is fixed.  
> >>
> >> This implies that we can proactively catch all the affected boards. I do
> >> not believe this is reasonable and because of that my comment before
> >> about creating regression to the users.  
> > 
> > I really don't understand the reasoning here.
> > 
> > What I say is: let's fix the boards known to be incorrectly described
> > when we break them so they continue working with a broken firmware.  
> 
> The second part of the message, as far as I understand it, is "ignore problems this will cause to users of boards we do not know about, let them run into unbootable systems after some linux kernel update, 

Now you know what kernel update will break them, so you can prevent it
from happening. 

For boards without even a dtsi in the kernel, should we care?

> and once they suffer through system recovery, make them add compatible
> string to the arch-side workaround".
> 
> > What regression could this possibly bring? I don't care about catching
> > the 2k boards out there which work but wrongly describe their
> > partitions. If they work, they will continue working.  
> 
> Those boards would start failing once the Linux-side DT size-cells is corrected.
> 
> Also, this got missed in the previous discussion. If you use only board compatible string in arch-side workaround, the workaround would be applied even on systems with updated bootloaders, which is likely not what we want.

If the heuristics here needs to be improved somehow, let's discuss that ;)

> > You and Marek say: let's blindly always change a property in the DT, no
> > matter if the board is broken, even if we don't know if this is the
> > right thing to do, and apply this to the entire world.  
> 
> As far as I can tell, if we have partitions in the NAND controller node and size-cells=0, then the right thing to do is to override size-cells to 1 , because partitions with size-cells=0 make no sense.

How do you know the firmware did not set #size-cell=0 on purpose to
avoid using the partitions? How would this be more stupid than
updating partitions without setting #size-cell to the right value? Can
you be so sure every board in the world will never do that? And how do
you handle the 64-bit case as well? You _do_no_know_ what applies in
all the cases, guessing is dangerous here, you'll just support new
broken cases or even break existing setups! What you do know however is
what applies for a number of cases your clearly identified. Applying
this logic to *all* cases in simply _broken_ and utterly stupid (tm). I
don't know how to express that so we stop bike-shedding.

> If the heuristics here needs to be improved somehow, let's discuss that.
> 
> > But with this approach you're not worried about regressions.
> > 
> > I am sorry it does not stand.  
> 
> [...]

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 15:35                                                   ` Miquel Raynal
@ 2022-12-16 16:30                                                     ` Francesco Dolcini
  2023-01-02  9:40                                                       ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2022-12-16 16:30 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Marek Vasut, Francesco Dolcini, Richard Weinberger,
	Vignesh Raghavendra, linux-mtd, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable, u-boot

On Fri, Dec 16, 2022 at 04:35:01PM +0100, Miquel Raynal wrote:
> marex@denx.de wrote on Fri, 16 Dec 2022 15:32:28 +0100:
> > The second part of the message, as far as I understand it, is
> > "ignore problems this will cause to users of boards we do not know
> > about, let them run into unbootable systems after some linux kernel
> > update, 
> 
> Now you know what kernel update will break them, so you can prevent it
> from happening. 
> 
> For boards without even a dtsi in the kernel, should we care?

Would caring for those boards not be just exact the same as caring for
some UEFI/ACPI mess for which no source code is normally available and
nobody really known at which point the various vendors have forked their
source code from some Intel or AMD or whatever reference code?

IMHO we should care for the multiple reason I have already written in my
previous emails.

And honestly, just as a side comment, I would feel way more happy
to know that the elevator control system in the elevator I use everyday
or the chemical industrial plan HMI next to my home is running an up to
date Linux system that is not affected by known security vulnerabilities
and they did stop updating it just because there was some random bug
preventing the updated kernel to boot and nobody had the time/skill to
investigate and fix it. [1]

Francesco

[1] this is pure fictional, I have no elevator in my condo and no
industrial plant next to my home :-), yet I know that this non upstream
boards we are talking about are used every day in such environments ...


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2022-12-16 16:30                                                     ` Francesco Dolcini
@ 2023-01-02  9:40                                                       ` Miquel Raynal
  2023-01-05 11:33                                                         ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2023-01-02  9:40 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Marek Vasut, Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

francesco@dolcini.it wrote on Fri, 16 Dec 2022 17:30:18 +0100:

> On Fri, Dec 16, 2022 at 04:35:01PM +0100, Miquel Raynal wrote:
> > marex@denx.de wrote on Fri, 16 Dec 2022 15:32:28 +0100:  
> > > The second part of the message, as far as I understand it, is
> > > "ignore problems this will cause to users of boards we do not know
> > > about, let them run into unbootable systems after some linux kernel
> > > update,   
> > 
> > Now you know what kernel update will break them, so you can prevent it
> > from happening. 
> > 
> > For boards without even a dtsi in the kernel, should we care?  
> 
> Would caring for those boards not be just exact the same as caring for
> some UEFI/ACPI mess for which no source code is normally available and
> nobody really known at which point the various vendors have forked their
> source code from some Intel or AMD or whatever reference code?

I am sorry I don't know UEFI/ACPI well enough to discuss it.

> IMHO we should care for the multiple reason I have already written in my
> previous emails.
> 
> And honestly, just as a side comment, I would feel way more happy
> to know that the elevator control system in the elevator I use everyday
> or the chemical industrial plan HMI next to my home is running an up to
> date Linux system that is not affected by known security vulnerabilities
> and they did stop updating it just because there was some random bug
> preventing the updated kernel to boot and nobody had the time/skill to
> investigate and fix it. [1]

The issue comes from a very specific U-Boot function that should have
never existed. I hope people working on chemical plants do not make
use of these and will not disregard the "your DT is broken there [...]"
warning we plan to add right before their updated board will fail. We
are not living people in the dark, I agreed for a warning, but I don't
think applying the proposed fix blindly is wise and future-proof.

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2023-01-02  9:40                                                       ` Miquel Raynal
@ 2023-01-05 11:33                                                         ` Miquel Raynal
  2023-01-05 12:47                                                           ` Francesco Dolcini
  0 siblings, 1 reply; 48+ messages in thread
From: Miquel Raynal @ 2023-01-05 11:33 UTC (permalink / raw)
  To: Francesco Dolcini
  Cc: Marek Vasut, Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

Hi Francesco,

miquel.raynal@bootlin.com wrote on Mon, 2 Jan 2023 10:40:04 +0100:

> Hi Francesco,
> 
> francesco@dolcini.it wrote on Fri, 16 Dec 2022 17:30:18 +0100:
> 
> > On Fri, Dec 16, 2022 at 04:35:01PM +0100, Miquel Raynal wrote:  
> > > marex@denx.de wrote on Fri, 16 Dec 2022 15:32:28 +0100:    
> > > > The second part of the message, as far as I understand it, is
> > > > "ignore problems this will cause to users of boards we do not know
> > > > about, let them run into unbootable systems after some linux kernel
> > > > update,     
> > > 
> > > Now you know what kernel update will break them, so you can prevent it
> > > from happening. 
> > > 
> > > For boards without even a dtsi in the kernel, should we care?    
> > 
> > Would caring for those boards not be just exact the same as caring for
> > some UEFI/ACPI mess for which no source code is normally available and
> > nobody really known at which point the various vendors have forked their
> > source code from some Intel or AMD or whatever reference code?  
> 
> I am sorry I don't know UEFI/ACPI well enough to discuss it.
> 
> > IMHO we should care for the multiple reason I have already written in my
> > previous emails.
> > 
> > And honestly, just as a side comment, I would feel way more happy
> > to know that the elevator control system in the elevator I use everyday
> > or the chemical industrial plan HMI next to my home is running an up to
> > date Linux system that is not affected by known security vulnerabilities
> > and they did stop updating it just because there was some random bug
> > preventing the updated kernel to boot and nobody had the time/skill to
> > investigate and fix it. [1]  
> 
> The issue comes from a very specific U-Boot function that should have
> never existed. I hope people working on chemical plants do not make
> use of these and will not disregard the "your DT is broken there [...]"
> warning we plan to add right before their updated board will fail. We
> are not living people in the dark, I agreed for a warning, but I don't
> think applying the proposed fix blindly is wise and future-proof.

Let's move forward with this. Let's assume my fears are baseless. We
might consider the situation where someone tries to hide the partitions
by setting #size-cell to 0 even wronger and too unlikely. Hopefully we
will not break any other existing setups by applying an always-on fix.

I would still like to see U-Boot partitions handling evolve, at least:
- fix #size-cells in fdt_fixup_mtd()
- avoid the fdt_fixup_mtd() call from Collibri boards (ie. an example
  that can be followed by the other users)

On Linux side let's fix #size-cells like you proposed without filtering
against a list of compatibles. We however need to improve the
heuristics:
- Do it only when there are partitions declared within a NAND
  controller node.
- Change the warning to avoid mentioning backward compatibility, just
  mention this is utterly wrong and thus the value will be set to 1
  instead of 0.
- Mention in the comment above this only works on systems with <4GiB
  chips.
If you think about other conditions please feel free to add them.

Do you concur?

Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2023-01-05 11:33                                                         ` Miquel Raynal
@ 2023-01-05 12:47                                                           ` Francesco Dolcini
  2023-01-05 14:51                                                             ` Marek Vasut
  0 siblings, 1 reply; 48+ messages in thread
From: Francesco Dolcini @ 2023-01-05 12:47 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Francesco Dolcini, Marek Vasut, Richard Weinberger,
	Vignesh Raghavendra, linux-mtd, Francesco Dolcini, Shawn Guo,
	linux-arm-kernel, stable, u-boot

Hello Miquel,

On Thu, Jan 05, 2023 at 12:33:34PM +0100, Miquel Raynal wrote:
> miquel.raynal@bootlin.com wrote on Mon, 2 Jan 2023 10:40:04 +0100:
> > francesco@dolcini.it wrote on Fri, 16 Dec 2022 17:30:18 +0100:
> > > On Fri, Dec 16, 2022 at 04:35:01PM +0100, Miquel Raynal wrote:  
> > > > marex@denx.de wrote on Fri, 16 Dec 2022 15:32:28 +0100:    
> > > > > The second part of the message, as far as I understand it, is
> > > > > "ignore problems this will cause to users of boards we do not know
> > > > > about, let them run into unbootable systems after some linux kernel
> > > > > update,     
> > > > 
> > > > Now you know what kernel update will break them, so you can prevent it
> > > > from happening. 
> > > > 
> > > > For boards without even a dtsi in the kernel, should we care?    
> > > 
> > > Would caring for those boards not be just exact the same as caring for
> > > some UEFI/ACPI mess for which no source code is normally available and
> > > nobody really known at which point the various vendors have forked their
> > > source code from some Intel or AMD or whatever reference code?  
> > 
> > I am sorry I don't know UEFI/ACPI well enough to discuss it.
> > 
> > > IMHO we should care for the multiple reason I have already written in my
> > > previous emails.
> > > 
> > > And honestly, just as a side comment, I would feel way more happy
> > > to know that the elevator control system in the elevator I use everyday
> > > or the chemical industrial plan HMI next to my home is running an up to
> > > date Linux system that is not affected by known security vulnerabilities
> > > and they did stop updating it just because there was some random bug
> > > preventing the updated kernel to boot and nobody had the time/skill to
> > > investigate and fix it. [1]  
> > 
> > The issue comes from a very specific U-Boot function that should have
> > never existed. I hope people working on chemical plants do not make
> > use of these and will not disregard the "your DT is broken there [...]"
> > warning we plan to add right before their updated board will fail. We
> > are not living people in the dark, I agreed for a warning, but I don't
> > think applying the proposed fix blindly is wise and future-proof.
> 
> Let's move forward with this. Let's assume my fears are baseless. We
> might consider the situation where someone tries to hide the partitions
> by setting #size-cell to 0 even wronger and too unlikely. Hopefully we
> will not break any other existing setups by applying an always-on fix.

Nice, good!

> I would still like to see U-Boot partitions handling evolve, at least:
> - fix #size-cells in fdt_fixup_mtd()
> - avoid the fdt_fixup_mtd() call from Collibri boards (ie. an example
>   that can be followed by the other users)

Fine, I can do it. 

However I am just not 100% sure about your proposal, I wonder if we
should just deprecate this function or we should fix it.
The exact end result will depend on the discussion with the U-Boot
folks, but I absolutely agree that the current situation needs to
change. I'll keep you in CC on those patches.

> On Linux side let's fix #size-cells like you proposed without filtering
> against a list of compatibles. We however need to improve the
> heuristics:
> - Do it only when there are partitions declared within a NAND
>   controller node.
> - Change the warning to avoid mentioning backward compatibility, just
>   mention this is utterly wrong and thus the value will be set to 1
>   instead of 0.
> - Mention in the comment above this only works on systems with <4GiB
>   chips.
> If you think about other conditions please feel free to add them.
> 
> Do you concur?
Yes, I do agree.

Side comment, I have been recently busy with other life AND work priorities
and this task was just idling on the bottom of my backlog. I do not see
the situation improving that much in the next few weeks.

Said that patches coming, I am committed to have this sorted out before
the next Linux Kernel merge window, for U-Boot the merge window opens in
3 days and I am already late, let's see, this might be as well
considered a fix that is fine for a late integration.

Francesco


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2023-01-05 12:47                                                           ` Francesco Dolcini
@ 2023-01-05 14:51                                                             ` Marek Vasut
  2023-01-05 15:03                                                               ` Miquel Raynal
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Vasut @ 2023-01-05 14:51 UTC (permalink / raw)
  To: Francesco Dolcini, Miquel Raynal
  Cc: Richard Weinberger, Vignesh Raghavendra, linux-mtd,
	Francesco Dolcini, Shawn Guo, linux-arm-kernel, stable, u-boot

On 1/5/23 13:47, Francesco Dolcini wrote:
> Hello Miquel,

Hi,

[...]

>> Let's move forward with this. Let's assume my fears are baseless. We
>> might consider the situation where someone tries to hide the partitions
>> by setting #size-cell to 0 even wronger and too unlikely. Hopefully we
>> will not break any other existing setups by applying an always-on fix.
> 
> Nice, good!

Indeed

>> I would still like to see U-Boot partitions handling evolve, at least:
>> - fix #size-cells in fdt_fixup_mtd()
>> - avoid the fdt_fixup_mtd() call from Collibri boards (ie. an example
>>    that can be followed by the other users)
> 
> Fine, I can do it.
> 
> However I am just not 100% sure about your proposal, I wonder if we
> should just deprecate this function or we should fix it.

I would say fix it.

> The exact end result will depend on the discussion with the U-Boot
> folks, but I absolutely agree that the current situation needs to
> change. I'll keep you in CC on those patches.
> 
>> On Linux side let's fix #size-cells like you proposed without filtering
>> against a list of compatibles. We however need to improve the
>> heuristics:
>> - Do it only when there are partitions declared within a NAND
>>    controller node.
>> - Change the warning to avoid mentioning backward compatibility, just
>>    mention this is utterly wrong and thus the value will be set to 1
>>    instead of 0.
>> - Mention in the comment above this only works on systems with <4GiB
>>    chips.
>> If you think about other conditions please feel free to add them.
>>
>> Do you concur?
> Yes, I do agree.

Same here, agreed, thanks.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0
  2023-01-05 14:51                                                             ` Marek Vasut
@ 2023-01-05 15:03                                                               ` Miquel Raynal
  0 siblings, 0 replies; 48+ messages in thread
From: Miquel Raynal @ 2023-01-05 15:03 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Francesco Dolcini, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, Francesco Dolcini, Shawn Guo, linux-arm-kernel,
	stable, u-boot


marex@denx.de wrote on Thu, 5 Jan 2023 15:51:10 +0100:

> On 1/5/23 13:47, Francesco Dolcini wrote:
> > Hello Miquel,  
> 
> Hi,
> 
> [...]
> 
> >> Let's move forward with this. Let's assume my fears are baseless. We
> >> might consider the situation where someone tries to hide the partitions
> >> by setting #size-cell to 0 even wronger and too unlikely. Hopefully we
> >> will not break any other existing setups by applying an always-on fix.  
> > 
> > Nice, good!  
> 
> Indeed
> 
> >> I would still like to see U-Boot partitions handling evolve, at least:
> >> - fix #size-cells in fdt_fixup_mtd()
> >> - avoid the fdt_fixup_mtd() call from Collibri boards (ie. an example
> >>    that can be followed by the other users)  
> > 
> > Fine, I can do it.
> > 
> > However I am just not 100% sure about your proposal, I wonder if we
> > should just deprecate this function or we should fix it.  
> 
> I would say fix it.

Well, I think these are two orthogonal changes. The function should be
deprecated *and* fixed for the existing users?

> > The exact end result will depend on the discussion with the U-Boot
> > folks, but I absolutely agree that the current situation needs to
> > change. I'll keep you in CC on those patches.
> >   
> >> On Linux side let's fix #size-cells like you proposed without filtering
> >> against a list of compatibles. We however need to improve the
> >> heuristics:
> >> - Do it only when there are partitions declared within a NAND
> >>    controller node.
> >> - Change the warning to avoid mentioning backward compatibility, just
> >>    mention this is utterly wrong and thus the value will be set to 1
> >>    instead of 0.
> >> - Mention in the comment above this only works on systems with <4GiB
> >>    chips.
> >> If you think about other conditions please feel free to add them.
> >>
> >> Do you concur?  
> > Yes, I do agree.  
> 
> Same here, agreed, thanks.
> 
> [...]


Thanks,
Miquèl

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-01-05 22:10 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-02  7:19 [PATCH v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0 Francesco Dolcini
2022-12-02  9:14 ` Miquel Raynal
2022-12-02 10:12   ` Francesco Dolcini
2022-12-02 10:24     ` Francesco Dolcini
2022-12-02 10:53       ` Miquel Raynal
2022-12-02 11:23         ` Francesco Dolcini
2022-12-02 14:05           ` Miquel Raynal
2022-12-02 14:31             ` Marek Vasut
2022-12-02 15:00               ` Miquel Raynal
2022-12-02 15:23                 ` Marek Vasut
2022-12-02 15:49                   ` Miquel Raynal
2022-12-02 16:01                     ` Miquel Raynal
2022-12-02 16:17                     ` Marek Vasut
2022-12-02 16:42                       ` Miquel Raynal
2022-12-02 16:52                         ` Marek Vasut
2022-12-02 16:57                           ` Miquel Raynal
2022-12-02 17:08                             ` Marek Vasut
2022-12-05 11:26                               ` Francesco Dolcini
2022-12-05 13:49                                 ` Miquel Raynal
2022-12-05 16:25                                   ` Marek Vasut
2022-12-15  7:16                                     ` Miquel Raynal
2022-12-15  7:45                                       ` Marek Vasut
2022-12-15  8:04                                         ` Miquel Raynal
2022-12-16  0:36                                           ` Marek Vasut
2022-12-16  7:52                                             ` Francesco Dolcini
2022-12-16  7:45                                       ` Francesco Dolcini
2022-12-16 10:46                                         ` Marek Vasut
2022-12-16 11:01                                           ` Miquel Raynal
2022-12-16 12:37                                             ` Francesco Dolcini
2022-12-16 13:37                                               ` Miquel Raynal
2022-12-16 14:32                                                 ` Marek Vasut
2022-12-16 15:35                                                   ` Miquel Raynal
2022-12-16 16:30                                                     ` Francesco Dolcini
2023-01-02  9:40                                                       ` Miquel Raynal
2023-01-05 11:33                                                         ` Miquel Raynal
2023-01-05 12:47                                                           ` Francesco Dolcini
2023-01-05 14:51                                                             ` Marek Vasut
2023-01-05 15:03                                                               ` Miquel Raynal
2022-12-02 17:20                         ` Francesco Dolcini
2022-12-05 11:30                           ` Francesco Dolcini
2022-12-05 15:28                             ` Miquel Raynal
2022-12-02 16:45                       ` Francesco Dolcini
2022-12-02 17:05                         ` Miquel Raynal
2022-12-02 15:56               ` Thorsten Leemhuis
2022-12-04 12:50                 ` Marek Vasut
2022-12-04 12:59                   ` Thorsten Leemhuis
2022-12-04 15:50                     ` Marek Vasut
2022-12-02 12:43 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).