linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device
@ 2023-02-28  8:43 H. Nikolaus Schaller
  2023-03-01  2:57 ` Hongxing Zhu
  2023-03-08 18:49 ` Bjorn Helgaas
  0 siblings, 2 replies; 4+ messages in thread
From: H. Nikolaus Schaller @ 2023-02-28  8:43 UTC (permalink / raw)
  To: Richard Zhu, Lucas Stach, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Bjorn Helgaas, Shawn Guo,
	Sascha Hauer
  Cc: Rob Herring, Pengutronix Kernel Team, Fabio Estevam,
	NXP Linux Team, linux-pci, linux-arm-kernel, linux-kernel,
	letux-kernel, kernel, H. Nikolaus Schaller

commit bb38919ec56e ("PCI: imx6: Add support for i.MX6 PCIe controller")
added a fault hook to this driver in the probe function. So it was only
installed if needed.

commit bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
moved it from probe to driver init which installs the hook unconditionally
as soon as the driver is compiled into a kernel.

When this driver is compiled as a module, the hook is not registered
until after the driver has been matched with a .compatible and
loaded.

commit 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
extended the fault handling code.

commit 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
added some protection for non-ARM architectures, but this does not
protect non-i.MX ARM architectures.

Since fault handlers can be triggered on any architecture for different
reasons, there is no guarantee that they will be triggered only for the
assumed situation, leading to improper error handling (i.MX6-specific
imx6q_pcie_abort_handler) on foreign systems.

I had seen strange L3 imprecise external abort messages several times on
OMAP4 and OMAP5 devices and couldn't make sense of them until I realized
they were related to this unused imx6q driver because I had
CONFIG_PCI_IMX6=y.

Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed
to run on different ARM SoC and be differentiated only by device tree
binaries. So turning off CONFIG_PCI_IMX6 is not a solution.

Therefore we check the compatible in the init function before registering
the fault handler.

Fixes: bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
Fixes: 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")

Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
---
 drivers/pci/controller/dwc/pci-imx6.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index 1dde5c579edc8..89774aa187ae8 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -1402,6 +1402,15 @@ DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd,
 static int __init imx6_pcie_init(void)
 {
 #ifdef CONFIG_ARM
+	const struct of_device_id *reboot_id;
+	struct device_node *np;
+
+	np = of_find_matching_node_and_match(NULL, imx6_pcie_of_match,
+					     &reboot_id);
+	if (!np)
+		return -ENODEV;
+	of_node_put(np);
+
 	/*
 	 * Since probe() can be deferred we need to make sure that
 	 * hook_fault_code is not called after __init memory is freed
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device
  2023-02-28  8:43 [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device H. Nikolaus Schaller
@ 2023-03-01  2:57 ` Hongxing Zhu
  2023-03-08 18:49 ` Bjorn Helgaas
  1 sibling, 0 replies; 4+ messages in thread
From: Hongxing Zhu @ 2023-03-01  2:57 UTC (permalink / raw)
  To: H. Nikolaus Schaller, Lucas Stach, Lorenzo Pieralisi,
	Krzysztof Wilczy��ski, Bjorn Helgaas, Shawn Guo,
	Sascha Hauer
  Cc: Rob Herring, Pengutronix Kernel Team, Fabio Estevam,
	dl-linux-imx, linux-pci, linux-arm-kernel, linux-kernel,
	letux-kernel, kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 3822 bytes --]


> -----Original Message-----
> From: H. Nikolaus Schaller <hns@goldelico.com>
> Sent: 2023Äê2ÔÂ28ÈÕ 16:44
> To: Hongxing Zhu <hongxing.zhu@nxp.com>; Lucas Stach
> <l.stach@pengutronix.de>; Lorenzo Pieralisi <lpieralisi@kernel.org>; Krzysztof
> Wilczy¨½ski <kw@linux.com>; Bjorn Helgaas <bhelgaas@google.com>; Shawn
> Guo <shawnguo@kernel.org>; Sascha Hauer <s.hauer@pengutronix.de>
> Cc: Rob Herring <robh@kernel.org>; Pengutronix Kernel Team
> <kernel@pengutronix.de>; Fabio Estevam <festevam@gmail.com>; dl-linux-imx
> <linux-imx@nxp.com>; linux-pci@vger.kernel.org;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org;
> letux-kernel@openphoenux.org; kernel@pyra-handheld.com; H. Nikolaus
> Schaller <hns@goldelico.com>
> Subject: [PATCH] PCI: imx6: install the fault handler only if we are really running
> on a compatible device
> 
> commit bb38919ec56e ("PCI: imx6: Add support for i.MX6 PCIe controller")
> added a fault hook to this driver in the probe function. So it was only installed if
> needed.
> 
> commit bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO") moved
> it from probe to driver init which installs the hook unconditionally as soon as the
> driver is compiled into a kernel.
> 
> When this driver is compiled as a module, the hook is not registered until after
> the driver has been matched with a .compatible and loaded.
> 
> commit 415b6185c541 ("PCI: imx6: Fix config read timeout handling") extended
> the fault handling code.
> 
> commit 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") added some
> protection for non-ARM architectures, but this does not protect non-i.MX ARM
> architectures.
> 
> Since fault handlers can be triggered on any architecture for different reasons,
> there is no guarantee that they will be triggered only for the assumed situation,
> leading to improper error handling (i.MX6-specific
> imx6q_pcie_abort_handler) on foreign systems.
> 
> I had seen strange L3 imprecise external abort messages several times on
> OMAP4 and OMAP5 devices and couldn't make sense of them until I realized
> they were related to this unused imx6q driver because I had
> CONFIG_PCI_IMX6=y.
> 
> Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed to
> run on different ARM SoC and be differentiated only by device tree binaries. So
> turning off CONFIG_PCI_IMX6 is not a solution.
> 
> Therefore we check the compatible in the init function before registering the
> fault handler.
> 
> Fixes: bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
> Fixes: 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
> Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
> 
> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Hi H.Nikolaus:
I'm fine with these changes. Thanks.
Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>

Best Regards
Richard Zhu
> ---
>  drivers/pci/controller/dwc/pci-imx6.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pci-imx6.c
> b/drivers/pci/controller/dwc/pci-imx6.c
> index 1dde5c579edc8..89774aa187ae8 100644
> --- a/drivers/pci/controller/dwc/pci-imx6.c
> +++ b/drivers/pci/controller/dwc/pci-imx6.c
> @@ -1402,6 +1402,15 @@
> DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd,
> static int __init imx6_pcie_init(void)  {  #ifdef CONFIG_ARM
> +	const struct of_device_id *reboot_id;
> +	struct device_node *np;
> +
> +	np = of_find_matching_node_and_match(NULL, imx6_pcie_of_match,
> +					     &reboot_id);
> +	if (!np)
> +		return -ENODEV;
> +	of_node_put(np);
> +
>  	/*
>  	 * Since probe() can be deferred we need to make sure that
>  	 * hook_fault_code is not called after __init memory is freed
> --
> 2.38.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device
  2023-02-28  8:43 [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device H. Nikolaus Schaller
  2023-03-01  2:57 ` Hongxing Zhu
@ 2023-03-08 18:49 ` Bjorn Helgaas
  2023-03-08 20:40   ` H. Nikolaus Schaller
  1 sibling, 1 reply; 4+ messages in thread
From: Bjorn Helgaas @ 2023-03-08 18:49 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Richard Zhu, Lucas Stach, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Bjorn Helgaas, Shawn Guo,
	Sascha Hauer, Rob Herring, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-pci, linux-arm-kernel,
	linux-kernel, letux-kernel, kernel

On Tue, Feb 28, 2023 at 09:43:54AM +0100, H. Nikolaus Schaller wrote:
> commit bb38919ec56e ("PCI: imx6: Add support for i.MX6 PCIe controller")
> added a fault hook to this driver in the probe function. So it was only
> installed if needed.
> 
> commit bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
> moved it from probe to driver init which installs the hook unconditionally
> as soon as the driver is compiled into a kernel.
> 
> When this driver is compiled as a module, the hook is not registered
> until after the driver has been matched with a .compatible and
> loaded.
> 
> commit 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
> extended the fault handling code.
> 
> commit 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
> added some protection for non-ARM architectures, but this does not
> protect non-i.MX ARM architectures.

Are *all* these commits relevant?  Question also applies to Fixes:
below.

> Since fault handlers can be triggered on any architecture for different
> reasons, there is no guarantee that they will be triggered only for the
> assumed situation, leading to improper error handling (i.MX6-specific
> imx6q_pcie_abort_handler) on foreign systems.
> 
> I had seen strange L3 imprecise external abort messages several times on
> OMAP4 and OMAP5 devices and couldn't make sense of them until I realized
> they were related to this unused imx6q driver because I had
> CONFIG_PCI_IMX6=y.

Apparently imx6q_pcie_abort_handler() assumes it is always called
because of a PCI abort?  If so, that sounds problematic.

If non-PCI imprecise aborts happen on OMAP4 and OMAP5 where imx6q is
unused and imx6q_pcie_abort_handler() is not appropriate, I assume
similar non-PCI aborts can also happen on systems where imx6q *is*
used.

So imx6q_pcie_abort_handler() may be trying to fixup non-PCI aborts
when it shouldn't?

> Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed
> to run on different ARM SoC and be differentiated only by device tree
> binaries. So turning off CONFIG_PCI_IMX6 is not a solution.
> 
> Therefore we check the compatible in the init function before registering
> the fault handler.
> 
> Fixes: bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
> Fixes: 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
> Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
> 
> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
> ---
>  drivers/pci/controller/dwc/pci-imx6.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
> index 1dde5c579edc8..89774aa187ae8 100644
> --- a/drivers/pci/controller/dwc/pci-imx6.c
> +++ b/drivers/pci/controller/dwc/pci-imx6.c
> @@ -1402,6 +1402,15 @@ DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd,
>  static int __init imx6_pcie_init(void)
>  {
>  #ifdef CONFIG_ARM
> +	const struct of_device_id *reboot_id;
> +	struct device_node *np;
> +
> +	np = of_find_matching_node_and_match(NULL, imx6_pcie_of_match,
> +					     &reboot_id);

Since you don't need reboot_id, I think you should use
of_find_matching_node() instead.

> +	if (!np)
> +		return -ENODEV;
> +	of_node_put(np);
> +
>  	/*
>  	 * Since probe() can be deferred we need to make sure that
>  	 * hook_fault_code is not called after __init memory is freed
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device
  2023-03-08 18:49 ` Bjorn Helgaas
@ 2023-03-08 20:40   ` H. Nikolaus Schaller
  0 siblings, 0 replies; 4+ messages in thread
From: H. Nikolaus Schaller @ 2023-03-08 20:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Richard Zhu, Lucas Stach, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Bjorn Helgaas, Shawn Guo,
	Sascha Hauer, Rob Herring, Pengutronix Kernel Team,
	Fabio Estevam, NXP Linux Team, linux-pci, linux-arm-kernel,
	linux-kernel, letux-kernel, kernel

Hi Bjorn,

> Am 08.03.2023 um 19:49 schrieb Bjorn Helgaas <helgaas@kernel.org>:
> 
> On Tue, Feb 28, 2023 at 09:43:54AM +0100, H. Nikolaus Schaller wrote:
>> commit bb38919ec56e ("PCI: imx6: Add support for i.MX6 PCIe controller")
>> added a fault hook to this driver in the probe function. So it was only
>> installed if needed.
>> 
>> commit bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
>> moved it from probe to driver init which installs the hook unconditionally
>> as soon as the driver is compiled into a kernel.
>> 
>> When this driver is compiled as a module, the hook is not registered
>> until after the driver has been matched with a .compatible and
>> loaded.
>> 
>> commit 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
>> extended the fault handling code.
>> 
>> commit 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
>> added some protection for non-ARM architectures, but this does not
>> protect non-i.MX ARM architectures.
> 
> Are *all* these commits relevant?

Yes, it was correct when introduced by commit bb38919ec56e for a goo reason.
And it was broken by bde4a5a00e76 an all attempts later made it worse.

>  Question also applies to Fixes:
> below.

It fixes all between bde4a5a00e76 and HEAD. Well, one can argue that
commit bde4a5a00e76 could be sufficient for Fixes:

I don't know if it is a problem because I have no overview over side-effects.

> 
>> Since fault handlers can be triggered on any architecture for different
>> reasons, there is no guarantee that they will be triggered only for the
>> assumed situation, leading to improper error handling (i.MX6-specific
>> imx6q_pcie_abort_handler) on foreign systems.
>> 
>> I had seen strange L3 imprecise external abort messages several times on
>> OMAP4 and OMAP5 devices and couldn't make sense of them until I realized
>> they were related to this unused imx6q driver because I had
>> CONFIG_PCI_IMX6=y.
> 
> Apparently imx6q_pcie_abort_handler() assumes it is always called
> because of a PCI abort?  If so, that sounds problematic.

> 
> If non-PCI imprecise aborts happen on OMAP4 and OMAP5 where imx6q is
> unused and imx6q_pcie_abort_handler() is not appropriate, I assume
> similar non-PCI aborts can also happen on systems where imx6q *is*
> used.

As far as I know the reasons why imprecise aborts occur may be SoC specific.

So I have no experience with i.MX6 to judge this. My goal is to shield other
architectures from this fault handler may it be correct or wrong.

> So imx6q_pcie_abort_handler() may be trying to fixup non-PCI aborts
> when it shouldn't?

Yes, at least if it is triggered on OMAP4/OMAP5 by accessing non-existing
registers in some subsystems (e.g. through devmem2).

> 
>> Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed
>> to run on different ARM SoC and be differentiated only by device tree
>> binaries. So turning off CONFIG_PCI_IMX6 is not a solution.
>> 
>> Therefore we check the compatible in the init function before registering
>> the fault handler.
>> 
>> Fixes: bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO")
>> Fixes: 415b6185c541 ("PCI: imx6: Fix config read timeout handling")
>> Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
>> 
>> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
>> ---
>> drivers/pci/controller/dwc/pci-imx6.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>> 
>> diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
>> index 1dde5c579edc8..89774aa187ae8 100644
>> --- a/drivers/pci/controller/dwc/pci-imx6.c
>> +++ b/drivers/pci/controller/dwc/pci-imx6.c
>> @@ -1402,6 +1402,15 @@ DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd,
>> static int __init imx6_pcie_init(void)
>> {
>> #ifdef CONFIG_ARM
>> +	const struct of_device_id *reboot_id;
>> +	struct device_node *np;
>> +
>> +	np = of_find_matching_node_and_match(NULL, imx6_pcie_of_match,
>> +					     &reboot_id);
> 
> Since you don't need reboot_id, I think you should use
> of_find_matching_node() instead.

Well, I used it for debugging, but for production code it has indeed no benefit.

of_find_matching_node it is just a static inline wrapper for
of_find_matching_node_and_match with NULL parameter, but we can save one stack position.

I'll send a v2 soon.

> 
>> +	if (!np)
>> +		return -ENODEV;
>> +	of_node_put(np);
>> +
>> 	/*
>> 	 * Since probe() can be deferred we need to make sure that
>> 	 * hook_fault_code is not called after __init memory is freed
>> -- 
>> 2.38.1
>> 

BR and thanks,
Nikolaus


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-08 20:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-28  8:43 [PATCH] PCI: imx6: install the fault handler only if we are really running on a compatible device H. Nikolaus Schaller
2023-03-01  2:57 ` Hongxing Zhu
2023-03-08 18:49 ` Bjorn Helgaas
2023-03-08 20:40   ` H. Nikolaus Schaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).