All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alice Guo (OSS)" <alice.guo@oss.nxp.com>
To: Dominique MARTINET <dominique.martinet@atmark-techno.com>,
	Shawn Guo <shawnguo@kernel.org>,
	Krzysztof Kozlowski <krzk@kernel.org>
Cc: "robh+dt@kernel.org" <robh+dt@kernel.org>,
	"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
	Peng Fan <peng.fan@nxp.com>, dl-linux-imx <linux-imx@nxp.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [EXT] regression due to soc_device_match not handling defer (Was: [PATCH v4 4/4] soc: imx8m: change to use platform driver)
Date: Tue, 30 Mar 2021 02:41:23 +0000	[thread overview]
Message-ID: <AM6PR04MB60536EF0DEEE6EB64CF29390E27D9@AM6PR04MB6053.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <YGGZJjAxA1IO+/VU@atmark-techno.com>

Hi,

Thanks for reporting this issue, I'll check and add a fix to handle defer probe.

Best regards,
Alice Guo

> -----Original Message-----
> From: Dominique MARTINET <dominique.martinet@atmark-techno.com>
> Sent: 2021年3月29日 17:09
> To: Alice Guo <alice.guo@nxp.com>; Shawn Guo <shawnguo@kernel.org>;
> Krzysztof Kozlowski <krzk@kernel.org>
> Cc: robh+dt@kernel.org; devicetree@vger.kernel.org; Peng Fan
> <peng.fan@nxp.com>; dl-linux-imx <linux-imx@nxp.com>;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: [EXT] regression due to soc_device_match not handling defer (Was:
> [PATCH v4 4/4] soc: imx8m: change to use platform driver)
> 
> Caution: EXT Email
> 
> Hi,
> 
> First thanks for the patch, it fixes the kexec hang I was looking at...
> 
> Unfortunately it means the soc is now init much later and other piece of drivers
> that depend on querying the soc will fail, I am getting a BUG in the caam crypto
> driver from 7d981405d0fd ("soc: imx8m: change to use platform driver") +
> ce58459d8c7f ("arm64: dts: imx8m: add SoC ID
> compatible") on the imx8mp evk as follow:
> 
> [    2.575505] caam 30900000.crypto: caam pkc algorithms registered in
> /proc/crypto
> [    2.588986] caam 30900000.crypto: registering rng-caam
> [    2.594168] caam_jr 30901000.jr: job ring error: irqstate: 00000103
> [    2.600492] ------------[ cut here ]------------
> [    2.605109] kernel BUG at drivers/crypto/caam/jr.c:187!
> [    2.610338] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [    2.615829] Modules linked in:
> [    2.618895] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc5 #8
> [    2.625168] Hardware name: NXP i.MX8MPlus EVK board (DT)
> [    2.630482] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
> [    2.636493] pc : caam_jr_interrupt+0x150/0x154
> [    2.640944] lr : caam_jr_interrupt+0x150/0x154
> [    2.645396] sp : ffff800010003e90
> [    2.648709] x29: ffff800010003e90 x28: ffff800011f82e80
> [    2.654035] x27: ffff800011cd2000 x26: ffff0000c1988400
> [    2.659353] x25: ffff800011cd2000 x24: ffff800011f789e0
> [    2.664674] x23: 000000000000002e x22: ffff800012261000
> [    2.669994] x21: ffff0000c1979410 x20: ffff0000c1988a80
> [    2.675315] x19: 0000000000000103 x18: 0000000000000000
> [    2.680637] x17: 0000000000000007 x16: 000000000000000e
> [    2.685958] x15: 0000000000000030 x14: ffffffffffffffff
> [    2.691279] x13: ffff800090003aa7 x12: ffff800010003aaf
> [    2.696601] x11: 0000000000000003 x10: 0101010101010101
> [    2.701921] x9 : ffff8000100039d0 x8 : ffff800011fa3830
> [    2.707242] x7 : ffff800011ffb830 x6 : ffff800011ffb830
> [    2.712560] x5 : 0000000000000000 x4 : 0000000000000000
> [    2.717881] x3 : 0000000000000000 x2 : 0000000000000000
> [    2.723203] x1 : 0000000000000000 x0 : ffff800011f82e80
> [    2.728528] Call trace:
> [    2.730976]  caam_jr_interrupt+0x150/0x154
> [    2.735080]  __handle_irq_event_percpu+0x6c/0x280
> [    2.739791]  handle_irq_event+0x70/0x160
> [    2.743716]  handle_fasteoi_irq+0xb0/0x200
> [    2.747822]  __handle_domain_irq+0x8c/0xf0
> [    2.751924]  gic_handle_irq+0xd8/0x160
> [    2.755683]  el1_irq+0xb4/0x180
> [    2.758829]  arch_cpu_idle+0x18/0x30
> [    2.762412]  default_idle_call+0x4c/0x1d0
> [    2.766431]  do_idle+0x238/0x2b0
> [    2.769664]  cpu_startup_entry+0x30/0x7c
> [    2.773595]  rest_init+0xe4/0xf4
> [    2.776832]  arch_call_rest_init+0x1c/0x28
> [    2.780937]  start_kernel+0x570/0x5a8
> [    2.784602]  0x0
> [    2.786452] Code: aa1503e0 90005c41 91108021 940da95d (d4210000)
> [    2.792560] ---[ end trace 968b8515172abc2e ]---
> [    2.797181] Kernel panic - not syncing: Oops - BUG: Fatal exception in
> interrupt
> [    2.804580] SMP: stopping secondary CPUs
> [    2.808508] Kernel Offset: disabled
> [    2.811998] CPU features: 0x00240002,2000200c
> [    2.816360] Memory Limit: none
> [    2.819419] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception
> in interrupt ]---
> 
> 
> This particular crash can be fixed by making the caam driver delay as well if the
> device isn't inited yet, e.g. this works:
> --------
> diff --git a/drivers/base/soc.c b/drivers/base/soc.c index
> 0af5363a582c..f179f9e55b49 100644
> --- a/drivers/base/soc.c
> +++ b/drivers/base/soc.c
> @@ -36,6 +36,7 @@ static DEVICE_ATTR(family,            0444,
> soc_info_show,  NULL);
>  static DEVICE_ATTR(serial_number,      0444, soc_info_show,  NULL);
>  static DEVICE_ATTR(soc_id,             0444, soc_info_show,  NULL);
>  static DEVICE_ATTR(revision,           0444, soc_info_show,  NULL);
> +static int init_done;
> 
>  struct device *soc_device_to_device(struct soc_device *soc_dev)  { @@
> -157,6 +158,7 @@ struct soc_device *soc_device_register(struct
> soc_device_attribute *soc_dev_attr
>                 return ERR_PTR(ret);
>         }
> 
> +       init_done = true;
>         return soc_dev;
> 
>  out3:
> @@ -243,6 +245,9 @@ const struct soc_device_attribute
> *soc_device_match(  {
>         int ret = 0;
> 
> +       if (!init_done)
> +               return ERR_PTR(-EPROBE_DEFER);
> +
>         if (!matches)
>                 return NULL;
> 
> diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c index
> ca0361b2dbb0..d08f8cc4131f 100644
> --- a/drivers/crypto/caam/ctrl.c
> +++ b/drivers/crypto/caam/ctrl.c
> @@ -635,6 +635,9 @@ static int caam_probe(struct platform_device *pdev)
>         nprop = pdev->dev.of_node;
> 
>         imx_soc_match = soc_device_match(caam_imx_soc_table);
> +       if (IS_ERR(imx_soc_match))
> +               return PTR_ERR(imx_soc_match);
> +
>         caam_imx = (bool)imx_soc_match;
> 
>         if (imx_soc_match) {
> -------
> 
> But it obviously doesn't cover the ~50 (!) other soc_device_match users in the
> code base which would all need to start handling potential error return code.
> 
> (I also had problems with other drivers when trying to backport the patch to the
> 5.4.70_2.3.0 imx kernel but I just gave up on that one)
> 
> 
> I think having all drivers handle potential EPROBE_DEFER errors is the best way
> forward, how do you propose to move on?
> I can do some but have no way of testing most of these so am a bit reluctant
> to...
> 
> Thanks,
> --
> Dominique

WARNING: multiple messages have this Message-ID (diff)
From: "Alice Guo (OSS)" <alice.guo@oss.nxp.com>
To: Dominique MARTINET <dominique.martinet@atmark-techno.com>,
	Shawn Guo <shawnguo@kernel.org>,
	Krzysztof Kozlowski <krzk@kernel.org>
Cc: "robh+dt@kernel.org" <robh+dt@kernel.org>,
	"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
	Peng Fan <peng.fan@nxp.com>, dl-linux-imx <linux-imx@nxp.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [EXT] regression due to soc_device_match not handling defer (Was: [PATCH v4 4/4] soc: imx8m: change to use platform driver)
Date: Tue, 30 Mar 2021 02:41:23 +0000	[thread overview]
Message-ID: <AM6PR04MB60536EF0DEEE6EB64CF29390E27D9@AM6PR04MB6053.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <YGGZJjAxA1IO+/VU@atmark-techno.com>

Hi,

Thanks for reporting this issue, I'll check and add a fix to handle defer probe.

Best regards,
Alice Guo

> -----Original Message-----
> From: Dominique MARTINET <dominique.martinet@atmark-techno.com>
> Sent: 2021年3月29日 17:09
> To: Alice Guo <alice.guo@nxp.com>; Shawn Guo <shawnguo@kernel.org>;
> Krzysztof Kozlowski <krzk@kernel.org>
> Cc: robh+dt@kernel.org; devicetree@vger.kernel.org; Peng Fan
> <peng.fan@nxp.com>; dl-linux-imx <linux-imx@nxp.com>;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: [EXT] regression due to soc_device_match not handling defer (Was:
> [PATCH v4 4/4] soc: imx8m: change to use platform driver)
> 
> Caution: EXT Email
> 
> Hi,
> 
> First thanks for the patch, it fixes the kexec hang I was looking at...
> 
> Unfortunately it means the soc is now init much later and other piece of drivers
> that depend on querying the soc will fail, I am getting a BUG in the caam crypto
> driver from 7d981405d0fd ("soc: imx8m: change to use platform driver") +
> ce58459d8c7f ("arm64: dts: imx8m: add SoC ID
> compatible") on the imx8mp evk as follow:
> 
> [    2.575505] caam 30900000.crypto: caam pkc algorithms registered in
> /proc/crypto
> [    2.588986] caam 30900000.crypto: registering rng-caam
> [    2.594168] caam_jr 30901000.jr: job ring error: irqstate: 00000103
> [    2.600492] ------------[ cut here ]------------
> [    2.605109] kernel BUG at drivers/crypto/caam/jr.c:187!
> [    2.610338] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [    2.615829] Modules linked in:
> [    2.618895] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc5 #8
> [    2.625168] Hardware name: NXP i.MX8MPlus EVK board (DT)
> [    2.630482] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
> [    2.636493] pc : caam_jr_interrupt+0x150/0x154
> [    2.640944] lr : caam_jr_interrupt+0x150/0x154
> [    2.645396] sp : ffff800010003e90
> [    2.648709] x29: ffff800010003e90 x28: ffff800011f82e80
> [    2.654035] x27: ffff800011cd2000 x26: ffff0000c1988400
> [    2.659353] x25: ffff800011cd2000 x24: ffff800011f789e0
> [    2.664674] x23: 000000000000002e x22: ffff800012261000
> [    2.669994] x21: ffff0000c1979410 x20: ffff0000c1988a80
> [    2.675315] x19: 0000000000000103 x18: 0000000000000000
> [    2.680637] x17: 0000000000000007 x16: 000000000000000e
> [    2.685958] x15: 0000000000000030 x14: ffffffffffffffff
> [    2.691279] x13: ffff800090003aa7 x12: ffff800010003aaf
> [    2.696601] x11: 0000000000000003 x10: 0101010101010101
> [    2.701921] x9 : ffff8000100039d0 x8 : ffff800011fa3830
> [    2.707242] x7 : ffff800011ffb830 x6 : ffff800011ffb830
> [    2.712560] x5 : 0000000000000000 x4 : 0000000000000000
> [    2.717881] x3 : 0000000000000000 x2 : 0000000000000000
> [    2.723203] x1 : 0000000000000000 x0 : ffff800011f82e80
> [    2.728528] Call trace:
> [    2.730976]  caam_jr_interrupt+0x150/0x154
> [    2.735080]  __handle_irq_event_percpu+0x6c/0x280
> [    2.739791]  handle_irq_event+0x70/0x160
> [    2.743716]  handle_fasteoi_irq+0xb0/0x200
> [    2.747822]  __handle_domain_irq+0x8c/0xf0
> [    2.751924]  gic_handle_irq+0xd8/0x160
> [    2.755683]  el1_irq+0xb4/0x180
> [    2.758829]  arch_cpu_idle+0x18/0x30
> [    2.762412]  default_idle_call+0x4c/0x1d0
> [    2.766431]  do_idle+0x238/0x2b0
> [    2.769664]  cpu_startup_entry+0x30/0x7c
> [    2.773595]  rest_init+0xe4/0xf4
> [    2.776832]  arch_call_rest_init+0x1c/0x28
> [    2.780937]  start_kernel+0x570/0x5a8
> [    2.784602]  0x0
> [    2.786452] Code: aa1503e0 90005c41 91108021 940da95d (d4210000)
> [    2.792560] ---[ end trace 968b8515172abc2e ]---
> [    2.797181] Kernel panic - not syncing: Oops - BUG: Fatal exception in
> interrupt
> [    2.804580] SMP: stopping secondary CPUs
> [    2.808508] Kernel Offset: disabled
> [    2.811998] CPU features: 0x00240002,2000200c
> [    2.816360] Memory Limit: none
> [    2.819419] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception
> in interrupt ]---
> 
> 
> This particular crash can be fixed by making the caam driver delay as well if the
> device isn't inited yet, e.g. this works:
> --------
> diff --git a/drivers/base/soc.c b/drivers/base/soc.c index
> 0af5363a582c..f179f9e55b49 100644
> --- a/drivers/base/soc.c
> +++ b/drivers/base/soc.c
> @@ -36,6 +36,7 @@ static DEVICE_ATTR(family,            0444,
> soc_info_show,  NULL);
>  static DEVICE_ATTR(serial_number,      0444, soc_info_show,  NULL);
>  static DEVICE_ATTR(soc_id,             0444, soc_info_show,  NULL);
>  static DEVICE_ATTR(revision,           0444, soc_info_show,  NULL);
> +static int init_done;
> 
>  struct device *soc_device_to_device(struct soc_device *soc_dev)  { @@
> -157,6 +158,7 @@ struct soc_device *soc_device_register(struct
> soc_device_attribute *soc_dev_attr
>                 return ERR_PTR(ret);
>         }
> 
> +       init_done = true;
>         return soc_dev;
> 
>  out3:
> @@ -243,6 +245,9 @@ const struct soc_device_attribute
> *soc_device_match(  {
>         int ret = 0;
> 
> +       if (!init_done)
> +               return ERR_PTR(-EPROBE_DEFER);
> +
>         if (!matches)
>                 return NULL;
> 
> diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c index
> ca0361b2dbb0..d08f8cc4131f 100644
> --- a/drivers/crypto/caam/ctrl.c
> +++ b/drivers/crypto/caam/ctrl.c
> @@ -635,6 +635,9 @@ static int caam_probe(struct platform_device *pdev)
>         nprop = pdev->dev.of_node;
> 
>         imx_soc_match = soc_device_match(caam_imx_soc_table);
> +       if (IS_ERR(imx_soc_match))
> +               return PTR_ERR(imx_soc_match);
> +
>         caam_imx = (bool)imx_soc_match;
> 
>         if (imx_soc_match) {
> -------
> 
> But it obviously doesn't cover the ~50 (!) other soc_device_match users in the
> code base which would all need to start handling potential error return code.
> 
> (I also had problems with other drivers when trying to backport the patch to the
> 5.4.70_2.3.0 imx kernel but I just gave up on that one)
> 
> 
> I think having all drivers handle potential EPROBE_DEFER errors is the best way
> forward, how do you propose to move on?
> I can do some but have no way of testing most of these so am a bit reluctant
> to...
> 
> Thanks,
> --
> Dominique
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-03-30  2:42 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 10:11 [PATCH v4 1/4] dt-bindings: soc: imx8m: add DT Binding doc for soc unique ID Alice Guo
2020-11-20 10:11 ` Alice Guo
2020-11-20 10:11 ` [PATCH v4 2/4] arm64: dts: imx8m: add SoC ID compatible Alice Guo
2020-11-20 10:11   ` Alice Guo
2020-11-20 10:11 ` [PATCH v4 3/4] arm64: dts: imx8m: add NVMEM provider and consumer to read soc unique ID Alice Guo
2020-11-20 10:11   ` Alice Guo
2020-11-20 10:53   ` Krzysztof Kozlowski
2020-11-20 10:53     ` Krzysztof Kozlowski
2020-11-20 10:11 ` [PATCH v4 4/4] soc: imx8m: change to use platform driver Alice Guo
2020-11-20 10:11   ` Alice Guo
2020-11-20 10:46   ` Krzysztof Kozlowski
2020-11-20 10:46     ` Krzysztof Kozlowski
2021-03-29  9:08   ` regression due to soc_device_match not handling defer (Was: [PATCH v4 4/4] soc: imx8m: change to use platform driver) Dominique MARTINET
2021-03-29  9:08     ` Dominique MARTINET
2021-03-30  2:41     ` Alice Guo (OSS) [this message]
2021-03-30  2:41       ` [EXT] " Alice Guo (OSS)
2021-04-15  1:27       ` Dominique MARTINET
2021-04-15  1:27         ` Dominique MARTINET
2021-04-15  1:33         ` Peng Fan
2021-04-15  1:33           ` Peng Fan
2021-06-24 10:36           ` Lucas Stach
2021-06-24 10:36             ` Lucas Stach
2021-06-29  2:39             ` Peng Fan (OSS)
2021-06-29  2:39               ` Peng Fan (OSS)
2020-11-20 10:50 ` [PATCH v4 1/4] dt-bindings: soc: imx8m: add DT Binding doc for soc unique ID Krzysztof Kozlowski
2020-11-20 10:50   ` Krzysztof Kozlowski
2020-11-23  4:45   ` [EXT] " Alice Guo
2020-11-23  4:45     ` Alice Guo
2020-11-23  9:04     ` Krzysztof Kozlowski
2020-11-23  9:04       ` Krzysztof Kozlowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM6PR04MB60536EF0DEEE6EB64CF29390E27D9@AM6PR04MB6053.eurprd04.prod.outlook.com \
    --to=alice.guo@oss.nxp.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dominique.martinet@atmark-techno.com \
    --cc=krzk@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-imx@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peng.fan@nxp.com \
    --cc=robh+dt@kernel.org \
    --cc=shawnguo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.