linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu/mediatek: Fix crash on isr after kexec()
@ 2022-11-25 16:28 Ricardo Ribalda
  2022-11-25 17:02 ` Robin Murphy
  2022-11-28  6:44 ` Yong Wu (吴勇)
  0 siblings, 2 replies; 5+ messages in thread
From: Ricardo Ribalda @ 2022-11-25 16:28 UTC (permalink / raw)
  To: Joerg Roedel, Matthias Brugger, Yong Wu, Will Deacon, Robin Murphy
  Cc: iommu, linux-kernel, linux-arm-kernel, Ricardo Ribalda, linux-mediatek

If the system is rebooted via isr(), the IRQ handler might be triggerd
before the domain is initialized. Resulting on an invalid memory access
error.

Fix:
[    0.500930] Unable to handle kernel read from unreadable memory at virtual address 0000000000000070
[    0.501166] Call trace:
[    0.501174]  report_iommu_fault+0x28/0xfc
[    0.501180]  mtk_iommu_isr+0x10c/0x1c0

Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
To: Yong Wu <yong.wu@mediatek.com>
To: Joerg Roedel <joro@8bytes.org>
To: Will Deacon <will@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
To: Matthias Brugger <matthias.bgg@gmail.com>
Cc: iommu@lists.linux.dev
Cc: linux-mediatek@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/iommu/mtk_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 2ab2ecfe01f8..17f6be5a5097 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -454,7 +454,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
 		fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm];
 	}
 
-	if (report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
+	if (dom && report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
 			       write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
 		dev_err_ratelimited(
 			bank->parent_dev,

---
base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
change-id: 20221125-mtk-iommu-13023f971298

Best regards,
-- 
Ricardo Ribalda <ribalda@chromium.org>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/mediatek: Fix crash on isr after kexec()
  2022-11-25 16:28 [PATCH] iommu/mediatek: Fix crash on isr after kexec() Ricardo Ribalda
@ 2022-11-25 17:02 ` Robin Murphy
  2022-11-25 17:15   ` Ricardo Ribalda
  2022-11-28  6:44 ` Yong Wu (吴勇)
  1 sibling, 1 reply; 5+ messages in thread
From: Robin Murphy @ 2022-11-25 17:02 UTC (permalink / raw)
  To: Ricardo Ribalda, Joerg Roedel, Matthias Brugger, Yong Wu, Will Deacon
  Cc: iommu, linux-kernel, linux-arm-kernel, linux-mediatek

On 2022-11-25 16:28, Ricardo Ribalda wrote:
> If the system is rebooted via isr(), the IRQ handler might be triggerd
> before the domain is initialized. Resulting on an invalid memory access
> error.
> 
> Fix:
> [    0.500930] Unable to handle kernel read from unreadable memory at virtual address 0000000000000070
> [    0.501166] Call trace:
> [    0.501174]  report_iommu_fault+0x28/0xfc
> [    0.501180]  mtk_iommu_isr+0x10c/0x1c0

Hmm, shouldn't we clear any pending faults at probe in 
mtk_iommu_hw_init(), before the IRQ is requested? mtk_iommu_isr() might 
still want to be robust against a spurious interrupt, but then it can 
simply return without doing anything at all if the domain is NULL, since 
we'll know that's the case.

Thanks,
Robin.

(It might be nice if request_irq() had a flag to say "if this IRQ looks 
pending already just clear it" for drivers that know it could only be 
spurious at that point; kexec seems to lead to this problem quite a lot...)

> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> ---
> To: Yong Wu <yong.wu@mediatek.com>
> To: Joerg Roedel <joro@8bytes.org>
> To: Will Deacon <will@kernel.org>
> To: Robin Murphy <robin.murphy@arm.com>
> To: Matthias Brugger <matthias.bgg@gmail.com>
> Cc: iommu@lists.linux.dev
> Cc: linux-mediatek@lists.infradead.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
>   drivers/iommu/mtk_iommu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 2ab2ecfe01f8..17f6be5a5097 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -454,7 +454,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
>   		fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm];
>   	}
>   
> -	if (report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
> +	if (dom && report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
>   			       write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
>   		dev_err_ratelimited(
>   			bank->parent_dev,
> 
> ---
> base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> change-id: 20221125-mtk-iommu-13023f971298
> 
> Best regards,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/mediatek: Fix crash on isr after kexec()
  2022-11-25 17:02 ` Robin Murphy
@ 2022-11-25 17:15   ` Ricardo Ribalda
  0 siblings, 0 replies; 5+ messages in thread
From: Ricardo Ribalda @ 2022-11-25 17:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Joerg Roedel, Matthias Brugger, Yong Wu, Will Deacon, iommu,
	linux-kernel, linux-arm-kernel, linux-mediatek

Hi Robin


Thanks for your  review!

On Fri, 25 Nov 2022 at 18:02, Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2022-11-25 16:28, Ricardo Ribalda wrote:
> > If the system is rebooted via isr(), the IRQ handler might be triggerd
> > before the domain is initialized. Resulting on an invalid memory access
> > error.
> >
> > Fix:
> > [    0.500930] Unable to handle kernel read from unreadable memory at virtual address 0000000000000070
> > [    0.501166] Call trace:
> > [    0.501174]  report_iommu_fault+0x28/0xfc
> > [    0.501180]  mtk_iommu_isr+0x10c/0x1c0
>
> Hmm, shouldn't we clear any pending faults at probe in
> mtk_iommu_hw_init(), before the IRQ is requested? mtk_iommu_isr() might
> still want to be robust against a spurious interrupt, but then it can
> simply return without doing anything at all if the domain is NULL, since
> we'll know that's the case.
>
> Thanks,
> Robin.
>
> (It might be nice if request_irq() had a flag to say "if this IRQ looks
> pending already just clear it" for drivers that know it could only be
> spurious at that point; kexec seems to lead to this problem quite a lot...)

It is not only about the "last" IRQ before kexec. The peripherals
under the IOMMU might still active and producing faults and therefore
IRQs.

I tried this:

@@ -886,6 +886,11 @@ static int mtk_iommu_hw_init(const struct
mtk_iommu_data *data, unsigned int ban
                         upper_32_bits(data->protect_base);
        writel_relaxed(regval, bankx->base + REG_MMU_IVRP_PADDR);

+       /* Clear previous IRQs */
+       regval = readl_relaxed(bankx->base + REG_MMU_INT_CONTROL0);
+       regval |= F_INT_CLR_BIT;
+       writel_relaxed(regval, bankx->base + REG_MMU_INT_CONTROL0);
+
        if (devm_request_irq(bankx->pdev, bankx->irq, mtk_iommu_isr, 0,
                             dev_name(bankx->pdev), (void *)bankx)) {
                writel_relaxed(0, bankx->base + REG_MMU_PT_BASE_ADDR);

And I still get the same crash


>
> > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > ---
> > To: Yong Wu <yong.wu@mediatek.com>
> > To: Joerg Roedel <joro@8bytes.org>
> > To: Will Deacon <will@kernel.org>
> > To: Robin Murphy <robin.murphy@arm.com>
> > To: Matthias Brugger <matthias.bgg@gmail.com>
> > Cc: iommu@lists.linux.dev
> > Cc: linux-mediatek@lists.infradead.org
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >   drivers/iommu/mtk_iommu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 2ab2ecfe01f8..17f6be5a5097 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -454,7 +454,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
> >               fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm];
> >       }
> >
> > -     if (report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
> > +     if (dom && report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
> >                              write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
> >               dev_err_ratelimited(
> >                       bank->parent_dev,
> >
> > ---
> > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> > change-id: 20221125-mtk-iommu-13023f971298
> >
> > Best regards,



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/mediatek: Fix crash on isr after kexec()
  2022-11-25 16:28 [PATCH] iommu/mediatek: Fix crash on isr after kexec() Ricardo Ribalda
  2022-11-25 17:02 ` Robin Murphy
@ 2022-11-28  6:44 ` Yong Wu (吴勇)
  2022-11-28 22:14   ` Ricardo Ribalda
  1 sibling, 1 reply; 5+ messages in thread
From: Yong Wu (吴勇) @ 2022-11-28  6:44 UTC (permalink / raw)
  To: robin.murphy, joro, ribalda, matthias.bgg, will
  Cc: linux-arm-kernel, linux-kernel, linux-mediatek, iommu

On Fri, 2022-11-25 at 17:28 +0100, Ricardo Ribalda wrote:
> If the system is rebooted via isr(), the IRQ handler might be
> triggerd
> before the domain is initialized. Resulting on an invalid memory
> access
> error.
> 
> Fix:
> [    0.500930] Unable to handle kernel read from unreadable memory at
> virtual address 0000000000000070
> [    0.501166] Call trace:
> [    0.501174]  report_iommu_fault+0x28/0xfc
> [    0.501180]  mtk_iommu_isr+0x10c/0x1c0
> 
> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> ---
> To: Yong Wu <yong.wu@mediatek.com>
> To: Joerg Roedel <joro@8bytes.org>
> To: Will Deacon <will@kernel.org>
> To: Robin Murphy <robin.murphy@arm.com>
> To: Matthias Brugger <matthias.bgg@gmail.com>
> Cc: iommu@lists.linux.dev
> Cc: linux-mediatek@lists.infradead.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/iommu/mtk_iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 2ab2ecfe01f8..17f6be5a5097 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -454,7 +454,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void
> *dev_id)
>  		fault_larb = data->plat_data-
> >larbid_remap[fault_larb][sub_comm];
>  	}
>  
> -	if (report_iommu_fault(&dom->domain, bank->parent_dev,
> fault_iova,
> +	if (dom && report_iommu_fault(&dom->domain, bank->parent_dev,
> fault_iova,


Which SoC does this issue happen? Does this issue is happened in the 
upstream kernel or the downstream kernel? 

Normally each port enable the iommu defaultly. Let's print the error
log even though "dom" is null to check which port fail here. then
analyse the port's behavior.

if (!dom || report_iommu_fault(xx))
     dev_err_ratelimited(xx)

>  			       write ? IOMMU_FAULT_WRITE :
> IOMMU_FAULT_READ)) {
>  		dev_err_ratelimited(
>  			bank->parent_dev,
> 
> ---
> base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> change-id: 20221125-mtk-iommu-13023f971298
> 
> Best regards,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/mediatek: Fix crash on isr after kexec()
  2022-11-28  6:44 ` Yong Wu (吴勇)
@ 2022-11-28 22:14   ` Ricardo Ribalda
  0 siblings, 0 replies; 5+ messages in thread
From: Ricardo Ribalda @ 2022-11-28 22:14 UTC (permalink / raw)
  To: Yong Wu (吴勇)
  Cc: robin.murphy, joro, matthias.bgg, will, linux-arm-kernel,
	linux-kernel, linux-mediatek, iommu

Hi Yong


On Mon, 28 Nov 2022 at 07:44, Yong Wu (吴勇) <Yong.Wu@mediatek.com> wrote:
>
> On Fri, 2022-11-25 at 17:28 +0100, Ricardo Ribalda wrote:
> > If the system is rebooted via isr(), the IRQ handler might be
> > triggerd
> > before the domain is initialized. Resulting on an invalid memory
> > access
> > error.
> >
> > Fix:
> > [    0.500930] Unable to handle kernel read from unreadable memory at
> > virtual address 0000000000000070
> > [    0.501166] Call trace:
> > [    0.501174]  report_iommu_fault+0x28/0xfc
> > [    0.501180]  mtk_iommu_isr+0x10c/0x1c0
> >
> > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > ---
> > To: Yong Wu <yong.wu@mediatek.com>
> > To: Joerg Roedel <joro@8bytes.org>
> > To: Will Deacon <will@kernel.org>
> > To: Robin Murphy <robin.murphy@arm.com>
> > To: Matthias Brugger <matthias.bgg@gmail.com>
> > Cc: iommu@lists.linux.dev
> > Cc: linux-mediatek@lists.infradead.org
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  drivers/iommu/mtk_iommu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 2ab2ecfe01f8..17f6be5a5097 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -454,7 +454,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void
> > *dev_id)
> >               fault_larb = data->plat_data-
> > >larbid_remap[fault_larb][sub_comm];
> >       }
> >
> > -     if (report_iommu_fault(&dom->domain, bank->parent_dev,
> > fault_iova,
> > +     if (dom && report_iommu_fault(&dom->domain, bank->parent_dev,
> > fault_iova,
>
>
> Which SoC does this issue happen? Does this issue is happened in the
> upstream kernel or the downstream kernel?

I am using chromeos-5.10 and chromeos-5.15 (which are pretty much upstream).

I have seen this issue at least with MT8195 and MT8183


>
> Normally each port enable the iommu defaultly. Let's print the error
> log even though "dom" is null to check which port fail here. then
> analyse the port's behavior.
>
> if (!dom || report_iommu_fault(xx))
>      dev_err_ratelimited(xx)

sending a v2 with the change.

Thanks!


>
> >                              write ? IOMMU_FAULT_WRITE :
> > IOMMU_FAULT_READ)) {
> >               dev_err_ratelimited(
> >                       bank->parent_dev,
> >
> > ---
> > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> > change-id: 20221125-mtk-iommu-13023f971298
> >
> > Best regards,



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-28 22:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-25 16:28 [PATCH] iommu/mediatek: Fix crash on isr after kexec() Ricardo Ribalda
2022-11-25 17:02 ` Robin Murphy
2022-11-25 17:15   ` Ricardo Ribalda
2022-11-28  6:44 ` Yong Wu (吴勇)
2022-11-28 22:14   ` Ricardo Ribalda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).