arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
@ 2020-03-23 15:21 Prabhakar Kushwaha
  2020-03-23 16:58 ` Robin Murphy
  0 siblings, 1 reply; 8+ messages in thread
From: Prabhakar Kushwaha @ 2020-03-23 15:21 UTC (permalink / raw)
  To: linux-arm-kernel, kexec mailing list
  Cc: Bjorn Helgaas, Bhupesh Sharma, will.deacon,
	Ganapatrao Prabhakerrao Kulkarni, Will Deacon

Hi All,

I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
Here network card is continuously giving following AER error
[  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
aer_mask: 0x00000000
[  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
[  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
aer_agent=Requester ID
[  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011

This error is not 100% reproducible. It happens 1 out of 4 try.

This error goes away in following two scenarios
A) Set iommu in bypass mode via bootargs iommu.passthrough=1
B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
        if (reg & CR0_SMMUEN) {
                dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
                WARN_ON(is_kdump_kernel() && !disable_bypass);
                mdelay(100);  <-- Added delay
                arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
        }

From A), it is clear that it is related to IOMMU
From B), looks like during boot of kdump kernel, network card is still
active and it has sent some request over PCIe.
as GPBA_ABORT bit is set, no response/completion coming to PCIe
controller hence "CmpltTO" error.

Ideally before setting GPBA_ABORT bit, there should be some check for
active transaction. if it is not possible, a wait should be done to
assure that no more pending transaction left.

why any such delay has not been considered?

--pk

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-23 15:21 arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel Prabhakar Kushwaha
@ 2020-03-23 16:58 ` Robin Murphy
  2020-03-26 13:36   ` Prabhakar Kushwaha
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Murphy @ 2020-03-23 16:58 UTC (permalink / raw)
  To: Prabhakar Kushwaha, linux-arm-kernel, kexec mailing list
  Cc: will.deacon, Bhupesh Sharma, Bjorn Helgaas,
	Ganapatrao Prabhakerrao Kulkarni, Will Deacon

On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
> Hi All,
> 
> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
> Here network card is continuously giving following AER error
> [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
> aer_mask: 0x00000000
> [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
> [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
> aer_agent=Requester ID
> [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
> 
> This error is not 100% reproducible. It happens 1 out of 4 try.
> 
> This error goes away in following two scenarios
> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
> B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
>          if (reg & CR0_SMMUEN) {
>                  dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
>                  WARN_ON(is_kdump_kernel() && !disable_bypass);
>                  mdelay(100);  <-- Added delay
>                  arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>          }
> 
>  From A), it is clear that it is related to IOMMU
>  From B), looks like during boot of kdump kernel, network card is still
> active and it has sent some request over PCIe.
> as GPBA_ABORT bit is set, no response/completion coming to PCIe
> controller hence "CmpltTO" error.
> 
> Ideally before setting GPBA_ABORT bit, there should be some check for
> active transaction. if it is not possible, a wait should be done to
> assure that no more pending transaction left.

In general there is no way to check for active transactions, and even if 
there were, waiting for them to finish could mean waiting forever (if, 
say, a device is continuously streaming to/from a ring buffer).

> why any such delay has not been considered?

The main aim here is to block any DMA left over from the crashed kernel 
as quickly as possible, to minimise any further potential corruption of 
memory (consider if a device was left writing to an IOMMU virtual 
address that happened to have the same value as some physical address in 
the crash kernel's reserved memory). The fact that an arbitrary delay 
happens to give a 'nicer' result in one particular situation on one 
particular platform is neither here nor there in general.

Besides, this is *crash* kernel, so yeah, expect errors - something's 
already gone badly wrong to get us here, and everything from then on is 
merely a best-effort attempt to salvage what we can. Does it even make 
sense to have AER enabled at this point?

Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-23 16:58 ` Robin Murphy
@ 2020-03-26 13:36   ` Prabhakar Kushwaha
  2020-03-26 14:19     ` Robin Murphy
  0 siblings, 1 reply; 8+ messages in thread
From: Prabhakar Kushwaha @ 2020-03-26 13:36 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Ganapatrao Prabhakerrao Kulkarni, kexec mailing list,
	Bhupesh Sharma, will.deacon, Bjorn Helgaas, Will Deacon,
	linux-arm-kernel

On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
> > Hi All,
> >
> > I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
> > Here network card is continuously giving following AER error
> > [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
> > aer_mask: 0x00000000
> > [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
> > [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
> > aer_agent=Requester ID
> > [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
> >
> > This error is not 100% reproducible. It happens 1 out of 4 try.
> >
> > This error goes away in following two scenarios
> > A) Set iommu in bypass mode via bootargs iommu.passthrough=1
> > B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
> >          if (reg & CR0_SMMUEN) {
> >                  dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> >                  WARN_ON(is_kdump_kernel() && !disable_bypass);
> >                  mdelay(100);  <-- Added delay
> >                  arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> >          }
> >
> >  From A), it is clear that it is related to IOMMU
> >  From B), looks like during boot of kdump kernel, network card is still
> > active and it has sent some request over PCIe.
> > as GPBA_ABORT bit is set, no response/completion coming to PCIe
> > controller hence "CmpltTO" error.
> >
> > Ideally before setting GPBA_ABORT bit, there should be some check for
> > active transaction. if it is not possible, a wait should be done to
> > assure that no more pending transaction left.
>
> In general there is no way to check for active transactions, and even if
> there were, waiting for them to finish could mean waiting forever (if,
> say, a device is continuously streaming to/from a ring buffer).
>
> > why any such delay has not been considered?
>
> The main aim here is to block any DMA left over from the crashed kernel
> as quickly as possible, to minimise any further potential corruption of
> memory (consider if a device was left writing to an IOMMU virtual
> address that happened to have the same value as some physical address in
> the crash kernel's reserved memory). The fact that an arbitrary delay
> happens to give a 'nicer' result in one particular situation on one
> particular platform is neither here nor there in general.
>

I agree.
But we are depending upon kdump boot time and expecting devices to
reach to idle state before setting GBPA_ABORT bit.
adding a delay will be fair and make it independent of kdump boot time.

> Besides, this is *crash* kernel, so yeah, expect errors - something's
> already gone badly wrong to get us here, and everything from then on is
> merely a best-effort attempt to salvage what we can. Does it even make
> sense to have AER enabled at this point?
>

i tried by disabling AER in kdump kernel. but it did not helped as
network device become out of sync with respect to tx unit causing it
to be hanged and it never recovered from there.  Same can happen with
other devices like SATA etc

--pk

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-26 13:36   ` Prabhakar Kushwaha
@ 2020-03-26 14:19     ` Robin Murphy
  2020-03-26 15:35       ` Prabhakar Kushwaha
  0 siblings, 1 reply; 8+ messages in thread
From: Robin Murphy @ 2020-03-26 14:19 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: Ganapatrao Prabhakerrao Kulkarni, kexec mailing list,
	Bhupesh Sharma, will.deacon, Bjorn Helgaas, Will Deacon,
	linux-arm-kernel

On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote:
> On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
>>> Hi All,
>>>
>>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
>>> Here network card is continuously giving following AER error
>>> [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
>>> aer_mask: 0x00000000
>>> [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
>>> [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
>>> aer_agent=Requester ID
>>> [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
>>>
>>> This error is not 100% reproducible. It happens 1 out of 4 try.
>>>
>>> This error goes away in following two scenarios
>>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
>>> B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
>>>           if (reg & CR0_SMMUEN) {
>>>                   dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
>>>                   WARN_ON(is_kdump_kernel() && !disable_bypass);
>>>                   mdelay(100);  <-- Added delay
>>>                   arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>>>           }
>>>
>>>   From A), it is clear that it is related to IOMMU
>>>   From B), looks like during boot of kdump kernel, network card is still
>>> active and it has sent some request over PCIe.
>>> as GPBA_ABORT bit is set, no response/completion coming to PCIe
>>> controller hence "CmpltTO" error.
>>>
>>> Ideally before setting GPBA_ABORT bit, there should be some check for
>>> active transaction. if it is not possible, a wait should be done to
>>> assure that no more pending transaction left.
>>
>> In general there is no way to check for active transactions, and even if
>> there were, waiting for them to finish could mean waiting forever (if,
>> say, a device is continuously streaming to/from a ring buffer).
>>
>>> why any such delay has not been considered?
>>
>> The main aim here is to block any DMA left over from the crashed kernel
>> as quickly as possible, to minimise any further potential corruption of
>> memory (consider if a device was left writing to an IOMMU virtual
>> address that happened to have the same value as some physical address in
>> the crash kernel's reserved memory). The fact that an arbitrary delay
>> happens to give a 'nicer' result in one particular situation on one
>> particular platform is neither here nor there in general.
>>
> 
> I agree.
> But we are depending upon kdump boot time and expecting devices to
> reach to idle state before setting GBPA_ABORT bit.

So (ideally) stop depending on that, because like I said it's fragile 
and doesn't generalise.

> adding a delay will be fair and make it independent of kdump boot time.

And what delay value is "fair" and appropriate for any device on any 
system in any circumstance?

>> Besides, this is *crash* kernel, so yeah, expect errors - something's
>> already gone badly wrong to get us here, and everything from then on is
>> merely a best-effort attempt to salvage what we can. Does it even make
>> sense to have AER enabled at this point?
>>
> 
> i tried by disabling AER in kdump kernel. but it did not helped as
> network device become out of sync with respect to tx unit causing it
> to be hanged and it never recovered from there.  Same can happen with
> other devices like SATA etc

Any devices that the kdump kernel wants to use need to be fully reset to 
get them into a sane state anyway, don't they? I mean, what if the crash 
was *caused* by once of those devices going wrong in the first place? 
Any devices that kdump *doesn't* care about shouldn't matter, since 
nothing should be unmasking their interrupts regardless of what state 
they're in.

Assume some descriptor or pagetable entry got corrupted that caused your 
network device to access an invalid physical address downstream of the 
SMMU and get an abort from that *before* the kdump kernel starts - is 
waiting an extra 100ms at any point after that going to help?

Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-26 14:19     ` Robin Murphy
@ 2020-03-26 15:35       ` Prabhakar Kushwaha
  2020-03-26 16:13         ` Marc Zyngier
  0 siblings, 1 reply; 8+ messages in thread
From: Prabhakar Kushwaha @ 2020-03-26 15:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Ganapatrao Prabhakerrao Kulkarni, kexec mailing list,
	Bhupesh Sharma, will.deacon, Bjorn Helgaas, Will Deacon,
	linux-arm-kernel

On Thu, Mar 26, 2020 at 7:49 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote:
> > On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >>
> >> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
> >>> Hi All,
> >>>
> >>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
> >>> Here network card is continuously giving following AER error
> >>> [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
> >>> aer_mask: 0x00000000
> >>> [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
> >>> [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
> >>> aer_agent=Requester ID
> >>> [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
> >>>
> >>> This error is not 100% reproducible. It happens 1 out of 4 try.
> >>>
> >>> This error goes away in following two scenarios
> >>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
> >>> B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
> >>>           if (reg & CR0_SMMUEN) {
> >>>                   dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> >>>                   WARN_ON(is_kdump_kernel() && !disable_bypass);
> >>>                   mdelay(100);  <-- Added delay
> >>>                   arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> >>>           }
> >>>
> >>>   From A), it is clear that it is related to IOMMU
> >>>   From B), looks like during boot of kdump kernel, network card is still
> >>> active and it has sent some request over PCIe.
> >>> as GPBA_ABORT bit is set, no response/completion coming to PCIe
> >>> controller hence "CmpltTO" error.
> >>>
> >>> Ideally before setting GPBA_ABORT bit, there should be some check for
> >>> active transaction. if it is not possible, a wait should be done to
> >>> assure that no more pending transaction left.
> >>
> >> In general there is no way to check for active transactions, and even if
> >> there were, waiting for them to finish could mean waiting forever (if,
> >> say, a device is continuously streaming to/from a ring buffer).
> >>
> >>> why any such delay has not been considered?
> >>
> >> The main aim here is to block any DMA left over from the crashed kernel
> >> as quickly as possible, to minimise any further potential corruption of
> >> memory (consider if a device was left writing to an IOMMU virtual
> >> address that happened to have the same value as some physical address in
> >> the crash kernel's reserved memory). The fact that an arbitrary delay
> >> happens to give a 'nicer' result in one particular situation on one
> >> particular platform is neither here nor there in general.
> >>
> >
> > I agree.
> > But we are depending upon kdump boot time and expecting devices to
> > reach to idle state before setting GBPA_ABORT bit.
>
> So (ideally) stop depending on that, because like I said it's fragile
> and doesn't generalise.
>
> > adding a delay will be fair and make it independent of kdump boot time.
>
> And what delay value is "fair" and appropriate for any device on any
> system in any circumstance?
>

 it is tough question.  1sec can be thought of.

> >> Besides, this is *crash* kernel, so yeah, expect errors - something's
> >> already gone badly wrong to get us here, and everything from then on is
> >> merely a best-effort attempt to salvage what we can. Does it even make
> >> sense to have AER enabled at this point?
> >>
> >
> > i tried by disabling AER in kdump kernel. but it did not helped as
> > network device become out of sync with respect to tx unit causing it
> > to be hanged and it never recovered from there.  Same can happen with
> > other devices like SATA etc
>
> Any devices that the kdump kernel wants to use need to be fully reset to
> get them into a sane state anyway, don't they? I mean, what if the crash
> was *caused* by once of those devices going wrong in the first place?
> Any devices that kdump *doesn't* care about shouldn't matter, since
> nothing should be unmasking their interrupts regardless of what state
> they're in.
>
> Assume some descriptor or pagetable entry got corrupted that caused your
> network device to access an invalid physical address downstream of the
> SMMU and get an abort from that *before* the kdump kernel starts - is
> waiting an extra 100ms at any point after that going to help?
>
I agree with you. in above scenaro, where device if faulty or done
something wrong, waiting even hours is waste.

I was just going through other iommu drivers as this problem is
generic one and i found following patch

commit 091d42e43d21b6ca7ec39bf5f9e17bc0bd8d4312
Author: Joerg Roedel <jroedel@suse.de>
Date:   Fri Jun 12 11:56:10 2015 +0200
    iommu/vt-d: Copy translation tables from old kernel

    If we are in a kdump kernel and find translation enabled in
    the iommu, try to copy the translation tables from the old
    kernel to preserve the mappings until the device driver
    takes over.

    This supports old and the extended root-entry and
    context-table formats.

    Tested-by: ZhenHua Li <zhen-hual@hp.com>
    Tested-by: Baoquan He <bhe@redhat.com>
    Signed-off-by: Joerg Roedel <jroedel@suse.de>

I believe, similar kind of solution is required for SMMU also.

--pk

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-26 15:35       ` Prabhakar Kushwaha
@ 2020-03-26 16:13         ` Marc Zyngier
  2020-03-26 16:55           ` Prabhakar Kushwaha
  0 siblings, 1 reply; 8+ messages in thread
From: Marc Zyngier @ 2020-03-26 16:13 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: Ganapatrao Prabhakerrao Kulkarni, Will Deacon, will.deacon,
	Bhupesh Sharma, kexec mailing list, Bjorn Helgaas, Robin Murphy,
	linux-arm-kernel

On 2020-03-26 15:35, Prabhakar Kushwaha wrote:
> On Thu, Mar 26, 2020 at 7:49 PM Robin Murphy <robin.murphy@arm.com> 
> wrote:
>> 
>> On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote:
>> > On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
>> >>
>> >> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
>> >>> Hi All,
>> >>>
>> >>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
>> >>> Here network card is continuously giving following AER error
>> >>> [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
>> >>> aer_mask: 0x00000000
>> >>> [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
>> >>> [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
>> >>> aer_agent=Requester ID
>> >>> [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
>> >>>
>> >>> This error is not 100% reproducible. It happens 1 out of 4 try.
>> >>>
>> >>> This error goes away in following two scenarios
>> >>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
>> >>> B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
>> >>>           if (reg & CR0_SMMUEN) {
>> >>>                   dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
>> >>>                   WARN_ON(is_kdump_kernel() && !disable_bypass);
>> >>>                   mdelay(100);  <-- Added delay
>> >>>                   arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>> >>>           }
>> >>>
>> >>>   From A), it is clear that it is related to IOMMU
>> >>>   From B), looks like during boot of kdump kernel, network card is still
>> >>> active and it has sent some request over PCIe.
>> >>> as GPBA_ABORT bit is set, no response/completion coming to PCIe
>> >>> controller hence "CmpltTO" error.
>> >>>
>> >>> Ideally before setting GPBA_ABORT bit, there should be some check for
>> >>> active transaction. if it is not possible, a wait should be done to
>> >>> assure that no more pending transaction left.
>> >>
>> >> In general there is no way to check for active transactions, and even if
>> >> there were, waiting for them to finish could mean waiting forever (if,
>> >> say, a device is continuously streaming to/from a ring buffer).
>> >>
>> >>> why any such delay has not been considered?
>> >>
>> >> The main aim here is to block any DMA left over from the crashed kernel
>> >> as quickly as possible, to minimise any further potential corruption of
>> >> memory (consider if a device was left writing to an IOMMU virtual
>> >> address that happened to have the same value as some physical address in
>> >> the crash kernel's reserved memory). The fact that an arbitrary delay
>> >> happens to give a 'nicer' result in one particular situation on one
>> >> particular platform is neither here nor there in general.
>> >>
>> >
>> > I agree.
>> > But we are depending upon kdump boot time and expecting devices to
>> > reach to idle state before setting GBPA_ABORT bit.
>> 
>> So (ideally) stop depending on that, because like I said it's fragile
>> and doesn't generalise.
>> 
>> > adding a delay will be fair and make it independent of kdump boot time.
>> 
>> And what delay value is "fair" and appropriate for any device on any
>> system in any circumstance?
>> 
> 
>  it is tough question.  1sec can be thought of.
> 
>> >> Besides, this is *crash* kernel, so yeah, expect errors - something's
>> >> already gone badly wrong to get us here, and everything from then on is
>> >> merely a best-effort attempt to salvage what we can. Does it even make
>> >> sense to have AER enabled at this point?
>> >>
>> >
>> > i tried by disabling AER in kdump kernel. but it did not helped as
>> > network device become out of sync with respect to tx unit causing it
>> > to be hanged and it never recovered from there.  Same can happen with
>> > other devices like SATA etc
>> 
>> Any devices that the kdump kernel wants to use need to be fully reset 
>> to
>> get them into a sane state anyway, don't they? I mean, what if the 
>> crash
>> was *caused* by once of those devices going wrong in the first place?
>> Any devices that kdump *doesn't* care about shouldn't matter, since
>> nothing should be unmasking their interrupts regardless of what state
>> they're in.
>> 
>> Assume some descriptor or pagetable entry got corrupted that caused 
>> your
>> network device to access an invalid physical address downstream of the
>> SMMU and get an abort from that *before* the kdump kernel starts - is
>> waiting an extra 100ms at any point after that going to help?
>> 
> I agree with you. in above scenaro, where device if faulty or done
> something wrong, waiting even hours is waste.
> 
> I was just going through other iommu drivers as this problem is
> generic one and i found following patch
> 
> commit 091d42e43d21b6ca7ec39bf5f9e17bc0bd8d4312
> Author: Joerg Roedel <jroedel@suse.de>
> Date:   Fri Jun 12 11:56:10 2015 +0200
>     iommu/vt-d: Copy translation tables from old kernel
> 
>     If we are in a kdump kernel and find translation enabled in
>     the iommu, try to copy the translation tables from the old
>     kernel to preserve the mappings until the device driver
>     takes over.
> 
>     This supports old and the extended root-entry and
>     context-table formats.
> 
>     Tested-by: ZhenHua Li <zhen-hual@hp.com>
>     Tested-by: Baoquan He <bhe@redhat.com>
>     Signed-off-by: Joerg Roedel <jroedel@suse.de>
> 
> I believe, similar kind of solution is required for SMMU also.

There's a much more general problem: how to preserve pre-boot DMA
configurations because they are important to the new kernel (for
whatever reason).

And in a number of cases, it makes perfect sense: framebuffer
scanning out boot animations, ongoing DMA for other *cough* agents
*cough* in the system...

But I really don't like the idea of preserving the page tables across
a kdump kernel because for all we know, these page tables could be
horribly corrupted and mostly only make sense in the context of
the driver that created them. Oh wait, this driver has long died,
along with the rest of the original kernel.

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-26 16:13         ` Marc Zyngier
@ 2020-03-26 16:55           ` Prabhakar Kushwaha
  2020-03-26 17:19             ` Marc Zyngier
  0 siblings, 1 reply; 8+ messages in thread
From: Prabhakar Kushwaha @ 2020-03-26 16:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Ganapatrao Prabhakerrao Kulkarni, Will Deacon, will.deacon,
	Bhupesh Sharma, kexec mailing list, Bjorn Helgaas, Robin Murphy,
	linux-arm-kernel

On Thu, Mar 26, 2020 at 9:43 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On 2020-03-26 15:35, Prabhakar Kushwaha wrote:
> > On Thu, Mar 26, 2020 at 7:49 PM Robin Murphy <robin.murphy@arm.com>
> > wrote:
> >>
> >> On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote:
> >> > On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >> >>
> >> >> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
> >> >>> Hi All,
> >> >>>
> >> >>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
> >> >>> Here network card is continuously giving following AER error
> >> >>> [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
> >> >>> aer_mask: 0x00000000
> >> >>> [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
> >> >>> [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
> >> >>> aer_agent=Requester ID
> >> >>> [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
> >> >>>
> >> >>> This error is not 100% reproducible. It happens 1 out of 4 try.
> >> >>>
> >> >>> This error goes away in following two scenarios
> >> >>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
> >> >>> B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
> >> >>>           if (reg & CR0_SMMUEN) {
> >> >>>                   dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> >> >>>                   WARN_ON(is_kdump_kernel() && !disable_bypass);
> >> >>>                   mdelay(100);  <-- Added delay
> >> >>>                   arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> >> >>>           }
> >> >>>
> >> >>>   From A), it is clear that it is related to IOMMU
> >> >>>   From B), looks like during boot of kdump kernel, network card is still
> >> >>> active and it has sent some request over PCIe.
> >> >>> as GPBA_ABORT bit is set, no response/completion coming to PCIe
> >> >>> controller hence "CmpltTO" error.
> >> >>>
> >> >>> Ideally before setting GPBA_ABORT bit, there should be some check for
> >> >>> active transaction. if it is not possible, a wait should be done to
> >> >>> assure that no more pending transaction left.
> >> >>
> >> >> In general there is no way to check for active transactions, and even if
> >> >> there were, waiting for them to finish could mean waiting forever (if,
> >> >> say, a device is continuously streaming to/from a ring buffer).
> >> >>
> >> >>> why any such delay has not been considered?
> >> >>
> >> >> The main aim here is to block any DMA left over from the crashed kernel
> >> >> as quickly as possible, to minimise any further potential corruption of
> >> >> memory (consider if a device was left writing to an IOMMU virtual
> >> >> address that happened to have the same value as some physical address in
> >> >> the crash kernel's reserved memory). The fact that an arbitrary delay
> >> >> happens to give a 'nicer' result in one particular situation on one
> >> >> particular platform is neither here nor there in general.
> >> >>
> >> >
> >> > I agree.
> >> > But we are depending upon kdump boot time and expecting devices to
> >> > reach to idle state before setting GBPA_ABORT bit.
> >>
> >> So (ideally) stop depending on that, because like I said it's fragile
> >> and doesn't generalise.
> >>
> >> > adding a delay will be fair and make it independent of kdump boot time.
> >>
> >> And what delay value is "fair" and appropriate for any device on any
> >> system in any circumstance?
> >>
> >
> >  it is tough question.  1sec can be thought of.
> >
> >> >> Besides, this is *crash* kernel, so yeah, expect errors - something's
> >> >> already gone badly wrong to get us here, and everything from then on is
> >> >> merely a best-effort attempt to salvage what we can. Does it even make
> >> >> sense to have AER enabled at this point?
> >> >>
> >> >
> >> > i tried by disabling AER in kdump kernel. but it did not helped as
> >> > network device become out of sync with respect to tx unit causing it
> >> > to be hanged and it never recovered from there.  Same can happen with
> >> > other devices like SATA etc
> >>
> >> Any devices that the kdump kernel wants to use need to be fully reset
> >> to
> >> get them into a sane state anyway, don't they? I mean, what if the
> >> crash
> >> was *caused* by once of those devices going wrong in the first place?
> >> Any devices that kdump *doesn't* care about shouldn't matter, since
> >> nothing should be unmasking their interrupts regardless of what state
> >> they're in.
> >>
> >> Assume some descriptor or pagetable entry got corrupted that caused
> >> your
> >> network device to access an invalid physical address downstream of the
> >> SMMU and get an abort from that *before* the kdump kernel starts - is
> >> waiting an extra 100ms at any point after that going to help?
> >>
> > I agree with you. in above scenaro, where device if faulty or done
> > something wrong, waiting even hours is waste.
> >
> > I was just going through other iommu drivers as this problem is
> > generic one and i found following patch
> >
> > commit 091d42e43d21b6ca7ec39bf5f9e17bc0bd8d4312
> > Author: Joerg Roedel <jroedel@suse.de>
> > Date:   Fri Jun 12 11:56:10 2015 +0200
> >     iommu/vt-d: Copy translation tables from old kernel
> >
> >     If we are in a kdump kernel and find translation enabled in
> >     the iommu, try to copy the translation tables from the old
> >     kernel to preserve the mappings until the device driver
> >     takes over.
> >
> >     This supports old and the extended root-entry and
> >     context-table formats.
> >
> >     Tested-by: ZhenHua Li <zhen-hual@hp.com>
> >     Tested-by: Baoquan He <bhe@redhat.com>
> >     Signed-off-by: Joerg Roedel <jroedel@suse.de>
> >
> > I believe, similar kind of solution is required for SMMU also.
>
> There's a much more general problem: how to preserve pre-boot DMA
> configurations because they are important to the new kernel (for
> whatever reason).
>
> And in a number of cases, it makes perfect sense: framebuffer
> scanning out boot animations, ongoing DMA for other *cough* agents
> *cough* in the system...
>
> But I really don't like the idea of preserving the page tables across
> a kdump kernel because for all we know, these page tables could be
> horribly corrupted and mostly only make sense in the context of
> the driver that created them.

If I am correct, similar approach is used in GIC-ITS for LPI tables.
Probability of corruption is still there.

> Oh wait, this driver has long died,
> along with the rest of the original kernel.
>

:(
if this is the case than chance of foolproof and generic solution is very less.

At least a delay should be considered before setting SMMU_ABORT bit
for giving a chance of DMA getting completed.

--pk

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
  2020-03-26 16:55           ` Prabhakar Kushwaha
@ 2020-03-26 17:19             ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2020-03-26 17:19 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: Ganapatrao Prabhakerrao Kulkarni, Will Deacon, will.deacon,
	Bhupesh Sharma, kexec mailing list, Bjorn Helgaas, Robin Murphy,
	linux-arm-kernel

On Thu, 26 Mar 2020 22:25:38 +0530
Prabhakar Kushwaha <prabhakar.pkin@gmail.com> wrote:

> On Thu, Mar 26, 2020 at 9:43 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On 2020-03-26 15:35, Prabhakar Kushwaha wrote:  
> > > On Thu, Mar 26, 2020 at 7:49 PM Robin Murphy <robin.murphy@arm.com>
> > > wrote:  
> > >>
> > >> On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote:  
> > >> > On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:  
> > >> >>
> > >> >> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:  
> > >> >>> Hi All,
> > >> >>>
> > >> >>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
> > >> >>> Here network card is continuously giving following AER error
> > >> >>> [  100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
> > >> >>> aer_mask: 0x00000000
> > >> >>> [  100.846463] igb 0000:09:00.1: AER:    [14] CmpltTO                (First)
> > >> >>> [  100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
> > >> >>> aer_agent=Requester ID
> > >> >>> [  100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
> > >> >>>
> > >> >>> This error is not 100% reproducible. It happens 1 out of 4 try.
> > >> >>>
> > >> >>> This error goes away in following two scenarios
> > >> >>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
> > >> >>> B) Wait for ~100ms in arm_smmu_device_reset of  drivers/iommu/arm-smmu-v3.c
> > >> >>>           if (reg & CR0_SMMUEN) {
> > >> >>>                   dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> > >> >>>                   WARN_ON(is_kdump_kernel() && !disable_bypass);
> > >> >>>                   mdelay(100);  <-- Added delay
> > >> >>>                   arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> > >> >>>           }
> > >> >>>
> > >> >>>   From A), it is clear that it is related to IOMMU
> > >> >>>   From B), looks like during boot of kdump kernel, network card is still
> > >> >>> active and it has sent some request over PCIe.
> > >> >>> as GPBA_ABORT bit is set, no response/completion coming to PCIe
> > >> >>> controller hence "CmpltTO" error.
> > >> >>>
> > >> >>> Ideally before setting GPBA_ABORT bit, there should be some check for
> > >> >>> active transaction. if it is not possible, a wait should be done to
> > >> >>> assure that no more pending transaction left.  
> > >> >>
> > >> >> In general there is no way to check for active transactions, and even if
> > >> >> there were, waiting for them to finish could mean waiting forever (if,
> > >> >> say, a device is continuously streaming to/from a ring buffer).
> > >> >>  
> > >> >>> why any such delay has not been considered?  
> > >> >>
> > >> >> The main aim here is to block any DMA left over from the crashed kernel
> > >> >> as quickly as possible, to minimise any further potential corruption of
> > >> >> memory (consider if a device was left writing to an IOMMU virtual
> > >> >> address that happened to have the same value as some physical address in
> > >> >> the crash kernel's reserved memory). The fact that an arbitrary delay
> > >> >> happens to give a 'nicer' result in one particular situation on one
> > >> >> particular platform is neither here nor there in general.
> > >> >>  
> > >> >
> > >> > I agree.
> > >> > But we are depending upon kdump boot time and expecting devices to
> > >> > reach to idle state before setting GBPA_ABORT bit.  
> > >>
> > >> So (ideally) stop depending on that, because like I said it's fragile
> > >> and doesn't generalise.
> > >>  
> > >> > adding a delay will be fair and make it independent of kdump boot time.  
> > >>
> > >> And what delay value is "fair" and appropriate for any device on any
> > >> system in any circumstance?
> > >>  
> > >
> > >  it is tough question.  1sec can be thought of.
> > >  
> > >> >> Besides, this is *crash* kernel, so yeah, expect errors - something's
> > >> >> already gone badly wrong to get us here, and everything from then on is
> > >> >> merely a best-effort attempt to salvage what we can. Does it even make
> > >> >> sense to have AER enabled at this point?
> > >> >>  
> > >> >
> > >> > i tried by disabling AER in kdump kernel. but it did not helped as
> > >> > network device become out of sync with respect to tx unit causing it
> > >> > to be hanged and it never recovered from there.  Same can happen with
> > >> > other devices like SATA etc  
> > >>
> > >> Any devices that the kdump kernel wants to use need to be fully reset
> > >> to
> > >> get them into a sane state anyway, don't they? I mean, what if the
> > >> crash
> > >> was *caused* by once of those devices going wrong in the first place?
> > >> Any devices that kdump *doesn't* care about shouldn't matter, since
> > >> nothing should be unmasking their interrupts regardless of what state
> > >> they're in.
> > >>
> > >> Assume some descriptor or pagetable entry got corrupted that caused
> > >> your
> > >> network device to access an invalid physical address downstream of the
> > >> SMMU and get an abort from that *before* the kdump kernel starts - is
> > >> waiting an extra 100ms at any point after that going to help?
> > >>  
> > > I agree with you. in above scenaro, where device if faulty or done
> > > something wrong, waiting even hours is waste.
> > >
> > > I was just going through other iommu drivers as this problem is
> > > generic one and i found following patch
> > >
> > > commit 091d42e43d21b6ca7ec39bf5f9e17bc0bd8d4312
> > > Author: Joerg Roedel <jroedel@suse.de>
> > > Date:   Fri Jun 12 11:56:10 2015 +0200
> > >     iommu/vt-d: Copy translation tables from old kernel
> > >
> > >     If we are in a kdump kernel and find translation enabled in
> > >     the iommu, try to copy the translation tables from the old
> > >     kernel to preserve the mappings until the device driver
> > >     takes over.
> > >
> > >     This supports old and the extended root-entry and
> > >     context-table formats.
> > >
> > >     Tested-by: ZhenHua Li <zhen-hual@hp.com>
> > >     Tested-by: Baoquan He <bhe@redhat.com>
> > >     Signed-off-by: Joerg Roedel <jroedel@suse.de>
> > >
> > > I believe, similar kind of solution is required for SMMU also.  
> >
> > There's a much more general problem: how to preserve pre-boot DMA
> > configurations because they are important to the new kernel (for
> > whatever reason).
> >
> > And in a number of cases, it makes perfect sense: framebuffer
> > scanning out boot animations, ongoing DMA for other *cough* agents
> > *cough* in the system...
> >
> > But I really don't like the idea of preserving the page tables across
> > a kdump kernel because for all we know, these page tables could be
> > horribly corrupted and mostly only make sense in the context of
> > the driver that created them.  
> 
> If I am correct, similar approach is used in GIC-ITS for LPI tables.
> Probability of corruption is still there.

You are incorrect.

The memory is reused, in the sense that we cannot use another set of
redistributor tables (which are *not* ITS tables) once LPIs are
enabled. But none of the *data* is reused at all, and we happily
reprogram everything. So we know for sure that nothing will be written
outside of the pending tables, and nothing will be read outside of the
property table. Also, this memory is *physical*, as the GIC is not
translated by the SMMU.

So no, there is no corruption in this case. Well tried though! ;-)

> 
> > Oh wait, this driver has long died,
> > along with the rest of the original kernel.
> >  
> 
> :(
> if this is the case than chance of foolproof and generic solution is very less.

You're missing the very point of kdump: We use it when everything else
has failed. The system is in an unrecoverable situation, and we have no
idea of what is going on. All we're trying to do is to snapshot whatever
is left of it.

How could it be foolproof?

> At least a delay should be considered before setting SMMU_ABORT bit
> for giving a chance of DMA getting completed.

As Robin pointed out, there is no such delay. How long are you going to
wait? Until the whole of the memory is corrupted? Also, how do you know
when the DMA stops?  A framebuffer scans out the whole screen every
60Hz (and that's a pretty bad display...). You could wait forever!

The best thing to do for a screaming device is to shut it up as quickly
as possible, and the SMMU is a good tool for that.

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-03-26 17:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-23 15:21 arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel Prabhakar Kushwaha
2020-03-23 16:58 ` Robin Murphy
2020-03-26 13:36   ` Prabhakar Kushwaha
2020-03-26 14:19     ` Robin Murphy
2020-03-26 15:35       ` Prabhakar Kushwaha
2020-03-26 16:13         ` Marc Zyngier
2020-03-26 16:55           ` Prabhakar Kushwaha
2020-03-26 17:19             ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).