From: Prabhakar Kushwaha <prabhakar.pkin@gmail.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Ganapatrao Prabhakerrao Kulkarni <gkulkarni@marvell.com>,
Will Deacon <will@kernel.org>,
will.deacon@arm.com, Bhupesh Sharma <bhsharma@redhat.com>,
kexec mailing list <kexec@lists.infradead.org>,
Bjorn Helgaas <helgaas@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel
Date: Thu, 26 Mar 2020 22:25:38 +0530 [thread overview]
Message-ID: <CAJ2QiJJ8nB=2df2++1KahTD8Cqe7f5JY1tp=tZ07QomvgFwjnA@mail.gmail.com> (raw)
In-Reply-To: <81616336e3b8c5f083b26125bf47575c@kernel.org>
On Thu, Mar 26, 2020 at 9:43 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On 2020-03-26 15:35, Prabhakar Kushwaha wrote:
> > On Thu, Mar 26, 2020 at 7:49 PM Robin Murphy <robin.murphy@arm.com>
> > wrote:
> >>
> >> On 2020-03-26 1:36 pm, Prabhakar Kushwaha wrote:
> >> > On Mon, Mar 23, 2020 at 10:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >> >>
> >> >> On 2020-03-23 3:21 pm, Prabhakar Kushwaha wrote:
> >> >>> Hi All,
> >> >>>
> >> >>> I am facing issue on Marvell's ARM64 Thunder X2 with kdump kernel.
> >> >>> Here network card is continuously giving following AER error
> >> >>> [ 100.839168] igb 0000:09:00.1: AER: aer_status: 0x00004000,
> >> >>> aer_mask: 0x00000000
> >> >>> [ 100.846463] igb 0000:09:00.1: AER: [14] CmpltTO (First)
> >> >>> [ 100.861491] igb 0000:09:00.1: AER: aer_layer=Transaction Layer,
> >> >>> aer_agent=Requester ID
> >> >>> [ 100.869400] igb 0000:09:00.1: AER: aer_uncor_severity: 0x00062011
> >> >>>
> >> >>> This error is not 100% reproducible. It happens 1 out of 4 try.
> >> >>>
> >> >>> This error goes away in following two scenarios
> >> >>> A) Set iommu in bypass mode via bootargs iommu.passthrough=1
> >> >>> B) Wait for ~100ms in arm_smmu_device_reset of drivers/iommu/arm-smmu-v3.c
> >> >>> if (reg & CR0_SMMUEN) {
> >> >>> dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
> >> >>> WARN_ON(is_kdump_kernel() && !disable_bypass);
> >> >>> mdelay(100); <-- Added delay
> >> >>> arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> >> >>> }
> >> >>>
> >> >>> From A), it is clear that it is related to IOMMU
> >> >>> From B), looks like during boot of kdump kernel, network card is still
> >> >>> active and it has sent some request over PCIe.
> >> >>> as GPBA_ABORT bit is set, no response/completion coming to PCIe
> >> >>> controller hence "CmpltTO" error.
> >> >>>
> >> >>> Ideally before setting GPBA_ABORT bit, there should be some check for
> >> >>> active transaction. if it is not possible, a wait should be done to
> >> >>> assure that no more pending transaction left.
> >> >>
> >> >> In general there is no way to check for active transactions, and even if
> >> >> there were, waiting for them to finish could mean waiting forever (if,
> >> >> say, a device is continuously streaming to/from a ring buffer).
> >> >>
> >> >>> why any such delay has not been considered?
> >> >>
> >> >> The main aim here is to block any DMA left over from the crashed kernel
> >> >> as quickly as possible, to minimise any further potential corruption of
> >> >> memory (consider if a device was left writing to an IOMMU virtual
> >> >> address that happened to have the same value as some physical address in
> >> >> the crash kernel's reserved memory). The fact that an arbitrary delay
> >> >> happens to give a 'nicer' result in one particular situation on one
> >> >> particular platform is neither here nor there in general.
> >> >>
> >> >
> >> > I agree.
> >> > But we are depending upon kdump boot time and expecting devices to
> >> > reach to idle state before setting GBPA_ABORT bit.
> >>
> >> So (ideally) stop depending on that, because like I said it's fragile
> >> and doesn't generalise.
> >>
> >> > adding a delay will be fair and make it independent of kdump boot time.
> >>
> >> And what delay value is "fair" and appropriate for any device on any
> >> system in any circumstance?
> >>
> >
> > it is tough question. 1sec can be thought of.
> >
> >> >> Besides, this is *crash* kernel, so yeah, expect errors - something's
> >> >> already gone badly wrong to get us here, and everything from then on is
> >> >> merely a best-effort attempt to salvage what we can. Does it even make
> >> >> sense to have AER enabled at this point?
> >> >>
> >> >
> >> > i tried by disabling AER in kdump kernel. but it did not helped as
> >> > network device become out of sync with respect to tx unit causing it
> >> > to be hanged and it never recovered from there. Same can happen with
> >> > other devices like SATA etc
> >>
> >> Any devices that the kdump kernel wants to use need to be fully reset
> >> to
> >> get them into a sane state anyway, don't they? I mean, what if the
> >> crash
> >> was *caused* by once of those devices going wrong in the first place?
> >> Any devices that kdump *doesn't* care about shouldn't matter, since
> >> nothing should be unmasking their interrupts regardless of what state
> >> they're in.
> >>
> >> Assume some descriptor or pagetable entry got corrupted that caused
> >> your
> >> network device to access an invalid physical address downstream of the
> >> SMMU and get an abort from that *before* the kdump kernel starts - is
> >> waiting an extra 100ms at any point after that going to help?
> >>
> > I agree with you. in above scenaro, where device if faulty or done
> > something wrong, waiting even hours is waste.
> >
> > I was just going through other iommu drivers as this problem is
> > generic one and i found following patch
> >
> > commit 091d42e43d21b6ca7ec39bf5f9e17bc0bd8d4312
> > Author: Joerg Roedel <jroedel@suse.de>
> > Date: Fri Jun 12 11:56:10 2015 +0200
> > iommu/vt-d: Copy translation tables from old kernel
> >
> > If we are in a kdump kernel and find translation enabled in
> > the iommu, try to copy the translation tables from the old
> > kernel to preserve the mappings until the device driver
> > takes over.
> >
> > This supports old and the extended root-entry and
> > context-table formats.
> >
> > Tested-by: ZhenHua Li <zhen-hual@hp.com>
> > Tested-by: Baoquan He <bhe@redhat.com>
> > Signed-off-by: Joerg Roedel <jroedel@suse.de>
> >
> > I believe, similar kind of solution is required for SMMU also.
>
> There's a much more general problem: how to preserve pre-boot DMA
> configurations because they are important to the new kernel (for
> whatever reason).
>
> And in a number of cases, it makes perfect sense: framebuffer
> scanning out boot animations, ongoing DMA for other *cough* agents
> *cough* in the system...
>
> But I really don't like the idea of preserving the page tables across
> a kdump kernel because for all we know, these page tables could be
> horribly corrupted and mostly only make sense in the context of
> the driver that created them.
If I am correct, similar approach is used in GIC-ITS for LPI tables.
Probability of corruption is still there.
> Oh wait, this driver has long died,
> along with the rest of the original kernel.
>
:(
if this is the case than chance of foolproof and generic solution is very less.
At least a delay should be considered before setting SMMU_ABORT bit
for giving a chance of DMA getting completed.
--pk
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-03-26 16:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-23 15:21 arm64: Getting continuous PCIe "CmpltTO" AER from network card in kdump kernel Prabhakar Kushwaha
2020-03-23 16:58 ` Robin Murphy
2020-03-26 13:36 ` Prabhakar Kushwaha
2020-03-26 14:19 ` Robin Murphy
2020-03-26 15:35 ` Prabhakar Kushwaha
2020-03-26 16:13 ` Marc Zyngier
2020-03-26 16:55 ` Prabhakar Kushwaha [this message]
2020-03-26 17:19 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJ2QiJJ8nB=2df2++1KahTD8Cqe7f5JY1tp=tZ07QomvgFwjnA@mail.gmail.com' \
--to=prabhakar.pkin@gmail.com \
--cc=bhsharma@redhat.com \
--cc=gkulkarni@marvell.com \
--cc=helgaas@kernel.org \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=robin.murphy@arm.com \
--cc=will.deacon@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).