All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Thompson <daniel.thompson@linaro.org>
To: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Jon Nettleton <jon@solid-run.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Diana Madalina Craciun <diana.craciun@nxp.com>,
	Ioana Ciornei <ioana.ciornei@nxp.com>,
	leoyang.li@nxp.com
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case
Date: Wed, 17 Nov 2021 17:00:13 +0000	[thread overview]
Message-ID: <20211117170013.otj3gwfqnmvtlxlu@maple.lan> (raw)
In-Reply-To: <ef23386b-5b83-a791-e2f0-a72ec610836a@nxp.com>

On Wed, Nov 17, 2021 at 05:30:32PM +0200, Laurentiu Tudor wrote:
> On 11/17/2021 3:59 PM, Daniel Thompson wrote:
> > On Wed, Nov 17, 2021 at 03:07:51PM +0200, Laurentiu Tudor wrote:
> >> On 11/12/2021 7:31 PM, Daniel Thompson wrote:
> >>> On Thu, Nov 11, 2021 at 06:36:58PM +0100, Jon Nettleton wrote:
> >>>> On Thu, Nov 11, 2021 at 6:23 PM Daniel Thompson
> >>>> <daniel.thompson@linaro.org> wrote:
> >>>> The correct solution for the problem you are seeing is the ACPI
> >>>> maintainers figuring out how to land the IORT RMR patchset.  Until
> >>>> that is done the only workaround is setting "arm-smmu.disable_bypass=0
> >>>> iommu.passthrough=1" on the kernel commandline.  The latter option is
> >>>> required since 5.15 and I haven't had time or energy to figure out
> >>>> why.  The proper solution is to just land the IORT RMR patchset and
> >>>> let HoneyComb run with the SMMU enabled.
> >>>
> >>> Thanks for the update. I'll probably adopt iommu.passthrough=1 for now.
> >>> That allows me to adopt a distro kernel when it updates to v5.15.
> >>
> >> The "iommu.passthrough=1" kernel arg shouldn't be needed. By chance, do
> >> you remember what errors were you seeing? What was failing?
> > 
> > For all testing of v5.15 I had "arm-smmu.disable_bypass=0" set because I
> > was guided to enable that by the error messages in older kernels ;-) .
> > 
> > Anyhow without "iommu.passthrough=1" (and without the patch from this thread
> > reverted) then the logs are being massively spammed with error messages:
> > 
> > ~~~
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> > arm_smmu_context_fault: 1697259 callbacks suppressed
> > ~~~
> > 
> > This results a relatively simple workstation (LX2 + nVidia GT-710 + USB
> > for networking) becoming unresponsive. How long to fail is a little
> > unpredictable. I assumed that the weight of such dense log messages
> > eventually gets into a timing pattern that prevented any useful
> > interrupts from being serviced... but that is only a guess.
> > 
> 
> Few comments here:
>  - I'm suspecting that the PCI video card is triggering the smmu faults.
> Would it be possible to give it a try with the card out and without
> "iommu.passthrough=1"?

The PCIe video card does not cause the smmu faults. These still manifest
when the card is removed (and with same IOVA).


>  - the IOVAs look weird to me, they should look something like
> 0xffffxxxxxx or so. Maybe there are issues in the nvidia driver?

I guess there could be, but why would a problem that bisects down to
a change in the fsl-mc-bus initialization configuration alter the
behaviour of the PCIe graphics driver?


>  - Would it be possible to share a full boot log? I'm thinking that it
> would be interesting to see how the devices are allocated in iommu groups.

See
https://gist.github.com/daniel-thompson/07489561f14965fd1af7d5bd4340f54b

It contains three files, all gathered with the GPU removed:

 * Logs from unmodified v5.15 with iommu.passthrough=1 set
   (networking is good).
 * Logs from v5.15 patched with the revert I shared earlier in
   the thread (networking is good).
 * Logs from v5.15 without iommu.passthough=1 set (many SMMU messages,
   networking is broken).


Daniel.

  reply	other threads:[~2021-11-17 17:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 14:07 [PATCH 1/8] bus: fsl-mc: fix arg in call to dprc_scan_objects() laurentiu.tudor
2021-07-15 14:07 ` [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case laurentiu.tudor
2021-11-11 17:23   ` Daniel Thompson
2021-11-11 17:36     ` Jon Nettleton
2021-11-12 17:31       ` Daniel Thompson
2021-11-17 13:07         ` Laurentiu Tudor
2021-11-17 13:22           ` Jon Nettleton
2021-11-17 13:59           ` Daniel Thompson
2021-11-17 15:30             ` Laurentiu Tudor
2021-11-17 17:00               ` Daniel Thompson [this message]
2021-11-18 12:41                 ` Laurentiu Tudor
2021-11-17 13:03     ` Laurentiu Tudor
2021-11-17 14:46       ` Daniel Thompson
2021-07-15 14:07 ` [PATCH 3/8] bus: fsl-mc: fully resume the firmware laurentiu.tudor
2021-07-15 14:07 ` [PATCH 4/8] bus: fsl-mc: add .shutdown() op for the bus driver laurentiu.tudor
2021-07-15 14:07 ` [PATCH 5/8] bus: fsl-mc: pause the MC firmware before IOMMU setup laurentiu.tudor
2021-07-15 14:07 ` [PATCH 6/8] bus: fsl-mc: pause the MC firmware when unloading laurentiu.tudor
2021-07-15 14:07 ` [PATCH 7/8] bus: fsl-mc: rescan devices if endpoint not found laurentiu.tudor
2021-07-15 14:07 ` [PATCH 8/8] bus: fsl-mc: fix mmio base address for child DPRCs laurentiu.tudor
2021-07-21 16:11 ` [PATCH 1/8] bus: fsl-mc: fix arg in call to dprc_scan_objects() Diana Madalina Craciun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211117170013.otj3gwfqnmvtlxlu@maple.lan \
    --to=daniel.thompson@linaro.org \
    --cc=diana.craciun@nxp.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ioana.ciornei@nxp.com \
    --cc=jon@solid-run.com \
    --cc=laurentiu.tudor@nxp.com \
    --cc=leoyang.li@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.