From: Jerry Snitselaar <jsnitsel@redhat.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: iommu@lists.linux-foundation.org, Will Deacon <will@kernel.org>
Subject: Re: arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1
Date: Mon, 17 Feb 2020 07:58:12 -0700 [thread overview]
Message-ID: <20200217145812.w77idhj3u6jgaeam@cantor> (raw)
In-Reply-To: <efb6da9c-51a3-c35c-1bbf-ae6808006beb@arm.com>
On Mon Feb 17 20, Robin Murphy wrote:
>On 16/02/2020 10:11 pm, Jerry Snitselaar wrote:
>>On Fri Feb 14 20, Robin Murphy wrote:
>>>Hi Jerry,
>>>
>>>On 2020-02-14 8:13 pm, Jerry Snitselaar wrote:
>>>>Hi Will,
>>>>
>>>>On a gigabyte system with Cavium CN8xx, when doing a fio test against
>>>>an nvme drive we are seeing the following:
>>>>
>>>>[ 637.161194] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x8010003f6000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.174329] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x801000036000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.186887] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x8010002ee000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.199275] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x8010003c7000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.211885] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x801000392000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.224580] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x801000018000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.237241] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x801000360000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.249657] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x8010000ba000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.262120] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x80100003e000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>>[ 637.274468] arm-smmu arm-smmu.1.auto: Unhandled context
>>>>fault: fsr=0x80000402, iova=0x801000304000, fsynr=0x70091,
>>>>cbfrsynra=0x9000, cb=7
>>>
>>>Those "IOVAs" don't look much like IOVAs from the DMA allocator -
>>>if they were physical addresses, would they correspond to an
>>>expected region of the physical memory map?
>>>
>>>I would suspect that this is most likely misbehaviour in the NVMe
>>>driver (issuing a write to a non-DMA-mapped address), and the SMMU
>>>is just doing its job in blocking and reporting it.
>>>
>>>>I also reproduced with 5.5-rc7, and will check 5.6-rc1 later
>>>>today. I couldn't narrow it down further into 5.4-rc1.
>>>>I don't know smmu or the code well, any thoughts on where to
>>>>start digging into this?
>>>>
>>>>fio test that is being run is:
>>>>
>>>>#fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite
>>>>-ioengine=libaio -bs=4k -runtime=43200 -size=-group_reporting
>>>>-name=mytest -numjobs=32
>>>
>>>Just to clarify, do other tests work OK on the same device?
>>>
>>>Thanks,
>>>Robin.
>>>
>>
>>I was able to get back on the system today. I think I know what the
>>problem is:
>>
>>[ 0.036189] iommu: Gigabyte R120-T34-00 detected, force iommu
>>passthrough mode
>>[ 6.324282] iommu: Default domain type: Translated
>>
>>So the new default domain code in 5.4 overrides the iommu quirk code
>>setting default
>>passthrough. Testing a quick patch that tracks whether the default
>>domain was set
>>in the quirk code, and leaves it alone if it was. So far it seems to
>>be working.
>
>Ah, OK. Could you point me at that quirk code? I can't seem to track
>it down in mainline, and seeing this much leaves me dubious that it's
>even correct - matching a particular board implies that it's a
>firmware issue (as far as I'm aware the SMMUs in CN88xx SoCs are
>usable in general), but if the firmware description is wrong to the
>point that DMA ops translation doesn't work, then no other translation
>(e.g. VFIO) is likely to work either. In that case it's simply not
>safe to enable the SMMU at all, and fudging the default domain type
>merely hides one symptom of the problem.
>
>Robin.
>
Ugh. It is a RHEL only patch, but for some reason it is applied to the
ark kernel builds as well. Sorry for the noise.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
prev parent reply other threads:[~2020-02-17 14:58 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-14 20:13 arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1 Jerry Snitselaar
2020-02-14 20:58 ` Robin Murphy
2020-02-16 22:11 ` Jerry Snitselaar
2020-02-17 13:08 ` Robin Murphy
2020-02-17 14:58 ` Jerry Snitselaar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200217145812.w77idhj3u6jgaeam@cantor \
--to=jsnitsel@redhat.com \
--cc=iommu@lists.linux-foundation.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).