From: Lu Baolu <baolu.lu@linux.intel.com>
To: "Longpeng (Mike,
Cloud Infrastructure Service Product Dept.)"
<longpeng2@huawei.com>,
dwmw2@infradead.org, joro@8bytes.org, will@kernel.org,
alex.williamson@redhat.com
Cc: baolu.lu@linux.intel.com, iommu@lists.linux-foundation.org,
LKML <linux-kernel@vger.kernel.org>,
"Gonglei (Arei)" <arei.gonglei@huawei.com>,
chenjiashang@huawei.com
Subject: Re: A problem of Intel IOMMU hardware ?
Date: Wed, 17 Mar 2021 13:16:58 +0800 [thread overview]
Message-ID: <692186fd-42b8-4054-ead2-f6c6b1bf5b2d@linux.intel.com> (raw)
In-Reply-To: <670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com>
Hi Longpeng,
On 3/17/21 11:16 AM, Longpeng (Mike, Cloud Infrastructure Service
Product Dept.) wrote:
> Hi guys,
>
> We find the Intel iommu cache (i.e. iotlb) maybe works wrong in a special
> situation, it would cause DMA fails or get wrong data.
>
> The reproducer (based on Alex's vfio testsuite[1]) is in attachment, it can
> reproduce the problem with high probability (~50%).
>
> The machine we used is:
> processor : 47
> vendor_id : GenuineIntel
> cpu family : 6
> model : 85
> model name : Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
> stepping : 4
> microcode : 0x2000069
>
> And the iommu capability reported is:
> ver 1:0 cap 8d2078c106f0466 ecap f020df
> (caching mode = 0 , page-selective invalidation = 1)
>
> (The problem is also on 'Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz' and
> 'Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz')
>
> We run the reproducer on Linux 4.18 and it works as follow:
>
> Step 1. alloc 4G *2M-hugetlb* memory (N.B. no problem with 4K-page mapping)
I don't understand 2M-hugetlb here means exactly. The IOMMU hardware
supports both 2M and 1G super page. The mapping physical memory is 4G.
Why couldn't it use 1G super page?
> Step 2. DMA Map 4G memory
> Step 3.
> while (1) {
> {UNMAP, 0x0, 0xa0000}, ------------------------------------ (a)
> {UNMAP, 0xc0000, 0xbff40000},
Have these two ranges been mapped before? Does the IOMMU driver
complains when you trying to unmap a range which has never been
mapped? The IOMMU driver implicitly assumes that mapping and
unmapping are paired.
> {MAP, 0x0, 0xc0000000}, --------------------------------- (b)
> use GDB to pause at here, and then DMA read IOVA=0,
IOVA 0 seems to be a special one. Have you verified with other addresses
than IOVA 0?
> sometimes DMA success (as expected),
> but sometimes DMA error (report not-present).
> {UNMAP, 0x0, 0xc0000000}, --------------------------------- (c)
> {MAP, 0x0, 0xa0000},
> {MAP, 0xc0000, 0xbff40000},
> }
>
> The DMA read operations sholud success between (b) and (c), it should NOT report
> not-present at least!
>
> After analysis the problem, we think maybe it's caused by the Intel iommu iotlb.
> It seems the DMA Remapping hardware still uses the IOTLB or other caches of (a).
>
> When do DMA unmap at (a), the iotlb will be flush:
> intel_iommu_unmap
> domain_unmap
> iommu_flush_iotlb_psi
>
> When do DMA map at (b), no need to flush the iotlb according to the capability
> of this iommu:
> intel_iommu_map
> domain_pfn_mapping
> domain_mapping
> __mapping_notify_one
> if (cap_caching_mode(iommu->cap)) // FALSE
> iommu_flush_iotlb_psi
That's true. The iotlb flushing is not needed in case of PTE been
changed from non-present to present unless caching mode.
> But the problem will disappear if we FORCE flush here. So we suspect the iommu
> hardware.
>
> Do you have any suggestion ?
Best regards,
baolu
next prev parent reply other threads:[~2021-03-17 5:27 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-17 3:16 A problem of Intel IOMMU hardware ? Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-17 5:16 ` Lu Baolu [this message]
2021-03-17 9:40 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-17 15:18 ` Alex Williamson
2021-03-18 2:58 ` Lu Baolu
2021-03-18 4:46 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18 7:48 ` Nadav Amit
2021-03-17 5:46 ` Nadav Amit
2021-03-17 9:35 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-17 18:12 ` Nadav Amit
2021-03-18 3:03 ` Lu Baolu
2021-03-18 8:20 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18 8:27 ` Tian, Kevin
2021-03-18 8:38 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18 8:43 ` Tian, Kevin
2021-03-18 8:54 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18 8:56 ` Tian, Kevin
2021-03-18 9:25 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18 16:46 ` Nadav Amit
2021-03-21 23:51 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-22 0:27 ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-27 2:31 ` Lu Baolu
2021-03-27 4:36 ` Nadav Amit
2021-03-27 5:27 ` Lu Baolu
2021-03-19 0:15 ` Lu Baolu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=692186fd-42b8-4054-ead2-f6c6b1bf5b2d@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=alex.williamson@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=chenjiashang@huawei.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longpeng2@huawei.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).