linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: "Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.)" 
	<longpeng2@huawei.com>,
	dwmw2@infradead.org, joro@8bytes.org, will@kernel.org,
	alex.williamson@redhat.com
Cc: baolu.lu@linux.intel.com, iommu@lists.linux-foundation.org,
	LKML <linux-kernel@vger.kernel.org>,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	chenjiashang@huawei.com
Subject: Re: A problem of Intel IOMMU hardware ?
Date: Wed, 17 Mar 2021 13:16:58 +0800	[thread overview]
Message-ID: <692186fd-42b8-4054-ead2-f6c6b1bf5b2d@linux.intel.com> (raw)
In-Reply-To: <670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com>

Hi Longpeng,

On 3/17/21 11:16 AM, Longpeng (Mike, Cloud Infrastructure Service 
Product Dept.) wrote:
> Hi guys,
> 
> We find the Intel iommu cache (i.e. iotlb) maybe works wrong in a special
> situation, it would cause DMA fails or get wrong data.
> 
> The reproducer (based on Alex's vfio testsuite[1]) is in attachment, it can
> reproduce the problem with high probability (~50%).
> 
> The machine we used is:
> processor	: 47
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 85
> model name	: Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
> stepping	: 4
> microcode	: 0x2000069
> 
> And the iommu capability reported is:
> ver 1:0 cap 8d2078c106f0466 ecap f020df
> (caching mode = 0 , page-selective invalidation = 1)
> 
> (The problem is also on 'Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz' and
> 'Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz')
> 
> We run the reproducer on Linux 4.18 and it works as follow:
> 
> Step 1. alloc 4G *2M-hugetlb* memory (N.B. no problem with 4K-page mapping)

I don't understand 2M-hugetlb here means exactly. The IOMMU hardware
supports both 2M and 1G super page. The mapping physical memory is 4G.
Why couldn't it use 1G super page?

> Step 2. DMA Map 4G memory
> Step 3.
>      while (1) {
>          {UNMAP, 0x0, 0xa0000}, ------------------------------------ (a)
>          {UNMAP, 0xc0000, 0xbff40000},

Have these two ranges been mapped before? Does the IOMMU driver
complains when you trying to unmap a range which has never been
mapped? The IOMMU driver implicitly assumes that mapping and
unmapping are paired.

>          {MAP,   0x0, 0xc0000000}, --------------------------------- (b)
>                  use GDB to pause at here, and then DMA read IOVA=0,

IOVA 0 seems to be a special one. Have you verified with other addresses
than IOVA 0?

>                  sometimes DMA success (as expected),
>                  but sometimes DMA error (report not-present).
>          {UNMAP, 0x0, 0xc0000000}, --------------------------------- (c)
>          {MAP,   0x0, 0xa0000},
>          {MAP,   0xc0000, 0xbff40000},
>      }
> 
> The DMA read operations sholud success between (b) and (c), it should NOT report
> not-present at least!
> 
> After analysis the problem, we think maybe it's caused by the Intel iommu iotlb.
> It seems the DMA Remapping hardware still uses the IOTLB or other caches of (a).
> 
> When do DMA unmap at (a), the iotlb will be flush:
>      intel_iommu_unmap
>          domain_unmap
>              iommu_flush_iotlb_psi
> 
> When do DMA map at (b), no need to flush the iotlb according to the capability
> of this iommu:
>      intel_iommu_map
>          domain_pfn_mapping
>              domain_mapping
>                  __mapping_notify_one
>                      if (cap_caching_mode(iommu->cap)) // FALSE
>                          iommu_flush_iotlb_psi

That's true. The iotlb flushing is not needed in case of PTE been
changed from non-present to present unless caching mode.

> But the problem will disappear if we FORCE flush here. So we suspect the iommu
> hardware.
> 
> Do you have any suggestion ?

Best regards,
baolu

  reply	other threads:[~2021-03-17  5:27 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17  3:16 A problem of Intel IOMMU hardware ? Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-17  5:16 ` Lu Baolu [this message]
2021-03-17  9:40   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-17 15:18   ` Alex Williamson
2021-03-18  2:58     ` Lu Baolu
2021-03-18  4:46       ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18  7:48         ` Nadav Amit
2021-03-17  5:46 ` Nadav Amit
2021-03-17  9:35   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-17 18:12     ` Nadav Amit
2021-03-18  3:03       ` Lu Baolu
2021-03-18  8:20       ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18  8:27         ` Tian, Kevin
2021-03-18  8:38           ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18  8:43             ` Tian, Kevin
2021-03-18  8:54               ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18  8:56             ` Tian, Kevin
2021-03-18  9:25               ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-18 16:46                 ` Nadav Amit
2021-03-21 23:51                   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-22  0:27                   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-03-27  2:31                   ` Lu Baolu
2021-03-27  4:36                     ` Nadav Amit
2021-03-27  5:27                       ` Lu Baolu
2021-03-19  0:15               ` Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=692186fd-42b8-4054-ead2-f6c6b1bf5b2d@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=chenjiashang@huawei.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longpeng2@huawei.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).