Hi guys, We find the Intel iommu cache (i.e. iotlb) maybe works wrong in a special situation, it would cause DMA fails or get wrong data. The reproducer (based on Alex's vfio testsuite[1]) is in attachment, it can reproduce the problem with high probability (~50%). The machine we used is: processor : 47 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz stepping : 4 microcode : 0x2000069 And the iommu capability reported is: ver 1:0 cap 8d2078c106f0466 ecap f020df (caching mode = 0 , page-selective invalidation = 1) (The problem is also on 'Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz' and 'Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz') We run the reproducer on Linux 4.18 and it works as follow: Step 1. alloc 4G *2M-hugetlb* memory (N.B. no problem with 4K-page mapping) Step 2. DMA Map 4G memory Step 3. while (1) { {UNMAP, 0x0, 0xa0000}, ------------------------------------ (a) {UNMAP, 0xc0000, 0xbff40000}, {MAP, 0x0, 0xc0000000}, --------------------------------- (b) use GDB to pause at here, and then DMA read IOVA=0, sometimes DMA success (as expected), but sometimes DMA error (report not-present). {UNMAP, 0x0, 0xc0000000}, --------------------------------- (c) {MAP, 0x0, 0xa0000}, {MAP, 0xc0000, 0xbff40000}, } The DMA read operations sholud success between (b) and (c), it should NOT report not-present at least! After analysis the problem, we think maybe it's caused by the Intel iommu iotlb. It seems the DMA Remapping hardware still uses the IOTLB or other caches of (a). When do DMA unmap at (a), the iotlb will be flush: intel_iommu_unmap domain_unmap iommu_flush_iotlb_psi When do DMA map at (b), no need to flush the iotlb according to the capability of this iommu: intel_iommu_map domain_pfn_mapping domain_mapping __mapping_notify_one if (cap_caching_mode(iommu->cap)) // FALSE iommu_flush_iotlb_psi But the problem will disappear if we FORCE flush here. So we suspect the iommu hardware. Do you have any suggestion ?