On Fri, 2019-11-22 at 11:04 +0800, Lu Baolu wrote: > Intel VT-d 3.0 introduces more caches and interfaces for software to > flush when it runs in the scalable mode. Currently various cache flush > helpers are scattered around. This consolidates them by putting them in > the existing iommu_flush structure. > > /* struct iommu_flush - Intel IOMMU cache invalidation ops > * > * @cc_inv: invalidate context cache > * @iotlb_inv: Invalidate IOTLB and paging structure caches when software > * has changed second-level tables. > * @p_iotlb_inv: Invalidate IOTLB and paging structure caches when software > * has changed first-level tables. > * @pc_inv: invalidate pasid cache > * @dev_tlb_inv: invalidate cached mappings used by requests-without-PASID > * from the Device-TLB on a endpoint device. > * @p_dev_tlb_inv: invalidate cached mappings used by requests-with-PASID > * from the Device-TLB on an endpoint device > */ > struct iommu_flush { > void (*cc_inv)(struct intel_iommu *iommu, u16 did, > u16 sid, u8 fm, u64 type); > void (*iotlb_inv)(struct intel_iommu *iommu, u16 did, u64 addr, > unsigned int size_order, u64 type); > void (*p_iotlb_inv)(struct intel_iommu *iommu, u16 did, u32 pasid, > u64 addr, unsigned long npages, bool ih); > void (*pc_inv)(struct intel_iommu *iommu, u16 did, u32 pasid, > u64 granu); > void (*dev_tlb_inv)(struct intel_iommu *iommu, u16 sid, u16 pfsid, > u16 qdep, u64 addr, unsigned int mask); > void (*p_dev_tlb_inv)(struct intel_iommu *iommu, u16 sid, u16 pfsid, > u32 pasid, u16 qdep, u64 addr, > unsigned long npages); > }; > > The name of each cache flush ops is defined according to the spec section 6.5 > so that people are easy to look up them in the spec. Hm, indirect function calls are quite expensive these days. I would have preferred to go in the opposite direction, since surely aren't going to have *many* of these implementations. Currently there's only one for register-based and one for queued invalidation, right? Even if VT-d 3.0 throws an extra version in, I think I'd prefer to take out the indirection completely and have an if/then helper. Would love to see a microbenchmark of unmap operations before and after this patch series with retpoline enabled, to see the effect.