On Tue, Jun 01, 2021 at 07:09:21PM +0800, Lu Baolu wrote: > Hi Jason, > > On 2021/5/29 7:36, Jason Gunthorpe wrote: > > > /* > > > * Bind an user-managed I/O page table with the IOMMU > > > * > > > * Because user page table is untrusted, IOASID nesting must be enabled > > > * for this ioasid so the kernel can enforce its DMA isolation policy > > > * through the parent ioasid. > > > * > > > * Pgtable binding protocol is different from DMA mapping. The latter > > > * has the I/O page table constructed by the kernel and updated > > > * according to user MAP/UNMAP commands. With pgtable binding the > > > * whole page table is created and updated by userspace, thus different > > > * set of commands are required (bind, iotlb invalidation, page fault, etc.). > > > * > > > * Because the page table is directly walked by the IOMMU, the user > > > * must use a format compatible to the underlying hardware. It can > > > * check the format information through IOASID_GET_INFO. > > > * > > > * The page table is bound to the IOMMU according to the routing > > > * information of each attached device under the specified IOASID. The > > > * routing information (RID and optional PASID) is registered when a > > > * device is attached to this IOASID through VFIO uAPI. > > > * > > > * Input parameters: > > > * - child_ioasid; > > > * - address of the user page table; > > > * - formats (vendor, address_width, etc.); > > > * > > > * Return: 0 on success, -errno on failure. > > > */ > > > #define IOASID_BIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 9) > > > #define IOASID_UNBIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 10) > > Also feels backwards, why wouldn't we specify this, and the required > > page table format, during alloc time? > > > > Thinking of the required page table format, perhaps we should shed more > light on the page table of an IOASID. So far, an IOASID might represent > one of the following page tables (might be more): > > 1) an IOMMU format page table (a.k.a. iommu_domain) > 2) a user application CPU page table (SVA for example) > 3) a KVM EPT (future option) > 4) a VM guest managed page table (nesting mode) > > This version only covers 1) and 4). Do you think we need to support 2), Isn't (2) the equivalent of using the using the host-managed pagetable then doing a giant MAP of all your user address space into it? But maybe we should identify that case explicitly in case the host can optimize it. > 3) and beyond? If so, it seems that we need some in-kernel helpers and > uAPIs to support pre-installing a page table to IOASID. From this point > of view an IOASID is actually not just a variant of iommu_domain, but an > I/O page table representation in a broader sense. > > Best regards, > baolu > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson