On Mon, Oct 11, 2021 at 02:17:48PM -0300, Jason Gunthorpe wrote: > On Mon, Oct 11, 2021 at 04:37:38PM +1100, david@gibson.dropbear.id.au wrote: > > > PASID support will already require that a device can be multi-bound to > > > many IOAS's, couldn't PPC do the same with the windows? > > > > I don't see how that would make sense. The device has no awareness of > > multiple windows the way it does of PASIDs. It just sends > > transactions over the bus with the IOVAs it's told. If those IOVAs > > lie within one of the windows, the IOMMU picks them up and translates > > them. If they don't, it doesn't. > > To my mind that address centric routing is awareness. I don't really understand that position. A PASID capable device has to be built to be PASID capable, and will generally have registers into which you store PASIDs to use. Any 64-bit DMA capable device can use the POWER IOMMU just fine - it's up to the driver to program it with addresses that will be translated (and in Linux the driver will get those from the DMA subsystem). > If the HW can attach multiple non-overlapping IOAS's to the same > device then the HW is routing to the correct IOAS by using the address > bits. This is not much different from the prior discussion we had > where we were thinking of the PASID as an 80 bit address Ah... that might be a workable approach. And it even helps me get my head around multiple attachment which I was struggling with before. So, the rule would be that you can attach multiple IOASes to a device, as long as none of them overlap. The non-overlapping could be because each IOAS covers a disjoint address range, or it could be because there's some attached information - such as a PASID - to disambiguate. What remains a question is where the disambiguating information comes from in each case: does it come from properties of the IOAS, propertues of the device, or from extra parameters supplied at attach time. IIUC, the current draft suggests it always comes at attach time for the PASID information. Obviously the more consistency we can have here the better. I can also see an additional problem in implementation, once we start looking at hot-adding devices to existing address spaces. Suppose our software (maybe qemu) wants to set up a single DMA view for a bunch of devices, that has such a split window. It can set up IOASes easily enough for the two windows, then it needs to attach them. Presumbly, it attaches them one at a time, which means that each device (or group) goes through an interim state where it's attached to one, but not the other. That can probably be achieved by using an extra IOMMU domain (or the local equivalent) in the hardware for that interim state. However it means we have to repeatedly create and destroy that extra domain for each device after the first we add, rather than simply adding each device to the domain which has both windows. [I think this doesn't arise on POWER when running under PowerVM. That has no concept like IOMMU domains, and instead the mapping is always done per "partitionable endpoint" (PE), essentially a group. That means it's just a question of whether we mirror mappings on both windows into a given PE or just those from one IOAS. It's not an unreasonable extension/combination of existing hardware quirks to consider, though] > The fact the PPC HW actually has multiple page table roots and those > roots even have different page tables layouts while still connected to > the same device suggests this is not even an unnatural modelling > approach... > > Jason > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson