Another idea is seperate remote TLB invalidate into two instructions: - sfence.vma.b.asyc - sfence.vma.b.barrier // wait all async TLB invalidate operations finished for all harts. (I remember who mentioned me separate them into two instructions after session. Anup? Is the idea right ?) Actually, I never consider asyc TLB invalidate before, because current our light iommu did not need it. Thx all people attend the session :) Let's continue the talk. Guo Ren 于 2019年9月12日周四 22:59写道: > Thx Will for reply. > > On Thu, Sep 12, 2019 at 3:03 PM Will Deacon wrote: > > > > On Sun, Sep 08, 2019 at 07:52:55AM +0800, Guo Ren wrote: > > > On Mon, Jun 24, 2019 at 6:40 PM Will Deacon wrote: > > > > > I'll keep my system use the same ASID for SMP + IOMMU :P > > > > > > > > You will want a separate allocator for that: > > > > > > > > > https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.brucker@arm.com > > > > > > Yes, it is hard to maintain ASID between IOMMU and CPUMMU or different > > > system, because it's difficult to synchronize the IO_ASID when the CPU > > > ASID is rollover. > > > But we could still use hardware broadcast TLB invalidation instruction > > > to uniformly manage the ASID and IO_ASID, or OTHER_ASID in our IOMMU. > > > > That's probably a bad idea, because you'll likely stall execution on the > > CPU until the IOTLB has completed invalidation. In the case of ATS, I > think > > an endpoint ATC is permitted to take over a minute to respond. In > reality, I > > suspect the worst you'll ever see would be in the msec range, but that's > > still an unacceptable period of time to hold a CPU. > Just as I've said in the session that IOTLB invalidate delay is > another topic, My main proposal is to introduce stage1.pgd and > stage2.pgd as address space identifiers between different TLB systems > based on vmid, asid. My last part of sildes will show you how to > translate stage1/2.pgd to as/vmid in PCI ATS system and the method > could work with SMMU-v3 and intel Vt-d. (It's regret for me there is > no time to show you the whole slides.) > > In our light IOMMU implementation, there's no IOTLB invalidate delay > problem. Becasue IOMMU is very close to CPU MMU and interconnect's > delay is the same with SMP CPUs MMU (no PCI, VM supported). > > To solve the problem, we could define a async mode in sfence.vma.b to > slove the problem and finished with per_cpu_irq/exception. > > -- > Best Regards > Guo Ren > > ML: https://lore.kernel.org/linux-csky/ >