This needs to be discussed and debated at length; proposing edits to the spec at this stage is putting the cart before the horse! We shouldn’t change the definition of the existing SFENCE.VMA instruction to accomplish this. It’s also not abundantly clear to me that this should be an instruction: TLB shootdown looks more like MMIO. On Thu, Sep 19, 2019 at 5:36 AM Guo Ren wrote: > From: Guo Ren > > The patch is for https://github.com/riscv/riscv-isa-manual > > The proposal has been talked in LPC-2019 RISC-V MC ref [1]. Here is the > formal patch. > > Introduction > ============ > > Using the Hardware TLB broadcast invalidation instruction to maintain the > system TLB is a good choice and it'll simplify the system software design. > The proposal hopes to add a broadcast mode to the sfence.vma in the > riscv-privilege specification. To support the sfence.vma broadcast mode, > there are two modification introduced below: > > 1) Add PGD.PPN (root page table's PPN) as the unique identifier of the > address space in addition to asid/vmid. Compared to the dynamically > changed asid/vmid, PGD.PPN is fixed throughout the address space life > cycle. This feature enables uniform address space identification > between different TLB systems (actually, it's difficult to unify the > asid/vmid between the CPU system and the IOMMU system, because their > mechanisms are different) > > 2) Modify the definition of the sfence.vma instruction from synchronous > mode to asynchronous mode, which means that the completion of the TLB > operation is not guaranteed when the sfence.vma instruction retires. > It needs to be completed by checking the flag bit on the hart. The > sfence.vma request finish can notify the software by generating an > interrupt. This function alleviates the large delay of TLB invalidation > in the PCI ATS system. > > Add S1/S2.PGD.PPN for ASID/VMID > =============================== > > PGD is global directory (defined in linux) and PPN is page physical number > (defined in riscv-spec). PGD.PNN corresponds to the root page table pointer > of the address space, i.e. mm->pgd (linux concept). > > In CPU/IOMMU TLB, we use asid/vmid to distinguish the address space of > process or virtual machine. Due to the limitation of id encoding, it can > only represent a part(window) of the address space. S1/S2.PGD.PPN are the > root page table's PPNs of the address spaces and S1/S2.PGD.PPN are the > unique identifier of the address spaces. > > For the CPU SMP system, you can use context switch to perform the necessary > software mechanism to ensure that the asid/vmid on all harts is consistent > (please refer to the arm64 asid mechanism). In this way, the TLB broadcast > invalidation instruction can determine the address space processed on all > harts by asid/vmid. > > Different from the CPU SMP system, there is no context switch for the > DMA-IOMMU system, so the unification with the CPU asid/vmid cannot be > guaranteed. So we need a unique identifier for the address space to > establish a communication bridge between the TLBs of different systems. > > That is PGD.PPN (for virtualization scenarios: S1/S2.PGD.PPN) > > current: > sfence.vma rs1 = vaddr, rs2 = asid > hfence.vvma rs1 = vaddr, rs2 = asid > hfence.gvma rs1 = gaddr, rs2 = vmid > > proposed: > sfence.vma rs1 = vaddr, rs2 = mode:ppn:asid > hfence.vvma rs1 = vaddr, rs2 = mode:ppn:asid > hfence.gvma rs1 = gaddr, rs2 = mode:ppn:vmid > > mode - broadcast | local > ppn - the PPN of the address space of the root page table > vmid/asid - the window identifier of the address space > > At the Linux Plumber Conference 2019 RISCV-MC, ref:[1], we've showed two > IOMMU examples to explain how it work with hardware. > > 1) In a lightweight IOMMU system (up to 64 address spaces), the hardware > could directly convert PGD.PPN into DID (IOMMU ASID) > > 2) For the PCI ATS scenario, its IO ASID/VMID encoding space can support > a very large number of address spaces. We use two reverse mapping > tables to let the hardware translate S1/S2.PGD.PPN into IO ASID/VMID. > > ASYNC BROADCAST SFENCE.VMA > =========================== > > To support the high latency broadcast sfence.vma operation in the PCI ATS > usage scenario, we modify the sfence.vma from synchronous mode to > asynchronous mode. (For simpler implementation, if hardware only implement > synchronous mode and software still work in asynchronous mode) > > To implement the asynchronous mode, 3 features are added: > 1) sstatus:TLBI > A "status bit - TLBI" is added to the sstatus register. The TLBI status > bit indicates if there are still outstanding sfence.vma requests on the > current hart. > Value: > 1: sfence.vma requests are not completed. > 0: all sfece.vma requests completed, request queue is empty. > > 2) sstatus:TLBIC > A "control bits - TLBIC" is added to sstatus register. The TLBIC > control > bits are controlled by software. > "Write 1" will trigger the current hart check to see if there are still > outstanding sfence.vma requests. If there are unfinished requests, an > interrupt will be generated when the request is completed, notifying > the > software that all of the current sfence.vma requests have been > completed. > "Write 0" will cause nothing. > > 3) supervisor interrupt register (sip & sie):TLBI finish interrupt > A per-hart interrupt is added to supervisor interrupt registers. > When all sfence.vma requests are completed and sstatus:TLBIC has been > triggered, hart will receive a TLBI finish interrupt. Just like timer, > software and external interrupt's definition in sip & sie. > > Fake code: > > flush_tlb_page(vma, addr) { > asid = cpu_asid(vma->vm_mm); > ppn = PFN_DOWN(vma->vm_mm->pgd); > > sfence.vma (addr, 1|PPN_OFFSET(ppn)|asid); //1. start request > > while(sstatus:TLBI) if (time_out() > 1ms) break; //2. loop check > > while (sstatus:TLBI) { > ... > set sstatus:TLBIC; > wait_TLBI_finish_interrupt(); //3. wait irq, io_schedule > } > } > > Here we give 2 level check: > 1) loop check sstatus:TLBI, CPU could response Interrupt. > 2) set sstatus:TLBIC and wait for irq, CPU schedule out for other task. > > ACE-DVM Example > =============== > > Honestly, "broadcasting addr, asid, vmid, S1/S2.PGD.PPN to interconnects" > and "ASYNC SFENCE.VMA" could be implemented by ACE-DVM protocol ref [2]. > > There are 3 types of transactions in DVM: > > - DVM operation > Send all information to the interconnect, including addr, asid, > S1.PGD.PPN, vmid, S2.PGD.PPN. > > - DVM synchronization > Check that all DVM operations have been completed. If not, it will use > state machine to wait DVM complete requests. > > - DVM complete > Return transaction from components, eg: IOMMU. If hart has received all > DVM completes which are triggered by sfence.vma instructions and > "sstatus:TLBIC" has been set, a TLBI finish interrupt is triggered. > > (Actually, we do not need to implement the above functions strictly > according to the ACE specification :P ) > > 1: https://www.linuxplumbersconf.org/event/4/contributions/307/ > 2: AMBA AXI and ACE Protocol Specification - Distributed Virtual Memory > Transactions" > > Signed-off-by: Guo Ren > Reviewed-by: Li Feiteng > --- > src/hypervisor.tex | 43 ++++++++------- > src/supervisor.tex | 155 > +++++++++++++++++++++++++++++++++++++++++------------ > 2 files changed, 143 insertions(+), 55 deletions(-) > > diff --git a/src/hypervisor.tex b/src/hypervisor.tex > index 47b90b2..3718819 100644 > --- a/src/hypervisor.tex > +++ b/src/hypervisor.tex > @@ -1094,15 +1094,15 @@ The hypervisor extension adds two new privileged > fence instructions. > \multicolumn{1}{c|}{opcode} \\ > \hline > 7 & 5 & 5 & 3 & 5 & 7 \\ > -HFENCE.GVMA & vmid & gaddr & PRIV & 0 & SYSTEM \\ > -HFENCE.VVMA & asid & vaddr & PRIV & 0 & SYSTEM \\ > +HFENCE.GVMA & mode:ppn:vmid & gaddr & PRIV & 0 & SYSTEM \\ > +HFENCE.VVMA & mode:ppn:asid & vaddr & PRIV & 0 & SYSTEM \\ > \end{tabular} > \end{center} > > The hypervisor memory-management fence instructions, HFENCE.GVMA and > HFENCE.VVMA, are valid only in HS-mode when {\tt mstatus}.TVM=0, or in > M-mode > (irrespective of {\tt mstatus}.TVM). > -These instructions perform a function similar to SFENCE.VMA > +These instructions perform a function similar to SFENCE.VMA > (broadcast/local) > (Section~\ref{sec:sfence.vma}), except applying to the guest-physical > memory-management data structures controlled by CSR {\tt hgatp} > (HFENCE.GVMA) > or the VS-level memory-management data structures controlled by CSR {\tt > vsatp} > @@ -1136,11 +1136,10 @@ An HFENCE.VVMA instruction applies only to a > single virtual machine, identified > by the setting of {\tt hgatp}.VMID when HFENCE.VVMA executes. > \end{commentary} > > -When {\em rs2}$\neq${\tt x0}, bits XLEN-1:ASIDMAX of the value held in > {\em > -rs2} are reserved for future use and should be zeroed by software and > ignored > -by current implementations. > -Furthermore, if ASIDLEN~$<$~ASIDMAX, the implementation shall ignore bits > -ASIDMAX-1:ASIDLEN of the value held in {\em rs2}. > +When {\em rs2}$\neq${\tt x0}, bits contain 3 informations: mode, ppn, > asid. > +1) mode control HFENCE.VVMA broadcast or not. > +2) ppn is the root page talbe's PPN of the asid address space. > +3) asid is the identifier of process in virtual machine. > > \begin{commentary} > Simpler implementations of HFENCE.VVMA can ignore the guest virtual > address in > @@ -1168,11 +1167,10 @@ physical addresses in PMP address registers > (Section~\ref{sec:pmp}) and in page > table entries (Sections \ref{sec:sv32}, \ref{sec:sv39}, > and~\ref{sec:sv48}). > \end{commentary} > > -When {\em rs2}$\neq${\tt x0}, bits XLEN-1:VMIDMAX of the value held in > {\em > -rs2} are reserved for future use and should be zeroed by software and > ignored > -by current implementations. > -Furthermore, if VMIDLEN~$<$~VMIDMAX, the implementation shall ignore bits > -VMIDMAX-1:VMIDLEN of the value held in {\em rs2}. > +When {\em rs2}$\neq${\tt x0}, bits contain 3 informations: mode, vmid, > ppn. > +1) mode control HFENCE.GVMA broadcast or not. > +2) ppn is the root page talbe's PPN of the vmid address space. > +3) vmid is the identifier of virtual machine. > > \begin{commentary} > Simpler implementations of HFENCE.GVMA can ignore the guest physical > address in > @@ -1567,21 +1565,22 @@ register. > \subsection{Memory-Management Fences} > > The behavior of the SFENCE.VMA instruction is affected by the current > -virtualization mode V. When V=0, the virtual-address argument is an > HS-level > -virtual address, and the ASID argument is an HS-level ASID. > +virtualization mode V. When V=0, the rs1 argument is an HS-level > +virtual address, and the rs2 argument is an HS-level ASID and root page > table's PPN. > The instruction orders stores only to HS-level address-translation > structures > with subsequent HS-level address translations. > > -When V=1, the virtual-address argument to SFENCE.VMA is a guest virtual > -address within the current virtual machine, and the ASID argument is a > VS-level > -ASID within the current virtual machine. > +When V=1, the rs1 argument to SFENCE.VMA is a guest virtual > +address within the current virtual machine, and the rs2 argument is a > VS-level > +ASID and root page table's PPN within the current virtual machine. > The current virtual machine is identified by the VMID field of CSR {\tt > hgatp}, > -and the effective ASID can be considered to be the combination of this > VMID > -with the VS-level ASID. > +and the effective ASID and root page table's PPN can be considered to be > the > +combination of this VMID and root page table's PPN with the VS-level ASID > and > +root page table's PPN. > The SFENCE.VMA instruction orders stores only to the VS-level > address-translation structures with subsequent VS-level address > translations > -for the same virtual machine, i.e., only when {\tt hgatp}.VMID is the > same as > -when the SFENCE.VMA executed. > +for the same virtual machine, i.e., only when {\tt hgatp}.VMID and {\\tt > hgatp}.PPN is > +the same as when the SFENCE.VMA executed. > > Hypervisor instructions HFENCE.GVMA and HFENCE.VVMA provide additional > memory-management fences to complement SFENCE.VMA. > diff --git a/src/supervisor.tex b/src/supervisor.tex > index ba3ced5..2877b7a 100644 > --- a/src/supervisor.tex > +++ b/src/supervisor.tex > @@ -47,10 +47,12 @@ register keeps track of the processor's current > operating state. > \begin{center} > \setlength{\tabcolsep}{4pt} > \scalebox{0.95}{ > -\begin{tabular}{cWcccccWccccWcc} > +\begin{tabular}{cccWcccccWccccWcc} > \\ > \instbit{31} & > -\instbitrange{30}{20} & > +\instbit{30} & > +\instbit{29} & > +\instbitrange{28}{20} & > \instbit{19} & > \instbit{18} & > \instbit{17} & > @@ -66,6 +68,8 @@ register keeps track of the processor's current > operating state. > \instbit{0} \\ > \hline > \multicolumn{1}{|c|}{SD} & > +\multicolumn{1}{|c|}{TLBI} & > +\multicolumn{1}{|c|}{TLBIC} & > \multicolumn{1}{c|}{\wpri} & > \multicolumn{1}{c|}{MXR} & > \multicolumn{1}{c|}{SUM} & > @@ -82,7 +86,7 @@ register keeps track of the processor's current > operating state. > \multicolumn{1}{c|}{\wpri} > \\ > \hline > -1 & 11 & 1 & 1 & 1 & 2 & 2 & 4 & 1 & 1 & 1 & 1 & 3 & 1 & 1 \\ > +1 & 1 & 1 & 10 & 1 & 1 & 1 & 2 & 2 & 4 & 1 & 1 & 1 & 1 & 3 & 1 & 1 \\ > \end{tabular}} > \end{center} > } > @@ -95,10 +99,12 @@ register keeps track of the processor's current > operating state. > {\footnotesize > \begin{center} > \setlength{\tabcolsep}{4pt} > -\begin{tabular}{cMFScccc} > +\begin{tabular}{cccMFScccc} > \\ > \instbit{SXLEN-1} & > -\instbitrange{SXLEN-2}{34} & > +\instbit{SXLEN-2} & > +\instbit{SXLEN-3} & > +\instbitrange{SXLEN-4}{34} & > \instbitrange{33}{32} & > \instbitrange{31}{20} & > \instbit{19} & > @@ -107,6 +113,8 @@ register keeps track of the processor's current > operating state. > \\ > \hline > \multicolumn{1}{|c|}{SD} & > +\multicolumn{1}{|c|}{TLBI} & > +\multicolumn{1}{|c|}{TLBIC} & > \multicolumn{1}{c|}{\wpri} & > \multicolumn{1}{c|}{UXL[1:0]} & > \multicolumn{1}{c|}{\wpri} & > @@ -115,7 +123,7 @@ register keeps track of the processor's current > operating state. > \multicolumn{1}{c|}{\wpri} & > \\ > \hline > -1 & SXLEN-35 & 2 & 12 & 1 & 1 & 1 & \\ > +1 & 1 & 1 & SXLEN-37 & 2 & 12 & 1 & 1 & 1 & \\ > \end{tabular} > \begin{tabular}{cWWFccccWcc} > \\ > @@ -152,6 +160,17 @@ register keeps track of the processor's current > operating state. > \label{sstatusreg} > \end{figure*} > > +The TLBI (read-only) bit indicates that any async sfence.vma operations > are > +still pended on the hart. The value:0 means that there is no sfence.vma > +operations pending and value:1 means that there are still sfence.vma > operations > +pending on the hart. > + > +When the sstatus:TLBIC bit is written 1, it triggers the hardware to > check if > +there are any TLB invalidate operations being pended. When all operations > are > +finished, a TLB Invalidate finish interrupt will be triggered > +(see Section~\ref{sipreg}). When the sstatus:TLBIC bit is written 0, it > will > +cause nothing. Reading sstatus:TLBIC bit will alaways return 0. > + > The SPP bit indicates the privilege level at which a hart was executing > before > entering supervisor mode. When a trap is taken, SPP is set to 0 if the > trap > originated from user mode, or 1 otherwise. When an SRET instruction > @@ -329,8 +348,10 @@ SXLEN-bit read/write register containing interrupt > enable bits. > {\footnotesize > \begin{center} > \setlength{\tabcolsep}{4pt} > -\begin{tabular}{KcFcFcc} > -\instbitrange{SXLEN-1}{10} & > +\begin{tabular}{KcFcFcFcc} > +\instbitrange{SXLEN-1}{14} & > +\instbit{13} & > +\instbitrange{12}{10} & > \instbit{9} & > \instbitrange{8}{6} & > \instbit{5} & > @@ -339,6 +360,8 @@ SXLEN-bit read/write register containing interrupt > enable bits. > \instbit{0} \\ > \hline > \multicolumn{1}{|c|}{\wpri} & > +\multicolumn{1}{c|}{STLBIP} & > +\multicolumn{1}{|c|}{\wpri} & > \multicolumn{1}{c|}{SEIP} & > \multicolumn{1}{c|}{\wpri} & > \multicolumn{1}{c|}{STIP} & > @@ -346,7 +369,7 @@ SXLEN-bit read/write register containing interrupt > enable bits. > \multicolumn{1}{c|}{SSIP} & > \multicolumn{1}{c|}{\wpri} \\ > \hline > -SXLEN-10 & 1 & 3 & 1 & 3 & 1 & 1 \\ > +SXLEN-14 & 1 & 3 & 1 & 3 & 1 & 3 & 1 & 1 \\ > \end{tabular} > \end{center} > } > @@ -359,8 +382,10 @@ SXLEN-10 & 1 & 3 & 1 & 3 & 1 & 1 \\ > {\footnotesize > \begin{center} > \setlength{\tabcolsep}{4pt} > -\begin{tabular}{KcFcFcc} > -\instbitrange{SXLEN-1}{10} & > +\begin{tabular}{KcFcFcFcc} > +\instbitrange{SXLEN-1}{14} & > +\instbit{13} & > +\instbitrange{12}{10} & > \instbit{9} & > \instbitrange{8}{6} & > \instbit{5} & > @@ -369,6 +394,8 @@ SXLEN-10 & 1 & 3 & 1 & 3 & 1 & 1 \\ > \instbit{0} \\ > \hline > \multicolumn{1}{|c|}{\wpri} & > +\multicolumn{1}{c|}{STLBIE} & > +\multicolumn{1}{|c|}{\wpri} & > \multicolumn{1}{c|}{SEIE} & > \multicolumn{1}{c|}{\wpri} & > \multicolumn{1}{c|}{STIE} & > @@ -376,7 +403,7 @@ SXLEN-10 & 1 & 3 & 1 & 3 & 1 & 1 \\ > \multicolumn{1}{c|}{SSIE} & > \multicolumn{1}{c|}{\wpri} \\ > \hline > -SXLEN-10 & 1 & 3 & 1 & 3 & 1 & 1 \\ > +SXLEN-14 & 1 & 3 & 1 & 3 & 1 & 3 & 1 & 1 \\ > \end{tabular} > \end{center} > } > @@ -410,6 +437,12 @@ when the SEIE bit in the {\tt sie} register is > clear. The implementation > should provide facilities to mask, unmask, and query the cause of external > interrupts. > > +A supervisor-level TLB Invalidate finish interrupt is pending if the > STLBIP bit > +in the {\tt sip} register is set. Supervisor-level TLB Invalidate finish > +interrupts are disabled when the STLBIE bit in the {\tt sie} register is > clear. > +When hart tlb invalidate operations are finished, hardware will change > sstatus:TLBI > +bit from 1 to 0 and trigger TLB Invalidate finish interrupt. > + > \begin{commentary} > The {\tt sip} and {\tt sie} registers are subsets of the {\tt mip} and > {\tt > mie} registers. Reading any field, or writing any writable field, of {\tt > @@ -598,7 +631,9 @@ so is only guaranteed to hold supported exception > codes. > 1 & 5 & Supervisor timer interrupt \\ > 1 & 6--8 & {\em Reserved} \\ > 1 & 9 & Supervisor external interrupt \\ > - 1 & 10--15 & {\em Reserved} \\ > + 1 & 10--11 & {\em Reserved} \\ > + 1 & 12 & Supervisor TLBI finish interrupt \\ > + 1 & 13--15 & {\em Reserved} \\ > 1 & $\ge$16 & {\em Available for platform use} \\ \hline > 0 & 0 & Instruction address misaligned \\ > 0 & 1 & Instruction access fault \\ > @@ -884,7 +919,7 @@ provided. > \multicolumn{1}{c|}{opcode} \\ > \hline > 7 & 5 & 5 & 3 & 5 & 7 \\ > -SFENCE.VMA & asid & vaddr & PRIV & 0 & SYSTEM \\ > +SFENCE.VMA & mode:ppn:asid & vaddr & LOCAL & 0 & SYSTEM \\ > \end{tabular} > \end{center} > > @@ -899,21 +934,70 @@ from that hart to the memory-management data > structures. > Further details on the behavior of this instruction are > described in Section~\ref{virt-control} and Section~\ref{pmp-vmem}. > > +SFENCE.VMA is defined as an asynchronous completion instruction, which > means > +that the TLB operation is not guaranteed to complete when the instruction > retires. > +Software need check sstatus:TLBI to determine all TLB operations complete. > +The sstatus:TLBI described in Section~\ref{sstatus}. When hardware change > +sstatus:TLBI bit from 1 to 0, the TLB Invalidate finish interrupt will be > +triggered. > + > \begin{commentary} > -The SFENCE.VMA is used to flush any local hardware caches related to > +The SFENCE.VMA is used to flush any local/remote hardware caches related > to > address translation. It is specified as a fence rather than a TLB > flush to provide cleaner semantics with respect to which instructions > are affected by the flush operation and to support a wider variety of > dynamic caching structures and memory-management schemes. SFENCE.VMA > is also used by higher privilege levels to synchronize page table > -writes and the address translation hardware. > +writes and the address translation hardware. There is a mode bit to > determine > +sfence.vma would broadcast on interconnect or not. > \end{commentary} > > -SFENCE.VMA orders only the local hart's implicit references to the > -memory-management data structures. > +\begin{figure}[h!] > +{\footnotesize > +\begin{center} > +\begin{tabular}{c@{}E@{}K} > +\instbit{31} & > +\instbitrange{30}{9} & > +\instbitrange{8}{0} \\ > +\hline > +\multicolumn{1}{|c|}{{\tt MODE}} & > +\multicolumn{1}{|c|}{{\tt PPN (root page table)}} & > +\multicolumn{1}{|c|}{{\tt ASID}} \\ > +\hline > +1 & 22 & 9 \\ > +\end{tabular} > +\end{center} > +} > +\vspace{-0.1in} > +\caption{RV32 sfence.vma rs2 format.} > +\label{rv32satp} > +\end{figure} > + > +\begin{figure}[h!] > +{\footnotesize > +\begin{center} > +\begin{tabular}{@{}S@{}T@{}U} > +\instbitrange{63}{60} & > +\instbitrange{59}{16} & > +\instbitrange{15}{0} \\ > +\hline > +\multicolumn{1}{|c|}{{\tt MODE}} & > +\multicolumn{1}{|c|}{{\tt PPN (root page table)}} & > +\multicolumn{1}{|c|}{{\tt ASID}} \\ > +\hline > +4 & 44 & 16 \\ > +\end{tabular} > +\end{center} > +} > +\vspace{-0.1in} > +\caption{RV64 sfence.vma rs2 format, for MODE values, only highest bit:63 > is > +valid and others are reserved.} > +\label{rv64satp} > +\end{figure} > > \begin{commentary} > -Consequently, other harts must be notified separately when the > +The mode's highest bit could control sfence.vma behavior with 1:broadcast > or 0:local. > +If only have mode:local, other harts must be notified separately when the > memory-management data structures have been modified. > One approach is to use 1) > a local data fence to ensure local writes are visible globally, then > @@ -928,8 +1012,17 @@ modified for a single address mapping (i.e., one > page or superpage), {\em rs1} > can specify a virtual address within that mapping to effect a translation > fence for that mapping only. Furthermore, for the common case that the > translation data structures have only been modified for a single > address-space > -identifier, {\em rs2} can specify the address space. The behavior of > -SFENCE.VMA depends on {\em rs1} and {\em rs2} as follows: > +identifier, {\em rs2} can specify the address space with {\tt satp} format > +which include asid and root page table's PPN information. > + > +\begin{commentary} > +We use ASID and root page table's PPN to determine address space and the > format > +stored in rs2 is similar with {\tt satp} described in > Section~\ref{sec:satp}. > +ASID are used by local harts and root page table's PPN of the asid are > used by > +other different TLB systems, eg: IOMMU. > +\end{commentary} > + > +The behavior of SFENCE.VMA depends on {\em rs1} and {\em rs2} as follows: > > \begin{itemize} > \item If {\em rs1}={\tt x0} and {\em rs2}={\tt x0}, the fence orders all > @@ -939,23 +1032,18 @@ SFENCE.VMA depends on {\em rs1} and {\em rs2} as > follows: > all reads and writes made to any level of the page tables, but only > for the address space identified by integer register {\em rs2}. > Accesses to {\em global} mappings (see > Section~\ref{sec:translation}) > - are not ordered. > + are not ordered. The mode field in rs2 is determine broadcast or > local. > \item If {\em rs1}$\neq${\tt x0} and {\em rs2}={\tt x0}, the fence orders > only reads and writes made to the leaf page table entry > corresponding > to the virtual address in {\em rs1}, for all address spaces. > \item If {\em rs1}$\neq${\tt x0} and {\em rs2}$\neq${\tt x0}, the fence > orders only reads and writes made to the leaf page table entry > corresponding to the virtual address in {\em rs1}, for the address > - space identified by integer register {\em rs2}. > + space identified by integer register {\em rs2}. The mode field in > rs2 > + is determine broadcast or local. > Accesses to global mappings are not ordered. > \end{itemize} > > -When {\em rs2}$\neq${\tt x0}, bits SXLEN-1:ASIDMAX of the value held in > {\em > -rs2} are reserved for future use and should be zeroed by software and > ignored > -by current implementations. Furthermore, if ASIDLEN~$<$~ASIDMAX, the > -implementation shall ignore bits ASIDMAX-1:ASIDLEN of the value held in > {\em > -rs2}. > - > \begin{commentary} > Simpler implementations can ignore the virtual address in {\em rs1} and > the ASID value in {\em rs2} and always perform a global fence. > @@ -994,7 +1082,7 @@ can execute the same SFENCE.VMA instruction while a > different ASID is loaded > into {\tt satp}, provided the next time {\tt satp} is loaded with the > recycled > ASID, it is simultaneously loaded with the new page table. > > -\item If the implementation does not provide ASIDs, or software chooses to > +\item If the implementation does not provide ASIDs and PPNs, or software > chooses to > always use ASID 0, then after every {\tt satp} write, software should > execute > SFENCE.VMA with {\em rs1}={\tt x0}. In the common case that no global > translations have been modified, {\em rs2} should be set to a register > other than > @@ -1003,13 +1091,14 @@ not flushed. > > \item If software modifies a non-leaf PTE, it should execute SFENCE.VMA > with > {\em rs1}={\tt x0}. If any PTE along the traversal path had its G bit > set, > -{\em rs2} must be {\tt x0}; otherwise, {\em rs2} should be set to the > ASID for > -which the translation is being modified. > +{\em rs2} must be {\tt x0}; otherwise, {\em rs2} should be set to the > ASID and > +root page table's PPN for which the translation is being modified. > > \item If software modifies a leaf PTE, it should execute SFENCE.VMA with > {\em > rs1} set to a virtual address within the page. If any PTE along the > traversal > path had its G bit set, {\em rs2} must be {\tt x0}; otherwise, {\em rs2} > -should be set to the ASID for which the translation is being modified. > +should be set to the ASID and root page table's PPN for which the > translation > +is being modified. > > \item For the special cases of increasing the permissions on a leaf PTE > and > changing an invalid PTE to a valid leaf, software may choose to execute > -- > 2.7.4 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > > View/Reply Online (#810): > https://lists.riscv.org/g/tech-privileged/message/810 > Mute This Topic: https://lists.riscv.org/mt/34198986/1677273 > Group Owner: tech-privileged+owner@lists.riscv.org > Unsubscribe: https://lists.riscv.org/g/tech-privileged/unsub [ > andrew@sifive.com] > -=-=-=-=-=-=-=-=-=-=-=- > >