On 02/06/2021 19:19, Richard Henderson wrote: > On 6/2/21 12:58 PM, Bruno Piazera Larsen wrote: >>> For the use from ppc_cpu_get_phys_page_debug, you'd pass in >>> cpu_mmu_index(env, false). >> >> ppc_cpu_get_phys_page_debug has 2 calls to ppc_xlate, one using the >> data MMU, the other using the instruction MMU. I'm guessing I should >> pass both, right? > > Yes. > >> But here we have another bit that confuses me: cpu_mmu_index returns >> 0 if in user mode, or uses the information stored in env to get it, >> so I don't see how that would be different from getting directly. >> Unless the point is to have ppc_*_xlate be generic and pc_*_debug >> knows the info in env is correct. Is that it? > > The issue is that > > (1) ppc_*_xlate should perform the lookup requested, and mmu_idx >     does not *necessarily* correspond to the current contents of >     env->msr et al.  See (2). > > (2) There is a secondary call to ppc_radix64_partition_scoped_xlate >     for which the second stage page table should be read >     with hypervisor permissions, and not the permissions of the >     original memory access. > >     Note that ppc_radix64_check_prot checks msr_pr directly. > >     Thus the second stage lookup should use mmu_idx = 5 >     (HV kernel virtual mode).  If I understand things correctly... > >> >>> >>> >>>> +    const short HV = 1, IR = 2, DR = 3; >>>> +    bool MSR[3]; >>>> +    MSR[HV] = dmmu_idx & 2, >>>> +    MSR[IR] = immu_idx & 4, >>>> +    MSR[DR] = dmmu_idx & 4; >>> >>> There's no point in the array.  Just use three different scalars >>> (real_mode, hv, and pr (note that pr is the major portion of the bug >>> as reported)). Additionally, you'll not be distinguishing immu_idx >>> and dmmu_idx, but using the single idx that's given. >> >> Ah, yeah, that's the "more complex than necessary, but it was easy >> for me to read" part. Scalars are a good solution. In this function >> in specific, PR doesn't actually show up anywhere, so I would >> actually only need 2. Anyway, will start working on this. > > Oh, I'll note that your constants above are wrong.  I think that you > should have some common routines in (mmu-)internal.h: > > /* >  * These correspond to the mmu_idx values computed in >  * hreg_compute_hflags_value.  See the tables therein. >  */ > static inline bool mmuidx_pr(int idx) { return idx & 1; } > static inline bool mmuidx_real(int idx) { return idx & 2; } > static inline bool mmuidx_hv(int idx) { return idx & 4; } > > because you'll want to use these past mmu-radix64.c. > > Then you also have a single place to adjust if the mmu_idx are > reordered at a later date. > > > r~ I just tried sending mmu_idx all the way down, but I ran into a very weird bug of gcc. If we have to add one more parameter that GCC can't just optimize away we get at least a slow down of 5x for the first test of check-acceptance (could be more, but the test times out after 900 seconds, so I'm not sure). One way that I managed to get around that is saving the current MSR, setting it to 5, and restoring after the xlate call. The code ended up something like: int new_idx = (5<msr & clr; clr = ~clr; /* set new msr so we don't need to send the mmu_idx */ env->msr = (env->msr & clr) | new_idx; ret = ppc_radix64_partition_scoped_xlate(...); /* restore old mmu_idx */ env->msr = (env->msr & clr) | old_idx; That way we continue to use the env as the way to send the variable and avoid what I think is a problem with the compiler's optimization. Would this be acceptable (with proper documentation in the form of comments, ofc) or do we have to find a better solution that doesn't touch the values of env? I personally don't like it, but I couldn't find a better solution. If you're fine with it, we can use it, otherwise I'll keep looking. -- Bruno Piazera Larsen Instituto de Pesquisas ELDORADO Departamento Computação Embarcada Analista de Software Trainee Aviso Legal - Disclaimer If