On Mon, 19 Mar 2018, Alex Bennée wrote: > > Peter Maydell writes: > >> On 19 March 2018 at 17:46, Victor Kamensky wrote: >>> In v2.11.1 of qemu, that we use, we already have >>> b29fd33db578decacd14f34933b29aece3e7c25e. Previous testing >>> and collected log was done with it present. >>> >>> But my understanding that eret would happen when target exits >>> an interrupt, here I don't think it enters one. >>> >>> Consider that target explicitely disables interrupts and while it is >>> disabled, arm_cpu_exec_interrupt function calls arm_excp_unmasked >>> and it returns false, so arm_cpu_do_interrupt is not called. Main >>> loop resume execution, and one of the block explicitely >>> reenables interrupt and sequence continues without ever returning to >>> main loop. >>> >>> For example, if I apply below patch, it boots fine. But I am not sure >>> in what other places similar thing is needed, and whether below >>> is complete and correct: >>> >>> diff --git a/target/arm/helper.c b/target/arm/helper.c >>> index 91a9300..19128c5 100644 >>> --- a/target/arm/helper.c >>> +++ b/target/arm/helper.c >>> @@ -2948,6 +2948,14 @@ static CPAccessResult aa64_daif_access(CPUARMState >>> *env, const ARMCPRegInfo *ri, >>> static void aa64_daif_write(CPUARMState *env, const ARMCPRegInfo *ri, >>> uint64_t value) >>> { >>> + if (env->daif & ~(value & PSTATE_DAIF)) { >>> + /* reenabling interrupts */ >>> + CPUState *cs = CPU(arm_env_get_cpu(env)); >>> + if (cs->interrupt_request) { >>> + /* there is pending one, let's drop back into main loop */ >>> + cs->icount_decr.u16.high = -1; >>> + } >>> + } >>> env->daif = value & PSTATE_DAIF; >>> } >> >> target/arm/translate-a64.c:handle_sys() is setting >> s->base.is_jmp = DISAS_UPDATE; >> which it thinks will end the TB, specifically because system >> register writes might do things like unmask interrupts or >> otherwise require main loop processing. > > For the DAIFclear and eret paths we set DISAS_EXIT. What is the > handle_sys path that should be doing this? Is this a direct setting of > DAIF? Yes, the one that translated into aa64_daif_write help invocation, ie something like: 'msr daif, x25' the reason why I went in my experiment after aa64_daif_write function, that I saw it was hitting daif watchpoint last, clearing it before system hangged. Here is backtrace before system stuck. After backtrace, first entry is interrupt_requested, followed by daif, and then by cp15.hcr_el2, and cp15.scr_el3 Old value = 128 New value = 0 aa64_daif_write (env=0x18c8430, ri=0x18f07d0, value=0) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/target/arm/helper.c:2952 2952 } #0 aa64_daif_write (env=0x18c8430, ri=0x18f07d0, value=0) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/target/arm/helper.c:2952 #1 0x00000000005c8f43 in helper_set_cp_reg64 (env=0x18c8430, rip=0x18f07d0, value=0) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/target/arm/op_helper.c:842 #2 0x00007fffec05cec7 in code_gen_buffer () #3 0x000000000048aee9 in cpu_tb_exec (cpu=0x18c0190, itb=0x7fffec0393c0 ) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/accel/tcg/cpu-exec.c:167 #4 0x000000000048bd82 in cpu_loop_exec_tb (cpu=0x18c0190, tb=0x7fffec0393c0 , last_tb=0x7fffec00faf8, tb_exit=0x7fffec00faf0) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/accel/tcg/cpu-exec.c:627 #5 0x000000000048c091 in cpu_exec (cpu=0x18c0190) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/accel/tcg/cpu-exec.c:736 #6 0x000000000044a883 in tcg_cpu_exec (cpu=0x18c0190) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/cpus.c:1270 #7 0x000000000044ad82 in qemu_tcg_cpu_thread_fn (arg=0x18c0190) at /wd6/oe/20180311/build/tmp-glibc/work/x86_64-linux/qemu-native/2.11.1-r0/qemu-2.11.1/cpus.c:1475 #8 0x00007ffff79616ba in start_thread (arg=0x7fffec010700) at pthread_create.c:333 #9 0x00007ffff59bc41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 $7778 = 0x2 $7779 = 0x0 $7780 = 0x0 $7781 = 0x0 Note IMO dealing with aa64_daif_write may not be sufficient, because besides daif unmasked interrupt function also checks cp15.hcr_el2, and cp15.scr_el3. Those could be a reason why interrupt is masked and they could change too and require exit into main loop, if interrupt is pending. Thanks, Victor >> >> The changes that prompted b29fd33db578dec stopped this working. >> I suspect what we want is for the case DISAS_UPDATE in >> aarch64_tr_tb_stop() to fall through into DISAS_EXIT, not >> DISAS_JUMP. (The AArch32 code gets this right, amazingly.) >> >> thanks >> -- PMM > > > -- > Alex Bennée >