All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
@ 2017-10-06 17:36 Peter Maydell
  2017-10-10  9:53 ` Alex Bennée
  2017-11-03 16:11 ` Peter Maydell
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Maydell @ 2017-10-06 17:36 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Alex Bennée, Richard Henderson

Running the test program
http://people.linaro.org/~peter.maydell/thumb-over-page
(source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
in the usermode emulator:
 ./build/x86/arm-linux-user/qemu-arm ~/linaro/qemu-misc-tests/thumb-over-page

results in an assertion failure:
write_insns1: T32 insn crossing page boundary
Calling into buffer at 0x6fff9
qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
tb_lock: Assertion `!have_tb_lock' failed.
qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
tb_lock: Assertion `!have_tb_lock' failed.
Segmentation fault (core dumped)

It ought to exit successfully:
write_insns1: T32 insn crossing page boundary
Calling into buffer at 0x6fff9
got sig 11
fault pc 0x6fffe r0 0x1
e104462:xenial:qemu$

(so this is a regression).

Here's a backtrace:

qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
tb_lock: Assertion `!have_tb_lock' failed.

Thread 1 "qemu-arm" received signal SIGABRT, Aborted.
0x00007ffff6851428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6851428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff685302a in __GI_abort () at abort.c:89
#2  0x00007ffff6849bd7 in __assert_fail_base (fmt=<optimised out>,
    assertion=assertion@entry=0x55555570a0ae "!have_tb_lock",
    file=file@entry=0x55555570a020
"/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c",
line=line@entry=169,
    function=function@entry=0x55555570a208 <__PRETTY_FUNCTION__.27063>
"tb_lock") at assert.c:92
#3  0x00007ffff6849c82 in __GI___assert_fail (assertion=0x55555570a0ae
"!have_tb_lock",
    file=0x55555570a020
"/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c",
    line=169, function=0x55555570a208 <__PRETTY_FUNCTION__.27063>
"tb_lock") at assert.c:101
#4  0x00005555555cd50c in tb_lock ()
    at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169
#5  0x00005555555cda34 in cpu_restore_state (cpu=0x555557a1d930,
retaddr=93824992991167)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:353
#6  0x00005555555d0765 in handle_cpu_signal (pc=93824992991165,
address=458752, is_write=0,
    old_set=0x7fffffffd2a8) at
/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/user-exec.c:125
#7  0x00005555555d0808 in cpu_arm_signal_handler (host_signum=11,
pinfo=0x7fffffffd2b0,
    puc=0x7fffffffd180) at
/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/user-exec.c:230
#8  0x00005555555fce44 in host_signal_handler (host_signum=11,
info=0x7fffffffd2b0,
    puc=0x7fffffffd180) at
/home/petmay01/linaro/qemu-from-laptop/qemu/linux-user/signal.c:646
#9  <signal handler called>
#10 0x000055555560d7bd in lduw_he_p (ptr=0x7ffefee1b000)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/include/qemu/bswap.h:317
#11 0x000055555560d836 in lduw_le_p (ptr=0x7ffefee1b000)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/include/qemu/bswap.h:359
#12 0x000055555561f868 in cpu_lduw_code (env=0x555557a25bc0, ptr=458752)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/include/exec/cpu_ldst_useronly_template.h:68
#13 0x000055555561f8fd in arm_lduw_code (env=0x555557a25bc0,
addr=458752, sctlr_b=false)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/arm_ldst.h:50
#14 0x000055555563c059 in disas_thumb2_insn (env=0x555557a25bc0,
s=0x7fffffffd9e0, insn_hw1=61952)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:9739
#15 0x00005555556416c7 in disas_thumb_insn (env=0x555557a25bc0,
s=0x7fffffffd9e0)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:11821
#16 0x0000555555641f3f in thumb_tr_translate_insn
(dcbase=0x7fffffffd9e0, cpu=0x555557a1d930)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:12104
#17 0x00005555555d0218 in translator_loop (ops=0x555555982480
<thumb_translator_ops>,
    db=0x7fffffffd9e0, cpu=0x555557a1d930, tb=0x555555a21cc0
<static_code_gen_buffer+206880>)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translator.c:104
#18 0x0000555555642446 in gen_intermediate_code (cpu=0x555557a1d930,
    tb=0x555555a21cc0 <static_code_gen_buffer+206880>)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:12300
#19 0x00005555555ceac0 in tb_gen_code (cpu=0x555557a1d930, pc=458750,
cs_base=0, flags=524417,
    cflags=0) at
/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:1283
#20 0x00005555555cba65 in tb_find (cpu=0x555557a1d930,
    last_tb=0x555555a21bc0 <static_code_gen_buffer+206624>, tb_exit=1)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/cpu-exec.c:402
#21 0x00005555555cc18a in cpu_exec (cpu=0x555557a1d930)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/cpu-exec.c:710
#22 0x00005555555d36ea in cpu_loop (env=0x555557a25bc0)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/linux-user/main.c:570
#23 0x00005555555d59f9 in main (argc=2, argv=0x7fffffffe458,
envp=0x7fffffffe470)
    at /home/petmay01/linaro/qemu-from-laptop/qemu/linux-user/main.c:4858

This is probably partly because of the silly way we handle guest
faults trying to read code in the translator.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-06 17:36 [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode Peter Maydell
@ 2017-10-10  9:53 ` Alex Bennée
  2017-10-10 10:07   ` Peter Maydell
  2017-11-03 16:11 ` Peter Maydell
  1 sibling, 1 reply; 9+ messages in thread
From: Alex Bennée @ 2017-10-10  9:53 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers, Richard Henderson


Peter Maydell <peter.maydell@linaro.org> writes:

> Running the test program
> http://people.linaro.org/~peter.maydell/thumb-over-page
> (source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
> in the usermode emulator:
>  ./build/x86/arm-linux-user/qemu-arm
> ~/linaro/qemu-misc-tests/thumb-over-page

Does this fail when run via system mode as well?

>
> results in an assertion failure:
> write_insns1: T32 insn crossing page boundary
> Calling into buffer at 0x6fff9
> qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
> tb_lock: Assertion `!have_tb_lock' failed.
> qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
> tb_lock: Assertion `!have_tb_lock' failed.
> Segmentation fault (core dumped)
>
> It ought to exit successfully:
> write_insns1: T32 insn crossing page boundary
> Calling into buffer at 0x6fff9
> got sig 11
> fault pc 0x6fffe r0 0x1
> e104462:xenial:qemu$
>
> (so this is a regression).

OK I'll have a look at how we broke this.

>
> Here's a backtrace:
>
> qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
> tb_lock: Assertion `!have_tb_lock' failed.
>
> Thread 1 "qemu-arm" received signal SIGABRT, Aborted.
> 0x00007ffff6851428 in __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:54
> 54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  0x00007ffff6851428 in __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:54
> #1  0x00007ffff685302a in __GI_abort () at abort.c:89
> #2  0x00007ffff6849bd7 in __assert_fail_base (fmt=<optimised out>,
>     assertion=assertion@entry=0x55555570a0ae "!have_tb_lock",
>     file=file@entry=0x55555570a020
> "/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c",
> line=line@entry=169,
>     function=function@entry=0x55555570a208 <__PRETTY_FUNCTION__.27063>
> "tb_lock") at assert.c:92
> #3  0x00007ffff6849c82 in __GI___assert_fail (assertion=0x55555570a0ae
> "!have_tb_lock",
>     file=0x55555570a020
> "/home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c",
>     line=169, function=0x55555570a208 <__PRETTY_FUNCTION__.27063>
> "tb_lock") at assert.c:101
> #4  0x00005555555cd50c in tb_lock ()
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169
> #5  0x00005555555cda34 in cpu_restore_state (cpu=0x555557a1d930,
> retaddr=93824992991167)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:353
> #6  0x00005555555d0765 in handle_cpu_signal (pc=93824992991165,
> address=458752, is_write=0,
>     old_set=0x7fffffffd2a8) at
> /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/user-exec.c:125
> #7  0x00005555555d0808 in cpu_arm_signal_handler (host_signum=11,
> pinfo=0x7fffffffd2b0,
>     puc=0x7fffffffd180) at
> /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/user-exec.c:230
> #8  0x00005555555fce44 in host_signal_handler (host_signum=11,
> info=0x7fffffffd2b0,
>     puc=0x7fffffffd180) at
> /home/petmay01/linaro/qemu-from-laptop/qemu/linux-user/signal.c:646
> #9  <signal handler called>
> #10 0x000055555560d7bd in lduw_he_p (ptr=0x7ffefee1b000)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/include/qemu/bswap.h:317
> #11 0x000055555560d836 in lduw_le_p (ptr=0x7ffefee1b000)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/include/qemu/bswap.h:359
> #12 0x000055555561f868 in cpu_lduw_code (env=0x555557a25bc0, ptr=458752)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/include/exec/cpu_ldst_useronly_template.h:68
> #13 0x000055555561f8fd in arm_lduw_code (env=0x555557a25bc0,
> addr=458752, sctlr_b=false)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/arm_ldst.h:50
> #14 0x000055555563c059 in disas_thumb2_insn (env=0x555557a25bc0,
> s=0x7fffffffd9e0, insn_hw1=61952)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:9739
> #15 0x00005555556416c7 in disas_thumb_insn (env=0x555557a25bc0,
> s=0x7fffffffd9e0)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:11821
> #16 0x0000555555641f3f in thumb_tr_translate_insn
> (dcbase=0x7fffffffd9e0, cpu=0x555557a1d930)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:12104
> #17 0x00005555555d0218 in translator_loop (ops=0x555555982480
> <thumb_translator_ops>,
>     db=0x7fffffffd9e0, cpu=0x555557a1d930, tb=0x555555a21cc0
> <static_code_gen_buffer+206880>)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translator.c:104
> #18 0x0000555555642446 in gen_intermediate_code (cpu=0x555557a1d930,
>     tb=0x555555a21cc0 <static_code_gen_buffer+206880>)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/target/arm/translate.c:12300
> #19 0x00005555555ceac0 in tb_gen_code (cpu=0x555557a1d930, pc=458750,
> cs_base=0, flags=524417,
>     cflags=0) at
> /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:1283
> #20 0x00005555555cba65 in tb_find (cpu=0x555557a1d930,
>     last_tb=0x555555a21bc0 <static_code_gen_buffer+206624>, tb_exit=1)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/cpu-exec.c:402
> #21 0x00005555555cc18a in cpu_exec (cpu=0x555557a1d930)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/cpu-exec.c:710
> #22 0x00005555555d36ea in cpu_loop (env=0x555557a25bc0)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/linux-user/main.c:570
> #23 0x00005555555d59f9 in main (argc=2, argv=0x7fffffffe458,
> envp=0x7fffffffe470)
>     at /home/petmay01/linaro/qemu-from-laptop/qemu/linux-user/main.c:4858
>
> This is probably partly because of the silly way we handle guest
> faults trying to read code in the translator.
>
> thanks
> -- PMM


--
Alex Bennée

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-10  9:53 ` Alex Bennée
@ 2017-10-10 10:07   ` Peter Maydell
  2017-10-10 10:41     ` Paolo Bonzini
  2017-10-10 10:54     ` Alex Bennée
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Maydell @ 2017-10-10 10:07 UTC (permalink / raw)
  To: Alex Bennée; +Cc: QEMU Developers, Richard Henderson

On 10 October 2017 at 10:53, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
>> Running the test program
>> http://people.linaro.org/~peter.maydell/thumb-over-page
>> (source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
>> in the usermode emulator:
>>  ./build/x86/arm-linux-user/qemu-arm
>> ~/linaro/qemu-misc-tests/thumb-over-page
>
> Does this fail when run via system mode as well?

Nope, only usermode. (Makes sense, since the handle_cpu_signal()
codepath is only used in usermode emulation.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-10 10:07   ` Peter Maydell
@ 2017-10-10 10:41     ` Paolo Bonzini
  2017-10-10 10:52       ` Peter Maydell
  2017-10-10 10:54     ` Alex Bennée
  1 sibling, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2017-10-10 10:41 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée; +Cc: QEMU Developers, Richard Henderson

On 10/10/2017 12:07, Peter Maydell wrote:
> On 10 October 2017 at 10:53, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>>> Running the test program
>>> http://people.linaro.org/~peter.maydell/thumb-over-page
>>> (source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
>>> in the usermode emulator:
>>>  ./build/x86/arm-linux-user/qemu-arm
>>> ~/linaro/qemu-misc-tests/thumb-over-page
>>
>> Does this fail when run via system mode as well?
> 
> Nope, only usermode. (Makes sense, since the handle_cpu_signal()
> codepath is only used in usermode emulation.)

I've seen the same on x86.  Using the program counter from translate.c 
here looks very fishy:

    /* Now we have a real cpu fault.  Since this is the exact location of
     * the exception, we must undo the adjustment done by cpu_restore_state
     * for handling call return addresses.  */
    cpu_restore_state(cpu, pc + GETPC_ADJ);

and cpu_restore_state would just return false because tb_find_pc fails.  Maybe
something like this?

diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 492ea0826c..66a4351b96 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -119,10 +119,13 @@ static inline int handle_cpu_signal(uintptr_t pc, unsigned long address,
         return 1; /* the MMU fault was handled without causing real CPU fault */
     }
 
-    /* Now we have a real cpu fault.  Since this is the exact location of
-     * the exception, we must undo the adjustment done by cpu_restore_state
-     * for handling call return addresses.  */
-    cpu_restore_state(cpu, pc + GETPC_ADJ);
+    if (pc >= (uintptr_t)tcg_ctx.code_gen_buffer &&
+        pc < (uintptr_t)tcg_ctx.code_gen_ptr) {
+        /* Now we have a real cpu fault.  Since this is the exact location of
+         * the exception, we must undo the adjustment done by cpu_restore_state
+         * for handling call return addresses.  */
+        cpu_restore_state(cpu, pc + GETPC_ADJ);
+    }
 
     sigprocmask(SIG_SETMASK, old_set, NULL);
     cpu_loop_exit(cpu);

(very rough idea)

Paolo

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-10 10:41     ` Paolo Bonzini
@ 2017-10-10 10:52       ` Peter Maydell
  2017-10-10 11:01         ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2017-10-10 10:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alex Bennée, QEMU Developers, Richard Henderson

On 10 October 2017 at 11:41, Paolo Bonzini <pbonzini@redhat.com> wrote:
> I've seen the same on x86.  Using the program counter from translate.c
> here looks very fishy:
>
>     /* Now we have a real cpu fault.  Since this is the exact location of
>      * the exception, we must undo the adjustment done by cpu_restore_state
>      * for handling call return addresses.  */
>     cpu_restore_state(cpu, pc + GETPC_ADJ);

This is the right thing if the signal happened directly from
translated code (as it will for guest reads/writes). This
bit of code expects that cpu_restore_state() will just do nothing
if the pc value isn't actually in a TB.

> and cpu_restore_state would just return false because tb_find_pc fails.  Maybe
> something like this?
>
> diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
> index 492ea0826c..66a4351b96 100644
> --- a/accel/tcg/user-exec.c
> +++ b/accel/tcg/user-exec.c
> @@ -119,10 +119,13 @@ static inline int handle_cpu_signal(uintptr_t pc, unsigned long address,
>          return 1; /* the MMU fault was handled without causing real CPU fault */
>      }
>
> -    /* Now we have a real cpu fault.  Since this is the exact location of
> -     * the exception, we must undo the adjustment done by cpu_restore_state
> -     * for handling call return addresses.  */
> -    cpu_restore_state(cpu, pc + GETPC_ADJ);
> +    if (pc >= (uintptr_t)tcg_ctx.code_gen_buffer &&
> +        pc < (uintptr_t)tcg_ctx.code_gen_ptr) {
> +        /* Now we have a real cpu fault.  Since this is the exact location of
> +         * the exception, we must undo the adjustment done by cpu_restore_state
> +         * for handling call return addresses.  */
> +        cpu_restore_state(cpu, pc + GETPC_ADJ);
> +    }

I think that would be better inside cpu_restore_state(), because it's
an internal detail of the TCG accelerator that it happens to put
all its code inside those bounds.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-10 10:07   ` Peter Maydell
  2017-10-10 10:41     ` Paolo Bonzini
@ 2017-10-10 10:54     ` Alex Bennée
  2017-10-10 11:07       ` Peter Maydell
  1 sibling, 1 reply; 9+ messages in thread
From: Alex Bennée @ 2017-10-10 10:54 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers, Richard Henderson


Peter Maydell <peter.maydell@linaro.org> writes:

> On 10 October 2017 at 10:53, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>>> Running the test program
>>> http://people.linaro.org/~peter.maydell/thumb-over-page
>>> (source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
>>> in the usermode emulator:
>>>  ./build/x86/arm-linux-user/qemu-arm
>>> ~/linaro/qemu-misc-tests/thumb-over-page
>>
>> Does this fail when run via system mode as well?
>
> Nope, only usermode. (Makes sense, since the handle_cpu_signal()
> codepath is only used in usermode emulation.)

But surely system emulation has to deal with the same rolling over a
page issue. How is it's resolution different?

>
> thanks
> -- PMM


--
Alex Bennée

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-10 10:52       ` Peter Maydell
@ 2017-10-10 11:01         ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2017-10-10 11:01 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Alex Bennée, QEMU Developers, Richard Henderson

On 10/10/2017 12:52, Peter Maydell wrote:
> On 10 October 2017 at 11:41, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> I've seen the same on x86.  Using the program counter from translate.c
>> here looks very fishy:
>>
>>     /* Now we have a real cpu fault.  Since this is the exact location of
>>      * the exception, we must undo the adjustment done by cpu_restore_state
>>      * for handling call return addresses.  */
>>     cpu_restore_state(cpu, pc + GETPC_ADJ);
> 
> This is the right thing if the signal happened directly from
> translated code (as it will for guest reads/writes). This
> bit of code expects that cpu_restore_state() will just do nothing
> if the pc value isn't actually in a TB.

Yes, it is right if it happens from translated code.  But
cpu_restore_code currently expects either retaddr == 0 or retaddr from a
TB.  And generalizing those expectations...

>> and cpu_restore_state would just return false because tb_find_pc fails.  Maybe
>> something like this?
>>
>> diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
>> index 492ea0826c..66a4351b96 100644
>> --- a/accel/tcg/user-exec.c
>> +++ b/accel/tcg/user-exec.c
>> @@ -119,10 +119,13 @@ static inline int handle_cpu_signal(uintptr_t pc, unsigned long address,
>>          return 1; /* the MMU fault was handled without causing real CPU fault */
>>      }
>>
>> -    /* Now we have a real cpu fault.  Since this is the exact location of
>> -     * the exception, we must undo the adjustment done by cpu_restore_state
>> -     * for handling call return addresses.  */
>> -    cpu_restore_state(cpu, pc + GETPC_ADJ);
>> +    if (pc >= (uintptr_t)tcg_ctx.code_gen_buffer &&
>> +        pc < (uintptr_t)tcg_ctx.code_gen_ptr) {
>> +        /* Now we have a real cpu fault.  Since this is the exact location of
>> +         * the exception, we must undo the adjustment done by cpu_restore_state
>> +         * for handling call return addresses.  */
>> +        cpu_restore_state(cpu, pc + GETPC_ADJ);
>> +    }
> 
> I think that would be better inside cpu_restore_state(), because it's
> an internal detail of the TCG accelerator that it happens to put
> all its code inside those bounds.

... makes sense too, and it can replace the existing check for !retaddr
(introduced in commit d8b2239bcd, "translate-all: exit cpu_restore_state
early if translating", 2017-03-09) which only works for softmmu.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-10 10:54     ` Alex Bennée
@ 2017-10-10 11:07       ` Peter Maydell
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Maydell @ 2017-10-10 11:07 UTC (permalink / raw)
  To: Alex Bennée; +Cc: QEMU Developers, Richard Henderson

On 10 October 2017 at 11:54, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
>> On 10 October 2017 at 10:53, Alex Bennée <alex.bennee@linaro.org> wrote:
>>>
>>> Peter Maydell <peter.maydell@linaro.org> writes:
>>>
>>>> Running the test program
>>>> http://people.linaro.org/~peter.maydell/thumb-over-page
>>>> (source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
>>>> in the usermode emulator:
>>>>  ./build/x86/arm-linux-user/qemu-arm
>>>> ~/linaro/qemu-misc-tests/thumb-over-page
>>>
>>> Does this fail when run via system mode as well?
>>
>> Nope, only usermode. (Makes sense, since the handle_cpu_signal()
>> codepath is only used in usermode emulation.)
>
> But surely system emulation has to deal with the same rolling over a
> page issue. How is it's resolution different?

In system emulation, the attempted load of code is via cpu_lduw_code(),
which in turn calls cpu_lduw_code_ra() passing it a retaddr parameter
of 0. This is then propagated through the softmmu functions until
it eventually reaches tlb_fill(), which then passes it to
cpu_restore_state() as the retaddr.

In usermode, we don't have the softmmu at all, so cpu_lduw_code()
turns into a straightforward host memory access, which is then
caught by the signal handler. The signal handler can't tell
whether it was invoked as a result of translated code or not,
so it just passes the PC of the host memory access to
cpu_restore_state(). cpu_restore_state() ought to check whether
the PC it is passed is within the TCG codegen buffer.

Incidentally cpu_restore_state currently says:
  /* A retaddr of zero is invalid so we really shouldn't have ended
   * up here.

but this isn't true -- it's an expected result for the case where
a zero retaddr was used, and we should deal with that in
cpu_restore_state() rather than making every caller do that test.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode
  2017-10-06 17:36 [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode Peter Maydell
  2017-10-10  9:53 ` Alex Bennée
@ 2017-11-03 16:11 ` Peter Maydell
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Maydell @ 2017-11-03 16:11 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Alex Bennée, Richard Henderson

On 6 October 2017 at 18:36, Peter Maydell <peter.maydell@linaro.org> wrote:
> Running the test program
> http://people.linaro.org/~peter.maydell/thumb-over-page
> (source at http://people.linaro.org/~peter.maydell/thumb-over-page.c)
> in the usermode emulator:
>  ./build/x86/arm-linux-user/qemu-arm ~/linaro/qemu-misc-tests/thumb-over-page
>
> results in an assertion failure:
> write_insns1: T32 insn crossing page boundary
> Calling into buffer at 0x6fff9
> qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
> tb_lock: Assertion `!have_tb_lock' failed.
> qemu-arm: /home/petmay01/linaro/qemu-from-laptop/qemu/accel/tcg/translate-all.c:169:
> tb_lock: Assertion `!have_tb_lock' failed.
> Segmentation fault (core dumped)
>
> It ought to exit successfully:
> write_insns1: T32 insn crossing page boundary
> Calling into buffer at 0x6fff9
> got sig 11
> fault pc 0x6fffe r0 0x1
> e104462:xenial:qemu$

Ping! This still asserts -- what's our plan for fixing it?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-11-03 16:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-06 17:36 [Qemu-devel] tcg/translate-all.c:169: tb_lock: Assertion `!have_tb_lock' failed when doing cpu_restore_state in usermode Peter Maydell
2017-10-10  9:53 ` Alex Bennée
2017-10-10 10:07   ` Peter Maydell
2017-10-10 10:41     ` Paolo Bonzini
2017-10-10 10:52       ` Peter Maydell
2017-10-10 11:01         ` Paolo Bonzini
2017-10-10 10:54     ` Alex Bennée
2017-10-10 11:07       ` Peter Maydell
2017-11-03 16:11 ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.