All of lore.kernel.org
 help / color / mirror / Atom feed
* Enable cross-page block chaining for user mode tcg
@ 2023-03-15 14:40 Wu, Fei
  2023-03-16  1:55 ` Wu, Fei
  0 siblings, 1 reply; 2+ messages in thread
From: Wu, Fei @ 2023-03-15 14:40 UTC (permalink / raw)
  To: qemu-devel, richard.henderson

Block chaining is one of the key performance factors of tcg. Currently
tcg doesn't allow chaining across page boundary, an example can be found
in gen_goto_tb() in target/riscv/translate.c.

For user-mode tcg, it's possible to enable cross-page chaining with
careful attentions, assume there are chains like this:
    preceding page -> 1st page -> 2nd page
                      Nth page -> 2nd page

There are 2 situations to consider:
1. First page should not jump to 2nd page directly anymore, if there is
a new breakpoint added to 3rd page, otherwise the breakpoint might not
be hit. One method to address this problem is when receiving gdb
commands, call tb_flush() to invalidate all the TBs, and make sure each
TB can only contain single instruction later, no matter the new JIT-ed
TBs use chain or not, the tcg core loop always has the chance to check
if there is any breakpoint on each instruction. There could be other
methods, but current tcg has already done this.

2. The protection of 2nd page has changed by mprotect/munmap, e.g. from
executable (X) to non-executable (NX), it's an error if the 1st page
jumps to 2nd page without checking the new protection. The point here is
to invalidate TBs in 2nd page and unlink all the TBs which jumps to it,
including 1st page and others(Nth in above chart). This is already done
in page_set_flags(). A small testcase runs on user-mode guest:

        void *page = mmap(NULL, pagesize,
			  PROT_READ | PROT_WRITE | PROT_EXEC,
                          MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
        memcpy(page, func_add, pagesize);
        f = (FUNC)page;

	f(1, 1); // good
	mprotect(f, pagesize, PROT_READ | PROT_EXEC);
	f(1, 2); // good
	mprotect(f, pagesize, PROT_READ);
	f(1, 3); // segfault

So it looks like current tcg implementation is ready to enable
cross-page chaining for user-mode. Correct?

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 7bda43ff61..822644c7a4 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -25,8 +25,12 @@ bool translator_use_goto_tb(DisasContextBase *db,
target_ulong dest)
         return false;
     }

+#ifdef CONFIG_USER_ONLY
+    return true;
+#else
     /* Check for the dest on the same page as the start of the TB.  */
     return ((db->pc_first ^ dest) & TARGET_PAGE_MASK) == 0;
+#endif
 }

 void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,


Thanks,
Fei.


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: Enable cross-page block chaining for user mode tcg
  2023-03-15 14:40 Enable cross-page block chaining for user mode tcg Wu, Fei
@ 2023-03-16  1:55 ` Wu, Fei
  0 siblings, 0 replies; 2+ messages in thread
From: Wu, Fei @ 2023-03-16  1:55 UTC (permalink / raw)
  To: qemu-devel, richard.henderson

On 3/15/2023 10:40 PM, Wu, Fei wrote:
> Block chaining is one of the key performance factors of tcg. Currently
> tcg doesn't allow chaining across page boundary, an example can be found
> in gen_goto_tb() in target/riscv/translate.c.
> 
> For user-mode tcg, it's possible to enable cross-page chaining with
> careful attentions, assume there are chains like this:
>     preceding page -> 1st page -> 2nd page
>                       Nth page -> 2nd page
> 
> There are 2 situations to consider:
> 1. First page should not jump to 2nd page directly anymore, if there is
> a new breakpoint added to 3rd page, otherwise the breakpoint might not
> be hit. One method to address this problem is when receiving gdb
> commands, call tb_flush() to invalidate all the TBs, and make sure each
> TB can only contain single instruction later, no matter the new JIT-ed
> TBs use chain or not, the tcg core loop always has the chance to check
> if there is any breakpoint on each instruction. There could be other
> methods, but current tcg has already done this.
> 
3rd page is a typo, it's 2nd instead.

With the patch at the bottom:
* TBs in the page where breakpoint is added always contain single
instruction, it doesn't impact instruction count of TBs in other pages.
* The single instruction TBs at the same page of breakpoint do generate
lookup_tb_ptr because of the flag CF_NO_GOTO_TB.

I tried to add breakpoint & conditional breakpoint, ignore breakpoint
using the following testcase, and checked the info '-d in_asm,op', all
works. If you have any comments or any tests for me to try, please let
me know.

--
#define A1    ++a;
#define A10   A1 A1 A1 A1 A1 A1 A1 A1 A1 A1
#define A100  A10 A10 A10 A10 A10 A10 A10 A10 A10 A10
#define A500  A100 A100 A100 A100 A100
#define A1000 A100 A100 A100 A100 A100 A100 A100 A100 A100 A100

long func0(long a) {
        A1000;
        return a;
}

long func1(long a) {
        int i;
        for (i = 0; i < 1000; ++i) {
                A1000;
        }
        return a;
}

int main() {
        long a = 0;
        long sum = 0;

        while (1) {
                sum += func1(a);
        }
        return 0;
}

Thanks,
Fei.

> 2. The protection of 2nd page has changed by mprotect/munmap, e.g. from
> executable (X) to non-executable (NX), it's an error if the 1st page
> jumps to 2nd page without checking the new protection. The point here is
> to invalidate TBs in 2nd page and unlink all the TBs which jumps to it,
> including 1st page and others(Nth in above chart). This is already done
> in page_set_flags(). A small testcase runs on user-mode guest:
> 
>         void *page = mmap(NULL, pagesize,
> 			  PROT_READ | PROT_WRITE | PROT_EXEC,
>                           MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>         memcpy(page, func_add, pagesize);
>         f = (FUNC)page;
> 
> 	f(1, 1); // good
> 	mprotect(f, pagesize, PROT_READ | PROT_EXEC);
> 	f(1, 2); // good
> 	mprotect(f, pagesize, PROT_READ);
> 	f(1, 3); // segfault
> 
> So it looks like current tcg implementation is ready to enable
> cross-page chaining for user-mode. Correct?
> 
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index 7bda43ff61..822644c7a4 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -25,8 +25,12 @@ bool translator_use_goto_tb(DisasContextBase *db,
> target_ulong dest)
>          return false;
>      }
> 
> +#ifdef CONFIG_USER_ONLY
> +    return true;
> +#else
>      /* Check for the dest on the same page as the start of the TB.  */
>      return ((db->pc_first ^ dest) & TARGET_PAGE_MASK) == 0;
> +#endif
>  }
> 
>  void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
> 
> 
> Thanks,
> Fei.



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-03-16  1:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-15 14:40 Enable cross-page block chaining for user mode tcg Wu, Fei
2023-03-16  1:55 ` Wu, Fei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.