All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 0/2] tcg-next patches
@ 2018-05-09 15:48 Richard Henderson
  2018-05-09 15:48 ` [Qemu-devel] [PULL 1/2] tcg/i386: Fix dup_vec in non-AVX2 codepath Richard Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Richard Henderson @ 2018-05-09 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Henderson, peter.maydell

The following changes since commit e5cd695266c5709308aa95b1baae499e4b5d4544:

  Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2018-05-08 17:05:58 +0100)

are available in the Git repository at:

  https://github.com/rth7680/qemu.git tags/tcg-next-pull-request

for you to fetch changes up to abebf92597186be2bc48d487235da28b1127860f:

  tcg: Limit the number of ops in a TB (2018-05-09 08:30:57 -0700)

----------------------------------------------------------------
Queued TCG patches

----------------------------------------------------------------

Peter Maydell (1):
  tcg/i386: Fix dup_vec in non-AVX2 codepath

Richard Henderson (1):
  tcg: Limit the number of ops in a TB

 tcg/tcg.h                 | 8 +++++++-
 tcg/i386/tcg-target.inc.c | 6 +++---
 tcg/tcg.c                 | 3 +++
 3 files changed, 13 insertions(+), 4 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PULL 1/2] tcg/i386: Fix dup_vec in non-AVX2 codepath
  2018-05-09 15:48 [Qemu-devel] [PULL 0/2] tcg-next patches Richard Henderson
@ 2018-05-09 15:48 ` Richard Henderson
  2018-05-09 15:48 ` [Qemu-devel] [PULL 2/2] tcg: Limit the number of ops in a TB Richard Henderson
  2018-05-11 14:41 ` [Qemu-devel] [PULL 0/2] tcg-next patches Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2018-05-09 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Henderson, peter.maydell, qemu-stable

From: Peter Maydell <peter.maydell@linaro.org>

The VPUNPCKLD* instructions are all "non-destructive source",
indicated by "NDS" in the encoding string in the x86 ISA manual.
This means that they take two source operands, one of which is
encoded in the VEX.vvvv field. We were incorrectly treating them
as if they were destructive-source and passing 0 as the 'v'
argument of tcg_out_vex_modrm(). This meant we were always
using %xmm0 as one of the source operands, causing incorrect
results if the register allocator happened to want to use
something else. For instance the input AArch64 insn:
 DUP v26.16b, w21
which becomes TCG IR ops:
 dup_vec v128,e8,tmp2,x21
 st_vec v128,e8,tmp2,env,$0xa40
was assembled to:
0x607c568c:  c4 c1 7a 7e 86 e8 00 00  vmovq    0xe8(%r14), %xmm0
0x607c5694:  00
0x607c5695:  c5 f9 60 c8              vpunpcklbw %xmm0, %xmm0, %xmm1
0x607c5699:  c5 f9 61 c9              vpunpcklwd %xmm1, %xmm0, %xmm1
0x607c569d:  c5 f9 70 c9 00           vpshufd  $0, %xmm1, %xmm1
0x607c56a2:  c4 c1 7a 7f 8e 40 0a 00  vmovdqu  %xmm1, 0xa40(%r14)
0x607c56aa:  00

when the vpunpcklwd insn should be "%xmm1, %xmm1, %xmm1".
This resulted in our incorrectly setting the output vector to
q26=0000320000003200:0000320000003200
when given an input of x21 == 0000000002803200
rather than the expected all-zeroes.

Pass the correct source register number to tcg_out_vex_modrm()
for these insns.

Fixes: 770c2fc7bb70804a
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180504153431.5169-1-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.inc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index d7e59e79c5..5357909fff 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -854,11 +854,11 @@ static void tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
         switch (vece) {
         case MO_8:
             /* ??? With zero in a register, use PSHUFB.  */
-            tcg_out_vex_modrm(s, OPC_PUNPCKLBW, r, 0, a);
+            tcg_out_vex_modrm(s, OPC_PUNPCKLBW, r, a, a);
             a = r;
             /* FALLTHRU */
         case MO_16:
-            tcg_out_vex_modrm(s, OPC_PUNPCKLWD, r, 0, a);
+            tcg_out_vex_modrm(s, OPC_PUNPCKLWD, r, a, a);
             a = r;
             /* FALLTHRU */
         case MO_32:
@@ -867,7 +867,7 @@ static void tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
             tcg_out8(s, 0);
             break;
         case MO_64:
-            tcg_out_vex_modrm(s, OPC_PUNPCKLQDQ, r, 0, a);
+            tcg_out_vex_modrm(s, OPC_PUNPCKLQDQ, r, a, a);
             break;
         default:
             g_assert_not_reached();
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PULL 2/2] tcg: Limit the number of ops in a TB
  2018-05-09 15:48 [Qemu-devel] [PULL 0/2] tcg-next patches Richard Henderson
  2018-05-09 15:48 ` [Qemu-devel] [PULL 1/2] tcg/i386: Fix dup_vec in non-AVX2 codepath Richard Henderson
@ 2018-05-09 15:48 ` Richard Henderson
  2018-05-11 14:41 ` [Qemu-devel] [PULL 0/2] tcg-next patches Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2018-05-09 15:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Henderson, peter.maydell, qemu-stable

In 6001f7729e12 we partially attempt to address the branch
displacement overflow caused by 15fa08f845.

However, gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqtbX.c
is a testcase that contains a TB so large as to overflow anyway.
The limit here of 8000 ops produces a maximum output TB size of
24112 bytes on a ppc64le host with that test case.  This is still
much less than the maximum forward branch distance of 32764 bytes.

Cc: qemu-stable@nongnu.org
Fixes: 15fa08f845 ("tcg: Dynamically allocate TCGOps")
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg.h | 8 +++++++-
 tcg/tcg.c | 3 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 75fbad128b..88378be310 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -655,6 +655,7 @@ struct TCGContext {
     int nb_globals;
     int nb_temps;
     int nb_indirects;
+    int nb_ops;
 
     /* goto_tb support */
     tcg_insn_unit *code_buf;
@@ -844,7 +845,12 @@ static inline TCGOp *tcg_last_op(void)
 /* Test for whether to terminate the TB for using too many opcodes.  */
 static inline bool tcg_op_buf_full(void)
 {
-    return false;
+    /* This is not a hard limit, it merely stops translation when
+     * we have produced "enough" opcodes.  We want to limit TB size
+     * such that a RISC host can reasonably use a 16-bit signed
+     * branch within the TB.
+     */
+    return tcg_ctx->nb_ops >= 8000;
 }
 
 /* pool based memory allocation */
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 551caf1c53..6eeebe0624 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -866,6 +866,7 @@ void tcg_func_start(TCGContext *s)
     /* No temps have been previously allocated for size or locality.  */
     memset(s->free_temps, 0, sizeof(s->free_temps));
 
+    s->nb_ops = 0;
     s->nb_labels = 0;
     s->current_frame_offset = s->frame_start;
 
@@ -1956,6 +1957,7 @@ void tcg_op_remove(TCGContext *s, TCGOp *op)
 {
     QTAILQ_REMOVE(&s->ops, op, link);
     QTAILQ_INSERT_TAIL(&s->free_ops, op, link);
+    s->nb_ops--;
 
 #ifdef CONFIG_PROFILER
     atomic_set(&s->prof.del_op_count, s->prof.del_op_count + 1);
@@ -1975,6 +1977,7 @@ static TCGOp *tcg_op_alloc(TCGOpcode opc)
     }
     memset(op, 0, offsetof(TCGOp, link));
     op->opc = opc;
+    s->nb_ops++;
 
     return op;
 }
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PULL 0/2] tcg-next patches
  2018-05-09 15:48 [Qemu-devel] [PULL 0/2] tcg-next patches Richard Henderson
  2018-05-09 15:48 ` [Qemu-devel] [PULL 1/2] tcg/i386: Fix dup_vec in non-AVX2 codepath Richard Henderson
  2018-05-09 15:48 ` [Qemu-devel] [PULL 2/2] tcg: Limit the number of ops in a TB Richard Henderson
@ 2018-05-11 14:41 ` Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2018-05-11 14:41 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, Richard Henderson

On 9 May 2018 at 16:48, Richard Henderson <richard.henderson@linaro.org> wrote:
> The following changes since commit e5cd695266c5709308aa95b1baae499e4b5d4544:
>
>   Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2018-05-08 17:05:58 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/rth7680/qemu.git tags/tcg-next-pull-request
>
> for you to fetch changes up to abebf92597186be2bc48d487235da28b1127860f:
>
>   tcg: Limit the number of ops in a TB (2018-05-09 08:30:57 -0700)
>
> ----------------------------------------------------------------
> Queued TCG patches
>
> ----------------------------------------------------------------
>
> Peter Maydell (1):
>   tcg/i386: Fix dup_vec in non-AVX2 codepath
>
> Richard Henderson (1):
>   tcg: Limit the number of ops in a TB
>
>  tcg/tcg.h                 | 8 +++++++-
>  tcg/i386/tcg-target.inc.c | 6 +++---
>  tcg/tcg.c                 | 3 +++
>  3 files changed, 13 insertions(+), 4 deletions(-)
>

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-11 14:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-09 15:48 [Qemu-devel] [PULL 0/2] tcg-next patches Richard Henderson
2018-05-09 15:48 ` [Qemu-devel] [PULL 1/2] tcg/i386: Fix dup_vec in non-AVX2 codepath Richard Henderson
2018-05-09 15:48 ` [Qemu-devel] [PULL 2/2] tcg: Limit the number of ops in a TB Richard Henderson
2018-05-11 14:41 ` [Qemu-devel] [PULL 0/2] tcg-next patches Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.