qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation
@ 2019-08-15  2:08 Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 01/46] target/i386: Push rex_r into DisasContext Jan Bobek
                   ` (46 more replies)
  0 siblings, 47 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The previous version can be found at [1]. Changes compared to v2:

  - Expanded the instruction operand infrastructure a bit; I am now
    fairly confident that it is powerful enough to accommodate for all
    the use cases I will need. It's still a bit clunky to work with at
    times, but I am happy with it for now.

  - Reduced the number of various INSN_* (now called OPCODE_*) macro
    variants using variadic macros.

  - Implemented translation for instructions up to SSE3.

Cheers,
 -Jan

References:
  1. https://lists.nongnu.org/archive/html/qemu-devel/2019-08/msg01790.html

Jan Bobek (43):
  target/i386: reduce scope of variable aflag
  target/i386: use dflag from DisasContext
  target/i386: use prefix from DisasContext
  target/i386: use pc_start from DisasContext
  target/i386: make variable b1 const
  target/i386: make variable is_xmm const
  target/i386: add vector register file alignment constraints
  target/i386: introduce gen_(ld,st)d_env_A0
  target/i386: introduce gen_sse_ng
  target/i386: disable unused function warning temporarily
  target/i386: introduce mnemonic aliases for several gvec operations
  target/i386: introduce function ck_cpuid
  target/i386: introduce instruction operand infrastructure
  target/i386: introduce generic operand alias
  target/i386: introduce generic either-or operand
  target/i386: introduce generic load-store operand
  target/i386: introduce tcg_temp operands
  target/i386: introduce modrm operand
  target/i386: introduce operands for decoding modrm fields
  target/i386: introduce operand for direct-only r/m field
  target/i386: introduce operand vex_v
  target/i386: introduce Ib (immediate) operand
  target/i386: introduce M* (memptr) operands
  target/i386: introduce G*, R*, E* (general register) operands
  target/i386: introduce P*, N*, Q* (MMX) operands
  target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands
  target/i386: introduce code generators
  target/i386: introduce helper-based code generator macros
  target/i386: introduce gvec-based code generator macros
  target/i386: introduce sse-opcode.inc.h
  target/i386: introduce instruction translator macros
  target/i386: introduce MMX translators
  target/i386: introduce MMX code generators
  target/i386: introduce MMX instructions to sse-opcode.inc.h
  target/i386: introduce SSE translators
  target/i386: introduce SSE code generators
  target/i386: introduce SSE instructions to sse-opcode.inc.h
  target/i386: introduce SSE2 translators
  target/i386: introduce SSE2 code generators
  target/i386: introduce SSE2 instructions to sse-opcode.inc.h
  target/i386: introduce SSE3 translators
  target/i386: introduce SSE3 code generators
  target/i386: introduce SSE3 instructions to sse-opcode.inc.h

Richard Henderson (3):
  target/i386: Push rex_r into DisasContext
  target/i386: Push rex_w into DisasContext
  target/i386: Simplify gen_exception arguments

 target/i386/cpu.h            |    6 +-
 target/i386/sse-opcode.inc.h |  699 +++++++++
 target/i386/translate.c      | 2808 ++++++++++++++++++++++++++++++----
 3 files changed, 3189 insertions(+), 324 deletions(-)
 create mode 100644 target/i386/sse-opcode.inc.h

-- 
2.20.1



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 01/46] target/i386: Push rex_r into DisasContext
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w " Jan Bobek
                   ` (45 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Bennée, Richard Henderson, Richard Henderson

From: Richard Henderson <rth@twiddle.net>

Treat this value the same as we do for rex_b and rex_x.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target/i386/translate.c | 85 +++++++++++++++++++++--------------------
 1 file changed, 44 insertions(+), 41 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 03150a86e2..d74dbfd585 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -43,10 +43,12 @@
 #define CODE64(s) ((s)->code64)
 #define REX_X(s) ((s)->rex_x)
 #define REX_B(s) ((s)->rex_b)
+#define REX_R(s) ((s)->rex_r)
 #else
 #define CODE64(s) 0
 #define REX_X(s) 0
 #define REX_B(s) 0
+#define REX_R(s) 0
 #endif
 
 #ifdef TARGET_X86_64
@@ -98,7 +100,7 @@ typedef struct DisasContext {
 #ifdef TARGET_X86_64
     int lma;    /* long mode active */
     int code64; /* 64 bit code segment */
-    int rex_x, rex_b;
+    int rex_x, rex_b, rex_r;
 #endif
     int vex_l;  /* vex vector length */
     int vex_v;  /* vex vvvv register, without 1's complement.  */
@@ -3037,7 +3039,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = {
 };
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-                    target_ulong pc_start, int rex_r)
+                    target_ulong pc_start)
 {
     int b1, op1_offset, op2_offset, is_xmm, val;
     int modrm, mod, rm, reg;
@@ -3107,8 +3109,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 
     modrm = x86_ldub_code(env, s);
     reg = ((modrm >> 3) & 7);
-    if (is_xmm)
-        reg |= rex_r;
+    if (is_xmm) {
+        reg |= REX_R(s);
+    }
     mod = (modrm >> 6) & 3;
     if (sse_fn_epp == SSE_SPECIAL) {
         b |= (b1 << 8);
@@ -3642,7 +3645,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 tcg_gen_ld16u_tl(s->T0, cpu_env,
                                 offsetof(CPUX86State,fpregs[rm].mmx.MMX_W(val)));
             }
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             gen_op_mov_reg_v(s, ot, reg, s->T0);
             break;
         case 0x1d6: /* movq ea, xmm */
@@ -3686,7 +3689,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                                  offsetof(CPUX86State, fpregs[rm].mmx));
                 gen_helper_pmovmskb_mmx(s->tmp2_i32, cpu_env, s->ptr0);
             }
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32);
             break;
 
@@ -3698,7 +3701,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             }
             modrm = x86_ldub_code(env, s);
             rm = modrm & 7;
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
             if (b1 >= 2) {
                 goto unknown_op;
@@ -3774,7 +3777,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             /* Various integer extensions at 0f 38 f[0-f].  */
             b = modrm | (b1 << 8);
             modrm = x86_ldub_code(env, s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
 
             switch (b) {
             case 0x3f0: /* crc32 Gd,Eb */
@@ -4128,7 +4131,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             b = modrm;
             modrm = x86_ldub_code(env, s);
             rm = modrm & 7;
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
             if (b1 >= 2) {
                 goto unknown_op;
@@ -4148,7 +4151,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 rm = (modrm & 7) | REX_B(s);
                 if (mod != 3)
                     gen_lea_modrm(env, s, modrm);
-                reg = ((modrm >> 3) & 7) | rex_r;
+                reg = ((modrm >> 3) & 7) | REX_R(s);
                 val = x86_ldub_code(env, s);
                 switch (b) {
                 case 0x14: /* pextrb */
@@ -4317,7 +4320,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             /* Various integer extensions at 0f 3a f[0-f].  */
             b = modrm | (b1 << 8);
             modrm = x86_ldub_code(env, s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
 
             switch (b) {
             case 0x3f0: /* rorx Gy,Ey, Ib */
@@ -4491,14 +4494,15 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     TCGMemOp ot, aflag, dflag;
     int modrm, reg, rm, mod, op, opreg, val;
     target_ulong next_eip, tval;
-    int rex_w, rex_r;
     target_ulong pc_start = s->base.pc_next;
+    int rex_w;
 
     s->pc_start = s->pc = pc_start;
     s->override = -1;
 #ifdef TARGET_X86_64
     s->rex_x = 0;
     s->rex_b = 0;
+    s->rex_r = 0;
     s->x86_64_hregs = false;
 #endif
     s->rip_offset = 0; /* for relative ip address */
@@ -4511,7 +4515,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     prefixes = 0;
     rex_w = -1;
-    rex_r = 0;
 
  next_byte:
     b = x86_ldub_code(env, s);
@@ -4555,9 +4558,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (CODE64(s)) {
             /* REX prefix */
             rex_w = (b >> 3) & 1;
-            rex_r = (b & 0x4) << 1;
+            s->rex_r = (b & 0x4) << 1;
             s->rex_x = (b & 0x2) << 2;
-            REX_B(s) = (b & 0x1) << 3;
+            s->rex_b = (b & 0x1) << 3;
             /* select uniform byte register addressing */
             s->x86_64_hregs = true;
             goto next_byte;
@@ -4590,8 +4593,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (s->x86_64_hregs) {
                 goto illegal_op;
             }
+            s->rex_r = (~vex2 >> 4) & 8;
 #endif
-            rex_r = (~vex2 >> 4) & 8;
             if (b == 0xc5) {
                 /* 2-byte VEX prefix: RVVVVlpp, implied 0f leading opcode byte */
                 vex3 = vex2;
@@ -4681,7 +4684,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             switch(f) {
             case 0: /* OP Ev, Gv */
                 modrm = x86_ldub_code(env, s);
-                reg = ((modrm >> 3) & 7) | rex_r;
+                reg = ((modrm >> 3) & 7) | REX_R(s);
                 mod = (modrm >> 6) & 3;
                 rm = (modrm & 7) | REX_B(s);
                 if (mod != 3) {
@@ -4703,7 +4706,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             case 1: /* OP Gv, Ev */
                 modrm = x86_ldub_code(env, s);
                 mod = (modrm >> 6) & 3;
-                reg = ((modrm >> 3) & 7) | rex_r;
+                reg = ((modrm >> 3) & 7) | REX_R(s);
                 rm = (modrm & 7) | REX_B(s);
                 if (mod != 3) {
                     gen_lea_modrm(env, s, modrm);
@@ -5123,7 +5126,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d(b, dflag);
 
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
 
         gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
         gen_op_mov_v_reg(s, ot, s->T1, reg);
@@ -5195,7 +5198,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x6b:
         ot = dflag;
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         if (b == 0x69)
             s->rip_offset = insn_const_size(ot);
         else if (b == 0x6b)
@@ -5247,7 +5250,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1c1: /* xadd Ev, Gv */
         ot = mo_b_d(b, dflag);
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
         gen_op_mov_v_reg(s, ot, s->T0, reg);
         if (mod == 3) {
@@ -5279,7 +5282,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
             ot = mo_b_d(b, dflag);
             modrm = x86_ldub_code(env, s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
             oldv = tcg_temp_new();
             newv = tcg_temp_new();
@@ -5502,7 +5505,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x89: /* mov Gv, Ev */
         ot = mo_b_d(b, dflag);
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
 
         /* generate a generic store */
         gen_ldst_modrm(env, s, modrm, ot, reg, 1);
@@ -5528,7 +5531,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x8b: /* mov Ev, Gv */
         ot = mo_b_d(b, dflag);
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
 
         gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
         gen_op_mov_reg_v(s, ot, reg, s->T0);
@@ -5578,7 +5581,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             s_ot = b & 8 ? MO_SIGN | ot : ot;
 
             modrm = x86_ldub_code(env, s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
             rm = (modrm & 7) | REX_B(s);
 
@@ -5617,7 +5620,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         mod = (modrm >> 6) & 3;
         if (mod == 3)
             goto illegal_op;
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         {
             AddressParts a = gen_lea_modrm_0(env, s, modrm);
             TCGv ea = gen_lea_modrm_1(s, a);
@@ -5699,7 +5702,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x87: /* xchg Ev, Gv */
         ot = mo_b_d(b, dflag);
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
         if (mod == 3) {
             rm = (modrm & 7) | REX_B(s);
@@ -5736,7 +5739,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     do_lxx:
         ot = dflag != MO_16 ? MO_32 : MO_16;
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
         if (mod == 3)
             goto illegal_op;
@@ -5819,7 +5822,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         modrm = x86_ldub_code(env, s);
         mod = (modrm >> 6) & 3;
         rm = (modrm & 7) | REX_B(s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         if (mod != 3) {
             gen_lea_modrm(env, s, modrm);
             opreg = OR_TMP0;
@@ -6674,7 +6677,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         ot = dflag;
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         gen_cmovcc1(env, s, ot, b, modrm, reg);
         break;
 
@@ -6824,7 +6827,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     do_btx:
         ot = dflag;
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
         rm = (modrm & 7) | REX_B(s);
         gen_op_mov_v_reg(s, MO_32, s->T1, reg);
@@ -6929,7 +6932,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1bd: /* bsr / lzcnt */
         ot = dflag;
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
         gen_extu(ot, s->T0);
 
@@ -7693,7 +7696,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             d_ot = dflag;
 
             modrm = x86_ldub_code(env, s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
             rm = (modrm & 7) | REX_B(s);
 
@@ -7767,7 +7770,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             ot = dflag != MO_16 ? MO_32 : MO_16;
             modrm = x86_ldub_code(env, s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
             t0 = tcg_temp_local_new();
             gen_update_cc_op(s);
@@ -7808,7 +7811,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         modrm = x86_ldub_code(env, s);
         if (s->flags & HF_MPX_EN_MASK) {
             mod = (modrm >> 6) & 3;
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             if (prefixes & PREFIX_REPZ) {
                 /* bndcl */
                 if (reg >= 4
@@ -7898,7 +7901,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         modrm = x86_ldub_code(env, s);
         if (s->flags & HF_MPX_EN_MASK) {
             mod = (modrm >> 6) & 3;
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             if (mod != 3 && (prefixes & PREFIX_REPZ)) {
                 /* bndmk */
                 if (reg >= 4
@@ -8012,7 +8015,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
              * are assumed to be 1's, regardless of actual values.
              */
             rm = (modrm & 7) | REX_B(s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             if (CODE64(s))
                 ot = MO_64;
             else
@@ -8069,7 +8072,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
              * are assumed to be 1's, regardless of actual values.
              */
             rm = (modrm & 7) | REX_B(s);
-            reg = ((modrm >> 3) & 7) | rex_r;
+            reg = ((modrm >> 3) & 7) | REX_R(s);
             if (CODE64(s))
                 ot = MO_64;
             else
@@ -8112,7 +8115,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         mod = (modrm >> 6) & 3;
         if (mod == 3)
             goto illegal_op;
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
         /* generate a generic store */
         gen_ldst_modrm(env, s, modrm, ot, reg, 1);
         break;
@@ -8338,7 +8341,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             goto illegal_op;
 
         modrm = x86_ldub_code(env, s);
-        reg = ((modrm >> 3) & 7) | rex_r;
+        reg = ((modrm >> 3) & 7) | REX_R(s);
 
         if (s->prefix & PREFIX_DATA) {
             ot = MO_16;
@@ -8366,7 +8369,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1c2:
     case 0x1c4 ... 0x1c6:
     case 0x1d0 ... 0x1fe:
-        gen_sse(env, s, b, pc_start, rex_r);
+        gen_sse(env, s, b, pc_start);
         break;
     default:
         goto unknown_op;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 01/46] target/i386: Push rex_r into DisasContext Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  7:30   ` Aleksandar Markovic
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag Jan Bobek
                   ` (44 subsequent siblings)
  46 siblings, 1 reply; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Bennée, Richard Henderson, Richard Henderson

From: Richard Henderson <rth@twiddle.net>

Treat this the same as we already do for other rex bits.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target/i386/translate.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index d74dbfd585..c0866c2797 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -44,11 +44,13 @@
 #define REX_X(s) ((s)->rex_x)
 #define REX_B(s) ((s)->rex_b)
 #define REX_R(s) ((s)->rex_r)
+#define REX_W(s) ((s)->rex_w)
 #else
 #define CODE64(s) 0
 #define REX_X(s) 0
 #define REX_B(s) 0
 #define REX_R(s) 0
+#define REX_W(s) -1
 #endif
 
 #ifdef TARGET_X86_64
@@ -100,7 +102,7 @@ typedef struct DisasContext {
 #ifdef TARGET_X86_64
     int lma;    /* long mode active */
     int code64; /* 64 bit code segment */
-    int rex_x, rex_b, rex_r;
+    int rex_x, rex_b, rex_r, rex_w;
 #endif
     int vex_l;  /* vex vector length */
     int vex_v;  /* vex vvvv register, without 1's complement.  */
@@ -4495,7 +4497,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     int modrm, reg, rm, mod, op, opreg, val;
     target_ulong next_eip, tval;
     target_ulong pc_start = s->base.pc_next;
-    int rex_w;
 
     s->pc_start = s->pc = pc_start;
     s->override = -1;
@@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     s->rex_x = 0;
     s->rex_b = 0;
     s->rex_r = 0;
+    s->rex_w = -1;
     s->x86_64_hregs = false;
 #endif
     s->rip_offset = 0; /* for relative ip address */
@@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     }
 
     prefixes = 0;
-    rex_w = -1;
 
  next_byte:
     b = x86_ldub_code(env, s);
@@ -4557,7 +4558,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x40 ... 0x4f:
         if (CODE64(s)) {
             /* REX prefix */
-            rex_w = (b >> 3) & 1;
+            s->rex_w = (b >> 3) & 1;
             s->rex_r = (b & 0x4) << 1;
             s->rex_x = (b & 0x2) << 2;
             s->rex_b = (b & 0x1) << 3;
@@ -4606,7 +4607,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 s->rex_b = (~vex2 >> 2) & 8;
 #endif
                 vex3 = x86_ldub_code(env, s);
-                rex_w = (vex3 >> 7) & 1;
+#ifdef TARGET_X86_64
+                s->rex_w = (vex3 >> 7) & 1;
+#endif
                 switch (vex2 & 0x1f) {
                 case 0x01: /* Implied 0f leading opcode bytes.  */
                     b = x86_ldub_code(env, s) | 0x100;
@@ -4631,9 +4634,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     /* Post-process prefixes.  */
     if (CODE64(s)) {
         /* In 64-bit mode, the default data size is 32-bit.  Select 64-bit
-           data with rex_w, and 16-bit data with 0x66; rex_w takes precedence
+           data with REX_W, and 16-bit data with 0x66; REX_W takes precedence
            over 0x66 if both are present.  */
-        dflag = (rex_w > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 : MO_32);
+        dflag = (REX_W(s) > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 : MO_32);
         /* In 64-bit mode, 0x67 selects 32-bit addressing.  */
         aflag = (prefixes & PREFIX_ADR ? MO_32 : MO_64);
     } else {
@@ -5029,7 +5032,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 /* operand size for jumps is 64 bit */
                 ot = MO_64;
             } else if (op == 3 || op == 5) {
-                ot = dflag != MO_16 ? MO_32 + (rex_w == 1) : MO_16;
+                ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
             } else if (op == 6) {
                 /* default push size is 64 bit */
                 ot = mo_pushpop(s, dflag);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 01/46] target/i386: Push rex_r into DisasContext Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w " Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  7:16   ` Aleksandar Markovic
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 04/46] target/i386: use dflag from DisasContext Jan Bobek
                   ` (43 subsequent siblings)
  46 siblings, 1 reply; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Jan Bobek, Alex Bennée, Richard Henderson, Richard Henderson

The variable aflag is not used in most of disas_insn; make this clear
by explicitly reducing its scope to the block where it is used.

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c0866c2797..bda96277e4 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4493,11 +4493,14 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     CPUX86State *env = cpu->env_ptr;
     int b, prefixes;
     int shift;
-    TCGMemOp ot, aflag, dflag;
+    TCGMemOp ot, dflag;
     int modrm, reg, rm, mod, op, opreg, val;
     target_ulong next_eip, tval;
     target_ulong pc_start = s->base.pc_next;
 
+    {
+    TCGMemOp aflag;
+
     s->pc_start = s->pc = pc_start;
     s->override = -1;
 #ifdef TARGET_X86_64
@@ -4657,6 +4660,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     s->prefix = prefixes;
     s->aflag = aflag;
     s->dflag = dflag;
+    }
 
     /* now check op code */
  reswitch:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 04/46] target/i386: use dflag from DisasContext
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (2 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 05/46] target/i386: use prefix " Jan Bobek
                   ` (42 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Jan Bobek, Alex Bennée, Richard Henderson, Richard Henderson

There already is a variable dflag in DisasContext, so reduce the scope
of the local variable dflag to enforce use of the one in DisasContext.

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 184 ++++++++++++++++++++--------------------
 1 file changed, 92 insertions(+), 92 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index bda96277e4..bb13877df7 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4493,13 +4493,13 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     CPUX86State *env = cpu->env_ptr;
     int b, prefixes;
     int shift;
-    TCGMemOp ot, dflag;
+    TCGMemOp ot;
     int modrm, reg, rm, mod, op, opreg, val;
     target_ulong next_eip, tval;
     target_ulong pc_start = s->base.pc_next;
 
     {
-    TCGMemOp aflag;
+    TCGMemOp aflag, dflag;
 
     s->pc_start = s->pc = pc_start;
     s->override = -1;
@@ -4686,7 +4686,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             op = (b >> 3) & 7;
             f = (b >> 1) & 3;
 
-            ot = mo_b_d(b, dflag);
+            ot = mo_b_d(b, s->dflag);
 
             switch(f) {
             case 0: /* OP Ev, Gv */
@@ -4744,7 +4744,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         {
             int val;
 
-            ot = mo_b_d(b, dflag);
+            ot = mo_b_d(b, s->dflag);
 
             modrm = x86_ldub_code(env, s);
             mod = (modrm >> 6) & 3;
@@ -4781,16 +4781,16 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         /**************************/
         /* inc, dec, and other misc arith */
     case 0x40 ... 0x47: /* inc Gv */
-        ot = dflag;
+        ot = s->dflag;
         gen_inc(s, ot, OR_EAX + (b & 7), 1);
         break;
     case 0x48 ... 0x4f: /* dec Gv */
-        ot = dflag;
+        ot = s->dflag;
         gen_inc(s, ot, OR_EAX + (b & 7), -1);
         break;
     case 0xf6: /* GRP3 */
     case 0xf7:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
 
         modrm = x86_ldub_code(env, s);
         mod = (modrm >> 6) & 3;
@@ -5022,7 +5022,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0xfe: /* GRP4 */
     case 0xff: /* GRP5 */
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
 
         modrm = x86_ldub_code(env, s);
         mod = (modrm >> 6) & 3;
@@ -5036,10 +5036,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 /* operand size for jumps is 64 bit */
                 ot = MO_64;
             } else if (op == 3 || op == 5) {
-                ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
+                ot = s->dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
             } else if (op == 6) {
                 /* default push size is 64 bit */
-                ot = mo_pushpop(s, dflag);
+                ot = mo_pushpop(s, s->dflag);
             }
         }
         if (mod != 3) {
@@ -5067,7 +5067,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             break;
         case 2: /* call Ev */
             /* XXX: optimize if memory (no 'and' is necessary) */
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tcg_gen_ext16u_tl(s->T0, s->T0);
             }
             next_eip = s->pc - s->cs_base;
@@ -5085,19 +5085,19 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (s->pe && !s->vm86) {
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_lcall_protected(cpu_env, s->tmp2_i32, s->T1,
-                                           tcg_const_i32(dflag - 1),
+                                           tcg_const_i32(s->dflag - 1),
                                            tcg_const_tl(s->pc - s->cs_base));
             } else {
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_lcall_real(cpu_env, s->tmp2_i32, s->T1,
-                                      tcg_const_i32(dflag - 1),
+                                      tcg_const_i32(s->dflag - 1),
                                       tcg_const_i32(s->pc - s->cs_base));
             }
             tcg_gen_ld_tl(s->tmp4, cpu_env, offsetof(CPUX86State, eip));
             gen_jr(s, s->tmp4);
             break;
         case 4: /* jmp Ev */
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tcg_gen_ext16u_tl(s->T0, s->T0);
             }
             gen_op_jmp_v(s->T0);
@@ -5130,7 +5130,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0x84: /* test Ev, Gv */
     case 0x85:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
 
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
@@ -5143,7 +5143,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0xa8: /* test eAX, Iv */
     case 0xa9:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         val = insn_get(env, s, ot);
 
         gen_op_mov_v_reg(s, ot, s->T0, OR_EAX);
@@ -5153,7 +5153,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
 
     case 0x98: /* CWDE/CBW */
-        switch (dflag) {
+        switch (s->dflag) {
 #ifdef TARGET_X86_64
         case MO_64:
             gen_op_mov_v_reg(s, MO_32, s->T0, R_EAX);
@@ -5176,7 +5176,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         break;
     case 0x99: /* CDQ/CWD */
-        switch (dflag) {
+        switch (s->dflag) {
 #ifdef TARGET_X86_64
         case MO_64:
             gen_op_mov_v_reg(s, MO_64, s->T0, R_EAX);
@@ -5203,7 +5203,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1af: /* imul Gv, Ev */
     case 0x69: /* imul Gv, Ev, I */
     case 0x6b:
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         if (b == 0x69)
@@ -5255,7 +5255,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x1c0:
     case 0x1c1: /* xadd Ev, Gv */
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
@@ -5287,7 +5287,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         {
             TCGv oldv, newv, cmpv;
 
-            ot = mo_b_d(b, dflag);
+            ot = mo_b_d(b, s->dflag);
             modrm = x86_ldub_code(env, s);
             reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
@@ -5348,7 +5348,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
 #ifdef TARGET_X86_64
-            if (dflag == MO_64) {
+            if (s->dflag == MO_64) {
                 if (!(s->cpuid_ext_features & CPUID_EXT_CX16)) {
                     goto illegal_op;
                 }
@@ -5388,7 +5388,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             }
             gen_helper_rdrand(s->T0, cpu_env);
             rm = (modrm & 7) | REX_B(s);
-            gen_op_mov_reg_v(s, dflag, rm, s->T0);
+            gen_op_mov_reg_v(s, s->dflag, rm, s->T0);
             set_cc_op(s, CC_OP_EFLAGS);
             if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
                 gen_io_end();
@@ -5425,7 +5425,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x68: /* push Iv */
     case 0x6a:
-        ot = mo_pushpop(s, dflag);
+        ot = mo_pushpop(s, s->dflag);
         if (b == 0x68)
             val = insn_get(env, s, ot);
         else
@@ -5510,7 +5510,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         /* mov */
     case 0x88:
     case 0x89: /* mov Gv, Ev */
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
 
@@ -5519,7 +5519,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xc6:
     case 0xc7: /* mov Ev, Iv */
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         modrm = x86_ldub_code(env, s);
         mod = (modrm >> 6) & 3;
         if (mod != 3) {
@@ -5536,7 +5536,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x8a:
     case 0x8b: /* mov Ev, Gv */
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
 
@@ -5568,7 +5568,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (reg >= 6)
             goto illegal_op;
         gen_op_movl_T0_seg(s, reg);
-        ot = mod == 3 ? dflag : MO_16;
+        ot = mod == 3 ? s->dflag : MO_16;
         gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
         break;
 
@@ -5581,7 +5581,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             TCGMemOp s_ot;
 
             /* d_ot is the size of destination */
-            d_ot = dflag;
+            d_ot = s->dflag;
             /* ot is the size of source */
             ot = (b & 1) + MO_8;
             /* s_ot is the sign+size of source */
@@ -5632,7 +5632,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             AddressParts a = gen_lea_modrm_0(env, s, modrm);
             TCGv ea = gen_lea_modrm_1(s, a);
             gen_lea_v_seg(s, s->aflag, ea, -1, -1);
-            gen_op_mov_reg_v(s, dflag, reg, s->A0);
+            gen_op_mov_reg_v(s, s->dflag, reg, s->A0);
         }
         break;
 
@@ -5643,7 +5643,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         {
             target_ulong offset_addr;
 
-            ot = mo_b_d(b, dflag);
+            ot = mo_b_d(b, s->dflag);
             switch (s->aflag) {
 #ifdef TARGET_X86_64
             case MO_64:
@@ -5681,7 +5681,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xb8 ... 0xbf: /* mov R, Iv */
 #ifdef TARGET_X86_64
-        if (dflag == MO_64) {
+        if (s->dflag == MO_64) {
             uint64_t tmp;
             /* 64 bit case */
             tmp = x86_ldq_code(env, s);
@@ -5691,7 +5691,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         } else
 #endif
         {
-            ot = dflag;
+            ot = s->dflag;
             val = insn_get(env, s, ot);
             reg = (b & 7) | REX_B(s);
             tcg_gen_movi_tl(s->T0, val);
@@ -5701,13 +5701,13 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0x91 ... 0x97: /* xchg R, EAX */
     do_xchg_reg_eax:
-        ot = dflag;
+        ot = s->dflag;
         reg = (b & 7) | REX_B(s);
         rm = R_EAX;
         goto do_xchg_reg;
     case 0x86:
     case 0x87: /* xchg Ev, Gv */
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
@@ -5744,7 +5744,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1b5: /* lgs Gv */
         op = R_GS;
     do_lxx:
-        ot = dflag != MO_16 ? MO_32 : MO_16;
+        ot = s->dflag != MO_16 ? MO_32 : MO_16;
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
@@ -5772,7 +5772,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         shift = 2;
     grp2:
         {
-            ot = mo_b_d(b, dflag);
+            ot = mo_b_d(b, s->dflag);
             modrm = x86_ldub_code(env, s);
             mod = (modrm >> 6) & 3;
             op = (modrm >> 3) & 7;
@@ -5825,7 +5825,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         op = 1;
         shift = 0;
     do_shiftd:
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         mod = (modrm >> 6) & 3;
         rm = (modrm & 7) | REX_B(s);
@@ -5987,7 +5987,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 }
                 break;
             case 0x0c: /* fldenv mem */
-                gen_helper_fldenv(cpu_env, s->A0, tcg_const_i32(dflag - 1));
+                gen_helper_fldenv(cpu_env, s->A0, tcg_const_i32(s->dflag - 1));
                 break;
             case 0x0d: /* fldcw mem */
                 tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0,
@@ -5995,7 +5995,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 gen_helper_fldcw(cpu_env, s->tmp2_i32);
                 break;
             case 0x0e: /* fnstenv mem */
-                gen_helper_fstenv(cpu_env, s->A0, tcg_const_i32(dflag - 1));
+                gen_helper_fstenv(cpu_env, s->A0, tcg_const_i32(s->dflag - 1));
                 break;
             case 0x0f: /* fnstcw mem */
                 gen_helper_fnstcw(s->tmp2_i32, cpu_env);
@@ -6010,10 +6010,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 gen_helper_fpop(cpu_env);
                 break;
             case 0x2c: /* frstor mem */
-                gen_helper_frstor(cpu_env, s->A0, tcg_const_i32(dflag - 1));
+                gen_helper_frstor(cpu_env, s->A0, tcg_const_i32(s->dflag - 1));
                 break;
             case 0x2e: /* fnsave mem */
-                gen_helper_fsave(cpu_env, s->A0, tcg_const_i32(dflag - 1));
+                gen_helper_fsave(cpu_env, s->A0, tcg_const_i32(s->dflag - 1));
                 break;
             case 0x2f: /* fnstsw mem */
                 gen_helper_fnstsw(s->tmp2_i32, cpu_env);
@@ -6355,7 +6355,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0xa4: /* movsS */
     case 0xa5:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_movs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
@@ -6365,7 +6365,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0xaa: /* stosS */
     case 0xab:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_stos(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
@@ -6374,7 +6374,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xac: /* lodsS */
     case 0xad:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_lods(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
@@ -6383,7 +6383,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xae: /* scasS */
     case 0xaf:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         if (prefixes & PREFIX_REPNZ) {
             gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
         } else if (prefixes & PREFIX_REPZ) {
@@ -6395,7 +6395,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0xa6: /* cmpsS */
     case 0xa7:
-        ot = mo_b_d(b, dflag);
+        ot = mo_b_d(b, s->dflag);
         if (prefixes & PREFIX_REPNZ) {
             gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
         } else if (prefixes & PREFIX_REPZ) {
@@ -6406,7 +6406,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x6c: /* insS */
     case 0x6d:
-        ot = mo_b_d32(b, dflag);
+        ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base, 
                      SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes) | 4);
@@ -6421,7 +6421,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x6e: /* outsS */
     case 0x6f:
-        ot = mo_b_d32(b, dflag);
+        ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base,
                      svm_is_rep(prefixes) | 4);
@@ -6440,7 +6440,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
     case 0xe4:
     case 0xe5:
-        ot = mo_b_d32(b, dflag);
+        ot = mo_b_d32(b, s->dflag);
         val = x86_ldub_code(env, s);
         tcg_gen_movi_tl(s->T0, val);
         gen_check_io(s, ot, pc_start - s->cs_base,
@@ -6459,7 +6459,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xe6:
     case 0xe7:
-        ot = mo_b_d32(b, dflag);
+        ot = mo_b_d32(b, s->dflag);
         val = x86_ldub_code(env, s);
         tcg_gen_movi_tl(s->T0, val);
         gen_check_io(s, ot, pc_start - s->cs_base,
@@ -6480,7 +6480,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xec:
     case 0xed:
-        ot = mo_b_d32(b, dflag);
+        ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base,
                      SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes));
@@ -6498,7 +6498,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xee:
     case 0xef:
-        ot = mo_b_d32(b, dflag);
+        ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base,
                      svm_is_rep(prefixes));
@@ -6542,21 +6542,21 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->pe && !s->vm86) {
             gen_update_cc_op(s);
             gen_jmp_im(s, pc_start - s->cs_base);
-            gen_helper_lret_protected(cpu_env, tcg_const_i32(dflag - 1),
+            gen_helper_lret_protected(cpu_env, tcg_const_i32(s->dflag - 1),
                                       tcg_const_i32(val));
         } else {
             gen_stack_A0(s);
             /* pop offset */
-            gen_op_ld_v(s, dflag, s->T0, s->A0);
+            gen_op_ld_v(s, s->dflag, s->T0, s->A0);
             /* NOTE: keeping EIP updated is not a problem in case of
                exception */
             gen_op_jmp_v(s->T0);
             /* pop selector */
-            gen_add_A0_im(s, 1 << dflag);
-            gen_op_ld_v(s, dflag, s->T0, s->A0);
+            gen_add_A0_im(s, 1 << s->dflag);
+            gen_op_ld_v(s, s->dflag, s->T0, s->A0);
             gen_op_movl_seg_T0_vm(s, R_CS);
             /* add stack offset */
-            gen_stack_update(s, val + (2 << dflag));
+            gen_stack_update(s, val + (2 << s->dflag));
         }
         gen_eob(s);
         break;
@@ -6567,17 +6567,17 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_svm_check_intercept(s, pc_start, SVM_EXIT_IRET);
         if (!s->pe) {
             /* real mode */
-            gen_helper_iret_real(cpu_env, tcg_const_i32(dflag - 1));
+            gen_helper_iret_real(cpu_env, tcg_const_i32(s->dflag - 1));
             set_cc_op(s, CC_OP_EFLAGS);
         } else if (s->vm86) {
             if (s->iopl != 3) {
                 gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
             } else {
-                gen_helper_iret_real(cpu_env, tcg_const_i32(dflag - 1));
+                gen_helper_iret_real(cpu_env, tcg_const_i32(s->dflag - 1));
                 set_cc_op(s, CC_OP_EFLAGS);
             }
         } else {
-            gen_helper_iret_protected(cpu_env, tcg_const_i32(dflag - 1),
+            gen_helper_iret_protected(cpu_env, tcg_const_i32(s->dflag - 1),
                                       tcg_const_i32(s->pc - s->cs_base));
             set_cc_op(s, CC_OP_EFLAGS);
         }
@@ -6585,14 +6585,14 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xe8: /* call im */
         {
-            if (dflag != MO_16) {
+            if (s->dflag != MO_16) {
                 tval = (int32_t)insn_get(env, s, MO_32);
             } else {
                 tval = (int16_t)insn_get(env, s, MO_16);
             }
             next_eip = s->pc - s->cs_base;
             tval += next_eip;
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tval &= 0xffff;
             } else if (!CODE64(s)) {
                 tval &= 0xffffffff;
@@ -6609,7 +6609,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
             if (CODE64(s))
                 goto illegal_op;
-            ot = dflag;
+            ot = s->dflag;
             offset = insn_get(env, s, ot);
             selector = insn_get(env, s, MO_16);
 
@@ -6618,13 +6618,13 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         goto do_lcall;
     case 0xe9: /* jmp im */
-        if (dflag != MO_16) {
+        if (s->dflag != MO_16) {
             tval = (int32_t)insn_get(env, s, MO_32);
         } else {
             tval = (int16_t)insn_get(env, s, MO_16);
         }
         tval += s->pc - s->cs_base;
-        if (dflag == MO_16) {
+        if (s->dflag == MO_16) {
             tval &= 0xffff;
         } else if (!CODE64(s)) {
             tval &= 0xffffffff;
@@ -6638,7 +6638,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
             if (CODE64(s))
                 goto illegal_op;
-            ot = dflag;
+            ot = s->dflag;
             offset = insn_get(env, s, ot);
             selector = insn_get(env, s, MO_16);
 
@@ -6649,7 +6649,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xeb: /* jmp Jb */
         tval = (int8_t)insn_get(env, s, MO_8);
         tval += s->pc - s->cs_base;
-        if (dflag == MO_16) {
+        if (s->dflag == MO_16) {
             tval &= 0xffff;
         }
         gen_jmp(s, tval);
@@ -6658,7 +6658,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         tval = (int8_t)insn_get(env, s, MO_8);
         goto do_jcc;
     case 0x180 ... 0x18f: /* jcc Jv */
-        if (dflag != MO_16) {
+        if (s->dflag != MO_16) {
             tval = (int32_t)insn_get(env, s, MO_32);
         } else {
             tval = (int16_t)insn_get(env, s, MO_16);
@@ -6666,7 +6666,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     do_jcc:
         next_eip = s->pc - s->cs_base;
         tval += next_eip;
-        if (dflag == MO_16) {
+        if (s->dflag == MO_16) {
             tval &= 0xffff;
         }
         gen_bnd_jmp(s);
@@ -6682,7 +6682,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (!(s->cpuid_features & CPUID_CMOV)) {
             goto illegal_op;
         }
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         gen_cmovcc1(env, s, ot, b, modrm, reg);
@@ -6707,7 +6707,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         } else {
             ot = gen_pop_T0(s);
             if (s->cpl == 0) {
-                if (dflag != MO_16) {
+                if (s->dflag != MO_16) {
                     gen_helper_write_eflags(cpu_env, s->T0,
                                             tcg_const_i32((TF_MASK | AC_MASK |
                                                            ID_MASK | NT_MASK |
@@ -6722,7 +6722,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 }
             } else {
                 if (s->cpl <= s->iopl) {
-                    if (dflag != MO_16) {
+                    if (s->dflag != MO_16) {
                         gen_helper_write_eflags(cpu_env, s->T0,
                                                 tcg_const_i32((TF_MASK |
                                                                AC_MASK |
@@ -6739,7 +6739,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                                                               & 0xffff));
                     }
                 } else {
-                    if (dflag != MO_16) {
+                    if (s->dflag != MO_16) {
                         gen_helper_write_eflags(cpu_env, s->T0,
                                            tcg_const_i32((TF_MASK | AC_MASK |
                                                           ID_MASK | NT_MASK)));
@@ -6799,7 +6799,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         /************************/
         /* bit operations */
     case 0x1ba: /* bt/bts/btr/btc Gv, im */
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         op = (modrm >> 3) & 7;
         mod = (modrm >> 6) & 3;
@@ -6832,7 +6832,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1bb: /* btc */
         op = 3;
     do_btx:
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         mod = (modrm >> 6) & 3;
@@ -6937,7 +6937,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x1bc: /* bsf / tzcnt */
     case 0x1bd: /* bsr / lzcnt */
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         reg = ((modrm >> 3) & 7) | REX_R(s);
         gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
@@ -7111,7 +7111,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x62: /* bound */
         if (CODE64(s))
             goto illegal_op;
-        ot = dflag;
+        ot = s->dflag;
         modrm = x86_ldub_code(env, s);
         reg = (modrm >> 3) & 7;
         mod = (modrm >> 6) & 3;
@@ -7129,7 +7129,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1c8 ... 0x1cf: /* bswap reg */
         reg = (b & 7) | REX_B(s);
 #ifdef TARGET_X86_64
-        if (dflag == MO_64) {
+        if (s->dflag == MO_64) {
             gen_op_mov_v_reg(s, MO_64, s->T0, reg);
             tcg_gen_bswap64_i64(s->T0, s->T0);
             gen_op_mov_reg_v(s, MO_64, reg, s->T0);
@@ -7159,7 +7159,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             tval = (int8_t)insn_get(env, s, MO_8);
             next_eip = s->pc - s->cs_base;
             tval += next_eip;
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tval &= 0xffff;
             }
 
@@ -7243,7 +7243,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (!s->pe) {
             gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
         } else {
-            gen_helper_sysexit(cpu_env, tcg_const_i32(dflag - 1));
+            gen_helper_sysexit(cpu_env, tcg_const_i32(s->dflag - 1));
             gen_eob(s);
         }
         break;
@@ -7262,7 +7262,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (!s->pe) {
             gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
         } else {
-            gen_helper_sysret(cpu_env, tcg_const_i32(dflag - 1));
+            gen_helper_sysret(cpu_env, tcg_const_i32(s->dflag - 1));
             /* condition codes are modified only in long mode */
             if (s->lma) {
                 set_cc_op(s, CC_OP_EFLAGS);
@@ -7301,7 +7301,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_svm_check_intercept(s, pc_start, SVM_EXIT_LDTR_READ);
             tcg_gen_ld32u_tl(s->T0, cpu_env,
                              offsetof(CPUX86State, ldt.selector));
-            ot = mod == 3 ? dflag : MO_16;
+            ot = mod == 3 ? s->dflag : MO_16;
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
             break;
         case 2: /* lldt */
@@ -7322,7 +7322,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_svm_check_intercept(s, pc_start, SVM_EXIT_TR_READ);
             tcg_gen_ld32u_tl(s->T0, cpu_env,
                              offsetof(CPUX86State, tr.selector));
-            ot = mod == 3 ? dflag : MO_16;
+            ot = mod == 3 ? s->dflag : MO_16;
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
             break;
         case 3: /* ltr */
@@ -7366,7 +7366,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_op_st_v(s, MO_16, s->T0, s->A0);
             gen_add_A0_im(s, 2);
             tcg_gen_ld_tl(s->T0, cpu_env, offsetof(CPUX86State, gdt.base));
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tcg_gen_andi_tl(s->T0, s->T0, 0xffffff);
             }
             gen_op_st_v(s, CODE64(s) + MO_32, s->T0, s->A0);
@@ -7421,7 +7421,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_op_st_v(s, MO_16, s->T0, s->A0);
             gen_add_A0_im(s, 2);
             tcg_gen_ld_tl(s->T0, cpu_env, offsetof(CPUX86State, idt.base));
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tcg_gen_andi_tl(s->T0, s->T0, 0xffffff);
             }
             gen_op_st_v(s, CODE64(s) + MO_32, s->T0, s->A0);
@@ -7571,7 +7571,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_op_ld_v(s, MO_16, s->T1, s->A0);
             gen_add_A0_im(s, 2);
             gen_op_ld_v(s, CODE64(s) + MO_32, s->T0, s->A0);
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tcg_gen_andi_tl(s->T0, s->T0, 0xffffff);
             }
             tcg_gen_st_tl(s->T0, cpu_env, offsetof(CPUX86State, gdt.base));
@@ -7588,7 +7588,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_op_ld_v(s, MO_16, s->T1, s->A0);
             gen_add_A0_im(s, 2);
             gen_op_ld_v(s, CODE64(s) + MO_32, s->T0, s->A0);
-            if (dflag == MO_16) {
+            if (s->dflag == MO_16) {
                 tcg_gen_andi_tl(s->T0, s->T0, 0xffffff);
             }
             tcg_gen_st_tl(s->T0, cpu_env, offsetof(CPUX86State, idt.base));
@@ -7700,7 +7700,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (CODE64(s)) {
             int d_ot;
             /* d_ot is the size of destination */
-            d_ot = dflag;
+            d_ot = s->dflag;
 
             modrm = x86_ldub_code(env, s);
             reg = ((modrm >> 3) & 7) | REX_R(s);
@@ -7775,7 +7775,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             TCGv t0;
             if (!s->pe || s->vm86)
                 goto illegal_op;
-            ot = dflag != MO_16 ? MO_32 : MO_16;
+            ot = s->dflag != MO_16 ? MO_32 : MO_16;
             modrm = x86_ldub_code(env, s);
             reg = ((modrm >> 3) & 7) | REX_R(s);
             gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
@@ -8117,7 +8117,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1c3: /* MOVNTI reg, mem */
         if (!(s->cpuid_features & CPUID_SSE2))
             goto illegal_op;
-        ot = mo_64_32(dflag);
+        ot = mo_64_32(s->dflag);
         modrm = x86_ldub_code(env, s);
         mod = (modrm >> 6) & 3;
         if (mod == 3)
@@ -8353,7 +8353,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->prefix & PREFIX_DATA) {
             ot = MO_16;
         } else {
-            ot = mo_64_32(dflag);
+            ot = mo_64_32(s->dflag);
         }
 
         gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 05/46] target/i386: use prefix from DisasContext
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (3 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 04/46] target/i386: use dflag from DisasContext Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 06/46] target/i386: Simplify gen_exception arguments Jan Bobek
                   ` (41 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Jan Bobek, Alex Bennée, Richard Henderson, Richard Henderson

Reduce scope of the local variable prefixes to enforce use of prefix
from DisasContext instead.

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 113 ++++++++++++++++++++--------------------
 1 file changed, 57 insertions(+), 56 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index bb13877df7..40a4844b64 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4491,7 +4491,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
 static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 {
     CPUX86State *env = cpu->env_ptr;
-    int b, prefixes;
+    int b;
     int shift;
     TCGMemOp ot;
     int modrm, reg, rm, mod, op, opreg, val;
@@ -4499,6 +4499,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     target_ulong pc_start = s->base.pc_next;
 
     {
+    int prefixes;
     TCGMemOp aflag, dflag;
 
     s->pc_start = s->pc = pc_start;
@@ -6356,7 +6357,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xa4: /* movsS */
     case 0xa5:
         ot = mo_b_d(b, s->dflag);
-        if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+        if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_movs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_movs(s, ot);
@@ -6366,7 +6367,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xaa: /* stosS */
     case 0xab:
         ot = mo_b_d(b, s->dflag);
-        if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+        if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_stos(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_stos(s, ot);
@@ -6375,7 +6376,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xac: /* lodsS */
     case 0xad:
         ot = mo_b_d(b, s->dflag);
-        if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+        if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_lods(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_lods(s, ot);
@@ -6384,9 +6385,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xae: /* scasS */
     case 0xaf:
         ot = mo_b_d(b, s->dflag);
-        if (prefixes & PREFIX_REPNZ) {
+        if (s->prefix & PREFIX_REPNZ) {
             gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
-        } else if (prefixes & PREFIX_REPZ) {
+        } else if (s->prefix & PREFIX_REPZ) {
             gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
         } else {
             gen_scas(s, ot);
@@ -6396,9 +6397,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xa6: /* cmpsS */
     case 0xa7:
         ot = mo_b_d(b, s->dflag);
-        if (prefixes & PREFIX_REPNZ) {
+        if (s->prefix & PREFIX_REPNZ) {
             gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
-        } else if (prefixes & PREFIX_REPZ) {
+        } else if (s->prefix & PREFIX_REPZ) {
             gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
         } else {
             gen_cmps(s, ot);
@@ -6409,8 +6410,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base, 
-                     SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes) | 4);
-        if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+                     SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix) | 4);
+        if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_ins(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_ins(s, ot);
@@ -6424,8 +6425,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base,
-                     svm_is_rep(prefixes) | 4);
-        if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
+                     svm_is_rep(s->prefix) | 4);
+        if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
             gen_repz_outs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_outs(s, ot);
@@ -6444,7 +6445,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         val = x86_ldub_code(env, s);
         tcg_gen_movi_tl(s->T0, val);
         gen_check_io(s, ot, pc_start - s->cs_base,
-                     SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes));
+                     SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix));
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
             gen_io_start();
         }
@@ -6463,7 +6464,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         val = x86_ldub_code(env, s);
         tcg_gen_movi_tl(s->T0, val);
         gen_check_io(s, ot, pc_start - s->cs_base,
-                     svm_is_rep(prefixes));
+                     svm_is_rep(s->prefix));
         gen_op_mov_v_reg(s, ot, s->T1, R_EAX);
 
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
@@ -6483,7 +6484,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base,
-                     SVM_IOIO_TYPE_MASK | svm_is_rep(prefixes));
+                     SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix));
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
             gen_io_start();
         }
@@ -6501,7 +6502,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
         gen_check_io(s, ot, pc_start - s->cs_base,
-                     svm_is_rep(prefixes));
+                     svm_is_rep(s->prefix));
         gen_op_mov_v_reg(s, ot, s->T1, R_EAX);
 
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
@@ -6944,7 +6945,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_extu(ot, s->T0);
 
         /* Note that lzcnt and tzcnt are in different extensions.  */
-        if ((prefixes & PREFIX_REPZ)
+        if ((s->prefix & PREFIX_REPZ)
             && (b & 1
                 ? s->cpuid_ext3_features & CPUID_EXT3_ABM
                 : s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_BMI1)) {
@@ -7037,14 +7038,14 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         /* misc */
     case 0x90: /* nop */
         /* XXX: correct lock test for all insn */
-        if (prefixes & PREFIX_LOCK) {
+        if (s->prefix & PREFIX_LOCK) {
             goto illegal_op;
         }
         /* If REX_B is set, then this is xchg eax, r8d, not a nop.  */
         if (REX_B(s)) {
             goto do_xchg_reg_eax;
         }
-        if (prefixes & PREFIX_REPZ) {
+        if (s->prefix & PREFIX_REPZ) {
             gen_update_cc_op(s);
             gen_jmp_im(s, pc_start - s->cs_base);
             gen_helper_pause(cpu_env, tcg_const_i32(s->pc - pc_start));
@@ -7607,7 +7608,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
             break;
         case 0xee: /* rdpkru */
-            if (prefixes & PREFIX_LOCK) {
+            if (s->prefix & PREFIX_LOCK) {
                 goto illegal_op;
             }
             tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
@@ -7615,7 +7616,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             tcg_gen_extr_i64_tl(cpu_regs[R_EAX], cpu_regs[R_EDX], s->tmp1_i64);
             break;
         case 0xef: /* wrpkru */
-            if (prefixes & PREFIX_LOCK) {
+            if (s->prefix & PREFIX_LOCK) {
                 goto illegal_op;
             }
             tcg_gen_concat_tl_i64(s->tmp1_i64, cpu_regs[R_EAX],
@@ -7819,18 +7820,18 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->flags & HF_MPX_EN_MASK) {
             mod = (modrm >> 6) & 3;
             reg = ((modrm >> 3) & 7) | REX_R(s);
-            if (prefixes & PREFIX_REPZ) {
+            if (s->prefix & PREFIX_REPZ) {
                 /* bndcl */
                 if (reg >= 4
-                    || (prefixes & PREFIX_LOCK)
+                    || (s->prefix & PREFIX_LOCK)
                     || s->aflag == MO_16) {
                     goto illegal_op;
                 }
                 gen_bndck(env, s, modrm, TCG_COND_LTU, cpu_bndl[reg]);
-            } else if (prefixes & PREFIX_REPNZ) {
+            } else if (s->prefix & PREFIX_REPNZ) {
                 /* bndcu */
                 if (reg >= 4
-                    || (prefixes & PREFIX_LOCK)
+                    || (s->prefix & PREFIX_LOCK)
                     || s->aflag == MO_16) {
                     goto illegal_op;
                 }
@@ -7838,14 +7839,14 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 tcg_gen_not_i64(notu, cpu_bndu[reg]);
                 gen_bndck(env, s, modrm, TCG_COND_GTU, notu);
                 tcg_temp_free_i64(notu);
-            } else if (prefixes & PREFIX_DATA) {
+            } else if (s->prefix & PREFIX_DATA) {
                 /* bndmov -- from reg/mem */
                 if (reg >= 4 || s->aflag == MO_16) {
                     goto illegal_op;
                 }
                 if (mod == 3) {
                     int reg2 = (modrm & 7) | REX_B(s);
-                    if (reg2 >= 4 || (prefixes & PREFIX_LOCK)) {
+                    if (reg2 >= 4 || (s->prefix & PREFIX_LOCK)) {
                         goto illegal_op;
                     }
                     if (s->flags & HF_MPX_IU_MASK) {
@@ -7874,7 +7875,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 /* bndldx */
                 AddressParts a = gen_lea_modrm_0(env, s, modrm);
                 if (reg >= 4
-                    || (prefixes & PREFIX_LOCK)
+                    || (s->prefix & PREFIX_LOCK)
                     || s->aflag == MO_16
                     || a.base < -1) {
                     goto illegal_op;
@@ -7909,10 +7910,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->flags & HF_MPX_EN_MASK) {
             mod = (modrm >> 6) & 3;
             reg = ((modrm >> 3) & 7) | REX_R(s);
-            if (mod != 3 && (prefixes & PREFIX_REPZ)) {
+            if (mod != 3 && (s->prefix & PREFIX_REPZ)) {
                 /* bndmk */
                 if (reg >= 4
-                    || (prefixes & PREFIX_LOCK)
+                    || (s->prefix & PREFIX_LOCK)
                     || s->aflag == MO_16) {
                     goto illegal_op;
                 }
@@ -7937,22 +7938,22 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 /* bnd registers are now in-use */
                 gen_set_hflag(s, HF_MPX_IU_MASK);
                 break;
-            } else if (prefixes & PREFIX_REPNZ) {
+            } else if (s->prefix & PREFIX_REPNZ) {
                 /* bndcn */
                 if (reg >= 4
-                    || (prefixes & PREFIX_LOCK)
+                    || (s->prefix & PREFIX_LOCK)
                     || s->aflag == MO_16) {
                     goto illegal_op;
                 }
                 gen_bndck(env, s, modrm, TCG_COND_GTU, cpu_bndu[reg]);
-            } else if (prefixes & PREFIX_DATA) {
+            } else if (s->prefix & PREFIX_DATA) {
                 /* bndmov -- to reg/mem */
                 if (reg >= 4 || s->aflag == MO_16) {
                     goto illegal_op;
                 }
                 if (mod == 3) {
                     int reg2 = (modrm & 7) | REX_B(s);
-                    if (reg2 >= 4 || (prefixes & PREFIX_LOCK)) {
+                    if (reg2 >= 4 || (s->prefix & PREFIX_LOCK)) {
                         goto illegal_op;
                     }
                     if (s->flags & HF_MPX_IU_MASK) {
@@ -7979,7 +7980,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 /* bndstx */
                 AddressParts a = gen_lea_modrm_0(env, s, modrm);
                 if (reg >= 4
-                    || (prefixes & PREFIX_LOCK)
+                    || (s->prefix & PREFIX_LOCK)
                     || s->aflag == MO_16
                     || a.base < -1) {
                     goto illegal_op;
@@ -8027,7 +8028,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 ot = MO_64;
             else
                 ot = MO_32;
-            if ((prefixes & PREFIX_LOCK) && (reg == 0) &&
+            if ((s->prefix & PREFIX_LOCK) && (reg == 0) &&
                 (s->cpuid_ext3_features & CPUID_EXT3_CR8LEG)) {
                 reg = 8;
             }
@@ -8131,7 +8132,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         switch (modrm) {
         CASE_MODRM_MEM_OP(0): /* fxsave */
             if (!(s->cpuid_features & CPUID_FXSR)
-                || (prefixes & PREFIX_LOCK)) {
+                || (s->prefix & PREFIX_LOCK)) {
                 goto illegal_op;
             }
             if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
@@ -8144,7 +8145,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
         CASE_MODRM_MEM_OP(1): /* fxrstor */
             if (!(s->cpuid_features & CPUID_FXSR)
-                || (prefixes & PREFIX_LOCK)) {
+                || (s->prefix & PREFIX_LOCK)) {
                 goto illegal_op;
             }
             if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
@@ -8183,8 +8184,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
         CASE_MODRM_MEM_OP(4): /* xsave */
             if ((s->cpuid_ext_features & CPUID_EXT_XSAVE) == 0
-                || (prefixes & (PREFIX_LOCK | PREFIX_DATA
-                                | PREFIX_REPZ | PREFIX_REPNZ))) {
+                || (s->prefix & (PREFIX_LOCK | PREFIX_DATA
+                                 | PREFIX_REPZ | PREFIX_REPNZ))) {
                 goto illegal_op;
             }
             gen_lea_modrm(env, s, modrm);
@@ -8195,8 +8196,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
         CASE_MODRM_MEM_OP(5): /* xrstor */
             if ((s->cpuid_ext_features & CPUID_EXT_XSAVE) == 0
-                || (prefixes & (PREFIX_LOCK | PREFIX_DATA
-                                | PREFIX_REPZ | PREFIX_REPNZ))) {
+                || (s->prefix & (PREFIX_LOCK | PREFIX_DATA
+                                 | PREFIX_REPZ | PREFIX_REPNZ))) {
                 goto illegal_op;
             }
             gen_lea_modrm(env, s, modrm);
@@ -8211,10 +8212,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             break;
 
         CASE_MODRM_MEM_OP(6): /* xsaveopt / clwb */
-            if (prefixes & PREFIX_LOCK) {
+            if (s->prefix & PREFIX_LOCK) {
                 goto illegal_op;
             }
-            if (prefixes & PREFIX_DATA) {
+            if (s->prefix & PREFIX_DATA) {
                 /* clwb */
                 if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_CLWB)) {
                     goto illegal_op;
@@ -8224,7 +8225,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 /* xsaveopt */
                 if ((s->cpuid_ext_features & CPUID_EXT_XSAVE) == 0
                     || (s->cpuid_xsave_features & CPUID_XSAVE_XSAVEOPT) == 0
-                    || (prefixes & (PREFIX_REPZ | PREFIX_REPNZ))) {
+                    || (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ))) {
                     goto illegal_op;
                 }
                 gen_lea_modrm(env, s, modrm);
@@ -8235,10 +8236,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             break;
 
         CASE_MODRM_MEM_OP(7): /* clflush / clflushopt */
-            if (prefixes & PREFIX_LOCK) {
+            if (s->prefix & PREFIX_LOCK) {
                 goto illegal_op;
             }
-            if (prefixes & PREFIX_DATA) {
+            if (s->prefix & PREFIX_DATA) {
                 /* clflushopt */
                 if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_CLFLUSHOPT)) {
                     goto illegal_op;
@@ -8258,8 +8259,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         case 0xd0 ... 0xd7: /* wrfsbase (f3 0f ae /2) */
         case 0xd8 ... 0xdf: /* wrgsbase (f3 0f ae /3) */
             if (CODE64(s)
-                && (prefixes & PREFIX_REPZ)
-                && !(prefixes & PREFIX_LOCK)
+                && (s->prefix & PREFIX_REPZ)
+                && !(s->prefix & PREFIX_LOCK)
                 && (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_FSGSBASE)) {
                 TCGv base, treg, src, dst;
 
@@ -8288,10 +8289,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             goto unknown_op;
 
         case 0xf8: /* sfence / pcommit */
-            if (prefixes & PREFIX_DATA) {
+            if (s->prefix & PREFIX_DATA) {
                 /* pcommit */
                 if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_PCOMMIT)
-                    || (prefixes & PREFIX_LOCK)) {
+                    || (s->prefix & PREFIX_LOCK)) {
                     goto illegal_op;
                 }
                 break;
@@ -8299,21 +8300,21 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             /* fallthru */
         case 0xf9 ... 0xff: /* sfence */
             if (!(s->cpuid_features & CPUID_SSE)
-                || (prefixes & PREFIX_LOCK)) {
+                || (s->prefix & PREFIX_LOCK)) {
                 goto illegal_op;
             }
             tcg_gen_mb(TCG_MO_ST_ST | TCG_BAR_SC);
             break;
         case 0xe8 ... 0xef: /* lfence */
             if (!(s->cpuid_features & CPUID_SSE)
-                || (prefixes & PREFIX_LOCK)) {
+                || (s->prefix & PREFIX_LOCK)) {
                 goto illegal_op;
             }
             tcg_gen_mb(TCG_MO_LD_LD | TCG_BAR_SC);
             break;
         case 0xf0 ... 0xf7: /* mfence */
             if (!(s->cpuid_features & CPUID_SSE2)
-                || (prefixes & PREFIX_LOCK)) {
+                || (s->prefix & PREFIX_LOCK)) {
                 goto illegal_op;
             }
             tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
@@ -8341,8 +8342,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_eob(s);
         break;
     case 0x1b8: /* SSE4.2 popcnt */
-        if ((prefixes & (PREFIX_REPZ | PREFIX_LOCK | PREFIX_REPNZ)) !=
-             PREFIX_REPZ)
+        if ((s->prefix & (PREFIX_REPZ | PREFIX_LOCK | PREFIX_REPNZ)) !=
+            PREFIX_REPZ)
             goto illegal_op;
         if (!(s->cpuid_ext_features & CPUID_EXT_POPCNT))
             goto illegal_op;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 06/46] target/i386: Simplify gen_exception arguments
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (4 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 05/46] target/i386: use prefix " Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 07/46] target/i386: use pc_start from DisasContext Jan Bobek
                   ` (40 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Bennée, Richard Henderson, Richard Henderson

From: Richard Henderson <rth@twiddle.net>

We can compute cur_eip from values present within DisasContext.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target/i386/translate.c | 89 ++++++++++++++++++++---------------------
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 40a4844b64..7532d65778 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -1272,10 +1272,10 @@ static void gen_helper_fp_arith_STN_ST0(int op, int opreg)
     }
 }
 
-static void gen_exception(DisasContext *s, int trapno, target_ulong cur_eip)
+static void gen_exception(DisasContext *s, int trapno)
 {
     gen_update_cc_op(s);
-    gen_jmp_im(s, cur_eip);
+    gen_jmp_im(s, s->pc_start - s->cs_base);
     gen_helper_raise_exception(cpu_env, tcg_const_i32(trapno));
     s->base.is_jmp = DISAS_NORETURN;
 }
@@ -1284,7 +1284,7 @@ static void gen_exception(DisasContext *s, int trapno, target_ulong cur_eip)
    the instruction is known, but it isn't allowed in the current cpu mode.  */
 static void gen_illegal_opcode(DisasContext *s)
 {
-    gen_exception(s, EXCP06_ILLOP, s->pc_start - s->cs_base);
+    gen_exception(s, EXCP06_ILLOP);
 }
 
 /* if d == OR_TMP0, it means memory operand (address in A0) */
@@ -3040,8 +3040,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = {
     [0xdf] = AESNI_OP(aeskeygenassist),
 };
 
-static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-                    target_ulong pc_start)
+static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
     int b1, op1_offset, op2_offset, is_xmm, val;
     int modrm, mod, rm, reg;
@@ -3076,7 +3075,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
     }
     /* simple MMX/SSE operation */
     if (s->flags & HF_TS_MASK) {
-        gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+        gen_exception(s, EXCP07_PREX);
         return;
     }
     if (s->flags & HF_EM_MASK) {
@@ -4515,7 +4514,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     s->vex_l = 0;
     s->vex_v = 0;
     if (sigsetjmp(s->jmpbuf, 0) != 0) {
-        gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+        gen_exception(s, EXCP0D_GPF);
         return s->pc;
     }
 
@@ -5854,7 +5853,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->flags & (HF_EM_MASK | HF_TS_MASK)) {
             /* if CR0.EM or CR0.TS are set, generate an FPU exception */
             /* XXX: what to do if illegal op ? */
-            gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+            gen_exception(s, EXCP07_PREX);
             break;
         }
         modrm = x86_ldub_code(env, s);
@@ -6572,7 +6571,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             set_cc_op(s, CC_OP_EFLAGS);
         } else if (s->vm86) {
             if (s->iopl != 3) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
             } else {
                 gen_helper_iret_real(cpu_env, tcg_const_i32(s->dflag - 1));
                 set_cc_op(s, CC_OP_EFLAGS);
@@ -6694,7 +6693,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x9c: /* pushf */
         gen_svm_check_intercept(s, pc_start, SVM_EXIT_PUSHF);
         if (s->vm86 && s->iopl != 3) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_update_cc_op(s);
             gen_helper_read_eflags(s->T0, cpu_env);
@@ -6704,7 +6703,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x9d: /* popf */
         gen_svm_check_intercept(s, pc_start, SVM_EXIT_POPF);
         if (s->vm86 && s->iopl != 3) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             ot = gen_pop_T0(s);
             if (s->cpl == 0) {
@@ -7021,7 +7020,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             goto illegal_op;
         val = x86_ldub_code(env, s);
         if (val == 0) {
-            gen_exception(s, EXCP00_DIVZ, pc_start - s->cs_base);
+            gen_exception(s, EXCP00_DIVZ);
         } else {
             gen_helper_aam(cpu_env, tcg_const_i32(val));
             set_cc_op(s, CC_OP_LOGICB);
@@ -7055,7 +7054,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x9b: /* fwait */
         if ((s->flags & (HF_MP_MASK | HF_TS_MASK)) ==
             (HF_MP_MASK | HF_TS_MASK)) {
-            gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+            gen_exception(s, EXCP07_PREX);
         } else {
             gen_helper_fwait(cpu_env);
         }
@@ -7066,7 +7065,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xcd: /* int N */
         val = x86_ldub_code(env, s);
         if (s->vm86 && s->iopl != 3) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_interrupt(s, val, pc_start - s->cs_base, s->pc - s->cs_base);
         }
@@ -7089,13 +7088,13 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (s->cpl <= s->iopl) {
                 gen_helper_cli(cpu_env);
             } else {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
             }
         } else {
             if (s->iopl == 3) {
                 gen_helper_cli(cpu_env);
             } else {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
             }
         }
         break;
@@ -7106,7 +7105,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_jmp_im(s, s->pc - s->cs_base);
             gen_eob_inhibit_irq(s, true);
         } else {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         }
         break;
     case 0x62: /* bound */
@@ -7198,7 +7197,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x130: /* wrmsr */
     case 0x132: /* rdmsr */
         if (s->cpl != 0) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_update_cc_op(s);
             gen_jmp_im(s, pc_start - s->cs_base);
@@ -7231,7 +7230,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (CODE64(s) && env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1)
             goto illegal_op;
         if (!s->pe) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_helper_sysenter(cpu_env);
             gen_eob(s);
@@ -7242,7 +7241,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (CODE64(s) && env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1)
             goto illegal_op;
         if (!s->pe) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_helper_sysexit(cpu_env, tcg_const_i32(s->dflag - 1));
             gen_eob(s);
@@ -7261,7 +7260,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x107: /* sysret */
         if (!s->pe) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_helper_sysret(cpu_env, tcg_const_i32(s->dflag - 1));
             /* condition codes are modified only in long mode */
@@ -7283,7 +7282,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0xf4: /* hlt */
         if (s->cpl != 0) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_update_cc_op(s);
             gen_jmp_im(s, pc_start - s->cs_base);
@@ -7309,7 +7308,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (!s->pe || s->vm86)
                 goto illegal_op;
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
             } else {
                 gen_svm_check_intercept(s, pc_start, SVM_EXIT_LDTR_WRITE);
                 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
@@ -7330,7 +7329,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (!s->pe || s->vm86)
                 goto illegal_op;
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
             } else {
                 gen_svm_check_intercept(s, pc_start, SVM_EXIT_TR_WRITE);
                 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
@@ -7446,7 +7445,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             tcg_gen_concat_tl_i64(s->tmp1_i64, cpu_regs[R_EAX],
@@ -7463,7 +7462,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7488,7 +7487,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7501,7 +7500,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7516,7 +7515,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7530,7 +7529,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7554,7 +7553,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7564,7 +7563,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
         CASE_MODRM_MEM_OP(2): /* lgdt */
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_svm_check_intercept(s, pc_start, SVM_EXIT_GDTR_WRITE);
@@ -7581,7 +7580,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
         CASE_MODRM_MEM_OP(3): /* lidt */
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_svm_check_intercept(s, pc_start, SVM_EXIT_IDTR_WRITE);
@@ -7626,7 +7625,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             break;
         CASE_MODRM_OP(6): /* lmsw */
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_svm_check_intercept(s, pc_start, SVM_EXIT_WRITE_CR0);
@@ -7638,7 +7637,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 
         CASE_MODRM_MEM_OP(7): /* invlpg */
             if (s->cpl != 0) {
-                gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                gen_exception(s, EXCP0D_GPF);
                 break;
             }
             gen_update_cc_op(s);
@@ -7653,7 +7652,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 #ifdef TARGET_X86_64
             if (CODE64(s)) {
                 if (s->cpl != 0) {
-                    gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+                    gen_exception(s, EXCP0D_GPF);
                 } else {
                     tcg_gen_mov_tl(s->T0, cpu_seg_base[R_GS]);
                     tcg_gen_ld_tl(cpu_seg_base[R_GS], cpu_env,
@@ -7690,7 +7689,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x108: /* invd */
     case 0x109: /* wbinvd */
         if (s->cpl != 0) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_svm_check_intercept(s, pc_start, (b & 2) ? SVM_EXIT_INVD : SVM_EXIT_WBINVD);
             /* nothing to do */
@@ -8014,7 +8013,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x120: /* mov reg, crN */
     case 0x122: /* mov crN, reg */
         if (s->cpl != 0) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             modrm = x86_ldub_code(env, s);
             /* Ignore the mod bits (assume (modrm&0xc0)==0xc0).
@@ -8071,7 +8070,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x121: /* mov reg, drN */
     case 0x123: /* mov drN, reg */
         if (s->cpl != 0) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             modrm = x86_ldub_code(env, s);
             /* Ignore the mod bits (assume (modrm&0xc0)==0xc0).
@@ -8105,7 +8104,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x106: /* clts */
         if (s->cpl != 0) {
-            gen_exception(s, EXCP0D_GPF, pc_start - s->cs_base);
+            gen_exception(s, EXCP0D_GPF);
         } else {
             gen_svm_check_intercept(s, pc_start, SVM_EXIT_WRITE_CR0);
             gen_helper_clts(cpu_env);
@@ -8136,7 +8135,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
-                gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+                gen_exception(s, EXCP07_PREX);
                 break;
             }
             gen_lea_modrm(env, s, modrm);
@@ -8149,7 +8148,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
-                gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+                gen_exception(s, EXCP07_PREX);
                 break;
             }
             gen_lea_modrm(env, s, modrm);
@@ -8161,7 +8160,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->flags & HF_TS_MASK) {
-                gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+                gen_exception(s, EXCP07_PREX);
                 break;
             }
             gen_lea_modrm(env, s, modrm);
@@ -8174,7 +8173,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (s->flags & HF_TS_MASK) {
-                gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+                gen_exception(s, EXCP07_PREX);
                 break;
             }
             gen_lea_modrm(env, s, modrm);
@@ -8377,7 +8376,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1c2:
     case 0x1c4 ... 0x1c6:
     case 0x1d0 ... 0x1fe:
-        gen_sse(env, s, b, pc_start);
+        gen_sse(env, s, b);
         break;
     default:
         goto unknown_op;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 07/46] target/i386: use pc_start from DisasContext
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (5 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 06/46] target/i386: Simplify gen_exception arguments Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 08/46] target/i386: make variable b1 const Jan Bobek
                   ` (39 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The variable pc_start is already a member of DisasContext. Remove the
superfluous local variable.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 131 ++++++++++++++++++++--------------------
 1 file changed, 65 insertions(+), 66 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7532d65778..b1ba2fc3e5 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4495,13 +4495,12 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     TCGMemOp ot;
     int modrm, reg, rm, mod, op, opreg, val;
     target_ulong next_eip, tval;
-    target_ulong pc_start = s->base.pc_next;
 
     {
     int prefixes;
     TCGMemOp aflag, dflag;
 
-    s->pc_start = s->pc = pc_start;
+    s->pc_start = s->pc = s->base.pc_next;
     s->override = -1;
 #ifdef TARGET_X86_64
     s->rex_x = 0;
@@ -6357,7 +6356,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xa5:
         ot = mo_b_d(b, s->dflag);
         if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-            gen_repz_movs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+            gen_repz_movs(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_movs(s, ot);
         }
@@ -6367,7 +6366,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xab:
         ot = mo_b_d(b, s->dflag);
         if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-            gen_repz_stos(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+            gen_repz_stos(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_stos(s, ot);
         }
@@ -6376,7 +6375,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xad:
         ot = mo_b_d(b, s->dflag);
         if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-            gen_repz_lods(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+            gen_repz_lods(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_lods(s, ot);
         }
@@ -6385,9 +6384,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xaf:
         ot = mo_b_d(b, s->dflag);
         if (s->prefix & PREFIX_REPNZ) {
-            gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
+            gen_repz_scas(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 1);
         } else if (s->prefix & PREFIX_REPZ) {
-            gen_repz_scas(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
+            gen_repz_scas(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 0);
         } else {
             gen_scas(s, ot);
         }
@@ -6397,9 +6396,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xa7:
         ot = mo_b_d(b, s->dflag);
         if (s->prefix & PREFIX_REPNZ) {
-            gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 1);
+            gen_repz_cmps(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 1);
         } else if (s->prefix & PREFIX_REPZ) {
-            gen_repz_cmps(s, ot, pc_start - s->cs_base, s->pc - s->cs_base, 0);
+            gen_repz_cmps(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 0);
         } else {
             gen_cmps(s, ot);
         }
@@ -6408,10 +6407,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x6d:
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
-        gen_check_io(s, ot, pc_start - s->cs_base, 
+        gen_check_io(s, ot, s->pc_start - s->cs_base,
                      SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix) | 4);
         if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-            gen_repz_ins(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+            gen_repz_ins(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_ins(s, ot);
             if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
@@ -6423,10 +6422,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x6f:
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
-        gen_check_io(s, ot, pc_start - s->cs_base,
+        gen_check_io(s, ot, s->pc_start - s->cs_base,
                      svm_is_rep(s->prefix) | 4);
         if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-            gen_repz_outs(s, ot, pc_start - s->cs_base, s->pc - s->cs_base);
+            gen_repz_outs(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base);
         } else {
             gen_outs(s, ot);
             if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
@@ -6443,7 +6442,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d32(b, s->dflag);
         val = x86_ldub_code(env, s);
         tcg_gen_movi_tl(s->T0, val);
-        gen_check_io(s, ot, pc_start - s->cs_base,
+        gen_check_io(s, ot, s->pc_start - s->cs_base,
                      SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix));
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
             gen_io_start();
@@ -6462,7 +6461,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         ot = mo_b_d32(b, s->dflag);
         val = x86_ldub_code(env, s);
         tcg_gen_movi_tl(s->T0, val);
-        gen_check_io(s, ot, pc_start - s->cs_base,
+        gen_check_io(s, ot, s->pc_start - s->cs_base,
                      svm_is_rep(s->prefix));
         gen_op_mov_v_reg(s, ot, s->T1, R_EAX);
 
@@ -6482,7 +6481,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xed:
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
-        gen_check_io(s, ot, pc_start - s->cs_base,
+        gen_check_io(s, ot, s->pc_start - s->cs_base,
                      SVM_IOIO_TYPE_MASK | svm_is_rep(s->prefix));
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
             gen_io_start();
@@ -6500,7 +6499,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0xef:
         ot = mo_b_d32(b, s->dflag);
         tcg_gen_ext16u_tl(s->T0, cpu_regs[R_EDX]);
-        gen_check_io(s, ot, pc_start - s->cs_base,
+        gen_check_io(s, ot, s->pc_start - s->cs_base,
                      svm_is_rep(s->prefix));
         gen_op_mov_v_reg(s, ot, s->T1, R_EAX);
 
@@ -6541,7 +6540,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     do_lret:
         if (s->pe && !s->vm86) {
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_lret_protected(cpu_env, tcg_const_i32(s->dflag - 1),
                                       tcg_const_i32(val));
         } else {
@@ -6564,7 +6563,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         val = 0;
         goto do_lret;
     case 0xcf: /* iret */
-        gen_svm_check_intercept(s, pc_start, SVM_EXIT_IRET);
+        gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_IRET);
         if (!s->pe) {
             /* real mode */
             gen_helper_iret_real(cpu_env, tcg_const_i32(s->dflag - 1));
@@ -6691,7 +6690,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         /************************/
         /* flags */
     case 0x9c: /* pushf */
-        gen_svm_check_intercept(s, pc_start, SVM_EXIT_PUSHF);
+        gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_PUSHF);
         if (s->vm86 && s->iopl != 3) {
             gen_exception(s, EXCP0D_GPF);
         } else {
@@ -6701,7 +6700,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         break;
     case 0x9d: /* popf */
-        gen_svm_check_intercept(s, pc_start, SVM_EXIT_POPF);
+        gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_POPF);
         if (s->vm86 && s->iopl != 3) {
             gen_exception(s, EXCP0D_GPF);
         } else {
@@ -7046,8 +7045,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         if (s->prefix & PREFIX_REPZ) {
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
-            gen_helper_pause(cpu_env, tcg_const_i32(s->pc - pc_start));
+            gen_jmp_im(s, s->pc_start - s->cs_base);
+            gen_helper_pause(cpu_env, tcg_const_i32(s->pc - s->pc_start));
             s->base.is_jmp = DISAS_NORETURN;
         }
         break;
@@ -7060,27 +7059,27 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         }
         break;
     case 0xcc: /* int3 */
-        gen_interrupt(s, EXCP03_INT3, pc_start - s->cs_base, s->pc - s->cs_base);
+        gen_interrupt(s, EXCP03_INT3, s->pc_start - s->cs_base, s->pc - s->cs_base);
         break;
     case 0xcd: /* int N */
         val = x86_ldub_code(env, s);
         if (s->vm86 && s->iopl != 3) {
             gen_exception(s, EXCP0D_GPF);
         } else {
-            gen_interrupt(s, val, pc_start - s->cs_base, s->pc - s->cs_base);
+            gen_interrupt(s, val, s->pc_start - s->cs_base, s->pc - s->cs_base);
         }
         break;
     case 0xce: /* into */
         if (CODE64(s))
             goto illegal_op;
         gen_update_cc_op(s);
-        gen_jmp_im(s, pc_start - s->cs_base);
-        gen_helper_into(cpu_env, tcg_const_i32(s->pc - pc_start));
+        gen_jmp_im(s, s->pc_start - s->cs_base);
+        gen_helper_into(cpu_env, tcg_const_i32(s->pc - s->pc_start));
         break;
 #ifdef WANT_ICEBP
     case 0xf1: /* icebp (undocumented, exits to external debugger) */
-        gen_svm_check_intercept(s, pc_start, SVM_EXIT_ICEBP);
-        gen_debug(s, pc_start - s->cs_base);
+        gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_ICEBP);
+        gen_debug(s, s->pc_start - s->cs_base);
         break;
 #endif
     case 0xfa: /* cli */
@@ -7200,7 +7199,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_exception(s, EXCP0D_GPF);
         } else {
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             if (b & 2) {
                 gen_helper_rdmsr(cpu_env);
             } else {
@@ -7210,7 +7209,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x131: /* rdtsc */
         gen_update_cc_op(s);
-        gen_jmp_im(s, pc_start - s->cs_base);
+        gen_jmp_im(s, s->pc_start - s->cs_base);
         if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
             gen_io_start();
         }
@@ -7222,7 +7221,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         break;
     case 0x133: /* rdpmc */
         gen_update_cc_op(s);
-        gen_jmp_im(s, pc_start - s->cs_base);
+        gen_jmp_im(s, s->pc_start - s->cs_base);
         gen_helper_rdpmc(cpu_env);
         break;
     case 0x134: /* sysenter */
@@ -7251,8 +7250,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x105: /* syscall */
         /* XXX: is it usable in real mode ? */
         gen_update_cc_op(s);
-        gen_jmp_im(s, pc_start - s->cs_base);
-        gen_helper_syscall(cpu_env, tcg_const_i32(s->pc - pc_start));
+        gen_jmp_im(s, s->pc_start - s->cs_base);
+        gen_helper_syscall(cpu_env, tcg_const_i32(s->pc - s->pc_start));
         /* TF handling for the syscall insn is different. The TF bit is  checked
            after the syscall insn completes. This allows #DB to not be
            generated after one has entered CPL0 if TF is set in FMASK.  */
@@ -7277,7 +7276,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
 #endif
     case 0x1a2: /* cpuid */
         gen_update_cc_op(s);
-        gen_jmp_im(s, pc_start - s->cs_base);
+        gen_jmp_im(s, s->pc_start - s->cs_base);
         gen_helper_cpuid(cpu_env);
         break;
     case 0xf4: /* hlt */
@@ -7285,8 +7284,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             gen_exception(s, EXCP0D_GPF);
         } else {
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
-            gen_helper_hlt(cpu_env, tcg_const_i32(s->pc - pc_start));
+            gen_jmp_im(s, s->pc_start - s->cs_base);
+            gen_helper_hlt(cpu_env, tcg_const_i32(s->pc - s->pc_start));
             s->base.is_jmp = DISAS_NORETURN;
         }
         break;
@@ -7298,7 +7297,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         case 0: /* sldt */
             if (!s->pe || s->vm86)
                 goto illegal_op;
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_LDTR_READ);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_LDTR_READ);
             tcg_gen_ld32u_tl(s->T0, cpu_env,
                              offsetof(CPUX86State, ldt.selector));
             ot = mod == 3 ? s->dflag : MO_16;
@@ -7310,7 +7309,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (s->cpl != 0) {
                 gen_exception(s, EXCP0D_GPF);
             } else {
-                gen_svm_check_intercept(s, pc_start, SVM_EXIT_LDTR_WRITE);
+                gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_LDTR_WRITE);
                 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_lldt(cpu_env, s->tmp2_i32);
@@ -7319,7 +7318,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         case 1: /* str */
             if (!s->pe || s->vm86)
                 goto illegal_op;
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_TR_READ);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_TR_READ);
             tcg_gen_ld32u_tl(s->T0, cpu_env,
                              offsetof(CPUX86State, tr.selector));
             ot = mod == 3 ? s->dflag : MO_16;
@@ -7331,7 +7330,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             if (s->cpl != 0) {
                 gen_exception(s, EXCP0D_GPF);
             } else {
-                gen_svm_check_intercept(s, pc_start, SVM_EXIT_TR_WRITE);
+                gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_TR_WRITE);
                 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
                 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
                 gen_helper_ltr(cpu_env, s->tmp2_i32);
@@ -7359,7 +7358,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         modrm = x86_ldub_code(env, s);
         switch (modrm) {
         CASE_MODRM_MEM_OP(0): /* sgdt */
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_GDTR_READ);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_GDTR_READ);
             gen_lea_modrm(env, s, modrm);
             tcg_gen_ld32u_tl(s->T0,
                              cpu_env, offsetof(CPUX86State, gdt.limit));
@@ -7377,7 +7376,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             tcg_gen_mov_tl(s->A0, cpu_regs[R_EAX]);
             gen_extu(s->aflag, s->A0);
             gen_add_A0_ds_seg(s);
@@ -7389,8 +7388,8 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
-            gen_helper_mwait(cpu_env, tcg_const_i32(s->pc - pc_start));
+            gen_jmp_im(s, s->pc_start - s->cs_base);
+            gen_helper_mwait(cpu_env, tcg_const_i32(s->pc - s->pc_start));
             gen_eob(s);
             break;
 
@@ -7415,7 +7414,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             break;
 
         CASE_MODRM_MEM_OP(1): /* sidt */
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_IDTR_READ);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_IDTR_READ);
             gen_lea_modrm(env, s, modrm);
             tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State, idt.limit));
             gen_op_st_v(s, MO_16, s->T0, s->A0);
@@ -7466,9 +7465,9 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 break;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_vmrun(cpu_env, tcg_const_i32(s->aflag - 1),
-                             tcg_const_i32(s->pc - pc_start));
+                             tcg_const_i32(s->pc - s->pc_start));
             tcg_gen_exit_tb(NULL, 0);
             s->base.is_jmp = DISAS_NORETURN;
             break;
@@ -7478,7 +7477,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_vmmcall(cpu_env);
             break;
 
@@ -7491,7 +7490,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 break;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_vmload(cpu_env, tcg_const_i32(s->aflag - 1));
             break;
 
@@ -7504,7 +7503,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 break;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_vmsave(cpu_env, tcg_const_i32(s->aflag - 1));
             break;
 
@@ -7533,7 +7532,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 break;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_clgi(cpu_env);
             break;
 
@@ -7544,7 +7543,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_skinit(cpu_env);
             break;
 
@@ -7557,7 +7556,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 break;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_helper_invlpga(cpu_env, tcg_const_i32(s->aflag - 1));
             break;
 
@@ -7566,7 +7565,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 gen_exception(s, EXCP0D_GPF);
                 break;
             }
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_GDTR_WRITE);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_GDTR_WRITE);
             gen_lea_modrm(env, s, modrm);
             gen_op_ld_v(s, MO_16, s->T1, s->A0);
             gen_add_A0_im(s, 2);
@@ -7583,7 +7582,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 gen_exception(s, EXCP0D_GPF);
                 break;
             }
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_IDTR_WRITE);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_IDTR_WRITE);
             gen_lea_modrm(env, s, modrm);
             gen_op_ld_v(s, MO_16, s->T1, s->A0);
             gen_add_A0_im(s, 2);
@@ -7596,7 +7595,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             break;
 
         CASE_MODRM_OP(4): /* smsw */
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_READ_CR0);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_READ_CR0);
             tcg_gen_ld_tl(s->T0, cpu_env, offsetof(CPUX86State, cr[0]));
             if (CODE64(s)) {
                 mod = (modrm >> 6) & 3;
@@ -7628,7 +7627,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 gen_exception(s, EXCP0D_GPF);
                 break;
             }
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_WRITE_CR0);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_WRITE_CR0);
             gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
             gen_helper_lmsw(cpu_env, s->T0);
             gen_jmp_im(s, s->pc - s->cs_base);
@@ -7641,7 +7640,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 break;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             gen_lea_modrm(env, s, modrm);
             gen_helper_invlpg(cpu_env, s->A0);
             gen_jmp_im(s, s->pc - s->cs_base);
@@ -7670,7 +7669,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             gen_update_cc_op(s);
-            gen_jmp_im(s, pc_start - s->cs_base);
+            gen_jmp_im(s, s->pc_start - s->cs_base);
             if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
                 gen_io_start();
             }
@@ -7691,7 +7690,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->cpl != 0) {
             gen_exception(s, EXCP0D_GPF);
         } else {
-            gen_svm_check_intercept(s, pc_start, (b & 2) ? SVM_EXIT_INVD : SVM_EXIT_WBINVD);
+            gen_svm_check_intercept(s, s->pc_start, (b & 2) ? SVM_EXIT_INVD : SVM_EXIT_WBINVD);
             /* nothing to do */
         }
         break;
@@ -8038,7 +8037,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             case 4:
             case 8:
                 gen_update_cc_op(s);
-                gen_jmp_im(s, pc_start - s->cs_base);
+                gen_jmp_im(s, s->pc_start - s->cs_base);
                 if (b & 2) {
                     if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
                         gen_io_start();
@@ -8088,14 +8087,14 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                 goto illegal_op;
             }
             if (b & 2) {
-                gen_svm_check_intercept(s, pc_start, SVM_EXIT_WRITE_DR0 + reg);
+                gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_WRITE_DR0 + reg);
                 gen_op_mov_v_reg(s, ot, s->T0, rm);
                 tcg_gen_movi_i32(s->tmp2_i32, reg);
                 gen_helper_set_dr(cpu_env, s->tmp2_i32, s->T0);
                 gen_jmp_im(s, s->pc - s->cs_base);
                 gen_eob(s);
             } else {
-                gen_svm_check_intercept(s, pc_start, SVM_EXIT_READ_DR0 + reg);
+                gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_READ_DR0 + reg);
                 tcg_gen_movi_i32(s->tmp2_i32, reg);
                 gen_helper_get_dr(s->T0, cpu_env, s->tmp2_i32);
                 gen_op_mov_reg_v(s, ot, rm, s->T0);
@@ -8106,7 +8105,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         if (s->cpl != 0) {
             gen_exception(s, EXCP0D_GPF);
         } else {
-            gen_svm_check_intercept(s, pc_start, SVM_EXIT_WRITE_CR0);
+            gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_WRITE_CR0);
             gen_helper_clts(cpu_env);
             /* abort block because static cpu state changed */
             gen_jmp_im(s, s->pc - s->cs_base);
@@ -8332,7 +8331,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
         gen_nop_modrm(env, s, modrm);
         break;
     case 0x1aa: /* rsm */
-        gen_svm_check_intercept(s, pc_start, SVM_EXIT_RSM);
+        gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_RSM);
         if (!(s->flags & HF_SMM_MASK))
             goto illegal_op;
         gen_update_cc_op(s);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 08/46] target/i386: make variable b1 const
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (6 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 07/46] target/i386: use pc_start from DisasContext Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 09/46] target/i386: make variable is_xmm const Jan Bobek
                   ` (38 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The variable b1 does not change value once assigned. Make this fact
explicit by marking it const.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index b1ba2fc3e5..8bf39b73c4 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -3042,7 +3042,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = {
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
-    int b1, op1_offset, op2_offset, is_xmm, val;
+    int op1_offset, op2_offset, is_xmm, val;
     int modrm, mod, rm, reg;
     SSEFunc_0_epp sse_fn_epp;
     SSEFunc_0_eppi sse_fn_eppi;
@@ -3051,14 +3051,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b)
     TCGMemOp ot;
 
     b &= 0xff;
-    if (s->prefix & PREFIX_DATA)
-        b1 = 1;
-    else if (s->prefix & PREFIX_REPZ)
-        b1 = 2;
-    else if (s->prefix & PREFIX_REPNZ)
-        b1 = 3;
-    else
-        b1 = 0;
+    const int b1 =
+        s->prefix & PREFIX_DATA ? 1
+        : s->prefix & PREFIX_REPZ ? 2
+        : s->prefix & PREFIX_REPNZ ? 3
+        : 0;
     sse_fn_epp = sse_op_table1[b][b1];
     if (!sse_fn_epp) {
         goto unknown_op;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 09/46] target/i386: make variable is_xmm const
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (7 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 08/46] target/i386: make variable b1 const Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 10/46] target/i386: add vector register file alignment constraints Jan Bobek
                   ` (37 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The variable is_xmm does not change value after assignment, so make
this fact explicit by marking it const.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 8bf39b73c4..c5ec309fe2 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -3042,7 +3042,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = {
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
-    int op1_offset, op2_offset, is_xmm, val;
+    int op1_offset, op2_offset, val;
     int modrm, mod, rm, reg;
     SSEFunc_0_epp sse_fn_epp;
     SSEFunc_0_eppi sse_fn_eppi;
@@ -3056,20 +3056,15 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b)
         : s->prefix & PREFIX_REPZ ? 2
         : s->prefix & PREFIX_REPNZ ? 3
         : 0;
+    const int is_xmm =
+        (0x10 <= b && b <= 0x5f)
+        || b == 0xc6
+        || b == 0xc2
+        || !!b1;
     sse_fn_epp = sse_op_table1[b][b1];
     if (!sse_fn_epp) {
         goto unknown_op;
     }
-    if ((b <= 0x5f && b >= 0x10) || b == 0xc6 || b == 0xc2) {
-        is_xmm = 1;
-    } else {
-        if (b1 == 0) {
-            /* MMX case */
-            is_xmm = 0;
-        } else {
-            is_xmm = 1;
-        }
-    }
     /* simple MMX/SSE operation */
     if (s->flags & HF_TS_MASK) {
         gen_exception(s, EXCP07_PREX);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 10/46] target/i386: add vector register file alignment constraints
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (8 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 09/46] target/i386: make variable is_xmm const Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 11/46] target/i386: introduce gen_(ld, st)d_env_A0 Jan Bobek
                   ` (36 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

gvec operations require that all vectors be aligned on 16-byte
boundary; make sure the MM/XMM/YMM/ZMM register file is aligned as
neccessary.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/cpu.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8b3dc5533e..cb407b86ba 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1199,9 +1199,9 @@ typedef struct CPUX86State {
     float_status mmx_status; /* for 3DNow! float ops */
     float_status sse_status;
     uint32_t mxcsr;
-    ZMMReg xmm_regs[CPU_NB_REGS == 8 ? 8 : 32];
-    ZMMReg xmm_t0;
-    MMXReg mmx_t0;
+    ZMMReg xmm_regs[CPU_NB_REGS == 8 ? 8 : 32] QEMU_ALIGNED(16);
+    ZMMReg xmm_t0 QEMU_ALIGNED(16);
+    MMXReg mmx_t0 QEMU_ALIGNED(8);
 
     XMMReg ymmh_regs[CPU_NB_REGS];
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 11/46] target/i386: introduce gen_(ld, st)d_env_A0
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (9 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 10/46] target/i386: add vector register file alignment constraints Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 12/46] target/i386: introduce gen_sse_ng Jan Bobek
                   ` (35 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Similar in spirit to the already present gen_(ld,st)(q,o)_env_A0, it
will prove useful in later commits for smaller-sized vector loads.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c5ec309fe2..258351fce3 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2652,6 +2652,18 @@ static void gen_jmp(DisasContext *s, target_ulong eip)
     gen_jmp_tb(s, eip, 0);
 }
 
+static inline void gen_ldd_env_A0(DisasContext *s, int offset)
+{
+    tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL);
+    tcg_gen_st_i32(s->tmp2_i32, cpu_env, offset);
+}
+
+static inline void gen_std_env_A0(DisasContext *s, int offset)
+{
+    tcg_gen_ld_i32(s->tmp2_i32, cpu_env, offset);
+    tcg_gen_qemu_st_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL);
+}
+
 static inline void gen_ldq_env_A0(DisasContext *s, int offset)
 {
     tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, s->mem_index, MO_LEQ);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 12/46] target/i386: introduce gen_sse_ng
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (10 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 11/46] target/i386: introduce gen_(ld, st)d_env_A0 Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 13/46] target/i386: disable unused function warning temporarily Jan Bobek
                   ` (34 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

This function serves as the point-of-intercept for all newly
implemented instructions. If no new implementation exists, fall back
to gen_sse.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 258351fce3..fdc7cb0054 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4489,6 +4489,33 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b)
     }
 }
 
+static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
+{
+    enum {
+        P_NP = 0,
+        P_66 = 1 << (0 + 8),
+        P_F3 = 1 << (1 + 8),
+        P_F2 = 1 << (2 + 8),
+        W_0  = 0 << (3 + 8),
+        W_1  = 1 << (3 + 8),
+        M_NA = 0,
+        M_0F = 1 << (4 + 8),
+    };
+
+    switch ((b & 0xff) | M_0F
+            | (s->prefix & PREFIX_DATA ? P_66 : 0)
+            | (s->prefix & PREFIX_REPZ ? P_F3 : 0)
+            | (s->prefix & PREFIX_REPNZ ? P_F2 : 0)
+            | (REX_W(s) > 0 ? W_1 : W_0)) {
+
+    default:
+        gen_sse(env, s, b);
+        return;
+    }
+
+    g_assert_not_reached();
+}
+
 /* convert one instruction. s->base.is_jmp is set if the translation must
    be stopped. Return the next pc value */
 static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
@@ -8379,7 +8406,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1c2:
     case 0x1c4 ... 0x1c6:
     case 0x1d0 ... 0x1fe:
-        gen_sse(env, s, b);
+        gen_sse_ng(env, s, b);
         break;
     default:
         goto unknown_op;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 13/46] target/i386: disable unused function warning temporarily
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (11 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 12/46] target/i386: introduce gen_sse_ng Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 14/46] target/i386: introduce mnemonic aliases for several gvec operations Jan Bobek
                   ` (33 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Some functions added later are generated by preprocessor macros and
end up being unused (e.g. not all operands can serve as a destination
operand). Disable unused function warnings for the new code until I
figure out how I want to solve this particular issue.

Note: This changeset is intended for development only and shall not be
included in the final patch series.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index fdc7cb0054..e9741cd7f7 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4489,6 +4489,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b)
     }
 }
 
+/* XXX TODO get rid of this eventually */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-function"
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
@@ -4515,6 +4519,7 @@ static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 
     g_assert_not_reached();
 }
+#pragma GCC diagnostic pop
 
 /* convert one instruction. s->base.is_jmp is set if the translation must
    be stopped. Return the next pc value */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 14/46] target/i386: introduce mnemonic aliases for several gvec operations
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (12 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 13/46] target/i386: disable unused function warning temporarily Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid Jan Bobek
                   ` (32 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

It is helpful to introduce aliases for some general gvec operations as
it makes a couple of instruction code generators simpler (added
later).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index e9741cd7f7..6296a02991 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4493,6 +4493,13 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wunused-function"
 
+#define tcg_gen_gvec_andn(vece, dofs, aofs, bofs, oprsz, maxsz) \
+    tcg_gen_gvec_andc(vece, dofs, bofs, aofs, oprsz, maxsz)
+#define tcg_gen_gvec_cmpeq(vece, dofs, aofs, bofs, oprsz, maxsz)        \
+    tcg_gen_gvec_cmp(TCG_COND_EQ, vece, dofs, aofs, bofs, oprsz, maxsz)
+#define tcg_gen_gvec_cmpgt(vece, dofs, aofs, bofs, oprsz, maxsz)        \
+    tcg_gen_gvec_cmp(TCG_COND_GT, vece, dofs, aofs, bofs, oprsz, maxsz)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (13 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 14/46] target/i386: introduce mnemonic aliases for several gvec operations Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15 15:01   ` Aleksandar Markovic
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 16/46] target/i386: introduce instruction operand infrastructure Jan Bobek
                   ` (31 subsequent siblings)
  46 siblings, 1 reply; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Introduce a helper function to take care of instruction CPUID checks.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 48 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 6296a02991..0cffa2226b 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4500,6 +4500,54 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 #define tcg_gen_gvec_cmpgt(vece, dofs, aofs, bofs, oprsz, maxsz)        \
     tcg_gen_gvec_cmp(TCG_COND_GT, vece, dofs, aofs, bofs, oprsz, maxsz)
 
+typedef enum {
+    CK_CPUID_MMX = 1,
+    CK_CPUID_3DNOW,
+    CK_CPUID_SSE,
+    CK_CPUID_SSE2,
+    CK_CPUID_CLFLUSH,
+    CK_CPUID_SSE3,
+    CK_CPUID_SSSE3,
+    CK_CPUID_SSE4_1,
+    CK_CPUID_SSE4_2,
+    CK_CPUID_SSE4A,
+    CK_CPUID_AVX,
+    CK_CPUID_AVX2,
+} CkCpuidFeat;
+
+static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
+{
+    switch (feat) {
+    case CK_CPUID_MMX:
+        return !(s->cpuid_features & CPUID_MMX)
+            || !(s->cpuid_ext2_features & CPUID_EXT2_MMX);
+    case CK_CPUID_3DNOW:
+        return !(s->cpuid_ext2_features & CPUID_EXT2_3DNOW);
+    case CK_CPUID_SSE:
+        return !(s->cpuid_features & CPUID_SSE);
+    case CK_CPUID_SSE2:
+        return !(s->cpuid_features & CPUID_SSE2);
+    case CK_CPUID_CLFLUSH:
+        return !(s->cpuid_features & CPUID_CLFLUSH);
+    case CK_CPUID_SSE3:
+        return !(s->cpuid_ext_features & CPUID_EXT_SSE3);
+    case CK_CPUID_SSSE3:
+        return !(s->cpuid_ext_features & CPUID_EXT_SSSE3);
+    case CK_CPUID_SSE4_1:
+        return !(s->cpuid_ext_features & CPUID_EXT_SSE41);
+    case CK_CPUID_SSE4_2:
+        return !(s->cpuid_ext_features & CPUID_EXT_SSE42);
+    case CK_CPUID_SSE4A:
+        return !(s->cpuid_ext3_features & CPUID_EXT3_SSE4A);
+    case CK_CPUID_AVX:
+        return !(s->cpuid_ext_features & CPUID_EXT_AVX);
+    case CK_CPUID_AVX2:
+        return !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2);
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 16/46] target/i386: introduce instruction operand infrastructure
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (14 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 17/46] target/i386: introduce generic operand alias Jan Bobek
                   ` (30 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

insnop_arg_t, insnop_ctxt_t and init, prepare and finalize functions
form the basis of instruction operand decoding. Introduce macros for
defining a generic instruction operand; use cases for operand decoding
will be introduced later.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 0cffa2226b..9d00b36406 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4548,6 +4548,34 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
     }
 }
 
+/*
+ * Instruction operand
+ */
+#define insnop_arg_t(opT)    insnop_ ## opT ## _arg_t
+#define insnop_ctxt_t(opT)   insnop_ ## opT ## _ctxt_t
+#define insnop_init(opT)     insnop_ ## opT ## _init
+#define insnop_prepare(opT)  insnop_ ## opT ## _prepare
+#define insnop_finalize(opT) insnop_ ## opT ## _finalize
+
+#define INSNOP_INIT(opT)                                        \
+    static int insnop_init(opT)(insnop_ctxt_t(opT) *ctxt,       \
+                                CPUX86State *env,               \
+                                DisasContext *s,                \
+                                int modrm, bool is_write)
+
+#define INSNOP_PREPARE(opT)                                             \
+    static insnop_arg_t(opT) insnop_prepare(opT)(insnop_ctxt_t(opT) *ctxt, \
+                                                 CPUX86State *env,      \
+                                                 DisasContext *s,       \
+                                                 int modrm, bool is_write)
+
+#define INSNOP_FINALIZE(opT)                                    \
+    static void insnop_finalize(opT)(insnop_ctxt_t(opT) *ctxt,  \
+                                     CPUX86State *env,          \
+                                     DisasContext *s,           \
+                                     int modrm, bool is_write,  \
+                                     insnop_arg_t(opT) arg)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 17/46] target/i386: introduce generic operand alias
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (15 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 16/46] target/i386: introduce instruction operand infrastructure Jan Bobek
@ 2019-08-15  2:08 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 18/46] target/i386: introduce generic either-or operand Jan Bobek
                   ` (29 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

It turns out it is useful to be able to declare operand name
aliases. Introduce a macro to capture this functionality.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 9d00b36406..8989e6504c 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4576,6 +4576,26 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
                                      int modrm, bool is_write,  \
                                      insnop_arg_t(opT) arg)
 
+/*
+ * Operand alias
+ */
+#define DEF_INSNOP_ALIAS(opT, opT2)                                     \
+    typedef insnop_arg_t(opT2) insnop_arg_t(opT);                       \
+    typedef insnop_ctxt_t(opT2) insnop_ctxt_t(opT);                     \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        return insnop_init(opT2)(ctxt, env, s, modrm, is_write);        \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        return insnop_prepare(opT2)(ctxt, env, s, modrm, is_write);     \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+        insnop_finalize(opT2)(ctxt, env, s, modrm, is_write, arg);      \
+    }
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 18/46] target/i386: introduce generic either-or operand
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (16 preceding siblings ...)
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 17/46] target/i386: introduce generic operand alias Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 19/46] target/i386: introduce generic load-store operand Jan Bobek
                   ` (28 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The either-or operand attempts to decode one operand, and if it fails,
it falls back to a second operand. It is unifying, meaning that
insnop_arg_t of the second operand must be implicitly castable to
insnop_arg_t of the first operand.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 46 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 8989e6504c..a0b883c680 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4596,6 +4596,52 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
         insnop_finalize(opT2)(ctxt, env, s, modrm, is_write, arg);      \
     }
 
+/*
+ * Generic unifying either-or operand
+ */
+#define DEF_INSNOP_EITHER(opT, opT1, opT2)                              \
+    typedef insnop_arg_t(opT1) insnop_arg_t(opT);                       \
+    typedef struct {                                                    \
+        bool is_ ## opT1;                                               \
+        union {                                                         \
+            insnop_ctxt_t(opT1) ctxt_ ## opT1;                          \
+            insnop_ctxt_t(opT2) ctxt_ ## opT2;                          \
+        };                                                              \
+    } insnop_ctxt_t(opT);                                               \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        int ret = insnop_init(opT1)(&ctxt->ctxt_ ## opT1,               \
+                                    env, s, modrm, is_write);           \
+        if (!ret) {                                                     \
+            ctxt->is_ ## opT1 = 1;                                      \
+            return 0;                                                   \
+        }                                                               \
+        ret = insnop_init(opT2)(&ctxt->ctxt_ ## opT2,                   \
+                                env, s, modrm, is_write);               \
+        if (!ret) {                                                     \
+            ctxt->is_ ## opT1 = 0;                                      \
+            return 0;                                                   \
+        }                                                               \
+        return ret;                                                     \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        return (ctxt->is_ ## opT1                                       \
+                ? insnop_prepare(opT1)(&ctxt->ctxt_ ## opT1,            \
+                                       env, s, modrm, is_write)         \
+                : insnop_prepare(opT2)(&ctxt->ctxt_ ## opT2,            \
+                                       env, s, modrm, is_write));       \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+        (ctxt->is_ ## opT1                                              \
+         ? insnop_finalize(opT1)(&ctxt->ctxt_ ## opT1,                  \
+                                 env, s, modrm, is_write, arg)          \
+         : insnop_finalize(opT2)(&ctxt->ctxt_ ## opT2,                  \
+                                 env, s, modrm, is_write, arg));        \
+    }
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 19/46] target/i386: introduce generic load-store operand
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (17 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 18/46] target/i386: introduce generic either-or operand Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 20/46] target/i386: introduce tcg_temp operands Jan Bobek
                   ` (27 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

This operand attempts to capture the "indirect" or "memory" operand in
a generic way. It significatly reduces the amount code that needs to
be written in order to read operands from memory to temporary storage
and write them back.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 54 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index a0b883c680..99f46be34e 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4642,6 +4642,60 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
                                  env, s, modrm, is_write, arg));        \
     }
 
+/*
+ * Generic load-store operand
+ */
+#define insnop_ldst(opTarg, opTptr)             \
+    insnop_ldst_ ## opTarg ## opTptr
+
+#define INSNOP_LDST(opTarg, opTptr)                                     \
+    static void insnop_ldst(opTarg, opTptr)(CPUX86State *env,           \
+                                            DisasContext *s,            \
+                                            int modrm, bool is_write,   \
+                                            insnop_arg_t(opTarg) arg,   \
+                                            insnop_arg_t(opTptr) ptr)
+
+#define DEF_INSNOP_LDST(opT, opTarg, opTptr)                            \
+    typedef insnop_arg_t(opTarg) insnop_arg_t(opT);                     \
+    typedef struct {                                                    \
+        insnop_ctxt_t(opTarg) arg;                                      \
+        insnop_ctxt_t(opTptr) ptr;                                      \
+    } insnop_ctxt_t(opT);                                               \
+                                                                        \
+    /* forward declaration */                                           \
+    INSNOP_LDST(opTarg, opTptr);                                        \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        int ret = insnop_init(opTarg)(&ctxt->arg, env, s, modrm, is_write); \
+        if (!ret) {                                                     \
+            ret = insnop_init(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+        }                                                               \
+        return ret;                                                     \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        const insnop_arg_t(opTarg) arg =                                \
+            insnop_prepare(opTarg)(&ctxt->arg, env, s, modrm, is_write); \
+        if (!is_write) {                                                \
+            const insnop_arg_t(opTptr) ptr =                            \
+                insnop_prepare(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+            insnop_ldst(opTarg, opTptr)(env, s, modrm, is_write, arg, ptr); \
+            insnop_finalize(opTptr)(&ctxt->ptr, env, s, modrm, is_write, ptr); \
+        }                                                               \
+        return arg;                                                     \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+        if (is_write) {                                                 \
+            const insnop_arg_t(opTptr) ptr =                            \
+                insnop_prepare(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+            insnop_ldst(opTarg, opTptr)(env, s, modrm, is_write, arg, ptr); \
+            insnop_finalize(opTptr)(&ctxt->ptr, env, s, modrm, is_write, ptr); \
+        }                                                               \
+        insnop_finalize(opTarg)(&ctxt->arg, env, s, modrm, is_write, arg); \
+    }
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 20/46] target/i386: introduce tcg_temp operands
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (18 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 19/46] target/i386: introduce generic load-store operand Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 21/46] target/i386: introduce modrm operand Jan Bobek
                   ` (26 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

TCG temporary operands allocate a 32-bit or 64-bit TCG temporary, and
later automatically free it.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 44 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 99f46be34e..7fc5149d29 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4696,6 +4696,50 @@ static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
         insnop_finalize(opTarg)(&ctxt->arg, env, s, modrm, is_write, arg); \
     }
 
+/*
+ * tcg_temp_i32
+ *
+ * Operand which allocates a 32-bit TCG temporary and frees it
+ * automatically after use.
+ */
+typedef TCGv_i32 insnop_arg_t(tcg_temp_i32);
+typedef struct {} insnop_ctxt_t(tcg_temp_i32);
+
+INSNOP_INIT(tcg_temp_i32)
+{
+    return 0;
+}
+INSNOP_PREPARE(tcg_temp_i32)
+{
+    return tcg_temp_new_i32();
+}
+INSNOP_FINALIZE(tcg_temp_i32)
+{
+    tcg_temp_free_i32(arg);
+}
+
+/*
+ * tcg_temp_i64
+ *
+ * Operand which allocates a 64-bit TCG temporary and frees it
+ * automatically after use.
+ */
+typedef TCGv_i64 insnop_arg_t(tcg_temp_i64);
+typedef struct {} insnop_ctxt_t(tcg_temp_i64);
+
+INSNOP_INIT(tcg_temp_i64)
+{
+    return 0;
+}
+INSNOP_PREPARE(tcg_temp_i64)
+{
+    return tcg_temp_new_i64();
+}
+INSNOP_FINALIZE(tcg_temp_i64)
+{
+    tcg_temp_free_i64(arg);
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 21/46] target/i386: introduce modrm operand
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (19 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 20/46] target/i386: introduce tcg_temp operands Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 22/46] target/i386: introduce operands for decoding modrm fields Jan Bobek
                   ` (25 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

This permits the ModR/M byte to be passed raw into the code generator,
effectively allowing to short-circuit the operand decoding mechanism
and do the decoding work manually in the code generator.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7fc5149d29..25c25a30fb 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4740,6 +4740,26 @@ INSNOP_FINALIZE(tcg_temp_i64)
     tcg_temp_free_i64(arg);
 }
 
+/*
+ * modrm
+ *
+ * Operand whose value is the ModR/M byte.
+ */
+typedef int insnop_arg_t(modrm);
+typedef struct {} insnop_ctxt_t(modrm);
+
+INSNOP_INIT(modrm)
+{
+    return 0;
+}
+INSNOP_PREPARE(modrm)
+{
+    return modrm;
+}
+INSNOP_FINALIZE(modrm)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 22/46] target/i386: introduce operands for decoding modrm fields
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (20 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 21/46] target/i386: introduce modrm operand Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 23/46] target/i386: introduce operand for direct-only r/m field Jan Bobek
                   ` (24 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The old code uses bitshifts and bitwise-and all over the place for
decoding ModR/M fields. Avoid doing that by introducing proper
decoding operands.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 62 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 25c25a30fb..e4515e81df 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4760,6 +4760,68 @@ INSNOP_FINALIZE(modrm)
 {
 }
 
+/*
+ * modrm_mod
+ *
+ * Operand whose value is the MOD field of the ModR/M byte.
+ */
+typedef int insnop_arg_t(modrm_mod);
+typedef struct {} insnop_ctxt_t(modrm_mod);
+
+INSNOP_INIT(modrm_mod)
+{
+    return 0;
+}
+INSNOP_PREPARE(modrm_mod)
+{
+    return (modrm >> 6) & 3;
+}
+INSNOP_FINALIZE(modrm_mod)
+{
+}
+
+/*
+ * modrm_reg
+ *
+ * Operand whose value is the REG field of the ModR/M byte, extended
+ * with the REX.R bit if REX prefix is present.
+ */
+typedef int insnop_arg_t(modrm_reg);
+typedef struct {} insnop_ctxt_t(modrm_reg);
+
+INSNOP_INIT(modrm_reg)
+{
+    return 0;
+}
+INSNOP_PREPARE(modrm_reg)
+{
+    return ((modrm >> 3) & 7) | REX_R(s);
+}
+INSNOP_FINALIZE(modrm_reg)
+{
+}
+
+/*
+ * modrm_rm
+ *
+ * Operand whose value is the RM field of the ModR/M byte, extended
+ * with the REX.B bit if REX prefix is present.
+ */
+typedef int insnop_arg_t(modrm_rm);
+typedef struct {} insnop_ctxt_t(modrm_rm);
+
+INSNOP_INIT(modrm_rm)
+{
+    return 0;
+}
+INSNOP_PREPARE(modrm_rm)
+{
+    return (modrm & 7) | REX_B(s);
+}
+INSNOP_FINALIZE(modrm_rm)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 23/46] target/i386: introduce operand for direct-only r/m field
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (21 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 22/46] target/i386: introduce operands for decoding modrm fields Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 24/46] target/i386: introduce operand vex_v Jan Bobek
                   ` (23 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Many operands can only decode successfully if the ModR/M byte has the
direct form (i.e. MOD=3). Capture this common aspect by introducing a
special direct-only r/m field operand.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index e4515e81df..c918065b96 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4822,6 +4822,43 @@ INSNOP_FINALIZE(modrm_rm)
 {
 }
 
+/*
+ * modrm_rm_direct
+ *
+ * Equivalent of modrm_rm, but only decodes successfully if
+ * the ModR/M byte has the direct form (i.e. MOD=3).
+ */
+typedef insnop_arg_t(modrm_rm) insnop_arg_t(modrm_rm_direct);
+typedef struct {
+    insnop_ctxt_t(modrm_rm) rm;
+} insnop_ctxt_t(modrm_rm_direct);
+
+INSNOP_INIT(modrm_rm_direct)
+{
+    int ret;
+    insnop_ctxt_t(modrm_mod) modctxt;
+
+    ret = insnop_init(modrm_mod)(&modctxt, env, s, modrm, 0);
+    if (!ret) {
+        const int mod = insnop_prepare(modrm_mod)(&modctxt, env, s, modrm, 0);
+        if (mod == 3) {
+            ret = insnop_init(modrm_rm)(&ctxt->rm, env, s, modrm, is_write);
+        } else {
+            ret = 1;
+        }
+        insnop_finalize(modrm_mod)(&modctxt, env, s, modrm, 0, mod);
+    }
+    return ret;
+}
+INSNOP_PREPARE(modrm_rm_direct)
+{
+    return insnop_prepare(modrm_rm)(&ctxt->rm, env, s, modrm, is_write);
+}
+INSNOP_FINALIZE(modrm_rm_direct)
+{
+    insnop_finalize(modrm_rm)(&ctxt->rm, env, s, modrm, is_write, arg);
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 24/46] target/i386: introduce operand vex_v
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (22 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 23/46] target/i386: introduce operand for direct-only r/m field Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 25/46] target/i386: introduce Ib (immediate) operand Jan Bobek
                   ` (22 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

This operand yields value of the VEX.vvvv field.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c918065b96..4562a097fa 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4859,6 +4859,26 @@ INSNOP_FINALIZE(modrm_rm_direct)
     insnop_finalize(modrm_rm)(&ctxt->rm, env, s, modrm, is_write, arg);
 }
 
+/*
+ * vex_v
+ *
+ * Operand whose value is the VVVV field of the VEX prefix.
+ */
+typedef int insnop_arg_t(vex_v);
+typedef struct {} insnop_ctxt_t(vex_v);
+
+INSNOP_INIT(vex_v)
+{
+    return !(s->prefix & PREFIX_VEX);
+}
+INSNOP_PREPARE(vex_v)
+{
+    return s->vex_v;
+}
+INSNOP_FINALIZE(vex_v)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 25/46] target/i386: introduce Ib (immediate) operand
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (23 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 24/46] target/i386: introduce operand vex_v Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 26/46] target/i386: introduce M* (memptr) operands Jan Bobek
                   ` (21 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Introduce the immediate-byte operand, which loads a byte from the
instruction stream and passes its value as the operand.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 4562a097fa..78e8f7a212 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4879,6 +4879,24 @@ INSNOP_FINALIZE(vex_v)
 {
 }
 
+/*
+ * Immediate operand
+ */
+typedef uint8_t insnop_arg_t(Ib);
+typedef struct {} insnop_ctxt_t(Ib);
+
+INSNOP_INIT(Ib)
+{
+    return 0;
+}
+INSNOP_PREPARE(Ib)
+{
+    return x86_ldub_code(env, s);
+}
+INSNOP_FINALIZE(Ib)
+{
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 26/46] target/i386: introduce M* (memptr) operands
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (24 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 25/46] target/i386: introduce Ib (immediate) operand Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 27/46] target/i386: introduce G*, R*, E* (general register) operands Jan Bobek
                   ` (20 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

The memory-pointer operand decodes the indirect form of ModR/M byte,
loads the effective address into a register and passes that register
as the operand.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 78e8f7a212..2374876b38 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4897,6 +4897,42 @@ INSNOP_FINALIZE(Ib)
 {
 }
 
+/*
+ * Memory-pointer operand
+ */
+typedef TCGv insnop_arg_t(M);
+typedef struct {} insnop_ctxt_t(M);
+
+INSNOP_INIT(M)
+{
+    int ret;
+    insnop_ctxt_t(modrm_mod) modctxt;
+
+    ret = insnop_init(modrm_mod)(&modctxt, env, s, modrm, is_write);
+    if (!ret) {
+        const int mod =
+            insnop_prepare(modrm_mod)(&modctxt, env, s, modrm, is_write);
+        ret = !(mod != 3);
+        insnop_finalize(modrm_mod)(&modctxt, env, s, modrm, is_write, mod);
+    }
+    return ret;
+}
+INSNOP_PREPARE(M)
+{
+    gen_lea_modrm(env, s, modrm);
+    return s->A0;
+}
+INSNOP_FINALIZE(M)
+{
+}
+
+DEF_INSNOP_ALIAS(Mb, M)
+DEF_INSNOP_ALIAS(Mw, M)
+DEF_INSNOP_ALIAS(Md, M)
+DEF_INSNOP_ALIAS(Mq, M)
+DEF_INSNOP_ALIAS(Mdq, M)
+DEF_INSNOP_ALIAS(Mqq, M)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 27/46] target/i386: introduce G*, R*, E* (general register) operands
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (25 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 26/46] target/i386: introduce M* (memptr) operands Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 28/46] target/i386: introduce P*, N*, Q* (MMX) operands Jan Bobek
                   ` (19 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

These address the general-purpose register file. The corresponding
32-bit or 64-bit register is passed as the operand value.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 78 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 2374876b38..779b692942 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4933,6 +4933,84 @@ DEF_INSNOP_ALIAS(Mq, M)
 DEF_INSNOP_ALIAS(Mdq, M)
 DEF_INSNOP_ALIAS(Mqq, M)
 
+/*
+ * 32-bit general register operands
+ */
+DEF_INSNOP_LDST(Gd, tcg_temp_i32, modrm_reg)
+DEF_INSNOP_LDST(Rd, tcg_temp_i32, modrm_rm_direct)
+
+INSNOP_LDST(tcg_temp_i32, modrm_reg)
+{
+    assert(0 <= ptr && ptr < CPU_NB_REGS);
+    if (is_write) {
+        tcg_gen_extu_i32_tl(cpu_regs[ptr], arg);
+    } else {
+        tcg_gen_trunc_tl_i32(arg, cpu_regs[ptr]);
+    }
+}
+INSNOP_LDST(tcg_temp_i32, modrm_rm_direct)
+{
+    insnop_ldst(tcg_temp_i32, modrm_reg)(env, s, modrm, is_write, arg, ptr);
+}
+
+DEF_INSNOP_LDST(MEd, tcg_temp_i32, Md)
+DEF_INSNOP_EITHER(Ed, Rd, MEd)
+DEF_INSNOP_LDST(MRdMw, tcg_temp_i32, Mw)
+DEF_INSNOP_EITHER(RdMw, Rd, MRdMw)
+
+INSNOP_LDST(tcg_temp_i32, Md)
+{
+    if (is_write) {
+        tcg_gen_qemu_st_i32(arg, ptr, s->mem_index, MO_LEUL);
+    } else {
+        tcg_gen_qemu_ld_i32(arg, ptr, s->mem_index, MO_LEUL);
+    }
+}
+INSNOP_LDST(tcg_temp_i32, Mw)
+{
+    if (is_write) {
+        tcg_gen_qemu_st_i32(arg, ptr, s->mem_index, MO_LEUW);
+    } else {
+        tcg_gen_qemu_ld_i32(arg, ptr, s->mem_index, MO_LEUW);
+    }
+}
+
+/*
+ * 64-bit general register operands
+ */
+DEF_INSNOP_LDST(Gq, tcg_temp_i64, modrm_reg)
+DEF_INSNOP_LDST(Rq, tcg_temp_i64, modrm_rm_direct)
+
+INSNOP_LDST(tcg_temp_i64, modrm_reg)
+{
+#ifdef TARGET_X86_64
+    assert(0 <= ptr && ptr < CPU_NB_REGS);
+    if (is_write) {
+        tcg_gen_mov_i64(cpu_regs[ptr], arg);
+    } else {
+        tcg_gen_mov_i64(arg, cpu_regs[ptr]);
+    }
+#else /* !TARGET_X86_64 */
+    g_assert_not_reached();
+#endif /* !TARGET_X86_64 */
+}
+INSNOP_LDST(tcg_temp_i64, modrm_rm_direct)
+{
+    insnop_ldst(tcg_temp_i64, modrm_reg)(env, s, modrm, is_write, arg, ptr);
+}
+
+DEF_INSNOP_LDST(MEq, tcg_temp_i64, Mq)
+DEF_INSNOP_EITHER(Eq, Rq, MEq)
+
+INSNOP_LDST(tcg_temp_i64, Mq)
+{
+    if (is_write) {
+        tcg_gen_qemu_st_i64(arg, ptr, s->mem_index, MO_LEQ);
+    } else {
+        tcg_gen_qemu_ld_i64(arg, ptr, s->mem_index, MO_LEQ);
+    }
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 28/46] target/i386: introduce P*, N*, Q* (MMX) operands
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (26 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 27/46] target/i386: introduce G*, R*, E* (general register) operands Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 29/46] target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands Jan Bobek
                   ` (18 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

These address the MMX-technology register file; the corresponding
cpu_env offset is passed as the operand value. Notably, offset of the
entire register is pased at all times, regardless of the operand-size
suffix.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 79 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 779b692942..bd3c7f9356 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5011,6 +5011,85 @@ INSNOP_LDST(tcg_temp_i64, Mq)
     }
 }
 
+/*
+ * MMX-technology register operands
+ */
+#define DEF_INSNOP_MM(opT, opTmmid)                                     \
+    typedef unsigned int insnop_arg_t(opT);                             \
+    typedef struct {                                                    \
+        insnop_ctxt_t(opTmmid) mmid;                                    \
+    } insnop_ctxt_t(opT);                                               \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        return insnop_init(opTmmid)(&ctxt->mmid, env, s, modrm, is_write); \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        const insnop_arg_t(opTmmid) mmid =                              \
+            insnop_prepare(opTmmid)(&ctxt->mmid, env, s, modrm, is_write); \
+        const insnop_arg_t(opT) arg =                                   \
+            offsetof(CPUX86State, fpregs[mmid & 7].mmx);                \
+        insnop_finalize(opTmmid)(&ctxt->mmid, env, s, modrm, is_write, mmid); \
+        return arg;                                                     \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+    }
+
+typedef unsigned int insnop_arg_t(mm_t0);
+typedef struct {} insnop_ctxt_t(mm_t0);
+
+INSNOP_INIT(mm_t0)
+{
+    return 0;
+}
+INSNOP_PREPARE(mm_t0)
+{
+    return offsetof(CPUX86State, mmx_t0);
+}
+INSNOP_FINALIZE(mm_t0)
+{
+}
+
+DEF_INSNOP_MM(P, modrm_reg)
+DEF_INSNOP_ALIAS(Pd, P)
+DEF_INSNOP_ALIAS(Pq, P)
+
+DEF_INSNOP_MM(N, modrm_rm_direct)
+DEF_INSNOP_ALIAS(Nd, N)
+DEF_INSNOP_ALIAS(Nq, N)
+
+DEF_INSNOP_LDST(MQd, mm_t0, Md)
+DEF_INSNOP_LDST(MQq, mm_t0, Mq)
+DEF_INSNOP_EITHER(Qd, Nd, MQd)
+DEF_INSNOP_EITHER(Qq, Nq, MQq)
+
+INSNOP_LDST(mm_t0, Md)
+{
+    const insnop_arg_t(mm_t0) ofs =
+        offsetof(MMXReg, MMX_L(0));
+
+    assert(ptr == s->A0);
+    if (is_write) {
+        gen_std_env_A0(s, arg + ofs);
+    } else {
+        gen_ldd_env_A0(s, arg + ofs);
+    }
+}
+INSNOP_LDST(mm_t0, Mq)
+{
+    const insnop_arg_t(mm_t0) ofs =
+        offsetof(MMXReg, MMX_Q(0));
+
+    assert(ptr == s->A0);
+    if (is_write) {
+        gen_stq_env_A0(s, arg + ofs);
+    } else {
+        gen_ldq_env_A0(s, arg + ofs);
+    }
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 29/46] target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (27 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 28/46] target/i386: introduce P*, N*, Q* (MMX) operands Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 30/46] target/i386: introduce code generators Jan Bobek
                   ` (17 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

These address the SSE/AVX-technology register file. Offset of the
entire corresponding register is passed as the operand value,
regardless of operand-size suffix.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 117 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 117 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index bd3c7f9356..69233fd0f8 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4930,6 +4930,7 @@ DEF_INSNOP_ALIAS(Mb, M)
 DEF_INSNOP_ALIAS(Mw, M)
 DEF_INSNOP_ALIAS(Md, M)
 DEF_INSNOP_ALIAS(Mq, M)
+DEF_INSNOP_ALIAS(Mhq, M)
 DEF_INSNOP_ALIAS(Mdq, M)
 DEF_INSNOP_ALIAS(Mqq, M)
 
@@ -5090,6 +5091,122 @@ INSNOP_LDST(mm_t0, Mq)
     }
 }
 
+/*
+ * SSE/AVX-technology registers
+ */
+#define DEF_INSNOP_XMM(opT, opTxmmid)                                   \
+    typedef unsigned int insnop_arg_t(opT);                             \
+    typedef struct {                                                    \
+        insnop_ctxt_t(opTxmmid) xmmid;                                  \
+    } insnop_ctxt_t(opT);                                               \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        return insnop_init(opTxmmid)(&ctxt->xmmid, env, s, modrm, is_write); \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        const insnop_arg_t(opTxmmid) xmmid =                            \
+            insnop_prepare(opTxmmid)(&ctxt->xmmid, env, s, modrm, is_write); \
+        const insnop_arg_t(opT) arg =                                   \
+            offsetof(CPUX86State, xmm_regs[xmmid]);                     \
+        insnop_finalize(opTxmmid)(&ctxt->xmmid, env, s,                 \
+                                  modrm, is_write, xmmid);              \
+        return arg;                                                     \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+    }
+
+typedef unsigned int insnop_arg_t(xmm_t0);
+typedef struct {} insnop_ctxt_t(xmm_t0);
+
+INSNOP_INIT(xmm_t0)
+{
+    return 0;
+}
+INSNOP_PREPARE(xmm_t0)
+{
+    return offsetof(CPUX86State, xmm_t0);
+}
+INSNOP_FINALIZE(xmm_t0)
+{
+}
+
+DEF_INSNOP_XMM(V, modrm_reg)
+DEF_INSNOP_ALIAS(Vd, V)
+DEF_INSNOP_ALIAS(Vq, V)
+DEF_INSNOP_ALIAS(Vdq, V)
+DEF_INSNOP_ALIAS(Vqq, V)
+
+DEF_INSNOP_XMM(U, modrm_rm_direct)
+DEF_INSNOP_ALIAS(Ud, U)
+DEF_INSNOP_ALIAS(Uq, U)
+DEF_INSNOP_ALIAS(Udq, U)
+DEF_INSNOP_ALIAS(Uqq, U)
+
+DEF_INSNOP_XMM(H, vex_v)
+DEF_INSNOP_ALIAS(Hd, H)
+DEF_INSNOP_ALIAS(Hq, H)
+DEF_INSNOP_ALIAS(Hdq, H)
+DEF_INSNOP_ALIAS(Hqq, H)
+
+DEF_INSNOP_LDST(MUd, xmm_t0, Md)
+DEF_INSNOP_LDST(MUq, xmm_t0, Mq)
+DEF_INSNOP_LDST(MWdq, xmm_t0, Mdq)
+DEF_INSNOP_LDST(MUdqMhq, xmm_t0, Mhq)
+DEF_INSNOP_EITHER(Wd, Ud, MUd)
+DEF_INSNOP_EITHER(Wq, Uq, MUq)
+DEF_INSNOP_EITHER(Wdq, Udq, MWdq)
+DEF_INSNOP_EITHER(UdqMq, Udq, MUq)
+DEF_INSNOP_EITHER(UdqMhq, Udq, MUdqMhq)
+
+INSNOP_LDST(xmm_t0, Md)
+{
+    const insnop_arg_t(xmm_t0) ofs =
+        offsetof(ZMMReg, ZMM_L(0));
+
+    assert(ptr == s->A0);
+    if (is_write) {
+        gen_std_env_A0(s, arg + ofs);
+    } else {
+        gen_ldd_env_A0(s, arg + ofs);
+    }
+}
+INSNOP_LDST(xmm_t0, Mq)
+{
+    const insnop_arg_t(xmm_t0) ofs =
+        offsetof(ZMMReg, ZMM_Q(0));
+
+    assert(ptr == s->A0);
+    if (is_write) {
+        gen_stq_env_A0(s, arg + ofs);
+    } else {
+        gen_ldq_env_A0(s, arg + ofs);
+    }
+}
+INSNOP_LDST(xmm_t0, Mdq)
+{
+    assert(ptr == s->A0);
+    if (is_write) {
+        gen_sto_env_A0(s, arg);
+    } else {
+        gen_ldo_env_A0(s, arg);
+    }
+}
+INSNOP_LDST(xmm_t0, Mhq)
+{
+    const insnop_arg_t(xmm_t0) ofs =
+        offsetof(ZMMReg, ZMM_Q(1));
+
+    assert(ptr == s->A0);
+    if (is_write) {
+        gen_stq_env_A0(s, arg + ofs);
+    } else {
+        gen_ldq_env_A0(s, arg + ofs);
+    }
+}
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 30/46] target/i386: introduce code generators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (28 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 29/46] target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 31/46] target/i386: introduce helper-based code generator macros Jan Bobek
                   ` (16 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

In this context, "code generators" are functions that receive decoded
instruction operands and emit TCG ops implementing the correct
instruction functionality. Introduce the naming macros first, actual
generator macros will be added later.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 69233fd0f8..b5f609e147 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5207,6 +5207,44 @@ INSNOP_LDST(xmm_t0, Mhq)
     }
 }
 
+/*
+ * Code generators
+ */
+#define gen_insn(mnem, argc, ...)               \
+    glue(gen_insn, argc)(mnem, ## __VA_ARGS__)
+#define gen_insn0(mnem)                         \
+    gen_ ## mnem ## _0
+#define gen_insn1(mnem, opT1)                   \
+    gen_ ## mnem ## _1 ## opT1
+#define gen_insn2(mnem, opT1, opT2)             \
+    gen_ ## mnem ## _2 ## opT1 ## opT2
+#define gen_insn3(mnem, opT1, opT2, opT3)       \
+    gen_ ## mnem ## _3 ## opT1 ## opT2 ## opT3
+#define gen_insn4(mnem, opT1, opT2, opT3, opT4)         \
+    gen_ ## mnem ## _4 ## opT1 ## opT2 ## opT3 ## opT4
+
+#define GEN_INSN0(mnem)                         \
+    static void gen_insn0(mnem)(                \
+        CPUX86State *env, DisasContext *s)
+#define GEN_INSN1(mnem, opT1)                   \
+    static void gen_insn1(mnem, opT1)(          \
+        CPUX86State *env, DisasContext *s,      \
+        insnop_arg_t(opT1) arg1)
+#define GEN_INSN2(mnem, opT1, opT2)                             \
+    static void gen_insn2(mnem, opT1, opT2)(                    \
+        CPUX86State *env, DisasContext *s,                      \
+        insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2)
+#define GEN_INSN3(mnem, opT1, opT2, opT3)                       \
+    static void gen_insn3(mnem, opT1, opT2, opT3)(              \
+        CPUX86State *env, DisasContext *s,                      \
+        insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,       \
+        insnop_arg_t(opT3) arg3)
+#define GEN_INSN4(mnem, opT1, opT2, opT3, opT4)                 \
+    static void gen_insn4(mnem, opT1, opT2, opT3, opT4)(        \
+        CPUX86State *env, DisasContext *s,                      \
+        insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,       \
+        insnop_arg_t(opT3) arg3, insnop_arg_t(opT4) arg4)
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 31/46] target/i386: introduce helper-based code generator macros
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (29 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 30/46] target/i386: introduce code generators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 32/46] target/i386: introduce gvec-based " Jan Bobek
                   ` (15 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Code generators defined using these macros rely on a helper function
(as emitted by gen_helper_*).

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 106 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index b5f609e147..b28d651b82 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5245,6 +5245,112 @@ INSNOP_LDST(xmm_t0, Mhq)
         insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,       \
         insnop_arg_t(opT3) arg3, insnop_arg_t(opT4) arg4)
 
+#define DEF_GEN_INSN0_HELPER(mnem, helper)      \
+    GEN_INSN0(mnem)                             \
+    {                                           \
+        gen_helper_ ## helper(cpu_env);         \
+    }
+
+#define DEF_GEN_INSN2_HELPER_EPD(mnem, helper, opT1, opT2)      \
+    GEN_INSN2(mnem, opT1, opT2)                                 \
+    {                                                           \
+        const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();           \
+                                                                \
+        tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);              \
+        gen_helper_ ## helper(cpu_env, arg1_ptr, arg2);         \
+                                                                \
+        tcg_temp_free_ptr(arg1_ptr);                            \
+    }
+#define DEF_GEN_INSN2_HELPER_DEP(mnem, helper, opT1, opT2)      \
+    GEN_INSN2(mnem, opT1, opT2)                                 \
+    {                                                           \
+        const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();           \
+                                                                \
+        tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);              \
+        gen_helper_ ## helper(arg1, cpu_env, arg2_ptr);         \
+                                                                \
+        tcg_temp_free_ptr(arg2_ptr);                            \
+    }
+#ifdef TARGET_X86_64
+#define DEF_GEN_INSN2_HELPER_EPQ(mnem, helper, opT1, opT2)      \
+    DEF_GEN_INSN2_HELPER_EPD(mnem, helper, opT1, opT2)
+#define DEF_GEN_INSN2_HELPER_QEP(mnem, helper, opT1, opT2)      \
+    DEF_GEN_INSN2_HELPER_DEP(mnem, helper, opT1, opT2)
+#else /* !TARGET_X86_64 */
+#define DEF_GEN_INSN2_HELPER_EPQ(mnem, helper, opT1, opT2)      \
+    GEN_INSN2(mnem, opT1, opT2)                                 \
+    {                                                           \
+        g_assert_not_reached();                                 \
+    }
+#define DEF_GEN_INSN2_HELPER_QEP(mnem, helper, opT1, opT2)      \
+    GEN_INSN2(mnem, opT1, opT2)                                 \
+    {                                                           \
+        g_assert_not_reached();                                 \
+    }
+#endif /* !TARGET_X86_64 */
+#define DEF_GEN_INSN2_HELPER_EPP(mnem, helper, opT1, opT2)      \
+    GEN_INSN2(mnem, opT1, opT2)                                 \
+    {                                                           \
+        const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();           \
+        const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();           \
+                                                                \
+        tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);              \
+        tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);              \
+        gen_helper_ ## helper(cpu_env, arg1_ptr, arg2_ptr);     \
+                                                                \
+        tcg_temp_free_ptr(arg1_ptr);                            \
+        tcg_temp_free_ptr(arg2_ptr);                            \
+    }
+
+#define DEF_GEN_INSN3_HELPER_EPP(mnem, helper, opT1, opT2, opT3)        \
+    GEN_INSN3(mnem, opT1, opT2, opT3)                                   \
+    {                                                                   \
+        const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();                   \
+        const TCGv_ptr arg3_ptr = tcg_temp_new_ptr();                   \
+                                                                        \
+        assert(arg1 == arg2);                                           \
+        tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);                      \
+        tcg_gen_addi_ptr(arg3_ptr, cpu_env, arg3);                      \
+        gen_helper_ ## helper(cpu_env, arg1_ptr, arg3_ptr);             \
+                                                                        \
+        tcg_temp_free_ptr(arg1_ptr);                                    \
+        tcg_temp_free_ptr(arg3_ptr);                                    \
+    }
+#define DEF_GEN_INSN3_HELPER_PPI(mnem, helper, opT1, opT2, opT3)        \
+    GEN_INSN3(mnem, opT1, opT2, opT3)                                   \
+    {                                                                   \
+        const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();                   \
+        const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();                   \
+        const TCGv_i32 arg3_r32 = tcg_temp_new_i32();                   \
+                                                                        \
+        tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);                      \
+        tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);                      \
+        tcg_gen_movi_i32(arg3_r32, arg3);                               \
+        gen_helper_ ## helper(arg1_ptr, arg2_ptr, arg3_r32);            \
+                                                                        \
+        tcg_temp_free_ptr(arg1_ptr);                                    \
+        tcg_temp_free_ptr(arg2_ptr);                                    \
+        tcg_temp_free_i32(arg3_r32);                                    \
+    }
+
+#define DEF_GEN_INSN4_HELPER_PPI(mnem, helper, opT1, opT2, opT3, opT4)  \
+    GEN_INSN4(mnem, opT1, opT2, opT3, opT4)                             \
+    {                                                                   \
+        const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();                   \
+        const TCGv_ptr arg3_ptr = tcg_temp_new_ptr();                   \
+        const TCGv_i32 arg4_r32 = tcg_temp_new_i32();                   \
+                                                                        \
+        assert(arg1 == arg2);                                           \
+        tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);                      \
+        tcg_gen_addi_ptr(arg3_ptr, cpu_env, arg3);                      \
+        tcg_gen_movi_i32(arg4_r32, arg4);                               \
+        gen_helper_ ## helper(arg1_ptr, arg3_ptr, arg4_r32);            \
+                                                                        \
+        tcg_temp_free_ptr(arg1_ptr);                                    \
+        tcg_temp_free_ptr(arg3_ptr);                                    \
+        tcg_temp_free_i32(arg4_r32);                                    \
+    }
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 32/46] target/i386: introduce gvec-based code generator macros
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (30 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 31/46] target/i386: introduce helper-based code generator macros Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 33/46] target/i386: introduce sse-opcode.inc.h Jan Bobek
                   ` (14 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Code generators defined using these macros rely on a gvec operation
(i.e. tcg_gen_gvec_*).

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index b28d651b82..75652afb45 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -23,6 +23,7 @@
 #include "disas/disas.h"
 #include "exec/exec-all.h"
 #include "tcg-op.h"
+#include "tcg-op-gvec.h"
 #include "exec/cpu_ldst.h"
 #include "exec/translator.h"
 
@@ -5351,6 +5352,18 @@ INSNOP_LDST(xmm_t0, Mhq)
         tcg_temp_free_i32(arg4_r32);                                    \
     }
 
+#define DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece, oprsz, maxsz)  \
+    GEN_INSN2(mnem, opT1, opT2)                                         \
+    {                                                                   \
+        tcg_gen_gvec_ ## gvec(vece, arg1, arg2, oprsz, maxsz);          \
+    }
+
+#define DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece, oprsz, maxsz) \
+    GEN_INSN3(mnem, opT1, opT2, opT3)                                   \
+    {                                                                   \
+        tcg_gen_gvec_ ## gvec(vece, arg1, arg2, arg3, oprsz, maxsz);    \
+    }
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 33/46] target/i386: introduce sse-opcode.inc.h
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (31 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 32/46] target/i386: introduce gvec-based " Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 34/46] target/i386: introduce instruction translator macros Jan Bobek
                   ` (13 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

This header is intended to eventually list all supported instructions
along with some useful details (e.g. mnemonics, opcode, operands etc.)
It shall be used (along with some preprocessor magic) anytime we need
to automatically generate code for every instruction.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/sse-opcode.inc.h | 69 ++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)
 create mode 100644 target/i386/sse-opcode.inc.h

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
new file mode 100644
index 0000000000..c5e81a6a80
--- /dev/null
+++ b/target/i386/sse-opcode.inc.h
@@ -0,0 +1,69 @@
+#define FMTI____     (0, 0, 0, )
+#define FMTI__R__    (1, 1, 0, r)
+#define FMTI__RR__   (2, 2, 0, rr)
+#define FMTI__W__    (1, 0, 1, w)
+#define FMTI__WR__   (2, 1, 1, wr)
+#define FMTI__WRR__  (3, 2, 1, wrr)
+#define FMTI__WRRR__ (4, 3, 1, wrrr)
+
+#define FMTI__(prop, fmti) FMTI_ ## prop ## __ fmti
+
+#define FMTI_ARGC__(argc, argc_rd, argc_wr, lower)    argc
+#define FMTI_ARGC_RD__(argc, argc_rd, argc_wr, lower) argc_rd
+#define FMTI_ARGC_WR__(argc, argc_rd, argc_wr, lower) argc_wr
+#define FMTI_LOWER__(argc, argc_rd, argc_wr, lower)   lower
+
+#define FMT_ARGC(fmt)    FMTI__(ARGC, FMTI__ ## fmt ## __)
+#define FMT_ARGC_RD(fmt) FMTI__(ARGC_RD, FMTI__ ## fmt ## __)
+#define FMT_ARGC_WR(fmt) FMTI__(ARGC_WR, FMTI__ ## fmt ## __)
+#define FMT_LOWER(fmt)   FMTI__(LOWER, FMTI__ ## fmt ## __)
+#define FMT_UPPER(fmt)   fmt
+
+#ifndef OPCODE
+#   define OPCODE(mnem, opcode, feat, fmt, ...)
+#endif /* OPCODE */
+
+#ifndef OPCODE_GRP
+#   define OPCODE_GRP(grpname, opcode)
+#endif /* OPCODE_GRP */
+
+#ifndef OPCODE_GRP_BEGIN
+#   define OPCODE_GRP_BEGIN(grpname)
+#endif /* OPCODE_GRP_BEGIN */
+
+#ifndef OPCODE_GRPMEMB
+#   define OPCODE_GRPMEMB(grpname, mnem, opcode, feat, fmt, ...)
+#endif /* OPCODE_GRPMEMB */
+
+#ifndef OPCODE_GRP_END
+#   define OPCODE_GRP_END(grpname)
+#endif /* OPCODE_GRP_END */
+
+#undef FMTI____
+#undef FMTI__R__
+#undef FMTI__RR__
+#undef FMTI__W__
+#undef FMTI__WR__
+#undef FMTI__WRR__
+#undef FMTI__WRRR__
+
+#undef FMTI__
+
+#undef FMTI_ARGC__
+#undef FMTI_ARGC_RD__
+#undef FMTI_ARGC_WR__
+#undef FMTI_LOWER__
+
+#undef FMT_ARGC
+#undef FMT_ARGC_RD
+#undef FMT_ARGC_WR
+#undef FMT_LOWER
+#undef FMT_UPPER
+
+#undef LEG
+#undef VEX
+#undef OPCODE
+#undef OPCODE_GRP
+#undef OPCODE_GRP_BEGIN
+#undef OPCODE_GRPMEMB
+#undef OPCODE_GRP_END
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 34/46] target/i386: introduce instruction translator macros
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (32 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 33/46] target/i386: introduce sse-opcode.inc.h Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 35/46] target/i386: introduce MMX translators Jan Bobek
                   ` (12 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Instruction "translators" are responsible for decoding and loading
instruction operands, calling the passed-in code generator, and
storing the operands back (if applicable). Once a translator returns,
the instruction has been translated to TCG ops, hence the name.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 237 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 237 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 75652afb45..76c27d0380 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5364,6 +5364,228 @@ INSNOP_LDST(xmm_t0, Mhq)
         tcg_gen_gvec_ ## gvec(vece, arg1, arg2, arg3, oprsz, maxsz);    \
     }
 
+/*
+ * Instruction translators
+ */
+#define translate_insn(argc, ...)               \
+    glue(translate_insn, argc)(__VA_ARGS__)
+#define translate_insn0()                       \
+    translate_insn_0
+#define translate_insn1(opT1)                   \
+    translate_insn_1 ## opT1
+#define translate_insn2(opT1, opT2)             \
+    translate_insn_2 ## opT1 ## opT2
+#define translate_insn3(opT1, opT2, opT3)       \
+    translate_insn_3 ## opT1 ## opT2 ## opT3
+#define translate_insn4(opT1, opT2, opT3, opT4)         \
+    translate_insn_4 ## opT1 ## opT2 ## opT3 ## opT4
+#define translate_group(grpname)                \
+    translate_group_ ## grpname
+
+static void translate_insn0()(
+    CPUX86State *env, DisasContext *s, int modrm,
+    int ck_cpuid_feat, unsigned int argc_wr,
+    void (*gen_insn_fp)(CPUX86State *, DisasContext *))
+{
+    if (ck_cpuid(env, s, ck_cpuid_feat)) {
+        gen_illegal_opcode(s);
+        return;
+    }
+
+    (*gen_insn_fp)(env, s);
+}
+
+#define DEF_TRANSLATE_INSN1(opT1)                                       \
+    static void translate_insn1(opT1)(                                  \
+        CPUX86State *env, DisasContext *s, int modrm,                   \
+        int ck_cpuid_feat, unsigned int argc_wr,                        \
+        void (*gen_insn1_fp)(CPUX86State *, DisasContext *,             \
+                             insnop_arg_t(opT1)))                       \
+    {                                                                   \
+        insnop_ctxt_t(opT1) ctxt1;                                      \
+                                                                        \
+        const bool is_write1 = (1 <= argc_wr);                          \
+                                                                        \
+        int ret = ck_cpuid(env, s, ck_cpuid_feat);                      \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT1)(&ctxt1, env, s, modrm, is_write1);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            const insnop_arg_t(opT1) arg1 =                             \
+                insnop_prepare(opT1)(&ctxt1, env, s, modrm, is_write1); \
+                                                                        \
+            (*gen_insn1_fp)(env, s, arg1);                              \
+                                                                        \
+            insnop_finalize(opT1)(&ctxt1, env, s, modrm, is_write1, arg1); \
+        } else {                                                        \
+            gen_illegal_opcode(s);                                      \
+        }                                                               \
+    }
+
+#define DEF_TRANSLATE_INSN2(opT1, opT2)                                 \
+    static void translate_insn2(opT1, opT2)(                            \
+        CPUX86State *env, DisasContext *s, int modrm,                   \
+        int ck_cpuid_feat, unsigned int argc_wr,                        \
+        void (*gen_insn2_fp)(CPUX86State *, DisasContext *,             \
+                             insnop_arg_t(opT1), insnop_arg_t(opT2)))   \
+    {                                                                   \
+        insnop_ctxt_t(opT1) ctxt1;                                      \
+        insnop_ctxt_t(opT2) ctxt2;                                      \
+                                                                        \
+        const bool is_write1 = (1 <= argc_wr);                          \
+        const bool is_write2 = (2 <= argc_wr);                          \
+                                                                        \
+        int ret = ck_cpuid(env, s, ck_cpuid_feat);                      \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT1)(&ctxt1, env, s, modrm, is_write1);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT2)(&ctxt2, env, s, modrm, is_write2);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            const insnop_arg_t(opT1) arg1 =                             \
+                insnop_prepare(opT1)(&ctxt1, env, s, modrm, is_write1); \
+            const insnop_arg_t(opT2) arg2 =                             \
+                insnop_prepare(opT2)(&ctxt2, env, s, modrm, is_write2); \
+                                                                        \
+            (*gen_insn2_fp)(env, s, arg1, arg2);                        \
+                                                                        \
+            insnop_finalize(opT1)(&ctxt1, env, s, modrm, is_write1, arg1); \
+            insnop_finalize(opT2)(&ctxt2, env, s, modrm, is_write2, arg2); \
+        } else {                                                        \
+            gen_illegal_opcode(s);                                      \
+        }                                                               \
+    }
+
+#define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)                           \
+    static void translate_insn3(opT1, opT2, opT3)(                      \
+        CPUX86State *env, DisasContext *s, int modrm,                   \
+        int ck_cpuid_feat, unsigned int argc_wr,                        \
+        void (*gen_insn3_fp)(CPUX86State *, DisasContext *,             \
+                             insnop_arg_t(opT1), insnop_arg_t(opT2),    \
+                             insnop_arg_t(opT3)))                       \
+    {                                                                   \
+        insnop_ctxt_t(opT1) ctxt1;                                      \
+        insnop_ctxt_t(opT2) ctxt2;                                      \
+        insnop_ctxt_t(opT3) ctxt3;                                      \
+                                                                        \
+        const bool is_write1 = (1 <= argc_wr);                          \
+        const bool is_write2 = (2 <= argc_wr);                          \
+        const bool is_write3 = (3 <= argc_wr);                          \
+                                                                        \
+        int ret = ck_cpuid(env, s, ck_cpuid_feat);                      \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT1)(&ctxt1, env, s, modrm, is_write1);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT2)(&ctxt2, env, s, modrm, is_write2);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT3)(&ctxt3, env, s, modrm, is_write3);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            const insnop_arg_t(opT1) arg1 =                             \
+                insnop_prepare(opT1)(&ctxt1, env, s, modrm, is_write1); \
+            const insnop_arg_t(opT2) arg2 =                             \
+                insnop_prepare(opT2)(&ctxt2, env, s, modrm, is_write2); \
+            const insnop_arg_t(opT3) arg3 =                             \
+                insnop_prepare(opT3)(&ctxt3, env, s, modrm, is_write3); \
+                                                                        \
+            (*gen_insn3_fp)(env, s, arg1, arg2, arg3);                  \
+                                                                        \
+            insnop_finalize(opT1)(&ctxt1, env, s, modrm, is_write1, arg1); \
+            insnop_finalize(opT2)(&ctxt2, env, s, modrm, is_write2, arg2); \
+            insnop_finalize(opT3)(&ctxt3, env, s, modrm, is_write3, arg3); \
+        } else {                                                        \
+            gen_illegal_opcode(s);                                      \
+        }                                                               \
+    }
+
+#define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4)                     \
+    static void translate_insn4(opT1, opT2, opT3, opT4)(                \
+        CPUX86State *env, DisasContext *s, int modrm,                   \
+        int ck_cpuid_feat, unsigned int argc_wr,                        \
+        void (*gen_insn4_fp)(CPUX86State *, DisasContext *,             \
+                             insnop_arg_t(opT1), insnop_arg_t(opT2),    \
+                             insnop_arg_t(opT3), insnop_arg_t(opT4)))   \
+    {                                                                   \
+        insnop_ctxt_t(opT1) ctxt1;                                      \
+        insnop_ctxt_t(opT2) ctxt2;                                      \
+        insnop_ctxt_t(opT3) ctxt3;                                      \
+        insnop_ctxt_t(opT4) ctxt4;                                      \
+                                                                        \
+        const bool is_write1 = (1 <= argc_wr);                          \
+        const bool is_write2 = (2 <= argc_wr);                          \
+        const bool is_write3 = (3 <= argc_wr);                          \
+        const bool is_write4 = (4 <= argc_wr);                          \
+                                                                        \
+        int ret = ck_cpuid(env, s, ck_cpuid_feat);                      \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT1)(&ctxt1, env, s, modrm, is_write1);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT2)(&ctxt2, env, s, modrm, is_write2);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT3)(&ctxt3, env, s, modrm, is_write3);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            ret = insnop_init(opT4)(&ctxt4, env, s, modrm, is_write4);  \
+        }                                                               \
+        if (!ret) {                                                     \
+            const insnop_arg_t(opT1) arg1 =                             \
+                insnop_prepare(opT1)(&ctxt1, env, s, modrm, is_write1); \
+            const insnop_arg_t(opT2) arg2 =                             \
+                insnop_prepare(opT2)(&ctxt2, env, s, modrm, is_write2); \
+            const insnop_arg_t(opT3) arg3 =                             \
+                insnop_prepare(opT3)(&ctxt3, env, s, modrm, is_write3); \
+            const insnop_arg_t(opT4) arg4 =                             \
+                insnop_prepare(opT4)(&ctxt4, env, s, modrm, is_write4); \
+                                                                        \
+            (*gen_insn4_fp)(env, s, arg1, arg2, arg3, arg4);            \
+                                                                        \
+            insnop_finalize(opT1)(&ctxt1, env, s, modrm, is_write1, arg1); \
+            insnop_finalize(opT2)(&ctxt2, env, s, modrm, is_write2, arg2); \
+            insnop_finalize(opT3)(&ctxt3, env, s, modrm, is_write3, arg3); \
+            insnop_finalize(opT4)(&ctxt4, env, s, modrm, is_write4, arg4); \
+        } else {                                                        \
+            gen_illegal_opcode(s);                                      \
+        }                                                               \
+    }
+
+#define OPCODE_GRP_BEGIN(grpname)                                       \
+    static void translate_group(grpname)(                               \
+        CPUX86State *env, DisasContext *s, int modrm)                   \
+    {                                                                   \
+        insnop_ctxt_t(modrm_reg) regctxt;                               \
+                                                                        \
+        int ret = insnop_init(modrm_reg)(&regctxt, env, s, modrm, 0);   \
+        if (!ret) {                                                     \
+            const insnop_arg_t(modrm_reg) reg =                         \
+                insnop_prepare(modrm_reg)(&regctxt, env, s, modrm, 0);  \
+                                                                        \
+            switch (reg & 7) {
+#define OPCODE_GRPMEMB(grpname, mnem, opcode, feat, fmt, ...)           \
+            case opcode:                                                \
+                translate_insn(FMT_ARGC(fmt), ## __VA_ARGS__)(          \
+                    env, s, modrm, CK_CPUID_ ## feat, FMT_ARGC_WR(fmt), \
+                    gen_insn(mnem, FMT_ARGC(fmt), ## __VA_ARGS__));     \
+                break;
+#define OPCODE_GRP_END(grpname)                                         \
+            default:                                                    \
+                ret = 1;                                                \
+                break;                                                  \
+            }                                                           \
+                                                                        \
+            insnop_finalize(modrm_reg)(&regctxt, env, s, modrm, 0, reg); \
+        }                                                               \
+                                                                        \
+        if (ret) {                                                      \
+            gen_illegal_opcode(s);                                      \
+        }                                                               \
+    }
+#include "sse-opcode.inc.h"
+
 static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
 {
     enum {
@@ -5383,6 +5605,21 @@ static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
             | (s->prefix & PREFIX_REPNZ ? P_F2 : 0)
             | (REX_W(s) > 0 ? W_1 : W_0)) {
 
+#define LEG(p, m, w, opcode)                    \
+    case opcode | M_ ## m | P_ ## p | W_ ## w:
+#define OPCODE(mnem, cases, feat, fmt, ...)                             \
+    cases {                                                             \
+        const int modrm = 0 < FMT_ARGC(fmt) ? x86_ldub_code(env, s) : -1; \
+        translate_insn(FMT_ARGC(fmt), ## __VA_ARGS__)(                  \
+            env, s, modrm, CK_CPUID_ ## feat, FMT_ARGC_WR(fmt),         \
+            gen_insn(mnem, FMT_ARGC(fmt), ## __VA_ARGS__));             \
+    } return;
+#define OPCODE_GRP(grpname, cases)                  \
+    cases {                                         \
+        const int modrm = x86_ldub_code(env, s);    \
+        translate_group(grpname)(env, s, modrm);    \
+    } return;
+#include "sse-opcode.inc.h"
     default:
         gen_sse(env, s, b);
         return;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 35/46] target/i386: introduce MMX translators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (33 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 34/46] target/i386: introduce instruction translator macros Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 36/46] target/i386: introduce MMX code generators Jan Bobek
                   ` (11 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Use the translator macros to define instruction translators required
by MMX instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 76c27d0380..4fecb0d240 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5457,6 +5457,15 @@ static void translate_insn0()(
         }                                                               \
     }
 
+DEF_TRANSLATE_INSN2(Ed, Pq)
+DEF_TRANSLATE_INSN2(Eq, Pq)
+DEF_TRANSLATE_INSN2(Gd, Nq)
+DEF_TRANSLATE_INSN2(Gq, Nq)
+DEF_TRANSLATE_INSN2(Pq, Ed)
+DEF_TRANSLATE_INSN2(Pq, Eq)
+DEF_TRANSLATE_INSN2(Pq, Qq)
+DEF_TRANSLATE_INSN2(Qq, Pq)
+
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)                           \
     static void translate_insn3(opT1, opT2, opT3)(                      \
         CPUX86State *env, DisasContext *s, int modrm,                   \
@@ -5501,6 +5510,13 @@ static void translate_insn0()(
         }                                                               \
     }
 
+DEF_TRANSLATE_INSN3(Gd, Nq, Ib)
+DEF_TRANSLATE_INSN3(Gq, Nq, Ib)
+DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
+DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
+DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
+DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
+
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4)                     \
     static void translate_insn4(opT1, opT2, opT3, opT4)(                \
         CPUX86State *env, DisasContext *s, int modrm,                   \
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 36/46] target/i386: introduce MMX code generators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (34 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 35/46] target/i386: introduce MMX translators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 37/46] target/i386: introduce MMX instructions to sse-opcode.inc.h Jan Bobek
                   ` (10 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Define code generators required for MMX instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 111 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 4fecb0d240..a02e9cd0d2 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5357,12 +5357,123 @@ INSNOP_LDST(xmm_t0, Mhq)
     {                                                                   \
         tcg_gen_gvec_ ## gvec(vece, arg1, arg2, oprsz, maxsz);          \
     }
+#define DEF_GEN_INSN2_GVEC_MM(mnem, gvec, opT1, opT2, vece)       \
+    DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece,              \
+                       sizeof(MMXReg), sizeof(MMXReg))
 
 #define DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece, oprsz, maxsz) \
     GEN_INSN3(mnem, opT1, opT2, opT3)                                   \
     {                                                                   \
         tcg_gen_gvec_ ## gvec(vece, arg1, arg2, arg3, oprsz, maxsz);    \
     }
+#define DEF_GEN_INSN3_GVEC_MM(mnem, gvec, opT1, opT2, opT3, vece)   \
+    DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece,          \
+                       sizeof(MMXReg), sizeof(MMXReg))
+
+GEN_INSN2(movq, Pq, Eq);        /* forward declaration */
+GEN_INSN2(movd, Pq, Ed)
+{
+    const insnop_arg_t(Eq) arg2_r64 = tcg_temp_new_i64();
+    tcg_gen_extu_i32_i64(arg2_r64, arg2);
+    gen_insn2(movq, Pq, Eq)(env, s, arg1, arg2_r64);
+    tcg_temp_free_i64(arg2_r64);
+}
+
+GEN_INSN2(movd, Ed, Pq)
+{
+    const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_L(0));
+    tcg_gen_ld_i32(arg1, cpu_env, arg2 + ofs);
+}
+
+GEN_INSN2(movq, Pq, Eq)
+{
+    const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_Q(0));
+    tcg_gen_st_i64(arg2, cpu_env, arg1 + ofs);
+}
+
+GEN_INSN2(movq, Eq, Pq)
+{
+    const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_Q(0));
+    tcg_gen_ld_i64(arg1, cpu_env, arg2 + ofs);
+}
+
+DEF_GEN_INSN2_GVEC_MM(movq, mov, Pq, Qq, MO_64)
+DEF_GEN_INSN2_GVEC_MM(movq, mov, Qq, Pq, MO_64)
+
+DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(paddd, add, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(paddsb, ssadd, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(paddsw, ssadd, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(paddusb, usadd, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(paddusw, usadd, Pq, Pq, Qq, MO_16)
+
+DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(psubw, sub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(psubd, sub, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(psubsb, sssub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(psubsw, sssub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(psubusb, ussub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(psubusw, ussub, Pq, Pq, Qq, MO_16)
+
+DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmulhw, pmulhw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_mmx, Pq, Pq, Qq)
+
+DEF_GEN_INSN3_GVEC_MM(pcmpeqb, cmpeq, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(pcmpeqw, cmpeq, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(pcmpeqd, cmpeq, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(pcmpgtb, cmpgt, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(pcmpgtw, cmpgt, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_MM(pcmpgtd, cmpgt, Pq, Pq, Qq, MO_32)
+
+DEF_GEN_INSN3_GVEC_MM(pand, and, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_MM(pandn, andn, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_MM(por, or, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_MM(pxor, xor, Pq, Pq, Qq, MO_64)
+
+DEF_GEN_INSN3_HELPER_EPP(psllw, psllw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pslld, pslld_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psllq, psllq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrlw, psrlw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrld, psrld_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrlq, psrlq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psraw, psraw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrad, psrad_mmx, Pq, Pq, Qq)
+
+#define DEF_GEN_PSHIFT_IMM_MM(mnem, opT1, opT2)                         \
+    GEN_INSN3(mnem, opT1, opT2, Ib)                                     \
+    {                                                                   \
+        const uint64_t arg3_ui64 = (uint8_t)arg3;                       \
+        const insnop_arg_t(Eq) arg3_r64 = s->tmp1_i64;                  \
+        const insnop_arg_t(Qq) arg3_mm =                                \
+            offsetof(CPUX86State, mmx_t0.MMX_Q(0));                     \
+                                                                        \
+        tcg_gen_movi_i64(arg3_r64, arg3_ui64);                          \
+        gen_insn2(movq, Pq, Eq)(env, s, arg3_mm, arg3_r64);             \
+        gen_insn3(mnem, Pq, Pq, Qq)(env, s, arg1, arg2, arg3_mm);       \
+    }
+
+DEF_GEN_PSHIFT_IMM_MM(psllw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(pslld, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psllq, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psrlw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psrld, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psrlq, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psraw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_MM(psrad, Nq, Nq)
+
+DEF_GEN_INSN3_HELPER_EPP(packsswb, packsswb_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(packssdw, packssdw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(packuswb, packuswb_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(punpcklbw, punpcklbw_mmx, Pq, Pq, Qd)
+DEF_GEN_INSN3_HELPER_EPP(punpcklwd, punpcklwd_mmx, Pq, Pq, Qd)
+DEF_GEN_INSN3_HELPER_EPP(punpckldq, punpckldq_mmx, Pq, Pq, Qd)
+DEF_GEN_INSN3_HELPER_EPP(punpckhbw, punpckhbw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(punpckhwd, punpckhwd_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(punpckhdq, punpckhdq_mmx, Pq, Pq, Qq)
+
+DEF_GEN_INSN0_HELPER(emms, emms)
 
 /*
  * Instruction translators
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 37/46] target/i386: introduce MMX instructions to sse-opcode.inc.h
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (35 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 36/46] target/i386: introduce MMX code generators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 38/46] target/i386: introduce SSE translators Jan Bobek
                   ` (9 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Add all MMX instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/sse-opcode.inc.h | 131 +++++++++++++++++++++++++++++++++++
 1 file changed, 131 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index c5e81a6a80..36963e5a7c 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -39,6 +39,137 @@
 #   define OPCODE_GRP_END(grpname)
 #endif /* OPCODE_GRP_END */
 
+/* NP 0F 6E /r: MOVD mm,r/m32 */
+OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
+/* NP 0F 7E /r: MOVD r/m32,mm */
+OPCODE(movd, LEG(NP, 0F, 0, 0x7e), MMX, WR, Ed, Pq)
+/* NP REX.W + 0F 6E /r: MOVQ mm,r/m64 */
+OPCODE(movq, LEG(NP, 0F, 1, 0x6e), MMX, WR, Pq, Eq)
+/* NP REX.W + 0F 7E /r: MOVQ r/m64,mm */
+OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
+/* NP 0F 6F /r: MOVQ mm, mm/m64 */
+OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
+/* NP 0F 7F /r: MOVQ mm/m64, mm */
+OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
+/* NP 0F FC /r: PADDB mm, mm/m64 */
+OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F FD /r: PADDW mm, mm/m64 */
+OPCODE(paddw, LEG(NP, 0F, 0, 0xfd), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F FE /r: PADDD mm, mm/m64 */
+OPCODE(paddd, LEG(NP, 0F, 0, 0xfe), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F EC /r: PADDSB mm, mm/m64 */
+OPCODE(paddsb, LEG(NP, 0F, 0, 0xec), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F ED /r: PADDSW mm, mm/m64 */
+OPCODE(paddsw, LEG(NP, 0F, 0, 0xed), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DC /r: PADDUSB mm,mm/m64 */
+OPCODE(paddusb, LEG(NP, 0F, 0, 0xdc), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DD /r: PADDUSW mm,mm/m64 */
+OPCODE(paddusw, LEG(NP, 0F, 0, 0xdd), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F8 /r: PSUBB mm, mm/m64 */
+OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F9 /r: PSUBW mm, mm/m64 */
+OPCODE(psubw, LEG(NP, 0F, 0, 0xf9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F FA /r: PSUBD mm, mm/m64 */
+OPCODE(psubd, LEG(NP, 0F, 0, 0xfa), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E8 /r: PSUBSB mm, mm/m64 */
+OPCODE(psubsb, LEG(NP, 0F, 0, 0xe8), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E9 /r: PSUBSW mm, mm/m64 */
+OPCODE(psubsw, LEG(NP, 0F, 0, 0xe9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D8 /r: PSUBUSB mm, mm/m64 */
+OPCODE(psubusb, LEG(NP, 0F, 0, 0xd8), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D9 /r: PSUBUSW mm, mm/m64 */
+OPCODE(psubusw, LEG(NP, 0F, 0, 0xd9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D5 /r: PMULLW mm, mm/m64 */
+OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E5 /r: PMULHW mm, mm/m64 */
+OPCODE(pmulhw, LEG(NP, 0F, 0, 0xe5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F5 /r: PMADDWD mm, mm/m64 */
+OPCODE(pmaddwd, LEG(NP, 0F, 0, 0xf5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 74 /r: PCMPEQB mm,mm/m64 */
+OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 75 /r: PCMPEQW mm,mm/m64 */
+OPCODE(pcmpeqw, LEG(NP, 0F, 0, 0x75), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 76 /r: PCMPEQD mm,mm/m64 */
+OPCODE(pcmpeqd, LEG(NP, 0F, 0, 0x76), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 64 /r: PCMPGTB mm,mm/m64 */
+OPCODE(pcmpgtb, LEG(NP, 0F, 0, 0x64), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 65 /r: PCMPGTW mm,mm/m64 */
+OPCODE(pcmpgtw, LEG(NP, 0F, 0, 0x65), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 66 /r: PCMPGTD mm,mm/m64 */
+OPCODE(pcmpgtd, LEG(NP, 0F, 0, 0x66), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DB /r: PAND mm, mm/m64 */
+OPCODE(pand, LEG(NP, 0F, 0, 0xdb), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F DF /r: PANDN mm, mm/m64 */
+OPCODE(pandn, LEG(NP, 0F, 0, 0xdf), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F EB /r: POR mm, mm/m64 */
+OPCODE(por, LEG(NP, 0F, 0, 0xeb), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F EF /r: PXOR mm, mm/m64 */
+OPCODE(pxor, LEG(NP, 0F, 0, 0xef), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F1 /r: PSLLW mm, mm/m64 */
+OPCODE(psllw, LEG(NP, 0F, 0, 0xf1), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F2 /r: PSLLD mm, mm/m64 */
+OPCODE(pslld, LEG(NP, 0F, 0, 0xf2), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F F3 /r: PSLLQ mm, mm/m64 */
+OPCODE(psllq, LEG(NP, 0F, 0, 0xf3), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D1 /r: PSRLW mm, mm/m64 */
+OPCODE(psrlw, LEG(NP, 0F, 0, 0xd1), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D2 /r: PSRLD mm, mm/m64 */
+OPCODE(psrld, LEG(NP, 0F, 0, 0xd2), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F D3 /r: PSRLQ mm, mm/m64 */
+OPCODE(psrlq, LEG(NP, 0F, 0, 0xd3), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E1 /r: PSRAW mm,mm/m64 */
+OPCODE(psraw, LEG(NP, 0F, 0, 0xe1), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E2 /r: PSRAD mm,mm/m64 */
+OPCODE(psrad, LEG(NP, 0F, 0, 0xe2), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 63 /r: PACKSSWB mm1, mm2/m64 */
+OPCODE(packsswb, LEG(NP, 0F, 0, 0x63), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 6B /r: PACKSSDW mm1, mm2/m64 */
+OPCODE(packssdw, LEG(NP, 0F, 0, 0x6b), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 67 /r: PACKUSWB mm, mm/m64 */
+OPCODE(packuswb, LEG(NP, 0F, 0, 0x67), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 68 /r: PUNPCKHBW mm, mm/m64 */
+OPCODE(punpckhbw, LEG(NP, 0F, 0, 0x68), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 69 /r: PUNPCKHWD mm, mm/m64 */
+OPCODE(punpckhwd, LEG(NP, 0F, 0, 0x69), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 6A /r: PUNPCKHDQ mm, mm/m64 */
+OPCODE(punpckhdq, LEG(NP, 0F, 0, 0x6a), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 60 /r: PUNPCKLBW mm, mm/m32 */
+OPCODE(punpcklbw, LEG(NP, 0F, 0, 0x60), MMX, WRR, Pq, Pq, Qd)
+/* NP 0F 61 /r: PUNPCKLWD mm, mm/m32 */
+OPCODE(punpcklwd, LEG(NP, 0F, 0, 0x61), MMX, WRR, Pq, Pq, Qd)
+/* NP 0F 62 /r: PUNPCKLDQ mm, mm/m32 */
+OPCODE(punpckldq, LEG(NP, 0F, 0, 0x62), MMX, WRR, Pq, Pq, Qd)
+/* NP 0F 77: EMMS */
+OPCODE(emms, LEG(NP, 0F, 0, 0x77), MMX, )
+
+OPCODE_GRP(grp12_LEG_NP, LEG(NP, 0F, 0, 0x71))
+OPCODE_GRP_BEGIN(grp12_LEG_NP)
+    /* NP 0F 71 /6 ib: PSLLW mm1, imm8 */
+    OPCODE_GRPMEMB(grp12_LEG_NP, psllw, 6, MMX, WRR, Nq, Nq, Ib)
+    /* NP 0F 71 /2 ib: PSRLW mm, imm8 */
+    OPCODE_GRPMEMB(grp12_LEG_NP, psrlw, 2, MMX, WRR, Nq, Nq, Ib)
+    /* NP 0F 71 /4 ib: PSRAW mm,imm8 */
+    OPCODE_GRPMEMB(grp12_LEG_NP, psraw, 4, MMX, WRR, Nq, Nq, Ib)
+OPCODE_GRP_END(grp12_LEG_NP)
+
+OPCODE_GRP(grp13_LEG_NP, LEG(NP, 0F, 0, 0x72))
+OPCODE_GRP_BEGIN(grp13_LEG_NP)
+    /* NP 0F 72 /6 ib: PSLLD mm, imm8 */
+    OPCODE_GRPMEMB(grp13_LEG_NP, pslld, 6, MMX, WRR, Nq, Nq, Ib)
+    /* NP 0F 72 /2 ib: PSRLD mm, imm8 */
+    OPCODE_GRPMEMB(grp13_LEG_NP, psrld, 2, MMX, WRR, Nq, Nq, Ib)
+    /* NP 0F 72 /4 ib: PSRAD mm,imm8 */
+    OPCODE_GRPMEMB(grp13_LEG_NP, psrad, 4, MMX, WRR, Nq, Nq, Ib)
+OPCODE_GRP_END(grp13_LEG_NP)
+
+OPCODE_GRP(grp14_LEG_NP, LEG(NP, 0F, 0, 0x73))
+OPCODE_GRP_BEGIN(grp14_LEG_NP)
+    /* NP 0F 73 /6 ib: PSLLQ mm, imm8 */
+    OPCODE_GRPMEMB(grp14_LEG_NP, psllq, 6, MMX, WRR, Nq, Nq, Ib)
+    /* NP 0F 73 /2 ib: PSRLQ mm, imm8 */
+    OPCODE_GRPMEMB(grp14_LEG_NP, psrlq, 2, MMX, WRR, Nq, Nq, Ib)
+OPCODE_GRP_END(grp14_LEG_NP)
+
 #undef FMTI____
 #undef FMTI__R__
 #undef FMTI__RR__
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 38/46] target/i386: introduce SSE translators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (36 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 37/46] target/i386: introduce MMX instructions to sse-opcode.inc.h Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 39/46] target/i386: introduce SSE code generators Jan Bobek
                   ` (8 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Use the translator macros to define translators required by SSE
instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index a02e9cd0d2..ef64fe606f 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5533,6 +5533,9 @@ static void translate_insn0()(
         }                                                               \
     }
 
+DEF_TRANSLATE_INSN1(Mb)
+DEF_TRANSLATE_INSN1(Md)
+
 #define DEF_TRANSLATE_INSN2(opT1, opT2)                                 \
     static void translate_insn2(opT1, opT2)(                            \
         CPUX86State *env, DisasContext *s, int modrm,                   \
@@ -5571,11 +5574,29 @@ static void translate_insn0()(
 DEF_TRANSLATE_INSN2(Ed, Pq)
 DEF_TRANSLATE_INSN2(Eq, Pq)
 DEF_TRANSLATE_INSN2(Gd, Nq)
+DEF_TRANSLATE_INSN2(Gd, Udq)
+DEF_TRANSLATE_INSN2(Gd, Wd)
 DEF_TRANSLATE_INSN2(Gq, Nq)
+DEF_TRANSLATE_INSN2(Gq, Udq)
+DEF_TRANSLATE_INSN2(Gq, Wd)
+DEF_TRANSLATE_INSN2(Mdq, Vdq)
+DEF_TRANSLATE_INSN2(Mq, Pq)
+DEF_TRANSLATE_INSN2(Mq, Vdq)
+DEF_TRANSLATE_INSN2(Mq, Vq)
 DEF_TRANSLATE_INSN2(Pq, Ed)
 DEF_TRANSLATE_INSN2(Pq, Eq)
+DEF_TRANSLATE_INSN2(Pq, Nq)
 DEF_TRANSLATE_INSN2(Pq, Qq)
+DEF_TRANSLATE_INSN2(Pq, Wq)
 DEF_TRANSLATE_INSN2(Qq, Pq)
+DEF_TRANSLATE_INSN2(Vd, Ed)
+DEF_TRANSLATE_INSN2(Vd, Eq)
+DEF_TRANSLATE_INSN2(Vd, Wd)
+DEF_TRANSLATE_INSN2(Vdq, Qq)
+DEF_TRANSLATE_INSN2(Vdq, Wdq)
+DEF_TRANSLATE_INSN2(Vq, UdqMhq)
+DEF_TRANSLATE_INSN2(Wd, Vd)
+DEF_TRANSLATE_INSN2(Wdq, Vdq)
 
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)                           \
     static void translate_insn3(opT1, opT2, opT3)(                      \
@@ -5627,6 +5648,9 @@ DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
 DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
+DEF_TRANSLATE_INSN3(Vd, Vd, Wd)
+DEF_TRANSLATE_INSN3(Vdq, Vdq, Wdq)
+DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
 
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4)                     \
     static void translate_insn4(opT1, opT2, opT3, opT4)(                \
@@ -5680,6 +5704,11 @@ DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
         }                                                               \
     }
 
+DEF_TRANSLATE_INSN4(Pq, Pq, RdMw, Ib)
+DEF_TRANSLATE_INSN4(Vd, Vd, Wd, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, Wd, modrm_mod)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, Wdq, Ib)
+
 #define OPCODE_GRP_BEGIN(grpname)                                       \
     static void translate_group(grpname)(                               \
         CPUX86State *env, DisasContext *s, int modrm)                   \
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 39/46] target/i386: introduce SSE code generators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (37 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 38/46] target/i386: introduce SSE translators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 40/46] target/i386: introduce SSE instructions to sse-opcode.inc.h Jan Bobek
                   ` (7 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Introduce code generators required by SSE instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 319 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 319 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index ef64fe606f..3d526ee470 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5360,6 +5360,9 @@ INSNOP_LDST(xmm_t0, Mhq)
 #define DEF_GEN_INSN2_GVEC_MM(mnem, gvec, opT1, opT2, vece)       \
     DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece,              \
                        sizeof(MMXReg), sizeof(MMXReg))
+#define DEF_GEN_INSN2_GVEC_XMM(mnem, gvec, opT1, opT2, vece)      \
+    DEF_GEN_INSN2_GVEC(mnem, gvec, opT1, opT2, vece,              \
+                       sizeof(XMMReg), sizeof(XMMReg))
 
 #define DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece, oprsz, maxsz) \
     GEN_INSN3(mnem, opT1, opT2, opT3)                                   \
@@ -5369,6 +5372,9 @@ INSNOP_LDST(xmm_t0, Mhq)
 #define DEF_GEN_INSN3_GVEC_MM(mnem, gvec, opT1, opT2, opT3, vece)   \
     DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece,          \
                        sizeof(MMXReg), sizeof(MMXReg))
+#define DEF_GEN_INSN3_GVEC_XMM(mnem, gvec, opT1, opT2, opT3, vece)  \
+    DEF_GEN_INSN3_GVEC(mnem, gvec, opT1, opT2, opT3, vece,          \
+                       sizeof(XMMReg), sizeof(XMMReg))
 
 GEN_INSN2(movq, Pq, Eq);        /* forward declaration */
 GEN_INSN2(movd, Pq, Ed)
@@ -5399,6 +5405,90 @@ GEN_INSN2(movq, Eq, Pq)
 
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Pq, Qq, MO_64)
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Qq, Pq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movups, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movups, mov, Wdq, Vdq, MO_64)
+
+GEN_INSN2(movss, Wd, Vd);       /* forward declaration */
+GEN_INSN4(movss, Vdq, Vdq, Wd, modrm_mod)
+{
+    assert(arg1 == arg2);
+
+    if (arg4 == 3) {
+        /* merging movss */
+        gen_insn2(movss, Wd, Vd)(env, s, arg1, arg3);
+    } else {
+        /* zero-extending movss */
+        const TCGv_i32 r32 = tcg_temp_new_i32();
+        const TCGv_i64 r64 = tcg_temp_new_i64();
+
+        tcg_gen_ld_i32(r32, cpu_env, arg3 + offsetof(ZMMReg, ZMM_L(0)));
+        tcg_gen_extu_i32_i64(r64, r32);
+        tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+
+        tcg_gen_movi_i64(r64, 0);
+        tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+
+        tcg_temp_free_i32(r32);
+        tcg_temp_free_i64(r64);
+    }
+}
+
+GEN_INSN2(movss, Wd, Vd)
+{
+    const insnop_arg_t(Wd) dofs = offsetof(ZMMReg, ZMM_L(0));
+    const insnop_arg_t(Vd) aofs = offsetof(ZMMReg, ZMM_L(0));
+    gen_op_movl(s, arg1 + dofs, arg2 + aofs);
+}
+
+GEN_INSN2(movhlps, Vq, UdqMhq)
+{
+    const size_t dofs = offsetof(ZMMReg, ZMM_Q(0));
+    const size_t aofs = offsetof(ZMMReg, ZMM_Q(1));
+    gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+}
+
+GEN_INSN2(movlps, Mq, Vq)
+{
+    assert(arg1 == s->A0);
+    gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
+}
+
+GEN_INSN3(movlhps, Vdq, Vq, Wq)
+{
+    assert(arg1 == arg2);
+
+    const size_t dofs = offsetof(ZMMReg, ZMM_Q(1));
+    const size_t aofs = offsetof(ZMMReg, ZMM_Q(0));
+    gen_op_movq(s, arg1 + dofs, arg3 + aofs);
+}
+
+GEN_INSN2(movhps, Mq, Vdq)
+{
+    assert(arg1 == s->A0);
+    gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(1)));
+}
+
+DEF_GEN_INSN2_HELPER_DEP(pmovmskb, pmovmskb_mmx, Gd, Nq)
+
+GEN_INSN2(pmovmskb, Gq, Nq)
+{
+    const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+    gen_insn2(pmovmskb, Gd, Nq)(env, s, arg1_r32, arg2);
+    tcg_gen_extu_i32_i64(arg1, arg1_r32);
+    tcg_temp_free_i32(arg1_r32);
+}
+
+DEF_GEN_INSN2_HELPER_DEP(movmskps, movmskps, Gd, Udq)
+
+GEN_INSN2(movmskps, Gq, Udq)
+{
+    const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+    gen_insn2(movmskps, Gd, Udq)(env, s, arg1_r32, arg2);
+    tcg_gen_extu_i32_i64(arg1, arg1_r32);
+    tcg_temp_free_i32(arg1_r32);
+}
 
 DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
@@ -5407,6 +5497,8 @@ DEF_GEN_INSN3_GVEC_MM(paddsb, ssadd, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddsw, ssadd, Pq, Pq, Qq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(paddusb, usadd, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddusw, usadd, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_HELPER_EPP(addps, addps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(addss, addss, Vd, Vd, Wd)
 
 DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubw, sub, Pq, Pq, Qq, MO_16)
@@ -5415,11 +5507,38 @@ DEF_GEN_INSN3_GVEC_MM(psubsb, sssub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubsw, sssub, Pq, Pq, Qq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(psubusb, ussub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubusw, ussub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_HELPER_EPP(subps, subps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(subss, subss, Vd, Vd, Wd)
 
 DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(pmulhw, pmulhw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmulhuw, pmulhuw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(mulps, mulps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(mulss, mulss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_mmx, Pq, Pq, Qq)
 
+DEF_GEN_INSN3_HELPER_EPP(divps, divps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(divss, divss, Vd, Vd, Wd)
+
+DEF_GEN_INSN2_HELPER_EPP(rcpps, rcpps, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(rcpss, rcpss, Vd, Wd)
+DEF_GEN_INSN2_HELPER_EPP(sqrtps, sqrtps, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(sqrtss, sqrtss, Vd, Wd)
+DEF_GEN_INSN2_HELPER_EPP(rsqrtps, rsqrtps, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(rsqrtss, rsqrtss, Vd, Wd)
+
+DEF_GEN_INSN3_GVEC_MM(pminub, umin, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(pminsw, smin, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_HELPER_EPP(minps, minps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(minss, minss, Vd, Vd, Wd)
+DEF_GEN_INSN3_GVEC_MM(pmaxub, umax, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_MM(pmaxsw, smax, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_HELPER_EPP(maxps, maxps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(maxss, maxss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(pavgb, pavgb_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pavgw, pavgw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psadbw, psadbw_mmx, Pq, Pq, Qq)
+
 DEF_GEN_INSN3_GVEC_MM(pcmpeqb, cmpeq, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(pcmpeqw, cmpeq, Pq, Pq, Qq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(pcmpeqd, cmpeq, Pq, Pq, Qq, MO_32)
@@ -5427,10 +5546,98 @@ DEF_GEN_INSN3_GVEC_MM(pcmpgtb, cmpgt, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(pcmpgtw, cmpgt, Pq, Pq, Qq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(pcmpgtd, cmpgt, Pq, Pq, Qq, MO_32)
 
+DEF_GEN_INSN3_HELPER_EPP(cmpeqps, cmpeqps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpeqss, cmpeqss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpltps, cmpltps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpltss, cmpltss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpleps, cmpleps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpless, cmpless, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpunordps, cmpunordps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpunordss, cmpunordss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpneqps, cmpneqps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpneqss, cmpneqss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpnltps, cmpnltps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpnltss, cmpnltss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpnleps, cmpnleps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpnless, cmpnless, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpordps, cmpordps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpordss, cmpordss, Vd, Vd, Wd)
+
+GEN_INSN4(cmpps, Vdq, Vdq, Wdq, Ib)
+{
+    switch (arg4 & 7) {
+    case 0:
+        gen_insn3(cmpeqps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 1:
+        gen_insn3(cmpltps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 2:
+        gen_insn3(cmpleps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 3:
+        gen_insn3(cmpunordps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 4:
+        gen_insn3(cmpneqps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 5:
+        gen_insn3(cmpnltps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 6:
+        gen_insn3(cmpnleps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 7:
+        gen_insn3(cmpordps, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    }
+
+    g_assert_not_reached();
+}
+
+GEN_INSN4(cmpss, Vd, Vd, Wd, Ib)
+{
+    switch (arg4 & 7) {
+    case 0:
+        gen_insn3(cmpeqss, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 1:
+        gen_insn3(cmpltss, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 2:
+        gen_insn3(cmpless, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 3:
+        gen_insn3(cmpunordss, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 4:
+        gen_insn3(cmpneqss, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 5:
+        gen_insn3(cmpnltss, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 6:
+        gen_insn3(cmpnless, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    case 7:
+        gen_insn3(cmpordss, Vd, Vd, Wd)(env, s, arg1, arg2, arg3);
+        return;
+    }
+
+    g_assert_not_reached();
+}
+
+DEF_GEN_INSN2_HELPER_EPP(comiss, comiss, Vd, Wd)
+DEF_GEN_INSN2_HELPER_EPP(ucomiss, ucomiss, Vd, Wd)
+
 DEF_GEN_INSN3_GVEC_MM(pand, and, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(andps, and, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(pandn, andn, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(andnps, andn, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(por, or, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(orps, or, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(pxor, xor, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(xorps, xor, Vdq, Vdq, Wdq, MO_64)
 
 DEF_GEN_INSN3_HELPER_EPP(psllw, psllw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(pslld, pslld_mmx, Pq, Pq, Qq)
@@ -5473,8 +5680,120 @@ DEF_GEN_INSN3_HELPER_EPP(punpckhbw, punpckhbw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(punpckhwd, punpckhwd_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(punpckhdq, punpckhdq_mmx, Pq, Pq, Qq)
 
+DEF_GEN_INSN3_HELPER_EPP(unpcklps, punpckldq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(unpckhps, punpckhdq_xmm, Vdq, Vdq, Wdq)
+
+DEF_GEN_INSN3_HELPER_PPI(pshufw, pshufw_mmx, Pq, Qq, Ib)
+DEF_GEN_INSN4_HELPER_PPI(shufps, shufps, Vdq, Vdq, Wdq, Ib)
+
+GEN_INSN4(pinsrw, Pq, Pq, RdMw, Ib)
+{
+    assert(arg1 == arg2);
+
+    const size_t ofs = offsetof(MMXReg, MMX_W(arg4 & 3));
+    tcg_gen_st16_i32(arg3, cpu_env, arg1 + ofs);
+}
+
+GEN_INSN3(pextrw, Gd, Nq, Ib)
+{
+    const size_t ofs = offsetof(MMXReg, MMX_W(arg3 & 3));
+    tcg_gen_ld16u_i32(arg1, cpu_env, arg2 + ofs);
+}
+
+GEN_INSN3(pextrw, Gq, Nq, Ib)
+{
+    const size_t ofs = offsetof(MMXReg, MMX_W(arg3 & 3));
+    tcg_gen_ld16u_i64(arg1, cpu_env, arg2 + ofs);
+}
+
+DEF_GEN_INSN2_HELPER_EPP(cvtpi2ps, cvtpi2ps, Vdq, Qq)
+DEF_GEN_INSN2_HELPER_EPD(cvtsi2ss, cvtsi2ss, Vd, Ed)
+DEF_GEN_INSN2_HELPER_EPQ(cvtsi2ss, cvtsq2ss, Vd, Eq)
+DEF_GEN_INSN2_HELPER_EPP(cvtps2pi, cvtps2pi, Pq, Wq)
+DEF_GEN_INSN2_HELPER_DEP(cvtss2si, cvtss2si, Gd, Wd)
+DEF_GEN_INSN2_HELPER_QEP(cvtss2si, cvtss2sq, Gq, Wd)
+DEF_GEN_INSN2_HELPER_EPP(cvttps2pi, cvttps2pi, Pq, Wq)
+DEF_GEN_INSN2_HELPER_DEP(cvttss2si, cvttss2si, Gd, Wd)
+DEF_GEN_INSN2_HELPER_QEP(cvttss2si, cvttss2sq, Gq, Wd)
+
+GEN_INSN2(maskmovq, Pq, Nq)
+{
+    const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();
+    const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();
+
+    tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
+    gen_extu(s->aflag, s->A0);
+    gen_add_A0_ds_seg(s);
+
+    tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);
+    tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);
+    gen_helper_maskmov_mmx(cpu_env, arg1_ptr, arg2_ptr, s->A0);
+
+    tcg_temp_free_ptr(arg1_ptr);
+    tcg_temp_free_ptr(arg2_ptr);
+}
+
+GEN_INSN2(movntps, Mdq, Vdq)
+{
+    assert(arg1 == s->A0);
+    gen_sto_env_A0(s, arg2);
+}
+
+GEN_INSN2(movntq, Mq, Pq)
+{
+    assert(arg1 == s->A0);
+    gen_stq_env_A0(s, arg2);
+}
+
 DEF_GEN_INSN0_HELPER(emms, emms)
 
+GEN_INSN0(sfence)
+{
+    if (s->prefix & PREFIX_LOCK) {
+        gen_illegal_opcode(s);
+    } else {
+        tcg_gen_mb(TCG_MO_ST_ST | TCG_BAR_SC);
+    }
+}
+
+GEN_INSN1(ldmxcsr, Md)
+{
+    if ((s->flags & HF_EM_MASK) || !(s->flags & HF_OSFXSR_MASK)) {
+        gen_illegal_opcode(s);
+    } else if (s->flags & HF_TS_MASK) {
+        gen_exception(s, EXCP07_PREX);
+    } else {
+        tcg_gen_qemu_ld_i32(s->tmp2_i32, arg1, s->mem_index, MO_LEUL);
+        gen_helper_ldmxcsr(cpu_env, s->tmp2_i32);
+    }
+}
+
+GEN_INSN1(stmxcsr, Md)
+{
+    if ((s->flags & HF_EM_MASK) || !(s->flags & HF_OSFXSR_MASK)) {
+        gen_illegal_opcode(s);
+    } else if (s->flags & HF_TS_MASK) {
+        gen_exception(s, EXCP07_PREX);
+    } else {
+        const size_t ofs = offsetof(CPUX86State, mxcsr);
+        tcg_gen_ld_i32(s->tmp2_i32, cpu_env, ofs);
+        tcg_gen_qemu_st_i32(s->tmp2_i32, arg1, s->mem_index, MO_LEUL);
+    }
+}
+
+GEN_INSN1(prefetcht0, Mb)
+{
+}
+GEN_INSN1(prefetcht1, Mb)
+{
+}
+GEN_INSN1(prefetcht2, Mb)
+{
+}
+GEN_INSN1(prefetchnta, Mb)
+{
+}
+
 /*
  * Instruction translators
  */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 40/46] target/i386: introduce SSE instructions to sse-opcode.inc.h
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (38 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 39/46] target/i386: introduce SSE code generators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 41/46] target/i386: introduce SSE2 translators Jan Bobek
                   ` (6 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Add all the SSE instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/sse-opcode.inc.h | 158 +++++++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index 36963e5a7c..39947aeb51 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -51,6 +51,36 @@ OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
 OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
 /* NP 0F 7F /r: MOVQ mm/m64, mm */
 OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
+/* NP 0F 28 /r: MOVAPS xmm1, xmm2/m128 */
+OPCODE(movaps, LEG(NP, 0F, 0, 0x28), SSE, WR, Vdq, Wdq)
+/* NP 0F 29 /r: MOVAPS xmm2/m128, xmm1 */
+OPCODE(movaps, LEG(NP, 0F, 0, 0x29), SSE, WR, Wdq, Vdq)
+/* NP 0F 10 /r: MOVUPS xmm1, xmm2/m128 */
+OPCODE(movups, LEG(NP, 0F, 0, 0x10), SSE, WR, Vdq, Wdq)
+/* NP 0F 11 /r: MOVUPS xmm2/m128, xmm1 */
+OPCODE(movups, LEG(NP, 0F, 0, 0x11), SSE, WR, Wdq, Vdq)
+/* F3 0F 10 /r: MOVSS xmm1, xmm2/m32 */
+OPCODE(movss, LEG(F3, 0F, 0, 0x10), SSE, WRRR, Vdq, Vdq, Wd, modrm_mod)
+/* F3 0F 11 /r: MOVSS xmm2/m32, xmm1 */
+OPCODE(movss, LEG(F3, 0F, 0, 0x11), SSE, WR, Wd, Vd)
+/* NP 0F 12 /r: MOVHLPS xmm1, xmm2 */
+/* NP 0F 12 /r: MOVLPS xmm1, m64 */
+OPCODE(movhlps, LEG(NP, 0F, 0, 0x12), SSE, WR, Vq, UdqMhq)
+/* 0F 13 /r: MOVLPS m64, xmm1 */
+OPCODE(movlps, LEG(NP, 0F, 0, 0x13), SSE, WR, Mq, Vq)
+/* NP 0F 16 /r: MOVLHPS xmm1, xmm2 */
+/* NP 0F 16 /r: MOVHPS xmm1, m64 */
+OPCODE(movlhps, LEG(NP, 0F, 0, 0x16), SSE, WRR, Vdq, Vq, Wq)
+/* NP 0F 17 /r: MOVHPS m64, xmm1 */
+OPCODE(movhps, LEG(NP, 0F, 0, 0x17), SSE, WR, Mq, Vdq)
+/* NP 0F D7 /r: PMOVMSKB r32, mm */
+OPCODE(pmovmskb, LEG(NP, 0F, 0, 0xd7), SSE, WR, Gd, Nq)
+/* NP REX.W 0F D7 /r: PMOVMSKB r64, mm */
+OPCODE(pmovmskb, LEG(NP, 0F, 1, 0xd7), SSE, WR, Gq, Nq)
+/* NP 0F 50 /r: MOVMSKPS r32, xmm */
+OPCODE(movmskps, LEG(NP, 0F, 0, 0x50), SSE, WR, Gd, Udq)
+/* NP REX.W 0F 50 /r: MOVMSKPS r64, xmm */
+OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
 /* NP 0F FC /r: PADDB mm, mm/m64 */
 OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F FD /r: PADDW mm, mm/m64 */
@@ -65,6 +95,10 @@ OPCODE(paddsw, LEG(NP, 0F, 0, 0xed), MMX, WRR, Pq, Pq, Qq)
 OPCODE(paddusb, LEG(NP, 0F, 0, 0xdc), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F DD /r: PADDUSW mm,mm/m64 */
 OPCODE(paddusw, LEG(NP, 0F, 0, 0xdd), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 58 /r: ADDPS xmm1, xmm2/m128 */
+OPCODE(addps, LEG(NP, 0F, 0, 0x58), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 58 /r: ADDSS xmm1, xmm2/m32 */
+OPCODE(addss, LEG(F3, 0F, 0, 0x58), SSE, WRR, Vd, Vd, Wd)
 /* NP 0F F8 /r: PSUBB mm, mm/m64 */
 OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F F9 /r: PSUBW mm, mm/m64 */
@@ -79,12 +113,60 @@ OPCODE(psubsw, LEG(NP, 0F, 0, 0xe9), MMX, WRR, Pq, Pq, Qq)
 OPCODE(psubusb, LEG(NP, 0F, 0, 0xd8), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F D9 /r: PSUBUSW mm, mm/m64 */
 OPCODE(psubusw, LEG(NP, 0F, 0, 0xd9), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 5C /r: SUBPS xmm1, xmm2/m128 */
+OPCODE(subps, LEG(NP, 0F, 0, 0x5c), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5C /r: SUBSS xmm1, xmm2/m32 */
+OPCODE(subss, LEG(F3, 0F, 0, 0x5c), SSE, WRR, Vd, Vd, Wd)
 /* NP 0F D5 /r: PMULLW mm, mm/m64 */
 OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F E5 /r: PMULHW mm, mm/m64 */
 OPCODE(pmulhw, LEG(NP, 0F, 0, 0xe5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F E4 /r: PMULHUW mm1, mm2/m64 */
+OPCODE(pmulhuw, LEG(NP, 0F, 0, 0xe4), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F 59 /r: MULPS xmm1, xmm2/m128 */
+OPCODE(mulps, LEG(NP, 0F, 0, 0x59), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 59 /r: MULSS xmm1,xmm2/m32 */
+OPCODE(mulss, LEG(F3, 0F, 0, 0x59), SSE, WRR, Vd, Vd, Wd)
 /* NP 0F F5 /r: PMADDWD mm, mm/m64 */
 OPCODE(pmaddwd, LEG(NP, 0F, 0, 0xf5), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 5E /r: DIVPS xmm1, xmm2/m128 */
+OPCODE(divps, LEG(NP, 0F, 0, 0x5e), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5E /r: DIVSS xmm1, xmm2/m32 */
+OPCODE(divss, LEG(F3, 0F, 0, 0x5e), SSE, WRR, Vd, Vd, Wd)
+/* NP 0F 53 /r: RCPPS xmm1, xmm2/m128 */
+OPCODE(rcpps, LEG(NP, 0F, 0, 0x53), SSE, WR, Vdq, Wdq)
+/* F3 0F 53 /r: RCPSS xmm1, xmm2/m32 */
+OPCODE(rcpss, LEG(F3, 0F, 0, 0x53), SSE, WR, Vd, Wd)
+/* NP 0F 51 /r: SQRTPS xmm1, xmm2/m128 */
+OPCODE(sqrtps, LEG(NP, 0F, 0, 0x51), SSE, WR, Vdq, Wdq)
+/* F3 0F 51 /r: SQRTSS xmm1, xmm2/m32 */
+OPCODE(sqrtss, LEG(F3, 0F, 0, 0x51), SSE, WR, Vd, Wd)
+/* NP 0F 52 /r: RSQRTPS xmm1, xmm2/m128 */
+OPCODE(rsqrtps, LEG(NP, 0F, 0, 0x52), SSE, WR, Vdq, Wdq)
+/* F3 0F 52 /r: RSQRTSS xmm1, xmm2/m32 */
+OPCODE(rsqrtss, LEG(F3, 0F, 0, 0x52), SSE, WR, Vd, Wd)
+/* NP 0F DA /r: PMINUB mm1, mm2/m64 */
+OPCODE(pminub, LEG(NP, 0F, 0, 0xda), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F EA /r: PMINSW mm1, mm2/m64 */
+OPCODE(pminsw, LEG(NP, 0F, 0, 0xea), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F 5D /r: MINPS xmm1, xmm2/m128 */
+OPCODE(minps, LEG(NP, 0F, 0, 0x5d), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5D /r: MINSS xmm1,xmm2/m32 */
+OPCODE(minss, LEG(F3, 0F, 0, 0x5d), SSE, WRR, Vd, Vd, Wd)
+/* NP 0F DE /r: PMAXUB mm1, mm2/m64 */
+OPCODE(pmaxub, LEG(NP, 0F, 0, 0xde), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F EE /r: PMAXSW mm1, mm2/m64 */
+OPCODE(pmaxsw, LEG(NP, 0F, 0, 0xee), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F 5F /r: MAXPS xmm1, xmm2/m128 */
+OPCODE(maxps, LEG(NP, 0F, 0, 0x5f), SSE, WRR, Vdq, Vdq, Wdq)
+/* F3 0F 5F /r: MAXSS xmm1, xmm2/m32 */
+OPCODE(maxss, LEG(F3, 0F, 0, 0x5f), SSE, WRR, Vd, Vd, Wd)
+/* NP 0F E0 /r: PAVGB mm1, mm2/m64 */
+OPCODE(pavgb, LEG(NP, 0F, 0, 0xe0), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F E3 /r: PAVGW mm1, mm2/m64 */
+OPCODE(pavgw, LEG(NP, 0F, 0, 0xe3), SSE, WRR, Pq, Pq, Qq)
+/* NP 0F F6 /r: PSADBW mm1, mm2/m64 */
+OPCODE(psadbw, LEG(NP, 0F, 0, 0xf6), SSE, WRR, Pq, Pq, Qq)
 /* NP 0F 74 /r: PCMPEQB mm,mm/m64 */
 OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F 75 /r: PCMPEQW mm,mm/m64 */
@@ -97,14 +179,30 @@ OPCODE(pcmpgtb, LEG(NP, 0F, 0, 0x64), MMX, WRR, Pq, Pq, Qq)
 OPCODE(pcmpgtw, LEG(NP, 0F, 0, 0x65), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F 66 /r: PCMPGTD mm,mm/m64 */
 OPCODE(pcmpgtd, LEG(NP, 0F, 0, 0x66), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F C2 /r ib: CMPPS xmm1, xmm2/m128, imm8 */
+OPCODE(cmpps, LEG(NP, 0F, 0, 0xc2), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
+/* F3 0F C2 /r ib: CMPSS xmm1, xmm2/m32, imm8 */
+OPCODE(cmpss, LEG(F3, 0F, 0, 0xc2), SSE, WRRR, Vd, Vd, Wd, Ib)
+/* NP 0F 2E /r: UCOMISS xmm1, xmm2/m32 */
+OPCODE(ucomiss, LEG(NP, 0F, 0, 0x2e), SSE, RR, Vd, Wd)
+/* NP 0F 2F /r: COMISS xmm1, xmm2/m32 */
+OPCODE(comiss, LEG(NP, 0F, 0, 0x2f), SSE, RR, Vd, Wd)
 /* NP 0F DB /r: PAND mm, mm/m64 */
 OPCODE(pand, LEG(NP, 0F, 0, 0xdb), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 54 /r: ANDPS xmm1, xmm2/m128 */
+OPCODE(andps, LEG(NP, 0F, 0, 0x54), SSE, WRR, Vdq, Vdq, Wdq)
 /* NP 0F DF /r: PANDN mm, mm/m64 */
 OPCODE(pandn, LEG(NP, 0F, 0, 0xdf), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 55 /r: ANDNPS xmm1, xmm2/m128 */
+OPCODE(andnps, LEG(NP, 0F, 0, 0x55), SSE, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EB /r: POR mm, mm/m64 */
 OPCODE(por, LEG(NP, 0F, 0, 0xeb), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 56 /r: ORPS xmm1, xmm2/m128 */
+OPCODE(orps, LEG(NP, 0F, 0, 0x56), SSE, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EF /r: PXOR mm, mm/m64 */
 OPCODE(pxor, LEG(NP, 0F, 0, 0xef), MMX, WRR, Pq, Pq, Qq)
+/* NP 0F 57 /r: XORPS xmm1, xmm2/m128 */
+OPCODE(xorps, LEG(NP, 0F, 0, 0x57), SSE, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F1 /r: PSLLW mm, mm/m64 */
 OPCODE(psllw, LEG(NP, 0F, 0, 0xf1), MMX, WRR, Pq, Pq, Qq)
 /* NP 0F F2 /r: PSLLD mm, mm/m64 */
@@ -139,6 +237,44 @@ OPCODE(punpcklbw, LEG(NP, 0F, 0, 0x60), MMX, WRR, Pq, Pq, Qd)
 OPCODE(punpcklwd, LEG(NP, 0F, 0, 0x61), MMX, WRR, Pq, Pq, Qd)
 /* NP 0F 62 /r: PUNPCKLDQ mm, mm/m32 */
 OPCODE(punpckldq, LEG(NP, 0F, 0, 0x62), MMX, WRR, Pq, Pq, Qd)
+/* NP 0F 14 /r: UNPCKLPS xmm1, xmm2/m128 */
+OPCODE(unpcklps, LEG(NP, 0F, 0, 0x14), SSE, WRR, Vdq, Vdq, Wdq)
+/* NP 0F 15 /r: UNPCKHPS xmm1, xmm2/m128 */
+OPCODE(unpckhps, LEG(NP, 0F, 0, 0x15), SSE, WRR, Vdq, Vdq, Wdq)
+/* NP 0F 70 /r ib: PSHUFW mm1, mm2/m64, imm8 */
+OPCODE(pshufw, LEG(NP, 0F, 0, 0x70), SSE, WRR, Pq, Qq, Ib)
+/* NP 0F C6 /r ib: SHUFPS xmm1, xmm3/m128, imm8 */
+OPCODE(shufps, LEG(NP, 0F, 0, 0xc6), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
+/* NP 0F C4 /r ib: PINSRW mm, r32/m16, imm8 */
+OPCODE(pinsrw, LEG(NP, 0F, 0, 0xc4), SSE, WRRR, Pq, Pq, RdMw, Ib)
+/* NP 0F C5 /r ib: PEXTRW r32, mm, imm8 */
+OPCODE(pextrw, LEG(NP, 0F, 0, 0xc5), SSE, WRR, Gd, Nq, Ib)
+/* NP REX.W 0F C5 /r ib: PEXTRW r64, mm, imm8 */
+OPCODE(pextrw, LEG(NP, 0F, 1, 0xc5), SSE, WRR, Gq, Nq, Ib)
+/* NP 0F 2A /r: CVTPI2PS xmm, mm/m64 */
+OPCODE(cvtpi2ps, LEG(NP, 0F, 0, 0x2a), SSE, WR, Vdq, Qq)
+/* F3 0F 2A /r: CVTSI2SS xmm1,r/m32 */
+OPCODE(cvtsi2ss, LEG(F3, 0F, 0, 0x2a), SSE, WR, Vd, Ed)
+/* F3 REX.W 0F 2A /r: CVTSI2SS xmm1,r/m64 */
+OPCODE(cvtsi2ss, LEG(F3, 0F, 1, 0x2a), SSE, WR, Vd, Eq)
+/* NP 0F 2D /r: CVTPS2PI mm, xmm/m64 */
+OPCODE(cvtps2pi, LEG(NP, 0F, 0, 0x2d), SSE, WR, Pq, Wq)
+/* F3 0F 2D /r: CVTSS2SI r32,xmm1/m32 */
+OPCODE(cvtss2si, LEG(F3, 0F, 0, 0x2d), SSE, WR, Gd, Wd)
+/* F3 REX.W 0F 2D /r: CVTSS2SI r64,xmm1/m32 */
+OPCODE(cvtss2si, LEG(F3, 0F, 1, 0x2d), SSE, WR, Gq, Wd)
+/* NP 0F 2C /r: CVTTPS2PI mm, xmm/m64 */
+OPCODE(cvttps2pi, LEG(NP, 0F, 0, 0x2c), SSE, WR, Pq, Wq)
+/* F3 0F 2C /r: CVTTSS2SI r32,xmm1/m32 */
+OPCODE(cvttss2si, LEG(F3, 0F, 0, 0x2c), SSE, WR, Gd, Wd)
+/* F3 REX.W 0F 2C /r: CVTTSS2SI r64,xmm1/m32 */
+OPCODE(cvttss2si, LEG(F3, 0F, 1, 0x2c), SSE, WR, Gq, Wd)
+/* NP 0F F7 /r: MASKMOVQ mm1, mm2 */
+OPCODE(maskmovq, LEG(NP, 0F, 0, 0xf7), SSE, RR, Pq, Nq)
+/* NP 0F 2B /r: MOVNTPS m128, xmm1 */
+OPCODE(movntps, LEG(NP, 0F, 0, 0x2b), SSE, WR, Mdq, Vdq)
+/* NP 0F E7 /r: MOVNTQ m64, mm */
+OPCODE(movntq, LEG(NP, 0F, 0, 0xe7), SSE, WR, Mq, Pq)
 /* NP 0F 77: EMMS */
 OPCODE(emms, LEG(NP, 0F, 0, 0x77), MMX, )
 
@@ -170,6 +306,28 @@ OPCODE_GRP_BEGIN(grp14_LEG_NP)
     OPCODE_GRPMEMB(grp14_LEG_NP, psrlq, 2, MMX, WRR, Nq, Nq, Ib)
 OPCODE_GRP_END(grp14_LEG_NP)
 
+OPCODE_GRP(grp15_LEG_NP, LEG(NP, 0F, 0, 0xae))
+OPCODE_GRP_BEGIN(grp15_LEG_NP)
+    /* NP 0F AE /7: SFENCE */
+    OPCODE_GRPMEMB(grp15_LEG_NP, sfence, 7, SSE, )
+    /* NP 0F AE /2: LDMXCSR m32 */
+    OPCODE_GRPMEMB(grp15_LEG_NP, ldmxcsr, 2, SSE, R, Md)
+    /* NP 0F AE /3: STMXCSR m32 */
+    OPCODE_GRPMEMB(grp15_LEG_NP, stmxcsr, 3, SSE, W, Md)
+OPCODE_GRP_END(grp15_LEG_NP)
+
+OPCODE_GRP(grp16_LEG_NP, LEG(NP, 0F, 0, 0x18))
+OPCODE_GRP_BEGIN(grp16_LEG_NP)
+    /* 0F 18 /1: PREFETCHT0 m8 */
+    OPCODE_GRPMEMB(grp16_LEG_NP, prefetcht0, 1, SSE, R, Mb)
+    /* 0F 18 /2: PREFETCHT1 m8 */
+    OPCODE_GRPMEMB(grp16_LEG_NP, prefetcht1, 2, SSE, R, Mb)
+    /* 0F 18 /3: PREFETCHT2 m8 */
+    OPCODE_GRPMEMB(grp16_LEG_NP, prefetcht2, 3, SSE, R, Mb)
+    /* 0F 18 /0: PREFETCHNTA m8 */
+    OPCODE_GRPMEMB(grp16_LEG_NP, prefetchnta, 0, SSE, R, Mb)
+OPCODE_GRP_END(grp16_LEG_NP)
+
 #undef FMTI____
 #undef FMTI__R__
 #undef FMTI__RR__
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 41/46] target/i386: introduce SSE2 translators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (39 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 40/46] target/i386: introduce SSE instructions to sse-opcode.inc.h Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 42/46] target/i386: introduce SSE2 code generators Jan Bobek
                   ` (5 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Use the translator macros to define translators required by SSE2
instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 3d526ee470..177bedd0ef 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5891,14 +5891,20 @@ DEF_TRANSLATE_INSN1(Md)
     }
 
 DEF_TRANSLATE_INSN2(Ed, Pq)
+DEF_TRANSLATE_INSN2(Ed, Vdq)
 DEF_TRANSLATE_INSN2(Eq, Pq)
+DEF_TRANSLATE_INSN2(Eq, Vdq)
 DEF_TRANSLATE_INSN2(Gd, Nq)
 DEF_TRANSLATE_INSN2(Gd, Udq)
 DEF_TRANSLATE_INSN2(Gd, Wd)
+DEF_TRANSLATE_INSN2(Gd, Wq)
 DEF_TRANSLATE_INSN2(Gq, Nq)
 DEF_TRANSLATE_INSN2(Gq, Udq)
 DEF_TRANSLATE_INSN2(Gq, Wd)
+DEF_TRANSLATE_INSN2(Gq, Wq)
+DEF_TRANSLATE_INSN2(Md, Gd)
 DEF_TRANSLATE_INSN2(Mdq, Vdq)
+DEF_TRANSLATE_INSN2(Mq, Gq)
 DEF_TRANSLATE_INSN2(Mq, Pq)
 DEF_TRANSLATE_INSN2(Mq, Vdq)
 DEF_TRANSLATE_INSN2(Mq, Vq)
@@ -5906,16 +5912,33 @@ DEF_TRANSLATE_INSN2(Pq, Ed)
 DEF_TRANSLATE_INSN2(Pq, Eq)
 DEF_TRANSLATE_INSN2(Pq, Nq)
 DEF_TRANSLATE_INSN2(Pq, Qq)
+DEF_TRANSLATE_INSN2(Pq, Uq)
+DEF_TRANSLATE_INSN2(Pq, Wdq)
 DEF_TRANSLATE_INSN2(Pq, Wq)
 DEF_TRANSLATE_INSN2(Qq, Pq)
+DEF_TRANSLATE_INSN2(UdqMq, Vq)
 DEF_TRANSLATE_INSN2(Vd, Ed)
 DEF_TRANSLATE_INSN2(Vd, Eq)
 DEF_TRANSLATE_INSN2(Vd, Wd)
+DEF_TRANSLATE_INSN2(Vd, Wq)
+DEF_TRANSLATE_INSN2(Vdq, Ed)
+DEF_TRANSLATE_INSN2(Vdq, Eq)
+DEF_TRANSLATE_INSN2(Vdq, Nq)
 DEF_TRANSLATE_INSN2(Vdq, Qq)
+DEF_TRANSLATE_INSN2(Vdq, Udq)
 DEF_TRANSLATE_INSN2(Vdq, Wdq)
+DEF_TRANSLATE_INSN2(Vdq, Wq)
+DEF_TRANSLATE_INSN2(Vq, Ed)
+DEF_TRANSLATE_INSN2(Vq, Eq)
+DEF_TRANSLATE_INSN2(Vq, Mq)
 DEF_TRANSLATE_INSN2(Vq, UdqMhq)
+DEF_TRANSLATE_INSN2(Vq, Wd)
+DEF_TRANSLATE_INSN2(Vq, Wq)
 DEF_TRANSLATE_INSN2(Wd, Vd)
 DEF_TRANSLATE_INSN2(Wdq, Vdq)
+DEF_TRANSLATE_INSN2(Wq, Vq)
+DEF_TRANSLATE_INSN2(Wq, Wd)
+DEF_TRANSLATE_INSN2(modrm_mod, modrm)
 
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)                           \
     static void translate_insn3(opT1, opT2, opT3)(                      \
@@ -5962,14 +5985,21 @@ DEF_TRANSLATE_INSN2(Wdq, Vdq)
     }
 
 DEF_TRANSLATE_INSN3(Gd, Nq, Ib)
+DEF_TRANSLATE_INSN3(Gd, Udq, Ib)
 DEF_TRANSLATE_INSN3(Gq, Nq, Ib)
+DEF_TRANSLATE_INSN3(Gq, Udq, Ib)
 DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
 DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
+DEF_TRANSLATE_INSN3(Udq, Udq, Ib)
 DEF_TRANSLATE_INSN3(Vd, Vd, Wd)
+DEF_TRANSLATE_INSN3(Vdq, Vd, Mq)
+DEF_TRANSLATE_INSN3(Vdq, Vdq, Mq)
 DEF_TRANSLATE_INSN3(Vdq, Vdq, Wdq)
 DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
+DEF_TRANSLATE_INSN3(Vdq, Wdq, Ib)
+DEF_TRANSLATE_INSN3(Vq, Vq, Wq)
 
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4)                     \
     static void translate_insn4(opT1, opT2, opT3, opT4)(                \
@@ -6025,8 +6055,11 @@ DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
 
 DEF_TRANSLATE_INSN4(Pq, Pq, RdMw, Ib)
 DEF_TRANSLATE_INSN4(Vd, Vd, Wd, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, RdMw, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wd, modrm_mod)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wdq, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Vdq, Wq, modrm_mod)
+DEF_TRANSLATE_INSN4(Vq, Vq, Wq, Ib)
 
 #define OPCODE_GRP_BEGIN(grpname)                                       \
     static void translate_group(grpname)(                               \
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 42/46] target/i386: introduce SSE2 code generators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (40 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 41/46] target/i386: introduce SSE2 translators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h Jan Bobek
                   ` (4 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Introduce code generators required by SSE2 instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 444 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 442 insertions(+), 2 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 177bedd0ef..7ec082e79d 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5391,6 +5391,21 @@ GEN_INSN2(movd, Ed, Pq)
     tcg_gen_ld_i32(arg1, cpu_env, arg2 + ofs);
 }
 
+GEN_INSN2(movq, Vdq, Eq);       /* forward declaration */
+GEN_INSN2(movd, Vdq, Ed)
+{
+    const insnop_arg_t(Eq) arg2_r64 = tcg_temp_new_i64();
+    tcg_gen_extu_i32_i64(arg2_r64, arg2);
+    gen_insn2(movq, Vdq, Eq)(env, s, arg1, arg2_r64);
+    tcg_temp_free_i64(arg2_r64);
+}
+
+GEN_INSN2(movd, Ed, Vdq)
+{
+    const insnop_arg_t(Vdq) ofs = offsetof(ZMMReg, ZMM_L(0));
+    tcg_gen_ld_i32(arg1, cpu_env, arg2 + ofs);
+}
+
 GEN_INSN2(movq, Pq, Eq)
 {
     const insnop_arg_t(Pq) ofs = offsetof(MMXReg, MMX_Q(0));
@@ -5403,12 +5418,53 @@ GEN_INSN2(movq, Eq, Pq)
     tcg_gen_ld_i64(arg1, cpu_env, arg2 + ofs);
 }
 
+GEN_INSN2(movq, Vdq, Eq)
+{
+    const insnop_arg_t(Vdq) ofs0 = offsetof(ZMMReg, ZMM_Q(0));
+    tcg_gen_st_i64(arg2, cpu_env, arg1 + ofs0);
+
+    const insnop_arg_t(Vdq) ofs1 = offsetof(ZMMReg, ZMM_Q(1));
+    tcg_gen_movi_i64(arg2, 0);
+    tcg_gen_st_i64(arg2, cpu_env, arg1 + ofs1);
+}
+
+GEN_INSN2(movq, Eq, Vdq)
+{
+    const insnop_arg_t(Vdq) ofs = offsetof(ZMMReg, ZMM_Q(0));
+    tcg_gen_ld_i64(arg1, cpu_env, arg2 + ofs);
+}
+
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Pq, Qq, MO_64)
 DEF_GEN_INSN2_GVEC_MM(movq, mov, Qq, Pq, MO_64)
+
+GEN_INSN2(movq, Vdq, Wq)
+{
+    const insnop_arg_t(Vdq) dofs = offsetof(ZMMReg, ZMM_Q(0));
+    const insnop_arg_t(Wq) aofs = offsetof(ZMMReg, ZMM_Q(0));
+    gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+
+    const TCGv_i64 r64z = tcg_const_i64(0);
+    tcg_gen_st_i64(r64z, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+    tcg_temp_free_i64(r64z);
+}
+
+GEN_INSN2(movq, UdqMq, Vq)
+{
+    gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg2);
+}
+
 DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Vdq, Wdq, MO_64)
 DEF_GEN_INSN2_GVEC_XMM(movaps, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movapd, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movapd, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqa, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqa, mov, Wdq, Vdq, MO_64)
 DEF_GEN_INSN2_GVEC_XMM(movups, mov, Vdq, Wdq, MO_64)
 DEF_GEN_INSN2_GVEC_XMM(movups, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movupd, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movupd, mov, Wdq, Vdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqu, mov, Vdq, Wdq, MO_64)
+DEF_GEN_INSN2_GVEC_XMM(movdqu, mov, Wdq, Vdq, MO_64)
 
 GEN_INSN2(movss, Wd, Vd);       /* forward declaration */
 GEN_INSN4(movss, Vdq, Vdq, Wd, modrm_mod)
@@ -5442,6 +5498,44 @@ GEN_INSN2(movss, Wd, Vd)
     gen_op_movl(s, arg1 + dofs, arg2 + aofs);
 }
 
+GEN_INSN2(movsd, Wq, Vq);       /* forward declaration */
+GEN_INSN4(movsd, Vdq, Vdq, Wq, modrm_mod)
+{
+    assert(arg1 == arg2);
+
+    if (arg4 == 3) {
+        /* merging movsd */
+        gen_insn2(movsd, Wq, Vq)(env, s, arg1, arg3);
+    } else {
+        /* zero-extending movsd */
+        gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg3);
+    }
+}
+
+GEN_INSN2(movsd, Wq, Vq)
+{
+    const size_t ofs = offsetof(ZMMReg, ZMM_Q(0));
+    gen_op_movq(s, arg1 + ofs, arg2 + ofs);
+}
+
+GEN_INSN2(movq2dq, Vdq, Nq)
+{
+    const insnop_arg_t(Vdq) dofs = offsetof(ZMMReg, ZMM_Q(0));
+    const insnop_arg_t(Nq) aofs = offsetof(MMXReg, MMX_Q(0));
+    gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+
+    const TCGv_i64 r64z = tcg_const_i64(0);
+    tcg_gen_st_i64(r64z, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+    tcg_temp_free_i64(r64z);
+}
+
+GEN_INSN2(movdq2q, Pq, Uq)
+{
+    const insnop_arg_t(Pq) dofs = offsetof(MMXReg, MMX_Q(0));
+    const insnop_arg_t(Uq) aofs = offsetof(ZMMReg, ZMM_Q(0));
+    gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+}
+
 GEN_INSN2(movhlps, Vq, UdqMhq)
 {
     const size_t dofs = offsetof(ZMMReg, ZMM_Q(0));
@@ -5455,6 +5549,17 @@ GEN_INSN2(movlps, Mq, Vq)
     gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
 }
 
+GEN_INSN2(movlpd, Vq, Mq)
+{
+    assert(arg2 == s->A0);
+    gen_ldq_env_A0(s, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+}
+
+GEN_INSN2(movlpd, Mq, Vq)
+{
+    gen_insn2(movlps, Mq, Vq)(env, s, arg1, arg2);
+}
+
 GEN_INSN3(movlhps, Vdq, Vq, Wq)
 {
     assert(arg1 == arg2);
@@ -5470,6 +5575,18 @@ GEN_INSN2(movhps, Mq, Vdq)
     gen_stq_env_A0(s, arg2 + offsetof(ZMMReg, ZMM_Q(1)));
 }
 
+GEN_INSN3(movhpd, Vdq, Vd, Mq)
+{
+    assert(arg1 == arg2);
+    assert(arg3 == s->A0);
+    gen_ldq_env_A0(s, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+}
+
+GEN_INSN2(movhpd, Mq, Vdq)
+{
+    gen_insn2(movhps, Mq, Vdq)(env, s, arg1, arg2);
+}
+
 DEF_GEN_INSN2_HELPER_DEP(pmovmskb, pmovmskb_mmx, Gd, Nq)
 
 GEN_INSN2(pmovmskb, Gq, Nq)
@@ -5480,6 +5597,16 @@ GEN_INSN2(pmovmskb, Gq, Nq)
     tcg_temp_free_i32(arg1_r32);
 }
 
+DEF_GEN_INSN2_HELPER_DEP(pmovmskb, pmovmskb_xmm, Gd, Udq)
+
+GEN_INSN2(pmovmskb, Gq, Udq)
+{
+    const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+    gen_insn2(pmovmskb, Gd, Udq)(env, s, arg1_r32, arg2);
+    tcg_gen_extu_i32_i64(arg1, arg1_r32);
+    tcg_temp_free_i32(arg1_r32);
+}
+
 DEF_GEN_INSN2_HELPER_DEP(movmskps, movmskps, Gd, Udq)
 
 GEN_INSN2(movmskps, Gq, Udq)
@@ -5490,78 +5617,155 @@ GEN_INSN2(movmskps, Gq, Udq)
     tcg_temp_free_i32(arg1_r32);
 }
 
+DEF_GEN_INSN2_HELPER_DEP(movmskpd, movmskpd, Gd, Udq)
+
+GEN_INSN2(movmskpd, Gq, Udq)
+{
+    const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+    gen_insn2(movmskpd, Gd, Udq)(env, s, arg1_r32, arg2);
+    tcg_gen_extu_i32_i64(arg1, arg1_r32);
+    tcg_temp_free_i32(arg1_r32);
+}
+
 DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(paddb, add, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(paddw, add, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(paddd, add, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_XMM(paddd, add, Vdq, Vdq, Wdq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(paddq, add, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(paddq, add, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(paddsb, ssadd, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(paddsb, ssadd, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddsw, ssadd, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(paddsw, ssadd, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(paddusb, usadd, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(paddusb, usadd, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddusw, usadd, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(paddusw, usadd, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_HELPER_EPP(addps, addps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(addss, addss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(addpd, addpd, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(addsd, addsd, Vq, Vq, Wq)
 
 DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(psubb, sub, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubw, sub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(psubw, sub, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(psubd, sub, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_XMM(psubd, sub, Vdq, Vdq, Wdq, MO_32)
+DEF_GEN_INSN3_GVEC_MM(psubq, sub, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(psubq, sub, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(psubsb, sssub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(psubsb, sssub, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubsw, sssub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(psubsw, sssub, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(psubusb, ussub, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(psubusb, ussub, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(psubusw, ussub, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(psubusw, ussub, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_HELPER_EPP(subps, subps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(subpd, subpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(subss, subss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(subsd, subsd, Vq, Vq, Wq)
 
 DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(pmulhw, pmulhw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmulhw, pmulhw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(pmulhuw, pmulhuw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmulhuw, pmulhuw_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(pmuludq, pmuludq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmuludq, pmuludq_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(mulps, mulps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(mulpd, mulpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(mulss, mulss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(mulsd, mulsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_xmm, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_HELPER_EPP(divps, divps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(divss, divss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(divpd, divpd, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(divsd, divsd, Vq, Vq, Wq)
 
 DEF_GEN_INSN2_HELPER_EPP(rcpps, rcpps, Vdq, Wdq)
 DEF_GEN_INSN2_HELPER_EPP(rcpss, rcpss, Vd, Wd)
 DEF_GEN_INSN2_HELPER_EPP(sqrtps, sqrtps, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(sqrtpd, sqrtpd, Vdq, Wdq)
 DEF_GEN_INSN2_HELPER_EPP(sqrtss, sqrtss, Vd, Wd)
+DEF_GEN_INSN2_HELPER_EPP(sqrtsd, sqrtsd, Vq, Wq)
 DEF_GEN_INSN2_HELPER_EPP(rsqrtps, rsqrtps, Vdq, Wdq)
 DEF_GEN_INSN2_HELPER_EPP(rsqrtss, rsqrtss, Vd, Wd)
 
 DEF_GEN_INSN3_GVEC_MM(pminub, umin, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(pminub, umin, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(pminsw, smin, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(pminsw, smin, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_HELPER_EPP(minps, minps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(minpd, minpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(minss, minss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(minsd, minsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_GVEC_MM(pmaxub, umax, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(pmaxub, umax, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(pmaxsw, smax, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(pmaxsw, smax, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_HELPER_EPP(maxps, maxps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(maxpd, maxpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(maxss, maxss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(maxsd, maxsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(pavgb, pavgb_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pavgb, pavgb_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(pavgw, pavgw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pavgw, pavgw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psadbw, psadbw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psadbw, psadbw_xmm, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_GVEC_MM(pcmpeqb, cmpeq, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(pcmpeqb, cmpeq, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(pcmpeqw, cmpeq, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(pcmpeqw, cmpeq, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(pcmpeqd, cmpeq, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_XMM(pcmpeqd, cmpeq, Vdq, Vdq, Wdq, MO_32)
 DEF_GEN_INSN3_GVEC_MM(pcmpgtb, cmpgt, Pq, Pq, Qq, MO_8)
+DEF_GEN_INSN3_GVEC_XMM(pcmpgtb, cmpgt, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(pcmpgtw, cmpgt, Pq, Pq, Qq, MO_16)
+DEF_GEN_INSN3_GVEC_XMM(pcmpgtw, cmpgt, Vdq, Vdq, Wdq, MO_16)
 DEF_GEN_INSN3_GVEC_MM(pcmpgtd, cmpgt, Pq, Pq, Qq, MO_32)
+DEF_GEN_INSN3_GVEC_XMM(pcmpgtd, cmpgt, Vdq, Vdq, Wdq, MO_32)
 
 DEF_GEN_INSN3_HELPER_EPP(cmpeqps, cmpeqps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpeqpd, cmpeqpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpeqss, cmpeqss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpeqsd, cmpeqsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpltps, cmpltps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpltpd, cmpltpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpltss, cmpltss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpltsd, cmpltsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpleps, cmpleps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmplepd, cmplepd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpless, cmpless, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmplesd, cmplesd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpunordps, cmpunordps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpunordpd, cmpunordpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpunordss, cmpunordss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpunordsd, cmpunordsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpneqps, cmpneqps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpneqpd, cmpneqpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpneqss, cmpneqss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpneqsd, cmpneqsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpnltps, cmpnltps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpnltpd, cmpnltpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpnltss, cmpnltss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpnltsd, cmpnltsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpnleps, cmpnleps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpnlepd, cmpnlepd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpnless, cmpnless, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpnlesd, cmpnlesd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(cmpordps, cmpordps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(cmpordpd, cmpordpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(cmpordss, cmpordss, Vd, Vd, Wd)
+DEF_GEN_INSN3_HELPER_EPP(cmpordsd, cmpordsd, Vq, Vq, Wq)
 
 GEN_INSN4(cmpps, Vdq, Vdq, Wdq, Ib)
 {
@@ -5595,6 +5799,38 @@ GEN_INSN4(cmpps, Vdq, Vdq, Wdq, Ib)
     g_assert_not_reached();
 }
 
+GEN_INSN4(cmppd, Vdq, Vdq, Wdq, Ib)
+{
+    switch (arg4 & 7) {
+    case 0:
+        gen_insn3(cmpeqpd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 1:
+        gen_insn3(cmpltpd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 2:
+        gen_insn3(cmplepd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 3:
+        gen_insn3(cmpunordpd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 4:
+        gen_insn3(cmpneqpd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 5:
+        gen_insn3(cmpnltpd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 6:
+        gen_insn3(cmpnlepd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    case 7:
+        gen_insn3(cmpordpd, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3);
+        return;
+    }
+
+    g_assert_not_reached();
+}
+
 GEN_INSN4(cmpss, Vd, Vd, Wd, Ib)
 {
     switch (arg4 & 7) {
@@ -5627,26 +5863,78 @@ GEN_INSN4(cmpss, Vd, Vd, Wd, Ib)
     g_assert_not_reached();
 }
 
+GEN_INSN4(cmpsd, Vq, Vq, Wq, Ib)
+{
+    switch (arg4 & 7) {
+    case 0:
+        gen_insn3(cmpeqsd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 1:
+        gen_insn3(cmpltsd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 2:
+        gen_insn3(cmplesd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 3:
+        gen_insn3(cmpunordsd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 4:
+        gen_insn3(cmpneqsd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 5:
+        gen_insn3(cmpnltsd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 6:
+        gen_insn3(cmpnlesd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    case 7:
+        gen_insn3(cmpordsd, Vq, Vq, Wq)(env, s, arg1, arg2, arg3);
+        return;
+    }
+
+    g_assert_not_reached();
+}
+
 DEF_GEN_INSN2_HELPER_EPP(comiss, comiss, Vd, Wd)
+DEF_GEN_INSN2_HELPER_EPP(comisd, comisd, Vq, Wq)
 DEF_GEN_INSN2_HELPER_EPP(ucomiss, ucomiss, Vd, Wd)
+DEF_GEN_INSN2_HELPER_EPP(ucomisd, ucomisd, Vq, Wq)
 
 DEF_GEN_INSN3_GVEC_MM(pand, and, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(pand, and, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_XMM(andps, and, Vdq, Vdq, Wdq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(andpd, and, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(pandn, andn, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(pandn, andn, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_XMM(andnps, andn, Vdq, Vdq, Wdq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(andnpd, andn, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(por, or, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(por, or, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_XMM(orps, or, Vdq, Vdq, Wdq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(orpd, or, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_MM(pxor, xor, Pq, Pq, Qq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(pxor, xor, Vdq, Vdq, Wdq, MO_64)
 DEF_GEN_INSN3_GVEC_XMM(xorps, xor, Vdq, Vdq, Wdq, MO_64)
+DEF_GEN_INSN3_GVEC_XMM(xorpd, xor, Vdq, Vdq, Wdq, MO_64)
 
 DEF_GEN_INSN3_HELPER_EPP(psllw, psllw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psllw, psllw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(pslld, pslld_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(pslld, pslld_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psllq, psllq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psllq, psllq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(pslldq, pslldq_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psrlw, psrlw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrlw, psrlw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psrld, psrld_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrld, psrld_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psrlq, psrlq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrlq, psrlq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(psrldq, psrldq_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psraw, psraw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psraw, psraw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(psrad, psrad_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(psrad, psrad_xmm, Vdq, Vdq, Wdq)
 
 #define DEF_GEN_PSHIFT_IMM_MM(mnem, opT1, opT2)                         \
     GEN_INSN3(mnem, opT1, opT2, Ib)                                     \
@@ -5660,31 +5948,70 @@ DEF_GEN_INSN3_HELPER_EPP(psrad, psrad_mmx, Pq, Pq, Qq)
         gen_insn2(movq, Pq, Eq)(env, s, arg3_mm, arg3_r64);             \
         gen_insn3(mnem, Pq, Pq, Qq)(env, s, arg1, arg2, arg3_mm);       \
     }
+#define DEF_GEN_PSHIFT_IMM_XMM(mnem, opT1, opT2)                        \
+    GEN_INSN3(mnem, opT1, opT2, Ib)                                     \
+    {                                                                   \
+        const uint64_t arg3_ui64 = (uint8_t)arg3;                       \
+        const insnop_arg_t(Eq) arg3_r64 = s->tmp1_i64;                  \
+        const insnop_arg_t(Wdq) arg3_xmm =                              \
+            offsetof(CPUX86State, xmm_t0.ZMM_Q(0));                     \
+                                                                        \
+        tcg_gen_movi_i64(arg3_r64, arg3_ui64);                          \
+        gen_insn2(movq, Vdq, Eq)(env, s, arg3_xmm, arg3_r64);           \
+        gen_insn3(mnem, Vdq, Vdq, Wdq)(env, s, arg1, arg2, arg3_xmm);   \
+    }
 
 DEF_GEN_PSHIFT_IMM_MM(psllw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psllw, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(pslld, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(pslld, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(psllq, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psllq, Udq, Udq)
+DEF_GEN_PSHIFT_IMM_XMM(pslldq, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(psrlw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psrlw, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(psrld, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psrld, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(psrlq, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psrlq, Udq, Udq)
+DEF_GEN_PSHIFT_IMM_XMM(psrldq, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(psraw, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psraw, Udq, Udq)
 DEF_GEN_PSHIFT_IMM_MM(psrad, Nq, Nq)
+DEF_GEN_PSHIFT_IMM_XMM(psrad, Udq, Udq)
 
 DEF_GEN_INSN3_HELPER_EPP(packsswb, packsswb_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(packsswb, packsswb_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(packssdw, packssdw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(packssdw, packssdw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(packuswb, packuswb_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(packuswb, packuswb_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(punpcklbw, punpcklbw_mmx, Pq, Pq, Qd)
+DEF_GEN_INSN3_HELPER_EPP(punpcklbw, punpcklbw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(punpcklwd, punpcklwd_mmx, Pq, Pq, Qd)
+DEF_GEN_INSN3_HELPER_EPP(punpcklwd, punpcklwd_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(punpckldq, punpckldq_mmx, Pq, Pq, Qd)
+DEF_GEN_INSN3_HELPER_EPP(punpckldq, punpckldq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(punpcklqdq, punpcklqdq_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(punpckhbw, punpckhbw_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(punpckhbw, punpckhbw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(punpckhwd, punpckhwd_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(punpckhwd, punpckhwd_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(punpckhdq, punpckhdq_mmx, Pq, Pq, Qq)
+DEF_GEN_INSN3_HELPER_EPP(punpckhdq, punpckhdq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(punpckhqdq, punpckhqdq_xmm, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_HELPER_EPP(unpcklps, punpckldq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(unpcklpd, punpcklqdq_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(unpckhps, punpckhdq_xmm, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(unpckhpd, punpckhqdq_xmm, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_HELPER_PPI(pshufw, pshufw_mmx, Pq, Qq, Ib)
+DEF_GEN_INSN3_HELPER_PPI(pshuflw, pshuflw_xmm, Vdq, Wdq, Ib)
+DEF_GEN_INSN3_HELPER_PPI(pshufhw, pshufhw_xmm, Vdq, Wdq, Ib)
+DEF_GEN_INSN3_HELPER_PPI(pshufd, pshufd_xmm, Vdq, Wdq, Ib)
 DEF_GEN_INSN4_HELPER_PPI(shufps, shufps, Vdq, Vdq, Wdq, Ib)
+DEF_GEN_INSN4_HELPER_PPI(shufpd, shufpd, Vdq, Vdq, Wdq, Ib)
 
 GEN_INSN4(pinsrw, Pq, Pq, RdMw, Ib)
 {
@@ -5694,6 +6021,14 @@ GEN_INSN4(pinsrw, Pq, Pq, RdMw, Ib)
     tcg_gen_st16_i32(arg3, cpu_env, arg1 + ofs);
 }
 
+GEN_INSN4(pinsrw, Vdq, Vdq, RdMw, Ib)
+{
+    assert(arg1 == arg2);
+
+    const size_t ofs = offsetof(ZMMReg, ZMM_W(arg4 & 7));
+    tcg_gen_st16_i32(arg3, cpu_env, arg1 + ofs);
+}
+
 GEN_INSN3(pextrw, Gd, Nq, Ib)
 {
     const size_t ofs = offsetof(MMXReg, MMX_W(arg3 & 3));
@@ -5706,15 +6041,47 @@ GEN_INSN3(pextrw, Gq, Nq, Ib)
     tcg_gen_ld16u_i64(arg1, cpu_env, arg2 + ofs);
 }
 
+GEN_INSN3(pextrw, Gd, Udq, Ib)
+{
+    const size_t ofs = offsetof(ZMMReg, ZMM_W(arg3 & 7));
+    tcg_gen_ld16u_i32(arg1, cpu_env, arg2 + ofs);
+}
+
+GEN_INSN3(pextrw, Gq, Udq, Ib)
+{
+    const size_t ofs = offsetof(ZMMReg, ZMM_W(arg3 & 7));
+    tcg_gen_ld16u_i64(arg1, cpu_env, arg2 + ofs);
+}
+
 DEF_GEN_INSN2_HELPER_EPP(cvtpi2ps, cvtpi2ps, Vdq, Qq)
 DEF_GEN_INSN2_HELPER_EPD(cvtsi2ss, cvtsi2ss, Vd, Ed)
 DEF_GEN_INSN2_HELPER_EPQ(cvtsi2ss, cvtsq2ss, Vd, Eq)
+DEF_GEN_INSN2_HELPER_EPP(cvtpi2pd, cvtpi2pd, Vdq, Qq)
+DEF_GEN_INSN2_HELPER_EPD(cvtsi2sd, cvtsi2sd, Vq, Ed)
+DEF_GEN_INSN2_HELPER_EPQ(cvtsi2sd, cvtsq2sd, Vq, Eq)
 DEF_GEN_INSN2_HELPER_EPP(cvtps2pi, cvtps2pi, Pq, Wq)
 DEF_GEN_INSN2_HELPER_DEP(cvtss2si, cvtss2si, Gd, Wd)
 DEF_GEN_INSN2_HELPER_QEP(cvtss2si, cvtss2sq, Gq, Wd)
+DEF_GEN_INSN2_HELPER_EPP(cvtpd2pi, cvtpd2pi, Pq, Wdq)
+DEF_GEN_INSN2_HELPER_DEP(cvtsd2si, cvtsd2si, Gd, Wq)
+DEF_GEN_INSN2_HELPER_QEP(cvtsd2si, cvtsd2sq, Gq, Wq)
 DEF_GEN_INSN2_HELPER_EPP(cvttps2pi, cvttps2pi, Pq, Wq)
 DEF_GEN_INSN2_HELPER_DEP(cvttss2si, cvttss2si, Gd, Wd)
 DEF_GEN_INSN2_HELPER_QEP(cvttss2si, cvttss2sq, Gq, Wd)
+DEF_GEN_INSN2_HELPER_EPP(cvttpd2pi, cvttpd2pi, Pq, Wdq)
+DEF_GEN_INSN2_HELPER_DEP(cvttsd2si, cvttsd2si, Gd, Wq)
+DEF_GEN_INSN2_HELPER_QEP(cvttsd2si, cvttsd2sq, Gq, Wq)
+
+DEF_GEN_INSN2_HELPER_EPP(cvtpd2dq, cvtpd2dq, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(cvttpd2dq, cvttpd2dq, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(cvtdq2pd, cvtdq2pd, Vdq, Wq)
+DEF_GEN_INSN2_HELPER_EPP(cvtps2pd, cvtps2pd, Vdq, Wq)
+DEF_GEN_INSN2_HELPER_EPP(cvtpd2ps, cvtpd2ps, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(cvtss2sd, cvtss2sd, Vq, Wd)
+DEF_GEN_INSN2_HELPER_EPP(cvtsd2ss, cvtsd2ss, Vd, Wq)
+DEF_GEN_INSN2_HELPER_EPP(cvtdq2ps, cvtdq2ps, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(cvtps2dq, cvtps2dq, Vdq, Wdq)
+DEF_GEN_INSN2_HELPER_EPP(cvttps2dq, cvttps2dq, Vdq, Wdq)
 
 GEN_INSN2(maskmovq, Pq, Nq)
 {
@@ -5733,26 +6100,99 @@ GEN_INSN2(maskmovq, Pq, Nq)
     tcg_temp_free_ptr(arg2_ptr);
 }
 
+GEN_INSN2(maskmovdqu, Vdq, Udq)
+{
+    const TCGv_ptr arg1_ptr = tcg_temp_new_ptr();
+    const TCGv_ptr arg2_ptr = tcg_temp_new_ptr();
+
+    tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
+    gen_extu(s->aflag, s->A0);
+    gen_add_A0_ds_seg(s);
+
+    tcg_gen_addi_ptr(arg1_ptr, cpu_env, arg1);
+    tcg_gen_addi_ptr(arg2_ptr, cpu_env, arg2);
+    gen_helper_maskmov_xmm(cpu_env, arg1_ptr, arg2_ptr, s->A0);
+
+    tcg_temp_free_ptr(arg1_ptr);
+    tcg_temp_free_ptr(arg2_ptr);
+}
+
 GEN_INSN2(movntps, Mdq, Vdq)
 {
     assert(arg1 == s->A0);
     gen_sto_env_A0(s, arg2);
 }
 
+GEN_INSN2(movntpd, Mdq, Vdq)
+{
+    assert(arg1 == s->A0);
+    gen_sto_env_A0(s, arg2);
+}
+
+GEN_INSN2(movnti, Md, Gd)
+{
+    tcg_gen_qemu_st_i32(arg2, arg1, s->mem_index, MO_LEUL);
+}
+
+GEN_INSN2(movnti, Mq, Gq)
+{
+    tcg_gen_qemu_st_i64(arg2, arg1, s->mem_index, MO_LEQ);
+}
+
 GEN_INSN2(movntq, Mq, Pq)
 {
     assert(arg1 == s->A0);
     gen_stq_env_A0(s, arg2);
 }
 
+GEN_INSN2(movntdq, Mdq, Vdq)
+{
+    assert(arg1 == s->A0);
+    gen_sto_env_A0(s, arg2);
+}
+
+GEN_INSN0(pause)
+{
+    /* handled in disas_insn at the moment */
+    g_assert_not_reached();
+}
+
 DEF_GEN_INSN0_HELPER(emms, emms)
 
-GEN_INSN0(sfence)
+GEN_INSN2(sfence_clflush, modrm_mod, modrm)
+{
+    if (arg1 == 3) {
+        /* sfence */
+        if (s->prefix & PREFIX_LOCK) {
+            gen_illegal_opcode(s);
+        } else {
+            tcg_gen_mb(TCG_MO_ST_ST | TCG_BAR_SC);
+        }
+    } else {
+        /* clflush */
+        if (ck_cpuid(env, s, CK_CPUID_CLFLUSH)) {
+            gen_illegal_opcode(s);
+        } else {
+            gen_nop_modrm(env, s, arg2);
+        }
+    }
+}
+
+GEN_INSN0(lfence)
+{
+    if (s->prefix & PREFIX_LOCK) {
+        gen_illegal_opcode(s);
+    } else {
+        tcg_gen_mb(TCG_MO_LD_LD | TCG_BAR_SC);
+    }
+ }
+
+GEN_INSN0(mfence)
 {
     if (s->prefix & PREFIX_LOCK) {
         gen_illegal_opcode(s);
     } else {
-        tcg_gen_mb(TCG_MO_ST_ST | TCG_BAR_SC);
+        tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
     }
 }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (41 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 42/46] target/i386: introduce SSE2 code generators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  8:00   ` Aleksandar Markovic
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 44/46] target/i386: introduce SSE3 translators Jan Bobek
                   ` (3 subsequent siblings)
  46 siblings, 1 reply; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Add all the SSE2 instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/sse-opcode.inc.h | 323 ++++++++++++++++++++++++++++++++++-
 1 file changed, 322 insertions(+), 1 deletion(-)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index 39947aeb51..efa67b7ce2 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -43,241 +43,535 @@
 OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
 /* NP 0F 7E /r: MOVD r/m32,mm */
 OPCODE(movd, LEG(NP, 0F, 0, 0x7e), MMX, WR, Ed, Pq)
+/* 66 0F 6E /r: MOVD xmm,r/m32 */
+OPCODE(movd, LEG(66, 0F, 0, 0x6e), SSE2, WR, Vdq, Ed)
+/* 66 0F 7E /r: MOVD r/m32,xmm */
+OPCODE(movd, LEG(66, 0F, 0, 0x7e), SSE2, WR, Ed, Vdq)
 /* NP REX.W + 0F 6E /r: MOVQ mm,r/m64 */
 OPCODE(movq, LEG(NP, 0F, 1, 0x6e), MMX, WR, Pq, Eq)
 /* NP REX.W + 0F 7E /r: MOVQ r/m64,mm */
 OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
+/* 66 REX.W 0F 6E /r: MOVQ xmm,r/m64 */
+OPCODE(movq, LEG(66, 0F, 1, 0x6e), SSE2, WR, Vdq, Eq)
+/* 66 REX.W 0F 7E /r: MOVQ r/m64,xmm */
+OPCODE(movq, LEG(66, 0F, 1, 0x7e), SSE2, WR, Eq, Vdq)
 /* NP 0F 6F /r: MOVQ mm, mm/m64 */
 OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
 /* NP 0F 7F /r: MOVQ mm/m64, mm */
 OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
+/* F3 0F 7E /r: MOVQ xmm1, xmm2/m64 */
+OPCODE(movq, LEG(F3, 0F, 0, 0x7e), SSE2, WR, Vdq, Wq)
+/* 66 0F D6 /r: MOVQ xmm2/m64, xmm1 */
+OPCODE(movq, LEG(66, 0F, 0, 0xd6), SSE2, WR, UdqMq, Vq)
 /* NP 0F 28 /r: MOVAPS xmm1, xmm2/m128 */
 OPCODE(movaps, LEG(NP, 0F, 0, 0x28), SSE, WR, Vdq, Wdq)
 /* NP 0F 29 /r: MOVAPS xmm2/m128, xmm1 */
 OPCODE(movaps, LEG(NP, 0F, 0, 0x29), SSE, WR, Wdq, Vdq)
+/* 66 0F 28 /r: MOVAPD xmm1, xmm2/m128 */
+OPCODE(movapd, LEG(66, 0F, 0, 0x28), SSE2, WR, Vdq, Wdq)
+/* 66 0F 29 /r: MOVAPD xmm2/m128, xmm1 */
+OPCODE(movapd, LEG(66, 0F, 0, 0x29), SSE2, WR, Wdq, Vdq)
+/* 66 0F 6F /r: MOVDQA xmm1, xmm2/m128 */
+OPCODE(movdqa, LEG(66, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
+/* 66 0F 7F /r: MOVDQA xmm2/m128, xmm1 */
+OPCODE(movdqa, LEG(66, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
 /* NP 0F 10 /r: MOVUPS xmm1, xmm2/m128 */
 OPCODE(movups, LEG(NP, 0F, 0, 0x10), SSE, WR, Vdq, Wdq)
 /* NP 0F 11 /r: MOVUPS xmm2/m128, xmm1 */
 OPCODE(movups, LEG(NP, 0F, 0, 0x11), SSE, WR, Wdq, Vdq)
+/* 66 0F 10 /r: MOVUPD xmm1, xmm2/m128 */
+OPCODE(movupd, LEG(66, 0F, 0, 0x10), SSE2, WR, Vdq, Wdq)
+/* 66 0F 11 /r: MOVUPD xmm2/m128, xmm1 */
+OPCODE(movupd, LEG(66, 0F, 0, 0x11), SSE2, WR, Wdq, Vdq)
+/* F3 0F 6F /r: MOVDQU xmm1,xmm2/m128 */
+OPCODE(movdqu, LEG(F3, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
+/* F3 0F 7F /r: MOVDQU xmm2/m128,xmm1 */
+OPCODE(movdqu, LEG(F3, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
 /* F3 0F 10 /r: MOVSS xmm1, xmm2/m32 */
 OPCODE(movss, LEG(F3, 0F, 0, 0x10), SSE, WRRR, Vdq, Vdq, Wd, modrm_mod)
 /* F3 0F 11 /r: MOVSS xmm2/m32, xmm1 */
 OPCODE(movss, LEG(F3, 0F, 0, 0x11), SSE, WR, Wd, Vd)
+/* F2 0F 10 /r: MOVSD xmm1, xmm2/m64 */
+OPCODE(movsd, LEG(F2, 0F, 0, 0x10), SSE2, WRRR, Vdq, Vdq, Wq, modrm_mod)
+/* F2 0F 11 /r: MOVSD xmm1/m64, xmm2 */
+OPCODE(movsd, LEG(F2, 0F, 0, 0x11), SSE2, WR, Wq, Vq)
+/* F3 0F D6 /r: MOVQ2DQ xmm, mm */
+OPCODE(movq2dq, LEG(F3, 0F, 0, 0xd6), SSE2, WR, Vdq, Nq)
+/* F2 0F D6 /r: MOVDQ2Q mm, xmm */
+OPCODE(movdq2q, LEG(F2, 0F, 0, 0xd6), SSE2, WR, Pq, Uq)
 /* NP 0F 12 /r: MOVHLPS xmm1, xmm2 */
 /* NP 0F 12 /r: MOVLPS xmm1, m64 */
 OPCODE(movhlps, LEG(NP, 0F, 0, 0x12), SSE, WR, Vq, UdqMhq)
 /* 0F 13 /r: MOVLPS m64, xmm1 */
 OPCODE(movlps, LEG(NP, 0F, 0, 0x13), SSE, WR, Mq, Vq)
+/* 66 0F 12 /r: MOVLPD xmm1,m64 */
+OPCODE(movlpd, LEG(66, 0F, 0, 0x12), SSE2, WR, Vq, Mq)
+/* 66 0F 13 /r: MOVLPD m64,xmm1 */
+OPCODE(movlpd, LEG(66, 0F, 0, 0x13), SSE2, WR, Mq, Vq)
 /* NP 0F 16 /r: MOVLHPS xmm1, xmm2 */
 /* NP 0F 16 /r: MOVHPS xmm1, m64 */
 OPCODE(movlhps, LEG(NP, 0F, 0, 0x16), SSE, WRR, Vdq, Vq, Wq)
 /* NP 0F 17 /r: MOVHPS m64, xmm1 */
 OPCODE(movhps, LEG(NP, 0F, 0, 0x17), SSE, WR, Mq, Vdq)
+/* 66 0F 16 /r: MOVHPD xmm1, m64 */
+OPCODE(movhpd, LEG(66, 0F, 0, 0x16), SSE2, WRR, Vdq, Vd, Mq)
+/* 66 0F 17 /r: MOVHPD m64, xmm1 */
+OPCODE(movhpd, LEG(66, 0F, 0, 0x17), SSE2, WR, Mq, Vdq)
 /* NP 0F D7 /r: PMOVMSKB r32, mm */
 OPCODE(pmovmskb, LEG(NP, 0F, 0, 0xd7), SSE, WR, Gd, Nq)
 /* NP REX.W 0F D7 /r: PMOVMSKB r64, mm */
 OPCODE(pmovmskb, LEG(NP, 0F, 1, 0xd7), SSE, WR, Gq, Nq)
+/* 66 0F D7 /r: PMOVMSKB r32, xmm */
+OPCODE(pmovmskb, LEG(66, 0F, 0, 0xd7), SSE2, WR, Gd, Udq)
+/* 66 REX.W 0F D7 /r: PMOVMSKB r64, xmm */
+OPCODE(pmovmskb, LEG(66, 0F, 1, 0xd7), SSE2, WR, Gq, Udq)
 /* NP 0F 50 /r: MOVMSKPS r32, xmm */
 OPCODE(movmskps, LEG(NP, 0F, 0, 0x50), SSE, WR, Gd, Udq)
 /* NP REX.W 0F 50 /r: MOVMSKPS r64, xmm */
 OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
+/* 66 0F 50 /r: MOVMSKPD r32, xmm */
+OPCODE(movmskpd, LEG(66, 0F, 0, 0x50), SSE2, WR, Gd, Udq)
+/* 66 REX.W 0F 50 /r: MOVMSKPD r64, xmm */
+OPCODE(movmskpd, LEG(66, 0F, 1, 0x50), SSE2, WR, Gq, Udq)
 /* NP 0F FC /r: PADDB mm, mm/m64 */
 OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F FC /r: PADDB xmm1, xmm2/m128 */
+OPCODE(paddb, LEG(66, 0F, 0, 0xfc), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F FD /r: PADDW mm, mm/m64 */
 OPCODE(paddw, LEG(NP, 0F, 0, 0xfd), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F FD /r: PADDW xmm1, xmm2/m128 */
+OPCODE(paddw, LEG(66, 0F, 0, 0xfd), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F FE /r: PADDD mm, mm/m64 */
 OPCODE(paddd, LEG(NP, 0F, 0, 0xfe), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F FE /r: PADDD xmm1, xmm2/m128 */
+OPCODE(paddd, LEG(66, 0F, 0, 0xfe), SSE2, WRR, Vdq, Vdq, Wdq)
+/* NP 0F D4 /r: PADDQ mm, mm/m64 */
+OPCODE(paddq, LEG(NP, 0F, 0, 0xd4), SSE2, WRR, Pq, Pq, Qq)
+/* 66 0F D4 /r: PADDQ xmm1, xmm2/m128 */
+OPCODE(paddq, LEG(66, 0F, 0, 0xd4), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EC /r: PADDSB mm, mm/m64 */
 OPCODE(paddsb, LEG(NP, 0F, 0, 0xec), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F EC /r: PADDSB xmm1, xmm2/m128 */
+OPCODE(paddsb, LEG(66, 0F, 0, 0xec), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F ED /r: PADDSW mm, mm/m64 */
 OPCODE(paddsw, LEG(NP, 0F, 0, 0xed), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F ED /r: PADDSW xmm1, xmm2/m128 */
+OPCODE(paddsw, LEG(66, 0F, 0, 0xed), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F DC /r: PADDUSB mm,mm/m64 */
 OPCODE(paddusb, LEG(NP, 0F, 0, 0xdc), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F DC /r: PADDUSB xmm1,xmm2/m128 */
+OPCODE(paddusb, LEG(66, 0F, 0, 0xdc), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F DD /r: PADDUSW mm,mm/m64 */
 OPCODE(paddusw, LEG(NP, 0F, 0, 0xdd), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F DD /r: PADDUSW xmm1,xmm2/m128 */
+OPCODE(paddusw, LEG(66, 0F, 0, 0xdd), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 58 /r: ADDPS xmm1, xmm2/m128 */
 OPCODE(addps, LEG(NP, 0F, 0, 0x58), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 58 /r: ADDPD xmm1, xmm2/m128 */
+OPCODE(addpd, LEG(66, 0F, 0, 0x58), SSE2, WRR, Vdq, Vdq, Wdq)
 /* F3 0F 58 /r: ADDSS xmm1, xmm2/m32 */
 OPCODE(addss, LEG(F3, 0F, 0, 0x58), SSE, WRR, Vd, Vd, Wd)
+/* F2 0F 58 /r: ADDSD xmm1, xmm2/m64 */
+OPCODE(addsd, LEG(F2, 0F, 0, 0x58), SSE2, WRR, Vq, Vq, Wq)
 /* NP 0F F8 /r: PSUBB mm, mm/m64 */
 OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F F8 /r: PSUBB xmm1, xmm2/m128 */
+OPCODE(psubb, LEG(66, 0F, 0, 0xf8), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F9 /r: PSUBW mm, mm/m64 */
 OPCODE(psubw, LEG(NP, 0F, 0, 0xf9), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F F9 /r: PSUBW xmm1, xmm2/m128 */
+OPCODE(psubw, LEG(66, 0F, 0, 0xf9), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F FA /r: PSUBD mm, mm/m64 */
 OPCODE(psubd, LEG(NP, 0F, 0, 0xfa), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F FA /r: PSUBD xmm1, xmm2/m128 */
+OPCODE(psubd, LEG(66, 0F, 0, 0xfa), SSE2, WRR, Vdq, Vdq, Wdq)
+/* NP 0F FB /r: PSUBQ mm1, mm2/m64 */
+OPCODE(psubq, LEG(NP, 0F, 0, 0xfb), SSE2, WRR, Pq, Pq, Qq)
+/* 66 0F FB /r: PSUBQ xmm1, xmm2/m128 */
+OPCODE(psubq, LEG(66, 0F, 0, 0xfb), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E8 /r: PSUBSB mm, mm/m64 */
 OPCODE(psubsb, LEG(NP, 0F, 0, 0xe8), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F E8 /r: PSUBSB xmm1, xmm2/m128 */
+OPCODE(psubsb, LEG(66, 0F, 0, 0xe8), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E9 /r: PSUBSW mm, mm/m64 */
 OPCODE(psubsw, LEG(NP, 0F, 0, 0xe9), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F E9 /r: PSUBSW xmm1, xmm2/m128 */
+OPCODE(psubsw, LEG(66, 0F, 0, 0xe9), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D8 /r: PSUBUSB mm, mm/m64 */
 OPCODE(psubusb, LEG(NP, 0F, 0, 0xd8), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F D8 /r: PSUBUSB xmm1, xmm2/m128 */
+OPCODE(psubusb, LEG(66, 0F, 0, 0xd8), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D9 /r: PSUBUSW mm, mm/m64 */
 OPCODE(psubusw, LEG(NP, 0F, 0, 0xd9), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F D9 /r: PSUBUSW xmm1, xmm2/m128 */
+OPCODE(psubusw, LEG(66, 0F, 0, 0xd9), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 5C /r: SUBPS xmm1, xmm2/m128 */
 OPCODE(subps, LEG(NP, 0F, 0, 0x5c), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 5C /r: SUBPD xmm1, xmm2/m128 */
+OPCODE(subpd, LEG(66, 0F, 0, 0x5c), SSE2, WRR, Vdq, Vdq, Wdq)
 /* F3 0F 5C /r: SUBSS xmm1, xmm2/m32 */
 OPCODE(subss, LEG(F3, 0F, 0, 0x5c), SSE, WRR, Vd, Vd, Wd)
+/* F2 0F 5C /r: SUBSD xmm1, xmm2/m64 */
+OPCODE(subsd, LEG(F2, 0F, 0, 0x5c), SSE2, WRR, Vq, Vq, Wq)
 /* NP 0F D5 /r: PMULLW mm, mm/m64 */
 OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F D5 /r: PMULLW xmm1, xmm2/m128 */
+OPCODE(pmullw, LEG(66, 0F, 0, 0xd5), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E5 /r: PMULHW mm, mm/m64 */
 OPCODE(pmulhw, LEG(NP, 0F, 0, 0xe5), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F E5 /r: PMULHW xmm1, xmm2/m128 */
+OPCODE(pmulhw, LEG(66, 0F, 0, 0xe5), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E4 /r: PMULHUW mm1, mm2/m64 */
 OPCODE(pmulhuw, LEG(NP, 0F, 0, 0xe4), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F E4 /r: PMULHUW xmm1, xmm2/m128 */
+OPCODE(pmulhuw, LEG(66, 0F, 0, 0xe4), SSE2, WRR, Vdq, Vdq, Wdq)
+/* NP 0F F4 /r: PMULUDQ mm1, mm2/m64 */
+OPCODE(pmuludq, LEG(NP, 0F, 0, 0xf4), SSE2, WRR, Pq, Pq, Qq)
+/* 66 0F F4 /r: PMULUDQ xmm1, xmm2/m128 */
+OPCODE(pmuludq, LEG(66, 0F, 0, 0xf4), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 59 /r: MULPS xmm1, xmm2/m128 */
 OPCODE(mulps, LEG(NP, 0F, 0, 0x59), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 59 /r: MULPD xmm1, xmm2/m128 */
+OPCODE(mulpd, LEG(66, 0F, 0, 0x59), SSE2, WRR, Vdq, Vdq, Wdq)
 /* F3 0F 59 /r: MULSS xmm1,xmm2/m32 */
 OPCODE(mulss, LEG(F3, 0F, 0, 0x59), SSE, WRR, Vd, Vd, Wd)
+/* F2 0F 59 /r: MULSD xmm1,xmm2/m64 */
+OPCODE(mulsd, LEG(F2, 0F, 0, 0x59), SSE2, WRR, Vq, Vq, Wq)
 /* NP 0F F5 /r: PMADDWD mm, mm/m64 */
 OPCODE(pmaddwd, LEG(NP, 0F, 0, 0xf5), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F F5 /r: PMADDWD xmm1, xmm2/m128 */
+OPCODE(pmaddwd, LEG(66, 0F, 0, 0xf5), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 5E /r: DIVPS xmm1, xmm2/m128 */
 OPCODE(divps, LEG(NP, 0F, 0, 0x5e), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 5E /r: DIVPD xmm1, xmm2/m128 */
+OPCODE(divpd, LEG(66, 0F, 0, 0x5e), SSE2, WRR, Vdq, Vdq, Wdq)
 /* F3 0F 5E /r: DIVSS xmm1, xmm2/m32 */
 OPCODE(divss, LEG(F3, 0F, 0, 0x5e), SSE, WRR, Vd, Vd, Wd)
+/* F2 0F 5E /r: DIVSD xmm1, xmm2/m64 */
+OPCODE(divsd, LEG(F2, 0F, 0, 0x5e), SSE2, WRR, Vq, Vq, Wq)
 /* NP 0F 53 /r: RCPPS xmm1, xmm2/m128 */
 OPCODE(rcpps, LEG(NP, 0F, 0, 0x53), SSE, WR, Vdq, Wdq)
 /* F3 0F 53 /r: RCPSS xmm1, xmm2/m32 */
 OPCODE(rcpss, LEG(F3, 0F, 0, 0x53), SSE, WR, Vd, Wd)
 /* NP 0F 51 /r: SQRTPS xmm1, xmm2/m128 */
 OPCODE(sqrtps, LEG(NP, 0F, 0, 0x51), SSE, WR, Vdq, Wdq)
+/* 66 0F 51 /r: SQRTPD xmm1, xmm2/m128 */
+OPCODE(sqrtpd, LEG(66, 0F, 0, 0x51), SSE2, WR, Vdq, Wdq)
 /* F3 0F 51 /r: SQRTSS xmm1, xmm2/m32 */
 OPCODE(sqrtss, LEG(F3, 0F, 0, 0x51), SSE, WR, Vd, Wd)
+/* F2 0F 51 /r: SQRTSD xmm1,xmm2/m64 */
+OPCODE(sqrtsd, LEG(F2, 0F, 0, 0x51), SSE2, WR, Vq, Wq)
 /* NP 0F 52 /r: RSQRTPS xmm1, xmm2/m128 */
 OPCODE(rsqrtps, LEG(NP, 0F, 0, 0x52), SSE, WR, Vdq, Wdq)
 /* F3 0F 52 /r: RSQRTSS xmm1, xmm2/m32 */
 OPCODE(rsqrtss, LEG(F3, 0F, 0, 0x52), SSE, WR, Vd, Wd)
 /* NP 0F DA /r: PMINUB mm1, mm2/m64 */
 OPCODE(pminub, LEG(NP, 0F, 0, 0xda), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F DA /r: PMINUB xmm1, xmm2/m128 */
+OPCODE(pminub, LEG(66, 0F, 0, 0xda), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EA /r: PMINSW mm1, mm2/m64 */
 OPCODE(pminsw, LEG(NP, 0F, 0, 0xea), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F EA /r: PMINSW xmm1, xmm2/m128 */
+OPCODE(pminsw, LEG(66, 0F, 0, 0xea), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 5D /r: MINPS xmm1, xmm2/m128 */
 OPCODE(minps, LEG(NP, 0F, 0, 0x5d), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 5D /r: MINPD xmm1, xmm2/m128 */
+OPCODE(minpd, LEG(66, 0F, 0, 0x5d), SSE2, WRR, Vdq, Vdq, Wdq)
 /* F3 0F 5D /r: MINSS xmm1,xmm2/m32 */
 OPCODE(minss, LEG(F3, 0F, 0, 0x5d), SSE, WRR, Vd, Vd, Wd)
+/* F2 0F 5D /r: MINSD xmm1, xmm2/m64 */
+OPCODE(minsd, LEG(F2, 0F, 0, 0x5d), SSE2, WRR, Vq, Vq, Wq)
 /* NP 0F DE /r: PMAXUB mm1, mm2/m64 */
 OPCODE(pmaxub, LEG(NP, 0F, 0, 0xde), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F DE /r: PMAXUB xmm1, xmm2/m128 */
+OPCODE(pmaxub, LEG(66, 0F, 0, 0xde), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EE /r: PMAXSW mm1, mm2/m64 */
 OPCODE(pmaxsw, LEG(NP, 0F, 0, 0xee), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F EE /r: PMAXSW xmm1, xmm2/m128 */
+OPCODE(pmaxsw, LEG(66, 0F, 0, 0xee), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 5F /r: MAXPS xmm1, xmm2/m128 */
 OPCODE(maxps, LEG(NP, 0F, 0, 0x5f), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 5F /r: MAXPD xmm1, xmm2/m128 */
+OPCODE(maxpd, LEG(66, 0F, 0, 0x5f), SSE2, WRR, Vdq, Vdq, Wdq)
 /* F3 0F 5F /r: MAXSS xmm1, xmm2/m32 */
 OPCODE(maxss, LEG(F3, 0F, 0, 0x5f), SSE, WRR, Vd, Vd, Wd)
+/* F2 0F 5F /r: MAXSD xmm1, xmm2/m64 */
+OPCODE(maxsd, LEG(F2, 0F, 0, 0x5f), SSE2, WRR, Vq, Vq, Wq)
 /* NP 0F E0 /r: PAVGB mm1, mm2/m64 */
 OPCODE(pavgb, LEG(NP, 0F, 0, 0xe0), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F E0 /r: PAVGB xmm1, xmm2/m128 */
+OPCODE(pavgb, LEG(66, 0F, 0, 0xe0), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E3 /r: PAVGW mm1, mm2/m64 */
 OPCODE(pavgw, LEG(NP, 0F, 0, 0xe3), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F E3 /r: PAVGW xmm1, xmm2/m128 */
+OPCODE(pavgw, LEG(66, 0F, 0, 0xe3), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F6 /r: PSADBW mm1, mm2/m64 */
 OPCODE(psadbw, LEG(NP, 0F, 0, 0xf6), SSE, WRR, Pq, Pq, Qq)
+/* 66 0F F6 /r: PSADBW xmm1, xmm2/m128 */
+OPCODE(psadbw, LEG(66, 0F, 0, 0xf6), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 74 /r: PCMPEQB mm,mm/m64 */
 OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 74 /r: PCMPEQB xmm1,xmm2/m128 */
+OPCODE(pcmpeqb, LEG(66, 0F, 0, 0x74), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 75 /r: PCMPEQW mm,mm/m64 */
 OPCODE(pcmpeqw, LEG(NP, 0F, 0, 0x75), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 75 /r: PCMPEQW xmm1,xmm2/m128 */
+OPCODE(pcmpeqw, LEG(66, 0F, 0, 0x75), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 76 /r: PCMPEQD mm,mm/m64 */
 OPCODE(pcmpeqd, LEG(NP, 0F, 0, 0x76), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 76 /r: PCMPEQD xmm1,xmm2/m128 */
+OPCODE(pcmpeqd, LEG(66, 0F, 0, 0x76), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 64 /r: PCMPGTB mm,mm/m64 */
 OPCODE(pcmpgtb, LEG(NP, 0F, 0, 0x64), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 64 /r: PCMPGTB xmm1,xmm2/m128 */
+OPCODE(pcmpgtb, LEG(66, 0F, 0, 0x64), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 65 /r: PCMPGTW mm,mm/m64 */
 OPCODE(pcmpgtw, LEG(NP, 0F, 0, 0x65), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 65 /r: PCMPGTW xmm1,xmm2/m128 */
+OPCODE(pcmpgtw, LEG(66, 0F, 0, 0x65), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 66 /r: PCMPGTD mm,mm/m64 */
 OPCODE(pcmpgtd, LEG(NP, 0F, 0, 0x66), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 66 /r: PCMPGTD xmm1,xmm2/m128 */
+OPCODE(pcmpgtd, LEG(66, 0F, 0, 0x66), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F C2 /r ib: CMPPS xmm1, xmm2/m128, imm8 */
 OPCODE(cmpps, LEG(NP, 0F, 0, 0xc2), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
+/* 66 0F C2 /r ib: CMPPD xmm1, xmm2/m128, imm8 */
+OPCODE(cmppd, LEG(66, 0F, 0, 0xc2), SSE2, WRRR, Vdq, Vdq, Wdq, Ib)
 /* F3 0F C2 /r ib: CMPSS xmm1, xmm2/m32, imm8 */
 OPCODE(cmpss, LEG(F3, 0F, 0, 0xc2), SSE, WRRR, Vd, Vd, Wd, Ib)
+/* F2 0F C2 /r ib: CMPSD xmm1, xmm2/m64, imm8 */
+OPCODE(cmpsd, LEG(F2, 0F, 0, 0xc2), SSE2, WRRR, Vq, Vq, Wq, Ib)
 /* NP 0F 2E /r: UCOMISS xmm1, xmm2/m32 */
 OPCODE(ucomiss, LEG(NP, 0F, 0, 0x2e), SSE, RR, Vd, Wd)
+/* 66 0F 2E /r: UCOMISD xmm1, xmm2/m64 */
+OPCODE(ucomisd, LEG(66, 0F, 0, 0x2e), SSE2, RR, Vq, Wq)
 /* NP 0F 2F /r: COMISS xmm1, xmm2/m32 */
 OPCODE(comiss, LEG(NP, 0F, 0, 0x2f), SSE, RR, Vd, Wd)
+/* 66 0F 2F /r: COMISD xmm1, xmm2/m64 */
+OPCODE(comisd, LEG(66, 0F, 0, 0x2f), SSE2, RR, Vq, Wq)
 /* NP 0F DB /r: PAND mm, mm/m64 */
 OPCODE(pand, LEG(NP, 0F, 0, 0xdb), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F DB /r: PAND xmm1, xmm2/m128 */
+OPCODE(pand, LEG(66, 0F, 0, 0xdb), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 54 /r: ANDPS xmm1, xmm2/m128 */
 OPCODE(andps, LEG(NP, 0F, 0, 0x54), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 54 /r: ANDPD xmm1, xmm2/m128 */
+OPCODE(andpd, LEG(66, 0F, 0, 0x54), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F DF /r: PANDN mm, mm/m64 */
 OPCODE(pandn, LEG(NP, 0F, 0, 0xdf), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F DF /r: PANDN xmm1, xmm2/m128 */
+OPCODE(pandn, LEG(66, 0F, 0, 0xdf), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 55 /r: ANDNPS xmm1, xmm2/m128 */
 OPCODE(andnps, LEG(NP, 0F, 0, 0x55), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 55 /r: ANDNPD xmm1, xmm2/m128 */
+OPCODE(andnpd, LEG(66, 0F, 0, 0x55), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EB /r: POR mm, mm/m64 */
 OPCODE(por, LEG(NP, 0F, 0, 0xeb), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F EB /r: POR xmm1, xmm2/m128 */
+OPCODE(por, LEG(66, 0F, 0, 0xeb), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 56 /r: ORPS xmm1, xmm2/m128 */
 OPCODE(orps, LEG(NP, 0F, 0, 0x56), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 56 /r: ORPD xmm1, xmm2/m128 */
+OPCODE(orpd, LEG(66, 0F, 0, 0x56), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F EF /r: PXOR mm, mm/m64 */
 OPCODE(pxor, LEG(NP, 0F, 0, 0xef), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F EF /r: PXOR xmm1, xmm2/m128 */
+OPCODE(pxor, LEG(66, 0F, 0, 0xef), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 57 /r: XORPS xmm1, xmm2/m128 */
 OPCODE(xorps, LEG(NP, 0F, 0, 0x57), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 57 /r: XORPD xmm1, xmm2/m128 */
+OPCODE(xorpd, LEG(66, 0F, 0, 0x57), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F1 /r: PSLLW mm, mm/m64 */
 OPCODE(psllw, LEG(NP, 0F, 0, 0xf1), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F F1 /r: PSLLW xmm1, xmm2/m128 */
+OPCODE(psllw, LEG(66, 0F, 0, 0xf1), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F2 /r: PSLLD mm, mm/m64 */
 OPCODE(pslld, LEG(NP, 0F, 0, 0xf2), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F F2 /r: PSLLD xmm1, xmm2/m128 */
+OPCODE(pslld, LEG(66, 0F, 0, 0xf2), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F3 /r: PSLLQ mm, mm/m64 */
 OPCODE(psllq, LEG(NP, 0F, 0, 0xf3), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F F3 /r: PSLLQ xmm1, xmm2/m128 */
+OPCODE(psllq, LEG(66, 0F, 0, 0xf3), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D1 /r: PSRLW mm, mm/m64 */
 OPCODE(psrlw, LEG(NP, 0F, 0, 0xd1), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F D1 /r: PSRLW xmm1, xmm2/m128 */
+OPCODE(psrlw, LEG(66, 0F, 0, 0xd1), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D2 /r: PSRLD mm, mm/m64 */
 OPCODE(psrld, LEG(NP, 0F, 0, 0xd2), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F D2 /r: PSRLD xmm1, xmm2/m128 */
+OPCODE(psrld, LEG(66, 0F, 0, 0xd2), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D3 /r: PSRLQ mm, mm/m64 */
 OPCODE(psrlq, LEG(NP, 0F, 0, 0xd3), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F D3 /r: PSRLQ xmm1, xmm2/m128 */
+OPCODE(psrlq, LEG(66, 0F, 0, 0xd3), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E1 /r: PSRAW mm,mm/m64 */
 OPCODE(psraw, LEG(NP, 0F, 0, 0xe1), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F E1 /r: PSRAW xmm1,xmm2/m128 */
+OPCODE(psraw, LEG(66, 0F, 0, 0xe1), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F E2 /r: PSRAD mm,mm/m64 */
 OPCODE(psrad, LEG(NP, 0F, 0, 0xe2), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F E2 /r: PSRAD xmm1,xmm2/m128 */
+OPCODE(psrad, LEG(66, 0F, 0, 0xe2), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 63 /r: PACKSSWB mm1, mm2/m64 */
 OPCODE(packsswb, LEG(NP, 0F, 0, 0x63), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 63 /r: PACKSSWB xmm1, xmm2/m128 */
+OPCODE(packsswb, LEG(66, 0F, 0, 0x63), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 6B /r: PACKSSDW mm1, mm2/m64 */
 OPCODE(packssdw, LEG(NP, 0F, 0, 0x6b), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 6B /r: PACKSSDW xmm1, xmm2/m128 */
+OPCODE(packssdw, LEG(66, 0F, 0, 0x6b), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 67 /r: PACKUSWB mm, mm/m64 */
 OPCODE(packuswb, LEG(NP, 0F, 0, 0x67), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 67 /r: PACKUSWB xmm1, xmm2/m128 */
+OPCODE(packuswb, LEG(66, 0F, 0, 0x67), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 68 /r: PUNPCKHBW mm, mm/m64 */
 OPCODE(punpckhbw, LEG(NP, 0F, 0, 0x68), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 68 /r: PUNPCKHBW xmm1, xmm2/m128 */
+OPCODE(punpckhbw, LEG(66, 0F, 0, 0x68), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 69 /r: PUNPCKHWD mm, mm/m64 */
 OPCODE(punpckhwd, LEG(NP, 0F, 0, 0x69), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 69 /r: PUNPCKHWD xmm1, xmm2/m128 */
+OPCODE(punpckhwd, LEG(66, 0F, 0, 0x69), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 6A /r: PUNPCKHDQ mm, mm/m64 */
 OPCODE(punpckhdq, LEG(NP, 0F, 0, 0x6a), MMX, WRR, Pq, Pq, Qq)
+/* 66 0F 6A /r: PUNPCKHDQ xmm1, xmm2/m128 */
+OPCODE(punpckhdq, LEG(66, 0F, 0, 0x6a), SSE2, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 6D /r: PUNPCKHQDQ xmm1, xmm2/m128 */
+OPCODE(punpckhqdq, LEG(66, 0F, 0, 0x6d), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 60 /r: PUNPCKLBW mm, mm/m32 */
 OPCODE(punpcklbw, LEG(NP, 0F, 0, 0x60), MMX, WRR, Pq, Pq, Qd)
+/* 66 0F 60 /r: PUNPCKLBW xmm1, xmm2/m128 */
+OPCODE(punpcklbw, LEG(66, 0F, 0, 0x60), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 61 /r: PUNPCKLWD mm, mm/m32 */
 OPCODE(punpcklwd, LEG(NP, 0F, 0, 0x61), MMX, WRR, Pq, Pq, Qd)
+/* 66 0F 61 /r: PUNPCKLWD xmm1, xmm2/m128 */
+OPCODE(punpcklwd, LEG(66, 0F, 0, 0x61), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 62 /r: PUNPCKLDQ mm, mm/m32 */
 OPCODE(punpckldq, LEG(NP, 0F, 0, 0x62), MMX, WRR, Pq, Pq, Qd)
+/* 66 0F 62 /r: PUNPCKLDQ xmm1, xmm2/m128 */
+OPCODE(punpckldq, LEG(66, 0F, 0, 0x62), SSE2, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 6C /r: PUNPCKLQDQ xmm1, xmm2/m128 */
+OPCODE(punpcklqdq, LEG(66, 0F, 0, 0x6c), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 14 /r: UNPCKLPS xmm1, xmm2/m128 */
 OPCODE(unpcklps, LEG(NP, 0F, 0, 0x14), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 14 /r: UNPCKLPD xmm1, xmm2/m128 */
+OPCODE(unpcklpd, LEG(66, 0F, 0, 0x14), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 15 /r: UNPCKHPS xmm1, xmm2/m128 */
 OPCODE(unpckhps, LEG(NP, 0F, 0, 0x15), SSE, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 15 /r: UNPCKHPD xmm1, xmm2/m128 */
+OPCODE(unpckhpd, LEG(66, 0F, 0, 0x15), SSE2, WRR, Vdq, Vdq, Wdq)
 /* NP 0F 70 /r ib: PSHUFW mm1, mm2/m64, imm8 */
 OPCODE(pshufw, LEG(NP, 0F, 0, 0x70), SSE, WRR, Pq, Qq, Ib)
+/* F2 0F 70 /r ib: PSHUFLW xmm1, xmm2/m128, imm8 */
+OPCODE(pshuflw, LEG(F2, 0F, 0, 0x70), SSE2, WRR, Vdq, Wdq, Ib)
+/* F3 0F 70 /r ib: PSHUFHW xmm1, xmm2/m128, imm8 */
+OPCODE(pshufhw, LEG(F3, 0F, 0, 0x70), SSE2, WRR, Vdq, Wdq, Ib)
+/* 66 0F 70 /r ib: PSHUFD xmm1, xmm2/m128, imm8 */
+OPCODE(pshufd, LEG(66, 0F, 0, 0x70), SSE2, WRR, Vdq, Wdq, Ib)
 /* NP 0F C6 /r ib: SHUFPS xmm1, xmm3/m128, imm8 */
 OPCODE(shufps, LEG(NP, 0F, 0, 0xc6), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
+/* 66 0F C6 /r ib: SHUFPD xmm1, xmm2/m128, imm8 */
+OPCODE(shufpd, LEG(66, 0F, 0, 0xc6), SSE2, WRRR, Vdq, Vdq, Wdq, Ib)
 /* NP 0F C4 /r ib: PINSRW mm, r32/m16, imm8 */
 OPCODE(pinsrw, LEG(NP, 0F, 0, 0xc4), SSE, WRRR, Pq, Pq, RdMw, Ib)
+/* 66 0F C4 /r ib: PINSRW xmm, r32/m16, imm8 */
+OPCODE(pinsrw, LEG(66, 0F, 0, 0xc4), SSE2, WRRR, Vdq, Vdq, RdMw, Ib)
 /* NP 0F C5 /r ib: PEXTRW r32, mm, imm8 */
 OPCODE(pextrw, LEG(NP, 0F, 0, 0xc5), SSE, WRR, Gd, Nq, Ib)
 /* NP REX.W 0F C5 /r ib: PEXTRW r64, mm, imm8 */
 OPCODE(pextrw, LEG(NP, 0F, 1, 0xc5), SSE, WRR, Gq, Nq, Ib)
+/* 66 0F C5 /r ib: PEXTRW r32, xmm, imm8 */
+OPCODE(pextrw, LEG(66, 0F, 0, 0xc5), SSE2, WRR, Gd, Udq, Ib)
+/* 66 REX.W 0F C5 /r ib: PEXTRW r64, xmm, imm8 */
+OPCODE(pextrw, LEG(66, 0F, 1, 0xc5), SSE2, WRR, Gq, Udq, Ib)
 /* NP 0F 2A /r: CVTPI2PS xmm, mm/m64 */
 OPCODE(cvtpi2ps, LEG(NP, 0F, 0, 0x2a), SSE, WR, Vdq, Qq)
 /* F3 0F 2A /r: CVTSI2SS xmm1,r/m32 */
 OPCODE(cvtsi2ss, LEG(F3, 0F, 0, 0x2a), SSE, WR, Vd, Ed)
 /* F3 REX.W 0F 2A /r: CVTSI2SS xmm1,r/m64 */
 OPCODE(cvtsi2ss, LEG(F3, 0F, 1, 0x2a), SSE, WR, Vd, Eq)
+/* 66 0F 2A /r: CVTPI2PD xmm, mm/m64 */
+OPCODE(cvtpi2pd, LEG(66, 0F, 0, 0x2a), SSE2, WR, Vdq, Qq)
+/* F2 0F 2A /r: CVTSI2SD xmm1,r32/m32 */
+OPCODE(cvtsi2sd, LEG(F2, 0F, 0, 0x2a), SSE2, WR, Vq, Ed)
+/* F2 REX.W 0F 2A /r: CVTSI2SD xmm1,r/m64 */
+OPCODE(cvtsi2sd, LEG(F2, 0F, 1, 0x2a), SSE2, WR, Vq, Eq)
 /* NP 0F 2D /r: CVTPS2PI mm, xmm/m64 */
 OPCODE(cvtps2pi, LEG(NP, 0F, 0, 0x2d), SSE, WR, Pq, Wq)
 /* F3 0F 2D /r: CVTSS2SI r32,xmm1/m32 */
 OPCODE(cvtss2si, LEG(F3, 0F, 0, 0x2d), SSE, WR, Gd, Wd)
 /* F3 REX.W 0F 2D /r: CVTSS2SI r64,xmm1/m32 */
 OPCODE(cvtss2si, LEG(F3, 0F, 1, 0x2d), SSE, WR, Gq, Wd)
+/* 66 0F 2D /r: CVTPD2PI mm, xmm/m128 */
+OPCODE(cvtpd2pi, LEG(66, 0F, 0, 0x2d), SSE2, WR, Pq, Wdq)
+/* F2 0F 2D /r: CVTSD2SI r32,xmm1/m64 */
+OPCODE(cvtsd2si, LEG(F2, 0F, 0, 0x2d), SSE2, WR, Gd, Wq)
+/* F2 REX.W 0F 2D /r: CVTSD2SI r64,xmm1/m64 */
+OPCODE(cvtsd2si, LEG(F2, 0F, 1, 0x2d), SSE2, WR, Gq, Wq)
 /* NP 0F 2C /r: CVTTPS2PI mm, xmm/m64 */
 OPCODE(cvttps2pi, LEG(NP, 0F, 0, 0x2c), SSE, WR, Pq, Wq)
 /* F3 0F 2C /r: CVTTSS2SI r32,xmm1/m32 */
 OPCODE(cvttss2si, LEG(F3, 0F, 0, 0x2c), SSE, WR, Gd, Wd)
 /* F3 REX.W 0F 2C /r: CVTTSS2SI r64,xmm1/m32 */
 OPCODE(cvttss2si, LEG(F3, 0F, 1, 0x2c), SSE, WR, Gq, Wd)
+/* 66 0F 2C /r: CVTTPD2PI mm, xmm/m128 */
+OPCODE(cvttpd2pi, LEG(66, 0F, 0, 0x2c), SSE2, WR, Pq, Wdq)
+/* F2 0F 2C /r: CVTTSD2SI r32,xmm1/m64 */
+OPCODE(cvttsd2si, LEG(F2, 0F, 0, 0x2c), SSE2, WR, Gd, Wq)
+/* F2 REX.W 0F 2C /r: CVTTSD2SI r64,xmm1/m64 */
+OPCODE(cvttsd2si, LEG(F2, 0F, 1, 0x2c), SSE2, WR, Gq, Wq)
+/* F2 0F E6 /r: CVTPD2DQ xmm1, xmm2/m128 */
+OPCODE(cvtpd2dq, LEG(F2, 0F, 0, 0xe6), SSE2, WR, Vdq, Wdq)
+/* 66 0F E6 /r: CVTTPD2DQ xmm1, xmm2/m128 */
+OPCODE(cvttpd2dq, LEG(66, 0F, 0, 0xe6), SSE2, WR, Vdq, Wdq)
+/* F3 0F E6 /r: CVTDQ2PD xmm1, xmm2/m64 */
+OPCODE(cvtdq2pd, LEG(F3, 0F, 0, 0xe6), SSE2, WR, Vdq, Wq)
+/* NP 0F 5A /r: CVTPS2PD xmm1, xmm2/m64 */
+OPCODE(cvtps2pd, LEG(NP, 0F, 0, 0x5a), SSE2, WR, Vdq, Wq)
+/* 66 0F 5A /r: CVTPD2PS xmm1, xmm2/m128 */
+OPCODE(cvtpd2ps, LEG(66, 0F, 0, 0x5a), SSE2, WR, Vdq, Wdq)
+/* F3 0F 5A /r: CVTSS2SD xmm1, xmm2/m32 */
+OPCODE(cvtss2sd, LEG(F3, 0F, 0, 0x5a), SSE2, WR, Vq, Wd)
+/* F2 0F 5A /r: CVTSD2SS xmm1, xmm2/m64 */
+OPCODE(cvtsd2ss, LEG(F2, 0F, 0, 0x5a), SSE2, WR, Vd, Wq)
+/* NP 0F 5B /r: CVTDQ2PS xmm1, xmm2/m128 */
+OPCODE(cvtdq2ps, LEG(NP, 0F, 0, 0x5b), SSE2, WR, Vdq, Wdq)
+/* 66 0F 5B /r: CVTPS2DQ xmm1, xmm2/m128 */
+OPCODE(cvtps2dq, LEG(66, 0F, 0, 0x5b), SSE2, WR, Vdq, Wdq)
+/* F3 0F 5B /r: CVTTPS2DQ xmm1, xmm2/m128 */
+OPCODE(cvttps2dq, LEG(F3, 0F, 0, 0x5b), SSE2, WR, Vdq, Wdq)
 /* NP 0F F7 /r: MASKMOVQ mm1, mm2 */
 OPCODE(maskmovq, LEG(NP, 0F, 0, 0xf7), SSE, RR, Pq, Nq)
+/* 66 0F F7 /r: MASKMOVDQU xmm1, xmm2 */
+OPCODE(maskmovdqu, LEG(66, 0F, 0, 0xf7), SSE2, RR, Vdq, Udq)
 /* NP 0F 2B /r: MOVNTPS m128, xmm1 */
 OPCODE(movntps, LEG(NP, 0F, 0, 0x2b), SSE, WR, Mdq, Vdq)
+/* 66 0F 2B /r: MOVNTPD m128, xmm1 */
+OPCODE(movntpd, LEG(66, 0F, 0, 0x2b), SSE2, WR, Mdq, Vdq)
+/* NP 0F C3 /r: MOVNTI m32, r32 */
+OPCODE(movnti, LEG(NP, 0F, 0, 0xc3), SSE2, WR, Md, Gd)
+/* NP REX.W + 0F C3 /r: MOVNTI m64, r64 */
+OPCODE(movnti, LEG(NP, 0F, 1, 0xc3), SSE2, WR, Mq, Gq)
 /* NP 0F E7 /r: MOVNTQ m64, mm */
 OPCODE(movntq, LEG(NP, 0F, 0, 0xe7), SSE, WR, Mq, Pq)
+/* 66 0F E7 /r: MOVNTDQ m128, xmm1 */
+OPCODE(movntdq, LEG(66, 0F, 0, 0xe7), SSE2, WR, Mdq, Vdq)
+/* F3 90: PAUSE */
+OPCODE(pause, LEG(F3, NA, 0, 0x90), SSE2, )
 /* NP 0F 77: EMMS */
 OPCODE(emms, LEG(NP, 0F, 0, 0x77), MMX, )
 
+OPCODE_GRP(grp12_LEG_66, LEG(66, 0F, 0, 0x71))
+OPCODE_GRP_BEGIN(grp12_LEG_66)
+    /* 66 0F 71 /6 ib: PSLLW xmm1, imm8 */
+    OPCODE_GRPMEMB(grp12_LEG_66, psllw, 6, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 71 /2 ib: PSRLW xmm1, imm8 */
+    OPCODE_GRPMEMB(grp12_LEG_66, psrlw, 2, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 71 /4 ib: PSRAW xmm1,imm8 */
+    OPCODE_GRPMEMB(grp12_LEG_66, psraw, 4, SSE2, WRR, Udq, Udq, Ib)
+OPCODE_GRP_END(grp12_LEG_66)
+
 OPCODE_GRP(grp12_LEG_NP, LEG(NP, 0F, 0, 0x71))
 OPCODE_GRP_BEGIN(grp12_LEG_NP)
     /* NP 0F 71 /6 ib: PSLLW mm1, imm8 */
@@ -288,6 +582,16 @@ OPCODE_GRP_BEGIN(grp12_LEG_NP)
     OPCODE_GRPMEMB(grp12_LEG_NP, psraw, 4, MMX, WRR, Nq, Nq, Ib)
 OPCODE_GRP_END(grp12_LEG_NP)
 
+OPCODE_GRP(grp13_LEG_66, LEG(66, 0F, 0, 0x72))
+OPCODE_GRP_BEGIN(grp13_LEG_66)
+    /* 66 0F 72 /6 ib: PSLLD xmm1, imm8 */
+    OPCODE_GRPMEMB(grp13_LEG_66, pslld, 6, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 72 /2 ib: PSRLD xmm1, imm8 */
+    OPCODE_GRPMEMB(grp13_LEG_66, psrld, 2, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 72 /4 ib: PSRAD xmm1,imm8 */
+    OPCODE_GRPMEMB(grp13_LEG_66, psrad, 4, SSE2, WRR, Udq, Udq, Ib)
+OPCODE_GRP_END(grp13_LEG_66)
+
 OPCODE_GRP(grp13_LEG_NP, LEG(NP, 0F, 0, 0x72))
 OPCODE_GRP_BEGIN(grp13_LEG_NP)
     /* NP 0F 72 /6 ib: PSLLD mm, imm8 */
@@ -298,6 +602,18 @@ OPCODE_GRP_BEGIN(grp13_LEG_NP)
     OPCODE_GRPMEMB(grp13_LEG_NP, psrad, 4, MMX, WRR, Nq, Nq, Ib)
 OPCODE_GRP_END(grp13_LEG_NP)
 
+OPCODE_GRP(grp14_LEG_66, LEG(66, 0F, 0, 0x73))
+OPCODE_GRP_BEGIN(grp14_LEG_66)
+    /* 66 0F 73 /6 ib: PSLLQ xmm1, imm8 */
+    OPCODE_GRPMEMB(grp14_LEG_66, psllq, 6, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 73 /7 ib: PSLLDQ xmm1, imm8 */
+    OPCODE_GRPMEMB(grp14_LEG_66, pslldq, 7, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 73 /2 ib: PSRLQ xmm1, imm8 */
+    OPCODE_GRPMEMB(grp14_LEG_66, psrlq, 2, SSE2, WRR, Udq, Udq, Ib)
+    /* 66 0F 73 /3 ib: PSRLDQ xmm1, imm8 */
+    OPCODE_GRPMEMB(grp14_LEG_66, psrldq, 3, SSE2, WRR, Udq, Udq, Ib)
+OPCODE_GRP_END(grp14_LEG_66)
+
 OPCODE_GRP(grp14_LEG_NP, LEG(NP, 0F, 0, 0x73))
 OPCODE_GRP_BEGIN(grp14_LEG_NP)
     /* NP 0F 73 /6 ib: PSLLQ mm, imm8 */
@@ -309,7 +625,12 @@ OPCODE_GRP_END(grp14_LEG_NP)
 OPCODE_GRP(grp15_LEG_NP, LEG(NP, 0F, 0, 0xae))
 OPCODE_GRP_BEGIN(grp15_LEG_NP)
     /* NP 0F AE /7: SFENCE */
-    OPCODE_GRPMEMB(grp15_LEG_NP, sfence, 7, SSE, )
+    /* NP 0F AE /7: CLFLUSH m8 */
+    OPCODE_GRPMEMB(grp15_LEG_NP, sfence_clflush, 7, SSE, RR, modrm_mod, modrm)
+    /* NP 0F AE /5: LFENCE */
+    OPCODE_GRPMEMB(grp15_LEG_NP, lfence, 5, SSE2, )
+    /* NP 0F AE /6: MFENCE */
+    OPCODE_GRPMEMB(grp15_LEG_NP, mfence, 6, SSE2, )
     /* NP 0F AE /2: LDMXCSR m32 */
     OPCODE_GRPMEMB(grp15_LEG_NP, ldmxcsr, 2, SSE, R, Md)
     /* NP 0F AE /3: STMXCSR m32 */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 44/46] target/i386: introduce SSE3 translators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (42 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 45/46] target/i386: introduce SSE3 code generators Jan Bobek
                   ` (2 subsequent siblings)
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Use the translator macros to define translators required by SSE3
instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7ec082e79d..c72138014a 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -6363,6 +6363,7 @@ DEF_TRANSLATE_INSN2(Vd, Wd)
 DEF_TRANSLATE_INSN2(Vd, Wq)
 DEF_TRANSLATE_INSN2(Vdq, Ed)
 DEF_TRANSLATE_INSN2(Vdq, Eq)
+DEF_TRANSLATE_INSN2(Vdq, Mdq)
 DEF_TRANSLATE_INSN2(Vdq, Nq)
 DEF_TRANSLATE_INSN2(Vdq, Qq)
 DEF_TRANSLATE_INSN2(Vdq, Udq)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 45/46] target/i386: introduce SSE3 code generators
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (43 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 44/46] target/i386: introduce SSE3 translators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h Jan Bobek
  2019-08-15  3:06 ` [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation no-reply
  46 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Introduce code generators required by SSE3 instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/translate.c | 64 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index c72138014a..9da3fbb611 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5627,6 +5627,63 @@ GEN_INSN2(movmskpd, Gq, Udq)
     tcg_temp_free_i32(arg1_r32);
 }
 
+GEN_INSN2(lddqu, Vdq, Mdq)
+{
+    assert(arg2 == s->A0);
+    gen_ldo_env_A0(s, arg1);
+}
+
+GEN_INSN2(movshdup, Vdq, Wdq)
+{
+    const TCGv_i32 r32 = tcg_temp_new_i32();
+
+    tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(1)));
+    tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(0)));
+    if (arg1 != arg2) {
+        tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(1)));
+    }
+
+    tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(3)));
+    tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(2)));
+    if (arg1 != arg2) {
+        tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(3)));
+    }
+
+    tcg_temp_free_i32(r32);
+}
+
+GEN_INSN2(movsldup, Vdq, Wdq)
+{
+    const TCGv_i32 r32 = tcg_temp_new_i32();
+
+    tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(0)));
+    if (arg1 != arg2) {
+        tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(0)));
+    }
+    tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(1)));
+
+    tcg_gen_ld_i32(r32, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(2)));
+    if (arg1 != arg2) {
+        tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(2)));
+    }
+    tcg_gen_st_i32(r32, cpu_env, arg1 + offsetof(ZMMReg, ZMM_L(3)));
+
+    tcg_temp_free_i32(r32);
+}
+
+GEN_INSN2(movddup, Vdq, Wq)
+{
+    const TCGv_i64 r64 = tcg_temp_new_i64();
+
+    tcg_gen_ld_i64(r64, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
+    if (arg1 != arg2) {
+        tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+    }
+    tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+
+    tcg_temp_free_i64(r64);
+}
+
 DEF_GEN_INSN3_GVEC_MM(paddb, add, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_XMM(paddb, add, Vdq, Vdq, Wdq, MO_8)
 DEF_GEN_INSN3_GVEC_MM(paddw, add, Pq, Pq, Qq, MO_16)
@@ -5647,6 +5704,8 @@ DEF_GEN_INSN3_HELPER_EPP(addps, addps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(addss, addss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(addpd, addpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(addsd, addsd, Vq, Vq, Wq)
+DEF_GEN_INSN3_HELPER_EPP(haddps, haddps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(haddpd, haddpd, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_GVEC_MM(psubb, sub, Pq, Pq, Qq, MO_8)
 DEF_GEN_INSN3_GVEC_XMM(psubb, sub, Vdq, Vdq, Wdq, MO_8)
@@ -5668,6 +5727,11 @@ DEF_GEN_INSN3_HELPER_EPP(subps, subps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(subpd, subpd, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(subss, subss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(subsd, subsd, Vq, Vq, Wq)
+DEF_GEN_INSN3_HELPER_EPP(hsubps, hsubps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(hsubpd, hsubpd, Vdq, Vdq, Wdq)
+
+DEF_GEN_INSN3_HELPER_EPP(addsubps, addsubps, Vdq, Vdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(addsubpd, addsubpd, Vdq, Vdq, Wdq)
 
 DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_xmm, Vdq, Vdq, Wdq)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (44 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 45/46] target/i386: introduce SSE3 code generators Jan Bobek
@ 2019-08-15  2:09 ` Jan Bobek
       [not found]   ` <CAL1e-=gZF1+=Gduqm4TwS0p-G6rvb4q+rw+hL9nzAz3P-r3+BQ@mail.gmail.com>
  2019-08-15  3:06 ` [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation no-reply
  46 siblings, 1 reply; 60+ messages in thread
From: Jan Bobek @ 2019-08-15  2:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Bobek, Alex Bennée, Richard Henderson

Add all the SSE3 instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 target/i386/sse-opcode.inc.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index efa67b7ce2..0cfe6fbe31 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -133,6 +133,14 @@ OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
 OPCODE(movmskpd, LEG(66, 0F, 0, 0x50), SSE2, WR, Gd, Udq)
 /* 66 REX.W 0F 50 /r: MOVMSKPD r64, xmm */
 OPCODE(movmskpd, LEG(66, 0F, 1, 0x50), SSE2, WR, Gq, Udq)
+/* F2 0F F0 /r: LDDQU xmm1, m128 */
+OPCODE(lddqu, LEG(F2, 0F, 0, 0xf0), SSE3, WR, Vdq, Mdq)
+/* F3 0F 16 /r: MOVSHDUP xmm1, xmm2/m128 */
+OPCODE(movshdup, LEG(F3, 0F, 0, 0x16), SSE3, WR, Vdq, Wdq)
+/* F3 0F 12 /r: MOVSLDUP xmm1, xmm2/m128 */
+OPCODE(movsldup, LEG(F3, 0F, 0, 0x12), SSE3, WR, Vdq, Wdq)
+/* F2 0F 12 /r: MOVDDUP xmm1, xmm2/m64 */
+OPCODE(movddup, LEG(F2, 0F, 0, 0x12), SSE3, WR, Vdq, Wq)
 /* NP 0F FC /r: PADDB mm, mm/m64 */
 OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
 /* 66 0F FC /r: PADDB xmm1, xmm2/m128 */
@@ -173,6 +181,10 @@ OPCODE(addpd, LEG(66, 0F, 0, 0x58), SSE2, WRR, Vdq, Vdq, Wdq)
 OPCODE(addss, LEG(F3, 0F, 0, 0x58), SSE, WRR, Vd, Vd, Wd)
 /* F2 0F 58 /r: ADDSD xmm1, xmm2/m64 */
 OPCODE(addsd, LEG(F2, 0F, 0, 0x58), SSE2, WRR, Vq, Vq, Wq)
+/* F2 0F 7C /r: HADDPS xmm1, xmm2/m128 */
+OPCODE(haddps, LEG(F2, 0F, 0, 0x7c), SSE3, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 7C /r: HADDPD xmm1, xmm2/m128 */
+OPCODE(haddpd, LEG(66, 0F, 0, 0x7c), SSE3, WRR, Vdq, Vdq, Wdq)
 /* NP 0F F8 /r: PSUBB mm, mm/m64 */
 OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
 /* 66 0F F8 /r: PSUBB xmm1, xmm2/m128 */
@@ -213,6 +225,14 @@ OPCODE(subpd, LEG(66, 0F, 0, 0x5c), SSE2, WRR, Vdq, Vdq, Wdq)
 OPCODE(subss, LEG(F3, 0F, 0, 0x5c), SSE, WRR, Vd, Vd, Wd)
 /* F2 0F 5C /r: SUBSD xmm1, xmm2/m64 */
 OPCODE(subsd, LEG(F2, 0F, 0, 0x5c), SSE2, WRR, Vq, Vq, Wq)
+/* F2 0F 7D /r: HSUBPS xmm1, xmm2/m128 */
+OPCODE(hsubps, LEG(F2, 0F, 0, 0x7d), SSE3, WRR, Vdq, Vdq, Wdq)
+/* 66 0F 7D /r: HSUBPD xmm1, xmm2/m128 */
+OPCODE(hsubpd, LEG(66, 0F, 0, 0x7d), SSE3, WRR, Vdq, Vdq, Wdq)
+/* F2 0F D0 /r: ADDSUBPS xmm1, xmm2/m128 */
+OPCODE(addsubps, LEG(F2, 0F, 0, 0xd0), SSE3, WRR, Vdq, Vdq, Wdq)
+/* 66 0F D0 /r: ADDSUBPD xmm1, xmm2/m128 */
+OPCODE(addsubpd, LEG(66, 0F, 0, 0xd0), SSE3, WRR, Vdq, Vdq, Wdq)
 /* NP 0F D5 /r: PMULLW mm, mm/m64 */
 OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
 /* 66 0F D5 /r: PMULLW xmm1, xmm2/m128 */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation
  2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
                   ` (45 preceding siblings ...)
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h Jan Bobek
@ 2019-08-15  3:06 ` no-reply
  46 siblings, 0 replies; 60+ messages in thread
From: no-reply @ 2019-08-15  3:06 UTC (permalink / raw)
  To: jan.bobek; +Cc: alex.bennee, richard.henderson, jan.bobek, qemu-devel

Patchew URL: https://patchew.org/QEMU/20190815020928.9679-1-jan.bobek@gmail.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation
Message-id: 20190815020928.9679-1-jan.bobek@gmail.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20190815020928.9679-1-jan.bobek@gmail.com -> patchew/20190815020928.9679-1-jan.bobek@gmail.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' (https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' (https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out '22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out '90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out '20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' (https://github.com/openssl/openssl) registered for path 'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out '50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out '09403100de2f6f1cdd0d484dcb8e620f1c335c8f'
Cloning into 'roms/ipxe'...
Submodule path 'roms/ipxe': checked out 'de4565cbe76ea9f7913a01f331be3ee901bb6e17'
Cloning into 'roms/openbios'...
Submodule path 'roms/openbios': checked out 'c79e0ecb84f4f1ee3f73f521622e264edd1bf174'
Cloning into 'roms/openhackware'...
Submodule path 'roms/openhackware': checked out 'c559da7c8eec5e45ef1f67978827af6f0b9546f5'
Cloning into 'roms/opensbi'...
Submodule path 'roms/opensbi': checked out 'ce228ee0919deb9957192d723eecc8aaae2697c6'
Cloning into 'roms/qemu-palcode'...
Submodule path 'roms/qemu-palcode': checked out 'bf0e13698872450164fa7040da36a95d2d4b326f'
Cloning into 'roms/seabios'...
Submodule path 'roms/seabios': checked out 'a5cab58e9a3fb6e168aba919c5669bea406573b4'
Cloning into 'roms/seabios-hppa'...
Submodule path 'roms/seabios-hppa': checked out '0f4fe84658165e96ce35870fd19fc634e182e77b'
Cloning into 'roms/sgabios'...
Submodule path 'roms/sgabios': checked out 'cbaee52287e5f32373181cff50a00b6c4ac9015a'
Cloning into 'roms/skiboot'...
Submodule path 'roms/skiboot': checked out '261ca8e779e5138869a45f174caa49be6a274501'
Cloning into 'roms/u-boot'...
Submodule path 'roms/u-boot': checked out 'd3689267f92c5956e09cc7d1baa4700141662bff'
Cloning into 'roms/u-boot-sam460ex'...
Submodule path 'roms/u-boot-sam460ex': checked out '60b3916f33e617a815973c5a6df77055b2e3a588'
Cloning into 'slirp'...
Submodule path 'slirp': checked out '126c04acbabd7ad32c2b018fe10dfac2a3bc1210'
Cloning into 'tests/fp/berkeley-softfloat-3'...
Submodule path 'tests/fp/berkeley-softfloat-3': checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'tests/fp/berkeley-testfloat-3'...
Submodule path 'tests/fp/berkeley-testfloat-3': checked out '5a59dcec19327396a011a17fd924aed4fec416b3'
Cloning into 'ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out '6b3d716e2b6472eb7189d3220552280ef3d832ce'
Switched to a new branch 'test'
ca9528e target/i386: introduce SSE3 instructions to sse-opcode.inc.h
a60cf87 target/i386: introduce SSE3 code generators
d6b90c5 target/i386: introduce SSE3 translators
d07a81b target/i386: introduce SSE2 instructions to sse-opcode.inc.h
9a52ea2 target/i386: introduce SSE2 code generators
96646bc target/i386: introduce SSE2 translators
d5e067e target/i386: introduce SSE instructions to sse-opcode.inc.h
66abf9f target/i386: introduce SSE code generators
f8c0699 target/i386: introduce SSE translators
71021da target/i386: introduce MMX instructions to sse-opcode.inc.h
09b6b40 target/i386: introduce MMX code generators
43fff80 target/i386: introduce MMX translators
b1715f9 target/i386: introduce instruction translator macros
fd93e87 target/i386: introduce sse-opcode.inc.h
ab711a3 target/i386: introduce gvec-based code generator macros
4168b75 target/i386: introduce helper-based code generator macros
3fac287 target/i386: introduce code generators
3a30642 target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands
cc18af8 target/i386: introduce P*, N*, Q* (MMX) operands
1175822 target/i386: introduce G*, R*, E* (general register) operands
10f4ae7 target/i386: introduce M* (memptr) operands
728b398 target/i386: introduce Ib (immediate) operand
a331fff target/i386: introduce operand vex_v
22f4f48 target/i386: introduce operand for direct-only r/m field
f132d77 target/i386: introduce operands for decoding modrm fields
ed424a5 target/i386: introduce modrm operand
d712698 target/i386: introduce tcg_temp operands
818fcff target/i386: introduce generic load-store operand
8380c80 target/i386: introduce generic either-or operand
a03a2f6 target/i386: introduce generic operand alias
f3b669c target/i386: introduce instruction operand infrastructure
4b391ad target/i386: introduce function ck_cpuid
8123ca7 target/i386: introduce mnemonic aliases for several gvec operations
92c895f target/i386: disable unused function warning temporarily
ef32f4c target/i386: introduce gen_sse_ng
f232b7a target/i386: introduce gen_(ld, st)d_env_A0
93c6baf target/i386: add vector register file alignment constraints
255b027 target/i386: make variable is_xmm const
d413859 target/i386: make variable b1 const
fe6f670 target/i386: use pc_start from DisasContext
83feecf target/i386: Simplify gen_exception arguments
1649460 target/i386: use prefix from DisasContext
9ab47b3 target/i386: use dflag from DisasContext
7046842 target/i386: reduce scope of variable aflag
713bb8c target/i386: Push rex_w into DisasContext
c403330 target/i386: Push rex_r into DisasContext

=== OUTPUT BEGIN ===
1/46 Checking commit c403330f5c0f (target/i386: Push rex_r into DisasContext)
2/46 Checking commit 713bb8c7bc73 (target/i386: Push rex_w into DisasContext)
3/46 Checking commit 70468427f9c5 (target/i386: reduce scope of variable aflag)
4/46 Checking commit 9ab47b327e93 (target/i386: use dflag from DisasContext)
5/46 Checking commit 164946033ade (target/i386: use prefix from DisasContext)
6/46 Checking commit 83feecf4ffec (target/i386: Simplify gen_exception arguments)
7/46 Checking commit fe6f6709834f (target/i386: use pc_start from DisasContext)
WARNING: line over 80 characters
#65: FILE: target/i386/translate.c:6387:
+            gen_repz_scas(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 1);

WARNING: line over 80 characters
#68: FILE: target/i386/translate.c:6389:
+            gen_repz_scas(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 0);

WARNING: line over 80 characters
#77: FILE: target/i386/translate.c:6399:
+            gen_repz_cmps(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 1);

WARNING: line over 80 characters
#80: FILE: target/i386/translate.c:6401:
+            gen_repz_cmps(s, ot, s->pc_start - s->cs_base, s->pc - s->cs_base, 0);

WARNING: line over 80 characters
#198: FILE: target/i386/translate.c:7062:
+        gen_interrupt(s, EXCP03_INT3, s->pc_start - s->cs_base, s->pc - s->cs_base);

ERROR: line over 90 characters
#484: FILE: target/i386/translate.c:7693:
+            gen_svm_check_intercept(s, s->pc_start, (b & 2) ? SVM_EXIT_INVD : SVM_EXIT_WBINVD);

WARNING: line over 80 characters
#502: FILE: target/i386/translate.c:8090:
+                gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_WRITE_DR0 + reg);

WARNING: line over 80 characters
#510: FILE: target/i386/translate.c:8097:
+                gen_svm_check_intercept(s, s->pc_start, SVM_EXIT_READ_DR0 + reg);

total: 1 errors, 7 warnings, 464 lines checked

Patch 7/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

8/46 Checking commit d413859dd1da (target/i386: make variable b1 const)
9/46 Checking commit 255b027543f1 (target/i386: make variable is_xmm const)
10/46 Checking commit 93c6baf75eec (target/i386: add vector register file alignment constraints)
11/46 Checking commit f232b7a29836 (target/i386: introduce gen_(ld, st)d_env_A0)
12/46 Checking commit ef32f4c8330d (target/i386: introduce gen_sse_ng)
13/46 Checking commit 92c895f5d577 (target/i386: disable unused function warning temporarily)
14/46 Checking commit 8123ca7ad80b (target/i386: introduce mnemonic aliases for several gvec operations)
15/46 Checking commit 4b391ad16666 (target/i386: introduce function ck_cpuid)
16/46 Checking commit f3b669ca511e (target/i386: introduce instruction operand infrastructure)
ERROR: spaces required around that '*' (ctx:WxV)
#33: FILE: target/i386/translate.c:4561:
+    static int insnop_init(opT)(insnop_ctxt_t(opT) *ctxt,       \
                                                    ^

ERROR: spaces required around that '*' (ctx:WxV)
#39: FILE: target/i386/translate.c:4567:
+    static insnop_arg_t(opT) insnop_prepare(opT)(insnop_ctxt_t(opT) *ctxt, \
                                                                     ^

ERROR: spaces required around that '*' (ctx:WxV)
#45: FILE: target/i386/translate.c:4573:
+    static void insnop_finalize(opT)(insnop_ctxt_t(opT) *ctxt,  \
                                                         ^

total: 3 errors, 0 warnings, 34 lines checked

Patch 16/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

17/46 Checking commit a03a2f667f10 (target/i386: introduce generic operand alias)
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#24: FILE: target/i386/translate.c:4582:
+#define DEF_INSNOP_ALIAS(opT, opT2)                                     \
+    typedef insnop_arg_t(opT2) insnop_arg_t(opT);                       \
+    typedef insnop_ctxt_t(opT2) insnop_ctxt_t(opT);                     \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        return insnop_init(opT2)(ctxt, env, s, modrm, is_write);        \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        return insnop_prepare(opT2)(ctxt, env, s, modrm, is_write);     \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+        insnop_finalize(opT2)(ctxt, env, s, modrm, is_write, arg);      \
+    }

total: 1 errors, 0 warnings, 26 lines checked

Patch 17/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

18/46 Checking commit 8380c80b34c9 (target/i386: introduce generic either-or operand)
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#26: FILE: target/i386/translate.c:4602:
+#define DEF_INSNOP_EITHER(opT, opT1, opT2)                              \
+    typedef insnop_arg_t(opT1) insnop_arg_t(opT);                       \
+    typedef struct {                                                    \
+        bool is_ ## opT1;                                               \
+        union {                                                         \
+            insnop_ctxt_t(opT1) ctxt_ ## opT1;                          \
+            insnop_ctxt_t(opT2) ctxt_ ## opT2;                          \
+        };                                                              \
+    } insnop_ctxt_t(opT);                                               \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        int ret = insnop_init(opT1)(&ctxt->ctxt_ ## opT1,               \
+                                    env, s, modrm, is_write);           \
+        if (!ret) {                                                     \
+            ctxt->is_ ## opT1 = 1;                                      \
+            return 0;                                                   \
+        }                                                               \
+        ret = insnop_init(opT2)(&ctxt->ctxt_ ## opT2,                   \
+                                env, s, modrm, is_write);               \
+        if (!ret) {                                                     \
+            ctxt->is_ ## opT1 = 0;                                      \
+            return 0;                                                   \
+        }                                                               \
+        return ret;                                                     \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        return (ctxt->is_ ## opT1                                       \
+                ? insnop_prepare(opT1)(&ctxt->ctxt_ ## opT1,            \
+                                       env, s, modrm, is_write)         \
+                : insnop_prepare(opT2)(&ctxt->ctxt_ ## opT2,            \
+                                       env, s, modrm, is_write));       \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+        (ctxt->is_ ## opT1                                              \
+         ? insnop_finalize(opT1)(&ctxt->ctxt_ ## opT1,                  \
+                                 env, s, modrm, is_write, arg)          \
+         : insnop_finalize(opT2)(&ctxt->ctxt_ ## opT2,                  \
+                                 env, s, modrm, is_write, arg));        \
+    }

total: 1 errors, 0 warnings, 52 lines checked

Patch 18/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

19/46 Checking commit 818fcffce866 (target/i386: introduce generic load-store operand)
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#36: FILE: target/i386/translate.c:4658:
+#define DEF_INSNOP_LDST(opT, opTarg, opTptr)                            \
+    typedef insnop_arg_t(opTarg) insnop_arg_t(opT);                     \
+    typedef struct {                                                    \
+        insnop_ctxt_t(opTarg) arg;                                      \
+        insnop_ctxt_t(opTptr) ptr;                                      \
+    } insnop_ctxt_t(opT);                                               \
+                                                                        \
+    /* forward declaration */                                           \
+    INSNOP_LDST(opTarg, opTptr);                                        \
+                                                                        \
+    INSNOP_INIT(opT)                                                    \
+    {                                                                   \
+        int ret = insnop_init(opTarg)(&ctxt->arg, env, s, modrm, is_write); \
+        if (!ret) {                                                     \
+            ret = insnop_init(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+        }                                                               \
+        return ret;                                                     \
+    }                                                                   \
+    INSNOP_PREPARE(opT)                                                 \
+    {                                                                   \
+        const insnop_arg_t(opTarg) arg =                                \
+            insnop_prepare(opTarg)(&ctxt->arg, env, s, modrm, is_write); \
+        if (!is_write) {                                                \
+            const insnop_arg_t(opTptr) ptr =                            \
+                insnop_prepare(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+            insnop_ldst(opTarg, opTptr)(env, s, modrm, is_write, arg, ptr); \
+            insnop_finalize(opTptr)(&ctxt->ptr, env, s, modrm, is_write, ptr); \
+        }                                                               \
+        return arg;                                                     \
+    }                                                                   \
+    INSNOP_FINALIZE(opT)                                                \
+    {                                                                   \
+        if (is_write) {                                                 \
+            const insnop_arg_t(opTptr) ptr =                            \
+                insnop_prepare(opTptr)(&ctxt->ptr, env, s, modrm, is_write); \
+            insnop_ldst(opTarg, opTptr)(env, s, modrm, is_write, arg, ptr); \
+            insnop_finalize(opTptr)(&ctxt->ptr, env, s, modrm, is_write, ptr); \
+        }                                                               \
+        insnop_finalize(opTarg)(&ctxt->arg, env, s, modrm, is_write, arg); \
+    }

WARNING: Block comments use a leading /* on a separate line
#43: FILE: target/i386/translate.c:4665:
+    /* forward declaration */                                           \

total: 1 errors, 1 warnings, 60 lines checked

Patch 19/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

20/46 Checking commit d712698a82c6 (target/i386: introduce tcg_temp operands)
21/46 Checking commit ed424a5e2f6b (target/i386: introduce modrm operand)
22/46 Checking commit f132d772e756 (target/i386: introduce operands for decoding modrm fields)
23/46 Checking commit 22f4f4831251 (target/i386: introduce operand for direct-only r/m field)
24/46 Checking commit a331fff9f3fe (target/i386: introduce operand vex_v)
25/46 Checking commit 728b39849029 (target/i386: introduce Ib (immediate) operand)
26/46 Checking commit 10f4ae72ee5c (target/i386: introduce M* (memptr) operands)
27/46 Checking commit 1175822a21bf (target/i386: introduce G*, R*, E* (general register) operands)
28/46 Checking commit cc18af84223d (target/i386: introduce P*, N*, Q* (MMX) operands)
29/46 Checking commit 3a306429bb23 (target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands)
30/46 Checking commit 3fac28747b8b (target/i386: introduce code generators)
31/46 Checking commit 4168b75af461 (target/i386: introduce helper-based code generator macros)
32/46 Checking commit ab711a3d4dd0 (target/i386: introduce gvec-based code generator macros)
33/46 Checking commit fd93e87c50cc (target/i386: introduce sse-opcode.inc.h)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100644

ERROR: space prohibited before that close parenthesis ')'
#21: FILE: target/i386/sse-opcode.inc.h:1:
+#define FMTI____     (0, 0, 0, )

total: 1 errors, 1 warnings, 69 lines checked

Patch 33/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

34/46 Checking commit b1715f917915 (target/i386: introduce instruction translator macros)
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#224: FILE: target/i386/translate.c:5568:
+#define OPCODE_GRPMEMB(grpname, mnem, opcode, feat, fmt, ...)           \
+            case opcode:                                                \
+                translate_insn(FMT_ARGC(fmt), ## __VA_ARGS__)(          \
+                    env, s, modrm, CK_CPUID_ ## feat, FMT_ARGC_WR(fmt), \
+                    gen_insn(mnem, FMT_ARGC(fmt), ## __VA_ARGS__));     \
+                break;

ERROR: Macros with multiple statements should be enclosed in a do - while loop
#230: FILE: target/i386/translate.c:5574:
+#define OPCODE_GRP_END(grpname)                                         \
+            default:                                                    \
+                ret = 1;                                                \
+                break;                                                  \
+            }                                                           \
+                                                                        \
+            insnop_finalize(modrm_reg)(&regctxt, env, s, modrm, 0, reg); \
+        }                                                               \
+                                                                        \
+        if (ret) {                                                      \
+            gen_illegal_opcode(s);                                      \
+        }                                                               \
+    }

ERROR: spaces required around that ':' (ctx:VxE)
#231: FILE: target/i386/translate.c:5575:
+            default:                                                    \
                    ^

ERROR: Macros with complex values should be enclosed in parenthesis
#252: FILE: target/i386/translate.c:5608:
+#define LEG(p, m, w, opcode)                    \
+    case opcode | M_ ## m | P_ ## p | W_ ## w:

ERROR: Macros with multiple statements should be enclosed in a do - while loop
#254: FILE: target/i386/translate.c:5610:
+#define OPCODE(mnem, cases, feat, fmt, ...)                             \
+    cases {                                                             \
+        const int modrm = 0 < FMT_ARGC(fmt) ? x86_ldub_code(env, s) : -1; \
+        translate_insn(FMT_ARGC(fmt), ## __VA_ARGS__)(                  \
+            env, s, modrm, CK_CPUID_ ## feat, FMT_ARGC_WR(fmt),         \
+            gen_insn(mnem, FMT_ARGC(fmt), ## __VA_ARGS__));             \
+    } return;

ERROR: Macros with multiple statements should be enclosed in a do - while loop
#261: FILE: target/i386/translate.c:5617:
+#define OPCODE_GRP(grpname, cases)                  \
+    cases {                                         \
+        const int modrm = x86_ldub_code(env, s);    \
+        translate_group(grpname)(env, s, modrm);    \
+    } return;

total: 6 errors, 0 warnings, 249 lines checked

Patch 34/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

35/46 Checking commit 43fff80712ee (target/i386: introduce MMX translators)
36/46 Checking commit 09b6b40b1692 (target/i386: introduce MMX code generators)
37/46 Checking commit 71021da7cc61 (target/i386: introduce MMX instructions to sse-opcode.inc.h)
ERROR: space prohibited before that close parenthesis ')'
#121: FILE: target/i386/sse-opcode.inc.h:143:
+OPCODE(emms, LEG(NP, 0F, 0, 0x77), MMX, )

total: 1 errors, 0 warnings, 137 lines checked

Patch 37/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

38/46 Checking commit f8c069933d15 (target/i386: introduce SSE translators)
39/46 Checking commit 66abf9f5d42c (target/i386: introduce SSE code generators)
40/46 Checking commit d5e067eb3687 (target/i386: introduce SSE instructions to sse-opcode.inc.h)
ERROR: space prohibited before that close parenthesis ')'
#208: FILE: target/i386/sse-opcode.inc.h:312:
+    OPCODE_GRPMEMB(grp15_LEG_NP, sfence, 7, SSE, )

total: 1 errors, 0 warnings, 208 lines checked

Patch 40/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

41/46 Checking commit 96646bcf26a8 (target/i386: introduce SSE2 translators)
42/46 Checking commit 9a52ea2a9ce7 (target/i386: introduce SSE2 code generators)
43/46 Checking commit d07a81b490ae (target/i386: introduce SSE2 instructions to sse-opcode.inc.h)
ERROR: space prohibited before that close parenthesis ')'
#535: FILE: target/i386/sse-opcode.inc.h:561:
+OPCODE(pause, LEG(F3, NA, 0, 0x90), SSE2, )

ERROR: space prohibited before that close parenthesis ')'
#596: FILE: target/i386/sse-opcode.inc.h:631:
+    OPCODE_GRPMEMB(grp15_LEG_NP, lfence, 5, SSE2, )

ERROR: space prohibited before that close parenthesis ')'
#598: FILE: target/i386/sse-opcode.inc.h:633:
+    OPCODE_GRPMEMB(grp15_LEG_NP, mfence, 6, SSE2, )

total: 3 errors, 0 warnings, 582 lines checked

Patch 43/46 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

44/46 Checking commit d6b90c53aa61 (target/i386: introduce SSE3 translators)
45/46 Checking commit a60cf87145ed (target/i386: introduce SSE3 code generators)
46/46 Checking commit ca9528e6142c (target/i386: introduce SSE3 instructions to sse-opcode.inc.h)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190815020928.9679-1-jan.bobek@gmail.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag Jan Bobek
@ 2019-08-15  7:16   ` Aleksandar Markovic
  0 siblings, 0 replies; 60+ messages in thread
From: Aleksandar Markovic @ 2019-08-15  7:16 UTC (permalink / raw)
  To: Jan Bobek
  Cc: Alex Bennée, Richard Henderson, qemu-devel, Richard Henderson

15.08.2019. 04.10, "Jan Bobek" <jan.bobek@gmail.com> је написао/ла:
>
> The variable aflag is not used in most of disas_insn; make this clear
> by explicitly reducing its scope to the block where it is used.
>
> Suggested-by: Richard Henderson <rth@twiddle.net>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
> ---

Jan, the new block between { and } should be indented.

Yours,
Aleksandar

>  target/i386/translate.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index c0866c2797..bda96277e4 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -4493,11 +4493,14 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      CPUX86State *env = cpu->env_ptr;
>      int b, prefixes;
>      int shift;
> -    TCGMemOp ot, aflag, dflag;
> +    TCGMemOp ot, dflag;
>      int modrm, reg, rm, mod, op, opreg, val;
>      target_ulong next_eip, tval;
>      target_ulong pc_start = s->base.pc_next;
>
> +    {
> +    TCGMemOp aflag;
> +
>      s->pc_start = s->pc = pc_start;
>      s->override = -1;
>  #ifdef TARGET_X86_64
> @@ -4657,6 +4660,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      s->prefix = prefixes;
>      s->aflag = aflag;
>      s->dflag = dflag;
> +    }
>
>      /* now check op code */
>   reswitch:
> --
> 2.20.1
>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w " Jan Bobek
@ 2019-08-15  7:30   ` Aleksandar Markovic
  2019-08-15  9:55     ` Richard Henderson
  0 siblings, 1 reply; 60+ messages in thread
From: Aleksandar Markovic @ 2019-08-15  7:30 UTC (permalink / raw)
  To: Jan Bobek
  Cc: Alex Bennée, Richard Henderson, qemu-devel, Richard Henderson

15.08.2019. 04.13, "Jan Bobek" <jan.bobek@gmail.com> је написао/ла:
>
> From: Richard Henderson <rth@twiddle.net>
>
> Treat this the same as we already do for other rex bits.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target/i386/translate.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index d74dbfd585..c0866c2797 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -44,11 +44,13 @@
>  #define REX_X(s) ((s)->rex_x)
>  #define REX_B(s) ((s)->rex_b)
>  #define REX_R(s) ((s)->rex_r)
> +#define REX_W(s) ((s)->rex_w)
>  #else
>  #define CODE64(s) 0
>  #define REX_X(s) 0
>  #define REX_B(s) 0
>  #define REX_R(s) 0
> +#define REX_W(s) -1

The commit message says "treat rex_w the same as other rex bits". Why is
then REX_W() treated differently here?

Yours,
Aleksandar

>  #endif
>
>  #ifdef TARGET_X86_64
> @@ -100,7 +102,7 @@ typedef struct DisasContext {
>  #ifdef TARGET_X86_64
>      int lma;    /* long mode active */
>      int code64; /* 64 bit code segment */
> -    int rex_x, rex_b, rex_r;
> +    int rex_x, rex_b, rex_r, rex_w;
>  #endif
>      int vex_l;  /* vex vector length */
>      int vex_v;  /* vex vvvv register, without 1's complement.  */
> @@ -4495,7 +4497,6 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      int modrm, reg, rm, mod, op, opreg, val;
>      target_ulong next_eip, tval;
>      target_ulong pc_start = s->base.pc_next;
> -    int rex_w;
>
>      s->pc_start = s->pc = pc_start;
>      s->override = -1;
> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      s->rex_x = 0;
>      s->rex_b = 0;
>      s->rex_r = 0;
> +    s->rex_w = -1;
>      s->x86_64_hregs = false;
>  #endif
>      s->rip_offset = 0; /* for relative ip address */
> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      }
>
>      prefixes = 0;
> -    rex_w = -1;
>
>   next_byte:
>      b = x86_ldub_code(env, s);
> @@ -4557,7 +4558,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      case 0x40 ... 0x4f:
>          if (CODE64(s)) {
>              /* REX prefix */
> -            rex_w = (b >> 3) & 1;
> +            s->rex_w = (b >> 3) & 1;
>              s->rex_r = (b & 0x4) << 1;
>              s->rex_x = (b & 0x2) << 2;
>              s->rex_b = (b & 0x1) << 3;
> @@ -4606,7 +4607,9 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>                  s->rex_b = (~vex2 >> 2) & 8;
>  #endif
>                  vex3 = x86_ldub_code(env, s);
> -                rex_w = (vex3 >> 7) & 1;
> +#ifdef TARGET_X86_64
> +                s->rex_w = (vex3 >> 7) & 1;
> +#endif
>                  switch (vex2 & 0x1f) {
>                  case 0x01: /* Implied 0f leading opcode bytes.  */
>                      b = x86_ldub_code(env, s) | 0x100;
> @@ -4631,9 +4634,9 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>      /* Post-process prefixes.  */
>      if (CODE64(s)) {
>          /* In 64-bit mode, the default data size is 32-bit.  Select
64-bit
> -           data with rex_w, and 16-bit data with 0x66; rex_w takes
precedence
> +           data with REX_W, and 16-bit data with 0x66; REX_W takes
precedence
>             over 0x66 if both are present.  */
> -        dflag = (rex_w > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 :
MO_32);
> +        dflag = (REX_W(s) > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 :
MO_32);
>          /* In 64-bit mode, 0x67 selects 32-bit addressing.  */
>          aflag = (prefixes & PREFIX_ADR ? MO_32 : MO_64);
>      } else {
> @@ -5029,7 +5032,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>                  /* operand size for jumps is 64 bit */
>                  ot = MO_64;
>              } else if (op == 3 || op == 5) {
> -                ot = dflag != MO_16 ? MO_32 + (rex_w == 1) : MO_16;
> +                ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
>              } else if (op == 6) {
>                  /* default push size is 64 bit */
>                  ot = mo_pushpop(s, dflag);
> --
> 2.20.1
>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h
  2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h Jan Bobek
@ 2019-08-15  8:00   ` Aleksandar Markovic
  0 siblings, 0 replies; 60+ messages in thread
From: Aleksandar Markovic @ 2019-08-15  8:00 UTC (permalink / raw)
  To: Jan Bobek; +Cc: Alex Bennée, Richard Henderson, qemu-devel

15.08.2019. 04.51, "Jan Bobek" <jan.bobek@gmail.com> је написао/ла:
>
> Add all the SSE2 instruction entries to sse-opcode.inc.h.
>
> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
> ---

I gather the order of items in this file is based on instruction
functionality. This means, however, that, for example, SSE2 items will be
scattered all over the place.

In that light, consider providing a comment somewhere close to the top of
the file similar to this:

/*
* SSE2 instructions
* -----------------
*
*   MOVD xmm,r/m32
*   MOVD r/m32,xmm
*   MOVQ xmm,r/m64
*   MOVQ xmm2/m64, xmm1
... etc ...
*
*/

(the same for SSE3, MMX)

That would provide the reader with a nice overview per instruction set
generation, and would also serve you as an a convenient checkpoint against
specifications.

Yours,
Aleksandar

>  target/i386/sse-opcode.inc.h | 323 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 322 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
> index 39947aeb51..efa67b7ce2 100644
> --- a/target/i386/sse-opcode.inc.h
> +++ b/target/i386/sse-opcode.inc.h
> @@ -43,241 +43,535 @@
>  OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
>  /* NP 0F 7E /r: MOVD r/m32,mm */
>  OPCODE(movd, LEG(NP, 0F, 0, 0x7e), MMX, WR, Ed, Pq)
> +/* 66 0F 6E /r: MOVD xmm,r/m32 */
> +OPCODE(movd, LEG(66, 0F, 0, 0x6e), SSE2, WR, Vdq, Ed)
> +/* 66 0F 7E /r: MOVD r/m32,xmm */
> +OPCODE(movd, LEG(66, 0F, 0, 0x7e), SSE2, WR, Ed, Vdq)
>  /* NP REX.W + 0F 6E /r: MOVQ mm,r/m64 */
>  OPCODE(movq, LEG(NP, 0F, 1, 0x6e), MMX, WR, Pq, Eq)
>  /* NP REX.W + 0F 7E /r: MOVQ r/m64,mm */
>  OPCODE(movq, LEG(NP, 0F, 1, 0x7e), MMX, WR, Eq, Pq)
> +/* 66 REX.W 0F 6E /r: MOVQ xmm,r/m64 */
> +OPCODE(movq, LEG(66, 0F, 1, 0x6e), SSE2, WR, Vdq, Eq)
> +/* 66 REX.W 0F 7E /r: MOVQ r/m64,xmm */
> +OPCODE(movq, LEG(66, 0F, 1, 0x7e), SSE2, WR, Eq, Vdq)
>  /* NP 0F 6F /r: MOVQ mm, mm/m64 */
>  OPCODE(movq, LEG(NP, 0F, 0, 0x6f), MMX, WR, Pq, Qq)
>  /* NP 0F 7F /r: MOVQ mm/m64, mm */
>  OPCODE(movq, LEG(NP, 0F, 0, 0x7f), MMX, WR, Qq, Pq)
> +/* F3 0F 7E /r: MOVQ xmm1, xmm2/m64 */
> +OPCODE(movq, LEG(F3, 0F, 0, 0x7e), SSE2, WR, Vdq, Wq)
> +/* 66 0F D6 /r: MOVQ xmm2/m64, xmm1 */
> +OPCODE(movq, LEG(66, 0F, 0, 0xd6), SSE2, WR, UdqMq, Vq)
>  /* NP 0F 28 /r: MOVAPS xmm1, xmm2/m128 */
>  OPCODE(movaps, LEG(NP, 0F, 0, 0x28), SSE, WR, Vdq, Wdq)
>  /* NP 0F 29 /r: MOVAPS xmm2/m128, xmm1 */
>  OPCODE(movaps, LEG(NP, 0F, 0, 0x29), SSE, WR, Wdq, Vdq)
> +/* 66 0F 28 /r: MOVAPD xmm1, xmm2/m128 */
> +OPCODE(movapd, LEG(66, 0F, 0, 0x28), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 29 /r: MOVAPD xmm2/m128, xmm1 */
> +OPCODE(movapd, LEG(66, 0F, 0, 0x29), SSE2, WR, Wdq, Vdq)
> +/* 66 0F 6F /r: MOVDQA xmm1, xmm2/m128 */
> +OPCODE(movdqa, LEG(66, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 7F /r: MOVDQA xmm2/m128, xmm1 */
> +OPCODE(movdqa, LEG(66, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
>  /* NP 0F 10 /r: MOVUPS xmm1, xmm2/m128 */
>  OPCODE(movups, LEG(NP, 0F, 0, 0x10), SSE, WR, Vdq, Wdq)
>  /* NP 0F 11 /r: MOVUPS xmm2/m128, xmm1 */
>  OPCODE(movups, LEG(NP, 0F, 0, 0x11), SSE, WR, Wdq, Vdq)
> +/* 66 0F 10 /r: MOVUPD xmm1, xmm2/m128 */
> +OPCODE(movupd, LEG(66, 0F, 0, 0x10), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 11 /r: MOVUPD xmm2/m128, xmm1 */
> +OPCODE(movupd, LEG(66, 0F, 0, 0x11), SSE2, WR, Wdq, Vdq)
> +/* F3 0F 6F /r: MOVDQU xmm1,xmm2/m128 */
> +OPCODE(movdqu, LEG(F3, 0F, 0, 0x6f), SSE2, WR, Vdq, Wdq)
> +/* F3 0F 7F /r: MOVDQU xmm2/m128,xmm1 */
> +OPCODE(movdqu, LEG(F3, 0F, 0, 0x7f), SSE2, WR, Wdq, Vdq)
>  /* F3 0F 10 /r: MOVSS xmm1, xmm2/m32 */
>  OPCODE(movss, LEG(F3, 0F, 0, 0x10), SSE, WRRR, Vdq, Vdq, Wd, modrm_mod)
>  /* F3 0F 11 /r: MOVSS xmm2/m32, xmm1 */
>  OPCODE(movss, LEG(F3, 0F, 0, 0x11), SSE, WR, Wd, Vd)
> +/* F2 0F 10 /r: MOVSD xmm1, xmm2/m64 */
> +OPCODE(movsd, LEG(F2, 0F, 0, 0x10), SSE2, WRRR, Vdq, Vdq, Wq, modrm_mod)
> +/* F2 0F 11 /r: MOVSD xmm1/m64, xmm2 */
> +OPCODE(movsd, LEG(F2, 0F, 0, 0x11), SSE2, WR, Wq, Vq)
> +/* F3 0F D6 /r: MOVQ2DQ xmm, mm */
> +OPCODE(movq2dq, LEG(F3, 0F, 0, 0xd6), SSE2, WR, Vdq, Nq)
> +/* F2 0F D6 /r: MOVDQ2Q mm, xmm */
> +OPCODE(movdq2q, LEG(F2, 0F, 0, 0xd6), SSE2, WR, Pq, Uq)
>  /* NP 0F 12 /r: MOVHLPS xmm1, xmm2 */
>  /* NP 0F 12 /r: MOVLPS xmm1, m64 */
>  OPCODE(movhlps, LEG(NP, 0F, 0, 0x12), SSE, WR, Vq, UdqMhq)
>  /* 0F 13 /r: MOVLPS m64, xmm1 */
>  OPCODE(movlps, LEG(NP, 0F, 0, 0x13), SSE, WR, Mq, Vq)
> +/* 66 0F 12 /r: MOVLPD xmm1,m64 */
> +OPCODE(movlpd, LEG(66, 0F, 0, 0x12), SSE2, WR, Vq, Mq)
> +/* 66 0F 13 /r: MOVLPD m64,xmm1 */
> +OPCODE(movlpd, LEG(66, 0F, 0, 0x13), SSE2, WR, Mq, Vq)
>  /* NP 0F 16 /r: MOVLHPS xmm1, xmm2 */
>  /* NP 0F 16 /r: MOVHPS xmm1, m64 */
>  OPCODE(movlhps, LEG(NP, 0F, 0, 0x16), SSE, WRR, Vdq, Vq, Wq)
>  /* NP 0F 17 /r: MOVHPS m64, xmm1 */
>  OPCODE(movhps, LEG(NP, 0F, 0, 0x17), SSE, WR, Mq, Vdq)
> +/* 66 0F 16 /r: MOVHPD xmm1, m64 */
> +OPCODE(movhpd, LEG(66, 0F, 0, 0x16), SSE2, WRR, Vdq, Vd, Mq)
> +/* 66 0F 17 /r: MOVHPD m64, xmm1 */
> +OPCODE(movhpd, LEG(66, 0F, 0, 0x17), SSE2, WR, Mq, Vdq)
>  /* NP 0F D7 /r: PMOVMSKB r32, mm */
>  OPCODE(pmovmskb, LEG(NP, 0F, 0, 0xd7), SSE, WR, Gd, Nq)
>  /* NP REX.W 0F D7 /r: PMOVMSKB r64, mm */
>  OPCODE(pmovmskb, LEG(NP, 0F, 1, 0xd7), SSE, WR, Gq, Nq)
> +/* 66 0F D7 /r: PMOVMSKB r32, xmm */
> +OPCODE(pmovmskb, LEG(66, 0F, 0, 0xd7), SSE2, WR, Gd, Udq)
> +/* 66 REX.W 0F D7 /r: PMOVMSKB r64, xmm */
> +OPCODE(pmovmskb, LEG(66, 0F, 1, 0xd7), SSE2, WR, Gq, Udq)
>  /* NP 0F 50 /r: MOVMSKPS r32, xmm */
>  OPCODE(movmskps, LEG(NP, 0F, 0, 0x50), SSE, WR, Gd, Udq)
>  /* NP REX.W 0F 50 /r: MOVMSKPS r64, xmm */
>  OPCODE(movmskps, LEG(NP, 0F, 1, 0x50), SSE, WR, Gq, Udq)
> +/* 66 0F 50 /r: MOVMSKPD r32, xmm */
> +OPCODE(movmskpd, LEG(66, 0F, 0, 0x50), SSE2, WR, Gd, Udq)
> +/* 66 REX.W 0F 50 /r: MOVMSKPD r64, xmm */
> +OPCODE(movmskpd, LEG(66, 0F, 1, 0x50), SSE2, WR, Gq, Udq)
>  /* NP 0F FC /r: PADDB mm, mm/m64 */
>  OPCODE(paddb, LEG(NP, 0F, 0, 0xfc), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F FC /r: PADDB xmm1, xmm2/m128 */
> +OPCODE(paddb, LEG(66, 0F, 0, 0xfc), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F FD /r: PADDW mm, mm/m64 */
>  OPCODE(paddw, LEG(NP, 0F, 0, 0xfd), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F FD /r: PADDW xmm1, xmm2/m128 */
> +OPCODE(paddw, LEG(66, 0F, 0, 0xfd), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F FE /r: PADDD mm, mm/m64 */
>  OPCODE(paddd, LEG(NP, 0F, 0, 0xfe), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F FE /r: PADDD xmm1, xmm2/m128 */
> +OPCODE(paddd, LEG(66, 0F, 0, 0xfe), SSE2, WRR, Vdq, Vdq, Wdq)
> +/* NP 0F D4 /r: PADDQ mm, mm/m64 */
> +OPCODE(paddq, LEG(NP, 0F, 0, 0xd4), SSE2, WRR, Pq, Pq, Qq)
> +/* 66 0F D4 /r: PADDQ xmm1, xmm2/m128 */
> +OPCODE(paddq, LEG(66, 0F, 0, 0xd4), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F EC /r: PADDSB mm, mm/m64 */
>  OPCODE(paddsb, LEG(NP, 0F, 0, 0xec), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F EC /r: PADDSB xmm1, xmm2/m128 */
> +OPCODE(paddsb, LEG(66, 0F, 0, 0xec), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F ED /r: PADDSW mm, mm/m64 */
>  OPCODE(paddsw, LEG(NP, 0F, 0, 0xed), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F ED /r: PADDSW xmm1, xmm2/m128 */
> +OPCODE(paddsw, LEG(66, 0F, 0, 0xed), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F DC /r: PADDUSB mm,mm/m64 */
>  OPCODE(paddusb, LEG(NP, 0F, 0, 0xdc), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F DC /r: PADDUSB xmm1,xmm2/m128 */
> +OPCODE(paddusb, LEG(66, 0F, 0, 0xdc), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F DD /r: PADDUSW mm,mm/m64 */
>  OPCODE(paddusw, LEG(NP, 0F, 0, 0xdd), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F DD /r: PADDUSW xmm1,xmm2/m128 */
> +OPCODE(paddusw, LEG(66, 0F, 0, 0xdd), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 58 /r: ADDPS xmm1, xmm2/m128 */
>  OPCODE(addps, LEG(NP, 0F, 0, 0x58), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 58 /r: ADDPD xmm1, xmm2/m128 */
> +OPCODE(addpd, LEG(66, 0F, 0, 0x58), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* F3 0F 58 /r: ADDSS xmm1, xmm2/m32 */
>  OPCODE(addss, LEG(F3, 0F, 0, 0x58), SSE, WRR, Vd, Vd, Wd)
> +/* F2 0F 58 /r: ADDSD xmm1, xmm2/m64 */
> +OPCODE(addsd, LEG(F2, 0F, 0, 0x58), SSE2, WRR, Vq, Vq, Wq)
>  /* NP 0F F8 /r: PSUBB mm, mm/m64 */
>  OPCODE(psubb, LEG(NP, 0F, 0, 0xf8), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F F8 /r: PSUBB xmm1, xmm2/m128 */
> +OPCODE(psubb, LEG(66, 0F, 0, 0xf8), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F F9 /r: PSUBW mm, mm/m64 */
>  OPCODE(psubw, LEG(NP, 0F, 0, 0xf9), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F F9 /r: PSUBW xmm1, xmm2/m128 */
> +OPCODE(psubw, LEG(66, 0F, 0, 0xf9), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F FA /r: PSUBD mm, mm/m64 */
>  OPCODE(psubd, LEG(NP, 0F, 0, 0xfa), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F FA /r: PSUBD xmm1, xmm2/m128 */
> +OPCODE(psubd, LEG(66, 0F, 0, 0xfa), SSE2, WRR, Vdq, Vdq, Wdq)
> +/* NP 0F FB /r: PSUBQ mm1, mm2/m64 */
> +OPCODE(psubq, LEG(NP, 0F, 0, 0xfb), SSE2, WRR, Pq, Pq, Qq)
> +/* 66 0F FB /r: PSUBQ xmm1, xmm2/m128 */
> +OPCODE(psubq, LEG(66, 0F, 0, 0xfb), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E8 /r: PSUBSB mm, mm/m64 */
>  OPCODE(psubsb, LEG(NP, 0F, 0, 0xe8), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F E8 /r: PSUBSB xmm1, xmm2/m128 */
> +OPCODE(psubsb, LEG(66, 0F, 0, 0xe8), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E9 /r: PSUBSW mm, mm/m64 */
>  OPCODE(psubsw, LEG(NP, 0F, 0, 0xe9), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F E9 /r: PSUBSW xmm1, xmm2/m128 */
> +OPCODE(psubsw, LEG(66, 0F, 0, 0xe9), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F D8 /r: PSUBUSB mm, mm/m64 */
>  OPCODE(psubusb, LEG(NP, 0F, 0, 0xd8), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F D8 /r: PSUBUSB xmm1, xmm2/m128 */
> +OPCODE(psubusb, LEG(66, 0F, 0, 0xd8), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F D9 /r: PSUBUSW mm, mm/m64 */
>  OPCODE(psubusw, LEG(NP, 0F, 0, 0xd9), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F D9 /r: PSUBUSW xmm1, xmm2/m128 */
> +OPCODE(psubusw, LEG(66, 0F, 0, 0xd9), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 5C /r: SUBPS xmm1, xmm2/m128 */
>  OPCODE(subps, LEG(NP, 0F, 0, 0x5c), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 5C /r: SUBPD xmm1, xmm2/m128 */
> +OPCODE(subpd, LEG(66, 0F, 0, 0x5c), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* F3 0F 5C /r: SUBSS xmm1, xmm2/m32 */
>  OPCODE(subss, LEG(F3, 0F, 0, 0x5c), SSE, WRR, Vd, Vd, Wd)
> +/* F2 0F 5C /r: SUBSD xmm1, xmm2/m64 */
> +OPCODE(subsd, LEG(F2, 0F, 0, 0x5c), SSE2, WRR, Vq, Vq, Wq)
>  /* NP 0F D5 /r: PMULLW mm, mm/m64 */
>  OPCODE(pmullw, LEG(NP, 0F, 0, 0xd5), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F D5 /r: PMULLW xmm1, xmm2/m128 */
> +OPCODE(pmullw, LEG(66, 0F, 0, 0xd5), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E5 /r: PMULHW mm, mm/m64 */
>  OPCODE(pmulhw, LEG(NP, 0F, 0, 0xe5), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F E5 /r: PMULHW xmm1, xmm2/m128 */
> +OPCODE(pmulhw, LEG(66, 0F, 0, 0xe5), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E4 /r: PMULHUW mm1, mm2/m64 */
>  OPCODE(pmulhuw, LEG(NP, 0F, 0, 0xe4), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F E4 /r: PMULHUW xmm1, xmm2/m128 */
> +OPCODE(pmulhuw, LEG(66, 0F, 0, 0xe4), SSE2, WRR, Vdq, Vdq, Wdq)
> +/* NP 0F F4 /r: PMULUDQ mm1, mm2/m64 */
> +OPCODE(pmuludq, LEG(NP, 0F, 0, 0xf4), SSE2, WRR, Pq, Pq, Qq)
> +/* 66 0F F4 /r: PMULUDQ xmm1, xmm2/m128 */
> +OPCODE(pmuludq, LEG(66, 0F, 0, 0xf4), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 59 /r: MULPS xmm1, xmm2/m128 */
>  OPCODE(mulps, LEG(NP, 0F, 0, 0x59), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 59 /r: MULPD xmm1, xmm2/m128 */
> +OPCODE(mulpd, LEG(66, 0F, 0, 0x59), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* F3 0F 59 /r: MULSS xmm1,xmm2/m32 */
>  OPCODE(mulss, LEG(F3, 0F, 0, 0x59), SSE, WRR, Vd, Vd, Wd)
> +/* F2 0F 59 /r: MULSD xmm1,xmm2/m64 */
> +OPCODE(mulsd, LEG(F2, 0F, 0, 0x59), SSE2, WRR, Vq, Vq, Wq)
>  /* NP 0F F5 /r: PMADDWD mm, mm/m64 */
>  OPCODE(pmaddwd, LEG(NP, 0F, 0, 0xf5), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F F5 /r: PMADDWD xmm1, xmm2/m128 */
> +OPCODE(pmaddwd, LEG(66, 0F, 0, 0xf5), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 5E /r: DIVPS xmm1, xmm2/m128 */
>  OPCODE(divps, LEG(NP, 0F, 0, 0x5e), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 5E /r: DIVPD xmm1, xmm2/m128 */
> +OPCODE(divpd, LEG(66, 0F, 0, 0x5e), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* F3 0F 5E /r: DIVSS xmm1, xmm2/m32 */
>  OPCODE(divss, LEG(F3, 0F, 0, 0x5e), SSE, WRR, Vd, Vd, Wd)
> +/* F2 0F 5E /r: DIVSD xmm1, xmm2/m64 */
> +OPCODE(divsd, LEG(F2, 0F, 0, 0x5e), SSE2, WRR, Vq, Vq, Wq)
>  /* NP 0F 53 /r: RCPPS xmm1, xmm2/m128 */
>  OPCODE(rcpps, LEG(NP, 0F, 0, 0x53), SSE, WR, Vdq, Wdq)
>  /* F3 0F 53 /r: RCPSS xmm1, xmm2/m32 */
>  OPCODE(rcpss, LEG(F3, 0F, 0, 0x53), SSE, WR, Vd, Wd)
>  /* NP 0F 51 /r: SQRTPS xmm1, xmm2/m128 */
>  OPCODE(sqrtps, LEG(NP, 0F, 0, 0x51), SSE, WR, Vdq, Wdq)
> +/* 66 0F 51 /r: SQRTPD xmm1, xmm2/m128 */
> +OPCODE(sqrtpd, LEG(66, 0F, 0, 0x51), SSE2, WR, Vdq, Wdq)
>  /* F3 0F 51 /r: SQRTSS xmm1, xmm2/m32 */
>  OPCODE(sqrtss, LEG(F3, 0F, 0, 0x51), SSE, WR, Vd, Wd)
> +/* F2 0F 51 /r: SQRTSD xmm1,xmm2/m64 */
> +OPCODE(sqrtsd, LEG(F2, 0F, 0, 0x51), SSE2, WR, Vq, Wq)
>  /* NP 0F 52 /r: RSQRTPS xmm1, xmm2/m128 */
>  OPCODE(rsqrtps, LEG(NP, 0F, 0, 0x52), SSE, WR, Vdq, Wdq)
>  /* F3 0F 52 /r: RSQRTSS xmm1, xmm2/m32 */
>  OPCODE(rsqrtss, LEG(F3, 0F, 0, 0x52), SSE, WR, Vd, Wd)
>  /* NP 0F DA /r: PMINUB mm1, mm2/m64 */
>  OPCODE(pminub, LEG(NP, 0F, 0, 0xda), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F DA /r: PMINUB xmm1, xmm2/m128 */
> +OPCODE(pminub, LEG(66, 0F, 0, 0xda), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F EA /r: PMINSW mm1, mm2/m64 */
>  OPCODE(pminsw, LEG(NP, 0F, 0, 0xea), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F EA /r: PMINSW xmm1, xmm2/m128 */
> +OPCODE(pminsw, LEG(66, 0F, 0, 0xea), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 5D /r: MINPS xmm1, xmm2/m128 */
>  OPCODE(minps, LEG(NP, 0F, 0, 0x5d), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 5D /r: MINPD xmm1, xmm2/m128 */
> +OPCODE(minpd, LEG(66, 0F, 0, 0x5d), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* F3 0F 5D /r: MINSS xmm1,xmm2/m32 */
>  OPCODE(minss, LEG(F3, 0F, 0, 0x5d), SSE, WRR, Vd, Vd, Wd)
> +/* F2 0F 5D /r: MINSD xmm1, xmm2/m64 */
> +OPCODE(minsd, LEG(F2, 0F, 0, 0x5d), SSE2, WRR, Vq, Vq, Wq)
>  /* NP 0F DE /r: PMAXUB mm1, mm2/m64 */
>  OPCODE(pmaxub, LEG(NP, 0F, 0, 0xde), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F DE /r: PMAXUB xmm1, xmm2/m128 */
> +OPCODE(pmaxub, LEG(66, 0F, 0, 0xde), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F EE /r: PMAXSW mm1, mm2/m64 */
>  OPCODE(pmaxsw, LEG(NP, 0F, 0, 0xee), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F EE /r: PMAXSW xmm1, xmm2/m128 */
> +OPCODE(pmaxsw, LEG(66, 0F, 0, 0xee), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 5F /r: MAXPS xmm1, xmm2/m128 */
>  OPCODE(maxps, LEG(NP, 0F, 0, 0x5f), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 5F /r: MAXPD xmm1, xmm2/m128 */
> +OPCODE(maxpd, LEG(66, 0F, 0, 0x5f), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* F3 0F 5F /r: MAXSS xmm1, xmm2/m32 */
>  OPCODE(maxss, LEG(F3, 0F, 0, 0x5f), SSE, WRR, Vd, Vd, Wd)
> +/* F2 0F 5F /r: MAXSD xmm1, xmm2/m64 */
> +OPCODE(maxsd, LEG(F2, 0F, 0, 0x5f), SSE2, WRR, Vq, Vq, Wq)
>  /* NP 0F E0 /r: PAVGB mm1, mm2/m64 */
>  OPCODE(pavgb, LEG(NP, 0F, 0, 0xe0), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F E0 /r: PAVGB xmm1, xmm2/m128 */
> +OPCODE(pavgb, LEG(66, 0F, 0, 0xe0), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E3 /r: PAVGW mm1, mm2/m64 */
>  OPCODE(pavgw, LEG(NP, 0F, 0, 0xe3), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F E3 /r: PAVGW xmm1, xmm2/m128 */
> +OPCODE(pavgw, LEG(66, 0F, 0, 0xe3), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F F6 /r: PSADBW mm1, mm2/m64 */
>  OPCODE(psadbw, LEG(NP, 0F, 0, 0xf6), SSE, WRR, Pq, Pq, Qq)
> +/* 66 0F F6 /r: PSADBW xmm1, xmm2/m128 */
> +OPCODE(psadbw, LEG(66, 0F, 0, 0xf6), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 74 /r: PCMPEQB mm,mm/m64 */
>  OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 74 /r: PCMPEQB xmm1,xmm2/m128 */
> +OPCODE(pcmpeqb, LEG(66, 0F, 0, 0x74), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 75 /r: PCMPEQW mm,mm/m64 */
>  OPCODE(pcmpeqw, LEG(NP, 0F, 0, 0x75), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 75 /r: PCMPEQW xmm1,xmm2/m128 */
> +OPCODE(pcmpeqw, LEG(66, 0F, 0, 0x75), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 76 /r: PCMPEQD mm,mm/m64 */
>  OPCODE(pcmpeqd, LEG(NP, 0F, 0, 0x76), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 76 /r: PCMPEQD xmm1,xmm2/m128 */
> +OPCODE(pcmpeqd, LEG(66, 0F, 0, 0x76), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 64 /r: PCMPGTB mm,mm/m64 */
>  OPCODE(pcmpgtb, LEG(NP, 0F, 0, 0x64), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 64 /r: PCMPGTB xmm1,xmm2/m128 */
> +OPCODE(pcmpgtb, LEG(66, 0F, 0, 0x64), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 65 /r: PCMPGTW mm,mm/m64 */
>  OPCODE(pcmpgtw, LEG(NP, 0F, 0, 0x65), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 65 /r: PCMPGTW xmm1,xmm2/m128 */
> +OPCODE(pcmpgtw, LEG(66, 0F, 0, 0x65), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 66 /r: PCMPGTD mm,mm/m64 */
>  OPCODE(pcmpgtd, LEG(NP, 0F, 0, 0x66), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 66 /r: PCMPGTD xmm1,xmm2/m128 */
> +OPCODE(pcmpgtd, LEG(66, 0F, 0, 0x66), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F C2 /r ib: CMPPS xmm1, xmm2/m128, imm8 */
>  OPCODE(cmpps, LEG(NP, 0F, 0, 0xc2), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
> +/* 66 0F C2 /r ib: CMPPD xmm1, xmm2/m128, imm8 */
> +OPCODE(cmppd, LEG(66, 0F, 0, 0xc2), SSE2, WRRR, Vdq, Vdq, Wdq, Ib)
>  /* F3 0F C2 /r ib: CMPSS xmm1, xmm2/m32, imm8 */
>  OPCODE(cmpss, LEG(F3, 0F, 0, 0xc2), SSE, WRRR, Vd, Vd, Wd, Ib)
> +/* F2 0F C2 /r ib: CMPSD xmm1, xmm2/m64, imm8 */
> +OPCODE(cmpsd, LEG(F2, 0F, 0, 0xc2), SSE2, WRRR, Vq, Vq, Wq, Ib)
>  /* NP 0F 2E /r: UCOMISS xmm1, xmm2/m32 */
>  OPCODE(ucomiss, LEG(NP, 0F, 0, 0x2e), SSE, RR, Vd, Wd)
> +/* 66 0F 2E /r: UCOMISD xmm1, xmm2/m64 */
> +OPCODE(ucomisd, LEG(66, 0F, 0, 0x2e), SSE2, RR, Vq, Wq)
>  /* NP 0F 2F /r: COMISS xmm1, xmm2/m32 */
>  OPCODE(comiss, LEG(NP, 0F, 0, 0x2f), SSE, RR, Vd, Wd)
> +/* 66 0F 2F /r: COMISD xmm1, xmm2/m64 */
> +OPCODE(comisd, LEG(66, 0F, 0, 0x2f), SSE2, RR, Vq, Wq)
>  /* NP 0F DB /r: PAND mm, mm/m64 */
>  OPCODE(pand, LEG(NP, 0F, 0, 0xdb), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F DB /r: PAND xmm1, xmm2/m128 */
> +OPCODE(pand, LEG(66, 0F, 0, 0xdb), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 54 /r: ANDPS xmm1, xmm2/m128 */
>  OPCODE(andps, LEG(NP, 0F, 0, 0x54), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 54 /r: ANDPD xmm1, xmm2/m128 */
> +OPCODE(andpd, LEG(66, 0F, 0, 0x54), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F DF /r: PANDN mm, mm/m64 */
>  OPCODE(pandn, LEG(NP, 0F, 0, 0xdf), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F DF /r: PANDN xmm1, xmm2/m128 */
> +OPCODE(pandn, LEG(66, 0F, 0, 0xdf), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 55 /r: ANDNPS xmm1, xmm2/m128 */
>  OPCODE(andnps, LEG(NP, 0F, 0, 0x55), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 55 /r: ANDNPD xmm1, xmm2/m128 */
> +OPCODE(andnpd, LEG(66, 0F, 0, 0x55), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F EB /r: POR mm, mm/m64 */
>  OPCODE(por, LEG(NP, 0F, 0, 0xeb), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F EB /r: POR xmm1, xmm2/m128 */
> +OPCODE(por, LEG(66, 0F, 0, 0xeb), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 56 /r: ORPS xmm1, xmm2/m128 */
>  OPCODE(orps, LEG(NP, 0F, 0, 0x56), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 56 /r: ORPD xmm1, xmm2/m128 */
> +OPCODE(orpd, LEG(66, 0F, 0, 0x56), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F EF /r: PXOR mm, mm/m64 */
>  OPCODE(pxor, LEG(NP, 0F, 0, 0xef), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F EF /r: PXOR xmm1, xmm2/m128 */
> +OPCODE(pxor, LEG(66, 0F, 0, 0xef), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 57 /r: XORPS xmm1, xmm2/m128 */
>  OPCODE(xorps, LEG(NP, 0F, 0, 0x57), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 57 /r: XORPD xmm1, xmm2/m128 */
> +OPCODE(xorpd, LEG(66, 0F, 0, 0x57), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F F1 /r: PSLLW mm, mm/m64 */
>  OPCODE(psllw, LEG(NP, 0F, 0, 0xf1), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F F1 /r: PSLLW xmm1, xmm2/m128 */
> +OPCODE(psllw, LEG(66, 0F, 0, 0xf1), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F F2 /r: PSLLD mm, mm/m64 */
>  OPCODE(pslld, LEG(NP, 0F, 0, 0xf2), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F F2 /r: PSLLD xmm1, xmm2/m128 */
> +OPCODE(pslld, LEG(66, 0F, 0, 0xf2), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F F3 /r: PSLLQ mm, mm/m64 */
>  OPCODE(psllq, LEG(NP, 0F, 0, 0xf3), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F F3 /r: PSLLQ xmm1, xmm2/m128 */
> +OPCODE(psllq, LEG(66, 0F, 0, 0xf3), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F D1 /r: PSRLW mm, mm/m64 */
>  OPCODE(psrlw, LEG(NP, 0F, 0, 0xd1), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F D1 /r: PSRLW xmm1, xmm2/m128 */
> +OPCODE(psrlw, LEG(66, 0F, 0, 0xd1), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F D2 /r: PSRLD mm, mm/m64 */
>  OPCODE(psrld, LEG(NP, 0F, 0, 0xd2), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F D2 /r: PSRLD xmm1, xmm2/m128 */
> +OPCODE(psrld, LEG(66, 0F, 0, 0xd2), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F D3 /r: PSRLQ mm, mm/m64 */
>  OPCODE(psrlq, LEG(NP, 0F, 0, 0xd3), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F D3 /r: PSRLQ xmm1, xmm2/m128 */
> +OPCODE(psrlq, LEG(66, 0F, 0, 0xd3), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E1 /r: PSRAW mm,mm/m64 */
>  OPCODE(psraw, LEG(NP, 0F, 0, 0xe1), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F E1 /r: PSRAW xmm1,xmm2/m128 */
> +OPCODE(psraw, LEG(66, 0F, 0, 0xe1), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F E2 /r: PSRAD mm,mm/m64 */
>  OPCODE(psrad, LEG(NP, 0F, 0, 0xe2), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F E2 /r: PSRAD xmm1,xmm2/m128 */
> +OPCODE(psrad, LEG(66, 0F, 0, 0xe2), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 63 /r: PACKSSWB mm1, mm2/m64 */
>  OPCODE(packsswb, LEG(NP, 0F, 0, 0x63), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 63 /r: PACKSSWB xmm1, xmm2/m128 */
> +OPCODE(packsswb, LEG(66, 0F, 0, 0x63), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 6B /r: PACKSSDW mm1, mm2/m64 */
>  OPCODE(packssdw, LEG(NP, 0F, 0, 0x6b), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 6B /r: PACKSSDW xmm1, xmm2/m128 */
> +OPCODE(packssdw, LEG(66, 0F, 0, 0x6b), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 67 /r: PACKUSWB mm, mm/m64 */
>  OPCODE(packuswb, LEG(NP, 0F, 0, 0x67), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 67 /r: PACKUSWB xmm1, xmm2/m128 */
> +OPCODE(packuswb, LEG(66, 0F, 0, 0x67), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 68 /r: PUNPCKHBW mm, mm/m64 */
>  OPCODE(punpckhbw, LEG(NP, 0F, 0, 0x68), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 68 /r: PUNPCKHBW xmm1, xmm2/m128 */
> +OPCODE(punpckhbw, LEG(66, 0F, 0, 0x68), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 69 /r: PUNPCKHWD mm, mm/m64 */
>  OPCODE(punpckhwd, LEG(NP, 0F, 0, 0x69), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 69 /r: PUNPCKHWD xmm1, xmm2/m128 */
> +OPCODE(punpckhwd, LEG(66, 0F, 0, 0x69), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 6A /r: PUNPCKHDQ mm, mm/m64 */
>  OPCODE(punpckhdq, LEG(NP, 0F, 0, 0x6a), MMX, WRR, Pq, Pq, Qq)
> +/* 66 0F 6A /r: PUNPCKHDQ xmm1, xmm2/m128 */
> +OPCODE(punpckhdq, LEG(66, 0F, 0, 0x6a), SSE2, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 6D /r: PUNPCKHQDQ xmm1, xmm2/m128 */
> +OPCODE(punpckhqdq, LEG(66, 0F, 0, 0x6d), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 60 /r: PUNPCKLBW mm, mm/m32 */
>  OPCODE(punpcklbw, LEG(NP, 0F, 0, 0x60), MMX, WRR, Pq, Pq, Qd)
> +/* 66 0F 60 /r: PUNPCKLBW xmm1, xmm2/m128 */
> +OPCODE(punpcklbw, LEG(66, 0F, 0, 0x60), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 61 /r: PUNPCKLWD mm, mm/m32 */
>  OPCODE(punpcklwd, LEG(NP, 0F, 0, 0x61), MMX, WRR, Pq, Pq, Qd)
> +/* 66 0F 61 /r: PUNPCKLWD xmm1, xmm2/m128 */
> +OPCODE(punpcklwd, LEG(66, 0F, 0, 0x61), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 62 /r: PUNPCKLDQ mm, mm/m32 */
>  OPCODE(punpckldq, LEG(NP, 0F, 0, 0x62), MMX, WRR, Pq, Pq, Qd)
> +/* 66 0F 62 /r: PUNPCKLDQ xmm1, xmm2/m128 */
> +OPCODE(punpckldq, LEG(66, 0F, 0, 0x62), SSE2, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 6C /r: PUNPCKLQDQ xmm1, xmm2/m128 */
> +OPCODE(punpcklqdq, LEG(66, 0F, 0, 0x6c), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 14 /r: UNPCKLPS xmm1, xmm2/m128 */
>  OPCODE(unpcklps, LEG(NP, 0F, 0, 0x14), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 14 /r: UNPCKLPD xmm1, xmm2/m128 */
> +OPCODE(unpcklpd, LEG(66, 0F, 0, 0x14), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 15 /r: UNPCKHPS xmm1, xmm2/m128 */
>  OPCODE(unpckhps, LEG(NP, 0F, 0, 0x15), SSE, WRR, Vdq, Vdq, Wdq)
> +/* 66 0F 15 /r: UNPCKHPD xmm1, xmm2/m128 */
> +OPCODE(unpckhpd, LEG(66, 0F, 0, 0x15), SSE2, WRR, Vdq, Vdq, Wdq)
>  /* NP 0F 70 /r ib: PSHUFW mm1, mm2/m64, imm8 */
>  OPCODE(pshufw, LEG(NP, 0F, 0, 0x70), SSE, WRR, Pq, Qq, Ib)
> +/* F2 0F 70 /r ib: PSHUFLW xmm1, xmm2/m128, imm8 */
> +OPCODE(pshuflw, LEG(F2, 0F, 0, 0x70), SSE2, WRR, Vdq, Wdq, Ib)
> +/* F3 0F 70 /r ib: PSHUFHW xmm1, xmm2/m128, imm8 */
> +OPCODE(pshufhw, LEG(F3, 0F, 0, 0x70), SSE2, WRR, Vdq, Wdq, Ib)
> +/* 66 0F 70 /r ib: PSHUFD xmm1, xmm2/m128, imm8 */
> +OPCODE(pshufd, LEG(66, 0F, 0, 0x70), SSE2, WRR, Vdq, Wdq, Ib)
>  /* NP 0F C6 /r ib: SHUFPS xmm1, xmm3/m128, imm8 */
>  OPCODE(shufps, LEG(NP, 0F, 0, 0xc6), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
> +/* 66 0F C6 /r ib: SHUFPD xmm1, xmm2/m128, imm8 */
> +OPCODE(shufpd, LEG(66, 0F, 0, 0xc6), SSE2, WRRR, Vdq, Vdq, Wdq, Ib)
>  /* NP 0F C4 /r ib: PINSRW mm, r32/m16, imm8 */
>  OPCODE(pinsrw, LEG(NP, 0F, 0, 0xc4), SSE, WRRR, Pq, Pq, RdMw, Ib)
> +/* 66 0F C4 /r ib: PINSRW xmm, r32/m16, imm8 */
> +OPCODE(pinsrw, LEG(66, 0F, 0, 0xc4), SSE2, WRRR, Vdq, Vdq, RdMw, Ib)
>  /* NP 0F C5 /r ib: PEXTRW r32, mm, imm8 */
>  OPCODE(pextrw, LEG(NP, 0F, 0, 0xc5), SSE, WRR, Gd, Nq, Ib)
>  /* NP REX.W 0F C5 /r ib: PEXTRW r64, mm, imm8 */
>  OPCODE(pextrw, LEG(NP, 0F, 1, 0xc5), SSE, WRR, Gq, Nq, Ib)
> +/* 66 0F C5 /r ib: PEXTRW r32, xmm, imm8 */
> +OPCODE(pextrw, LEG(66, 0F, 0, 0xc5), SSE2, WRR, Gd, Udq, Ib)
> +/* 66 REX.W 0F C5 /r ib: PEXTRW r64, xmm, imm8 */
> +OPCODE(pextrw, LEG(66, 0F, 1, 0xc5), SSE2, WRR, Gq, Udq, Ib)
>  /* NP 0F 2A /r: CVTPI2PS xmm, mm/m64 */
>  OPCODE(cvtpi2ps, LEG(NP, 0F, 0, 0x2a), SSE, WR, Vdq, Qq)
>  /* F3 0F 2A /r: CVTSI2SS xmm1,r/m32 */
>  OPCODE(cvtsi2ss, LEG(F3, 0F, 0, 0x2a), SSE, WR, Vd, Ed)
>  /* F3 REX.W 0F 2A /r: CVTSI2SS xmm1,r/m64 */
>  OPCODE(cvtsi2ss, LEG(F3, 0F, 1, 0x2a), SSE, WR, Vd, Eq)
> +/* 66 0F 2A /r: CVTPI2PD xmm, mm/m64 */
> +OPCODE(cvtpi2pd, LEG(66, 0F, 0, 0x2a), SSE2, WR, Vdq, Qq)
> +/* F2 0F 2A /r: CVTSI2SD xmm1,r32/m32 */
> +OPCODE(cvtsi2sd, LEG(F2, 0F, 0, 0x2a), SSE2, WR, Vq, Ed)
> +/* F2 REX.W 0F 2A /r: CVTSI2SD xmm1,r/m64 */
> +OPCODE(cvtsi2sd, LEG(F2, 0F, 1, 0x2a), SSE2, WR, Vq, Eq)
>  /* NP 0F 2D /r: CVTPS2PI mm, xmm/m64 */
>  OPCODE(cvtps2pi, LEG(NP, 0F, 0, 0x2d), SSE, WR, Pq, Wq)
>  /* F3 0F 2D /r: CVTSS2SI r32,xmm1/m32 */
>  OPCODE(cvtss2si, LEG(F3, 0F, 0, 0x2d), SSE, WR, Gd, Wd)
>  /* F3 REX.W 0F 2D /r: CVTSS2SI r64,xmm1/m32 */
>  OPCODE(cvtss2si, LEG(F3, 0F, 1, 0x2d), SSE, WR, Gq, Wd)
> +/* 66 0F 2D /r: CVTPD2PI mm, xmm/m128 */
> +OPCODE(cvtpd2pi, LEG(66, 0F, 0, 0x2d), SSE2, WR, Pq, Wdq)
> +/* F2 0F 2D /r: CVTSD2SI r32,xmm1/m64 */
> +OPCODE(cvtsd2si, LEG(F2, 0F, 0, 0x2d), SSE2, WR, Gd, Wq)
> +/* F2 REX.W 0F 2D /r: CVTSD2SI r64,xmm1/m64 */
> +OPCODE(cvtsd2si, LEG(F2, 0F, 1, 0x2d), SSE2, WR, Gq, Wq)
>  /* NP 0F 2C /r: CVTTPS2PI mm, xmm/m64 */
>  OPCODE(cvttps2pi, LEG(NP, 0F, 0, 0x2c), SSE, WR, Pq, Wq)
>  /* F3 0F 2C /r: CVTTSS2SI r32,xmm1/m32 */
>  OPCODE(cvttss2si, LEG(F3, 0F, 0, 0x2c), SSE, WR, Gd, Wd)
>  /* F3 REX.W 0F 2C /r: CVTTSS2SI r64,xmm1/m32 */
>  OPCODE(cvttss2si, LEG(F3, 0F, 1, 0x2c), SSE, WR, Gq, Wd)
> +/* 66 0F 2C /r: CVTTPD2PI mm, xmm/m128 */
> +OPCODE(cvttpd2pi, LEG(66, 0F, 0, 0x2c), SSE2, WR, Pq, Wdq)
> +/* F2 0F 2C /r: CVTTSD2SI r32,xmm1/m64 */
> +OPCODE(cvttsd2si, LEG(F2, 0F, 0, 0x2c), SSE2, WR, Gd, Wq)
> +/* F2 REX.W 0F 2C /r: CVTTSD2SI r64,xmm1/m64 */
> +OPCODE(cvttsd2si, LEG(F2, 0F, 1, 0x2c), SSE2, WR, Gq, Wq)
> +/* F2 0F E6 /r: CVTPD2DQ xmm1, xmm2/m128 */
> +OPCODE(cvtpd2dq, LEG(F2, 0F, 0, 0xe6), SSE2, WR, Vdq, Wdq)
> +/* 66 0F E6 /r: CVTTPD2DQ xmm1, xmm2/m128 */
> +OPCODE(cvttpd2dq, LEG(66, 0F, 0, 0xe6), SSE2, WR, Vdq, Wdq)
> +/* F3 0F E6 /r: CVTDQ2PD xmm1, xmm2/m64 */
> +OPCODE(cvtdq2pd, LEG(F3, 0F, 0, 0xe6), SSE2, WR, Vdq, Wq)
> +/* NP 0F 5A /r: CVTPS2PD xmm1, xmm2/m64 */
> +OPCODE(cvtps2pd, LEG(NP, 0F, 0, 0x5a), SSE2, WR, Vdq, Wq)
> +/* 66 0F 5A /r: CVTPD2PS xmm1, xmm2/m128 */
> +OPCODE(cvtpd2ps, LEG(66, 0F, 0, 0x5a), SSE2, WR, Vdq, Wdq)
> +/* F3 0F 5A /r: CVTSS2SD xmm1, xmm2/m32 */
> +OPCODE(cvtss2sd, LEG(F3, 0F, 0, 0x5a), SSE2, WR, Vq, Wd)
> +/* F2 0F 5A /r: CVTSD2SS xmm1, xmm2/m64 */
> +OPCODE(cvtsd2ss, LEG(F2, 0F, 0, 0x5a), SSE2, WR, Vd, Wq)
> +/* NP 0F 5B /r: CVTDQ2PS xmm1, xmm2/m128 */
> +OPCODE(cvtdq2ps, LEG(NP, 0F, 0, 0x5b), SSE2, WR, Vdq, Wdq)
> +/* 66 0F 5B /r: CVTPS2DQ xmm1, xmm2/m128 */
> +OPCODE(cvtps2dq, LEG(66, 0F, 0, 0x5b), SSE2, WR, Vdq, Wdq)
> +/* F3 0F 5B /r: CVTTPS2DQ xmm1, xmm2/m128 */
> +OPCODE(cvttps2dq, LEG(F3, 0F, 0, 0x5b), SSE2, WR, Vdq, Wdq)
>  /* NP 0F F7 /r: MASKMOVQ mm1, mm2 */
>  OPCODE(maskmovq, LEG(NP, 0F, 0, 0xf7), SSE, RR, Pq, Nq)
> +/* 66 0F F7 /r: MASKMOVDQU xmm1, xmm2 */
> +OPCODE(maskmovdqu, LEG(66, 0F, 0, 0xf7), SSE2, RR, Vdq, Udq)
>  /* NP 0F 2B /r: MOVNTPS m128, xmm1 */
>  OPCODE(movntps, LEG(NP, 0F, 0, 0x2b), SSE, WR, Mdq, Vdq)
> +/* 66 0F 2B /r: MOVNTPD m128, xmm1 */
> +OPCODE(movntpd, LEG(66, 0F, 0, 0x2b), SSE2, WR, Mdq, Vdq)
> +/* NP 0F C3 /r: MOVNTI m32, r32 */
> +OPCODE(movnti, LEG(NP, 0F, 0, 0xc3), SSE2, WR, Md, Gd)
> +/* NP REX.W + 0F C3 /r: MOVNTI m64, r64 */
> +OPCODE(movnti, LEG(NP, 0F, 1, 0xc3), SSE2, WR, Mq, Gq)
>  /* NP 0F E7 /r: MOVNTQ m64, mm */
>  OPCODE(movntq, LEG(NP, 0F, 0, 0xe7), SSE, WR, Mq, Pq)
> +/* 66 0F E7 /r: MOVNTDQ m128, xmm1 */
> +OPCODE(movntdq, LEG(66, 0F, 0, 0xe7), SSE2, WR, Mdq, Vdq)
> +/* F3 90: PAUSE */
> +OPCODE(pause, LEG(F3, NA, 0, 0x90), SSE2, )
>  /* NP 0F 77: EMMS */
>  OPCODE(emms, LEG(NP, 0F, 0, 0x77), MMX, )
>
> +OPCODE_GRP(grp12_LEG_66, LEG(66, 0F, 0, 0x71))
> +OPCODE_GRP_BEGIN(grp12_LEG_66)
> +    /* 66 0F 71 /6 ib: PSLLW xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp12_LEG_66, psllw, 6, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 71 /2 ib: PSRLW xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp12_LEG_66, psrlw, 2, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 71 /4 ib: PSRAW xmm1,imm8 */
> +    OPCODE_GRPMEMB(grp12_LEG_66, psraw, 4, SSE2, WRR, Udq, Udq, Ib)
> +OPCODE_GRP_END(grp12_LEG_66)
> +
>  OPCODE_GRP(grp12_LEG_NP, LEG(NP, 0F, 0, 0x71))
>  OPCODE_GRP_BEGIN(grp12_LEG_NP)
>      /* NP 0F 71 /6 ib: PSLLW mm1, imm8 */
> @@ -288,6 +582,16 @@ OPCODE_GRP_BEGIN(grp12_LEG_NP)
>      OPCODE_GRPMEMB(grp12_LEG_NP, psraw, 4, MMX, WRR, Nq, Nq, Ib)
>  OPCODE_GRP_END(grp12_LEG_NP)
>
> +OPCODE_GRP(grp13_LEG_66, LEG(66, 0F, 0, 0x72))
> +OPCODE_GRP_BEGIN(grp13_LEG_66)
> +    /* 66 0F 72 /6 ib: PSLLD xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp13_LEG_66, pslld, 6, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 72 /2 ib: PSRLD xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp13_LEG_66, psrld, 2, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 72 /4 ib: PSRAD xmm1,imm8 */
> +    OPCODE_GRPMEMB(grp13_LEG_66, psrad, 4, SSE2, WRR, Udq, Udq, Ib)
> +OPCODE_GRP_END(grp13_LEG_66)
> +
>  OPCODE_GRP(grp13_LEG_NP, LEG(NP, 0F, 0, 0x72))
>  OPCODE_GRP_BEGIN(grp13_LEG_NP)
>      /* NP 0F 72 /6 ib: PSLLD mm, imm8 */
> @@ -298,6 +602,18 @@ OPCODE_GRP_BEGIN(grp13_LEG_NP)
>      OPCODE_GRPMEMB(grp13_LEG_NP, psrad, 4, MMX, WRR, Nq, Nq, Ib)
>  OPCODE_GRP_END(grp13_LEG_NP)
>
> +OPCODE_GRP(grp14_LEG_66, LEG(66, 0F, 0, 0x73))
> +OPCODE_GRP_BEGIN(grp14_LEG_66)
> +    /* 66 0F 73 /6 ib: PSLLQ xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp14_LEG_66, psllq, 6, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 73 /7 ib: PSLLDQ xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp14_LEG_66, pslldq, 7, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 73 /2 ib: PSRLQ xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp14_LEG_66, psrlq, 2, SSE2, WRR, Udq, Udq, Ib)
> +    /* 66 0F 73 /3 ib: PSRLDQ xmm1, imm8 */
> +    OPCODE_GRPMEMB(grp14_LEG_66, psrldq, 3, SSE2, WRR, Udq, Udq, Ib)
> +OPCODE_GRP_END(grp14_LEG_66)
> +
>  OPCODE_GRP(grp14_LEG_NP, LEG(NP, 0F, 0, 0x73))
>  OPCODE_GRP_BEGIN(grp14_LEG_NP)
>      /* NP 0F 73 /6 ib: PSLLQ mm, imm8 */
> @@ -309,7 +625,12 @@ OPCODE_GRP_END(grp14_LEG_NP)
>  OPCODE_GRP(grp15_LEG_NP, LEG(NP, 0F, 0, 0xae))
>  OPCODE_GRP_BEGIN(grp15_LEG_NP)
>      /* NP 0F AE /7: SFENCE */
> -    OPCODE_GRPMEMB(grp15_LEG_NP, sfence, 7, SSE, )
> +    /* NP 0F AE /7: CLFLUSH m8 */
> +    OPCODE_GRPMEMB(grp15_LEG_NP, sfence_clflush, 7, SSE, RR, modrm_mod,
modrm)
> +    /* NP 0F AE /5: LFENCE */
> +    OPCODE_GRPMEMB(grp15_LEG_NP, lfence, 5, SSE2, )
> +    /* NP 0F AE /6: MFENCE */
> +    OPCODE_GRPMEMB(grp15_LEG_NP, mfence, 6, SSE2, )
>      /* NP 0F AE /2: LDMXCSR m32 */
>      OPCODE_GRPMEMB(grp15_LEG_NP, ldmxcsr, 2, SSE, R, Md)
>      /* NP 0F AE /3: STMXCSR m32 */
> --
> 2.20.1
>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h
       [not found]   ` <CAL1e-=gZF1+=Gduqm4TwS0p-G6rvb4q+rw+hL9nzAz3P-r3+BQ@mail.gmail.com>
@ 2019-08-15  9:49     ` Richard Henderson
  2019-08-15 10:07       ` Aleksandar Markovic
  0 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2019-08-15  9:49 UTC (permalink / raw)
  To: Aleksandar Markovic, Jan Bobek; +Cc: Alex Bennée, qemu-devel

On 8/15/19 8:02 AM, Aleksandar Markovic wrote:
> A question for you: What about FISTTP, MONITOR, MWAIT, that belong to SSE3, but
> are not mentioned in this patch?
> 

They are also not vector instructions, which is the subject of this patch set.


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext
  2019-08-15  7:30   ` Aleksandar Markovic
@ 2019-08-15  9:55     ` Richard Henderson
  2019-08-15 10:19       ` Aleksandar Markovic
  0 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2019-08-15  9:55 UTC (permalink / raw)
  To: Aleksandar Markovic, Jan Bobek
  Cc: Alex Bennée, qemu-devel, Richard Henderson

On 8/15/19 8:30 AM, Aleksandar Markovic wrote:
> 
> 15.08.2019. 04.13, "Jan Bobek" <jan.bobek@gmail.com
> <mailto:jan.bobek@gmail.com>> је написао/ла:
>>
>> From: Richard Henderson <rth@twiddle.net <mailto:rth@twiddle.net>>
>>
>> Treat this the same as we already do for other rex bits.
>>
>> Signed-off-by: Richard Henderson <rth@twiddle.net <mailto:rth@twiddle.net>>
>> ---
>>  target/i386/translate.c | 19 +++++++++++--------
>>  1 file changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/target/i386/translate.c b/target/i386/translate.c
>> index d74dbfd585..c0866c2797 100644
>> --- a/target/i386/translate.c
>> +++ b/target/i386/translate.c
>> @@ -44,11 +44,13 @@
>>  #define REX_X(s) ((s)->rex_x)
>>  #define REX_B(s) ((s)->rex_b)
>>  #define REX_R(s) ((s)->rex_r)
>> +#define REX_W(s) ((s)->rex_w)
>>  #else
>>  #define CODE64(s) 0
>>  #define REX_X(s) 0
>>  #define REX_B(s) 0
>>  #define REX_R(s) 0
>> +#define REX_W(s) -1
> 
> The commit message says "treat rex_w the same as other rex bits". Why is then
> REX_W() treated differently here?

"Treated the same" in terms of being referenced by a macro instead of a local
variable.  As for the -1, if you look at the rest of the patch you can clearly
see it preserves existing behaviour.

>> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
> CPUState *cpu)
>>      s->rex_x = 0;
>>      s->rex_b = 0;
>>      s->rex_r = 0;
>> +    s->rex_w = -1;
>>      s->x86_64_hregs = false;
>>  #endif
>>      s->rip_offset = 0; /* for relative ip address */
>> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
> CPUState *cpu)
>>      }
>>
>>      prefixes = 0;
>> -    rex_w = -1;


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h
  2019-08-15  9:49     ` Richard Henderson
@ 2019-08-15 10:07       ` Aleksandar Markovic
  2019-08-21  5:04         ` Jan Bobek
  0 siblings, 1 reply; 60+ messages in thread
From: Aleksandar Markovic @ 2019-08-15 10:07 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, Jan Bobek, qemu-devel

15.08.2019. 11.55, "Richard Henderson" <richard.henderson@linaro.org> је
написао/ла:
>
> On 8/15/19 8:02 AM, Aleksandar Markovic wrote:
> > A question for you: What about FISTTP, MONITOR, MWAIT, that belong to
SSE3, but
> > are not mentioned in this patch?
> >
>
> They are also not vector instructions, which is the subject of this patch
set.
>

The subject of the patch and the patch set says "SSE3", not "vector", read
it again.

>
> r~

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext
  2019-08-15  9:55     ` Richard Henderson
@ 2019-08-15 10:19       ` Aleksandar Markovic
  2019-08-21  5:12         ` Jan Bobek
  0 siblings, 1 reply; 60+ messages in thread
From: Aleksandar Markovic @ 2019-08-15 10:19 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Jan Bobek, qemu-devel, Richard Henderson

15.08.2019. 11.55, "Richard Henderson" <richard.henderson@linaro.org> је
написао/ла:
>
> On 8/15/19 8:30 AM, Aleksandar Markovic wrote:
> >
> > 15.08.2019. 04.13, "Jan Bobek" <jan.bobek@gmail.com
> > <mailto:jan.bobek@gmail.com>> је написао/ла:
> >>
> >> From: Richard Henderson <rth@twiddle.net <mailto:rth@twiddle.net>>
> >>
> >> Treat this the same as we already do for other rex bits.
> >>
> >> Signed-off-by: Richard Henderson <rth@twiddle.net <mailto:
rth@twiddle.net>>
> >> ---
> >>  target/i386/translate.c | 19 +++++++++++--------
> >>  1 file changed, 11 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/target/i386/translate.c b/target/i386/translate.c
> >> index d74dbfd585..c0866c2797 100644
> >> --- a/target/i386/translate.c
> >> +++ b/target/i386/translate.c
> >> @@ -44,11 +44,13 @@
> >>  #define REX_X(s) ((s)->rex_x)
> >>  #define REX_B(s) ((s)->rex_b)
> >>  #define REX_R(s) ((s)->rex_r)
> >> +#define REX_W(s) ((s)->rex_w)
> >>  #else
> >>  #define CODE64(s) 0
> >>  #define REX_X(s) 0
> >>  #define REX_B(s) 0
> >>  #define REX_R(s) 0
> >> +#define REX_W(s) -1
> >
> > The commit message says "treat rex_w the same as other rex bits". Why
is then
> > REX_W() treated differently here?
>
> "Treated the same" in terms of being referenced by a macro instead of a
local
> variable.  As for the -1, if you look at the rest of the patch you can
clearly
> see it preserves existing behaviour.
>

That is exactly what I dislike about your commit messages: they often
introduce ambiguity, without any real need, and with really bad
consequences to the reader. Is adding "in terms of being referenced by a
macro instead of a local
variable" to the commit message that hard?

When writing commit messages, you need to try to put yourself in the shoes
of the reader.

Aleksandar

> >> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
> > CPUState *cpu)
> >>      s->rex_x = 0;
> >>      s->rex_b = 0;
> >>      s->rex_r = 0;
> >> +    s->rex_w = -1;
> >>      s->x86_64_hregs = false;
> >>  #endif
> >>      s->rip_offset = 0; /* for relative ip address */
> >> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
> > CPUState *cpu)
> >>      }
> >>
> >>      prefixes = 0;
> >> -    rex_w = -1;
>
>
> r~

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid
  2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid Jan Bobek
@ 2019-08-15 15:01   ` Aleksandar Markovic
  2019-08-15 15:16     ` Richard Henderson
  2019-08-21  5:07     ` Jan Bobek
  0 siblings, 2 replies; 60+ messages in thread
From: Aleksandar Markovic @ 2019-08-15 15:01 UTC (permalink / raw)
  To: Jan Bobek; +Cc: Alex Bennée, Richard Henderson, qemu-devel

15.08.2019. 04.23, "Jan Bobek" <jan.bobek@gmail.com> је написао/ла:
>
> Introduce a helper function to take care of instruction CPUID checks.
>
> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
> ---

Jan, what is the origin of "CK"? If it is a QEMU internal thing, perhaps
use "CHECK".

The function should be called check_cpuid(), imho. I know, Richard would
like c_ci(), or simpler cc(), better.

Aleksandar

>  target/i386/translate.c | 48 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 6296a02991..0cffa2226b 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -4500,6 +4500,54 @@ static void gen_sse(CPUX86State *env, DisasContext
*s, int b)
>  #define tcg_gen_gvec_cmpgt(vece, dofs, aofs, bofs, oprsz, maxsz)        \
>      tcg_gen_gvec_cmp(TCG_COND_GT, vece, dofs, aofs, bofs, oprsz, maxsz)
>
> +typedef enum {
> +    CK_CPUID_MMX = 1,
> +    CK_CPUID_3DNOW,
> +    CK_CPUID_SSE,
> +    CK_CPUID_SSE2,
> +    CK_CPUID_CLFLUSH,
> +    CK_CPUID_SSE3,
> +    CK_CPUID_SSSE3,
> +    CK_CPUID_SSE4_1,
> +    CK_CPUID_SSE4_2,
> +    CK_CPUID_SSE4A,
> +    CK_CPUID_AVX,
> +    CK_CPUID_AVX2,
> +} CkCpuidFeat;
> +
> +static int ck_cpuid(CPUX86State *env, DisasContext *s, CkCpuidFeat feat)
> +{
> +    switch (feat) {
> +    case CK_CPUID_MMX:
> +        return !(s->cpuid_features & CPUID_MMX)
> +            || !(s->cpuid_ext2_features & CPUID_EXT2_MMX);
> +    case CK_CPUID_3DNOW:
> +        return !(s->cpuid_ext2_features & CPUID_EXT2_3DNOW);
> +    case CK_CPUID_SSE:
> +        return !(s->cpuid_features & CPUID_SSE);
> +    case CK_CPUID_SSE2:
> +        return !(s->cpuid_features & CPUID_SSE2);
> +    case CK_CPUID_CLFLUSH:
> +        return !(s->cpuid_features & CPUID_CLFLUSH);
> +    case CK_CPUID_SSE3:
> +        return !(s->cpuid_ext_features & CPUID_EXT_SSE3);
> +    case CK_CPUID_SSSE3:
> +        return !(s->cpuid_ext_features & CPUID_EXT_SSSE3);
> +    case CK_CPUID_SSE4_1:
> +        return !(s->cpuid_ext_features & CPUID_EXT_SSE41);
> +    case CK_CPUID_SSE4_2:
> +        return !(s->cpuid_ext_features & CPUID_EXT_SSE42);
> +    case CK_CPUID_SSE4A:
> +        return !(s->cpuid_ext3_features & CPUID_EXT3_SSE4A);
> +    case CK_CPUID_AVX:
> +        return !(s->cpuid_ext_features & CPUID_EXT_AVX);
> +    case CK_CPUID_AVX2:
> +        return !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2);
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
>  static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
>  {
>      enum {
> --
> 2.20.1
>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid
  2019-08-15 15:01   ` Aleksandar Markovic
@ 2019-08-15 15:16     ` Richard Henderson
  2019-08-21  5:07     ` Jan Bobek
  1 sibling, 0 replies; 60+ messages in thread
From: Richard Henderson @ 2019-08-15 15:16 UTC (permalink / raw)
  To: Aleksandar Markovic, Jan Bobek; +Cc: Alex Bennée, qemu-devel

On August 15, 2019 4:01:33 PM GMT+01:00, Aleksandar Markovic
>
>The function should be called check_cpuid(), imho. I know, Richard
>would
>like c_ci(), or simpler cc(), better.

Now you're just playing the fool.  Cut it out.


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h
  2019-08-15 10:07       ` Aleksandar Markovic
@ 2019-08-21  5:04         ` Jan Bobek
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-21  5:04 UTC (permalink / raw)
  To: Aleksandar Markovic, Richard Henderson; +Cc: Alex Bennée, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 649 bytes --]

On 8/15/19 6:07 AM, Aleksandar Markovic wrote:
> 
> 15.08.2019. 11.55, "Richard Henderson" <richard.henderson@linaro.org <mailto:richard.henderson@linaro.org>> је написао/ла:
>>
>> On 8/15/19 8:02 AM, Aleksandar Markovic wrote:
>> > A question for you: What about FISTTP, MONITOR, MWAIT, that belong to SSE3, but
>> > are not mentioned in this patch?
>> >
>>
>> They are also not vector instructions, which is the subject of this patch set.
>>
> 
> The subject of the patch and the patch set says "SSE3", not "vector", read it again.

Richard is right, I will clarify the series' subject and the
commit message.

-Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid
  2019-08-15 15:01   ` Aleksandar Markovic
  2019-08-15 15:16     ` Richard Henderson
@ 2019-08-21  5:07     ` Jan Bobek
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-21  5:07 UTC (permalink / raw)
  To: Aleksandar Markovic; +Cc: Alex Bennée, Richard Henderson, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 646 bytes --]

On 8/15/19 11:01 AM, Aleksandar Markovic wrote:
> 
> 15.08.2019. 04.23, "Jan Bobek" <jan.bobek@gmail.com <mailto:jan.bobek@gmail.com>> је написао/ла:
>>
>> Introduce a helper function to take care of instruction CPUID checks.
>>
>> Signed-off-by: Jan Bobek <jan.bobek@gmail.com <mailto:jan.bobek@gmail.com>>
>> ---
> 
> Jan, what is the origin of "CK"? If it is a QEMU internal thing, perhaps use "CHECK".
> 
> The function should be called check_cpuid(), imho. I know, Richard would like c_ci(), or simpler cc(), better.

It was completely my initiative to name it like that. I'll rename
it to check_cpuid().

-Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w into DisasContext
  2019-08-15 10:19       ` Aleksandar Markovic
@ 2019-08-21  5:12         ` Jan Bobek
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Bobek @ 2019-08-21  5:12 UTC (permalink / raw)
  To: Aleksandar Markovic, Richard Henderson
  Cc: Alex Bennée, qemu-devel, Richard Henderson


[-- Attachment #1.1: Type: text/plain, Size: 2326 bytes --]

On 8/15/19 6:19 AM, Aleksandar Markovic wrote:
> 
> 15.08.2019. 11.55, "Richard Henderson" <richard.henderson@linaro.org <mailto:richard.henderson@linaro.org>> је написао/ла:
>>
>> On 8/15/19 8:30 AM, Aleksandar Markovic wrote:
>> >
>> > 15.08.2019. 04.13, "Jan Bobek" <jan.bobek@gmail.com <mailto:jan.bobek@gmail.com>
>> > <mailto:jan.bobek@gmail.com <mailto:jan.bobek@gmail.com>>> је написао/ла:
>> >>
>> >> From: Richard Henderson <rth@twiddle.net <mailto:rth@twiddle.net> <mailto:rth@twiddle.net <mailto:rth@twiddle.net>>>
>> >>
>> >> Treat this the same as we already do for other rex bits.
>> >>
>> >> Signed-off-by: Richard Henderson <rth@twiddle.net <mailto:rth@twiddle.net> <mailto:rth@twiddle.net <mailto:rth@twiddle.net>>>
>> >> ---
>> >>  target/i386/translate.c | 19 +++++++++++--------
>> >>  1 file changed, 11 insertions(+), 8 deletions(-)
>> >>
>> >> diff --git a/target/i386/translate.c b/target/i386/translate.c
>> >> index d74dbfd585..c0866c2797 100644
>> >> --- a/target/i386/translate.c
>> >> +++ b/target/i386/translate.c
>> >> @@ -44,11 +44,13 @@
>> >>  #define REX_X(s) ((s)->rex_x)
>> >>  #define REX_B(s) ((s)->rex_b)
>> >>  #define REX_R(s) ((s)->rex_r)
>> >> +#define REX_W(s) ((s)->rex_w)
>> >>  #else
>> >>  #define CODE64(s) 0
>> >>  #define REX_X(s) 0
>> >>  #define REX_B(s) 0
>> >>  #define REX_R(s) 0
>> >> +#define REX_W(s) -1
>> >
>> > The commit message says "treat rex_w the same as other rex bits". Why is then
>> > REX_W() treated differently here?
>>
>> "Treated the same" in terms of being referenced by a macro instead of a local
>> variable.  As for the -1, if you look at the rest of the patch you can clearly
>> see it preserves existing behaviour.
>>
> 
> That is exactly what I dislike about your commit messages: they often introduce ambiguity, without any real need, and with really bad consequences to the reader. Is adding "in terms of being referenced by a macro instead of a local
> variable" to the commit message that hard?
> 
> When writing commit messages, you need to try to put yourself in the shoes of the reader.

FWIW, personally I don't find it confusing. I think even just the
first couple of lines of the patch make it quite clear what's
going on. Just my 2 cents.

-Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2019-08-21  5:13 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-15  2:08 [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 01/46] target/i386: Push rex_r into DisasContext Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 02/46] target/i386: Push rex_w " Jan Bobek
2019-08-15  7:30   ` Aleksandar Markovic
2019-08-15  9:55     ` Richard Henderson
2019-08-15 10:19       ` Aleksandar Markovic
2019-08-21  5:12         ` Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 03/46] target/i386: reduce scope of variable aflag Jan Bobek
2019-08-15  7:16   ` Aleksandar Markovic
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 04/46] target/i386: use dflag from DisasContext Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 05/46] target/i386: use prefix " Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 06/46] target/i386: Simplify gen_exception arguments Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 07/46] target/i386: use pc_start from DisasContext Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 08/46] target/i386: make variable b1 const Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 09/46] target/i386: make variable is_xmm const Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 10/46] target/i386: add vector register file alignment constraints Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 11/46] target/i386: introduce gen_(ld, st)d_env_A0 Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 12/46] target/i386: introduce gen_sse_ng Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 13/46] target/i386: disable unused function warning temporarily Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 14/46] target/i386: introduce mnemonic aliases for several gvec operations Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 15/46] target/i386: introduce function ck_cpuid Jan Bobek
2019-08-15 15:01   ` Aleksandar Markovic
2019-08-15 15:16     ` Richard Henderson
2019-08-21  5:07     ` Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 16/46] target/i386: introduce instruction operand infrastructure Jan Bobek
2019-08-15  2:08 ` [Qemu-devel] [RFC PATCH v3 17/46] target/i386: introduce generic operand alias Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 18/46] target/i386: introduce generic either-or operand Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 19/46] target/i386: introduce generic load-store operand Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 20/46] target/i386: introduce tcg_temp operands Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 21/46] target/i386: introduce modrm operand Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 22/46] target/i386: introduce operands for decoding modrm fields Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 23/46] target/i386: introduce operand for direct-only r/m field Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 24/46] target/i386: introduce operand vex_v Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 25/46] target/i386: introduce Ib (immediate) operand Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 26/46] target/i386: introduce M* (memptr) operands Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 27/46] target/i386: introduce G*, R*, E* (general register) operands Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 28/46] target/i386: introduce P*, N*, Q* (MMX) operands Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 29/46] target/i386: introduce H*, V*, U*, W* (SSE/AVX) operands Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 30/46] target/i386: introduce code generators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 31/46] target/i386: introduce helper-based code generator macros Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 32/46] target/i386: introduce gvec-based " Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 33/46] target/i386: introduce sse-opcode.inc.h Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 34/46] target/i386: introduce instruction translator macros Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 35/46] target/i386: introduce MMX translators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 36/46] target/i386: introduce MMX code generators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 37/46] target/i386: introduce MMX instructions to sse-opcode.inc.h Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 38/46] target/i386: introduce SSE translators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 39/46] target/i386: introduce SSE code generators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 40/46] target/i386: introduce SSE instructions to sse-opcode.inc.h Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 41/46] target/i386: introduce SSE2 translators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 42/46] target/i386: introduce SSE2 code generators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 43/46] target/i386: introduce SSE2 instructions to sse-opcode.inc.h Jan Bobek
2019-08-15  8:00   ` Aleksandar Markovic
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 44/46] target/i386: introduce SSE3 translators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 45/46] target/i386: introduce SSE3 code generators Jan Bobek
2019-08-15  2:09 ` [Qemu-devel] [RFC PATCH v3 46/46] target/i386: introduce SSE3 instructions to sse-opcode.inc.h Jan Bobek
     [not found]   ` <CAL1e-=gZF1+=Gduqm4TwS0p-G6rvb4q+rw+hL9nzAz3P-r3+BQ@mail.gmail.com>
2019-08-15  9:49     ` Richard Henderson
2019-08-15 10:07       ` Aleksandar Markovic
2019-08-21  5:04         ` Jan Bobek
2019-08-15  3:06 ` [Qemu-devel] [RFC PATCH v3 00/46] rewrite MMX/SSE/SSE2/SSE3 instruction translation no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).